How to select the number of time segments that contain a event? - influxdb

I am trying to implement a "time spent on platform" metric, grouped by user and day.
My test data has 15 events for each of two users, and those 15 events are split among three days. However, the five events for a particular user/day combo all happen at exactly the same moment, so for the purposes of my "time spent" calculation they should only be counted as a single "time unit". I'm defining a "time unit" as a minute that contains at least event for a user.
Here is my attempt so far:
SELECT SUM(x) FROM (SELECT COUNT(score_value) as x FROM user_scores GROUP BY time(1m),user_id) GROUP BY time(1d),user_id
name: user_scores
tags: user_id=123
time sum
---- ---
1518134400000000000 5
1518220800000000000 5
1518307200000000000 5
1518393600000000000
name: user_scores
tags: user_id=456
time sum
---- ---
1518134400000000000 5
1518220800000000000 5
1518307200000000000 5
I can see how this is the expected result set, but it is not the data I'm looking for. Since each of the five events for a single user/day combo happen at exactly the same minute, the sum values in the results should all be 1.
So, I need a way to channge SELECT COUNT(score_value) as x FROM user_scores GROUP BY time(1m),user_id into something that returns 0 or 1 depending on if there are any events occuring in that minute

I figured it out, what works is as follows:
SELECT COUNT(x) FROM (SELECT COUNT(score_value) as x FROM user_scores GROUP BY time(1m),user_id) WHERE x > 0 GROUP BY time(1d),user_id
Basically I changed the outer SELECT SUM(x) to SELECT COUNT(x) and added the where x > 0.

Related

query order by 2nd column when score is equal

I'm using this formula to order column G to sort the highest points of the player list.
=QUERY(A2:G;"select * where A is not null order by G desc";0)
Some of the players have equal total points, but not equal times. Points are earned over different rounds, based on what time they finished.
If the players have equal points, I want to sort by a second column (their total finishing time) in column H.
example:
Both players finished 1st & 2nd. The total time has a difference of 1 minute. Player 2 should be ordered first based on his total time.
Note that I can't directly order by "Total Time" due to the point system in the background.
Player Round1 Round2 Points Total Time
1 3min 1min 10 4min
2 1min 2min 10 3min
Found it!
=QUERY(A2:G;"select * where A is not null order by G desc, H asc";0)

Influxdb - how to calculate sum of differences per second a the minute level

I want to query the sum per minute from the result obtained from another query that calculates the difference between subsequent values.
select sum(ph1), sum(ph2), sum(ph2) from (select
non_negative_difference(day_chan1) as ph1,
non_negative_difference(day_chan2) as ph2,
non_negative_difference(day_chan3) as ph3
from electricity)
group by time(1m) tz('Europe/Dublin')
For example if I get the following from the suqbquery
time ph1 ph2 ph3
---- --- --- ---
2017-04-02T14:40:38Z 0 0 2
2017-04-02T14:41:38Z 1 1 1
2017-04-02T14:41:39Z 0 0 2
2017-04-02T14:42:38Z 1 1 1
2017-04-02T14:42:39Z 0 1 2
I want to sum them up into
time ph1 ph2 ph3
---- --- --- ---
2017-04-02T14:40:00Z 0 0 2
2017-04-02T14:41:00Z 1 1 3
2017-04-02T14:42:00Z 1 2 3
but what I get from the query is aggregate function required inside the call to non_negative_difference but if I do the sub query on its own, it returns the results
I was also looking a long time for this and I finally found the solution:
select sum(ph1), sum(ph2), sum(ph2) from (select
This is right. Now we want to add an aggregate function inside the non_negative_difference call (as the error also indicates). I assume you want to sum everything.
non_negative_difference(sum(day_chan1)) as ph1,
non_negative_difference(sum(day_chan2)) as ph2,
non_negative_difference(sum(day_chan3)) as ph3
from electricity
Now if we don't add the following line the group by function of the inside query will also be 1m. We don't want this since if a value is missing the way influx calculates sum this will result in a very large differnce. So we group this subquery by the smallest interval you have (e.g. 1s)
group by time(1s))
Finally you can group the outer query by the interval you would like the values to be added together.
group by time(1m) tz('Europe/Dublin')

Count the number of times a tag appears in influxdb

I have an influxdb database called 'cars' that counts the cars that pass by my building. One tag for this is called 'make', where make = {ford, toyota...}. I'd like to count the number of times each car has passed by my building, by counting the number of entries with each tag.
Is this possible in InfluxDB?
Edit:
This does not work:
> select count(make) from my_series where time > now()-10d group by make;
name: my_series
tags: make=
time count
---- -----
1524058438416676920 764724
Query like this will give you number of entries per make for the last hour:
SELECT COUNT(somefield) FROM cars WHERE time > now() - 1h GROUP BY make

How to query difference between series with different tags in one measurement in InfluxDB?

for example, temperature data from two different locations stored in one measurement, like
time temperature location
---- ---- ----
1 15 A
2 20 B
3 17 A
4 18 B
How to calculate the difference between temperatures at A and B with querying?
i.e. expecting something like “SELECT a.temperature - b.temperature FROM measurement WHERE location = ‘A’ as a WHERE location = ‘B’ as b”
Thanks
You can use subqueries for that e.g.
SELECT last("temp_a") - last("temp_b") FROM (
SELECT "gauge" AS "temp_a" FROM "measurement" WHERE ("location"='A') AND $timeFilter
),(
SELECT "gauge" AS "temp_b" FROM "measurement" WHERE ("location"='B') AND $timeFilter
) GROUP BY time($__interval) fill(null)
Influx query language does not support functions across measurements.

Influxdb: How to get count of number of results in a group by query

Is there anyway that i can get the count of total number of results / points / records in a group by query result?
> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m)
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z 2
2015-08-18T00:24:00Z 2
I expect the count as 3 in this case. Even though I can calculate the number of results using the time period and interval (12m) here, I would like to know whether it is possible to do so with a query to database.
You can use Multiple Aggregates in single query.
Using your example add a select count(*) from (<inner query>):
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m))
name: h2o_feet
--------------
time count_count
1970-01-01T00:00:00Z 3
However if you had a situation in which the grouping by returns empty rows, they will not be counted.
For example, counting over the below table will result in a count of 2 rather than 3:
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z
2015-08-18T00:24:00Z 2
To include empty rows in your count you will need to add fill(1) to your query like this:
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) fill(1))
You will need to do some manual work. Run it directly,
$ influx -execute "select * from measurement_name" -database="db_name" | wc -l
This will return 4 more than the actual values.
Here is an example,
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles" | wc -l
5
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles"
name: yprices
time price
---- -----
1493626629063286219 2
luvpreet#DHARI-Inspiron-3542:~/www$
So, I think now you know why subtract 4 from the value.

Resources