query result in set of interval ranges in postgresql(rails) - ruby-on-rails

I have a timestamp column for which i have to calculate the time difference and divide it into certain set of intervals
for time difference in hours i have written this query
result = ActiveRecord::Base.connection.exec_query("SELECT id,(EXTRACT(EPOCH FROM CURRENT_TIMESTAMP - image_retouch_items.created_at)/3600)::INTEGER AS latency FROM image_retouch_items WHERE status= 0;");
The result of my query is
"id" "latency"
104 5928
106 5917
158 5751
162 5736
95 5940
85 5950
How to get result as set of intervals(hours),like for row for which time difference lie between the range of 0-24 hr increment the count .
i.e.
interval count
0-24 2
24-48 3
48-72 0
How to get that in single query

Related

Ruby on Rails: Group by data, find maximum value, minimum value and average value

We have a PostgreSQL table like that:
id
name
value
conversation_id
1
first_response
29
2
2
conversation_resolved
98
2
6
conversation_resolved
8
4
7
first_response
17
5
8
conversation_resolved
40
5
9
conversation_resolved
1041049
2
10
conversation_resolved
1046116
4
On ruby on rails, we need to find a value based on below condition.
Group by value by conversation_id.
Find Maximum and Minimum Value of each group.
Sum of maximum value.
Sum Of minimum value.
Get result of maximum value - minimum value.
Get Average value.
Is it possible on single query using activeRecord?

InfluxQL time calculations return no records

I'd like to query InfluxDB using InfluxQL and exclude any rows from 0 to 5 minutes after the hour.
Seems pretty easy to do using the time field (the number of nanoseconds since the epoch) and a little modulus math. But the problem is that any WHERE clause with even the simplest calculation on time returns zero records.
How can I get what I need if I can't perform calculations on time? How can I exclude any rows from 0 to 5 minutes after the hour?
# Returns 10 records
SELECT * FROM "telegraf"."autogen"."processes" WHERE time > 0 LIMIT 10
# Returns 0 records
SELECT * FROM "telegraf"."autogen"."processes" WHERE (time/1) > 0 LIMIT 10

Conditional sum contigous columns

I'd like to sum the total time allocated per task. Time has been provided either in minutes or in hours.
For example Task A should be:
30 minutes + 2 hours x 60 minutes/hour + 150 minutes = 300
What formula should I use on B2 to get such a result, considering that I have many more than 4 people?
Try
=sumproduct(n(C2:I2)*if(D2:J2="minutes",1,60))

Sliding window aggregate Big Query 15 minute aggregation

I have a table like this
Row time viewCount
1 00:00:00 31
2 00:00:01 44
3 00:00:02 78
4 00:00:03 71
5 00:00:04 72
6 00:00:05 73
7 00:00:06 64
8 00:00:07 70
I would like to aggregate this into
Row time viewCount
1 00:00:00 31
2 00:15:00 445
3 00:30:00 700
4 00:45:00 500
5 01:00:04 121
6 01:15:00 475
.
.
.
Please help. Thanks in advance
Supposing that you actually have a TIMESTAMP column, you can use an approach like this:
#standardSQL
SELECT
TIMESTAMP_SECONDS(
UNIX_SECONDS(timestamp) -
MOD(UNIX_SECONDS(timestamp), 15 * 60)
) AS time,
SUM(viewCount) AS viewCount
FROM `project.dataset.table`
GROUP BY time;
It relies on conversion to and from Unix seconds in order to compute the 15 minute intervals. Note that it will not produce a row with a zero count for an empty 15 minute interval unlike Mikhail's solution, however (it's not clear if this is important to you).
Below is for BigQuery Standard SQL
Note: you provided simplified example of your data and below follows it - so instead of each 15 minutes aggregation, it uses each 2 sec aggregation. This is for you to be able to easy test / play with it. It is easily can be adjusted to 15 minutes by changing SECOND to MINUTE in 3 places and 2 to 15 in 3 places. Also this example uses TIME data type for time field as it is in your example so it is limited to just 24 hour period - most likely in your real data you have DATETIME or TIMESTAMP. In this case you will also need to replace all TIME_* functions with respective DATETIME_* or TIMESTAMP_* functions
So, finally - the query is:
#standardSQL
WITH `project.dataset.table` AS (
SELECT TIME '00:00:00' time, 31 viewCount UNION ALL
SELECT TIME '00:00:01', 44 UNION ALL
SELECT TIME '00:00:02', 78 UNION ALL
SELECT TIME '00:00:03', 71 UNION ALL
SELECT TIME '00:00:04', 72 UNION ALL
SELECT TIME '00:00:05', 73 UNION ALL
SELECT TIME '00:00:06', 64 UNION ALL
SELECT TIME '00:00:07', 70
),
period AS (
SELECT MIN(time) min_time, MAX(time) max_time, TIME_DIFF(MAX(time), MIN(time), SECOND) diff
FROM `project.dataset.table`
),
checkpoints AS (
SELECT TIME_ADD(min_time, INTERVAL step SECOND) start_time, TIME_ADD(min_time, INTERVAL step + 2 SECOND) end_time
FROM period, UNNEST(GENERATE_ARRAY(0, diff + 2, 2)) step
)
SELECT start_time time, SUM(viewCount) viewCount
FROM checkpoints c
JOIN `project.dataset.table` t
ON t.time >= c.start_time AND t.time < c.end_time
GROUP BY start_time
ORDER BY start_time, time
and result is:
Row time viewCount
1 00:00:00 75
2 00:00:02 149
3 00:00:04 145
4 00:00:06 134

Influxdb - Subtracting value from previous row, group by time

Is it possible to get individual data from cumulative?
Output of the following query is
SELECT mean("value") FROM "statsd_value" WHERE "type_instance" = 'counts' AND time > now() - 5m GROUP BY time(10s) fill(none)
TimeStamp Value
1463393810 0
1463393820 10
1463393830 23
1463393840 34
1463393850 67
1463393860 90
1463393870 104
Basically, the above data is cumulative data, I want to get individual data from that similar to this
TimeStamp Value
1463393820 10
1463393830 13
1463393840 11
1463393850 33
1463393860 23
1463393870 14
Is it possible to form query to get data in this way?
InfluxQL provides a difference function that will give you the functionality that you're looking for.
The query would look like this:
SELECT difference(mean("value")) FROM "statsd_value" WHERE "type_instance" = 'counts' AND time > now() - 5m GROUP BY time(10s) fill(none)
TimeStamp Value
1463393820 10
1463393830 13
1463393840 11
1463393850 33
1463393860 23
1463393870 14

Resources