Related
I have a database in SPSS structured like the following table:
ID
Gender
Age
Var1
Var...
1
0
7
3
...
2
1
8
4
...
3
1
9
5
...
4
1
9
2
...
I want to select only the first n (e.g.: 150) cases, where Gender = 1 and Age = 9, so in the table above the 3. and 4. case. How can I do it? Thanks!
compute filter_ = $sysmis.
compute counter_ = 0.
if $casenum=1 and (Gender = 1 and Age = 9) counter_ =1 .
do if $casenum <> 1.
if ~(Gender = 1 and Age = 9) counter_ = lag(counter).
if (Gender = 1 and Age = 9) counter_ = lag(counter) +1.
end if.
compute filter_ = (Gender = 1 and Age = 9 and counter<= 150).
execute.
I am not sure if this is the most efficient way, but it gets the job done. We use the counter_ variable to assign an order number for each record which satisfies the condition ("counting" records with meet the criteria, from the top of the file downwards). Then create a filter of the first 150 such records.
The below will select the first 150 cases where gender=1 AND age=9 (assuming 150 cases meet that criteria).
N 150.
SELECT IF (Gender=1 AND Age=9).
EXE .
Flipping the order of N and SELECT IF () would yield the same result. You can read more about N in the IBM documentation
I have a table which contains a feedback about a product.It has feedback type (positive ,negative) which is a text column, date on which comments made. I need to get total count of positive ,negative feedback for particular time period . For example if the date range is 30 days, I need to get total count of positive ,negative feedback for 4 weeks , if the date range is 6 months , I need to get total count of positive ,negative feedback for each month. How to group the count based on date.
+------+------+----------+----------+---------------+--+--+--+
| Slno | User | Comments | type | commenteddate | | | |
+------+------+----------+----------+---------------+--+--+--+
| 1 | a | aaaa | positive | 22-jun-2016 | | | |
| 2 | b | bbb | positive | 1-jun-2016 | | | |
| 3 | c | qqq | negative | 2-jun-2016 | | | |
| 4 | d | ccc | neutral | 3-may-2016 | | | |
| 5 | e | www | positive | 2-apr-2016 | | | |
| 6 | f | s | negative | 11-nov-2015 | | | |
+------+------+----------+----------+---------------+--+--+--+
Query i tried is
SELECT type, to_char(commenteddate,'DD-MM-YYYY'), Count(type) FROM comments GROUP BY type, to_char(commenteddate,'DD-MM-YYYY');
Here's a kick at the can...
Assumptions:
you want to be able to switch the groupings to weekly or monthly only
the start of the first period will be the first date in the feedback data; intervals will be calculated from this initial date
output will show feedback value, time period, count
time periods will not overlap so periods will be x -> x + interval - 1 day
time of day is not important (time for commented dates is always 00:00:00)
First, create some sample data (100 rows):
drop table product_feedback purge;
create table product_feedback
as
select rownum as slno
, chr(65 + MOD(rownum, 26)) as userid
, lpad(chr(65 + MOD(rownum, 26)), 5, chr(65 + MOD(rownum, 26))) as comments
, trunc(sysdate) + rownum + trunc(dbms_random.value * 10) as commented_date
, case mod(rownum * TRUNC(dbms_random.value * 10), 3)
when 0 then 'positive'
when 1 then 'negative'
when 2 then 'neutral' end as feedback
from dual
connect by level <= 100
;
Here's what my sample data looks like:
select *
from product_feedback
;
SLNO USERID COMMENTS COMMENTED_DATE FEEDBACK
1 B BBBBB 2016-08-06 neutral
2 C CCCCC 2016-08-06 negative
3 D DDDDD 2016-08-14 positive
4 E EEEEE 2016-08-16 negative
5 F FFFFF 2016-08-09 negative
6 G GGGGG 2016-08-14 positive
7 H HHHHH 2016-08-17 positive
8 I IIIII 2016-08-18 positive
9 J JJJJJ 2016-08-12 positive
10 K KKKKK 2016-08-15 neutral
11 L LLLLL 2016-08-23 neutral
12 M MMMMM 2016-08-19 positive
13 N NNNNN 2016-08-16 neutral
...
Now for the fun part. Here's the gist:
find out what the earliest and latest commented dates are in the data
include a query where you can set the time period (to "WEEKS" or "MONTHS")
generate all of the (weekly or monthly) time periods between the min/max dates
join the product feedback to the time periods (commented date between start and end) with an outer join in case you want to see all time periods whether or not there was any feedback
group the joined result by feedback, period start, and period end, and set up a column to count one of the 3 possible feedback values
x
with
min_max_dates -- get earliest and latest feedback dates
as
(select min(commented_date) min_date, max(commented_date) max_date
from product_feedback
)
, time_period_interval
as
(select 'MONTHS' as tp_interval -- set the interval/time period here
from dual
)
, -- generate all time periods between the start date and end date
time_periods (start_of_period, end_of_period, max_date, time_period) -- recursive with clause - fun stuff!
as
(select mmd.min_date as start_of_period
, CASE WHEN tpi.tp_interval = 'WEEKS'
THEN mmd.min_date + 7
WHEN tpi.tp_interval = 'MONTHS'
THEN ADD_MONTHS(mmd.min_date, 1)
ELSE NULL
END - 1 as end_of_period
, mmd.max_date
, tpi.tp_interval as time_period
from time_period_interval tpi
cross join
min_max_dates mmd
UNION ALL
select CASE WHEN time_period = 'WEEKS'
THEN start_of_period + 7 * (ROWNUM )
WHEN time_period = 'MONTHS'
THEN ADD_MONTHS(start_of_period, ROWNUM)
ELSE NULL
END as start_of_period
, CASE WHEN time_period = 'WEEKS'
THEN start_of_period + 7 * (ROWNUM + 1)
WHEN time_period = 'MONTHS'
THEN ADD_MONTHS(start_of_period, ROWNUM + 1)
ELSE NULL
END - 1 as end_of_period
, max_date
, time_period
from time_periods
where end_of_period <= max_date
)
-- now put it all together
select pf.feedback
, tp.start_of_period
, tp.end_of_period
, count(*) as feedback_count
from time_periods tp
left outer join
product_feedback pf
on pf.commented_date between tp.start_of_period and tp.end_of_period
group by tp.start_of_period
, tp.end_of_period
, pf.feedback
order by pf.feedback
, tp.start_of_period
;
Output:
negative 2016-08-06 2016-09-05 6
negative 2016-09-06 2016-10-05 7
negative 2016-10-06 2016-11-05 8
negative 2016-11-06 2016-12-05 1
neutral 2016-08-06 2016-09-05 6
neutral 2016-09-06 2016-10-05 5
neutral 2016-10-06 2016-11-05 11
neutral 2016-11-06 2016-12-05 2
positive 2016-08-06 2016-09-05 17
positive 2016-09-06 2016-10-05 16
positive 2016-10-06 2016-11-05 15
positive 2016-11-06 2016-12-05 6
-- EDIT --
New and improved, all in one easy to use procedure. (I will assume you can configure the procedure to make use of the query in whatever way you need.) I made some changes to simplify the CASE statements in a few places and note that for whatever reason using a LEFT OUTER JOIN in the main SELECT results in an ORA-600 error for me so I switched it to INNER JOIN.
CREATE OR REPLACE PROCEDURE feedback_counts(p_days_chosen IN NUMBER, p_cursor OUT SYS_REFCURSOR)
AS
BEGIN
OPEN p_cursor FOR
with
min_max_dates -- get earliest and latest feedback dates
as
(select min(commented_date) min_date, max(commented_date) max_date
from product_feedback
)
, time_period_interval
as
(select CASE
WHEN p_days_chosen BETWEEN 1 AND 10 THEN 'DAYS'
WHEN p_days_chosen > 10 AND p_days_chosen <=31 THEN 'WEEKS'
WHEN p_days_chosen > 31 AND p_days_chosen <= 365 THEN 'MONTHS'
ELSE '3-MONTHS'
END as tp_interval -- set the interval/time period here
from dual --(SELECT p_days_chosen as days_chosen from dual)
)
, -- generate all time periods between the start date and end date
time_periods (start_of_period, end_of_period, max_date, tp_interval) -- recursive with clause - fun stuff!
as
(select mmd.min_date as start_of_period
, CASE tpi.tp_interval
WHEN 'DAYS'
THEN mmd.min_date + 1
WHEN 'WEEKS'
THEN mmd.min_date + 7
WHEN 'MONTHS'
THEN mmd.min_date + 30
WHEN '3-MONTHS'
THEN mmd.min_date + 90
ELSE NULL
END - 1 as end_of_period
, mmd.max_date
, tpi.tp_interval
from time_period_interval tpi
cross join
min_max_dates mmd
UNION ALL
select CASE tp_interval
WHEN 'DAYS'
THEN start_of_period + 1 * ROWNUM
WHEN 'WEEKS'
THEN start_of_period + 7 * ROWNUM
WHEN 'MONTHS'
THEN start_of_period + 30 * ROWNUM
WHEN '3-MONTHS'
THEN start_of_period + 90 * ROWNUM
ELSE NULL
END as start_of_period
, start_of_period
+ CASE tp_interval
WHEN 'DAYS'
THEN 1
WHEN 'WEEKS'
THEN 7
WHEN 'MONTHS'
THEN 30
WHEN '3-MONTHS'
THEN 90
ELSE NULL
END * (ROWNUM + 1)
- 1 as end_of_period
, max_date
, tp_interval
from time_periods
where end_of_period <= max_date
)
-- now put it all together
select pf.feedback
, tp.start_of_period
, tp.end_of_period
, count(*) as feedback_count
from time_periods tp
inner join -- currently a bug that prevents the procedure from compiling with a LEFT OUTER JOIN
product_feedback pf
on pf.commented_date between tp.start_of_period and tp.end_of_period
group by tp.start_of_period
, tp.end_of_period
, pf.feedback
order by tp.start_of_period
, pf.feedback
;
END;
Test the procedure (in something like SQLPlus or SQL Developer):
var x refcursor
exec feedback_counts(10, :x)
print :x
I am really sorry for this. I am new oracle and I have created following block which is resulting output and then error.
First cursor is generating output and then error and so second cursor is not generating output.
Please help for error free output .
thanks and regards
Error:
END;
Error report:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at line 36
06502. 00000 - "PL/SQL: numeric or value error%s"
*Cause:
*Action:
Block:
DECLARE
c_dbuser SYS_REFCURSOR;
c_dbuser1 SYS_REFCURSOR;
temp_dbuser FIT_SCHEMA.fxf_inspt_insp_main%ROWTYPE;
year_date varchar2(20):='2014';
typeOfGraph varchar2(50):='InspectionByInspection';
division varchar2(20):='Division';
subDiv varchar2(20):='ALL';
emp varchar2(20):='';
TYPE t_name IS RECORD( --Error is at this line :(
inspectiondate varchar2(20),
totalcount number(38),
inspectionDesc varchar2(20)
);
r_name t_name; -- name record
TYPE t_monthwise is record (
month_number varchar2(20),
inspcount number(38),
inspectionDesc varchar2(20)
);
t_month t_monthwise;
BEGIN
fit_schema.My_manager(typeOfGraph,year_date,division,subDiv,emp,c_dbuser,c_dbuser1);
LOOP
FETCH c_dbuser INTO r_name ;
EXIT WHEN c_dbuser%NOTFOUND;
dbms_output.put_line( r_name.inspectiondate ||' '||r_name.totalcount||' '||r_name.inspectionDesc);
END LOOP;
LOOP
FETCH c_dbuser1 INTO t_month ;
EXIT WHEN c_dbuser1%NOTFOUND;
dbms_output.put_line( t_month.month_number ||' '||t_month.inspcount||' '||t_month.inspectionDesc);
END LOOP;
CLOSE c_dbuser;
CLOSE c_dbuser1;
END;
Output :
13-DEC-2014 1 3#CPLD Only
13-DEC-2014 4 0#Class Only
14-DEC-2014 1 0#Class Only
15-DEC-2014 2 0#Class Only
16-DEC-2014 1 0#Class Only
17-DEC-2014 1 7#Negative Class
17-DEC-2014 9 0#Class Only
19-DEC-2014 15 0#Class Only
22-DEC-2014 1 11#65% Rule
23-DEC-2014 1 8#XLGH & Class
30-DEC-2014 1 0#Class Only
31-DEC-2014 1 10#Mixed Articles
31-DEC-2014 3 0#Class Only
02-JAN-2015 2 0#Class Only
05-JAN-2015 2 0#Class Only
07-JAN-2015 2 9#XLGH Only
07-JAN-2015 3 1#Class & Reweigh
07-JAN-2015 1 10#Mixed Articles
07-JAN-2015 4 0#Class Only
07-JAN-2015 2 11#65% Rule
08-JAN-2015 5 0#Class Only
08-JAN-2015 1 9#XLGH Only
09-JAN-2015 1 3#CPLD Only
09-JAN-2015 1 11#65% Rule
09-JAN-2015 4 0#Class Only
09-JAN-2015 1 1#Class & Reweigh
12-JAN-2015 1 5#CCD Only
12-JAN-2015 1 3#CPLD Only
19-JAN-2015 1 11#65% Rule
20-JAN-2015 1 0#Class Only
21-JAN-2015 4 0#Class Only
23-JAN-2015 1 10#Mixed Articles
26-JAN-2015 1 7#Negative Class
26-JAN-2015 2 0#Class Only
27-JAN-2015 1 3#CPLD Only
27-JAN-2015 3 0#Class Only
27-JAN-2015 1 6#CCD & Class
28-JAN-2015 6 0#Class Only
29-JAN-2015 3 0#Class Only
29-JAN-2015 1 5#CCD Only
30-JAN-2015 1 3#CPLD Only
30-JAN-2015 1 1#Class & Reweigh
01-FEB-2015 3 0#Class Only
02-FEB-2015 1 4#CPLD & Class
02-FEB-2015 1 8#XLGH & Class
04-FEB-2015 1 0#Class Only
Queries from Procedure :-
SELECT TO_CHAR(MAINTABLE.INSP_CREATED_TMSTP,'dd-MON-yyyy') inspectiondate, --varchar2
COUNT(MAINTABLE.insp_id) AS totalinspection, --count is number
MAINTABLE.insp_type_id ||'#'||ISNPTYPETABLE.INSP_TYPE_DESC AS INSPECTIONDESCRIPTION-- varchar2
FROM FIT_SCHEMA.fxf_inspt_insp_main MAINTABLE
JOIN FIT_SCHEMA.fxf_inspt_emp_detail EMPTABLE
ON(MAINTABLE.inspector_emp_nbr=EMPTABLE.inspector_emp_nbr)
JOIN FIT_SCHEMA.fxf_inspt_drop_down DROPDOWNTABLE
ON (emptable.division_id = DROPDOWNTABLE.drop_down_id)
JOIN FIT_SCHEMA.fxf_inspt_insp_type ISNPTYPETABLE
ON(MAINTABLE.insp_type_id =isnptypetable.insp_type_id)
WHERE dropdowntable.is_active_flg = 1
AND MAINTABLE.STATUS_ID = 8
AND MAINTABLE.insp_type_id IN (0,1,3,4,5,6,7,8,9,10,11,12)
AND TRUNC(MAINTABLE.insp_created_tmstp) between TRUNC(EMPTABLE.EFFECTIVE_FROM_TMSTP)
And Nvl(Trunc(Emptable.Effective_To_Tmstp),Sysdate) AND UPPER(dropdowntable.drop_down_grp) =UPPER('division') AND TRUNC(MAINTABLE.insp_created_tmstp) between '01-JUN-14' AND '31-MAY-15' GROUP BY TO_CHAR(MAINTABLE.INSP_CREATED_TMSTP,'dd-MON-yyyy')
, TRUNC(MAINTABLE.INSP_CREATED_TMSTP), MAINTABLE.insp_type_id ||'#'|| ISNPTYPETABLE.INSP_TYPE_DESC ORDER BY TRUNC(MAINTABLE.INSP_CREATED_TMSTP)
SELECT TO_CHAR(MAINTABLE.insp_created_tmstp,'MM') AS MONTHNUMBER,
COUNT(MAINTABLE.INSP_ID) AS INSPCOUNT ,
MAINTABLE.insp_type_id ||'#'
||ISNPTYPETABLE.INSP_TYPE_DESC AS INSPECTIONDESCRIPTION
FROM FIT_SCHEMA.fxf_inspt_insp_main MAINTABLE
JOIN FIT_SCHEMA.fxf_inspt_emp_detail EMPTABLE
ON(MAINTABLE.inspector_emp_nbr=EMPTABLE.inspector_emp_nbr)
JOIN FIT_SCHEMA.fxf_inspt_drop_down DROPDOWNTABLE
ON (emptable.division_id = DROPDOWNTABLE.drop_down_id)
JOIN FIT_SCHEMA.fxf_inspt_insp_type ISNPTYPETABLE
ON(MAINTABLE.insp_type_id =isnptypetable.insp_type_id)
WHERE dropdowntable.is_active_flg = 1
AND MAINTABLE.insp_type_id IN (0,1,3,4,5,6,7,8,9,10,11,12)
AND dropdowntable.is_active_flg = 1
AND MAINTABLE.STATUS_ID = 8
AND TRUNC(MAINTABLE.insp_created_tmstp) between TRUNC(EMPTABLE.EFFECTIVE_FROM_TMSTP)
And Nvl(Trunc(Emptable.Effective_To_Tmstp),Sysdate) AND UPPER(dropdowntable.drop_down_grp) =UPPER('division')
AND TRUNC(MAINTABLE.insp_created_tmstp) between '01-JUN-14'
AND '31-MAY-15' GROUP BY TO_CHAR(MAINTABLE.insp_created_tmstp,'MM'), MAINTABLE.insp_type_id ||'#'||ISNPTYPETABLE.INSP_TYPE_DESC ORDER BY TO_CHAR(MAINTABLE.insp_created_tmstp,'MM')
Line 36 is:
FETCH c_dbuser INTO r_name ;
So one of the values being returned in your c_dbuser ref cursor is too big. Your date format mask will only generate string 11 characters long so inspectiondate is (more than) adequate, and the count seems unlikely to exceed number(38) so totalcount also seems fine. Which suggests that inspectionDesc is too small for some of the values in the data.
You're generating that from:
MAINTABLE.insp_type_id ||'#'||ISNPTYPETABLE.INSP_TYPE_DESC
If the total length of that concatenated string exceeds 20 characters then you'll get an error on fetch.
If INSP_TYPE_DESC is defined as varchar2(20) then you aren't allowing any room for the rest; you need one extra character for the fixed #, and at least two for the insp_type_id - more if the highest ID can be three digits or more. If it isn't constrained then you'll need to decide the highest number you expect to ever see and allow for that. So if you expect up to three digits for an ID, define inspectionDesc as varchar2(24), etc.
I need to join two tables (well, actually two views) so that for every selected row of the left view, there is a count of rows from the right view. That sounds to me like a LEFT JOIN, but in SQLite (this test database) and a LEFT JOIN query:
SELECT TARGET.session_id session_id, TARGET.labeltype_id labeltype_id, TARGET.label_id label_id, count(SECONDARY.label_id) NOlabels
FROM segment_extended TARGET LEFT JOIN segment_extended SECONDARY
WHERE TARGET.session_id = SECONDARY.session_id AND TARGET.lt_name= "Word" AND SECONDARY.lt_name ="Comments"
AND ((SECONDARY.start <= TARGET.start AND TARGET.END <= SECONDARY.END) OR (TARGET.start <= SECONDARY.start AND SECONDARY.END <= TARGET.END))
AND TARGET.label != '' AND SECONDARY.label != ''
GROUP BY TARGET.session_id,TARGET.labeltype_id, TARGET.label_id;
I get only a small subset of what I would expect:
2 3 3 1
2 3 9 1
A more extended query gives the correct result:
SELECT session_id, labeltype_id, label_id, max(NOlabels) NOlabels
FROM (SELECT TARGET.session_id session_id, TARGET.labeltype_id labeltype_id, TARGET.label_id label_id, count(SECONDARY.label_id) NOlabels
FROM segment_extended TARGET , segment_extended SECONDARY
WHERE TARGET.session_id = SECONDARY.session_id AND TARGET.lt_name= "Word" AND SECONDARY.lt_name ="Comments"
AND ((SECONDARY.start <= TARGET.start AND TARGET.END <= SECONDARY.END) OR (TARGET.start <= SECONDARY.start AND SECONDARY.END <= TARGET.END))
AND TARGET.label != '' AND SECONDARY.label != ''
GROUP BY TARGET.session_id,TARGET.labeltype_id, TARGET.label_id
UNION
SELECT TARGET.session_id session_id, TARGET.labeltype_id labeltype_id, TARGET.label_id label_id, 0 NOlabels
FROM segment_extended TARGET
WHERE TARGET.lt_name= "Word"
AND TARGET.label != ''
GROUP BY TARGET.session_id,TARGET.labeltype_id, TARGET.label_id)
GROUP BY session_id, labeltype_id, label_id
ORDER BY session_id,labeltype_id, label_id
session_id labeltype_id label_id NOlabels
2 3 2 0
2 3 3 1
2 3 4 0
2 3 5 0
2 3 7 0
2 3 8 0
2 3 9 1
2 3 10 0
but it seems unnecessarily complicated. What am I doing wrong with the left join?
When doing a left join you have to count the null values from the left join as 0 records but still include them. You can accomplish this with a CASE construct in the inner query, then using the SUM aggregate function in the outer group-by.
SELECT session_id, labeltype_id, label_id, sum(has_label) NOlabels
FROM (
SELECT TARGET.session_id session_id, TARGET.labeltype_id labeltype_id, TARGET.label_id label_id, CASE WHEN SECONDARY.label_id is NULL then 0 else 1 END has_label
FROM
segment_extended TARGET
LEFT JOIN
segment_extended SECONDARY on
TARGET.session_id = SECONDARY.session_id
AND SECONDARY.lt_name ="Comments"
AND ((
SECONDARY.start <= TARGET.start AND TARGET.END <= SECONDARY.END)
OR (TARGET.start <= SECONDARY.start AND SECONDARY.END <= TARGET.END))
AND SECONDARY.label != ''
WHERE TARGET.lt_name= "Word" AND TARGET.label != '')
GROUP BY session_id, labeltype_id, label_id
Your join is not a left join.
A left join adds NULL values for the right table if there are no rows that match the join condition.
However, you query does not have a join condition, and the WHERE condition is not affect by the LEFT JOIN clause.
Replace WHERE with ON.
I have some difficulties with mySQL commands that I want to do.
SELECT a.timestamp, name, count(b.name)
FROM time a, id b
WHERE a.user = b.user
AND a.id = b.id
AND b.name = 'John'
AND a.timestamp BETWEEN '2010-11-16 10:30:00' AND '2010-11-16 11:00:00'
GROUP BY a.timestamp
This is my current output statement.
timestamp name count(b.name)
------------------- ---- -------------
2010-11-16 10:32:22 John 2
2010-11-16 10:35:12 John 7
2010-11-16 10:36:34 John 1
2010-11-16 10:37:45 John 2
2010-11-16 10:48:26 John 8
2010-11-16 10:55:00 John 9
2010-11-16 10:58:08 John 2
How do I group them into 5 minutes interval results?
I want my output to be like
timestamp name count(b.name)
------------------- ---- -------------
2010-11-16 10:30:00 John 2
2010-11-16 10:35:00 John 10
2010-11-16 10:40:00 John 0
2010-11-16 10:45:00 John 8
2010-11-16 10:50:00 John 0
2010-11-16 10:55:00 John 11
This works with every interval.
PostgreSQL
SELECT
TIMESTAMP WITH TIME ZONE 'epoch' +
INTERVAL '1 second' * round(extract('epoch' from timestamp) / 300) * 300 as timestamp,
name,
count(b.name)
FROM time a, id
WHERE …
GROUP BY
round(extract('epoch' from timestamp) / 300), name
MySQL
SELECT
timestamp, -- not sure about that
name,
count(b.name)
FROM time a, id
WHERE …
GROUP BY
UNIX_TIMESTAMP(timestamp) DIV 300, name
I came across the same issue.
I found that it is easy to group by any minute interval is
just dividing epoch by minutes in amount of seconds and then either rounding or using floor to get ride of the remainder. So if you want to get interval in 5 minutes you would use 300 seconds.
SELECT COUNT(*) cnt,
to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300)
AT TIME ZONE 'UTC' as interval_alias
FROM TABLE_NAME GROUP BY interval_alias
interval_alias cnt
------------------- ----
2010-11-16 10:30:00 2
2010-11-16 10:35:00 10
2010-11-16 10:45:00 8
2010-11-16 10:55:00 11
This will return the data correctly group by the selected minutes interval; however, it will not return the intervals that don't contains any data. In order to get those empty intervals we can use the function generate_series.
SELECT generate_series(MIN(date_trunc('hour',timestamp_column)),
max(date_trunc('minute',timestamp_column)),'5m') as interval_alias FROM
TABLE_NAME
Result:
interval_alias
-------------------
2010-11-16 10:30:00
2010-11-16 10:35:00
2010-11-16 10:40:00
2010-11-16 10:45:00
2010-11-16 10:50:00
2010-11-16 10:55:00
Now to get the result with interval with zero occurrences we just outer join both result sets.
SELECT series.minute as interval, coalesce(cnt.amnt,0) as count from
(
SELECT count(*) amnt,
to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300)
AT TIME ZONE 'UTC' as interval_alias
from TABLE_NAME group by interval_alias
) cnt
RIGHT JOIN
(
SELECT generate_series(min(date_trunc('hour',timestamp_column)),
max(date_trunc('minute',timestamp_column)),'5m') as minute from TABLE_NAME
) series
on series.minute = cnt.interval_alias
The end result will include the series with all 5 minute intervals even those that have no values.
interval count
------------------- ----
2010-11-16 10:30:00 2
2010-11-16 10:35:00 10
2010-11-16 10:40:00 0
2010-11-16 10:45:00 8
2010-11-16 10:50:00 0
2010-11-16 10:55:00 11
The interval can be easily changed by adjusting the last parameter of generate_series. In our case we use '5m' but it could be any interval we want.
You should rather use GROUP BY UNIX_TIMESTAMP(time_stamp) DIV 300 instead of round(../300) because of the rounding I found that some records are counted into two grouped result sets.
For postgres, I found it easier and more accurate to use the
date_trunc
function, like:
select name, sum(count), date_trunc('minute',timestamp) as timestamp
FROM table
WHERE xxx
GROUP BY name,date_trunc('minute',timestamp)
ORDER BY timestamp
You can provide various resolutions like 'minute','hour','day' etc... to date_trunc.
The query will be something like:
SELECT
DATE_FORMAT(
MIN(timestamp),
'%d/%m/%Y %H:%i:00'
) AS tmstamp,
name,
COUNT(id) AS cnt
FROM
table
GROUP BY ROUND(UNIX_TIMESTAMP(timestamp) / 300), name
Not sure if you still need it.
SELECT FROM_UNIXTIME(FLOOR((UNIX_TIMESTAMP(timestamp))/300)*300) AS t,timestamp,count(1) as c from users GROUP BY t ORDER BY t;
2016-10-29 19:35:00 | 2016-10-29 19:35:50 | 4 |
2016-10-29 19:40:00 | 2016-10-29 19:40:37 | 5 |
2016-10-29 19:45:00 | 2016-10-29 19:45:09 | 6 |
2016-10-29 19:50:00 | 2016-10-29 19:51:14 | 4 |
2016-10-29 19:55:00 | 2016-10-29 19:56:17 | 1 |
You're probably going to have to break up your timestamp into ymd:HM and use DIV 5 to split the minutes up into 5-minute bins -- something like
select year(a.timestamp),
month(a.timestamp),
hour(a.timestamp),
minute(a.timestamp) DIV 5,
name,
count(b.name)
FROM time a, id b
WHERE a.user = b.user AND a.id = b.id AND b.name = 'John'
AND a.timestamp BETWEEN '2010-11-16 10:30:00' AND '2010-11-16 11:00:00'
GROUP BY year(a.timestamp),
month(a.timestamp),
hour(a.timestamp),
minute(a.timestamp) DIV 12
...and then futz the output in client code to appear the way you like it. Or, you can build up the whole date string using the sql concat operatorinstead of getting separate columns, if you like.
select concat(year(a.timestamp), "-", month(a.timestamp), "-" ,day(a.timestamp),
" " , lpad(hour(a.timestamp),2,'0'), ":",
lpad((minute(a.timestamp) DIV 5) * 5, 2, '0'))
...and then group on that
How about this one:
select
from_unixtime(unix_timestamp(timestamp) - unix_timestamp(timestamp) mod 300) as ts,
sum(value)
from group_interval
group by ts
order by ts
;
I found out that with MySQL probably the correct query is the following:
SELECT SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,
'%Y-%m-%d %H:%i:%S' ) , 1, 19 ) AS ts_CEILING,
SUM(value)
FROM group_interval
GROUP BY SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,
'%Y-%m-%d %H:%i:%S' ) , 1, 19 )
ORDER BY SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,
'%Y-%m-%d %H:%i:%S' ) , 1, 19 ) DESC
Let me know what you think.
select
CONCAT(CAST(CREATEDATE AS DATE),' ',datepart(hour,createdate),':',ROUNd(CAST((CAST((CAST(DATEPART(MINUTE,CREATEDATE) AS DECIMAL (18,4)))/5 AS INT)) AS DECIMAL (18,4))/12*60,2)) AS '5MINDATE'
,count(something)
from TABLE
group by CONCAT(CAST(CREATEDATE AS DATE),' ',datepart(hour,createdate),':',ROUNd(CAST((CAST((CAST(DATEPART(MINUTE,CREATEDATE) AS DECIMAL (18,4)))/5 AS INT)) AS DECIMAL (18,4))/12*60,2))
This will do exactly what you want.
Replace
dt - your datetime
c - call field
astro_transit1 - your table
300 as seconds for each time gap increase
SELECT
FROM_UNIXTIME(300 * ROUND(UNIX_TIMESTAMP(r.dt) / 300)) AS 5datetime,
(SELECT
r.c
FROM
astro_transit1 ra
WHERE
ra.dt = r.dt
ORDER BY ra.dt DESC
LIMIT 1) AS first_val
FROM
astro_transit1 r
GROUP BY UNIX_TIMESTAMP(r.dt) DIV 300
LIMIT 0 , 30
Based on #boecko answer for MySQL, I used a CTE (Common Table Expression) to accelerate the query execution time :
so this :
SELECT
`timestamp`,
`name`,
count(b.`name`)
FROM `time` a, `id` b
WHERE …
GROUP BY
UNIX_TIMESTAMP(`timestamp`) DIV 300, name
becomes :
WITH cte AS (
SELECT
`timestamp`,
`name`,
count(b.`name`),
UNIX_TIMESTAMP(`timestamp`) DIV 300 AS `intervals`
FROM `time` a, `id` b
WHERE …
)
SELECT * FROM cte GROUP BY `intervals`
In a large amount of data, the speed is accelerated by more than 10!
As timestamp and time are reserved in MySQL, don't forget to use `...` on each table and column name !
Hope it will help some of you.