Calculate difference between two dates and group and count the results - ruby-on-rails

If I have a table with field date_completed. I would like to calculate the difference between this date and the time now in months
(date_completed.year * 12) + date_completed.month - (Date.today.year * 12) + Date.today.month
and then group the results based on the number of months e.g.
No Months => Count
1 => 10
2 => 5
3 => 6
etc.
Is it possible to calculate the difference between two dates in a database and then group and count the results?

A Postgre-SQL specific SQL query will be something like below - it makes use of extract and age method of PostgreSQL(Refer Documentation)
select extract(year from age(date_completed)) * 12 +
extract(month from age(date_completed)) as months,
count(*) from learn group by months;
Sample output:
"months"|"count"
2| 2
3| 1
4| 1
I made use of a table learn above, you need to change it to table name you have.

Related

Postgresql sum all unique values from previous dates

Let's say, for simplicity sake, I have the following table:
id amount p_id date
------------------------------------------------
1 5 1 2020-01-01T01:00:00
2 10 1 2020-01-01T01:10:00
3 15 2 2020-01-01T01:20:00
4 10 3 2020-01-01T03:30:00
5 10 4 2020-01-01T03:50:00
6 20 1 2020-01-01T03:40:00
Here's a sample response I want:
{
"2020-01-01T01:00:00": 25, -- this is from adding records with ids: 2 and 3
"2020-01-01T03:00:00": 55 -- this is from adding records with ids: 3,4,5 and 6
}
I want to get the total (sum(amount)) of all unique p_id's grouped by the hour.
The row chosen per p_id is the one with the latest date. So for example, the first value in the response above doesn't include id 1 because the record with id 2 has the same p_id and the date on that row is later.
The one tricky thing is I want to include the summation of all the amount per p_id if their date is before the hour presented. So for example, in the second value of the response (with key "2020-01-01T03:00:00"), even though id 3 has a timestamp in a different hour, it's the latest for that p_id 2 and therefore gets included in the sum for "2020-01-01T03:00:00". But the row with id 6 overrides id 2 with the same p_id 1.
In other words: always take the latest amount for each p_id so far, and compute the sum for every distinct hour found in the table.
Create a CTE that includes row_number() over (partition by p_id, date_trunc('hour',"date") order by "date" desc) as pid_hr_seq
Then write your query against that CTE with where pid_hr_seq = 1.

Treat Null value as zero

I have a query to influxdb, something like:
SELECT last("Shop1.balance")+last("Shop2.balance")+last("Shop2.balance") + last("Shop2.balance") FROM "balances" WHERE $timeFilter GROUP BY time($__interval) fill(previous)
where i am receiving with graphana a total balance for all shops . It works fine until I add a new shop with a new data: a new shop balance can be null for some interval in history and whole calculated value will be null. I can not reorganize my database but may be I can change my query to receive date where null interval will be treated as zero in a total sum.
This works as you've asked:
SELECT last("Shop1.balance") + last("Shop2.balance") + last("Shop3.balance)
FROM "balances"
WHERE $timeFilter
GROUP BY time($__interval) fill(0)
Demo
select last(balance1) as last_balance_1, last(balance2) as last_balance_2
from balance
where time > '2018-03-07T10:44:14.0000000Z' and time < '2018-03-07T10:44:52.0000000Z'
group by time(7s)
Result:
name: balance
time last_balance_1 last_balance_2
---- -------------- --------------
2018-03-07T10:44:13Z 1 2
2018-03-07T10:44:20Z 2 3
2018-03-07T10:44:27Z 4
2018-03-07T10:44:34Z 5 6
2018-03-07T10:44:41Z 7
2018-03-07T10:44:48Z 8 9
select last(balance1) as last_balance_1, last(balance2) as last_balance_2
from balance
where time > '2018-03-07T10:44:14.0000000Z' and time < '2018-03-07T10:44:52.0000000Z'
group by time(7s) fill(0)
Result:
name: balance
time last_balance_1 last_balance_2
---- -------------- --------------
2018-03-07T10:44:13Z 1 2
2018-03-07T10:44:20Z 2 3
2018-03-07T10:44:27Z 4 0
2018-03-07T10:44:34Z 5 6
2018-03-07T10:44:41Z 7 0
2018-03-07T10:44:48Z 8 9

Find total price given the quantity for each month and the rate for each month

Question: What is the best formula to add quantity multiplied by different rates based on dates?
Example:
Rate 1: 1/1/2017 - 4/30/2017 $5
Rate 2: 5/1/2017 - 6/30/2017 $6
Rate 3: 7/1/2017 - 12/31/2017 $7
Apples per month
Jan: 10
Feb: 30
Mar: 10
Apr: 50
May: 40
Jun: 70
Jul: 80
Aug: 40
Sep: 20
Oct: 10
Nov: 40
Dec: 100
The date is 5/9/2017. The formula would find that the date starts with Rate 2 and ends with Rate 3.
So, the calculation would be:
((40+70)*$6) +((80+40+20+10+40+100) *$7 )) = $2,690.00
What is the best formula to calculate this way? Use SUMPRODUCT? I am baffled...
Format a lookup table for rates, in which the first column is the date on which each rate goes into effect:
+----------+---+
| 1/1/2017 | 5 |
| 5/1/2017 | 6 |
| 7/1/2017 | 7 |
+----------+---+
Let's say the above is A1:B3. Now you can lookup the rate for any date (say a date is in C9) with
=vlookup(C9, A1:B3, 2, True)
Here vlookup searches for the date in the first column and returns the value from the second column for the nearest match that is less than or equal to the search key.
Then you can use sumproduct like this:
=sumproduct(D9:D15, arrayformula(vlookup(C9:C15, A1:B3, 2, True))
Here the rate is looked up for each date in the range C9:C15 (so those should be the first day of each month).
Finally, you want to do all of this, given a date like 5/9/2017. Suppose this date is in E1. Let's make 1 there. Say, in F1:
=date(2017, month(E1), 1)
Then filter the array of dates/amounts by the condition that the date is at least F1: filter(C:C, C >= F1). The final result will be like
=sumproduct(filter(D:D, C:C >= F1), arrayformula(vlookup(filter(C:C, C >= F1), A1:B3, 2, True))

Influxdb: How to get count of number of results in a group by query

Is there anyway that i can get the count of total number of results / points / records in a group by query result?
> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m)
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z 2
2015-08-18T00:24:00Z 2
I expect the count as 3 in this case. Even though I can calculate the number of results using the time period and interval (12m) here, I would like to know whether it is possible to do so with a query to database.
You can use Multiple Aggregates in single query.
Using your example add a select count(*) from (<inner query>):
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m))
name: h2o_feet
--------------
time count_count
1970-01-01T00:00:00Z 3
However if you had a situation in which the grouping by returns empty rows, they will not be counted.
For example, counting over the below table will result in a count of 2 rather than 3:
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z
2015-08-18T00:24:00Z 2
To include empty rows in your count you will need to add fill(1) to your query like this:
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) fill(1))
You will need to do some manual work. Run it directly,
$ influx -execute "select * from measurement_name" -database="db_name" | wc -l
This will return 4 more than the actual values.
Here is an example,
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles" | wc -l
5
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles"
name: yprices
time price
---- -----
1493626629063286219 2
luvpreet#DHARI-Inspiron-3542:~/www$
So, I think now you know why subtract 4 from the value.

Using sum when grouping one more columns in LINQ

I have 2 tables like below.
For comments to vote
VoteId VoteValue UserId CommentId DateAdded
1 1 1 1 10/11/2013
2 1 5 1 10/14/2013
3 1 9 2 09/08/2013
4 1 11 3 01/03/2014
For users that take point values
PointId Date PointValue UserId
1 10/11/2013 1 1
2 10/14/2013 1 5
3 09/08/2013 1 9
4 01/03/2014 1 11
I should find 10 users that most taken votes each month in all comments. Firstly I try to write LINQ like that;
var object = (db.Comments.
Where(c => c.ApplicationUser.Id == comment.ApplicationUser.Id).
FirstOrDefault()).ToList();
I can't use sum and add points to my table. Any helps?
I hope it's clear.
First you should extract mounth from datetime value, then group by month descending and also take sum of all coments and use Take(10) at the end.

Resources