Difference between actual sample vs filled - influxdb

I expected a subquery having GROUP BY time intervals and fill(previous) behaves identically to the resultant dataset. But a query against the subquery returns unexpected results.
Environments
InfluxDB v1.6.3
What Happened
Here is a sample dataset:
# DDL
CREATE DATABASE myapp
# DML
# CONTEXT-DATABASE: myapp
reqcnt,endpoint=a value=1 1563361000000000000
reqcnt,endpoint=b value=1 1563361000000000000
reqcnt,endpoint=a value=2 1563361005000000000
reqcnt,endpoint=b value=2 1563361005000000000
reqcnt,endpoint=a value=3 1563361010000000000
reqcnt,endpoint=b value=3 1563361010000000000
You can import the above data by executing:
$ influx -import /path/to/dataset -precision ns
The data is simple: three values per endpoint, measured with every 5 seconds. But because I want a dataset having per-second values, I made a subquery filling missing values with previous ones:
> SELECT LAST(value) AS lv FROM reqcnt WHERE time >= 1563361000000000000 AND time <= 1563361010000000000 GROUP BY time(1s), endpoint fill(previous)
name: reqcnt
tags: endpoint=a
time lv
---- --
1563361000000000000 1
1563361001000000000 1
1563361002000000000 1
1563361003000000000 1
1563361004000000000 1
1563361005000000000 2
1563361006000000000 2
1563361007000000000 2
1563361008000000000 2
1563361009000000000 2
1563361010000000000 3
name: reqcnt
tags: endpoint=b
time lv
---- --
1563361000000000000 1
1563361001000000000 1
1563361002000000000 1
1563361003000000000 1
1563361004000000000 1
1563361005000000000 2
1563361006000000000 2
1563361007000000000 2
1563361008000000000 2
1563361009000000000 2
1563361010000000000 3
So far so good. Now I'd like to sum across endpoints:
> SELECT SUM(value) FROM (SELECT LAST(value) AS value FROM reqcnt WHERE time >= 1563361000000000000 AND time <= 1563361010000000000 GROUP BY time(1s), endpoint fill(previous)) WHERE time >= 1563361000000000000 AND time <= 1563361010000000000 GROUP BY time(1s)
name: reqcnt
time sum
---- ---
1563361000000000000 1
1563361001000000000 1
1563361002000000000 1
1563361003000000000 1
1563361004000000000 1
1563361005000000000 1
1563361006000000000 1
1563361007000000000 1
1563361008000000000 1
1563361009000000000 1
1563361010000000000 1
1563361000000000000 1
1563361001000000000 1
1563361002000000000 1
1563361003000000000 1
1563361004000000000 1
1563361005000000000 1
1563361006000000000 1
1563361007000000000 1
1563361008000000000 1
1563361009000000000 1
1563361010000000000 1
1563361005000000000 2
1563361006000000000 2
1563361007000000000 2
1563361008000000000 2
1563361009000000000 2
1563361010000000000 2
1563361005000000000 2
1563361006000000000 2
1563361007000000000 2
1563361008000000000 2
1563361009000000000 2
1563361010000000000 8
Why does the result contains multiple values for a same timestamp? And why 34 values when I added 11 to 11?
What I expected
> SELECT SUM(value) FROM (SELECT LAST(value) AS value FROM reqcnt WHERE time >= 1563361000000000000 AND time <= 1563361010000000000 GROUP BY time(1s), endpoint fill(previous)) WHERE time >= 1563361000000000000 AND time <= 1563361010000000000 GROUP BY time(1s)
name: reqcnt
time sum
---- ---
1563361000000000000 2
1563361001000000000 2
1563361002000000000 2
1563361003000000000 2
1563361004000000000 2
1563361005000000000 2
1563361006000000000 4
1563361007000000000 4
1563361008000000000 4
1563361009000000000 4
1563361010000000000 6
Creating a per-second dataset and query it returns the expected result:
# DDL
CREATE DATABASE myapp
# DML
# CONTEXT-DATABASE: myapp
reqcnt2,endpoint=a value=1 1563361000000000000
reqcnt2,endpoint=b value=1 1563361000000000000
reqcnt2,endpoint=a value=1 1563361001000000000
reqcnt2,endpoint=b value=1 1563361001000000000
reqcnt2,endpoint=a value=1 1563361002000000000
reqcnt2,endpoint=b value=1 1563361002000000000
reqcnt2,endpoint=a value=1 1563361003000000000
reqcnt2,endpoint=b value=1 1563361003000000000
reqcnt2,endpoint=a value=1 1563361004000000000
reqcnt2,endpoint=b value=1 1563361004000000000
reqcnt2,endpoint=a value=2 1563361005000000000
reqcnt2,endpoint=a value=2 1563361005000000000
reqcnt2,endpoint=b value=2 1563361006000000000
reqcnt2,endpoint=a value=2 1563361006000000000
reqcnt2,endpoint=b value=2 1563361007000000000
reqcnt2,endpoint=a value=2 1563361007000000000
reqcnt2,endpoint=b value=2 1563361008000000000
reqcnt2,endpoint=a value=2 1563361008000000000
reqcnt2,endpoint=b value=2 1563361009000000000
reqcnt2,endpoint=a value=2 1563361009000000000
reqcnt2,endpoint=a value=3 1563361010000000000
reqcnt2,endpoint=b value=3 1563361010000000000
Importing the above and query the dataset shows the expected results:
> SELECT SUM(value) FROM reqcnt2 WHERE time >= 1563361000000000000 AND time <= 1563361010000000000 GROUP BY time(1s)
name: reqcnt2
time sum
---- ---
1563361000000000000 2
1563361001000000000 2
1563361002000000000 2
1563361003000000000 2
1563361004000000000 2
1563361005000000000 2
1563361006000000000 4
1563361007000000000 4
1563361008000000000 4
1563361009000000000 4
1563361010000000000 6
So what is the difference? a filled dataset should not be queried as is an actual dataset? Is there a proper way to query such subquery?

Related

Custom query to fetch all entries of a table and that only contains first of many duplicates based on a specific column

I have a Location model and the table looks like
id
name
vin
ip_address
created_at
updated_at
0
default
0
0.0.0.0/0
2021-11-08 11:54:26.822623
2021-11-08 11:54:26.822623
1
admin
1
10.108.150.143
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
2
V122
122
10.108.150.122
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
3
V123
123
10.108.150.123
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
4
V124
124
10.108.150.124
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
5
V122
122
10.108.150.122
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
6
V125
122
10.108.150.125
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
My method in the Location model
def self.find_all_non_duplicate
return self.find(:all, :conditions => "id <> 1")
end
I want to fetch all entries of the locations table except the entry with id = 1 and that contains only the first entry of many duplicates based on the column ip_address.
Since ip_address of id = 2 and id = 5 is duplicate. I want to keep the first entry of many duplicates i.e., id = 2.
The expected result is
id
name
vin
ip_address
created_at
updated_at
0
default
0
0.0.0.0/0
2021-11-08 11:54:26.822623
2021-11-08 11:54:26.822623
2
V122
122
10.108.150.122
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
3
V123
123
10.108.150.123
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
4
V124
124
10.108.150.124
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
6
V125
122
10.108.150.125
2021-11-08 11:54:26.82885
2021-11-08 11:54:26.82885
The entries with id's 1 and 5 to be ignored
What you need is a distinct on proposed to RoR quite recently here but not yet merged, as pointed out by #engineersmnky. In a raw SQL from it will look like this:
select distinct on (ip_address) *
from test
where id<>1
order by ip_address,created_at;
Which would translate to RoR's
self.where("id <> 1").distinct_on(:ip_address)
or, until the new feature gets accepted:
self.where("id <> 1").select("distinct on (ip_address) *")
Full db-side test:
drop table if exists test cascade;
create table test (
id serial primary key,
name text,
vin integer,
ip_address inet,
created_at timestamp,
updated_at timestamp);
insert into test
(id,name,vin,ip_address,created_at,updated_at)
values
(0,'default', 0,'0.0.0.0/0'::inet,'2021-11-08 11:54:26.822623'::timestamp,'2021-11-08 11:54:26.822623'::timestamp),
(1,'admin', 1,'10.108.150.143'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(2,'V122', 122,'10.108.150.122'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(3,'V123', 123,'10.108.150.123'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(4,'V124', 124,'10.108.150.124'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(5,'V122', 122,'10.108.150.122'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(6,'V125', 122,'10.108.150.125'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp);
select distinct on (ip_address) *
from test where id<>1
order by ip_address,created_at;
-- id | name | vin | ip_address | created_at | updated_at
------+---------+-----+----------------+----------------------------+----------------------------
-- 0 | default | 0 | 0.0.0.0/0 | 2021-11-08 11:54:26.822623 | 2021-11-08 11:54:26.822623
-- 2 | V122 | 122 | 10.108.150.122 | 2021-11-08 11:54:26.82885 | 2021-11-08 11:54:26.82885
-- 3 | V123 | 123 | 10.108.150.123 | 2021-11-08 11:54:26.82885 | 2021-11-08 11:54:26.82885
-- 4 | V124 | 124 | 10.108.150.124 | 2021-11-08 11:54:26.82885 | 2021-11-08 11:54:26.82885
-- 6 | V125 | 122 | 10.108.150.125 | 2021-11-08 11:54:26.82885 | 2021-11-08 11:54:26.82885
--(5 rows)

Merge / Combine Two Active Record Results (has_and_belongs_to_many and belongs_to)

I think this is perhaps straight forward but I can't seem to figure this out.
I have a Model that has the following associations:
has_and_belongs_to_many :locations, join_table: :model_locations
belongs_to :location_from, class_name: "Location", foreign_key: "location_from_id"
belongs_to :location_to, class_name: "Location", foreign_key: "location_to_id"
So model.locations can return 0, 1 or multiple records and model.location_from and model.location_to always present and single records.
What I am looking for is a combined result of all of these. I know there is a convoluted SQL query to do this but it would be nice to have a simple Active Record statement. I have looked at merge() and << but none of these seem to work.
For side reference the SQL out put from the has_and_belongs_to_many:
Location Load (0.6ms) SELECT "locations".* FROM "locations" INNER JOIN "model_locations" ON "locations"."id" = "model_locations"."location_id" WHERE "model_locations"."model_id" = $1 [["model_id", 17]]
The preferred answer is via Active Record but a raw SQL will do the trick too.
UPDATE
Progress added to an answer below - will still accept answers that compliment my answer.
I am half way there (sort of). I added an answer here as to not make the question too long.
Here is my data:
irb(main):169:0> Location.find_by_sql('SELECT "model_locations".* FROM "model_locations"')
Location Load (0.5ms) SELECT "model_locations".* FROM "model_locations"
+----+---------------------+-------------+
| id | model_id | location_id |
+----+---------------------+-------------+
| | 17 | 50 |
| | 17 | 51 |
| | 10 | 24 |
| | 19 | 11 |
| | 19 | 5 |
| | 19 | 51 |
+----+---------------------+-------------+
6 rows in set
irb(main):174:0> Model.select(:id, :location_from_id, :location_to_id)
Model Load (0.7ms) SELECT "models"."id", "models"."location_from_id", "models"."location_to_id" FROM "models"
+----+------------------+----------------+
| id | location_from_id | location_to_id |
+----+------------------+----------------+
| 17 | 1 | 5 |
| 18 | 50 | 24 |
| 10 | 3 | 8 |
| 1 | 50 | 11 |
| 19 | 1 | 5 |
| 20 | 1 | 11 |
| 21 | 11 | 5 |
+----+------------------+----------------+
7 rows in set
So for example:
Model 17 has Locations 50, 51 AND 1 ,5
Location 11 has Models 19, 1 AND 20, 21
So I can find the Model Locations:
'SELECT "locations".* FROM "locations" LEFT JOIN "model_locations" ON "locations"."id" = "model_locations"."location_id" WHERE "model_locations"."model_id" = 17 OR "locations"."id" IN (1,5)'
This works great - I get my 4 locations however I can't get the reverse to work:
'SELECT "models".* FROM "models" WHERE "models"."location_from_id" = 11 OR "models"."location_to_id" = 11 INNER JOIN "model_locations" ON "model_locations"."model_id" = "models"."id" WHERE "models_locations"."location_id" = 11'
This fails at the INNER JOIN:
PG::SyntaxError: ERROR: syntax error at or near "INNER"

How I can count the total votes in worker table in Rails [duplicate]

This question already has answers here:
Rails has_many association count child rows
(5 answers)
Closed 8 years ago.
I'am confused when I try to count the total votes in my votes table in rails..
serviceproviders has_many votes
votes belongs_to serviceproviders
I tried like this :
sp = Serviceprovider.joins(:votes).group_by(&:id).count
but it doesn't get the right output.
example output I want is:
If in the table Jhon Doe has 5 row of votes in the table, I can get the total 5 votes when I query. Can any give me the idea how can execute the query. Thank you!
Update:
Thank you for those answers.
I tried this in my rails c.
vote = Vote.joins(:serviceprovider).group(:serviceprovider_id).count
and I got the results: {108=>2, 109=>1}
My question how can I get the top 10 highest votes?
Here is the table:
app_development=# select * from votes;
id | city | created_at | updated_at | service_provider_id
----+---------+----------------------------+----------------------------+---------------------
1 | B\'lore | 2015-02-19 17:35:58.061324 | 2015-02-19 17:35:58.083479 | 3
2 | Kol | 2015-02-19 17:35:58.103013 | 2015-02-19 17:35:58.123405 | 2
3 | Mum | 2015-02-19 17:35:58.11242 | 2015-02-19 17:35:58.125345 | 2
4 | Kochin | 2015-02-19 17:35:58.136139 | 2015-02-19 17:35:58.167971 | 1
5 | Mum | 2015-02-19 17:35:58.145833 | 2015-02-19 17:35:58.170319 | 1
6 | Chennai | 2015-02-19 17:35:58.156755 | 2015-02-19 17:35:58.171996 | 1
(6 rows)
app_development=# select * from service_providers;
id | name | created_at | updated_at
----+------+----------------------------+----------------------------
1 | MTS | 2015-02-19 17:35:57.837508 | 2015-02-19 17:35:57.837508
2 | HCL | 2015-02-19 17:35:57.923479 | 2015-02-19 17:35:57.923479
3 | ACL | 2015-02-19 17:35:57.934414 | 2015-02-19 17:35:57.934414
You need the following query to obtain the desired result :
Vote.joins(:service_provider)
.group(:service_provider_id)
.order("count_all desc")
.limit(10)
.count
Tested in Rails console :
[arup#app]$ rails c
Loading development environment (Rails 4.1.1)
[1] pry(main)> Vote.joins(:service_provider).group(:service_provider_id).order("count_all desc").limit(2).count
(2.0ms) SELECT COUNT(*) AS count_all, service_provider_id AS service_provider_id FROM "votes" INNER JOIN "service_providers" ON "service_providers"."id" = "votes"."service_provider_id" GROUP BY service_provider_id ORDER BY count_all desc LIMIT 2
=> {1=>3, 2=>2}
[2] pry(main)>
Try this
sp = ServiceProvider.find_by_name("Jhon Doe");
#votes = sp.votes.count

Rails - Get Column Value with Group By

I have the following tables with entries that might look like:
Table: sessions
ID | CREATED_AT | UPDATED_AT | USER_ID | GROUP_ID
---|-------------------------|-------------------------|---------|----------
27 | 2014-07-01 23:02:16 | 2014-07-01 23:03:18 | 1 | 1
28 | 2014-07-02 16:55:25 | 2014-07-02 17:31:40 | 1 | 2
29 | 2014-07-07 20:31:13 | 2014-07-07 20:34:17 | 1 | 3
Table: groups
ID | NAME | CREATED_AT | UPDATED_AT
---|-------------------|-------------------------|------------------------
1 | Marching | 2013-12-17 19:45:28 | 2013-12-17 19:45:28
2 | Reaching | 2014-02-07 17:29:59 | 2014-02-07 17:29:59
3 | Picking | 2014-03-11 21:38:56 | 2014-03-11 21:38:56
And I have the following query in Rails:
Session.joins(:group).select('groups.name').where(:user_id => 1).group('sessions.group_id').count
Which returns the following keys and values:
=> {2=>7, 1=>3, 3=>1} (The "key" is the group_id and the "value"
is the # of times it occurs).
My question is: Instead of returning the "id" as the key, is it possible for me to return the groups.name instead? Which would look like:
=> {"Reaching"=>7, "Marching"=>3, "Picking"=>1}
If not, would I have to loop through and re-query again based on each group_id?
Thanks very much.
This should work if you have everything set up in your models.
data = Group.joins(:sessions).select('name, count(*) as occurrence_count').where('sessions.user_id = ?', 1).group('groups.name')
Then you can access it like this
data.first.occurrence_count

Rails fetch row with latest update

I have such db table structure:
id | currency_list_id | direction_id | value | updated_at
and i have such data:
1 | 1 | 1 | 8150 | 09-08-2010 01:00:00
1 | 1 | 2 | 8250 | 09-08-2010 01:00:00
1 | 2 | 1 | 8150 | 06-08-2010 01:00:00
1 | 2 | 2 | 8150 | 06-08-2010 01:00:00
1 | 1 | 1 | 8150 | 09-08-2010 15:00:00
1 | 1 | 2 | 8250 | 09-08-2010 15:00:00
so currency in exchanger is setted almost everyday, and could be setted more than one time in a day.... but also one could be setted some days ago... And i must to fetch all actual data..
How in rails (ruby) i could fetch only last actual data?
In my example result will be:
1 | 2 | 1 | 8150 | 06-08-2010 01:00:00
1 | 2 | 2 | 8150 | 06-08-2010 01:00:00
1 | 1 | 1 | 8150 | 09-08-2010 15:00:00
1 | 1 | 2 | 8250 | 09-08-2010 15:00:00
how to do this?
i try so:
#currencies = CurrencyValue.find(:all, :conditions => {:currency_list_id => id}, :order => :updated_at)
but then i will fetch all data for some currency_list_id with ordering, but how to fetch only last 2 values? How to fetch last 2 ordered rows?
#currencies = CurrencyValue.find(:all, :conditions => {:currency_list_id => id, :order => :updated_at}).last(2)
I think :). Can't check this right now.
I think what you need by be a GROUP_BY. There's another question that explains it here:
How to get the latest record in each group using GROUP BY?

Resources