I am using neo4j to create a social network application. The data model has a FRIEND relationship between two USER nodes. I need to get all the friends of mine ordered by displayName (Unique Indexed).
I need pagination for this query. I will send the last name from the list I got from the previous query results. And I want to limit each page to 20 names.
MATCH (u:USER{displayName:{id}})-[:FRIEND]-(f:USER)
RETURN f
ORDER BY f.displayName
LIMIT 20;
What is the best way to do this? Will SKIP work here, sending SKIP 0, SKIP 1*20, SKIP 2*20, ...
You can use the query in this way i think :
ORDER BY f.displayName LIMIT START_POSITION , LAST_POSITION;
For example:
ORDER BY f.displayName LIMIT 0 , 20;
ORDER BY f.displayName LIMIT 21 , 40;
Yes, you can use the SKIP clause to do what you want. In the following, I assume that you provide the page value (starting at 0) as a parameter.
MATCH (u:USER{displayName:{id}})-[:FRIEND]-(f:USER)
RETURN f
ORDER BY f.displayName
SKIP {page} * 20
LIMIT 20;
Note that this technique is not foolproof if the list of friends can change during paging.
Related
I am trying to count distinct sessionIds from a measurement. sessionId being a tag, I count the distinct entries in a "parent" query, since distinct() doesn't works on tags.
In the subquery, I use a group by sessionId limit 1 to still benefit from the index (if there is a more efficient technique, I have ears wide open but I'd still like to understand what's going on).
I have those two variants:
> select count(distinct(sessionId)) from (select * from UserSession group by sessionId limit 1)
name: UserSession
time count
---- -----
0 3757
> select count(sessionId) from (select * from UserSession group by sessionId limit 1)
name: UserSession
time count
---- -----
0 4206
To my understanding, those should return the same number, since group by sessionId limit 1 already returns distinct sessionIds (in the form of groups).
And indeed, if I execute:
select * from UserSession group by sessionId limit 1
I have 3757 results (groups), not 4206.
In fact, as soon as I put this in a subquery and re-select fields in a parent query, some sessionIds have multiple occurrences in the final result. Not always, since there is 17549 rows in total, but some are.
This is the sign that the limit 1 is somewhat working, but some sessionId still get multiple entries when re-selected. Maybe some kind of undefined behaviour?
I can confirm that I get the same result.
In my experience using nested queries does not always deliver what you expect/want.
Depending on how you use this you could retrieve a list of all values for a tag with:
SHOW TAG VALUES FROM UserSession WITH KEY=sessionId
Or to get the cardinality (number of distinct values for a tag):
SHOW TAG VALUES EXACT CARDINALITY FROM UserSession WITH KEY=sessionId.
Which will return a single row with a single column count, containing a number. You can remove the EXACT modifier if you don't need to be exact about the result: SHOW TAG VALUES CARDINALITY on Influx Documentation.
I want to be able to limit the activerecord objects to 20 being returned, then perform a where() that returns a subset of the limited objects which I currently know only 10 will fulfil the second columns criteria.
e.g. of ideal behaviour:
o = Object.limit(20)
o.where(column: criteria).count
=> 10
But instead, activerecord still looks for 20 objects that fulfil the where() criteria, but looks outside of the original 20 objects that the limit() would have returned on its own.
How can I get the desired response?
One way to decrease the search space is to use a nested query. You should search the first N records rather than all records which match a specific condition. In SQL this would be done like this:
select * from (select * from table order by ORDERING_FIELD limit 20) where column = value;
The query above will only search for the condition in 20 rows from the table. Notice how I have added a ORDERING_FIELD, this is required because each query could give you a different order each time you run it.
To do something similar in Rails, you could try the following:
Object.where(id: Object.order(:id).limit(20).select(:id)).where(column: criteria)
This will execute a query similar to the following:
SELECT [objects].* FROM [objects] WHERE [objects].[id] IN (SELECT TOP (20) [objects].[id] FROM [objects] ORDER BY [objects].id ASC) AND [objects].[column] = criteria
I am using this query to find the 2nd largest element. I am making query on value column.
Booking.where("value < ?", Booking.maximum(:value)).last
Is there any better query than this? Or any alternative to this.
PS - value is not unique. There could be two elements with same value
This should work.
Booking.select("DISTINCT value").order('value DESC').offset(1).limit(1)
Which will generate this query :
SELECT DISTINCT value FROM "bookings" ORDER BY value DESC LIMIT 1 OFFSET 1
You can use offset and last:
Booking.order(:value).offset(1).last
Which will produce following SQL statement:
SELECT `bookings`.* FROM `bookings`
ORDER BY `bookings`.`value` DESC
LIMIT 1 OFFSET 1
I've rewritten this question as my previous explanation was causing confusion.
In the SQL world, you have an initial record set that you apply a query to. The output of this query is the result set. Generally, the initial record set is an entire table of records and the result set is the records from the initial record set that match the query ruleset.
I have a use case where I need my application to occasionally operate on only a subset of records in a table. If a table has 10,000 records in it, I'd like my application to behave like only the first 1,000 records exist. These should be the same 1,000 records each time. In other words, I want the initial record set to be the first 1,000 devices in a table (when ordered by primary key), and the result set the resulting records from these first 1,000 devices.
Some solutions have been proposed, and it's revealed that my initial description was not very clear. To be more explicit, I am not trying to implement pagination. I'm also not trying to limit the number of results I receive (which .limit(1,000) would indeed achieve).
Thanks!
This is the line in your question that I don't understand:
This causes issues though with both of the calls, as limit limits the results of the query, not the database rows that the query is performed on.
This is not a Rails thing, this is a SQL thing.
Device.limit(n) runs SELECT * FROM device LIMIT n
Limit always returns a subset of the queried result set.
Would first(n) accomplish what you want? It will both order the result set ascending by the PK and limit the number of results returned.
SQL Statements can be chained together. So if you have your subset, you can then perform additional queries with it.
my_subset = Device.where(family: "Phone")
# SQL: SELECT * FROM Device WHERE `family` = "Phone"
my_results = my_subset.where(style: "Touchscreen")
# SQL: SELECT * FROM Device WHERE `family` = "Phone" AND `style` = "Touchscreen"
Which can also be written as:
my_results = Device.where(family: "Phone").where(style: "Touchscreen")
my_results = Device.where(family: "Phone", style: "Touchscreen")
# SQL: SELECT * FROM Device WHERE `family` = "Phone" AND `style` = "Touchscreen"
From your question, if you'd like to select the first 1,000 rows (ordered by primary key, pkey) and then query against that, you'll need to do:
my_results = Device.find_by_sql("SELECT *
FROM (SELECT * FROM devices ORDER BY pkey ASC LIMIT 1000)
WHERE `more_searching` = 'happens here'")
You could specifically ask for a set of IDs:
Device.where(id: (1..4).to_a)
That will construct a WHERE clause like:
WHERE id IN (1,2,3,4)
user = SkillUser.find_all_by_skill_id(skill_id)
user.size
gives me: 1 2 2 1 3 1 3 1 3 2 1 1 3
How can I get the biggest value (in this case 3) out of this row of numbers?
Thanks for help
You can use the maximum scope on your ActiveRelation:
SkillUser.maximum(:rating)
If you want the maximum of an attribute called rating.
If you want to count the number of users per skill id, try:
SkillUser.count(:group => :skill_id).max_by { |skill_id,count| count }
This gives you both the skill_id and the number of users for the skill with most users.
For a more efficient way (by doing the whole calculation in SQL), try:
SkillUser.limit(1).reverse_order.count(:group => :skill_id, :order => :count)
# Giving the SQL:
# => SELECT COUNT(*) AS count_all, "skill_users"."skill_id" AS skill_id
# FROM "skill_users" GROUP BY "skill_users"."skill_id"
# ORDER BY "skill_users"."count" DESC LIMIT 1
Be aware that count must be called last because it doesn't return an ActiveRelation for you to further scope the query.
You should use ActiveRecord::Calculations
http://ar.rubyonrails.org/classes/ActiveRecord/Calculations/ClassMethods.html
for performance reasons
1.9.3-194 (main):0 > User.maximum(:id)
(1.6ms) SELECT MAX("users"."id") AS max_id FROM "users"
=> 3
Fastest way to find a single maximum value in an unsorted list of
integer is to scan the list from left to right and memorize the
largest value so far.
If you sort the list first, you get the
additional benefit of easily finding the 2nd, 3rd etc. largest
values easily as well.
If you take one of the "maximum" methods hidden in ruby ... you should check what the implementors are doing to pick the max and compare it to 1. and 2. above :-)
Explanations:
to 1. Doing it this way, you just have to pick each value in the list exactly once and compare it once to the maximum so-far.
to 2. Sorting costs O(n*log n) ops in the average if you got a list with n entries. Obviously this is more than the O(n) in solution 1., but you get a bit more
to 3. Well.. I prefer knowing what happens, but your preferences might vary