any way to limit result by node - neo4j

I have this query:
MATCH (user:Users)-[buy:Sales]->(item:Items)<-[buy2:Sales]- (user2:Users)-[buy_other:Sales]->(item2:Items)
where item.category = item2.category
return
user.mail, item2.id
the idea is to get items that the first user could be interested in that other user2 also bought, but i want to limit the results to max 2 item2 id per user
I know i can limit results in general, with limit 10 for example, but that means that those 10 results could all be for the same user.
Any help? thanks in advance

You can do it by inserting a COLLECTing and getting the first n items of it.
MATCH (user:Users)-[buy:Sales]->(item:Items)<-[buy2:Sales]- (user2:Users)-[buy_other:Sales]->(item2:Items)
WHERE item.category = item2.category
// this is where you collect and get some items of it
WITH user,COLLECT(item2)[0..2] AS item2s
UNWIND item2s AS item2
RETURN
user.mail, item2.id

Related

Duplicates in the result of a subquery

I am trying to count distinct sessionIds from a measurement. sessionId being a tag, I count the distinct entries in a "parent" query, since distinct() doesn't works on tags.
In the subquery, I use a group by sessionId limit 1 to still benefit from the index (if there is a more efficient technique, I have ears wide open but I'd still like to understand what's going on).
I have those two variants:
> select count(distinct(sessionId)) from (select * from UserSession group by sessionId limit 1)
name: UserSession
time count
---- -----
0 3757
> select count(sessionId) from (select * from UserSession group by sessionId limit 1)
name: UserSession
time count
---- -----
0 4206
To my understanding, those should return the same number, since group by sessionId limit 1 already returns distinct sessionIds (in the form of groups).
And indeed, if I execute:
select * from UserSession group by sessionId limit 1
I have 3757 results (groups), not 4206.
In fact, as soon as I put this in a subquery and re-select fields in a parent query, some sessionIds have multiple occurrences in the final result. Not always, since there is 17549 rows in total, but some are.
This is the sign that the limit 1 is somewhat working, but some sessionId still get multiple entries when re-selected. Maybe some kind of undefined behaviour?
I can confirm that I get the same result.
In my experience using nested queries does not always deliver what you expect/want.
Depending on how you use this you could retrieve a list of all values for a tag with:
SHOW TAG VALUES FROM UserSession WITH KEY=sessionId
Or to get the cardinality (number of distinct values for a tag):
SHOW TAG VALUES EXACT CARDINALITY FROM UserSession WITH KEY=sessionId.
Which will return a single row with a single column count, containing a number. You can remove the EXACT modifier if you don't need to be exact about the result: SHOW TAG VALUES CARDINALITY on Influx Documentation.

'ORDER BY' results per row in cypher query(Neo4j)

This question is a follow on to the question here
With 2 answers for that
Now I need to modify this query to return those items related to this HashTag, order by createdDate(as all those items have createdDate property).
I've written this query:
MATCH (r:RateableEntity)<-[:TAG]-(h:HashTag:Featured)
WITH h, COUNT(h) AS Count
ORDER BY Count DESC
SKIP 2
LIMIT 3
WITH h, Count, h.tag as Name,
[(h)-[:TAG]->(m:RateableEntity {audience: 'world'}) | m][..3] AS Items
UNWIND Items as row
RETURN row, Name, Count, COLLECT(row.id)
ORDER BY row.createdDate
But the results are:
Name row.id Count
"vanessa" "cdd14968-404c-41e9-84d5-bf147030a023" 14
"vanessa" "qwd14968-2344-41e9-84d5-bftt34534566" 14
"vanessa" "cd14968-404c-41e9-84d5-certt4545455g" 14
"hash" "b7e74f38-44e4-4b7f-b2c4-8301023ffa9b" 15
"hash" "edr34334-2995-4202-b178-bb2a6f230ab0" 15
"hash" "htth5548-404c-41e9-84d5-bf147030a023" 15
"new" "oljj4968-2344-41e9-84d5-bftt34534566" 3
"new" "werr4968-404c-41e9-84d5-certt4545455" 3
"new" "be545b38-44e4-4b7f-b2c4-8301023ffa9b" 3
I can see that count is correct andskip and limit working as I want but here I have 3 rows instead of one row and 3 id.
Also ORDER BY is not working.
Any idea? ideas appreciated.
UPDATE:
Actually the result of this query will be nodes and after that, in my code, I'm mapping to this, so still it's not what I want

Order By String Property Value and Pagination in Neo4j

I am using neo4j to create a social network application. The data model has a FRIEND relationship between two USER nodes. I need to get all the friends of mine ordered by displayName (Unique Indexed).
I need pagination for this query. I will send the last name from the list I got from the previous query results. And I want to limit each page to 20 names.
MATCH (u:USER{displayName:{id}})-[:FRIEND]-(f:USER)
RETURN f
ORDER BY f.displayName
LIMIT 20;
What is the best way to do this? Will SKIP work here, sending SKIP 0, SKIP 1*20, SKIP 2*20, ...
You can use the query in this way i think :
ORDER BY f.displayName LIMIT START_POSITION , LAST_POSITION;
For example:
ORDER BY f.displayName LIMIT 0 , 20;
ORDER BY f.displayName LIMIT 21 , 40;
Yes, you can use the SKIP clause to do what you want. In the following, I assume that you provide the page value (starting at 0) as a parameter.
MATCH (u:USER{displayName:{id}})-[:FRIEND]-(f:USER)
RETURN f
ORDER BY f.displayName
SKIP {page} * 20
LIMIT 20;
Note that this technique is not foolproof if the list of friends can change during paging.

Ruby on Rails - Selecting a range from a query

I'm doing a query to get all the purchases from the db. For example
orders = PurchaseOrders.all
I in the same query, how can I select only the first hundred orders(1-100) or just the next 100(101-200) etc..?
Thank you
You can use limit and offset:
PurchaseOrders.limit(200).offset(100)
which meant start from 200 and take 100 records. More info here. Or with take:
PurchaseOrders.offset(100).take(400)
take 400 records starting from 100.
For the first 100 records;
orders = PurchaseOrders.first(100)
and last 100 records;
orders = PurchaseOrders.last(100)
or by IDs,
orders = PurchaseOrders.find([100, 201])

How to select data for defined page and total count of records?

I have a table with paginated data and this is the way I select data for each page:
#visitors = EventsVisitor
.select('visitors.*, events_visitors.checked_in, events_visitors.checkin_date, events_visitors.source, events_visitors.id AS ticket_id')
.joins(:visitor)
.order(order)
.where(:event_id => params[:event_id])
.where(filter_search)
.where(mode)
.limit(limit)
.offset(offset)
Also to build table pagination I need to know total count of records. Currently my solution for this is very rough:
total = EventsVisitor
.select('count(*) as count, events_visitors.*')
.joins(:visitor)
.order(order)
.where(:event_id => params[:event_id])
.where(filter_search)
.where(mode)
.first()
.count
So my question is as follows - What is the optimal ruby way to select limited data for the current page and total count of records?
I noticed that if I do #visitors.count - additional sql query will be generated:
SELECT COUNT(count_column) FROM (SELECT 1 AS count_column FROM `events_visitors` INNER JOIN `visitors` ON `visitors`.`id` = `events_visitors`.`visitor_id` WHERE `events_visitors`.`event_id` = 1 LIMIT 15 OFFSET 0) subquery_for_count
First of all, I do not understand what is the reason to send an additional query to get a count of data that we already have, I mean that after we got data from database in #visitors we can count it with ruby without need to send additional queries to DB.
Second - I thought that maybe there are some ways to use something like .total_count that will generate similar count(*) query but without that useless limit/offset?
you should except limit and offset
http://guides.rubyonrails.org/active_record_querying.html#except .
See how kaminari does it
https://github.com/kaminari/kaminari/blob/92052eedf047d65df71cc0021a9df9df1e2fc36e/lib/kaminari/models/active_record_relation_methods.rb#L11
So it might be something like
total = #visitors.except(:offset, :limit, :order).count

Resources