MATCH (n)
RETURN DISTINCT n
ORDER BY n.name
SKIP 5
LIMIT 10
When I write such a query, it will not always return 10 results because first the limitation is done and then DISTINCT command filters the results; so the DISTINCT command works on 10 results. How can I change this query to return DISTINCT results and then limits them to 10? I'd like to get 10 results every time.
Does this do what you want?
MATCH (n)
WITH DISTINCT n
ORDER BY n.name
RETURN n
SKIP 5
LIMIT 10
Related
I have 2 subqueries that returns 2 sets of users (each query return one set of users)
First query :
MATCH (:User {user_id: "69b3315a-ba4a-4021-94e1-0f494f9b957f"})-->(first_set_of_users)
RETURN first_set_of_users
Second query :
MATCH (:User {user_id: "69b3315a-ba4a-4021-94e1-0f494f9b957f"})<-[:LIKES]-(likers)-[:LIKES]->(v)
WITH DISTINCT v
MATCH (second_set_of_users)-[:LIKES]->(v)
RETURN second_set_of_users, COUNT(*) AS recoWeight
ORDER BY recoWeight DESC
What I want to finally return is all users from second_set_of_users minus the one in first_set_of_users and ORDER BY recoWeight DESC
How can I do that in just one query ? Everything I tried led to cartesian products of queries and took forever while each independent query takes less than a second.
MATCH (:User {user_id: "69b3315a-ba4a-4021-94e1-0f494f9b957f"})-->(first_set_of_users)
WITH collect(first_set_of_users) AS list_of_first_set_of_users
MATCH (:User {user_id: "69b3315a-ba4a-4021-94e1-0f494f9b957f"})<-[:LIKES]-(likers)-[:LIKES]->(v)
WITH DISTINCT v, list_of_first_set_of_users
MATCH (second_set_of_users)-[:LIKES]->(v)
WITH second_set_of_users, COUNT(*) AS recoWeight
WHERE NOT second_set_of_users IN list_of_first_set_of_users
RETURN second_set_of_users, recoWeight
ORDER BY recoWeight DESC
Explanation.
Using WITH clause we could pass the result of the first query into the second query.
And then using WHERE NOT IN we could filter the result of the second query.
I want to compute Indegree and Outdegree and return a graph that has a connection between top 5 Indegree nodes and top 5 Outdegree nodes. I have written a code as
match (a:Port1)<-[r]-()
return a.id as NodeIn, count(r) as Indegree
order by Indegree DESC LIMIT 5
union
match (n:Port1)-[r]->()
return n.id as NodeOut, count(r) as Outdegree
order by Outdegree DESC LIMIT 5
union
match p=(u:Port1)-[:LinkTo*1..]->(t:Port1)
where u.id in NodeIn and t.id in NodeOut
return p
I get an error as
All sub queries in an UNION must have the same column names (line 4, column 1 (offset: 99)) "union"
What are the changes that I need to do to the code?
There's a few things we can improve.
The matches you're doing isn't the most efficient way to get incoming and outgoing degrees for relationships.
Also, UNION can only be used to combine query results with identical columns. In this case, we won't even need UNION, we can use WITH to pipe results from one part of a query to another, and COLLECT() the nodes you need in between.
Try this query:
match (a:Port1)
with a, size((a)<--()) as Indegree
order by Indegree DESC LIMIT 5
with collect(a) as NodesIn
match (a:Port1)
with NodesIn, a, size((a)-->()) as Outdegree
order by Outdegree DESC LIMIT 5
with NodesIn, collect(a) as NodesOut
unwind NodesIn as NodeIn
unwind NodesOut as NodeOut
// we now have a cartesian product between both lists
match p=(NodeIn)-[:LinkTo*1..]->(NodeOut)
return p
Be aware that this performs two NodeLabelScans of :Port1 nodes, and does a cross product of the top 5 of each, so there are 25 variable length path matches, which can be expenses, as this generates all possible paths from each NodeIn to each NodeOut.
If you only one the shortest connection between each, then you might try replacing your variable length match with a shortestPath() call, which only returns the shortest path found between each two nodes:
...
match p = shortestPath((NodeIn)-[:LinkTo*1..]->(NodeOut))
return p
Also, make sure your desired direction is correct, as you're matching nodes with the highest in degree and getting an outgoing path to nodes with the highest out degree, that seems like it might be backwards to me, but you know your requirements best.
I currently have this query:
START n=node(*)
MATCH (p:Person)-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
RETURN p, count(p)
LIMIT 5
Say there are 42 people in FooManGroup, I want to return 5 of these people, with a count of 42.
Is this possible to do in one query?
Running this now returns 5 rows, which is fine, but a count of 104, which is the total number of nodes of any type in my DB.
Any suggestions?
You can use a WITH clause to do the counting of the persons, followed by an identical MATCH clause to do the matching of each person. Notice that you need to START on the p nodes and not just some n that will match any node in the graph:
MATCH (p:Person )-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
WITH count(p) as personsInGroup
MATCH (p:Person)-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
RETURN p, personsInGroup
LIMIT 5
It may not be the best or most elegant way to this, but it works. If you use cypher 2.0 it may be a bit more compact like this:
MATCH (p:Person)-[:is_member]->(g:Group {name: 'FooManGroup'})
WITH count(p) as personsInGroup
MATCH (p:Person)-[:is_member]->(g:Group {name: 'FooManGroup'})
RETURN p, personsInGroup
LIMIT 5
Relationship types are always uppercased in cypher, so :is_member should be :IS_MEMBER which I think is more readable:
MATCH (p:Person)-[:IS_MEMBER]->(g:Group {name: 'FooManGroup'})
WITH count(p) as personsInGroup
MATCH (p:Person)-[:IS_MEMBER]->(g:Group {name: 'FooManGroup'})
RETURN p, personsInGroup
LIMIT 5
Try this:
MATCH (p:Person)-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
RETURN count(p), collect(p)[0..5]
I ran into the following problem when combining two cypher queries on console.neo4j.org
The query:
MATCH (p1:Crew)-[r_pfq]->(fq:Crew)
WHERE fq.name IN ["Neo", "Morpheus"]
RETURN distinct(p1) AS person, count(r_pfq) AS friend_score, collect(fq.name) AS friends
ORDER BY friend_score DESC
LIMIT 10
works fine, as does
MATCH (f:Crew)<-[r_fqf]-(fq:Crew)
WHERE fq.name IN ["Neo", "Morpheus"]
WITH distinct(f), count(r_fqf) AS weight
ORDER BY weight DESC
LIMIT 10
MATCH f<--(p:Crew)
RETURN distinct(p) AS person, sum(weight) AS friend_score, collect(f.name) AS friends
ORDER BY friend_score DESC
LIMIT 10
Now when I try to combine the query results using the UNION command, i.e.
MATCH (p1:Crew)-[r_pfq]->(fq:Crew)
WHERE fq.name IN ["Neo", "Morpheus"]
RETURN distinct(p1) AS person, count(r_pfq) AS friend_score, collect(fq.name) AS friends
ORDER BY friend_score DESC
LIMIT 10
UNION
MATCH (f:Crew)<-[r_fqf]-(fq:Crew)
WHERE fq.name IN ["Neo", "Morpheus"]
WITH distinct(f), count(r_fqf) AS weight
ORDER BY weight DESC
LIMIT 10
MATCH f<--(p:Crew)
RETURN distinct(p) AS person, sum(weight) AS friend_score, collect(f.name) AS friends
ORDER BY friend_score DESC
LIMIT 10
I get the error
Error: org.neo4j.graphdb.NotFoundException: Unknown identifier `weight`.
Can anyone provide me with an explanation why these query results can not be combined and how to properly do so? Why is the identifier known when running both queries separately but unknown in a UNION-combined query?
EDIT
The following simpler query is basically equivalent, except that the second query in the UNION does not ORDER BY weight. This is because we are already ordering by the derived friend_score, so it seemed redundant. Also, in order for a variable to be included in the ORDER BY clause, it has to be in the RETURN clause -- but the first query in the UNION does not have a weight variable, which would have violated the requirements for a legal UNION statement.
In addition, there is a second WITH clause in the second query because you have to define the variables used in an ORDER BY clause (like friend_score) before the RETURN clause!
MATCH (p1:Crew)-[r_pfq]->(fq:Crew)
WHERE fq.name IN ["Neo", "Morpheus"]
RETURN DISTINCT (p1) AS person, count(r_pfq) AS friend_score, collect(fq.name) AS friends
ORDER BY friend_score DESC
LIMIT 10
UNION
MATCH (p:Crew)-->(f:Crew)<-[r_fqf]-(fq:Crew)
WHERE fq.name IN ["Neo", "Morpheus"]
WITH f, count(r_fqf) AS weight, p
WITH f, sum(weight) AS friend_score, p
RETURN DISTINCT (p) AS person, friend_score, collect(DISTINCT (f).name) AS friends
ORDER BY friend_score DESC
LIMIT 10
MATCH (n)
RETURN DISTINCT id(n) as nid, n.name
ORDER BY n.name
SKIP 5
LIMIT 10
I'd like to get distinct nids and their name properties but instead, the query filters the whole row, i.e applies distinct keyword on "nid, n.name" as whole. How can I achieve to get distinct nids and names of the nodes which have those distinct nids?
I assume you're looking for the collect function:
MATCH (n)
RETURN id(n) as nid, collect(n.name)
SKIP 5
LIMIT 10