How to skip an unwound list - neo4j

I have the following query
MATCH (n:Mob)
WITH COUNT(n) as total, COLLECT(n) as nodes
UNWIND nodes as node
WITH total, node
WHERE 8000 < node.order < 8100
RETURN node, total
SKIP 10
LIMIT 1
Right now, this query is giving me this error.
If I remove the SKIP part it works.
So my overall question is, how do I SKIP some of the records?

This was mainly a misunderstanding on my part. If you want to filter before bunching them together, then perform the COLLECT at a later stage.
Working code:
MATCH (n:Mob)
WITH COUNT(n) as total, n as node
WITH total, node
WHERE node.order > 1000
WITH total, node
SKIP 10
LIMIT 5
WITH collect(node) as nodes, total
RETURN nodes, total

Related

Cascade update of nodes in tree

I have a graph as acyclic tree with undefined depth. I need to count number of descendants for each node including node itself. So the final result should be something like that:
9
|\
4 4
|\ \
2 1 3
| |\
1 1 1
So for each node this number would be sum of numbers of its descendants + 1.
How can it be done in one query?
I could come up with something like that:
MATCH (n)
SET n.count = SIZE((n)<-[:PARENT*0..]-());
But it means a subquery for each node. Having over 1 300 000 nodes it takes ages.
Better way would be to set "1" for each leaf and ascend to the root calculating each node. Is it possible to do in one query?
I'd go for
MATCH (start)<-[:PARENT*0..]-(n)
RETURN id(start), count(n) as numberOfChildren
which counts how many nodes are found on the path. But I don't know how it performs on really large graphs (my test graph has only ~100s nodes).
You could already optimize your query by limiting the number of paths you are processing, e.g. like this :
MATCH (n)
WHERE EXISTS((n)<-[:PARENT]-())
MATCH path=(n)<-[:PARENT*0..]-(m)
WHERE NOT EXISTS((m)<-[:PARENT]-())
UNWIND nodes(path) AS node
WITH n, COUNT(DISTINCT node) AS count
SET n.count = count

Filter nodes after receiving total number

So I have the following query:
MATCH (n:Mob)
WITH count(n) as total, collect(n) as nodes
WITH nodes, total
UNWIND nodes as node
WHERE node.order > 8000 AND node.order < 8100
RETURN node, total
What I'm trying to do is to get to the total number of nodes (with label Mob) as a number then filter the actual returning nodes, so that I have a subset of the total nodes.
This currently gives me the error Invalid input 'H': expected 'i/I'. Is there anyway to do what I want in one query, or does it need to be split into two?
You need to have a WITH clause between the UNWIND and MATCH clauses. This should work:
MATCH (n:Mob)
WITH COUNT(n) as total, COLLECT(n) as nodes
UNWIND nodes as node
WITH total, node
WHERE 8000 < node.order < 8100
RETURN total, node
However, this is simpler if you are OK with getting a single list of suitable nodes instead of multiple return records:
MATCH (n:Mob)
RETURN
COUNT(n) AS total,
[m IN COLLECT(n) WHERE 8000 < m.order < 8100] AS nodes, COUNT(n) AS total
[UPDATE]
If you also want to do the equivalent of SKIP and LIMIT (assuming the SKIP and LIMIT counts are passed as parameters skip and limit):
MATCH (n:Mob)
RETURN
COUNT(n) AS total,
[m IN COLLECT(n) WHERE 8000 < m.order < 8100][$skip..($skip+$limit)] AS nodes
Adding to cybersam's answer. In order to SKIP/LIMIT a list you need to do [x..y] where x and y are numbers i.e.
MATCH (n:Mob)
RETURN [m IN COLLECT(n) WHERE 8000 < m.order < 8100][0..10] AS nodes, COUNT(n) AS total
Or
MATCH (n:Mob)
WITH COUNT(n) as total, COLLECT(n) as nodes
UNWIND nodes as node
WITH total, node
WHERE node.order > 1000
WITH total, node
SKIP 10
LIMIT 5
WITH collect(node) as nodes, total
RETURN nodes, total

Timetree specific nodes in range

I'm having difficulties to get all nodes in a specific time range. I have two types of node attached to the timetree, Nodes Tweet and Nodes News.
I want all the Tweets nodes. I'm using this query (10+ min stopped):
CALL ga.timetree.events.range({start: 148029120000, end: 1480896000000, relationshipType: "LAST_UPDATE", resolution: 'DAY'})
YIELD node
MATCH (a:TwitterUser)-[:POSTS]->(:Tweet)-[r:RETWEETS]->(:Tweet)<-[:POSTS]-(m:TwitterUser)
RETURN id(a), id(m), count(r) AS NumRetweets
ORDER BY NumRetweets DESC
But this takes a lot compared to the simple query (8 seconds):
MATCH (a:TwitterUser)-[:POSTS]->(:Tweet)-[r:RETWEETS]->(:Tweet)<-[:POSTS]-(m:TwitterUser)
RETURN id(a), id(m), count(r) AS NumRetweets
ORDER BY NumRetweets DESC
Actually, with my data, the 2 query should return the same nodes, so i dont understand the big time difference.
The problem with your first query is that you're not doing anything with the results of the timetree query. It is literally just wasting cycles and bloating up the built up rows with data that's not even used.
You need to take the :Tweet nodes returned from your timetree query and include them into the next part of your query.
CALL ga.timetree.events.range({start: 148029120000, end: 1480896000000, relationshipType: "LAST_UPDATE", resolution: 'DAY'})
YIELD node
WITH node as tweet
WHERE tweet:Tweet
MATCH (a:TwitterUser)-[:POSTS]->(:Tweet)-[r:RETWEETS]->(tweet)<-[:POSTS]-(m:TwitterUser)
RETURN id(a), id(m), count(r) AS NumRetweets
ORDER BY NumRetweets DESC

How can Cypher Impose a Maximum Number of Hops Only Counting a Specific Type of Node?

I know that in Neo4J, Cypher can be used to filter results based on a maximum number of hops between two nodes, like this:
MATCH (a:Word)-[relationships*..3]-(b:Word)
RETURN a, relationships, b
LIMIT 5
This will return nodes (a and b) that are both of type Word, and that are with 4 total hops of each other (through all node types, and all relationship types).
I'm wondering whether Cypher can be made to only count specific types of nodes when it's counting to that maximum of 3 hops in the above example.
For example, in this chain of nodes:
(a:Word) ---> (b:Definition) ---> (c:Word) ---> (d:Definition) ---> (e:Definition) ---> (f:Word) ---> (g:Definition) ---> (h:Word)
There are 7 total hops between nodes a and h. However, there are only 3 Word hops between them.
Is it possible for Cypher to impose a maximum number of hops in this way?
You can use a filter to count label of nodes. For example:
MATCH path = (a:Word)-[relationships*..10]-(b:Word)
WHERE SIZE( FILTER(n IN NODES(path) WHERE 'Word' IN LABELS(n)) ) > 3
RETURN a, relationships, b
LIMIT 5

Cypher to return total node count as well as a limited set

Is it possible to extract in a single cypher query a limited set of nodes and the total number of nodes?
match (n:Molecule) with n, count(*) as nb limit 10 return {N: nb, nodes: collect(n)}
The above query properly returns the nodes, but returns 1 as number of nodes. I certainly understand why it returns 1, since there is no grouping, but can't figure out how to correct it.
The following query returns the counter for the entire number of rows (which I guess is what was needed). Then it matches again and limits your search, but the original counter is still available since it is carried through via the WITH-statement.
MATCH
(n:Molecule)
WITH
count(*) AS cnt
MATCH
(n:Molecule)
WITH
n, cnt LIMIT 10
RETURN
{ N: cnt, nodes:collect(n) } AS molecules
Here is an alternate solution:
match (n:Molecule) return {nodes: collect(n)[0..5], n: length(collect(n))}
84 ms for 30k nodes, shorter but not as efficient as the above one proposed by wassgren.

Resources