I am writing an api to return neo4j data. For my case I get all nodes matching.
API takes in userId, limit and offset and return a list of data matching that condition.
I found one solution Cypher to return total node count as well as a limited set but it is pretty old. Not sure if this is still the best way to do it.
Performance is same as firing 2 separate queries, atleast then one of them would be cached by neo4j after couple of runs.
Match(u:WorkstationUser {id: "alw:44807"})-[:HAS_ACCESS_TO]->(p) return distinct(p) skip 0 limit 10
Match(u:WorkstationUser {id: "alw:44807"})-[:HAS_ACCESS_TO]->(p) return count(distinct(p))
I want the result to be something like
{
items: [ {}, {}], # query 1
total: 100, # query 2
limit: 10, # can get from input
skip: 0 # can get from input
}
This will depend a bit on how much information you need from the nodes for which you want the count, and whether you need to get distinct results or not.
If distinct results are not needed, and you don't need to do any additional filtering on the relationship or node at the other end (no filtering of the label or properties of the node), then you can use the size() of the pattern which will use the degree information of the relationships present on the node, which is more efficient as you never have to actually expand out the relationships:
MATCH (u:WorkstationUser {id: "alw:44807"})
WITH u, size((u)-[:HAS_ACCESS_TO]->(p)) as total
MATCH (u)-[:HAS_ACCESS_TO]->(p)
RETURN p, total
SKIP 0 LIMIT 10
However if distinct results are needed, or you need to filter the node by label or properties, then you will have to expand all the results to get the total. If there aren't too many results (millions or billions) then you can collect the distinct nodes, get the size of the collection, then UNWIND the results and page:
MATCH (:WorkstationUser {id: "alw:44807"})-[:HAS_ACCESS_TO]->(p)
WITH collect(DISTINCT p) as pList
WITH pList, size(pList) as total
UNWIND pList as p
RETURN p, total
SKIP 0 LIMIT 10
Related
Stuck in finding the avg number of relationships each node has. I know how to find the number of nodes and the number of relationships in total. But I can't combine them into one query. Please help
You can use this approach. This is for getting this info for all nodes regardless of label:
MATCH (n)
WITH n, size((n)--()) as relCount
RETURN avg(relCount) as averageRelCount
If you're trying to return this information in addition to total nodes and total relationships, then you should read this knowledge base article on getting fast counts from the counts store. It can get you those totals, though it can't get you the average rel count as above.
You can combine them by using the counts store early in the query, and the average relationship part at the end.
Here's how you might use it if you use APOC Procedures to select the counts you want from the store:
CALL apoc.meta.stats() YIELD nodeCount, relCount as totalRelCount
MATCH (n)
WITH n, size((n)--()) as relCount, nodeCount, totalRelCount
RETURN avg(relCount) as averageRelCount, nodeCount, totalRelCount
I want to make a cypher query that do below tasks:
there is a given start node, and I want to get all related nodes in 2 hops
sort queried nodes by hops asc, and limit it with given number
and get all relations between result of 1.
I tried tons of queries, and I made below query for step 1, 2
MATCH path=((start {eid:12018})-[r:REAL_CALL*1..2]-(end))
WITH start, end, path
ORDER BY length(path) ASC
RETURN start, collect(distinct end)[..10]
But when I try to get relationships in queried path with below query, it returns all relationships in the path :
MATCH path=((start {eid:12018})-[r:REAL_CALL*1..2]-(end))
WITH start, end, path
ORDER BY length(path) ASC
RETURN start, collect(distinct end)[..10], relationships(path)
I think I have to match again with result of first match instead of get relationships from path directly, but all of my attempts have failed.
How can I get all relationships between queried nodes?
Any helps appreciate, thanks a lot.
[EDITED]
Something like this may work for you:
MATCH (start {eid:12018})-[rels:REAL_CALL*..2]-(end)
RETURN start, end, COLLECT(rels) AS rels_collection
ORDER BY
REDUCE(s = 2, rs in rels_collection | CASE WHEN SIZE(rs) < s THEN SIZE(rs) ELSE s END)
LIMIT 10;
The COLLECT aggregation function will generate a collection (of relationship collections) for each distinct start/end pair. The LIMIT clause limits the returned results to the first 10 start/end pairs, based on the ORDER BY clause. The ORDER BY clause uses REDCUE to calculate the minimum size of each path to a given end node.
I have a Neo4j database with thousands of nodes.
I'm using this query to find nodes which contains some text inside the desired field:
MATCH (n:MYNODE)
WHERE n.myfield CONTAINS {textToSearch}
RETURN n
ORDER BY n.myfield ASC
LIMIT 50
This query works, and returns the first 50 results ordere by n.myfield.
Let's say 340 nodes match the search criteria: the first 50 get returned. Is there a way to return also the total count? I would like to have the 50 nodes along with the total count (340) for displaying purposes.
I would do a second query like this:
MATCH (n:MYNODE)
WHERE n.myfield CONTAINS {textToSearch}
RETURN count(n)
Is there a way to avoid a second query and include this result in the first one? Neo4j should find all the 340 nodes before limiting them to 50 in the first query, so is there a way to intercept the nodes count before the LIMIT clause is applied and return it aswell?
How about something like this. Order the result and put it in a collection. Then return the size of the collection and the first 50 items in the collection.
MATCH (n:MYNODE)
WHERE n.myfield CONTAINS {textToSearch}
WITH n
ORDER BY n.myfield
WITH COLLECT(n) as matched
RETURN size(matched), matched[..50]
I have graph: (:Sector)<-[:BELONGS_TO]-(:Company)-[:PRODUCE]->(:Product).
I'm looking for the query below.
Start with (:Sector). Then match first 50 companies in that sector and for each company match first 10 products.
First limit is simple. But what about limiting products.
Is it possible with cypher?
UPDATE
As #cybersam suggested below query will return valid results
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH c
LIMIT 50
MATCH (c)-[:PRODUCE]->(p:Product)
WITH c, (COLLECT(p))[0..10] AS products
RETURN c, products
However this solution doesn't scale as it still traverses all products per company. Slice applied after each company products collected. As number of products grows query performance will degrade.
Each returned row of this query will contain: a sector, one of its companies (at most 50 per sector), and a collection of up to 10 products for that company:
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH s, (COLLECT(c))[0..50] AS companies
UNWIND companies AS company
MATCH (company)-[:PRODUCE]->(p:Product)
WITH s, company, (COLLECT(p))[0..10] AS products;
Updating with some solutions using APOC Procedures.
This Neo4j knowledge base article on limiting results per row describes a few different ways to do this.
One way is to use apoc.cypher.run() to execute a limited subquery per row. Applied to the query in question, this would work:
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH c
LIMIT 50
CALL apoc.cypher.run('MATCH (c)-[:PRODUCE]->(p:Product) WITH p LIMIT 10 RETURN collect(p) as products', {c:c}) YIELD value
RETURN c, value.products AS products
The other alternative mentioned is using APOC path expander procedures, providing the label on a termination filter and a limit:
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH c
LIMIT 50
CALL apoc.path.subgraphNodes(c, {maxLevel:1, relationshipFilter:'PRODUCE>', labelFilter:'/Product', limit:10}) YIELD node
RETURN c, collect(node) AS products
Is it possible to extract in a single cypher query a limited set of nodes and the total number of nodes?
match (n:Molecule) with n, count(*) as nb limit 10 return {N: nb, nodes: collect(n)}
The above query properly returns the nodes, but returns 1 as number of nodes. I certainly understand why it returns 1, since there is no grouping, but can't figure out how to correct it.
The following query returns the counter for the entire number of rows (which I guess is what was needed). Then it matches again and limits your search, but the original counter is still available since it is carried through via the WITH-statement.
MATCH
(n:Molecule)
WITH
count(*) AS cnt
MATCH
(n:Molecule)
WITH
n, cnt LIMIT 10
RETURN
{ N: cnt, nodes:collect(n) } AS molecules
Here is an alternate solution:
match (n:Molecule) return {nodes: collect(n)[0..5], n: length(collect(n))}
84 ms for 30k nodes, shorter but not as efficient as the above one proposed by wassgren.