I wonder if anyone can advise how to adjust the following query so that it returns one relationship with a count of the number of actual relationships rather than every relationship? I have some nodes with many relationships and it's killing the graph's performance.
MATCH (p:Provider{countorig: "XXXX"})-[r:supplied]-(i:Importer)
RETURN p, i limit 100
Many thanks
To return the relationship name along with a count, change your "return" statement, like this:
MATCH (p:Provider{countorig: "XXXX"})-[r:supplied]-(i:Importer)
RETURN type(r), count(r)
Using type(r) will return the type of the relationship, which looks to be "supplied" in your example. And then count(r) is just using the built-in function to count the number of occurrences of that relationship in the query.
Related
match(m:master_node:Application)-[r]-(k:master_node:Server)-[r1]-(n:master_node)
where (m.name contains '' and (n:master_node:DeploymentUnit or n:master_node:Schema))
return distinct m.name,n.name
Hi,I am trying to get total number of records for the above query.How I change the query using count function to get the record count directly.
Thanks in advance
The following query uses the aggregating funtion COUNT. Distinct pairs of m.name, n.name values are used as the "grouping keys".
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
RETURN m.name, n.name, COUNT(*) AS cnt
I assume that m.name contains '' in your query was an attempt to test for the existence of m.name. This query uses the EXISTS() function to test that more efficiently.
[UPDATE]
To determine the number of distinct n and m pairs in the DB (instead of the number of times each pair appears in the DB):
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
WITH DISTINCT m.name AS n1, n.name AS n2
RETURN COUNT(*) AS cnt
Some things to consider for speeding up the query even further:
Remove unnecessary label tests from the MATCH pattern. For example, can we omit the master_node label test from any nodes? In fact, can we omit all label testing for any nodes without affecting the validity of the result? (You will likely need a label on at least one node, though, to avoid scanning all nodes when kicking off the query.)
Can you add a direction to each relationship (to avoid having to traverse relationships in both directions)?
Specify the relationship types in the MATCH pattern. This will filter out unwanted paths earlier. Once you do so, you may also be able to remove some node labels from the pattern as long as you can still get the same result.
Use the PROFILE clause to evaluate the number of DB hits needed by different Cypher queries.
You can find examples of how to use count in the Neo4j docs here
In your case the first example where:
count(*)
Is used to return a count of each returned item should work.
How can I match using Cypher by any relation amount? For example, in a database there are Person. If this person has any son, they will be related by the relation [:SON].
This query will return each Person that has a son:
match (p:Person)-[:SON]->(:Person)
return p
Knowing this, how can I match the number of Person that only has 2 sons?
You can use size() in the where clause, like so:
match (p:Person) where size((p)-[:SON]->())=2 return p
You can use the count function to get the count of sons for each person. Then filter out the persons based on the count.
MATCH (p:Person)-[r:SON]->(:Person)
WITH p, count(r) as sons
WHERE sons=2
RETURN p
I am sending a cypher query through php.
match (n:person)-[:watched]->(m:movie)
where m.Title in $mycollection
return count(distinct n.id);
this returns the number of people who have watched movies in my collection.
I actually want to return the list of names, and return n.name works fine.
When I try to return n.name and count(distinct n.id) at the same time, I lose the total count and get the count per row.
match (n:person)-[:watched]->(m:movie)
where m.Title in $mycollection
return n.name, count(distinct n.id);
does not work. The count column appears as 1 for each row.
As I'm using php, I've also tried:
$count = $result->getNodesCount();
to no avail. So I'm using php to count the array. But it feels like Cypher should be able to do it, right?
return n.name, count(distinct n.name) means "return each distinct n.name value and its number of distinct values". The number must always be 1, since a distinct value is, obviously, distinct.
If you are actually looking for the number of times each person had an outgoing relationship to a movie whose title is in $mycollection, do this instead (where count(*) counts the number of times a given n.name was matched):
MATCH (n:person)-->(m:movie)
WHERE m.Title in $mycollection
RETURN n.name, count(*);
Note that the above query omits the [watched] pattern found in your query, since that syntax (with no colon before watched) does no filtering at all. It merely assigns the relationship to a variable named watched, but that variable is not otherwise used, and is therefore superfluous.
If you had intended to use watched as the relationship type, then do this instead:
MATCH (n:person)-[:watched]->(m:movie)
WHERE m.Title in $mycollection
RETURN n.name, count(*);
This modified query returns the number of times each person watched a movie whose title is in $mycollection
MATCH (p:Product), (s:Student), (b:Boy), (a:Attribute)
RETURN count(distinct(p)), count(distinct(s)), count(distinct(b)), count(distinct(a))
I want to know how many counts of each node types in the graph using this query. However, the Neo4j Browser gives a warning saying that this query produces a cartesian product. Is there a better way to write the query?
Yes. You want to make sure your query uses the NodeCountFromCountStore operator (you can view this in the query plan if you EXPLAIN the query, so you can check before you actually execute).
The tricky part of this is that the only way for this plan to be used is if you match to all nodes of a label, then get the count (no other variables in your WITH or RETURN!).
You can try this approach, which unions queries together, and keeps the NodeCountFromStore by adding the label column after you get the count:
match (n:Product)
with count(n) as count
return 'Product' as label, count
union all
match (n:Student)
with count(n) as count
return 'Student' as label, count
union all
match (n:Boy)
with count(n) as count
return 'Boy' as label, count
union all
match (n:Attribute)
with count(n) as count
return 'Attribute' as label, count
To get a variety of statistics for your DB, including a count of the number of nodes for every label, you can use the APOC function apoc.meta.stats.
The following query gets just the label node counts, returning a map of label names to node counts:
CALL apoc.meta.stats() YIELD labels
RETURN labels;
I'm trying to find the number of nodes of a certain kind in my database that are connected to more than one other node of another kind. In my case, it's place nodes connected to several name nodes. I have a query that works:
MATCH rels=(p:Place)-[c:Called]->(n:Name)
WITH p,count(n) as counts
WHERE counts > 1
RETURN p;`
However, that only returns the place nodes, and ideally I'd like it to return all the nodes and edges involved. I've found a question on returning variables from before the WITH, but if I include any of the other variables I've defined, the query returns no responses, i.e. this query returns nothing:
MATCH rels=(p:Place)-[c:Called]->(n:Name)
WITH p, count(n) as counts, rels
WHERE counts > 1
RETURN p;
I don't know how to return the information that I want without changing the results of the query. Any help would be much appreciated
The reason your second query returns nothing is because its WITH clause specifies as aggregation "grouping keys" both p and rels. Since each rels path has only a single n value, counts would always be 1.
Something like this might work for you:
MATCH path=(p:Place)-[:Called]->(:Name)
WITH p, COLLECT(path) as paths
WHERE SIZE(paths) > 1
RETURN p, paths;
This returns each matching Place node and all its paths.
Try this:
MATCH (p:Place)-[c:Called]->(n:Name)
WHERE size((p)-[:Called]->(:Name)) > 1
WITH p,count(n) as counts, collect(n) AS names, collect(c) AS calls
RETURN p, names, calls, counts ORDER BY counts DESC;
This query makes use of Cypher's collect() function to create lists of the names and called relationships for each place that has more than Called relationship with a Name node.