Whenever I am using a query to get the count of a specific node, I always get the number greater than 1 even though there is only one distinct type of that node existing.
Sample query:
MATCH (p)-[rel]->(v:myDistinctNode) RETURN COUNT(v)
Output: 80
MATCH (p)-[rel]->(v:myDistinctNode) RETURN COUNT(DISTINCT v)
Output: 1
I see different results while using DISTINCT, but I cannot use DISTINCT all the time. Why I am seeing this and how can I avoid it? Thanks!
Neo4j Kernel-Version: 3.5.14
The short answer is that you need to use a collect statement to make it work.
MATCH (p)<-[rel]-(v:myDistinctNode) WITH collect(v) AS nodes RETURN count(nodes)
This should return one.
I'm not a cypher expert, but I believe the reason it doesn't work is that the cypher result seems more like a table where in one row you have p, another row you have r, and the last row you have v. Even though v is a unique entity, there are still 80 rows that have v.
Related
match(m:master_node:Application)-[r]-(k:master_node:Server)-[r1]-(n:master_node)
where (m.name contains '' and (n:master_node:DeploymentUnit or n:master_node:Schema))
return distinct m.name,n.name
Hi,I am trying to get total number of records for the above query.How I change the query using count function to get the record count directly.
Thanks in advance
The following query uses the aggregating funtion COUNT. Distinct pairs of m.name, n.name values are used as the "grouping keys".
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
RETURN m.name, n.name, COUNT(*) AS cnt
I assume that m.name contains '' in your query was an attempt to test for the existence of m.name. This query uses the EXISTS() function to test that more efficiently.
[UPDATE]
To determine the number of distinct n and m pairs in the DB (instead of the number of times each pair appears in the DB):
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
WITH DISTINCT m.name AS n1, n.name AS n2
RETURN COUNT(*) AS cnt
Some things to consider for speeding up the query even further:
Remove unnecessary label tests from the MATCH pattern. For example, can we omit the master_node label test from any nodes? In fact, can we omit all label testing for any nodes without affecting the validity of the result? (You will likely need a label on at least one node, though, to avoid scanning all nodes when kicking off the query.)
Can you add a direction to each relationship (to avoid having to traverse relationships in both directions)?
Specify the relationship types in the MATCH pattern. This will filter out unwanted paths earlier. Once you do so, you may also be able to remove some node labels from the pattern as long as you can still get the same result.
Use the PROFILE clause to evaluate the number of DB hits needed by different Cypher queries.
You can find examples of how to use count in the Neo4j docs here
In your case the first example where:
count(*)
Is used to return a count of each returned item should work.
I have rerun the query below multiple times for the last two days and the Neo4j interface says it's running but it seems like it is running endlessly. I have run other queries which have all return an output. I left the query running for 9 hours and it was still running after 9 hr. I'm not sure what the issue is but would appreciate any help.
I'm running Neo4j-community-2.3.12 which is an older version but it should work as I am following a tutorial and the rest of the queries work fine.
Cypher script - which is very basic:
match p=(ione)-[:ResponseTo*]->(itwo)
where length(p)=9 with p
match (u)-[:CreateChat]->(i)
where i in nodes(p)
return count(distinct u);
Image of query running endlessly:
This query looks like an endless loop.
I would suggest instead of getting all the paths and checking length later get the paths of the desired length(9).
Also, consider adding labels in path query.
match p=(ione)-[:ResponseTo*9]->(itwo)
with p
match (u)-[:CreateChat]->(i)
where i in nodes(p)
return count(distinct u);
As Raj noted, you will want to use labels in this, as right now this is doing an all nodes scan which isn't performant.
We can also make sure the second match is more performant by ensuring we start i with the previously matched nodes, rather than applying that as a filter after the match:
match p=(ione)-[:ResponseTo*9]->(itwo)
unwind nodes(p) as i
with DISTINCT i
match (u)-[:CreateChat]->(i)
return count(distinct u);
I have two queries which are almost the same as below(the only difference is the r: in front of FOR in Query 1)
Query 1: MATCH p=()-[r:FOR]->() RETURN count(p)
Query 2: MATCH p=()-[FOR]->() RETURN count(p)
When I am running this queries against my Neo4j server, it returns different result. Query 1 is around 1/3 or query 2, I guess it is due to query 1 has 'combined' the results while query 2 didn't.(e.g. a-[FOR]->c and b-[FOR]->c were combined into 1 record), but just my guessing. I have tried to google or search in Neo4j documentation but no luck. Anyone can explain the difference?
Thanks in advance.
MATCH p=()-[r:FOR]->() RETURN count(p)
This query binds the FOR relationship to the r variable (though it doesn't use it).
MATCH p=()-[FOR]->() RETURN count(p)
This query binds any relationship (i.e. of any type) to the FOR variable.
The correct syntax for specifying the relationship type in Cypher is :XXX, with the leading colon. The correct version of the second query would actually be:
MATCH p=()-[:FOR]->() RETURN count(p)
how can i update a random set of nodes in Neo4j. I tried the folowing:
match (Firstgraph)
with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph)
order by rand();
match (G1:FirstGraph)
where id(G1)=Id
set G1.Version=5
My idea is the get a random set then update it, but i got the error:
Expected exactly one statement per query but got: 2
Thanks for your help.
Let's find out what's the problem here, first of all, your error
Expected exactly one statement per query but got: 2
This is coming from your query, if we check it, we see that you did two queries in the same sentence, that's why you get this error.
match (Firstgraph) with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph) order by
rand(); match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
This is not a good query, because you can't use ; in a query sentence, it's the query end marker, so you can't do another query after this, but you can use UNION:
match (Firstgraph) with id(Firstgraph) as Id
return
Firstgraph.name, Firstgraph.version,id(Firstgraph) order by rand()
UNION
match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
Also, if you want to match a random set of nodes, you can simply do this (this example is for a 50% chances to get each node):
Match (node) Where rand() > 0.5 return node
And then do whatever you want with the node using WITH
I am trying to search for a key word on all the indexes. I have in my graph database.
Below is the query:
start n=node:Users(Name="Hello"),
m=node:Location(LocationName="Hello")
return n,m
I am getting the nodes and if keyword "Hello" is present in both the indexes (Users and Location), and I do not get any results if keyword Hello is not present in any one of index.
Could you please let me know how to modify this cypher query so that I get results if "Hello" is present in any of the index keys (Name or LocationName).
In 2.0 you can use UNION and have two separate queries like so:
start n=node:Users(Name="Hello")
return n
UNION
start n=node:Location(LocationName="Hello")
return n;
The problem with the way you have the query written is the way it calculates a cartesian product of pairs between n and m, so if n or m aren't found, no results are found. If one n is found, and two ms are found, then you get 2 results (with a repeating n). Similar to how the FROM clause works in SQL. If you have an empty table called empty, and you do select * from x, empty; then you'll get 0 results, unless you do an outer join of some sort.
Unfortunately, it's somewhat difficult to do this in 1.9. I've tried many iterations of things like WITH collect(n) as n, etc., but it boils down to the cartesian product thing at some point, no matter what.