I'm starting with Neo4j and using graphs, and I'm trying to get the following:
I have to find the subtraction(difference) between the number of users (each user is a node) and the number of differents names they have. I have 16 nodes, and each one has his own name (name is one of the properties it has), but some of them have the same name (for example the node A has (Name:Amanda,City:Roma) and node B has (Name:Amanda, City:Paris), so I will have less name's count because some of them are repeated.
I have tried this:
match (n) with n, count(n) as c return sum(c)
That gives me the number of nodes. And then I tried this
match (n) with n, count(n) as nodeC with n, count( distinct n.Name) as
nameC return sum(nodeC) as sumN, sum(nameC) as sumC, sumN-sumC
But it doesn't work (I'm not sure if even i'm getting the names well, because when I try it, separated, it doesn't work neither).
I think this is what you are looking for:
MATCH (n)
RETURN COUNT(n) - COUNT(DISTINCT n.name) AS diff;
Related
I want to get all the nodes that have specific labels. My code:
match (n) where labels(n)=["Person","Actor","Old"] return n
While there are nodes that satisfy this property I do not get any results.
This is why:
Labels(n) returns an array of strings, but not necessarily in a specific order.
You can try this:
WHERE n:Person AND n:Actor AND n:Old
Or
WHERE ALL(l in [“Person”, “Actor”, “Old”] WHERE l IN labels(n) )
List does not match if the items are not in the same order/sequence. So first of all, list out all labels in your database so you can see how the labels are arranged.
match (n)
return distinct labels(n)
Then you will see which node will have those labels that you look for: ["Person","Actor","Old"].
If you are trying to find nodes where it contains all nodes in that list in any order then this query will work for you.
match (n)
where all(lbl in ["Person","Actor","Old"] where lbl in labels(n))
return n
if you like using APOC functions, here is the query
match (n)
where apoc.coll.isEqualCollection(["Person","Actor","Old"], labels(n))
return n
We have a large graph (over 1 billion edges) that has multiple relationship types between nodes.
In order to check the number of nodes that have a single unique relationship between nodes (i.e. a single relationship between two nodes per type, which otherwise would not be connected) we are running the following query:
MATCH (n)-[:REL_TYPE]-(m)
WHERE size((n)-[]-(m))=1 AND id(n)>id(m)
RETURN COUNT(DISTINCT n) + COUNT(DISTINCT m)
To demonstrate a similar result, the below sample code can run on the movie graph after running
:play movies in an empty graph, resulting with 4 nodes (in this case we are asking for nodes with 3 types of relationships)
MATCH (n)-[]-(m)
WHERE size((n)-[]-(m))=3 AND id(n)>id(m)
RETURN COUNT(DISTINCT n) + COUNT(DISTINCT m)
Is there a better/more efficient way to query the graph?
The following query is more performant, since it only scans each relationship once [whereas size((n)--(m)) will cause relationships to be scanned multiple times]. It also specifies a relationship direction to filter out half of the relationship scans, and to avoid the need for comparing native IDs.
MATCH (n)-->(m)
WITH n, m, COUNT(*) AS cnt
WHERE cnt = 3
RETURN COUNT(DISTINCT n) + COUNT(DISTINCT m)
NOTE: It is not clear what you are using the COUNT(DISTINCT n) + COUNT(DISTINCT m) result for, but be aware that it is possible for some nodes to be counted twice after the addition.
[UPDATE]
If you want to get the actual number of distinct nodes that pass your filter, here is one way to do that:
MATCH (n)-->(m)
WITH n, m, COUNT(*) AS cnt
WHERE cnt = 3
WITH COLLECT(n) + COLLECT(m) AS nodes
UNWIND nodes AS node
RETURN COUNT(DISTINCT node)
How can I know how many nodes and edges are involved in a MATCH? Is there another way besides Explain / Profile Match?
If you mean how many nodes are matched in a path, such as a variable-length path, then you can assign a path variable for this:
MATCH p = (k:Person {name:'Keanu Reeves'})-[*..8]-(t:Person {name:'Tom Hanks'})
WITH p LIMIT 1
RETURN p, length(p) as pathLength, length(p) + 1 as numberOfNodesInPath
You can also use nodes(p) and relationships(p) to get the collection of nodes and relationships that make up the path, and you can use size() on those collections to get their size.
There exists the COUNT() function of Cypher that allows you to count the number of elements. As for example in this query:
MATCH (n)
RETURN COUNT(n);
This query will count all nodes in your database.
You can find more information in the cypher manual, under the aggregating functions. Check it out.
The following Cypher snippet should return the number of distinct nodes and relationships found by any given MATCH clause. Just replace <your code here> with your MATCH pattern.
MATCH <your code here>
WITH COLLECT(NODES(p)) AS ns, SUM(SIZE(RELATIONSHIPS(p))) AS relCount
UNWIND ns AS nodeList
UNWIND nodeList AS node
RETURN COUNT(DISTINCT node) AS nodeCount, relCount;
Background
Hi all, I am currently trying to write a cypher statement that allows me to find a set of paths on a map from a starting point. I want my search result to always return connecting streets within 5 nodes. Optionally, if there's a nearby hospital, I would like my search pattern to also indicate nearby hospitals.
Main Problem
Because there isn't always a nearby hospital to the current street, sometimes my optional match search pattern comes back as null. Here's the current cypher statement I'm using:
MATCH path=(a:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
OPTIONAL MATCH optionalPath=(b)-[:CONNECTED_TO]->(hospital:Hospital)
WHERE ALL (x IN nodes(path) WHERE (x:Street))
WITH DISTINCT nodes(path) + nodes(optionalPath) as n
UNWIND n as nodes
RETURN DISTINCT nodes;
However, this syntax only works if optionalPath contains nodes. If it doesn't, the statement nodes(path) + nodes(optionalPath) is an operation adding null and I get no records. This is true even the nodes(path) term does contain nodes.
What's the best way to get around this problem?
You can use COALESCE to replace a NULL with some other value. For example:
MATCH path=(:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
WHERE ALL (x IN nodes(path) WHERE x:Street)
OPTIONAL MATCH optionalPath=(b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH nodes(path) + COALESCE(nodes(optionalPath), []) as n
UNWIND n as nodes
RETURN DISTINCT nodes;
I have also made a few other improvements:
The WHERE clause was moved up right after the first MATCH. This eliminates the unwanted path values immediately. Your original query would get all path values (even unwanted ones) and always the perform the second MATCH query, and only eliminate unwanted paths afterwards. (But, it is actually not clear if you even need the WHERE clause at all; for example, if the CONNECTED_TO relationship is only used between Street nodes.)
The DISTINCT in your WITH clause would have prevented duplicate n collections, but the collections internally could have had duplicate paths. This was probably not what you wanted.
It seems you don't really want the path, just all the street nodes within 5 steps, plus any connected hospitals. So I would simplify your query to just that, and then condense the 3 columns down to 1.
MATCH (a:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
OPTIONAL MATCH (b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH collect(a) + collect(b) + collect(hospital) as n
UNWIND n as nodez
RETURN DISTINCT nodez;
If Streets can be indirectly connected (hospital in between), Than I'd adjust like this
MATCH (a:Street {id: 123})-[:CONNECTED_TO]-(b:Street)
WITH a as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
OPTIONAL MATCH (b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH nodez + collect(hospital) as n
UNWIND n as nodez
RETURN DISTINCT nodez;
It's a bit more verbose, but just says exactly what you want (and also adds the start node to the hospital check list)
I'm struggling with a problem despite having read a lot of documentation... I'm trying to find my graph root node (or nodes, they may be several top nodes) and counting their immediate children (all relations are typed :BELONGS_TO)
My graph looks like this (cf. attached screenshot). I have been trying the following query which works as long as the root node only has ONE incomming relationship, and it doesn not when it has more than one. (i'm not realy familiar with the cyhper language yet).
MATCH (n:Somelabel) WHERE NOT (()-[:BELONGS_TO]->(n:Somelabel)) RETURN n
Any help would be much appreciated ! (i haven't even tried to count the root nodes immediate children yet...which would be "2" according to my graph)
Correct query was given by cybersam
MATCH (n:Somelabel) WHERE NOT (n)-[:BELONGS_TO]->() RETURN n;
MATCH (n:Somelabel)<-[:BELONGS_TO]-(c:Somelabel)
WHERE NOT (n)-[:BELONGS_TO]->() RETURN n, count(c);
Based on your diagram, it looks like you are actually looking for "leaf" nodes. This query will search for all Somelabel nodes that have no outgoing relationships, and return each such node along with a count of the number of distinct nodes that have a relationship pointing to that node.
MATCH (n:Somelabel)
WHERE NOT (n)-[:BELONGS_TO]->()
OPTIONAL MATCH (m)-[:BELONGS_TO]->(n)
RETURN n, COUNT(DISTINCT m);
If you are actually looking for all "root" nodes, your original query would have worked.
As a sanity check, if you have a specific node that you believe is a "leaf" node (let's say it has an id value of 123), this query should return a single row with null values for r and m. If you get non-null results, then you actually have outgoing relationships.
MATCH (n {id:123})
OPTIONAL MATCH (n)-[r]->(m)
RETURN r, m