I'm learning Cypher and I created a 'Crime investigation' project on Neo4j.
I'm trying to return as an output the parent that only has two sons/daughters in total and each member of the family must have committed a crime.
So, in order to get this in the graph, I executed this query:
match(:Crime)<-[:PARTY_TO]-(p:Person)-[:FAMILY_REL]->(s:Person)-[:PARTY_TO]->(:Crime)
where size((p)-[:FAMILY_REL]->())=2
return p, s
FAMILY_REL relation shows the sons the Person (p) and PARTY_TO relation shows the Crime nodes a Person have committed.
The previous query it's not working as it should. It shows parents with more than two sons and also sons that have just one son.
What is wrong with the logic of the query?
SIZE((p)-[:FAMILY_REL]->()) counts all children of p, including ones who had committed no crimes.
This query should work better, as it only counts children who are criminals:
MATCH (:Crime)<-[:PARTY_TO]-(p:Person)-[:FAMILY_REL]->(s:Person)-[:PARTY_TO]->(:Crime)
WITH p, COLLECT(s) AS badKids
WHERE SIZE(badKids) = 2
RETURN p, badKids
Related
If I do the query
MATCH (:Label1 {prop1: "start node"}) -[relationships*1..10]-> ()
UNWIND relationships as relationship
RETURN DISTINCT relationship
How do I get nodes for each of acquired relationship to get result in format:
╒════════╤════════╤═══════╕
│"from" │"type" │"to" │
╞════════╪════════╪═══════╡
├────────┼────────┼───────┤
└────────┴────────┴───────┘
Is there a function such as type(r) but for getting nodes from relationship?
RomanMitasov and ray have working answers above.
I don't think they quite get at what you want to do though, because you're basically returning every relationship in the graph in a sort of inefficient way. I say that because without a start or end position, specifying a path length of 1-10 doesn't do anything.
For example:
CREATE (r1:Temp)-[:TEMP_REL]->(r2:Temp)-[:TEMP_REL]->(r3:Temp)
Now we have a graph with 3 Temp nodes with 2 relationships: from r1 to r2, from r2 to r3.
Run your query on these nodes:
MATCH (:Temp)-[rels*1..10]->(:Temp)
UNWIND rels as rel
RETURN startNode(rel), type(rel), endNode(rel)
And you'll see you get four rows. Which is not what you want because there are only two distinct relationships.
You could modify that to return only distinct values, but you're still over-searching the graph.
To get an idea of what relationships are in the graph and what they connect, I use a query like:
MMATCH (n)-[r]->(m)
RETURN labels(n), type(r), labels(m), count(r)
The downside of that, of course, is that it can take a while to run if you have a very large graph.
If you just want to see the structure of your graph:
CALL db.schema.visualization()
Best wishes and happy graphing! :)
Yes, such functions do exist!
startNode(r) to get the start node from relationship r
endNode(r) to get the end node
Here's the final query:
MATCH () -[relationships*1..10]-> ()
UNWIND relationships as r
RETURN startNode(r) as from, type(r) as type, endNode(r) as to
I have created many nodes in neo4j, the attributes of these nodes are the same, they all have user_id and item_id, the code used is as follows:
LOAD CSV WITH HEADERS FROM 'file://data.csv' AS row
CREATE (main:Main_table {USER_ID: row.user_id,
ITEM_ID: row.item_id}
)
CREATE INDEX ON :Main_table(USER_ID);
CREATE INDEX ON :Main_table(ITEM_ID);
Now I want to create relationship between the nodes with the same user_id or item_id. For example, if node A, B and C have the same USER_ID, I want to create (A)-[:EDGE]->(B), (A)-[:EDGE]->(C) and (B)-[:EDGE]->(C). In order to achieve this goal, I tried the following code:
MATCH (a:Main_table),(b:Main_table)
WHERE a.USER_ID = b.USER_ID
CREATE (a)-[:USER_EDGE]->(b);
MATCH (a:Main_table),(b:Main_table)
WHERE a.ITEM_ID = b.ITEM_ID
CREATE (a)-[:ITEM_EDGE]->(b);
But due to the large amount of data (3000000 nodes, 100000 users), this process is very slow, how can I quickly complete this process? Any help would be greatly appreciated!
Your query is causing a cartesian product, and the Cypher planner does not use indexes to optimize node lookups involving node property comparisons.
A query like this (instead of your USER_EDGE query) may be faster, as it does not cause a cartesian product:
MATCH (a:Main_table)
WITH a.USER_ID AS id, COLLECT(a) AS mains
UNWIND mains AS a
UNWIND mains AS b
WITH a, b
WHERE ID(a) < ID(b)
MERGE (a)-[:USER_EDGE]->(b)
This query uses the aggregating function COLLECT to collect the nodes that have the same USER_ID value, and uses the ID(a) < ID(b) test to ensure that a and b are not the same nodes and to also prevent duplicate relationships (in opposite directions).
I'm learning Cypher and I created a "Criminal investigation" project on Neo4j.
I'm trying to run a query that outputs each Person that has two children (Person) and both of the children must have committed a crime. To achieve this, I was testing some queries with a Person (p) called p.name = "Lillian" so I know this person has two children but just one of them has committed a crime.
In order to make this I execute this query (return something if Lillian has two sons that committed crimes or return nothing contrarily:
match (p:Person)-[r:FAMILY_REL]->(s:Person)
where p.name = "Lillian"
and size((p)-[:FAMILY_REL]->()-[:PARTY_TO]->(:Crime))=2 and size((p)-[:FAMILY_REL]->()) = 2
return p, s
As I already knew Lillian has only one son who committed a crime, the query should have not returned anything but it returned both of their children.
I'm guessing the wrong part of the query is here:
where /*...*/ and size((p)-[:FAMILY_REL]->()-[:PARTY_TO]->(:Crime))=2
I think this is counting just the number of children instead the number of children who have committed crimes.
What would be the correct way to do this?
Give this a try:
MATCH (p:Person)
WHERE p.name = "Lillian" AND size((p)-[:FAMILY_REL]->()) = 2
WITH p, [(p)-[:FAMILY_REL]->(child) WHERE (child)-[:PARTY_TO]->(:Crime) | child] as childCriminals
WHERE size(childCriminals) = 2
UNWIND childCriminals as s
RETURN p, s
Note that this will only work if Lillian has exactly two children, and both have been party to a crime.
As for why your query wasn't working, it's likely that one of the children was party to two crimes, that would produce results.
I have been pushing several times the same relationship between 2 nodes in Neo4j.
It was a mistake as it makes the visualization less clear.
Now, I would like to replace those several relations between 2 nodes by one single relation. It would be great if we could keep the number of relations inside a property "count" on the new unique relation.
What would be an efficient way to solve this problem ?
I have about 100 000 of relations and I am a bit worried about the time it would take.
Here is a quick example to make the problem clearer :
I have :
Node A -- R1 -- Node B
Node A -- R2 -- Node B
And I would like to have
Node A -- R {count : 2} -- Node B
Thanks!
I assume these relationships don't have any properties and Direction of the relationships doesn't matter.
You can combine these relationships with Cypher Query as shown:
MATCH (p:Node)-[r]-(c:Node)
WHERE ID(p) > ID(c)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[:R{count:count}]->(c)
If you want to merge relationships having the same directions only then you can use the following query:
MATCH (p:Node)-[r]->(c:Node)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[newrel:R{count:count}]->(c)
If you want to merge the properties as well then you can take help of
apoc plugin's apoc.refactor.mergeRelationships method.
I'm moving my complex user database where users can be on one of many teams, be friends with each other and more to Neo4j. Doing this in a RDBMS was painful and slow, but is simple and blazing with Neo4j. :)
I was hoping there is a way to query for
a relationship that is 1 hop away and
another relationship that is 2 hops away
from the same query.
START n=node:myIndex(user='345')
MATCH n-[:IS_FRIEND|ON_TEAM*2]-m
RETURN DISTINCT m;
The reason is that users that are friends are one edge from each other, but users linked by teams are linked through that team node, so they are two edges away. This query does IS_FRIEND*2 and ON_TEAM*2, which gets teammates (yeah) and friends of friends (boo).
Is there a succinct way in Cypher to get both differing length relations in a single query?
I rewrote it to return a collection:
start person=node(1)
match person-[:IS_FRIEND]-friend
with person, collect(distinct friend) as friends
match person-[:ON_TEAM*2]-teammate
with person, friends, collect(distinct teammate) as teammates
return person, friends + filter(dupcheck in teammates: not(dupcheck in friends)) as teammates_and_friends
http://console.neo4j.org/r/oo4dvx
thanks for putting together the sample db, Werner.
I have created a small test database at http://console.neo4j.org/?id=sqyz7i
I have also created a query which will work as you described:
START n=node(1)
MATCH n-[:IS_FRIEND]-m
WITH collect(distinct id(m)) as a, n
MATCH n-[:ON_TEAM*2]-m
WITH collect(distinct id(m)) as b, a
START n=node(*)
WHERE id(n) in a + b
RETURN n