Cypher: Create relationships between nodes based on a common property key id - neo4j

I'm brand new to Cypher (and Stackoverflow) and am having trouble creating relationships between nodes based on share property keys.
I would like to do something like this:
MATCH (a:Person)-->()<--(b:Country)
WHERE HAS (a.id) AND HAS (b.id) AND a.id=b.id
CREATE (a)-[:LIVES]->(b);
to create a relationship between Country node and Person nodes where they share the same id.
The above creates no errors when run but doesn't create any relationships either and I know that the ids should match.
Many thanks!!
EDIT:
I think I know what is going wrong - I'm asking to match nodes that have a relationship to eachother but no relationships are set up yet hence 0 results. I have now tried:
MATCH (a:Person),
(b:Country)
WHERE HAS (a.id) AND HAS (b.id) AND a.id=b.id
CREATE (a)-[:LIVES]->(b);
and the query is running. It's a big data set so might take a while......

That worked. Had to reduce the size of my data set (down from 64k nodes) as Neo4j was taking way too long to process but once I had a smaller set it worked fine.

One minor addition for future Googlers.
per the help files as of version 3.4
The has() function has been superseded by exists() and has been removed.
The new code should read
MATCH (a:Person),
(b:Country)
WHERE EXISTS (a.id) AND EXISTS (b.id) AND a.id=b.id
CREATE (a)-[:LIVES]->(b);

Related

How could i use this SQL on cypher(neo4j)

hi how can i transform this SQL Query as CYPHER Query ? :
SELECT n.enginetype, n.Rocket20, n.Yearlong, n.DistanceOn,
FROM TIMETAB AS n
JOIN PLANEAIR AS p ON (n.tailnum = p.tailNum)
If it is requisition before using that query to create any relationship or antyhing please write and help with that one too.. thanks
Here's a good guide for comparing SQL with Cypher and showing the equivalent Cypher for some SQL queries.
If we were to translate this directly, we'd use :PLANEAIR and :TIMETAB node labels (though I'd recommend using better names for these), and we'll need a relationship between them. Let's call it :RELATION.
Joins in SQL tend to be replaced with relationships between nodes, so we'll need to create these patterns in your graph:
(:PLANEAIR)-[:RELATION]->(:TIMETAB)
There are several ways to get your data into the graph, usually through LOAD CSV. The general approach is to MERGE your :PLANEAIR and :TIMETAB nodes with some id or unique property (maybe TailNum?, use ON CREATE SET ... after the MERGE to add the rest of the properties to the node when it's created, and then MERGE the relationship between the nodes.
The MERGE section of the developers manual should be helpful here, though I'd recommend reading through the entire dev manual anyway.
With this in place, the Cypher equivalent query is:
MATCH (p:PLANEAIR)-[:RELATION]->(n:TIMETAB)
RETURN n.Rocket20,p.enginetype, n.year, n.distance
Now this is just a literal translation of your SQL query. You may want to reconsider your model, however, as I'm not sure how much value there is in keeping time-related data for a plane separate from its node. You may just want to have all of the :TIMETAB properties on the :PLANEAIR node and do away with the :TIMETAB nodes completely. Of course your queries and use cases should guide how to model that data best.
EDIT
As far as creating the relationship between :PLANEAIR and :TIMETAB nodes (and again, I recommend using better labels for these, and maybe even keeping all time-related properties on a :Plane node instead of a separate one), provided you already have those nodes created, you'll need to do a joining match, but it will help to have a unique constraints on :PLANEAIR(tailnum) :TIMETAB(tailNum) (or an index, if this isn't supposed to be a unique property):
CREATE CONSTRAINT ON (p:PLANEAIR)
ASSERT p.tailNum IS UNIQUE
CREATE CONSTRAINT ON (n:TIMETAB)
ASSERT n.TailNum IS UNIQUE
Now we're ready to create the relationships
MATCH (p:PLANEAIR)
MATCH (n:TIMETAB)
WHERE p.tailNum = n.tailNum
CREATE (p)-[:RELATION]->(n)
REMOVE n.tailNum
Now that the relationships are created, and :TIMETAB tailNum property removed, we can drop the unique constraint on :TIMETAB(tailNum), since the relationship to :PLANEAIR is all we need.
DROP CONSTRAINT ON (n:TIMETAB)
ASSERT n.tailNum IS UNIQUE

How do I refactor data two neo4j nodes to a relationship?

I'm doing an experiment with using a graph database (neo4j). I have two csv's that I imported into a neo4j datastore. I'm a little shakey on the neo terminology; so forgive me. Lets say I have:
Customer (AccountNumber, CustomerName) and
CustomerGroups (AccountNumber, GroupName).
I would like to create a new Node called groups which is comprised of the distinct GroupName from CustomerGroups. I'll call it Group.
I then want to create relationships "HAS_GROUP" from Customer to Group using the common AccountNumber from CustomerGroups.
Once the above is completed, I could delete CustomerGroups as its no longer needed.
I'm just stuck at the syntax. I can get the distinct groups from CustomerGroups with:
MATCH (n:CustomerGroups) distinct n.GROUP_NAME
and I get back about 50 distinct groups, but can't figure how to add the create statement to the results and CREATE g:Group {GroupName: n.GROUP_NAME}
I then know my followup question is how to do the MATCH to the new group using the old table with common account numbers.
FYI: I've indexed the AccountNumber in both Nodes. Both Customer and CustomerGroups have over 5 Million nodes. Not bad for a laptop (2 min to import using neo4j-import). I was impressed!
Thanks for any help you can give!
Instead of creating a CustomerGroups label and creating nodes for that, you should be able to define relationships that you would like to create in your neo4j-import. It would certainly be a lot faster too. See:
http://neo4j.com/docs/stable/import-tool-header-format.html
To your question, you could probably do something like:
MATCH (cg:CustomerGroup)
MATCH (customer:Customer {AccountNumber: cg. AccountNumber}), (group:Group {GroupName: cg.GroupName})
CREATE (customer)-[:IN_GROUP]->(group)
You'd definitely want to make sure you have indexes on :Customer(AccountNumber) and :Group(GroupName) first. But even then it would still be much slower than doing it as part of your initial import.
Also, you may or may not want MERGE instead of CREATE

neo4j relationships between nodes using a common property (id) without creating more nodes

I have been trying to match together two different nodes (process) and (process framework) that have the same id. I want to set a relationship between those (called:SAME).
The query works but it creates 3 times the amount of processes that I wanted and I can´t figure out why:
MATCH (p:Process)
MATCH (pcf:ProcessFramework)
WHERE HAS (p.id) AND HAS (pcf.pcf_id) AND p.id=pcf.pcf_id
MERGE (p)<-[:same]-(pf)
return p,pcf
I also tried it with CREATE UNIQUE instead of MERGE but its the same result. I am not a developer so maybe I am doing a very obvious mistake but I really can´t see it!
Thanks!
You have a typo in your query, you use pf where you should use pcf in the MERGE.
I'd also change it to this and make sure you have an index on :ProcessFramework(pcf_id)
MATCH (p:Process)
MATCH (pcf:ProcessFramework)
WHERE p.id=pcf.pcf_id
MERGE (p)<-[:same]-(pcf)
return p,pcf

Creating unique relationships in Neo4j using py2neo get_or_create

I am trying to create a graph using some nodes and relationships using py2noe. say, I'm creating a family tree.
I am creating nodes using get_or_create() so that my script doesn't create duplicates if I supply the same value again.
How can I do the same for a relationship? I can't find any reference to a get_or_create() like function for a relationship.
I want to publish (Joe)-[:son]->(John)
the first time it creates 2 nodes joe and john and a link between them.
if I re run my script, as the nodes are unique, they aren't published but a new relationship is created.
This gives me a graph with 2 nodes and n relationships where n is the number of times I run the script.
I also tried using cypher and I get the same issue. It keeps on creating relationships.
Can anyone suggest me a way to solve this problem?
Thanks.
I don't know py2neo (so there may be a wrapper for this function), but the way to achieve that would be using a Cypher MERGE which it looks like you have to run using the raw Cypher statement:
cypher_merge_result = neo4j.CypherQuery(graph_db,
"MERGE (s:Person{name:Joe})-[:SON]->(f:Person{name:John})")
That will create 2 Person nodes and 1 SON relationship, no matter how many times that you run it.
Neo4J documentation is here and it is important to understand how it works as partial matches in the MERGE statement will cause the whole pattern to be created. So if your Person nodes already exist you should match them in advance to avoid duplicate Persons being created. i.e
cypher_merge_result = neo4j.CypherQuery(graph_db,
"MATCH (s:Person{name:Joe}), (f:Person{name:John}) MERGE (s)-[:SON]->(f)")

Assumptions regarding Node ID strings in Neo4j - cypher

In my recent question, Modeling conditional relationships in neo4j v.2 (cypher), the answer has led me to another question regarding my data model and the cypher syntax to represent it. Lets say in my model, there is a node CLT1 that is what I'll call the Source node. CLT1 has relationships to other 286 Target nodes. This is a model of a target node:
CREATE
(Abnormally_high:Label1:Label2:Label3:Label4:Label5:Label6:Label7:Label8:Label9:Label10
{Pro1:'x',Prop2:'y',Prop3:'z'})
Key point: I am assuming the string after the CREATE clause is
The ID of this target node
The ID is significant because its content has domain-specific meaning
and is query-able.
in this case its the phrase ...."Abnormally_high".
I made this assumption based on the movie database example.
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (Carrie:Person {name:'Carrie-Anne Moss', born:1967})
The first strings after CREATE definitely have domain-specific meaning!
In my earlier post I discuss Problem 2. I find that problem 2 arises because among the 286 target nodes, there are many instances where there was at least one more Target node who shares the identical ID. In this instance, the ID is "Abnormally_high". The other Target nodes may differ in the value of any of Label1 - Label10 or the associated properties.
Apparently, Cypher doesn't like that. In Problem 2, I was discussing the ways to deal with the fact that cypher doesn't like using the same node ID multiple times even though the labels or properties were different.
My problem are my assumptions about the Target node ID.
AM I RIGHT?
I am now thinking that I could instead use this....
CREATE (CLT1_target_1:Label1:Label2:Label3:Label4:Label5:Label6:Label7:Label8:Label9:Label10
{name:'Abnormally_high',Prop2:'y',Prop3:'z'})
If indeed the first string after the CREATE clause is an ID, then all I have to do is put a unique target node identifier.... like CLT1_target_1 and increment up to CLT1_target_286. If I do this, then I can have the name as a property and change whatever label or property I want.
Do I have this right?
You are wrong. In Cypher, a node name (like "Abnormally_high") is just a variable name that exists for the lifetime of the query (and sometimes not even that long). The node name used in a Cypher query is never persisted in any way, and can be any arbitrary string.
Also, in neo4j, the term "ID" has a specific meaning. The neo4j DB will automatically assign a (currently) unique integer ID to each new node. You have no control over the ID value assigned to a node. And when a node is deleted, neo4j can reassign its ID to a new node.
You should read the neo4j manual (available at docs.neo4j.org), especially the section on Cypher, to get a better understanding.

Resources