Creating Nodes and Relationships, after Matching Nodes using ID - neo4j

I have 3 types of nodes - Challenge, Entry and User, and 2 relationships between these nodes: (Entry)-[POSTED_BY]->(User) and (Entry)-[PART_OF]->(Challenge).
Here is what I'm trying to accomplish:
- I have an existing Challenge, and I can identify it by its ID (internal neo4j id)
- I also have an existing User, and I can identify it by its id (not the same as neo4j internal id)
- Given the existing Challenge and User, I need to create a new Entry node, and link it with the Challenge node with a PART_OF relationship.
- In addition, I need to link the new Entry with the User node by the POSTED_BY relationship
- I want to achieve the above in a single Cypher query statement, if possible
This is what I'm trying:
MATCH (c:Challenge)
WHERE (id(c) = 240),
MATCH (u:User {id: '70cf6846-b38a-413c-bab8-7c707d4f46a8'})
CREATE (e:Entry {name: "My Entry"})-[:PART_OF]->(c), (u)<-[r:POSTED_BY]-(e)
RETURN e;
However, this is failing as it seems I cannot match two nodes with the above syntax. However, if I match the Challenge with a non-internal property, say name, it seems to work:
MATCH (c:Challenge {name: "Challenge Name"),
MATCH (u:User {id: '70cf6846-b38a-413c-bab8-7c707d4f46a8'})
CREATE (e:Entry {name: "My Entry"})-[:PART_OF]->(c), (u)<-[r:POSTED_BY]-(e)
RETURN e;
However, as I mentioned above, I want to match the neo4j internal Node ID for the Challenge, and I'm not sure if there's a way to match that other than by using the WHERE id(c) = 232 clause.

Your syntax is almost correct, but you don't need the comma between the WHERE and the next MATCH.
MATCH (c:Challenge)
WHERE id(c) = 240
MATCH (u:User {id: '70cf6846-b38a-413c-bab8-7c707d4f46a8'})
CREATE (e:Entry {name: "My Entry"})-[:PART_OF]->(c), (u)<-[r:POSTED_BY]-(e)
RETURN e;

Related

Conditional partial merge of pattern into graph

I'm trying to create a relationship that connects a person to a city -> state -> country without recreating the city/state/country nodes and relationships if they do already exist - so I'd end-up with only one USA node in my graph for example
I start with a person
CREATE (p:Person {name:'Omar', Id: 'a'})
RETURN p
then I'd like to turn this into an apoc.do.case statement with apoc
or turn it into one merge statement using unique the constraint that creates a new node if no node is found or otherwise matches an existing node
// first case where the city/state/country all exist
MATCH (locality:Locality{name:"San Diego"})-[:SITUATED_IN]->(adminArea:AdministrativeArea { name: 'California' })-[:SITUATED_IN]->(country:Country { name: 'USA' })
MERGE (p)-[:SITUATED_IN]->(locality)-[:SITUATED_IN]->(adminArea)-[:SITUATED_IN]->(country)
return p
// second case where only state/country exist
MATCH (adminArea:AdministrativeArea { name: 'California' })-[:SITUATED_IN]->(country:Country { name: 'USA' })
MERGE (p)-[:SITUATED_IN]->(locality:Locality{name:"San Diego"})-[:SITUATED_IN]->(adminArea)-[:SITUATED_IN]->(country)
return p
// third case where only country exists
MATCH (country:Country { name: 'USA' })
MERGE (p)-[:SITUATED_IN]->(locality:Locality{name:"San Diego"})-[:SITUATED_IN]->(adminArea:AdministrativeArea { name: 'California' })-[:SITUATED_IN]->(country)
return p
// last case where none of city/state/country exist, so I have to create all nodes + relations
MERGE (p)-[:SITUATED_IN]->(locality:Locality{name:"San Diego"})-[:SITUATED_IN]->(adminArea:AdministrativeArea { name: 'California' })-[:SITUATED_IN]->(country:Country { name: 'USA' })
return p
The key here is I only want to end-up with one (California)->(USA). I don't want those nodes & relationships to get duplicated
Your queries that use MATCH never specify which Person you want. Variable names like p only exist for the life of a query (and sometimes not even that long). So p is unbound in your MATCH queries, and can result in your MERGE clauses creating empty nodes. You need to add MATCH (p:Person {Id: 'a'}) to the start of those queries (assuming all people have unique Id values).
It should NOT be the responsibility of every single query to ensure that all needed localities exist and are connected correctly -- that is way too much complexity and overhead for every query. Instead, you should create the appropriate localities and inter-locality relationships separately -- before you need them. If fact, it should be the responsibility of each query that creates a locality to create all the relationships associated with it.
A MERGE will only not create the specified pattern if every single thing in the pattern already exists, so to avoid duplicates a MERGE pattern should have at most 1 thing that might not already exist. So, a MERGE pattern should have at most 1 relationship, and if it has a relationship then the 2 end nodes should already be bound (by MATCH clauses, for example).
Once the Locality nodes and the inter-locality relationships exist, you can add a person like this:
MATCH (locality:Locality {name: "San Diego"})
MERGE (p:Person {Id: 'a'}) // create person if needed, specifying a unique identifier
ON CREATE SET p.name = 'Omar'; // set other properties as needed
MERGE (p)-[:SITUATED_IN]->(locality) // create relationship if necessary
The above considerations should help you design the code for creating the Locality nodes and the inter-locality relationships.
Finally, the solution I used is much simpler, it's a series of merges.
match (person:Person {Id: 'Omar'}) // that should be present in the graph
merge (country:Country {name: 'USA'})
merge (state:State {name: 'California'})-[:SITUATED_IN]->(country)
merge (city:City {name: 'Los Angeles'})-[:SITUATED_IN]->(state)
merge (person)-[:SITUATED_IN]->(city)
return person;

Is there a way to Traverse neo4j Path using Cypher for different relationships

I want to Traverse a PATH in neo4j (preferably using Cypher, but I can write neo4j managed extensions).
Problem -
For any starting node (:Person) I want to traverse hierarchy like
(me:Person)-[:FRIEND|:KNOWS*]->(newPerson:Person)
if the :FRIEND outgoing relationship is present then the path should traverse that, and ignore any :KNOWS outgoing relationships, if :FRIEND relationship does not exist but :KNOWS relationship is present then the PATH should traverse that node.
Right now the problem with above syntax is that it returns both the paths with :FRIEND and :KNOWS - I am not able to filter out a specific direction based on above requirement.
1. Example data set
For the ease of possible further answers and solutions I note my graph creating statement:
CREATE
(personA:Person {name:'Person A'})-[:FRIEND]->(personB:Person {name: 'Person B'}),
(personB)-[:FRIEND]->(personC:Person {name: 'Person C'}),
(personC)-[:FRIEND]->(personD:Person {name: 'Person D'}),
(personC)-[:FRIEND]->(personE:Person {name: 'Person E'}),
(personE)-[:FRIEND]->(personF:Person {name: 'Person F'}),
(personA)-[:KNOWS]->(personG:Person {name: 'Person G'}),
(personA)-[:KNOWS]->(personH:Person {name: 'Person H'}),
(personH)-[:KNOWS]->(personI:Person {name: 'Person I'}),
(personI)-[:FRIEND]->(personJ:Person {name: 'Person J'});
2. Scenario "Optional Match"
2.1 Solution
MATCH (startNode:Person {name:'Person A'})
OPTIONAL MATCH friendPath = (startNode)-[:FRIEND*]->(:Person)
OPTIONAL MATCH knowsPath = (startNode)-[:KNOWS*]->(:Person)
RETURN friendPath, knowsPath;
If you do not need every path to all nodes of the entire path, but only the whole, I recommend using shortestPath() for performance reasons.
2.1 Result
Note the missing node 'Person J', because it owns a FRIENDS relationship to node 'Person I'.
3. Scenario "Expand paths"
3.1 Solution
Alternatively you could use the Expand paths functions of the APOC user library. Depending on the next steps of your process you can choose between the identification of nodes, relationships or both.
MATCH (startNode:Person {name:'Person A'})
CALL apoc.path.subgraphNodes(startNode,
{maxLevel: -1, relationshipFilter: 'FRIEND>', labelFilter: '+Person'}) YIELD node AS friendNodes
CALL apoc.path.subgraphNodes(startNode,
{maxLevel: -1, relationshipFilter: 'KNOWS>', labelFilter: '+Person'}) YIELD node AS knowsNodes
WITH
collect(DISTINCT friendNodes.name) AS friendNodes,
collect(DISTINCT knowsNodes.name) AS knowsNodes
RETURN friendNodes, knowsNodes;
3.2 Explanation
line 1: defining your start node based on the name
line 2-3: Expand from the given startNode following the given relationships (relationshipFilter: 'FRIEND>') adhering to the label filter (labelFilter: '+Person').
line 4-5: Expand from the given startNode following the given relationships (relationshipFilter: 'KNOWS>') adhering to the label filter (labelFilter: '+Person').
line 7: aggregates all nodes by following the FRIEND relationship type (omit the .name part if you need the complete node)
line 8: aggregates all nodes by following the KNOWS relationship type (omit the .name part if you need the complete node)
line 9: render the resulting groups of nodes
3.3 Result
╒═════════════════════════════════════════════╤═════════════════════════════════════════════╕
│"friendNodes" │"knowsNodes" │
╞═════════════════════════════════════════════╪═════════════════════════════════════════════╡
│["Person A","Person B","Person C","Person E",│["Person A","Person H","Person G","Person I"]│
│"Person D","Person F"] │ │
└─────────────────────────────────────────────┴─────────────────────────────────────────────┘
MATCH p = (me:Person)-[:FRIEND|:KNOWS*]->(newPerson:Person)
WITH p, extract(r in relationships(p) | type(r)) AS types
RETURN p ORDER BY types asc LIMIT 1
This is a matter of interrogating the types of outgoing relationships for each node and then making a prioritized decision on which relationships to retain leveraging some nested case logic.
Using the small graph above
MATCH path = (a)-[r:KNOWS|FRIEND]->(b)
WITH a, COLLECT([type(r),a,r,b]) AS rels
WITH a,
rels,
CASE WHEN filter(el in rels WHERE el[0] = "FRIEND") THEN filter(el in rels WHERE el[0] = "FRIEND")
ELSE CASE WHEN filter(el in rels WHERE el[0] = "KNOWS") THEN filter(el in rels WHERE el[0] = "KNOWS") ELSE [''] END END AS search
UNWIND search AS s
RETURN s[1] AS a, s[2] AS r, s[3] AS b
I believe this returns your expected result:
Based on your logic, there should be no traversal to Person G or Person H from Person A, as there is a FRIEND relationship from Person A to Person B that takes precedence.
However there is a traversal from Person H to Person I because of the existence of the singular KNOWS relationship, and then a subsequent traversal from Person I to Person J.

Neo4j Passing distinct nodes through WITH in Cypher

I have the following query, where there are 3 MATCHES, connected with WITH, searching through 3 paths.
MATCH (:File {name: 'A'})-[:FILE_OF]->(:Fun {name: 'B'})-->(ent:CFGEntry)-[:Flows*]->()-->(expr:CallExpr {name: 'C'})-->()-[:IS_PARENT]->(Callee {name: 'd'})
WITH expr, ent
MATCH (expr)-->(:Arg {chNum: '1'})-->(id:Id)
WITH id, ent
MATCH (entry)-[:Flows*]->(:IdDecl)-[:Def]->(sym:Sym)
WHERE id.name = sym.name
RETURN id.name
The query returns two distinct id and one distinct entry, and 7 distinct sym.
The problem is that since in the second MATCH I pass "WITH id, entry", and two distinct id were found, two instances of entry is passed to the third match instead of 1, and the run time of the third match unnecessarily gets doubled at least.
I am wondering if anyone know how I should write this query to just make use of one single instance of entry.
Your best bet will be to aggregate id, but then you'll need to adjust your logic in the third part of your query accordingly:
MATCH (:File {name: 'A'})-[:FILE_OF]->(:Fun {name: 'B'})-->(ent:CFGEntry)-[:Flows*]->()-->(expr:CallExpr {name: 'C'})-->()-[:IS_PARENT]->(Callee {name: 'd'})
WITH expr, ent
MATCH (expr)-->(:Arg {chNum: '1'})-->(id:Id)
WITH collect(id.name) as names, ent
MATCH (entry)-[:Flows*]->(:IdDecl)-[:Def]->(sym:Sym)
WHERE sym.name in names
RETURN sym.name

CREATE UNIQUE in neo4j produces duplicate nodes

According to the neo4j documentation:
CREATE UNIQUE is in the middle of MATCH and CREATE — it will match
what it can, and create what is missing. CREATE UNIQUE will always
make the least change possible to the graph — if it can use parts of
the existing graph, it will.
This sounds great, but CREATE UNIQUE doesn't seem to follow the 'least possible change' rule. e.g., here is some Cypher to create two people:
CREATE (n:Person {name: 'Alice'})
CREATE (n:Person {name: 'Bob'})
CREATE INDEX ON :Person(name)
and here's two CREATE UNIQUE statements, to create a relationship between those people. Since both people already exist in the graph, only the relationships should be newly created:
MATCH (a:Person {name: 'Alice'})
CREATE UNIQUE (a)-[:knows]->(b:Person {name: 'Bob'})
RETURN a
MATCH (a:Person {name: 'Alice'})
CREATE UNIQUE (a)<-[:knows]-(b:Person {name: 'Bob'})
RETURN a
After this, the graph should look like
(Alice)<---KNOWS--->(Bob).
But when you run a MATCH query:
MATCH (a:Person)
RETURN a
it seems that the graph now looks like
(Bob)
(Bob)--KNOWS-->(Alice)--KNOWS-->(Bob);
two extra Bobs have been created.
I looked a bit through the other Cypher commands, but none of them seem intended for this use case: create a link between existing node A and existing node B if B exists, and otherwise create a link between existing node A and a newly created node B. How can this problem best be solved within the Cypher framework?
This query should do what you want (if you always want to end up with a single knows relationship between the 2 nodes):
MATCH (a:Person {name: 'Alice'})
MERGE (b:Person {name: 'Bob'})
MERGE (a)-[:knows]->(b)
RETURN a;
Here is how you can do it with CREATE UNIQUE
MATCH (a:Person {name: 'Alice'}), (b:Person {name:'Bob'})
CREATE UNIQUE (a)-[:knows]->(b), (b)-[:knows]->(a)
You need 2 match clauses otherwise you are always creating the node in the CREATE UNIQUE statement, not matching existing nodes.

Cypher, create unique relationship for one node

I want to add a "created by" relationship on nodes in my database. Any node should be able of having this relationship but there can never be more than one.
Right now my query looks something like this:
MATCH (u:User {email: 'my#mail.com'})
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.name='Node name', n.attribute='value'
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
As I have understood Cypher there is no way to achieve what I want, the current query will only make sure there are no unique relationships based on TWO nodes, not ONE. So, this will create more CREATED_BY relationships when run for another User and I want to limit the outgoing CREATED_BY relationship to just one for all nodes.
Is there a way to achieve this without running multiple queries involving program logic?
Thanks.
Update
I tried to simplyfy the query by removing implementation details, if it helps here's the updated query based on cybersams response.
MERGE (c:Company {name: 'Test Company'})
ON CREATE SET c.uuid='db764628-5695-40ee-92a7-6b750854ebfa', c.created_at='2015-02-23 23:08:15', c.updated_at='2015-02-23 23:08:15'
WITH c
OPTIONAL MATCH (c)
WHERE NOT (c)-[:CREATED_BY]-()
CREATE (c)-[:CREATED_BY {date: '2015-02-23 23:08:15'}]->(u:User {token: '32ba9d2a2367131cecc53c310cfcdd62413bf18e8048c496ea69257822c0ee53'})
RETURN c
Still not working as expected.
Update #2
I ended up splitting this into two queries.
The problem I found was that there was two possible outcomes as I noticed.
The CREATED_BY relationship was created and (n) was returned using OPTIONAL MATCH, this relationship would always be created if it didn't already exist between (n) and (u), so when changing the email attribute it would re-create the relationship.
The Node (n) was not found (because of not using OPTIONAL MATCH and the WHERE NOT (c)-[:CREATED_BY]-() clause), resulting in no relationship created (yay!) but without getting the (n) back the MERGE query looses all it's meaning I think.
My Solution was the following two queries:
MERGE (n:Node {name: 'Name'})
ON CREATE SET
SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)-[r:CREATED_BY]-()
RETURN c, r
Then I had program logic check the value of r, if there was no relationship I would run the second query.
MATCH (n:Node {name: 'Name'})
MATCH (u:User {email: 'my#email.com'})
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
Unfortunately I couldn't find any real solution to combining this in one single query with Cypher. Sam, thanks! I've selected your answer even though it didn't quite solve my problem, but it was very close.
This should work for you:
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)
WHERE NOT (n)-[:CREATED_BY]->()
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(:User {email: 'my#mail.com'})
RETURN n;
I've removed the starting MATCH clause (because I presume you want to create a CREATED_BY relationship even when that User does not yet exist in the DB), and simplified the ON CREATE to remove the redundant setting of the name property.
I have also added an OPTIONAL MATCH that will only match an n node that does not already have an outgoing CREATED_BY relationship, followed by a CREATE UNIQUE clause that fully specifies the User node.

Resources