Linking Optional Nodes with Cypher Query in Neo4j 2.0 - neo4j

I'm trying to figure out the correct way to attach newly created nodes to additional nodes that may or may not exist. Basically, CREATE A and if B exists, LINK B to A and RETURN A. If B doesn't exist just RETURN A.
This is my Cypher query (the extra WITH clauses are because this is part of a larger query and I'm trying to make sure this sample code works the same way):
CREATE (a:A { foo: "bar" })
WITH a
OPTIONAL MATCH (b:B)
WHERE a.foo = b.foo
CREATE UNIQUE b-[:LINK]->a
WITH a
RETURN a
This doesn't work since the CREATE UNIQUE fails since b is NULL. Other than breaking it up into multiple queries, is there a way to accomplish this?

I think you need to hack it with foreach...
CREATE (a:A { foo: "bar" })
WITH a
OPTIONAL MATCH (b:B)
WHERE a.foo = b.foo
WITH a, collect(b) as bs
FOREACH(b in bs | CREATE UNIQUE b-[:LINK]->a)
WITH a
RETURN a

Related

apoc.load.jdbc check row property before create

I'm using Apoc.load.jdbc to get data from Oracle Database and create row from it, here is the code:
call apoc.load.driver('oracle.jdbc.driver.OracleDriver')
WITH "jdbc:oracle:thin:#10.82.14.170:1521/ORACLE" as url
CALL apoc.load.jdbc(url,"select * from Patients",[],{credentials:{user:'KCB',password:'123'}}) YIELD row
Create (p:Person) set p=row
return p
That code work fine but I want to check row property before create it. Such as:
If (row.ID!=p.ID)
{
set p=row
}
Else{Not set}
How can I do that with my code? Thanks a lot!
As #TomažBratanič mentions in his answer, your desired conditional check makes no sense. That is, unless you also replace your CREATE clause.
Your query uses CREATE to always create a new p with no properties. So row.ID <> p.ID will always be true, and you'd always be executing the SET clause.
However, I believe your real intention is to avoid changing an existing Person node (and to avoid creating a duplicate Person node for the same person). So, below is a query that uses MERGE and ON CREATE to do that. I assume that people have unique ID values.
CALL apoc.load.driver('oracle.jdbc.driver.OracleDriver')
CALL apoc.load.jdbc(
"jdbc:oracle:thin:#10.82.14.170:1521/ORACLE",
"select * from Patients",[],{credentials:{user:'KCB',password:'123'}}
) YIELD row
MERGE (p:Person {ID: row.ID})
ON CREATE SET p = row
RETURN p
Also, you should consider creating an index (or uniqueness constraint) on :Person(ID) to optimize the lookup of existing Person nodes.
You can use a CASE statement to achieve this:
call apoc.load.driver('oracle.jdbc.driver.OracleDriver')
WITH "jdbc:oracle:thin:#10.82.14.170:1521/ORACLE" as url
CALL apoc.load.jdbc(url,"select * from Patients",[],{credentials:{user:'KCB',password:'123'}}) YIELD row
Create (p:Person) set p = CASE WHEN row.ID <> p.id THEN row ELSE null END
return p
However, this statement does not make sense, because you always create a new Person, so the row.ID will never be the same as p.id.

Cypher Query where 2 different labels do not contain a relationship to a 3rd label/node

I have 3 labels, A, B, and Z. A & B both have a relationship to Z. I want to find all the A nodes that do not have share any of nodes Z in common with B
Currently, doing a normal query where the relationship DOES exist, works.
MATCH (a:A)-[:rel1]->(z:Z)<-[:rel2]-(b:B { uuid: {<SOME ID>} })
RETURN DISTINCT a
But when I do
MATCH (a:A)
WHERE NOT (a)-[:rel1]->(z:Z)<-[:rel2]-(b:B { uuid: {<SOME ID>} }))
RETURN DISTINCT a
It throws an error
Neo4j::Server::CypherResponse::ResponseError: z not defined
Not sure if the syntax for this is incorrect, I tried WHERE NOT EXIST() but no luck.
The query is part of a larger one called through a rails app using neo4jrb / (Neo4j::Session.query)
This is a problem to do with the scope of your query. When you describe a node in a MATCH clause like the below
MATCH (n:SomeLabel)
You're telling cypher to look for a node with the label SomeLabel, and assign it to the variable n in the rest of the query, and at the end of the query, you can return the values stored in this node using RETURN n (unless you drop n by not including it in a WITH clause).
Later on in you query, if you want to MATCH another node, you can do it in reference to n, so for example:
MATCH (m:SomeOtherLabel)-[:SOME_RELATIONSHIP]-(n)
Will match a variable connected (in any direction) to the node n, with a label SomeOtherLabel, and assign it to the variable m for the rest of the query.
You can only assign nodes to variables like this in MATCH, OPTIONAL MATCH, MERGE, CREATE and (sort of) in WITH and UNWIND clauses (someone correct me here if I've missed one, I suppose you also do this in list comprehensions and FOREACH clauses).
In your second query, you are trying to find a node with the label A, which is not connected to a node with the label Z. However, the way you have written the query means that you are actually saying find a node with label A which is not connected via a rel1 relationship to the node stored as z. This will fail (and as shown, neo complains that z is not defined), because you can't create a new variable like this in the WHERE clause.
To correct your error, you need to remove the reference to the variable z, and ensure you have also defined the variable b containing your node before the WHERE clause. Now, you keep the label in the query, like the below.
MATCH (a:A)
MATCH (b:B { uuid: {<SOME ID>} })
WHERE NOT (a)-[:rel1]->(:Z)<-[:rel2]-(b) // changed this line
RETURN DISTINCT a
And with a bit of luck, this will now work.
You get the error because z is the identifier of a node that you are using in a where clause that you have not yet identified.
Since you know b already I would match it first and then use it in your where clause. You don't need to assign :Z an identifier, simply using the node label will suffice.
MATCH (b:B { uuid: {<SOME ID>} })
WITH b
MATCH (a:A)
WHERE NOT (a)-[:rel1]->(:Z)<-[:rel2]-(b)
RETURN DISTINCT a

Neo4j Cypher: Create a relationship only if the end node exists

Building on this similar question, I want the most performant way to handle this scenario.
MERGE (n1{id:<uuid>})
SET n1.topicID = <unique_name1>
IF (EXISTS((a:Topic{id:<unique_name1>})) | CREATE UNIQUE (n1)-[:HAS]->(a))
MERGE (n2{id:<uuid>})
SET n2.topicID = <unique_name2>
IF (EXISTS((a:Topic{id:<unique_name2>})) | CREATE UNIQUE (n2)-[:HAS]->(a))
Unfortunately, IF doesn't exist, and EXISTS can't be used to match or find a unique node.
I can't use OPTIONAL MATCH, because then CREATE UNIQUE will throw a null exception (as much as I wish it would ignore null parameters)
I can't use MATCH, because if the topic doesn't exist, I will will loose all my rows.
I can't use MERGE, because I don't want to create the node if it doesn't exist yet.
I can't use APOC, because I have no guarantee that it will be available for use on our Neo4j server.
The best solution I have right now is
MERGE (a:TEST{id:1})
WITH a
OPTIONAL MATCH (b:TEST{id:2})
// collect b so that there are no nulls, and rows aren't lost when no match
WITH a, collect(b) AS c
FOREACH(n IN c | CREATE UNIQUE (a)-[:HAS]->(n))
RETURN a
However, this seems kinda complicated and needs 2 WITHs for what is essentially CREATE UNIQUE RELATION if start and end node exist (and in the plan there is an eager). Is it possible to do any better? (Using Cypher 3.1)
You can simplify a quite a bit:
MERGE (a:TEST{id:1})
WITH a
MATCH (b:TEST{id:2})
CREATE UNIQUE (a)-[:HAS]->(b)
RETURN a;
The (single) WITH clause serves to split the query into 2 "sub-queries".
So, if the MATCH sub-query fails, it only aborts its own sub-query (and any subsequent ones) but does not roll back the previous successful MERGE sub-query.
Note, however, that whenever a final sub-query fails, the RETURN clause would return nothing. You will have to determine if this is acceptable.
Because the above RETURN clause would only return something if b exists, it might make more sense for it to return b, or the path. Here is an example of the latter (p will be assigned a value even if the path already existed):
MERGE (a:TEST{id:1})
WITH a
MATCH (b:TEST{id:2})
CREATE UNIQUE p=(a)-[:HAS]->(b)
RETURN p;
[UPDATE]
In neo4j 4.0+, CREATE UNIQUE is no longer supported, so MERGE needs to be used instead.
Also, if you want to return a even if b does not exist, you can use the APOC function apoc.do.when:
MERGE (a:TEST{id:1})
WITH a
OPTIONAL MATCH (b:TEST{id:2})
CALL apoc.do.when(
b IS NOT NULL,
'MERGE (a)-[:HAS]->(b)',
'',
{a: a, b: b}) YIELD value
RETURN a;

neo4j fail a query on a missing match instead of returning empty array

Given the below cypher query
MATCH (u:User {id: {userId}}), (b:B {id: {bId})
CREATE (c:C), (c)-[:HAS_USER]->(u), (b)-[:SOME_REL]->(c)
As you can see, I'm creating a node C that must have relationships with 2 things, some node b and some user.
What happens is I get an empty array when u or b does not exist, but I'd like neo4j to respond with a fail instead of an empty array. That makes things easier for me to know which node is missing. Is it possible to 'force' a fail when the match clause doesn't return anything?
That is how it works, if the MATCH return null then the query fails. That is why they have OPTIONAL MATCH available so that it doesn`t fail if null is returned.
Edit: add return at the end like this
MATCH (u:User {id: {userId}}), (b:B {id: {bId})
CREATE (c:C), (c)-[:HAS_USER]->(u), (b)-[:SOME_REL]->(c)
RETURN 'success'
So if you get success back that means that match found what it was looking for if not then it didn't
edit 2:
OPTIONAL MATCH (u:User {id: {userId}}), (b:B {id: {bId})
with *,CASE when u is not null and b is not null then [1] else [] end as exists
FOREACH (x in exists | CREATE (c:C), (c)-[:HAS_USER]->(u), (b)-[:SOME_REL]->(c))
RETURN u,b
so now we do an optional match so it doesnt break down when not found. Then we do a CASE statement and where both User and B exist we create some relationships. And in the end we return User and b and check if both exists or whether there is a null in any of them.

Neo4j indices slow when querying across 2 labels

I've got a graph where each node has label either A or B, and an index on the id property for each label:
CREATE INDEX ON :A(id);
CREATE INDEX ON :B(id);
In this graph, I want to find the node(s) with id "42", but I don't know a-priori the label. To do this I am executing the following query:
MATCH (n {id:"42"}) WHERE (n:A OR n:B) RETURN n;
But this query takes 6 seconds to complete. However, doing either of:
MATCH (n:A {id:"42"}) RETURN n;
MATCH (n:B {id:"42"}) RETURN n;
Takes only ~10ms.
Am I not formulating my query correctly? What is the right way to formulate it so that it takes advantage of the installed indices?
Here is one way to use both indices. result will be a collection of matching nodes.
OPTIONAL MATCH (a:B {id:"42"})
OPTIONAL MATCH (b:A {id:"42"})
RETURN
(CASE WHEN a IS NULL THEN [] ELSE [a] END) +
(CASE WHEN b IS NULL THEN [] ELSE [b] END)
AS result;
You should use PROFILE to verify that the execution plan for your neo4j environment uses the NodeIndexSeek operation for both OPTIONAL MATCH clauses. If not, you can use the USING INDEX clause to give a hint to Cypher.
You should use UNION to make sure that both indexes are used. In your question you almost had the answer.
MATCH (n:A {id:"42"}) RETURN n
UNION
MATCH (n:B {id:"42"}) RETURN n
;
This will work. To check your query use profile or explain before your query statement to check if the indexes are used .
Indexes are formed and and used via a node label and property, and to use them you need to form your query the same way. That means queries w/out a label will scan all nodes with the results you got.

Resources