WHERE NOT() in cypher neo4j query - neo4j

I am having trouble with a simple cypher query. The query is:
MATCH (u:user { google_id : 'example_user' })--(rm:room)--(a:area),
(c:category { name : 'culture:Yoruba' })--(o:object)
WHERE NOT (a-[:CONTAINS]->o)
RETURN DISTINCT o.id
The "WHERE NOT.." is being ignored and I am getting back the nodes with incoming :CONTAINS relationships from the area nodes. If I take out the "NOT" function, then I correctly only get back the nodes that have this a-->o relationship.
I think I have a weak understanding of NOT()

Trad,
The query is returning just what you asked it to. In your example at the link, there are three areas. None of the objects are contained by the first two areas, so all three nodes are returned. If you change the RETURN line to
RETURN a.area_number, o.id
you will see this.
I don't know about your larger problem context, but if you want to know about objects that aren't in any area, then the query
MATCH (o:object)
WHERE NOT (o)<-[:CONTAINS]-()
RETURN o.id
will accomplish the task.
Grace and peace,
Jim

Related

How can I make this neo4j query faster?

I have a database that contains these four nodes:
Store, Guitar, GuitarModel, Accessory
*Guitar refers to a specific guitar that a person can own/play
optional match (a:Store), (b:Guitar), (c:GuitarModel), (d:Accessory)
where a.StoreNumber ="1234" and (a)-[:ContainsGuitar]->(b) and
(b)-[:IS_OF_MODEL]->(c) and
((d)-[:COMES_STANDARD]-(c) OR (d)-[:COMES_OPTIONAL]-(c) OR (d)-:COMES_OPTION_UPGRADE]-(c) OR (d)-[:COMES_UPGRADE]-(c))
return b.name, collect(d.name)
My issue right now is this query is pretty slow it takes about 120,000ms to perform.
I have 67,000 nodes and 131,000 relationships.
So am I doing something wrong that making this slow?
Do you have an index/constraint on :Store(StoreNumber) ?
Why are you only using an optional match ? You can combine MATCH & OPTIONAL MATCH
Why are you doing your pattern in the WHERE clause ? You should put it directly in a MATCH.
I think that your query creates a cartesian product between nodes, that's why it's so slow.
Can you try this query :
MATCH
(:Store { StoreNumber:"1234" })-[:ContainsGuitar]->(b)
RETURN
b.name,
[(b)-[:IS_OF_MODEL]->(:GuitarModel)-[:COMES_STANDARD|COMES_OPTIONAL|COMES_OPTION_UPGRADE|COMES_UPGRADE]-(d:Accessory) | d.name]

How to show only the nodes connected to the ones selected using Cypher of Neo4J?

I want to see only the nodes labeled ':Context' that are connected to all (or the maximum) number of the nodes labeled ':Concept'.
I'm currently using a query:
match (c:Concept), (ctx:Context), (c)-[r]->(ctx) where (c.name = 'italy' or c.name = 'pick') return ctx,c;
This gives out the following result:
How would I remove all the unnecessary green nodes (they are the ones of the ':Context' type) and only leave those, which are connected both to "pick" and "italy" :Concept nodes?
I also want to be able to perform the same search for 3 nodes and more. Can't understand what's the best way to do that (with or without APOC).
This query below works:
match (c1:Concept{name:"italy"})-->(ctx:Context)<--(c2:Concept{name:"pick"}) return ctx;
but only for 2 items. What if I want to do the same for 3 or more?
this one is too slow:
match (c1:Concept{name:"italy"})-->(ctx:Context)<--(c2:Concept{name:"pick"})-->(ctx:Context)<--(c3:Concept{name:"novice"}) return ctx;
Thanks!
You can match as many branches as you want in additional matches like this
match (c1:Concept{name:"italy"})-->(ctx:Context)<--(c2:Concept{name:"pick"})
match (c3:Concept{name:"france"})-->(ctx)
return ctx;
Although, I would recommend using params as they are more reusable. So assuming you have the param 'list' that contains ["italy", france", "pick"]
MATCH (c:Concept)
WHERE c.name in $list
WITH COLLECT(c) as concepts
MATCH (ctx:Context)
WHERE ALL(c in concepts WHERE (c)-->(ctx))
RETURN ctx

Cypher - Neo4j Query Profiling

I have some questions regarding Neo4j's Query profiling.
Consider below simple Cypher query:
PROFILE
MATCH (n:Consumer {mobileNumber: "yyyyyyyyy"}),
(m:Consumer {mobileNumber: "xxxxxxxxxxx"})
WITH n,m
MATCH (n)-[r:HAS_CONTACT]->(m)
RETURN n,m,r;
and output is:
So according to Neo4j's Documentation:
3.7.2.2. Expand Into
When both the start and end node have already been found, expand-into
is used to find all connecting relationships between the two nodes.
Query.
MATCH (p:Person { name: 'me' })-[:FRIENDS_WITH]->(fof)-->(p) RETURN
> fof
So here in the above query (in my case), first of all, it should find both the StartNode & the EndNode before finding any relationships. But unfortunately, it's just finding the StartNode, and then going to expand all connected :HAS_CONTACT relationships, which results in not using "Expand Into" operator. Why does this work this way? There is only one :HAS_CONTACT relationship between the two nodes. There is a Unique Index constraint on :Consumer{mobileNumber}. Why does the above query expand all 7 relationships?
Another question is about the Filter operator: why does it requires 12 db hits although all nodes/ relationships are already retrieved? Why does this operation require 12 db calls for just 6 rows?
Edited
This is the complete Graph I am querying:
Also I have tested different versions of same above query, but the same Query Profile result is returned:
1
PROFILE
MATCH (n:Consumer{mobileNumber: "yyyyyyyyy"})
MATCH (m:Consumer{mobileNumber: "xxxxxxxxxxx"})
WITH n,m
MATCH (n)-[r:HAS_CONTACT]->(m)
RETURN n,m,r;
2
PROFILE
MATCH (n:Consumer{mobileNumber: "yyyyyyyyy"}), (m:Consumer{mobileNumber: "xxxxxxxxxxx"})
WITH n,m
MATCH (n)-[r:HAS_CONTACT]->(m)
RETURN n,m,r;
3
PROFILE
MATCH (n:Consumer{mobileNumber: "yyyyyyyyy"})
WITH n
MATCH (n)-[r:HAS_CONTACT]->(m:Consumer{mobileNumber: "xxxxxxxxxxx"})
RETURN n,m,r;
The query you are executing and the example provided in the Neo4j documentation for Expand Into are not the same. The example query starts and ends at the same node.
If you want the planner to find both nodes first and see if there is a relationship then you could use shortestPath with a length of 1 to minimize the DB hits.
PROFILE
MATCH (n:Consumer {mobileNumber: "yyyyyyyyy"}),
(m:Consumer {mobileNumber: "xxxxxxxxxxx"})
WITH n,m
MATCH Path=shortestPath((n)-[r:HAS_CONTACT*1]->(m))
RETURN n,m,r;
Why does this do this?
It appears that this behaviour relates to how the query planner performs a database search in response to your cypher query. Cypher provides an interface to search and perform operations in the graph (alternatives include the Java API, etc.), queries are handled by the query planner and then turned into graph operations by neo4j's internals. It make sense that the query planner will find what is likely to be the most efficient way to search the graph (hence why we love neo), and so just because a cypher query is written one way, it won't necessarily search the graph in the way we imagine it will in our head.
The documentation on this seemed a little sparse (or, rather I couldn't find it properly), any links or further explanations would be much appreciated.
Examining your query, I think you're trying to say this:
"Find two nodes each with a :Consumer label, n and m, with contact numbers x and y respectively, using the mobileNumber index. If you find them, try and find a -[:HAS_CONTACT]-> relationship from n to m. If you find the relationship, return both nodes and the relationship, else return nothing."
Running this query in this way requires a cartesian product to be created (i.e., a little table of all combinations of n and m - in this case only one row - but for other queries potentially many more), and then relationships to be searched for between each of these rows.
Rather than doing that, since a MATCH clause must be met in order to continue with the query, neo knows that the two nodes n and m must be connected via the -[:HAS_CONTACT]-> relationship if the query is to return anything. Thus, the most efficient way to run the query (and avoid the cartesian product) is as below, which is what your query can be simplified to.
"Find a node n with the :Consumer label, and value x for the index mobileNumber, which is connected via a -[:HAS_CONTACT]-> relationshop to a node m with the :Consumer label, and value y for its proprerty mobileNumber. Return both nodes and the relationship, else return nothing."
So, rather than perform two index searches, a cartesian product and a set of expand into operations, neo performs only one index search, an expand all, and a filter.
You can see the result of this simplification by the query planner through the presence of AUTOSTRING parameters in your query profile.
How to Change Query to Implement Search as Desired
If you want to change the query so that it must use an expand into relationship, make the requirement for the relationship optional, or use explicitly iterative execution. Both these queries below will produce the initially expected query profiles.
Optional example:
PROFILE
MATCH (n:Consumer{mobileNumber: "xxx"})
MATCH (m:Consumer{mobileNumber: "yyy"})
WITH n,m
OPTIONAL MATCH (n)-[r:HAS_CONTACT]->(m)
RETURN n,m,r;
Iterative example:
PROFILE
MATCH (n1:Consumer{mobileNumber: "xxx"})
MATCH (m:Consumer{mobileNumber: "yyy"})
UNWIND COLLECT(n1) AS n
MATCH (n)-[r:HAS_CONTACT]->(m)
RETURN n,m,r;

Cypher Query not returning nonexistent relationships

I have a graph database where there are user and interest nodes which are connected by IS_INTERESTED relationship. I want to find interests which are not selected by a user. I wrote this query and it is not working
OPTIONAL MATCH (u:User{userId : 1})-[r:IS_INTERESTED] -(i:Interest)
WHERE r is NULL
Return i.name as interest
According to answers to similar questions on SO (like this one), the above query is supposed to work.However,in this case it returns null. But when running the following query it works as expected:
MATCH (u:User{userId : 1}), (i:Interest)
WHERE NOT (u) -[:IS_INTERESTED] -(i)
return i.name as interest
The reason I don't want to run the above query is because Neo4j gives a warning:
This query builds a cartesian product between disconnected patterns.
If a part of a query contains multiple disconnected patterns, this
will build a cartesian product between all those parts. This may
produce a large amount of data and slow down query processing. While
occasionally intended, it may often be possible to reformulate the
query that avoids the use of this cross product, perhaps by adding a
relationship between the different parts or by using OPTIONAL MATCH
(identifier is: (i))
What am I doing wrong in the first query where I use OPTIONAL MATCH to find nonexistent relationships?
1) MATCH is looking for the pattern as a whole, and if can not find it in its entirety - does not return anything.
2) I think that this query will be effective:
// Take all user interests
MATCH (u:User{userId: 1})-[r:IS_INTERESTED]-(i:Interest)
WITH collect(i) as interests
// Check what interests are not included
MATCH (ni:Interest) WHERE NOT ni IN interests
RETURN ni.name
When your OPTIONAL MATCH query does not find a match, then both r AND i must be NULL. After all, since there is no relationship, there is no way get the nodes that it points to.
A WHERE directly after the OPTIONAL MATCH is pulled into the evaluation.
If you want to post-filter you have to use a WITH in between.
MATCH (u:User{userId : 1})
OPTIONAL MATCH (u)-[r:IS_INTERESTED] -(i:Interest)
WITH r,i
WHERE r is NULL
Return i.name as interest

Cypher path querying (using Neo4j)

I have a graph datebase so that there is in it some pattern like this one:
(n1)-[:a]->(n2),
(n1)-[:b]->(n2),
(n1)-[:c]->(n2),
(n1)-[:e]->(n2),
(n1)-[:d]->(n3),
(n2)-[:b]->(n4)
And I want to have all graph with this pattern
MATCH p={
(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4),
(n1)-[:b]->(n2)<-[:c]-(n1),
(n1)-[:e]->(n2)
}
RETURN p
Is it possible? I've search a little but I haven't found how to do it.
I know we can use "|" for a type like this
()-[:a|b]->()
but there is no "&" and the path assigning only works on pattern which are written without ",".
Thanks
EDIT:
If it could help, here is another example of what I'm seeking:
In a database with movies, person and relations like ACTED_IN, KNOWS, FRIEND and HATE
I want all the graphs containing an actor "Actor1" (who ACTED_IN a movie "M") who KNOWS "Person1", FRIEND "Person2" and HATE "Person3" which ACTED_IN the same movie "M".
An UNION like the one in the answer of "Michael Hunger" does not work because we have multiple subgraphs and not graphs. Moreover, some subgraph might not be correct answers for the bigger pattern.
Your query will be very inefficient, as you don't restrict your search to a set of start nodes neither with labels or label+property combinations !!!!
You can use UNION for that:
MATCH p=(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4) RETURN p
UNION
MATCH p=(n1)-[:b]->(n2)<-[:c]-(n1) RETURN p
UNION
MATCH p=(n1)-[:e]->(n2) RETURN p

Resources