Cypher path querying (using Neo4j) - neo4j

I have a graph datebase so that there is in it some pattern like this one:
(n1)-[:a]->(n2),
(n1)-[:b]->(n2),
(n1)-[:c]->(n2),
(n1)-[:e]->(n2),
(n1)-[:d]->(n3),
(n2)-[:b]->(n4)
And I want to have all graph with this pattern
MATCH p={
(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4),
(n1)-[:b]->(n2)<-[:c]-(n1),
(n1)-[:e]->(n2)
}
RETURN p
Is it possible? I've search a little but I haven't found how to do it.
I know we can use "|" for a type like this
()-[:a|b]->()
but there is no "&" and the path assigning only works on pattern which are written without ",".
Thanks
EDIT:
If it could help, here is another example of what I'm seeking:
In a database with movies, person and relations like ACTED_IN, KNOWS, FRIEND and HATE
I want all the graphs containing an actor "Actor1" (who ACTED_IN a movie "M") who KNOWS "Person1", FRIEND "Person2" and HATE "Person3" which ACTED_IN the same movie "M".
An UNION like the one in the answer of "Michael Hunger" does not work because we have multiple subgraphs and not graphs. Moreover, some subgraph might not be correct answers for the bigger pattern.

Your query will be very inefficient, as you don't restrict your search to a set of start nodes neither with labels or label+property combinations !!!!
You can use UNION for that:
MATCH p=(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4) RETURN p
UNION
MATCH p=(n1)-[:b]->(n2)<-[:c]-(n1) RETURN p
UNION
MATCH p=(n1)-[:e]->(n2) RETURN p

Related

Neo4j: Exclude certain nodes in variable path relationship

I've a graph database consisting of two types of nodes - persons and businesses, and one type of relationship - payment.
A person may pay either another person, or another business. Likewise, a business may pay a person or a business. That is, all these four types of paths are possible
(person)-[:PAYS]->(person)
(person)-[:PAYS]->(business)
(business)-[:PAYS]->(person)
(business)-[:PAYS]->(business)
In a use case of detecting possible money laundering, I would like to extract cases where payment made by a person went through several businesses before reaching another person. That is (omitting the relationship for convenience):
(person)-(business)-(business)-(business)-(person)
My cypher query should therefore look something like this:
(person)-[:PAYS*0..3]-(person)
However, this will also return me the following relationship, which isn't what I want:
(person)-(business)-(person)-(business)-(person)
What can I do to exclude (person) from the variable length relationship [:PAYS*0..3]?
I've followed the solution given here and tried this:
MATCH path((person)-[:PAYS*0..3]-(person))
WHERE NONE(n IN nodes(path) WHERE n:person)
RETURN path
However, this query ran for a long time before giving an output of zero results (which isn't correct). Another obvious solution is to change my relationship to make a distinction between [:PAYS_BUSINESS] and [:PAYS_PERSON], but I would find out if a solution is possible without changing my graph schema.
The reason that
MATCH path=((person)-[:PAYS*0..3]-(person))
WHERE NONE(n IN nodes(path) WHERE n:person)
RETURN path
does not result in anything seems to be that the first and the last node are persons
if you want to find the paths from :person to :person with only :business in between, you could do this
MATCH path=((p1:Person)-[:PAYS*1..3]-(p2:Person))
WHERE ALL(n IN nodes(path)[1..-1] WHERE n:Business)
RETURN path
You may all want to look at the apoc.path.expand and apoc.path.expandConfig procedures (https://neo4j.com/labs/apoc/4.1/overview/apoc.path/). Powerful, but you introduce a dependency on the APOC library.
5 minutes after I posted this question, I thought of and tried a possible solution that seems to work. Not sure if this is against the rules, but here's a possible way out of my own problem (in case someone else is facing the same problem):
MATCH x=(p1:person)-[:PAYS]-(b1:business)
WITH *
MATCH y=(b1:business)-[:PAYS*..3]-(b2:business)-[:PAYS]-(p2:person)
RETURN x, y
You might want to look at how I handled this with X-linked inheritance. In that use case you aggregate the sex of the parent (M or F) and can then excluded MM from the aggregated string since a man never passes an X to his son.
http://stumpf.org/genealogy-blog/graph-databases-in-genealogy
The query exclude all MM concatenated strings, rather accepted anything except MM:
match p=(n:Person{RN:32})<-[:father|mother*..99]-(m) with m, reduce(status ='', q IN nodes(p)| status + q.sex) AS c, reduce(srt2 ='|', q IN nodes(p)| srt2 + q.RN + '|') AS PathOrder where c=replace(c,'MM','') return distinct m.fullname as Fullname
In your case its P and B (person or business).

Neo4j: Find Person who acted in and directed any movies

I want to display all Person who act as well as direct a movie. It does not matter if the person direct a movie but does not act in the movie. As long as there are edges ACTED_IN and DIRECTED exist on a node, the query will display the result.
I have tried several Cypher queries. This one I believe show the nearest result I intend to:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE exists( (p)-[:DIRECTED]->() )
RETURN distinct *
Now, the issue is, one of the result shows "James Marshall" ACTED_IN "A Few Good Men" but he also DIRECTED two different movies which are "Ninja Assasin" and "V for Vendetta".
My current outcome only display "James Marshall" ACTED_IN "A Few Good Men" and does not show the other two movies he DIRECTED. So, how can I improve my Cypher?
You can first match on persons that have the relationships you need (this will be a degree check), then MATCH on the pattern using both relationships at once (which would be an OR match for the relationships in question otherwise):
MATCH (p:Person)
WHERE (p)-[:ACTED_IN]->() AND (p)-[:DIRECTED->()
MATCH path = (p)-[:ACTED_IN | DIRECTED]->(m:Movie)
RETURN path
You can use the solution in this post
The key point is to add another relationships in the other way around (right to the left to the movie), and then add in the WHERE clause the condition.
MATCH (a1:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(a2:Person)
WHERE exists( (a2)-[:DIRECTED]->(m) ) RETURN a2, m

Neo4j Cypher query - relationship with an "or"

I'm trying to get a named relationship with an or in the query. I'm thinking the query should look similar to:
MATCH (A:person)-[B (:ACTED_IN|:DIRECTED)]->(C:person) RETURN A, B, C
but no matter how I put in the parens I get an error. I suppose a UNION would do the trick but was hoping that there was some way of doing it similar to the above. TIA.
EDIT: This does what I want but seems not the way to do it.
MATCH (A:person)-[B]->(C:person) WHERE type(B)="ACTED_IN" OR type(B)="DIRECTED" RETURN A,B,C
I am a new user so I do not have the option to comment on questions yet. I am guessing you are trying to get the person who either acted in or directed the movie. Its described in official Cypher Documentation: Match on multiple relationship types.
With Demo Movie Data on Neo4j to get persons from The Matrix movie, I would use this:
MATCH (TheMatrix { title: 'The Matrix' })<-[rel:ACTED_IN|:DIRECTED]-(person)
RETURN person.name, rel

Cypher Query not returning nonexistent relationships

I have a graph database where there are user and interest nodes which are connected by IS_INTERESTED relationship. I want to find interests which are not selected by a user. I wrote this query and it is not working
OPTIONAL MATCH (u:User{userId : 1})-[r:IS_INTERESTED] -(i:Interest)
WHERE r is NULL
Return i.name as interest
According to answers to similar questions on SO (like this one), the above query is supposed to work.However,in this case it returns null. But when running the following query it works as expected:
MATCH (u:User{userId : 1}), (i:Interest)
WHERE NOT (u) -[:IS_INTERESTED] -(i)
return i.name as interest
The reason I don't want to run the above query is because Neo4j gives a warning:
This query builds a cartesian product between disconnected patterns.
If a part of a query contains multiple disconnected patterns, this
will build a cartesian product between all those parts. This may
produce a large amount of data and slow down query processing. While
occasionally intended, it may often be possible to reformulate the
query that avoids the use of this cross product, perhaps by adding a
relationship between the different parts or by using OPTIONAL MATCH
(identifier is: (i))
What am I doing wrong in the first query where I use OPTIONAL MATCH to find nonexistent relationships?
1) MATCH is looking for the pattern as a whole, and if can not find it in its entirety - does not return anything.
2) I think that this query will be effective:
// Take all user interests
MATCH (u:User{userId: 1})-[r:IS_INTERESTED]-(i:Interest)
WITH collect(i) as interests
// Check what interests are not included
MATCH (ni:Interest) WHERE NOT ni IN interests
RETURN ni.name
When your OPTIONAL MATCH query does not find a match, then both r AND i must be NULL. After all, since there is no relationship, there is no way get the nodes that it points to.
A WHERE directly after the OPTIONAL MATCH is pulled into the evaluation.
If you want to post-filter you have to use a WITH in between.
MATCH (u:User{userId : 1})
OPTIONAL MATCH (u)-[r:IS_INTERESTED] -(i:Interest)
WITH r,i
WHERE r is NULL
Return i.name as interest

Neo4J contains all Query

I'm brand new to neo4j and graph databases.
I'm trying to create a query I would describe as a 'contains all' however I think I'm very far away and not sure how to progress
MATCH (movie:Movie {name:'tropic thunder'})-[:stars_in]-(actors)
-[:guest_stars_in]-(movie2)
RETURN movie2.name
Let's say
MATCH (movie:Movie{name:'tropic thunder'})-[:stars_in]-(actors)
returns 5 actors
I'm looking to match exactly (all 5 actors -> same 5 actors as guest stars) or as a subset (all 5 actors are a subset of a movie which has 10 guest stars).
Hope that makes sense. Thanks for your help :D
The first thing I would point out is that you should call the variable actor instead of actors. It may seem picky, but it's a common confusion with Cypher. With the MATCH you are matching one sub-pattern at a time.
So to start out let's find each movie2 and get an array of the actors in question:
MATCH (movie:Movie {name:'tropic thunder'})-[:stars_in]-(actor)
-[:guest_stars_in]-(movie2)
RETURN movie2.name, collect(actor)
A first instinct might be to extend the path like so:
MATCH (movie:Movie {name:'tropic thunder'})-[:stars_in]-(actor)
-[:guest_stars_in]-(movie2)-[:guest_starts_in]-(actor2)
But again, we're matching every possible match of that path in the database. So for each actor, we're going to match all possible actor2s, which would lead to duplicates.
What we can do, though, is to take our first query and change the RETURN to a WITH in order to pass our data onto a second part of the query:
MATCH (movie:Movie {name:'tropic thunder'})-[:stars_in]-(actor)
-[:guest_stars_in]-(movie2)
WITH movie2, collect(actor) AS original_movie_actors
MATCH movie2-[:guest_stars_in]-(guest_star)
RETURN movie2.name, original_movie_actors, collect(guest_star) AS guest_stars
This gives us
a list of movies in question
the list of the actors who both stared in "tropic thunder" and guest stared in the movie in question
all guest stars for the movie in question
From here you could probably figure it out in your programming language of choice. But we can figure this out in Cypher too:
MATCH (movie:Movie {name:'tropic thunder'})-[:stars_in]-(actor)
-[:guest_stars_in]-(movie2)
WITH movie2, collect(actor) AS original_movie_actors
MATCH movie2-[:guest_stars_in]-(guest_star)
WITH movie2, original_movie_actors, collect(guest_star) AS guest_stars
RETURN
movie.name,
ALL(guest_star IN guest_stars WHERE guest_star IN original_movie_actors) AS all_matched,
length(original_movie_actors) / length(guest_stars) AS percentage_match
I threw in a percentage_match as a double-check and in case that's useful

Resources