I'm trying to find all the relationships of the nodes which have one specific relationship. People can be connected to events which in turn are connected to churches. I'm interested in the people who are connected as witnesses to events (marriages) in the following manner:
(p:person)-[:ACTED_AS_BEKENDE]-(e:event)
What I'm struggling with is that when I write a simple MATCH statement with a WHERE clause (see below), I only get the events to which people were connected via this specific relationship.
MATCH (p:person)--(e:event)--(c:church)
WHERE (p:person)-[:ACTED_AS_BEKENDE]-(e:event)
RETURN distinct p.ID AS ID, p.Name AS NAME, labels(e) AS Event_name, e.Event_year AS year, labels(c) AS Church ORDER BY e.Event_year ASC
To reiterate: I need a query which first selects the people who are tied to events via the [:ACTED_AS_BEKENDE] edge and then retrieves all the events to which these people were tied.
Do you need something like this?
MATCH (p:person)-[:ACTED_AS_BEKENDE]-(:event)
WITH p
MATCH (p)--(e:event)--(c:church)
RETURN distinct p.ID AS ID, p.Name AS NAME, labels(e) AS Event_name, e.Event_year AS year, labels(c) AS Church ORDER BY e.Event_year ASC
This will first find all persons that are ACTED_AS_BEKENDE, and for them it will find the events and churches as you wanted
Related
So this is a very basic question. I am trying to make a cypher query that creates a node and connects it to multiple nodes.
As an example, let's say I have a database with towns and cars. I want to create a query that:
creates people, and
connects them with the town they live in and any cars they may own.
So here goes:
Here's one way I tried this query (I have WHERE clauses that specify which town and which cars, but to simplify):
MATCH (t: Town)
OPTIONAL MATCH (c: Car)
MERGE a = ((c) <-[:OWNS_CAR]- (p:Person {name: "John"}) -[:LIVES_IN]-> (t))
RETURN a
But this returns multiple people named John - one for each car he owns!
In two queries:
MATCH (t:Town)
MERGE a = ((p:Person {name: "John"}) -[:LIVES_IN]-> (t))
MATCH (p:Person {name: "John"})
OPTIONAL MATCH (c:Car)
MERGE a = ((p) -[:OWNS_CAR]-> (c))
This gives me the result I want, but I was wondering if I could do this in 1 query. I don't like the idea that I have to find John again! Any suggestions?
It took me a bit to wrap my head around why MERGE sometimes creates duplicate nodes when I didn't intend that. This article helped me.
The basic insight is that it would be best to merge the Person node first before you match the towns and cars. That way you won't get a new Person node for each relationship pattern.
If Person nodes are uniquely identified by their name properties, a unique constraint would prevent you from creating duplicates even if you run a mistaken query.
If a person can have multiple cars and residences in multiple towns, you also want to avoid a cartesian product of cars and towns in your result set before you do the merge. Try using the table output in Neo4j Browser to see how many rows are getting returned before you do the MERGE to create relationships.
Here's how I would approach your query.
MERGE (p:Person {name:"John"})
WITH p
OPTIONAL MATCH (c:Car)
WHERE c.licensePlate in ["xyz123", "999aaa"]
WITH p, COLLECT(c) as cars
OPTIONAL MATCH (t:Town)
WHERE t.name in ["Lexington", "Concord"]
WITH p, cars, COLLECT(t) as towns
FOREACH(car in cars | MERGE (p)-[:OWNS]->(car))
FOREACH(town in towns | MERGE (p)-[:LIVES_IN]->(town))
RETURN p, towns, cars
Given that I'm very new to Neo4j. I have a schema which looks like the below image:
Here Has nodes are different for example Passport, Merchant, Driving License, etc. and also these nodes are describing the customer node (looking for future scope of filtering customers based on these nodes).
SIMILAR is a self-relation meaning there exists a customer with ID:1 is related to another customer with ID:2 with a score of 2800.
I have the following questions:
Is this a good schema given the condition of the future scope I mentioned above, or getting all the properties in a single customer node is viable? (Different nodes may have array of items as well, for example: ()-[:HAS]->(Phone) having {active: "+91-1231241", historic_phone_numbers: ["+91-121213", "+91-1231421"]})
I want to get the customer along with describing nodes in relation to other customers. For that, I tried the below query (w/o number of relation more than 1):
// With number_of_relation > 1
MATCH (searched:Customer)-[r:SIMILAR]->(matched:Customer)
WHERE r.score > 2700
WITH searched, COLLECT(matched.customer_id) AS MatchedList, count(r) as cnt
WHERE cnt > 1
UNWIND MatchedList AS matchedCustomer
MATCH (person:Customer {customer_id: matchedCustomer})-[:HAS|:LIVES_IN|:IS_EMPLOYED_BY]->(related)
RETURN searched, person, related
Result what I got is below, notice one customer node not having its describing nodes:
// without number_of_relation > 1
// second attempt - for a sample customer_id
MATCH (matched)<-[r:SIMILAR]-(c)-[:HAS|:LIVES_IN|:IS_EMPLOYED_BY]->(b)
WHERE size(keys(b)) > 0
AND c.customer_id = "1b093559-a39b-4f95-889b-a215cac698dc"
AND r.score > 2700
RETURN b AS props, c AS src_cust, r AS relation, matched
Result I got are below, notice related nodes are not having their describing nodes:
If I had two describing nodes with some property (some may have a list) upon which I wanted to query and build the expected graph specified in point 2 above, how can I do that?
I want the database to find a similar customer given the describing nodes. Example: A customer {name: "Dave"} has phone {active_number: "+91-12345"} is similar to customer {name: "Mike"} has phone {active_number: "+91-12345"}. How can get started with this?
If something is unclear, please ask. I can explain with examples.
[EDITED]
Yes, the schema seems fine, except that you should not use the same HAS relationship type between different node label pairs.
The main problem with your first query is that its top MATCH clause uses a directional relationship pattern, ()-->(), which does not allow all Customer nodes to have a chance to be the searched node (because some nodes may only be at the tail end of SIMILAR relationships). This tweaked query should work better:
MATCH (searched:Customer)-[r:SIMILAR]-(matched:Customer)
WHERE r.score > 2700
WITH searched, COLLECT(matched) AS matchedList
WHERE SIZE(matchedList) > 1
UNWIND matchedList AS person
MATCH (person)-[:HAS|LIVES_IN|IS_EMPLOYED_BY]->(pDesc)
WITH searched, person, COLLECT(pDesc) AS personDescribers
MATCH (searched)-[:HAS|LIVES_IN|IS_EMPLOYED_BY]->(sDesc)
RETURN searched, person, personDescribers, COLLECT(sDesc) AS searchedDescribers
It's not clear what you want are trying to do.
To get all Customers who have the same phone number:
MATCH (c:Customer)-[:HAS_PHONE]-(p:Phone)
WHERE p.activeNumber = '+91-12345'
WITH p.activeNumber AS phoneNumber, COLLECT(c) AS customers
WHERE SIZE(customers) > 1
RETURN phoneNumber, customers
I have two types of labels User and Conversation
User have outgoing relation with Conversation
(u:User)-[:IS_PARTICIPANT]->(c:Conversation)
Multiple Users can be participant of the Conversation.
Now what I am trying to query is for conversation that is specifically between two users only.
for ex.
MATCH (p1:User {name: 'Tom'})-[:IS_PARTICIPANT]->(c:Conversation)<-[:IS_PARTICIPANT]-(p2:User {name:"Jerry"})
return c
Above query does return conversation between two users Tom and Jerry but it will also return this conversation even if there is other user Tweety in that specific conversation. Is there a way in cypher we can specifically get conversation where only specific users are participant of and not others.
Find the node, and find out how many others are connected to it:
MATCH (p1:User {name: 'Tom'})
-[:IS_PARTICIPANT]->(c:Conversation)<-[:IS_PARTICIPANT]
-(p2:User {name:"Jerry"})
MATCH (c)<-[:IS_PARTICIPANT]-(u:User)
WITH c,
COUNT(u) AS countUser WHERE countUser = 2
return c
If nodes are more then two:
WITH ["Tom", "Jerry", "Tweety"] as names
MATCH (p:User)-[:IS_PARTICIPANT]->(c:Conversation)
WHERE p.name IN names
WITH distinct c,
names
MATCH (c)<-[:IS_PARTICIPANT]-(u:User)
WHERE u.name in names
WITH distinct c,
names,
count(distinct u) as countUser
WHERE countUser = size(names)
RETURN c
There are a few tricks we can use here.
First, given an arbitrary-sized collections of names (where you know all the names correspond with :Users), this knowledge base entry can be helpful for determining when all of the given nodes have a relationship to the same node.
Second, if the :IS_PARTICIPANT relationship always only connects :User and :Conversation nodes, we can use size(()-[:IS_PARTICIPANT]->(c)) to efficiently get the number of incoming :IS_PARTICIPANT relationships to a conversation without having to pay the cost of actually expanding those relationships.
WITH ["Tom", "Jerry", "Tweety"] as names // should be parameterized instead
WITH names, size(names) as requiredCount
MATCH (u:User)
WHERE u.name in names
WITH u, requiredCount
MATCH (u)-[:IS_PARTICIPANT]->(c:Conversation)
WITH requiredCount, c, count(u) as matches
WHERE requiredCount = matches and size(()-[:IS_PARTICIPANT]->(c)) = requiredCount
RETURN c
Do you have something like this in mind?
MATCH (p1:User {name: 'Tom'})-[:IS_PARTICIPANT]->(c:Conversation)<-[:IS_PARTICIPANT]-(p2:User {name: 'Jerry'})
WHERE NOT (c:Conversation)<-[:IS_PARTICIPANT]-(:User {name: 'The Dog'})
RETURN c
Here, the User The Dog is not a participant of the conversation.
I am using Neo4j CE 3.1.1 and I have a relationship WRITES between authors and books. I want to find the N (say N=10 for example) books with the largest number of authors. Following some examples I found, I came up with the query:
MATCH (a)-[r:WRITES]->(b)
RETURN r,
COUNT(r) ORDER BY COUNT(r) DESC LIMIT 10
When I execute this query in the Neo4j browser I get 10 books, but these do not look like the ones written by most authors, as they show only a few WRITES relationships to authors. If I change the query to
MATCH (a)-[r:WRITES]->(b)
RETURN b,
COUNT(r) ORDER BY COUNT(r) DESC LIMIT 10
Then I get the 10 books with the most authors, but I don't see their relationship to authors. To do so, I have to write additional queries explicitly stating the name of a book I found in the previous query:
MATCH ()-[r:WRITES]->(b)
WHERE b.title="Title of a book with many authors"
RETURN r
What am I doing wrong? Why isn't the first query working as expected?
Aggregations only have context based on the non-aggregation columns, and with your match, a unique relationship will only occur once in your results.
So your first query is asking for each relationship on a row, and the count of that particular relationship, which is 1.
You might rewrite this in a couple different ways.
One is to collect the authors and order on the size of the author list:
MATCH (a)-[:WRITES]->(b)
RETURN b, COLLECT(a) as authors
ORDER BY SIZE(authors) DESC LIMIT 10
You can always collect the author and its relationship, if the relationship itself is interesting to you.
EDIT
If you happen to have labels on your nodes (you absolutely SHOULD have labels on your nodes), you can try a different approach by matching to all books, getting the size of the incoming :WRITES relationships to each book, ordering and limiting on that, and then performing the match to the authors:
MATCH (b:Book)
WITH b, SIZE(()-[:WRITES]->(b)) as authorCnt
ORDER BY authorCnt DESC LIMIT 10
MATCH (a)-[:WRITES]->(b)
RETURN b, a
You can collect on the authors and/or return the relationship as well, depending on what you need from the output.
You are very close: after sorting, it is necessary to rediscover the authors. For example:
MATCH (a:Author)-[r:WRITES]->(b:Book)
WITH b,
COUNT(r) AS authorsCount
ORDER BY authorsCount DESC LIMIT 10
MATCH (b)<-[:WRITES]-(a:Author)
RETURN b,
COLLECT(a) AS authors
ORDER BY size(authors) DESC
I'm very new to Neo4J and I can't get this simple query work.
The data I have looks like this:
(a)-[:likes]->(b)
(a)-[:likes]->(c)
Now I'd like to extract a list with everyone who likes someone else.
Tried
match (u)-[:likes]->(p) return u order by p.id desc;
This gives me a duplicate of (a).
I tried using distinct:
match (u)-[:likes]->(p) return distinct u order by p.id desc;
This gives me 'variable p undefined'.
I know that if I drop the ordering, distinct works and gives me (a) once.
But how can I work with distinct and order by in the same time?
Consider why your query isn't working:
Without the distinct, you have rows with each pairing of u and p. When you use DISTINCT, how is it supposed to order when there are multiple lines for the same u, matching to multiple p's? That's an impossible task.
If you change it to order by u.id instead, then it works just fine.
I do encourage you to use labels, by the way, to restrict your query only to relevant nodes. You can also rework your query to prevent it from emitting duplicates and avoid the need for DISTINCT completely.
If we assume the nodes you're interested in are labeled with :Person, your query might be:
MATCH (p:Person)
WHERE EXISTS( (p)-[:likes]-() )
RETURN p ORDER BY p.id DESC