I have following data structure of neo4j database:
USER
/ |
/ |
LIST |
\ |
\ |
CONTACT
means, USER have a relationship with LIST and LIST have relationship with CONTACT, but in some case, USER might have relationship with CONTACT (not all time). Now I want to delete CONTACT's data. I have write the following query:
MATCH (b:USER { id: {id} } )-[relationship01]->(pl:LIST {id: {listId} )
OPTIONAL MATCH (pl)-[cnpt:USER_LIST]->(cn:CONTACTS {id: {contactId} } )
DELETE cnpt, cn;
This query delete CONTACT with relationship with LIST. But in some case, I also have to delete relationship with USER. To solve this, I have write the following query:
MATCH (b:USER { id: {id} } )-[relationship01]->(pl:LIST {id: {listId} )
OPTIONAL MATCH (pl)-[cnpt:USER_LIST]->(cn:CONTACTS {id: {contactId} } )
OPTIONAL MATCH (b)-[bur]->(cnx:CONTACTS {id: {contactId} } )
DELETE cnpt, cn, bur, cnx;
This query delete CONTACT with relationship with LIST and USER, but problem is, if there is no relationship between CONTACT and USER, then it throw error.
How can I solve this problem?
Thanks in Advance.
You can't delete a node until all its relationships are deleted, which is why there is a shorthand for deleting all of a node's relationships, then the node itself: DETACH DELETE
So all you have to do is this:
MATCH (:USER { id: {id} } )-->(pl:LIST {id: {listId} )
OPTIONAL MATCH (pl)-[:USER_LIST]->(cn:CONTACTS {id: {contactId} } )
DETACH DELETE cn;
Related
I have created the following nodes in neo4j (1 million of them):
CREATE (p:Person { name: 'user1', email: ['user1#gmail.com', 'user1#yahoo.com'] }) RETURN p
CREATE (p:Person { name: 'user2', email: ['user2#gmail.com', 'user2#yahoo.com'] }) RETURN p
...
CREATE (p:Person { name: 'user1000000', email: ['user1000000#gmail.com', 'user1000000#yahoo.com'] }) RETURN p
I have created the following indexes:
CREATE BTREE INDEX i1 FOR (n:Person) ON (n.name)
CREATE BTREE INDEX i2 FOR (n:Person) ON (n.email)
With the above data, the following query takes 2ms to complete and I can concurrently execute about 2800 such queries per second on my desktop.
MATCH (p:Person) WHERE p.name = 'user10' RETURN DISTINCT p.name
But the following query takes 710ms to complete and I can concurrently execute only about 5 such queries per second on my desktop.
MATCH (p:Person) WHERE 'user10#gmail.com' IN p.email RETURN DISTINCT p.name
Is there any way to speed up the second query and also increase the throughput ?
Edit 1:
I tried to use separate nodes for email as suggested by #jose_bacoy in his answer.
I created the following nodes:
CREATE (m1:mail { email: 'user1#gmail.com' })
CREATE (m2:mail { email: 'user1#yahoo.com' })
CREATE (p:Person { name: 'user1' })
CREATE (p) - [:attribute] -> (m1)
CREATE (p) - [:attribute] -> (m2)
RETURN p
...
CREATE (m1:mail { email: 'user1000000#gmail.com' })
CREATE (m2:mail { email: 'user1000000#yahoo.com' })
CREATE (p:Person { name: 'user1000000' })
CREATE (p) - [:attribute] -> (m1)
CREATE (p) - [:attribute] -> (m2)
RETURN p
and indexed them as follows:
CREATE BTREE INDEX i1 FOR (n:Person) ON (n.name)
CREATE BTREE INDEX i2 FOR (n:mail) ON (n.email)
The speed is also good. Latency: 4ms, throughput 1850 queries per second.
The problem with this is that the following query performs very badly.
MATCH (p:Person) - [:attribute] -> (m1:mail)
MATCH (p) - [:attribute] -> (m2:mail)
WHERE m1.email = 'user10#gmail.com' OR m2.email = 'user10#yahoo.com'
RETURN DISTINCT p.name
On my desktop, the latency is about 5s and the throughput is less than 1 per second.
Edit 2:
I modified the query as suggested by Charchit Kapoor below. Following is the query I used.
MATCH (p:Person) - [:attribute] -> (m:mail)
WHERE m.email IN ['user10#gmail.com', 'user10#yahoo.com']
RETURN DISTINCT p.name
has a latency of about 4ms and throughput of about 2600 queries per second.
Your data model is not aligned to your query. Email is a list of emails in Person node and you are searching within a list. Below is a script to change your data model from Person.email into a relationship between Person -[:HAS_EMAIL]-> Email. The APOC function iterate will divide your Person nodes into batches and will run it in parallel for efficiency. Ensure that you have APOC installed.
Then it will create the (Person)->(Email) relationship and remove the property in Person after completion. You can change the batch size (10k per batch) according to your taste. You also want to create a unique index for Email. I will leave it up to you on how to do it.
CALL apoc.periodic.iterate(
"MATCH (p:Person) RETURN p as person;",
"WITH person
UNWIND person.email as email
MERGE (e:Email {email: email})
MERGE (person)-[:HAS_EMAIL]->(e)
SET person.email = null;",
{batchSize:10000, parallel:true, retries:3});
After doing this and creating the index on Email.email, profiling shows that the BTREE index is being used:
PROFILE MATCH (p:Person) -[:HAS_EMAIL] -> (e:Email)
WHERE e.email = 'user10#gmail.com'
RETURN DISTINCT p.name
BTREE INDEX e:Email(email) WHERE
email = $autostring_0
Previously, it shows NodeLabelByScan and Filter on $autostring_0 IN p.email. Even if you create an index on a list, it is not used.
Your second query can be structured differently, first find all the relevant emails and then find the related users:
MATCH (m1:mail)
WHERE m1.email IN ['user10#gmail.com', 'user10#yahoo.com']
MATCH (p)-[:attribute]->(m1)
RETURN DISTINCT p.name
I have a linked list of Posts in a Group. A Group has a FIRST_POST and then subsequent NEXT_POST relationships. The GraphQL typedef looks like this:
type Group {
id: ID!
name: String!
membersList: [User] #relation(direction: IN, name: "MEMBER_OF")
engagesIn: Conversation #relation(direction: OUT, name: "ENGAGES_IN")
posts: [Post] #cypher(statement: """
MATCH (g:Group {id: this.id})-[:FIRST_POST|NEXT_POST*]->(post:Post)
RETURN post
""")
createdAt: DateTime
updatedAt: DateTime
}
The cypher query MATCH (g:Group {id: this.id})-[:FIRST_POST|NEXT_POST*]->(post:Post) RETURN post returns all of a specific Group's posts. I want to create a new Post and add it to the Group in my schema's typedef. Here's one stab at it that I know is not correct, but is where my current efforts are:
CreatePost(body: String!, userId: ID!): Post
#cypher(
statement: """
MATCH (u:User {id: $userId})-[:MEMBER_OF]->(g:Group)-[:FIRST_POST|NEXT_POST*]->(lp:Post) return Last(collect(p))
CREATE (p:Post { id: apoc.create.uuid(), body: $body, createdAt: datetime(), updatedAt: datetime() })
WITH p, lp
MATCH (u:User)
WHERE u.id = $userId
CREATE (p)<-[:WROTE]-(u)
CREATE (p)-[:NEXT_POST]->(lp)
RETURN u, p
""")
This is a custom mutation that accept post body and userId arguments. This line MATCH (u:User {id: $userId})-[:MEMBER_OF]->(g:Group)-[:FIRST_POST|NEXT_POST*]->(lp:Post) return Last(collect(p)) would return the last post, though I know I cannot return here and haven't successfully found a way to alias that MATCH. Ultimately I need to CREATE the NEXT_POST relationship off of the last NEXT_POST in that group with CREATE (p)-[:NEXT_POST]->(<LAST POST IN GROUP>). The first line also uses the userId to find the group as Users are MEMBER_OF only one Group and Posts belong to only one Group.
So, some confusion on my part. Someone else created the schema with FIRST_POST being the most recently created post rather than the first post created. Renaming it to NEWEST_POST, I can no do this:
MATCH (u:User {id: $userId})-[:MEMBER_OF]->(g:Group)-[rel:NEWEST_POST]->(previousNewestPost:Post)
DELETE rel
CREATE (p:Post { id: apoc.create.uuid(), body: 'test new post', createdAt: datetime(), updatedAt: datetime() })
WITH p, g, previousNewestPost
MATCH (u:User)
WHERE u.id = $userId
CREATE (p)<-[:WROTE]-(u)
CREATE (p)<-[:NEWEST_POST]-(g)
CREATE (p)-[:NEXT_POST]->(previousNewestPost)
RETURN u, p
to create a new post and update the relationships accordingly.
I have a following Neo4j Cypher query that checks if relationship exists between User and entity and returns boolean result:
MATCH (u:User) WHERE u.id = {userId} MATCH (entity) WHERE id(entity) = {entityGraphId} RETURN EXISTS( (u)<-[:OWNED_BY]-(entity) )
Please help to rewrite this query in order to be able to accept a collection of {entityGraphIds} instead of a single {entityGraphId} and check if a relationship exists between User and any entities with these {entityGraphIds}.
For example, I have user1 and entity1, entity2. user1 has a relationship with entity2. I'll pass {user.id} like {userId} and {entity1.id, entity2.id} like {entityGraphIds} and this query should return true.
I believe you can simply use the IN operator. Considering these parameters:
:params {userId: 1, entityGraphIds : [2,3,4]}
Then, the query:
MATCH (u:User) WHERE u.id = {userId}
MATCH (entity) WHERE id(entity) IN ({entityGraphIds})
RETURN EXISTS( (u)<-[:OWNED_BY]-(entity) )
EDIT:
If you are trying to return true when :User is connected to at least 1 entity, then you can simplify your query to:
OPTIONAL MATCH (u:User)<-[:OWNED_BY]-(entity:Entity)
WHERE u.id = {userId} AND id(entity) IN ({entityGraphIds})
RETURN u IS NOT NULL
I'm trying to write a query where I get the :LIKES relationships.
(:USER)
|
[:CREATED]
|
(:POST)<-[:LIKES]-(:USER)
|
[:RESHARED]
|
(:POST)<-[:LIKES]-(:USER)
I was trying something along the lines of:
MATCH (u:USER {name: "Lamoni"})-[:CREATED]-(p:POST)
OPTIONAL MATCH p<-[:LIKES]-(u2:USER)
OPTIONAL MATCH p<-[:RESHARED]-(p2:POST)<-[:LIKES]-(u3:USER)
Any ideas on an optimal way to do this and be able to order them by a property called created_at in a descending order?
Thanks!
If the POST structure always looks like this you can try:
// match the whole user-post-post path
MATCH (u:USER {name: "Lamoni"})-[:CREATED]-(p_direct:POST)-[:RESHARED]-(p_shared:Post)
WITH u, p_direct, p_shared
OPTIONAL MATCH (p_direct)<-[:LIKES]-(u2:USER)
OPTIONAL MATCH (p_shared)<-[:LIKES]-(u3:USER)
RETURN u.name, p_direct.xyz, collect(u2.name), p_shared.xyz, collect(u3.name)
If you just want all USERS that like a POST by a given USER (independent of the type of POST, created or shared) you can also collect all POST:
MATCH (u:USER {name: "Lamoni"})-[:CREATED|RESHARED*1..2]-(p:Post)
WITH u, p
OPTIONAL MATCH (p)<-[:LIKES]-(u2:USER)
WITH u.name, p, u2
ORDER BY u2.created_at
RETURN u.name, p, collect(u2.name)
For example, I created two linked nodes:
create (a:ACTOR {id: "a1", name: "bruce wellis"})
create (m:MOVIE {id: "m1", title: "die hardest"})
create (a)-[:ACTED_IN]->(m)
1. From this cypher query:
match (a:ACTOR {id: "a1"})
with a
optional match (m:MOVIE {id: "m1"})
set m += {
title: "die easier"
}
return a;
I can have result:
+-----------------------------------------+
| a |
+-----------------------------------------+
| Node[1000]{name:"bruce wellis",id:"a1"} |
+-----------------------------------------+
1 row
Properties set: 1
The query successfully returned the actor node.
2. (UPDATED) But if you make the match MOVIE subquery failed:
match (a:ACTOR {id: "a1"})
with a
optional match (m:MOVIE {id: "mm"})
set m += {
title: "die easier"
}
return a;
I got error:
CypherTypeException: Expected m to be a node or a relationship, but it was :`null`.
How to make the second query returning matched actor result?
A MATCH that fails to match anything will always return no rows.
So, in #2, since the second MATCH failed, it returns no rows.
You could use OPTIONAL MATCH in place of the second MATCH, and you should see results.
[EDITED]
For the Updated question, this (somewhat ugly) workaround should work:
MATCH (a:ACTOR {id: "a1"})
WITH a
OPTIONAL MATCH (m:MOVIE {id: "mm"})
WITH a, COLLECT(m) AS cm
FOREACH(m IN cm | SET m += {title: "die easier"})
RETURN a;