I'm trying to implement a soft delete in Neo4j. The graph described in Cypher from Alice's viewpoint is as such:
(clyde:User)<-[:FOLLOWS]-(alice:User)-[:LIKES]->(bob:User)
Instead of actually deleting a node and its relationships, I'm
changing its label so it can no longer be looked up directly, i.e. dropping its User label and adding a _User label (notice the underscore)
replacing its relationships so it can't be reached anymore by my normal queries, e.g. deleting its :FOLLOWS relationships and replacing it with :_FOLLOWS relationships.
So this is basically the equivalent of moving a row to an archiving table in a relational database. I figured this is a pretty efficient approach because you're effectively never visiting the parts of the graph that have been soft-deleted. Also, you don't have to modify any of your existing queries.
The result of soft-deleting Alice should be this:
(clyde:User)<-[:_FOLLOWS]-(alice:_User)-[:_LIKES]->(bob:User)
My first attempt at the query was this:
match (user:User {Id: 1})
optional match (user)-[follows:FOLLOWS]->(subject)
remove user:User set user:_User
delete follows
create (user)-[:_FOLLOWS]->(subject);
The problem is that when this user is not following anyone, the query tries to create a relationship between user and null because the second match is optional, so it gives me this error: Other node is null.
My second attempt was this:
match (user:User {Id: 1})
remove user:User set user:_User
optional match (user)-[follows:FOLLOWS]->(subject)
foreach (f in filter(f in collect({r: follows, n: subject}) where f.r is not null) | delete f.r create (user)-[:_FOLLOWS]->(f.n));
So I'm putting the relationship and the subject into a map, collecting these maps in a collection, throwing every "empty" map away and looping over the collection. But this query gives me this error:
SyntaxException: Invalid input '.': expected an identifier character, node labels, a property map, whitespace or ')' (line 1, column 238)
Does anyone know how I can fix this?
Thanks,
Jan
Could you change the label first and then match for relationships? Then you should be able to use 'non-optional' match, and not have to deal with the cases where there are no follows relationships, something like
MATCH (user:User {Id: 1})
REMOVE user:User SET user:_User
WITH user
MATCH (user)-[follows:FOLLOWS]->(subject)
DELETE follows
CREATE (user)-[:_FOLLOWS]->(subject)
Or you could carry the user, follows and subject and filter on where subject is not null. Something like
MATCH (user:User {Id: 1})
OPTIONAL MATCH (user)-[follows:FOLLOWS]->(subject)
REMOVE user:User SET user:_User
WITH user, follows, subject
WHERE subject IS NOT NULL
DELETE follows
CREATE (user)-[:_FOLLOWS]->(subject)
Edit:
If the problem is that you want to do this for more than one kind of relationship, then you could try
MATCH (user:User {Id: 1})
REMOVE user:User SET user:_User
WITH user
MATCH (user)-[f:FOLLOWS]->(other)
DELETE f
CREATE (user)-[:_FOLLOWS]->(other)
WITH user LIMIT 1
MATCH (user)-[l:LIKES]->(other)
DELETE l
CREATE user-[:_LIKES]->(other)
You can keep extending it with other relationship types, just be sure to limit user when you carry, since multiple matches (user)-[r]->(other) means there are multiple results for user, or you'll run the next query part multiple times.
I don't think there is a generic way to do it in cypher since you can't dynamically build the relationship type (i.e. CREATE (a)-[newRel:"_"+type(oldRel)]->(b) doesn't work)
Is something like that what you are looking for or am I misunderstanding your question?
Related
I would like to get a node, delete all outgoing relationships of a certain type and then add back relationships.
The problem I have is that once I grab the node, it still maintains it's previous relationships even after delete so instead of having 1 it keeps doubling whatever it has. 1->2->4->8 etc
Sample graph:
CREATE (a:Basic {name:'a'})
CREATE (b:Basic {name:'b'})
CREATE (c:Basic {name:'c'})
CREATE (a)-[:TO]->(b)
CREATE (a)-[:SO]->(c)
The query to delete the previous relationships and then add in the new relationships. (this is just a brief sample where in reality it wouldn't add back the same relationships, but more then likely point it to a different node).
MATCH (a:Basic {name:'a'})
WITH a
OPTIONAL MATCH (a)-[r:TO|SO]->()
DELETE r
WITH a
MATCH (b:Basic {name:'b'})
CREATE (a)-[:TO]->(b)
WITH a
MATCH (c:Basic {name:'c'})
CREATE (a)-[:SO]->(c)
If I change the CREATE to MERGE then it solves the problem, but it feels odd to have to merge when I know that I just deleted all the relationships. Is there a way to update "a" midway through the query so it reflects the changes? I would like to keep it in one query
The behavior you observed is due the subtle fact that the OPTIONAL MATCH clause generated 2 rows of data, which caused all subsequent operations to be done twice.
To force there to be only a single row of data after the DELETE clause, you can use WITH DISTINCT a (instead of WITH a) right after the DELETE clause, like this:
MATCH (a:Basic {name:'a'})
OPTIONAL MATCH (a)-[r:TO|SO]->()
DELETE r
WITH DISTINCT a
MATCH (b:Basic {name:'b'})
CREATE (a)-[:TO]->(b)
WITH a
MATCH (c:Basic {name:'c'})
CREATE (a)-[:SO]->(c)
I am using UNWIND to create multiple nodes in NEO4j. The problem is if one of the node is a duplicate it will be rejected and the entire query fails. I want to be able to create multiple relationship between the same nodes if they already exist..e.g a friend could receive multiple invitations from the same person. So I have an array of objects [{email: xxx#mail.com},{email:yyy#mymail.com},...] to be invited and the email of the sponsor sponsorEmail. There is constraint on email so attempts to create a duplicate will fail and reject the entire query. The following works fine when there is no duplicate.
MATCH (s {email: 'sponsor#gmail.com'})
UNWIND $arrayOfObjects as invitees
CREATE (i:Invitee) MERGE (s)-[r:INVITED {since: timestamp()}]->(i)
SET i=invitees
I have tried substituting MERGE the CREATE thinking that the MERGE would find a MATCH and proceed to CREATE the relationship but it did not work..I still get duplicate error. Short of cleaning the arrayOfObjects before executing the query is there another way to do this? What I want is for the duplicate not to fail but to create a relationship with the existing Invitee node.
You need to MERGE the :Invitee node along with the invitee email. As it is now, you are creating empty :Invitee nodes, and only after they are created setting the email address. You need to MERGE them with the email address. (also you should use a label on your first MATCH, otherwise it does an AllNodesScan...I'll assume it's :Invitee for now, but please replace with whatever label makes sense).
MATCH (s:Invitee {email: 'sponsor#gmail.com'})
UNWIND $arrayOfObjects as invitee
MERGE (i:Invitee {email:invitee.email})
CREATE (s)-[r:INVITED {since: timestamp()}]->(i)
[EDITED]
Your MERGE could match any existing Invitee, and your SET could try to change its email value to a non-unique value. This is probably why you get the constraint violation.
If you want only a single INVITED relationship (with the latest timestamp) for each invitee, this query may do what you want:
MATCH (s {email: 'sponsor#gmail.com'})
UNWIND $arrayOfObjects as invitee
MERGE (i:Invitee {email: invitee.email})
ON CREATE SET i = invitee
MERGE (s)-[r:INVITED]->(i)
ON CREATE SET r.since = timestamp()
This query assumes that every map in arrayOfObjects contains a unique email property value. (You should also create an index or uniqueness constraint on :Invitee(email) to speed up the first MERGE.)
The MERGE (s)-[r:INVITED {since: timestamp()}]->(i) clause (which specifies the current timestamp) is flawed, since it would not detect an existing relationship with an older since value -- so it would almost always create a new relationship. The MERGE (s)-[r:INVITED]->(i) clause would only create a relationship if none exists.
Or, if you want to keep track of the timestamps of every invitation, you could make the since value be an array of timestamps, like this:
MATCH (s {email: 'sponsor#gmail.com'})
UNWIND $arrayOfObjects as invitee
MERGE (i:Invitee {email: invitee.email})
ON CREATE SET i = invitee
MERGE (s)-[r:INVITED]->(i)
ON CREATE SET r.since = [timestamp()]
ON MATCH SET r.since = r.since + timestamp()
Both answer submitted are good answers. I am submitting this answer with a few nuances for those who come along later. The main objective of the question was to be able to create multiple invitations to a friend who has not yet accepted and be able to visualize those invitations. The following is what I settled on:
WITH ['tom#abc.com', 'tony#mymail.com',michael#gmail.com'] AS coll
UNWIND coll AS invitee
WITH DISTINCT invitee
MATCH (s:Sponsor {email: 'mary#gmail.com'})
MERGE (i:Invitee {email: invitee})
CREATE (s)-[r:INVITED {since: timestamp()}]->(i)
RETURN r;
This allowed me to create multiple relationships for each invite sent to the same person but only if sent at different times....which I can easily view.
I have a database in which I have Entity nodes, User nodes, and a couple of relationships including LIKES, POSTED_BY. I'm trying to write a query to achieve this objective:
Find all Entity nodes that a particular user LIKES or those that have been POSTED_BY that User
Note that I have simplified my query - in real I have a bunch of other conditions similar to the above.
I'm trying to use a COLLECT clause to aggregate the list of all Entity nodes, and build on that line by line.
MATCH (e)<-[:LIKES]-(me:User{id: 'rJVbpcqzf'} )
WITH me, COLLECT(e) AS all_entities
MATCH (e)-[:POSTED_BY]->(me)
WITH me, all_entities + COLLECT(e) AS all_entities
UNWIND all_entities AS e
WITH DISTINCT e
RETURN e;
This seems to be returning the correct list ONLY if there is at least one Entity that the user has liked (i.e., if the first COLLECT returns a non-empty list). However, if there is no Entity that I have liked, the entire query returns empty.
Any suggestions on what I'm missing here?
Use OPTIONAL MATCH:
MATCH (me:User {id: 'rJVbpcqzf'})
OPTIONAL MATCH (me)-[:LIKES|POSTED_BY]->(e)
RETURN collect(DISTINCT e) AS all_entities
Notes:
Instead of collecting and unwinding, you can simply use DISTINCT. You can also use DISTINCT with collect.
You can also use multiple relationship types, i.e. the LIKES|POSTED_BY for the relationship type here.
I have database with entities person (name,age) and project (name).
can I query the database in cypher that specifies me it is person or project?
for example consider I have these two instances for each :
Node (name = Alice, age= 20)
Node (name = Bob, age = 31)
Node (name = project1)
Node (name = project2)
-I want to know, is there any way that I just say project1 and it tells me that this is a project.
-or I query Alice and it says me this is a person?
Thanks
So your use case is to search things by name, and those things can be of several types instead of a single type.
Just to note, in general, this is not what Neo4j is built for. Typically in Neo4j queries you know the type of the thing you're searching for, and you're exploring relationships between that thing (or things) to figure out associations or data derived from that.
That said, there are ways to do this, though it's worth going through the rest of your use cases and seeing if Neo4j is really the best tool for what you're trying to do
Whenever you're querying by a property, you either want a unique constraint on the label/property, or an index on the label/property. Note that you need a combination of a label and a property for this; you cannot blindly ask for a node with a property without specifying a label and get good performance, as it will have to do a scan of all nodes in your database (there are some older manual indexes in Neo4j, but I'm not sure if these will continue to be supported; the schema indexes are recommended by the developers).
There is a workaround to this, as Neo4j allows multiple labels on the same node. If you only expect to query certain types by name (for example, only projects and people), you might create a :Named label, and set that label on all :Project and :Person nodes (and any other labels where it should apply). You can then create an index on :Named.name. That way your query would be something like:
MATCH (n:Named)
WHERE n.name = 'blah'
WITH LABELS(n) as types
WITH FILTER(type in types WHERE type <> 'Named') as labels
RETURN labels
Keep in mind that you haven't specified if a name should be unique among node types, so it could be possible for a :Person or a :Project or multiple :Persons to have the same name, unsure how that affects what should happen on your end. If every named thing ought to have a unique name, you should create a unique constraint on :Named.name (though again, it's on you to ensure that every node you create that ought to be :Named has the :Named label applied on creation).
You should use node labels (like Person and Project) to represent node "types".
For example, to create a person and a project:
CREATE (:Person {name: 'Alice', age: 20})
CREATE (:Project {name: 'project1'})
To find the project(s) named 'Fred':
MATCH (p:Project {name: 'Fred'})
RETURN p;
To get a collection of the labels of node n, you can invoke the LABELS(n) function. You can then look in that collection to see if the label you are looking for is in there. For example, if your Cypher query somehow obtains a node n, then this snippet would return n if and only if it has the Person label:
.
.
.
WHERE 'Person' IN LABELS(n)
RETURN n;
[UPDATED]
If you want to find all nodes with the name property value of "Fred":
MATCH (n {name: 'Fred'})
...
If you want to find all relationships with the name property value of "Fred":
MATCH ()-[r {name: 'Fred'})-()
...
If you want to match both in a single query, you have many ways to do that, depending on your exact use case. For example, if you want a cartesian product of the matching nodes and relationships:
OPTIONAL MATCH (n {name: 'Fred'})
OPTIONAL MATCH ()-[r {name: 'Fred'})-()
...
I want to add a "created by" relationship on nodes in my database. Any node should be able of having this relationship but there can never be more than one.
Right now my query looks something like this:
MATCH (u:User {email: 'my#mail.com'})
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.name='Node name', n.attribute='value'
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
As I have understood Cypher there is no way to achieve what I want, the current query will only make sure there are no unique relationships based on TWO nodes, not ONE. So, this will create more CREATED_BY relationships when run for another User and I want to limit the outgoing CREATED_BY relationship to just one for all nodes.
Is there a way to achieve this without running multiple queries involving program logic?
Thanks.
Update
I tried to simplyfy the query by removing implementation details, if it helps here's the updated query based on cybersams response.
MERGE (c:Company {name: 'Test Company'})
ON CREATE SET c.uuid='db764628-5695-40ee-92a7-6b750854ebfa', c.created_at='2015-02-23 23:08:15', c.updated_at='2015-02-23 23:08:15'
WITH c
OPTIONAL MATCH (c)
WHERE NOT (c)-[:CREATED_BY]-()
CREATE (c)-[:CREATED_BY {date: '2015-02-23 23:08:15'}]->(u:User {token: '32ba9d2a2367131cecc53c310cfcdd62413bf18e8048c496ea69257822c0ee53'})
RETURN c
Still not working as expected.
Update #2
I ended up splitting this into two queries.
The problem I found was that there was two possible outcomes as I noticed.
The CREATED_BY relationship was created and (n) was returned using OPTIONAL MATCH, this relationship would always be created if it didn't already exist between (n) and (u), so when changing the email attribute it would re-create the relationship.
The Node (n) was not found (because of not using OPTIONAL MATCH and the WHERE NOT (c)-[:CREATED_BY]-() clause), resulting in no relationship created (yay!) but without getting the (n) back the MERGE query looses all it's meaning I think.
My Solution was the following two queries:
MERGE (n:Node {name: 'Name'})
ON CREATE SET
SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)-[r:CREATED_BY]-()
RETURN c, r
Then I had program logic check the value of r, if there was no relationship I would run the second query.
MATCH (n:Node {name: 'Name'})
MATCH (u:User {email: 'my#email.com'})
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
Unfortunately I couldn't find any real solution to combining this in one single query with Cypher. Sam, thanks! I've selected your answer even though it didn't quite solve my problem, but it was very close.
This should work for you:
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)
WHERE NOT (n)-[:CREATED_BY]->()
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(:User {email: 'my#mail.com'})
RETURN n;
I've removed the starting MATCH clause (because I presume you want to create a CREATED_BY relationship even when that User does not yet exist in the DB), and simplified the ON CREATE to remove the redundant setting of the name property.
I have also added an OPTIONAL MATCH that will only match an n node that does not already have an outgoing CREATED_BY relationship, followed by a CREATE UNIQUE clause that fully specifies the User node.