How to set all relationships of a certain type -- replacing the old ones if necessary -- in Neo4j? - neo4j

I have a node id for an event, and list of node ids for users that are hosting the event. I want to update these (:USER)-[:HOSTS]->(:EVENT) relationships. I dont just want to add the new ones, I want to remove the old ones as well.
NOTE: this is coffeescript syntax where #{} is string interpolation, and str() will escape any characters for me.
Right now I'm querying all the hosts:
MATCH (u:USER)-[:HOSTS]->(:EVENT {id:#{str(eventId)}})
RETURN u.id
Then I'm determining which hosts are new and need to be added and which ones are old and need to be removed. For the old ones, I remove them
MATCH (:HOST {id:#{str(host.id)}})-[h:HOSTS]->(:EVENT {id:#{str(eventId)}})
DELETE h
And for the new ones, I add them:
MATCH (e:EVENT {id: #{str(eventId)}})
MERGE (u:USER {id:#{str(id)}})
SET u.name =#{str(name)}
MERGE (u)-[:HOSTS]->(e)
So my question is, can I do this more efficiently all in one query? I want want to set the new relationships, getting rid of any previous relationships that arent in the new set.

If I understand your question correctly, you can achieve your objective in a single query by introducing WITH and FOREACH. On a sample graph created by
CREATE (_1:User { name:"Peter" }),(_2:User { name:"Paul" }),(_3:User { name:"Mary" })
CREATE (_4:Event { name:"End of the world" })
CREATE _1-[:HOSTS]->_4, _2-[:HOSTS]->_4
you can remove the no longer relevant hosts, and add the new hosts, as such
WITH ["Peter", "Mary"] AS hosts, "End of the world" AS eventId
MATCH (event:Event { name:eventId })<-[r:HOSTS]-(u:User)
WHERE NOT u.name IN hosts
DELETE r
WITH COLLECT(u.name) AS oldHosts, hosts, event
WITH FILTER(h IN hosts
WHERE NOT h IN oldHosts) AS newHosts, event, oldHosts
FOREACH (n IN newHosts |
MERGE (nh:User { name:n })
MERGE nh-[:HOSTS]->event
)
I have made some assumptions, at least including
The new host (:User) of the event may already exists, therefore MERGE (nh:User { name:n }) and not CREATE.
The old [:HOSTS]s should be disconnected from the event, but not removed from the database.
Your coffee script stuff can be translated into parameters, and you can translate my pseudo-parameters into parameters. In my sample query I simulate parameters with the first line, but you may need to adapt the syntax according to how you actually pass the parameters to the query (I can't turn Coffee into Cypher).
Click here to test the query. Change the contents of the hosts array to ["Peter", "Paul"], or to ["Peter", "Dragon"], or whatever value makes sense to you, and rerun the query to see how it works. I've used name rather than id to catch the nodes, and again, I've simulated parameters, but you might be able to translate the query to the context from which you want to execute it.
Edit:
Re comment, if you want the query to also match events that don't have any hosts you need to make the -[:HOSTS]- part of the pattern optional. Do so by braking the MATCH clause in two:
MATCH (event:Event { name:eventId })
OPTIONAL MATCH event<-[r:HOSTS]-(u:User)
The rest of the query is the same.

Related

Create doesn't make all nodes and relationships appear

I just downloaded and installed Neo4J. Now I'm working with a simple csv that is looking like that:
So first I'm using this to merge the nodes for that file:
LOAD CSV WITH HEADERS FROM 'file:///Athletes.csv' AS line
MERGE(Rank:rank{rang: line.Rank})
MERGE(Name:name{nom: line.Name})
MERGE(Sport:sport{sport: line.Sport})
MERGE(Nation:nation{pays: line.Nation})
MERGE(Gender: gender{genre: line.Gender})
MERGE(BirthDate:birthDate{dateDeNaissance: line.BirthDate})
MERGE(BirthPlace: birthplace{lieuDeNaissance: line.BirthPlace})
MERGE(Height: height{taille: line.Height})
MERGE(Pay: pay{salaire: line.Pay})
and this to create some constraint for that file:
CREATE CONSTRAINT ON(name:Name) ASSERT name.nom IS UNIQUE
CREATE CONSTRAINT ON(rank:Rank) ASSERT rank.rang IS UNIQUE
Then I want to display to which country the athletes live to. For that I use:
Create(name)-[:WORK_AT]->(nation)
But I have have that appear:
I would like to know why I have that please.
I thank in advance anyone that takes time to help me.
Several issues come to mind:
If your CREATE clause is part of your first query: since the CREATE clause uses the variable names name and nation, and your MERGE clauses use Name and Nation (which have different casing) -- the CREATE clause would just create new nodes instead of using the Name and Nation nodes.
If your CREATE clause is NOT part of your first query: your CREATE clause would just create new nodes (since variable names, even assuming they had the same casing, are local to a query and are not stored in the DB).
Solution: You can add this clause to the end of the first query:
CREATE (Name)-[:WORK_AT]->(Nation)
Yes, Agree with #cybersam, it's the case sensitive issue of 'name' and 'nation' variables.
My suggesttion:
MERGE (Name)-[:WORK_AT]->(Nation)
I see that you're using MERGE for nodes, so just in case any values of Name or Nation duplicated, you should use MERGE instead of CREATE.

Property values can only be of primitive types or arrays thereof in Neo4J Cypher query

I have the following params set:
:params "userId":"15229100-b20e-11e3-80d3-6150cb20a1b9",
"contextNames":[{"uid":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","name":"zhora"}],
"statements":[{"text":"oranges apples bananas","concepts":["orange","apple","banana"],
"mentions":[],"timestamp":15481867295710000,"name":"# banana","uid":"34232870-1e7f-11e9-8609-a7f6b478c007",
"uniqueconcepts":[{"name":"orange","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"apple","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"banana","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000}],"uniquementions":[]}],"timestamp":15481867295710000,"conceptsRelations":[{"from":"orange","to":"apple","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710000,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"apple","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"orange","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":4,"weight":2}],"mentionsRelations":[]
Then when I make the following query:
MATCH (u:User {uid: $userId})
UNWIND $contextNames as contextName
MERGE (context:Context {name:contextName.name,by:u.uid,uid:contextName.uid})
ON CREATE SET context.timestamp=$timestamp
MERGE (context)-[:BY{timestamp:$timestamp}]->(u)
WITH u, context
UNWIND $statements as statement
CREATE (s:Statement {name:statement.name, text:statement.text, uid:statement.uid, timestamp:statement.timestamp})
CREATE (s)-[:BY {context:context.uid,timestamp:s.timestamp}]->(u)
CREATE (s)-[:IN {user:u.id,timestamp:s.timestamp}]->(context)
WITH u, s, context, statement
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName}) ON CREATE SET c.uid=apoc.create.uuid()
CREATE (c)-[:BY {context:context.uid,timestamp:s.timestamp,statement:s.suid}]->(u)
CREATE (c)-[:OF {context:context.uid,user:u.uid,timestamp:s.timestamp}]->(s)
CREATE (c)-[:AT {user:u.uid,timestamp:s.timestamp,context:context.uid,statement:s.uid}]->(context) )
WITH u, s
UNWIND $conceptsRelations as conceptsRelation MATCH (c_from:Concept{name: conceptsRelation.from}) MATCH (c_to:Concept{name: conceptsRelation.to})
CREATE (c_from)-[:TO {context:conceptsRelation.context,statement:conceptsRelation.statement,user:u.uid,timestamp:conceptsRelation.timestamp, uid:apoc.create.uuid(), gapscan:conceptsRelation.gapscan, weight: conceptsRelation.weight}]->(c_to)
RETURN DISTINCT s.uid
But when I run it, I get this error:
Neo.ClientError.Statement.TypeError
Property values can only be of primitive types or arrays thereof
Anybody knows why it's coming up? My params seem to be set correctly, I didn't see they couldn't be used in this way... Thanks!
Looks like the problem is here:
...
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName})
...
uniqueconcepts in your parameter is a list of objects, not a list of strings, so when attempting to MERGE conceptName, it errors out as conceptName isn't a primitive type (or array or primitive types). I think you'll want to use uniqueConcept instead of conceptName, and in your MERGE use name:uniqueConcept.name. Check for other usages of the elements of statement.uniqueconcepts.
This answer is for other n00bs like me that are trying to put a composite datatype into a property without reading the friendly manual, and get the error above. Google points here, so I felt appropriate to add this answer.
Specifically, I wanted to store a list [(datetime, event), ...] of tuples into a property of a relation.
Potential encountered errors are:
Neo.ClientError.Statement.TypeError: Property values can only be of primitive types or arrays thereof
Neo.ClientError.Statement.TypeError: Neo4j only supports a subset of Cypher types for storage as singleton or array properties. Please refer to section cypher/syntax/values of the manual for more details.
The bottom line is well summarized in this forum post by a Neo4j staff member:
Neo4j doesn't allow maps as properties (no sub-properties allowed, basically), and though lists are allowed as properties, they cannot be lists of maps (or lists of lists for that matter).
Basically I was trying to bypass the natural functionality of the DB. There seem to be 2 workarounds:
Dig your heels in as suggested here, and store the property as e.g. a JSON string
Rethink the design, and model these kind of properties into the graph (i.e. being more specific with the nodes)
After a little rethinking I came up with a much simpler data model that didn't require composite properties in relations. Although option 1 may have its uses, when we have to insist against a well-designed system (which neo4j is), that is usually an indicator that we should change course.
Andres

Merging a set of nodes onto one (only in query)

I am currently investigating how to model a bitemporal graph in neo4j. Unfortunately noone seems to have publicly undertaken this before.
One particular thing I am looking at is whether I can store in a new node only those values that have changed and then express a query that would merge all those values ordered by a given timestamp:
This creates the data I am playing with:
CREATE (:P1 {id: '1'})<-[:EXPANDS {date:5200, recorded:5100}]-(:P1Data {name:'Joe', wage: 3000})
// New data, recorded 2014-10-1 for 2015-1-1
MATCH (p:P1 {id: '1'}) CREATE (:P1Data { wage:3100 })-[:EXPANDS { date:5479, recorded: 5387}]->(p)
Now, I can get a history for a given point in time so far, e.g. like
MATCH (:P1 { id: '1' })<-[x:EXPANDS]-(d:P1Data)
WHERE x.recorded < 6000
WITH {date: x.date, data:d} as data
RETURN data
ORDER BY data.date DESC
What I would like to achieve is to merge the name and wage values such that I get a whole view of the data at a given point in time. The answer may also be that this is not really possible.
(PS: I say only in query, because I found a refactor function in apoc which does merge nodes, but that procedure actually merges and persists the node, while I would just want to query it).
As with most things, you can do it using REDUCE like so:
MATCH (:P1 { id: '1' })<-[x:EXPANDS]-(d:P1Data)
WITH x.date AS date, d AS data
ORDER BY date
WITH COLLECT(data) AS datas
WITH REDUCE(s = {}, y IN datas|
{name: COALESCE(y.name, s.name),
wage: COALESCE(y.wage, s.wage)})
AS most_recent_fields
RETURN most_recent_fields.name AS name, most_recent_fields.wage AS wage
You can do it in descending order instead (swap s and y inside the COALESCE statements if so), but there isn't really a way to shortcut processing the entire set of results from your queried time back to the start.
UPDATE: This will, of course, generate a Map and not a Node, but if you only want the properties and don't want to create a permanent record, a Map is actually better suited to your needs.
EXTENDED: If you don't want to specify which keys to use, you can do it without REDUCE like this instead:
MATCH (:P1 { id: '1' })<-[x:EXPANDS]-(d:P1Data)
WITH x.date AS date, d AS data
ORDER BY date
WITH COLLECT(data) AS datas
CREATE (t:Temp)
FOREACH(data IN datas|
SET t += data)
DELETE t
RETURN t
This approach does create a node, but if you DELETE it right before you RETURN it, it won't persist at all. += ensures that pre-existing properties aren't removed, only overwritten if the data node has existing values.

Cypher query - Optional Create

I am trying to create a social network-like structure.
I would like to create a timeline of posts which looks like this
(user:Person)-[:POSTED]->(p1:POST)-[:PREV]->[p2:POST]...
My problem is the following.
Assuming a post for a user already exists, I can create a new post by executing the following cypher query
MATCH (user:Person {id:#id})-[rel:POSTED]->(prev_post:POST)
DELETE rel
CREATE (user)-[:POSTED]->(post:POST {post:"#post", created:timestamp()}),
(post)-[:PREV]->(prev_post);
Assuming, the user has not created a post yet, this query fails. So I tried to somehow include both cases (user has no posts / user has at least one post) in one update query (I would like to insert a new post in the "post timeline")
MATCH (user:Person {id:"#id"})
OPTIONAL MATCH (user)-[rel:POSTED]->(prev_post:POST)
CREATE (post:POST {post:"#post2", created:timestamp()})
FOREACH (o IN CASE WHEN rel IS NOT NULL THEN [rel] ELSE [] END |
DELETE rel
)
FOREACH (o IN CASE WHEN prev_post IS NOT NULL THEN [prev_post] ELSE [] END |
CREATE (post)-[:PREV]->(o)
)
MERGE (user)-[:POSTED]->(post)
Is there any kind of if-statement (or some type of CREATE IF NOT NULL) to avoid using a foreach loop two times (the query looks a litte bit complicated and I know that the loop will only run 1 time)?.
However, this was the only solution, I could come up with after studying this SO post. I read in an older post that there is no such thing as an if-statement.
EDIT: The question is: Is it even good to include both cases in one query since I know that the "no-post case" will only occur once and that all other cases are "at least one post"?
Cheers
I've seen a solution to cases like this in some articles. To use a single query for all cases, you could create a special terminating node for the list of posts. A person with no posts would be like:
(:Person)-[:POSTED]->(:PostListEnd)
Now in all cases you can run the query:
MATCH (user:Person {id:#id})-[rel:POSTED]->(prev_post)
DELETE rel
CREATE (user)-[:POSTED]->(post:POST {post:"#post", created:timestamp()}),
(post)-[:PREV]->(prev_post);
Note that the no label is specified for prev_post, so it can match either (:POST) or (:PostListEnd).
After running the query, a person with 1 post will be like:
(:Person)-[:POSTED]->(:POST)-[:PREV]->(:PostListEnd)
Since the PostListEnd node has no info of its own, you can have the same one node for all your users.
I also do not see a better solution than using FOREACH.
However, I think I can make your query a bit more efficient. My solution essentially merges the 2 FOREACH tests into 1, since prev_postand rel must either be both NULL or both non-NULL. It also combines the CREATE and the MERGE (which should have been a CREATE, anyway).
MATCH (user:Person {id:"#id"})
OPTIONAL MATCH (user)-[rel:POSTED]->(prev_post:POST)
CREATE (user)-[:POSTED]->(post:POST {post:"#post2", created:timestamp()})
FOREACH (o IN CASE WHEN prev_post IS NOT NULL THEN [prev_post] ELSE [] END |
DELETE rel
CREATE (post)-[:PREV]->(o)
)
In the Neo4j v3.2 developer manual it specifies how you can create essentially a composite key made of multiple node properties at this link:
CREATE CONSTRAINT ON (n:Person) ASSERT (n.firstname, n.surname) IS NODE KEY
However, this is only available for the Enterprise Edition, not Community.
"CASE" is as close to an if-statement as you're going to get, I think.
The FOREACH probably isn't so bad given that you're likely limited in scope. But I see no particular downside to separating the query into two, especially to keep it readable and given the operations are fairly small.
Just my two cents.

CREATE UNIQUE with labels and properties

I'm using Neo4j 2.0.0-M06. Just learning Cypher and reading the docs. In my mind this query would work, but I should be so lucky...
I'm importing tweets to a mysql-database, and from there importing them to neo4j. If a tweet is already existing in the Neo4j database, it should be updated.
My query:
MATCH (y:Tweet:Socialmedia) WHERE
HAS (y.tweet_id) AND y.tweet_id = '123'
CREATE UNIQUE (n:Tweet:Socialmedia {
body : 'This is a tweet', tweet_id : '123', tweet_userid : '321', tweet_username : 'example'
} )
Neo4j says: This pattern is not supported for CREATE UNIQUE
The database is currently empty on nodes with the matching labels, so there are no tweets what so ever in the Neo4j database.
What is the correct query?
You want to use MERGE for this query, along with a unique constraint.
CREATE CONSTRAINT on (t:Tweet) ASSERT t.tweet_id IS UNIQUE;
MERGE (t:Tweet {tweet_id:'123'})
ON CREATE
SET t:SocialMedia,
t.body = 'This is a tweet',
t.tweet_userid = '321',
t.tweet_username = 'example';
This will use an index to lookup the tweet by id, and do nothing if the tweet exists, otherwise it will set those properties.
I would like to point that one can use a combination of
CREATE CONSTRAINT and then a normal
CREATE (without UNIQUE)
This is for cases where one expects a unique node and wants to throw an exception if the node unexpectedly exists. (Far cheaper than looking for the node before creating it).
Also note that MERGE seems to take more CPU cycles than a CREATE. (It also takes more CPU cycles even if an exception is thrown)
An alternative scenario covering CREATE CONSTRAINT, CREATE and MERGE (though admittedly not the primary purpose of this post).

Resources