I am trying to create a social network-like structure.
I would like to create a timeline of posts which looks like this
(user:Person)-[:POSTED]->(p1:POST)-[:PREV]->[p2:POST]...
My problem is the following.
Assuming a post for a user already exists, I can create a new post by executing the following cypher query
MATCH (user:Person {id:#id})-[rel:POSTED]->(prev_post:POST)
DELETE rel
CREATE (user)-[:POSTED]->(post:POST {post:"#post", created:timestamp()}),
(post)-[:PREV]->(prev_post);
Assuming, the user has not created a post yet, this query fails. So I tried to somehow include both cases (user has no posts / user has at least one post) in one update query (I would like to insert a new post in the "post timeline")
MATCH (user:Person {id:"#id"})
OPTIONAL MATCH (user)-[rel:POSTED]->(prev_post:POST)
CREATE (post:POST {post:"#post2", created:timestamp()})
FOREACH (o IN CASE WHEN rel IS NOT NULL THEN [rel] ELSE [] END |
DELETE rel
)
FOREACH (o IN CASE WHEN prev_post IS NOT NULL THEN [prev_post] ELSE [] END |
CREATE (post)-[:PREV]->(o)
)
MERGE (user)-[:POSTED]->(post)
Is there any kind of if-statement (or some type of CREATE IF NOT NULL) to avoid using a foreach loop two times (the query looks a litte bit complicated and I know that the loop will only run 1 time)?.
However, this was the only solution, I could come up with after studying this SO post. I read in an older post that there is no such thing as an if-statement.
EDIT: The question is: Is it even good to include both cases in one query since I know that the "no-post case" will only occur once and that all other cases are "at least one post"?
Cheers
I've seen a solution to cases like this in some articles. To use a single query for all cases, you could create a special terminating node for the list of posts. A person with no posts would be like:
(:Person)-[:POSTED]->(:PostListEnd)
Now in all cases you can run the query:
MATCH (user:Person {id:#id})-[rel:POSTED]->(prev_post)
DELETE rel
CREATE (user)-[:POSTED]->(post:POST {post:"#post", created:timestamp()}),
(post)-[:PREV]->(prev_post);
Note that the no label is specified for prev_post, so it can match either (:POST) or (:PostListEnd).
After running the query, a person with 1 post will be like:
(:Person)-[:POSTED]->(:POST)-[:PREV]->(:PostListEnd)
Since the PostListEnd node has no info of its own, you can have the same one node for all your users.
I also do not see a better solution than using FOREACH.
However, I think I can make your query a bit more efficient. My solution essentially merges the 2 FOREACH tests into 1, since prev_postand rel must either be both NULL or both non-NULL. It also combines the CREATE and the MERGE (which should have been a CREATE, anyway).
MATCH (user:Person {id:"#id"})
OPTIONAL MATCH (user)-[rel:POSTED]->(prev_post:POST)
CREATE (user)-[:POSTED]->(post:POST {post:"#post2", created:timestamp()})
FOREACH (o IN CASE WHEN prev_post IS NOT NULL THEN [prev_post] ELSE [] END |
DELETE rel
CREATE (post)-[:PREV]->(o)
)
In the Neo4j v3.2 developer manual it specifies how you can create essentially a composite key made of multiple node properties at this link:
CREATE CONSTRAINT ON (n:Person) ASSERT (n.firstname, n.surname) IS NODE KEY
However, this is only available for the Enterprise Edition, not Community.
"CASE" is as close to an if-statement as you're going to get, I think.
The FOREACH probably isn't so bad given that you're likely limited in scope. But I see no particular downside to separating the query into two, especially to keep it readable and given the operations are fairly small.
Just my two cents.
Related
I am new to Neo4j and I have a relatively complex (but small) database which I have simplified to the following:
The first door has no key, all other doors have keys, the window doesn't require a key. The idea is that if a person has key:'A', I want to see all possible paths they could take.
Here is the code to generate the db
CREATE (r1:room {name:'room1'})-[:DOOR]->(r2:room {name:'room2'})-[:DOOR {key:'A'}]->(r3:room {name:'room3'})
CREATE (r2)-[:DOOR {key:'B'}]->(r4:room {name:'room4'})-[:DOOR {key:'A'}]->(r5:room {name:'room5'})
CREATE (r4)-[:DOOR {key:'C'}]->(r6:room {name:'room6'})
CREATE (r2)-[:WINDOW]->(r4)
Here is the query I have tried, expecting it to return everything except for room6, instead I have an error which means I really don't know how to construct the query.
with {key:'A'} as params
match (n:room {name:'room1'})-[r:DOOR*:WINDOW*]->(m)
where r.key=params.key or not exists(r.key)
return n,m
To be clear, I don't need my query debugged so much as help understanding how to write it correctly.
Thanks!
This should work for you:
WITH {key:'A'} AS params
MATCH p=(n:room {name:'room1'})-[:DOOR|WINDOW*]->(m)
WHERE ALL(r IN RELATIONSHIPS(p) WHERE NOT EXISTS(r.key) OR r.key=params.key)
RETURN n, m
With your sample data, the result is:
╒════════════════╤════════════════╕
│"n" │"m" │
╞════════════════╪════════════════╡
│{"name":"room1"}│{"name":"room2"}│
├────────────────┼────────────────┤
│{"name":"room1"}│{"name":"room3"}│
├────────────────┼────────────────┤
│{"name":"room1"}│{"name":"room4"}│
├────────────────┼────────────────┤
│{"name":"room1"}│{"name":"room5"}│
└────────────────┴────────────────┘
I have the following params set:
:params "userId":"15229100-b20e-11e3-80d3-6150cb20a1b9",
"contextNames":[{"uid":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","name":"zhora"}],
"statements":[{"text":"oranges apples bananas","concepts":["orange","apple","banana"],
"mentions":[],"timestamp":15481867295710000,"name":"# banana","uid":"34232870-1e7f-11e9-8609-a7f6b478c007",
"uniqueconcepts":[{"name":"orange","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"apple","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"banana","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000}],"uniquementions":[]}],"timestamp":15481867295710000,"conceptsRelations":[{"from":"orange","to":"apple","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710000,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"apple","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"orange","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":4,"weight":2}],"mentionsRelations":[]
Then when I make the following query:
MATCH (u:User {uid: $userId})
UNWIND $contextNames as contextName
MERGE (context:Context {name:contextName.name,by:u.uid,uid:contextName.uid})
ON CREATE SET context.timestamp=$timestamp
MERGE (context)-[:BY{timestamp:$timestamp}]->(u)
WITH u, context
UNWIND $statements as statement
CREATE (s:Statement {name:statement.name, text:statement.text, uid:statement.uid, timestamp:statement.timestamp})
CREATE (s)-[:BY {context:context.uid,timestamp:s.timestamp}]->(u)
CREATE (s)-[:IN {user:u.id,timestamp:s.timestamp}]->(context)
WITH u, s, context, statement
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName}) ON CREATE SET c.uid=apoc.create.uuid()
CREATE (c)-[:BY {context:context.uid,timestamp:s.timestamp,statement:s.suid}]->(u)
CREATE (c)-[:OF {context:context.uid,user:u.uid,timestamp:s.timestamp}]->(s)
CREATE (c)-[:AT {user:u.uid,timestamp:s.timestamp,context:context.uid,statement:s.uid}]->(context) )
WITH u, s
UNWIND $conceptsRelations as conceptsRelation MATCH (c_from:Concept{name: conceptsRelation.from}) MATCH (c_to:Concept{name: conceptsRelation.to})
CREATE (c_from)-[:TO {context:conceptsRelation.context,statement:conceptsRelation.statement,user:u.uid,timestamp:conceptsRelation.timestamp, uid:apoc.create.uuid(), gapscan:conceptsRelation.gapscan, weight: conceptsRelation.weight}]->(c_to)
RETURN DISTINCT s.uid
But when I run it, I get this error:
Neo.ClientError.Statement.TypeError
Property values can only be of primitive types or arrays thereof
Anybody knows why it's coming up? My params seem to be set correctly, I didn't see they couldn't be used in this way... Thanks!
Looks like the problem is here:
...
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName})
...
uniqueconcepts in your parameter is a list of objects, not a list of strings, so when attempting to MERGE conceptName, it errors out as conceptName isn't a primitive type (or array or primitive types). I think you'll want to use uniqueConcept instead of conceptName, and in your MERGE use name:uniqueConcept.name. Check for other usages of the elements of statement.uniqueconcepts.
This answer is for other n00bs like me that are trying to put a composite datatype into a property without reading the friendly manual, and get the error above. Google points here, so I felt appropriate to add this answer.
Specifically, I wanted to store a list [(datetime, event), ...] of tuples into a property of a relation.
Potential encountered errors are:
Neo.ClientError.Statement.TypeError: Property values can only be of primitive types or arrays thereof
Neo.ClientError.Statement.TypeError: Neo4j only supports a subset of Cypher types for storage as singleton or array properties. Please refer to section cypher/syntax/values of the manual for more details.
The bottom line is well summarized in this forum post by a Neo4j staff member:
Neo4j doesn't allow maps as properties (no sub-properties allowed, basically), and though lists are allowed as properties, they cannot be lists of maps (or lists of lists for that matter).
Basically I was trying to bypass the natural functionality of the DB. There seem to be 2 workarounds:
Dig your heels in as suggested here, and store the property as e.g. a JSON string
Rethink the design, and model these kind of properties into the graph (i.e. being more specific with the nodes)
After a little rethinking I came up with a much simpler data model that didn't require composite properties in relations. Although option 1 may have its uses, when we have to insist against a well-designed system (which neo4j is), that is usually an indicator that we should change course.
Andres
I trying to set up a scheme for web-clicks, where each node is a (:Click), which links to the click that precedes it by a [:PREV]-edge and the (:Session) that owns it by a [:GEN]-edge. In the end this should happen procedural, a new transaction/insert when a new click is made. While I have no problem generating the involved objects, I cannot figure out how to dynamically select last (:Click) and link it to the current created one.
Generate a session with 2 clicks:
CREATE (s:Session {name:'S0'})
CREATE (c1:Click {name:'C1', click:1}), (c1)<-[:GEN]-(s)
CREATE (c2:Click {name:'C2', click:2}), (c2)<-[:GEN]-(s), (c1)<-[:PREV]-(c2);
generate one other click in separated transaction:
MERGE (s:Session {name:'S0'})
CREATE (c3:Click {name:'C3', click:3}),
(c3)<-[:GEN]-(s) //(c2)<-[:PREV]-(c3);
for the commented out link, I cannot use the c2-variable as it is scope-local to the previous transaction.
Now I thought to try something like this to dynamically find the last generated node on the same session and link it
MERGE (s:Session {name:'S0'})
CREATE (c3:Click {name:'C3', click:3}), (c3)<-[:GEN]-(s)
MATCH (s)-[:GEN]->(c_prevs:Click)
WITH c_prevs
ORDER BY c_prevs.click DESC LIMIT 1
CREATE (head(c_prevs))<-[:PREV]-(c3)
Unfortunately this won't work for me with any Cypher-construct I came up with so far.
If I understand you can get the last :Click node on the same session this way:
match (:Session {name:'S0'})-[:GEN]->(c:Click)
where not (:Click)-[:PREV]->(c)
return c
That is: Get the node from the same session that does not have an incoming [PREV] relationship. Will return c2
╒═══════════════════════╕
│"c" │
╞═══════════════════════╡
│{"name":"C2","click":2}│
└───────────────────────┘
For your specific case a query like the following should work:
merge (s:Session {name:'S0'})
with s
match (s)-[:GEN]->(last:Click)
where not (:Click)-[:PREV]->(last)
create (c3:Click {name:'C3', click:3}),
(c3)<-[:GEN]-(s),
(last)<-[:PREV]-(c3)
I found the answer to my question to be the following
MATCH (s:Session {name:'S0'})
CREATE (c3:Click {name:'C3', click:3})
WITH s, c3
MATCH (s)-[:GEN]->(c_prev:Click)
WITH c_prev, c3, s
ORDER BY c_prev.click DESC LIMIT 1
WITH c_prev, c3, s
CREATE (c_prev)<-[:PREV]-(c3), (c3)<-[:GEN]-(s)
which is chaining through the nodes as variables s, c3 and last_c with the WITH keyword. Unfortunately this involves a lot of repetition, as every WITH in principle is a part-separator in the query, so I learned.
This also allows to carry over already MERGED/CREATED nodes, which might help to ensure their existence.
EDIT:
This problem seems to be even more complicated if clicks should be generated prozedural, thus using one cypher-statement to insert and link any click.
my solution looks like the following
MERGE (s:Session {name: $session_name})
WITH s
CREATE (c:Click {name: $click_name, click: $click_count})
WITH s, c
OPTIONAL MATCH (s)-[:GEN]->(c_prev:Click)
WITH c_prev, c, s
ORDER BY c_prev.click DESC LIMIT 1
WITH c_prev, c, s
FOREACH (o IN CASE WHEN c_prev IS NOT NULL THEN ['1'] ELSE [] END |
CREATE (c_prev)<-[:PREV]-(c)
)
WITH s, c
CREATE (c)<-[:GEN]-(s)
with executing this statement for {$session_name, $click_name, $click_count} =[{'AAA', 'C1', 1}, {'AAA', 'C2', 2}, {'AAA', 'C3', 3}].
Notice that I had to work around the returning empty node-list by explicitly catching this condition and then not executing the subsequent connection statement with the FOREACH-loop on an empty list. This does not only look very ugly, I sincerely think there should be a better way to expressively specify this desired behavior through Cypher in the near future.
I think this is a long question for what's likely a simple answer. However I thought it wise to include the full context in case there's something wrong with my query logic (excuse the formatting if it's off - I've renamed the vars and it may be malformed, I need help with the theory and not the structure)
An organisation can have a sub office
(o:Organisation)-[:sub_office]->(an:Organisation)
Or a head office
(o)-[:head_office]->(ho:Organisation)
Persons in different sub offices can be employees or ex-employee
EX1
(o)-[:employee]->(p:Person{name:'person1'})<-[:ex_employee]-(an)
Persons can be related to other people through the management relationships. These management links can be variable length.
EX2
(o)-[:employee]->(p:Person{name:'person2'})-[:managed]->(p:Person{name:'person3'})<-[:ex_employee]-(an)
(o)-[:ex_employee]->(p:Person{name:'person4'})-[:managed]->(p:Person{name:'NOT_RETURNED1'})-[:managed]->(p:Person{name:'person5'})<-[:employee]-(an)
(o)-[:ex_employee]->(p:Person{name:'person6'})<-[:managed]-(p:Person{name:'NOT_RETURNED2'})<-[:managed]-(p:Person{name:'person8'})<-[:employee]-(an)
(o)-[:ex_employee]->(p:Person{name:'person9'})-[:managed]->(p:Person{name:'NOT_RETURNED4'})-[:managed]->(p:Person{name:'NOT_RETURNED5'})<-[:managed]-(p:Person{name:'person11'})<-[:employee]-(an)
....
I'm querying:
-organisation,
-sub office,
-how they're related
These are all working fine (I think...)
The issues I'm having is with returning Persons associated with the orgs (employees or ex employees) and their relationships to the organisation but only if they are connected to the other organisation directly (as in EX1) or through a managed chain (all of EX2 - I've tried to make it clearer by marking the Persons who won't be returned by the query as name 'NOT_RETURNED')
I've created the following:
MATCH (queryOrganisation:Organisation{name:'BigCorp'})-[orgRel]-(relatedOrganisation:Organisation)
WITH queryOrganisation, orgRel, relatedOrganisation
MATCH (queryOrganisation)-[employmentRel]->(queryPerson:Person)
OPTIONAL MATCH (queryPerson)<-[relatedOrgRel]-(relatedOrganisation)
OPTIONAL MATCH (queryPerson)-[:managed*1..]-(relatedPerson:Person)<-[relatedOrgRel]-(relatedOrganisation)
WITH queryOrganisation, orgRel, relatedOrganisation, employmentRel, queryPerson, relatedOrgRel, relatedPerson
WHERE NOT queryOrganisation.name = relatedOrganisation.name
RETURN ID(queryOrganisation) as queryOrganisationID,
ID(startNode(orgRel))as startNodeId, type(orgRel)as orgRel, ID(endNode(orgRel))as endNodeId,
ID(relatedOrganisation)as relatedOrganisationId, relatedOrganisation.name as relatedOrganisationName
COLLECT({
queryPerson:{endpoint:{ID:ID(queryPerson)}, endpointrelationship: type(employmentRel)},
relatedPerson:{endpoint:{ID:coalesce(ID(relatedPerson),ID(queryPerson))}, endpointrelationship:type(relatedOrgRel)}
}) as rels
I would have expected all the collected results to look like:
{
"startEmp":{
"ID":2715,
"startrelationship":"employee"
},
"relatedEmp":{
"ID":2722,
"endrelationship":"ex employee"
}
}
However the directly connected node results (same node ID) appear like:
{
"startEmp":{
"ID":2716,
"startrelationship":"employee"
},
"relatedEmp":{
"ID":2716,
"endrelationship":null
}
}
Why is that null appearing for type(relatedOrgRel)? Am I misunderstanding whats happening in the OPTIONAL MATCH and the relatedOrgRel gets overwritten by null during the second OPTIONAL MATCH? If so, how can I remedy?
Thanks
No, the OPTIONAL MATCHes cannot overwrite variables that are already defined.
I think the cause of the problems is when your second OPTIONAL MATCH doesn't match anything, but this is partially covered up by the COALESCE used in the collecting of persons in your return hides some of the conseque:
...
relatedPerson:{endpoint:{ID:coalesce(ID(relatedPerson),ID(queryPerson))}, endpointrelationship:type(relatedOrgRel)}
...
If relatedPerson is null, as it will be if your second OPTIONAL MATCH fails, then you're falling back to the id of queryPerson, but since you're not using a COALESCE for relatedOrgRel, this will still be null. You'll need a COALESCE here, or otherwise you'll need to figure out a better way to deal with the null variables in your OPTIONAL MATCHES in cases where they fail.
I have a node id for an event, and list of node ids for users that are hosting the event. I want to update these (:USER)-[:HOSTS]->(:EVENT) relationships. I dont just want to add the new ones, I want to remove the old ones as well.
NOTE: this is coffeescript syntax where #{} is string interpolation, and str() will escape any characters for me.
Right now I'm querying all the hosts:
MATCH (u:USER)-[:HOSTS]->(:EVENT {id:#{str(eventId)}})
RETURN u.id
Then I'm determining which hosts are new and need to be added and which ones are old and need to be removed. For the old ones, I remove them
MATCH (:HOST {id:#{str(host.id)}})-[h:HOSTS]->(:EVENT {id:#{str(eventId)}})
DELETE h
And for the new ones, I add them:
MATCH (e:EVENT {id: #{str(eventId)}})
MERGE (u:USER {id:#{str(id)}})
SET u.name =#{str(name)}
MERGE (u)-[:HOSTS]->(e)
So my question is, can I do this more efficiently all in one query? I want want to set the new relationships, getting rid of any previous relationships that arent in the new set.
If I understand your question correctly, you can achieve your objective in a single query by introducing WITH and FOREACH. On a sample graph created by
CREATE (_1:User { name:"Peter" }),(_2:User { name:"Paul" }),(_3:User { name:"Mary" })
CREATE (_4:Event { name:"End of the world" })
CREATE _1-[:HOSTS]->_4, _2-[:HOSTS]->_4
you can remove the no longer relevant hosts, and add the new hosts, as such
WITH ["Peter", "Mary"] AS hosts, "End of the world" AS eventId
MATCH (event:Event { name:eventId })<-[r:HOSTS]-(u:User)
WHERE NOT u.name IN hosts
DELETE r
WITH COLLECT(u.name) AS oldHosts, hosts, event
WITH FILTER(h IN hosts
WHERE NOT h IN oldHosts) AS newHosts, event, oldHosts
FOREACH (n IN newHosts |
MERGE (nh:User { name:n })
MERGE nh-[:HOSTS]->event
)
I have made some assumptions, at least including
The new host (:User) of the event may already exists, therefore MERGE (nh:User { name:n }) and not CREATE.
The old [:HOSTS]s should be disconnected from the event, but not removed from the database.
Your coffee script stuff can be translated into parameters, and you can translate my pseudo-parameters into parameters. In my sample query I simulate parameters with the first line, but you may need to adapt the syntax according to how you actually pass the parameters to the query (I can't turn Coffee into Cypher).
Click here to test the query. Change the contents of the hosts array to ["Peter", "Paul"], or to ["Peter", "Dragon"], or whatever value makes sense to you, and rerun the query to see how it works. I've used name rather than id to catch the nodes, and again, I've simulated parameters, but you might be able to translate the query to the context from which you want to execute it.
Edit:
Re comment, if you want the query to also match events that don't have any hosts you need to make the -[:HOSTS]- part of the pattern optional. Do so by braking the MATCH clause in two:
MATCH (event:Event { name:eventId })
OPTIONAL MATCH event<-[r:HOSTS]-(u:User)
The rest of the query is the same.