Neo4j Cypher: How to stop duplicate SET if multiple CREATES - neo4j

I have a complex cypher query that creates multiple nodes and increments some counters on those nodes. For sake of example here is a simplified version of what I am trying to do:
START a = node(1), e = node(2)
CREATE a-[r1]->(b {})-[r2]->(c {}), e-[r3]->b-[r4]->(d{})
SET a.first=a.first+1, e.second=e.second+1
RETURN b
The issue is that because there are two CREATE commands the SET commands run twice and the values are incremented by 2 instead of 1 as intended. I have looked to see if I can merge the multiple CREATE statements and I cannot.
My initial idea is to separate out the different creates into a batch query, however I was wondering if there is another option.

Where are you executing this query? What version of neo4j are you using?
I went to console.neo4j.org and successfully ran the following and it correctly added one to both a.first and e.second:
START a = node(1), e = node(2)
CREATE a-[r:KNOWS]->b-[r2:KNOWS]->c, e-[:KNOWS]->b-[:KNOWS]->d
SET a.first=a.first+1, e.second=e.second+1
RETURN b

Related

Neo4j Browser - Bug on create node?

I dont know if it cpuld be an issue/bug.
When you create a node without variable, server creates "zombie gray" nodes
Example: CREATE (Expedient {id_exp: 'MAPSAN_0004', name: 'contractual', dateCreation: '16/03/2022', dateExpiration:'16/03/2023'})
and run it 2 o 3 times.
You will see in console result for each command:
Created 1 node, set 4 properties, completed after 2 ms.
But when you want to match the nodes, per example:
MATCH (e:Expedient) RETURN e
you will not find de "zombie" nodes, only if run:
MATCH (e) RETURN e
you will see all nodes.
Is it a good behaviour?
Thx in advance community!
You are almost there, just missing a colon to define the node label.
CREATE (:Expedient
{id_exp: 'MAPSAN_0004', name: 'contractual', dateCreation: '16/03/2022', dateExpiration:'16/03/2023'})
If you don't include the colon, the Cypher treats it as a reference variable that you could refer to later in the same statement:
CREATE (e:Expedient {id_exp: 'MAPSAN_0004', name: 'contractual', dateCreation: '16/03/2022', dateExpiration:'16/03/2023'})
CREATE (e)-[:REL]->(:Node)
Another thing, you might want to store the dates as date format and not string.

Getting node/edge creation/removal statistics

I am running a Python (3.8) script which uses pip library neo4j 4.0.0 to interact with a community edition neo4j 4.1.1 server.
I am running many queries which use MERGE to update or create if nodes and relationships don't exist.
So far this is working well, as the database is getting the data as intended.
From my script's side, I would like however to know how many nodes and edges were created in each query.
The issue is that in these Cypher queries that I send to the database from my script, I call MERGE more than one time and also use APOC procedures (though the APOC ones are just for updating labels, they don't create entities).
Here is an example of a query:
comment_field_names: List[str] = list(threads[0].keys())
cypher_single_properties: List[str] = []
for c in comment_field_names:
cypher_single_properties.append("trd.{0} = {1}.{0}".format(c, "trd_var"))
cypher_property_string: str = ", ".join(cypher_single_properties)
with driver.session() as session:
crt_stmt = ("UNWIND $threads AS trd_var "
"MERGE (trd:Thread {thread_id:trd_var.thread_id}) "
"ON CREATE SET PY_REPLACE "
"ON MATCH SET PY_REPLACE "
"WITH trd AS trd "
"CALL apoc.create.addLabels(trd, [\"Comment\"]) YIELD node "
"WITH trd as trd "
"MERGE (trd)-[r:THREAD_IN]->(Domain {domain_id:trd.domain_id}) "
"ON CREATE SET r.created_utc = trd.created_utc "
"ON MATCH SET r.created_utc = trd.created_utc "
"RETURN distinct 'done' ")
crt_params = {"threads": threads}
# Insert the individual properties we recorded earlier.
crt_stmt = crt_stmt.replace("PY_REPLACE", cypher_property_string)
run_res = session.run(crt_stmt, crt_params)
This works fine and the nodes get created with the properties passed from the threads Dict which is passed through the variable crt_params to UNWIND.
However, the Result instance in run_res does not have any ResultSummary inside it with a SummaryCounters instance for me to access statistics of created nodes and relations.
I suspect this is because of:
"RETURN distinct 'done' "
However, I am not sure if this is the reason.
Hoping someone may be able to help me set up my queries so that, no matter the number of MERGE operations I perform, I get the statistics for the whole query that was sent in crt_stmt.
Thank you very much.
When using the earlier neo4j version, you could write n = result.summary().counters.nodes_created, but from 4.0 the summary() method does not exist.
Now I found from https://neo4j.com/docs/api/python-driver/current/breaking_changes.html that Result.summary() has been replaced with Result.consume(), this behaviour is to consume all remaining records in the buffer and returns the ResultSummary.
You can get all counters by counters = run_res.consume().counters

Create doesn't make all nodes and relationships appear

I just downloaded and installed Neo4J. Now I'm working with a simple csv that is looking like that:
So first I'm using this to merge the nodes for that file:
LOAD CSV WITH HEADERS FROM 'file:///Athletes.csv' AS line
MERGE(Rank:rank{rang: line.Rank})
MERGE(Name:name{nom: line.Name})
MERGE(Sport:sport{sport: line.Sport})
MERGE(Nation:nation{pays: line.Nation})
MERGE(Gender: gender{genre: line.Gender})
MERGE(BirthDate:birthDate{dateDeNaissance: line.BirthDate})
MERGE(BirthPlace: birthplace{lieuDeNaissance: line.BirthPlace})
MERGE(Height: height{taille: line.Height})
MERGE(Pay: pay{salaire: line.Pay})
and this to create some constraint for that file:
CREATE CONSTRAINT ON(name:Name) ASSERT name.nom IS UNIQUE
CREATE CONSTRAINT ON(rank:Rank) ASSERT rank.rang IS UNIQUE
Then I want to display to which country the athletes live to. For that I use:
Create(name)-[:WORK_AT]->(nation)
But I have have that appear:
I would like to know why I have that please.
I thank in advance anyone that takes time to help me.
Several issues come to mind:
If your CREATE clause is part of your first query: since the CREATE clause uses the variable names name and nation, and your MERGE clauses use Name and Nation (which have different casing) -- the CREATE clause would just create new nodes instead of using the Name and Nation nodes.
If your CREATE clause is NOT part of your first query: your CREATE clause would just create new nodes (since variable names, even assuming they had the same casing, are local to a query and are not stored in the DB).
Solution: You can add this clause to the end of the first query:
CREATE (Name)-[:WORK_AT]->(Nation)
Yes, Agree with #cybersam, it's the case sensitive issue of 'name' and 'nation' variables.
My suggesttion:
MERGE (Name)-[:WORK_AT]->(Nation)
I see that you're using MERGE for nodes, so just in case any values of Name or Nation duplicated, you should use MERGE instead of CREATE.

How to add a column to all connected nodes - mnesia table

I am trying to add new column to an existing mnesia table. For that, I use following code.
test()->
Transformer =
fun(X)-> % when is_record(X, user) -> %previous users
#userss{name = X#user.name,
age = X#user.age,
email = X#user.email,
year = 1990}
end,
AF = mnesia:transform_table(user, Transformer,record_info(fields, userss),userss),
mnesia:sync_transaction(AF).
Two records I have
-record(user,{name,age,email}).
-record(users,{name,age,email,year}).
I want to update all connected node's tables. But it fails.
{aborted,{badarg,{aborted,{"Bad transform function",user,
#Fun<test.2.61379004>,'otherserver#192.168.169.1',
{badfun,#Fun<test.2.61379004>}}},
[],infinity,mnesia}}
What is the problem here?
The problem is that an anonymous function can only be called on nodes where the module that defines it is loaded. I guess you loaded the module containing the test function only on one node in the cluster - you need to load it on all nodes for this to work. You can use the nl command ("network load") instead of l in the Erlang shell for that:
nl(my_module).
nl and other commands are described here.

Too much time importing data and creating nodes

i have recently started with neo4j and graph databases.
I am using this Api to make the persistence of my model. I have everything done and working but my problems comes related to efficiency.
So first of all i will talk about the scenary. I have a couple of xml documents which translates to some nodes and relations between the, as i already read that this API still not support a batch insertion, i am creating the nodes and relations once a time.
This is the code i am using for creating a node:
var newEntry = new EntryNode { hash = incremento++.ToString() };
var result = client.Cypher
.Merge("(entry:EntryNode {hash: {_hash} })")
.OnCreate()
.Set("entry = {newEntry}")
.WithParams(new
{
_hash = newEntry.hash,
newEntry
})
.Return(entry => new
{
EntryNode = entry.As<Node<EntryNode>>()
});
As i get it takes time to create all the nodes, i do not understand why the time it takes to create one increments so fats. I have made some tests and am stuck at the point where creating an EntryNode the setence takes 0,2 seconds to resolve, but once it has reached 500 it has incremented to ~2 seconds.
I have also created an index on EntryNode(hash) manually on the console before inserting any data, and made test with both versions, with and without index.
Am i doing something wrong? is this time normal?
EDITED:
#Tatham
Thanks for the answer, really helped. Now i am using the foreach statement in the neo4jclient to create 1000 nodes in just 2 seconds.
On a related topic, now that i create the nodes this way i wanted to also create relationships. This is the code i am trying right now, but got some errors.
client.Cypher
.Match("(e:EntryNode)")
.Match("(p:EntryPointerNode)")
.ForEach("(n in {set} | " +
"FOREACH (e in (CASE WHEN e.hash = n.EntryHash THEN [e] END) " +
"FOREACH (p in pointers (CASE WHEN p.hash = n.PointerHash THEN [p] END) "+
"MERGE ((p)-[r:PointerToEntry]->(ee)) )))")
.WithParam("set", nodesSet)
.ExecuteWithoutResults();
What i want it to do is, given a list of pairs of strings, get the nodes (which are uniques) with the string value as the property "hash" and create a relationship between them. I have tried a couple of variants to do this query but i dont seem to find the solution.
Is this possible?
This approach is going to be very slow because you do a separate HTTP call to Neo4j for every node you are inserting. Each call is then a transaction. Finally, you are also returning the node back, which is probably a waste.
There are two options for doing this in batches instead.
From https://stackoverflow.com/a/21865110/211747, you can do something like this, where you pass in a set of objects and then FOREACH through them in Cypher. This means one, larger, HTTP call to Neo4j and then executing in a single transaction on the DB:
FOREACH (n in {set} | MERGE (c:Label {Id : n.Id}) SET c = n)
http://docs.neo4j.org/chunked/stable/query-foreach.html
The other option, coming soon, is that you will be able to write something like this in Cypher:
LOAD CSV WITH HEADERS FROM 'file://c:/temp/input.csv' AS n
MERGE (c:Label { Id : n.Id })
SET c = n
https://github.com/davidegrohmann/neo4j/blob/2.1-fix-resource-failure-load-csv/community/cypher/cypher/src/test/scala/org/neo4j/cypher/LoadCsvAcceptanceTest.scala

Resources