Neo4j - poor performance - neo4j

I receive daily a full csv file with all customers. I need to insert/update the neo4j database. So I am using this query:
I already create indexes on hash field.
MERGE (XFEWYX:CUSTOMER_BSELLER {hash:'xyz#hotmail.com'} ) ON CREATE SET XFEWYX.hash = 'xyz#hotmail.com',XFEWYX.name = 'XYZ ',XFEWYX.birthdate = '1975-05-20T00:00:00',XFEWYX.id = '1770852',XFEWYX.nick = 'CLARISSA',XFEWYX.documentNumber = 'XYZ',XFEWYX.email = 'clarissajuridica#hotmail.com' with XFEWYX
MERGE (WHHEKX:EMAIL {hash:'clarissajuridica#hotmail.com'} ) ON CREATE SET WHHEKX.hash = 'clarissajuridica#hotmail.com',WHHEKX.email = 'clarissajuridica#hotmail.com' with XFEWYX,WHHEKX
MERGE (JJKONT:DOCUMENT {hash:'06845078700'} ) ON CREATE SET JJKONT.document = 'XYZ',JJKONT.hash = '06845078700' with XFEWYX,WHHEKX,JJKONT
MERGE (MERUCB:PHONE {hash:'NoneNone'} ) ON CREATE SET MERUCB.areaCode = 'None',MERUCB.hash = 'NoneNone',MERUCB.number = 'None' with XFEWYX,WHHEKX,JJKONT,MERUCB
MERGE (BOORBT:PHONE {hash:'XYZ'} ) ON CREATE SET BOORBT.areaCode = '21',BOORBT.hash = 'XYZ',BOORBT.number = 'XYZ' with XFEWYX,WHHEKX,JJKONT,MERUCB,BOORBT
MERGE (XBLZNF:PHONE {hash:'XYZ'} ) ON CREATE SET XBLZNF.areaCode = '21',XBLZNF.hash = 'XYZ',XBLZNF.number = 'XYZ' with XFEWYX,WHHEKX,JJKONT,MERUCB,BOORBT,XBLZNF
MERGE (XFEWYX)-[:REGISTERED_WITH {optin:'false'}]->(WHHEKX)
MERGE (XFEWYX)-[:DOCUMENT]->(JJKONT)
MERGE (XFEWYX)-[:COMMERCIAL_PHONE]->(MERUCB)
MERGE (XFEWYX)-[:PHONE]->(XBLZNF)
MERGE (XFEWYX)-[:CELL_PHONE]->(BOORBT)
Does anyone have another approach how to execute this query?

I would try using the PROFILE command. You might put a LIMIT on your LOAD CSV (I assume you're using LOAD CSV) for the testing.
I would also check out this article:
http://www.markhneedham.com/blog/2014/10/23/neo4j-cypher-avoiding-the-eager/
Some of that has been fixed in recent versions of Neo4j, but you have an awful lot of MERGEs there, so you probably could stand to split some of that up and process your CSV file more than once.

Related

How to verify if a node exist and if not exist then a created during an import of .csv to neo4j

I am importing some .cvs files for my database in neo4j, but I have the data of people in three different files, so when I import the data of the person from another file that has more data, I get an error when trying to import people nodes, because I already have other nodes with those dni (constraint) in my database.
So I want to create the new node or, if it exists, retrieve its pointer to create relationships with other nodes that I keep creating while I import.
I have tried several things on the internet but I still can't find the solution
Here my code:
LOAD CSV WITH HEADERS FROM 'file:/D:/ACCOUNT.csv' AS line FIELDTERMINATOR ';'
MERGE (persona :Persona { dni: line.DNI,
nombre: line.NOMBRE,
sexo: line.SEXO,
fechaNacimiento: line.FNACIMIENTO,
direccion: line.DIRECCION
})
I have tried with apoc and "with" but I still can't find the solution.
when this code finds another node with a person label and ID equal to the one entered, it gives me an error
To get this working, you'll have to understand how MERGE works. The statement
MERGE (persona :Persona { dni: line.DNI, nombre: line.NOMBRE, sexo: line.SEXO,
fechaNacimiento: line.FNACIMIENTO,direccion: line.DIRECCION
})
will create a new Persona node for every distinct combination of the above properties. So, for a node with the same dni, but with other values of other properties, this will fail. To fix this, you should try merging the nodes on the basis of their dni, and then set the properties like this:
MERGE (persona :Persona { dni: line.DNI })
ON CREATE
SET persona.nombre = line.NOMBRE,
persona.sexo = line.SEXO,
persona.fechaNacimiento = line.FNACIMIENTO,
persona.direccion = line.DIRECCION
The above query will ignore setting properties if a matching node is found. To set some properties when a match is found, use ON MATCH, like this:
MERGE (persona :Persona { dni: line.DNI })
ON CREATE
SET persona.nombre = line.NOMBRE,
persona.sexo = line.SEXO,
persona.fechaNacimiento = line.FNACIMIENTO,
persona.direccion = line.DIRECCION
ON MATCH
// Matching logic here

Neo4j ver. 4.1.1 Cypher Syntax Error "USING PERIODIC COMMIT"

From neo4d lessons learned example from NASA
References a github project showing how NASA used Neo4J to create a lessons learned database. It is from 2015 and I've been struggling to get the Cypher code running on newer versions of Neo4j (I'm learning as I go).
The error message I am getting is
My modified version is below.
MATCH (n)
OPTIONAL MATCH (n)-[r]-()
DELETE n,r;
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM 'file:///Users/kevin/Neo4j/doctopics-master/data/llis.csv'
AS line
WITH line
Limit 1
RETURN line
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM 'file:///Users/kevin/Neo4j/doctopics-master/data/llis.csv' AS line
WITH line, SPLIT(line.LessonDate, '-') AS date
CREATE (lesson:Lesson { name: ToInteger(line.`LessonId`) } )
SET lesson.year = ToInteger(date[0]),
lesson.month = ToInteger(date[1]),
lesson.day = ToInteger(date[2]),
lesson.title = (line.Title),
lesson.abstract = (line.Abstract),
lesson.lesson = (line.Lesson),
lesson.org = (line.MissionDirectorate),
lesson.safety = (line.SafetyIssue),
lesson.url = (line.url)
MERGE (submitter:Submitter { name: toUpper(line.Submitter1) })
MERGE (center:Center { name: toUpper(line.Organization) })
MERGE (topic:Topic { name: ToInteger(line.Topic) })
MERGE (category:Category { name: toUpper(line.Category) })
CREATE (topic)-[:Contains]->(lesson)
CREATE (submitter)-[:Wrote]->(lesson)
CREATE (lesson)-[:OccurredAt]->(center)
CREATE (lesson)-[:InCategory]->(category);
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM 'file:///Users/kevin/Neo4j/doctopics-master/data/topicCategory.csv' AS line
MATCH (topic:Topic { name: ToInteger(line.Topic) })
MATCH (category:Category { name: toUpper(line.Category) })
CREATE (topic)-[:AssociatedTo]->(category)
;
// Topic, Correlation.
// Adds a relation to each topic using their correlation
// as a property of the relationship
// Load.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM 'file:///Users/kevin/Neo4j/doctopics-master/data/topicCorr.csv' AS line
MATCH (topic:Topic), (topic2:Topic)
WHERE topic.name = ToInteger(line.Topic) AND topic2.name = ToInteger(line.ToTopic)
MERGE (topic)-[c:CorrelatedTo {corr : ToFloat(line.Correlation)}]-(topic2)
;
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM 'file:///Users/kevin/Neo4j/doctopics-master/data/topicTerms.csv' AS line
MATCH (topic:Topic { name: ToInteger(line.Topic) })
MERGE (term:Term { name: toUpper(line.Terms) })
CREATE (term)-[r:RankIn {rank : ToInteger(line.Rank)}]->(topic)
;
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM 'file:///Users/kevin/Neo4j/doctopics-master/data/topicLabels.csv' AS line
MATCH (topic:Topic { name: ToInteger(line.Topic) })
SET topic.label = line.Label
;
The original Cypher code and data is in David Meza's Repo
I'd just like to get something ready to demo to our company tomorrow morning to 'sell' the 'lessons learned' and usage of a graph database. Any help will be appreciated.
Thanks
did you try
DELETE n,r
on line 3?
Btw purging a store could also be done like this, which is probably faster.
MATCH (n)
DETACH DELETE n

Neo4j syntax for WITH statement

I am trying to auto generate a a text file that can be run to create the nodes and relationships for a Neoj4 Graph.
The text file is being created in two sections:
First the nodes are created in a For loop (6000 nodes) with a result like this:
create(SystemLogic_d6:FB {type:"SUB_DINT", instanceName:"d6", section:"SystemLogic"})
create(SystemLogic_d5:FB {type:"SUB_DINT", instanceName:"d5", section:"SystemLogic"})
create(SystemLogic_d7:FB {type:"ADD_DINT", instanceName:"d7", section:"SystemLogic"})
create(SystemLogic_d8:FB {type:"SEL", instanceName:"d8", section:"SystemLogic"})
Then in the next section of the text file relationships are created in another For loop wih a result like this:
MATCH (SystemLogic_d8:FB), (SystemLogic_d12:FB) WHERE SystemLogic_d8.instanceName = "d8" AND SystemLogic_d12.instanceName = "d12" CREATE (SystemLogic_d8)-[: c]->(SystemLogic_d12)
MATCH (SystemLogic_d17:FB), (SystemLogic_d18:FB) WHERE SystemLogic_d17.instanceName = "d17" AND SystemLogic_d18.instanceName = "d18" CREATE (SystemLogic_d17)-[: c]->(SystemLogic_d18)
MATCH (SystemLogic_d16:FB), (SystemLogic_d17:FB) WHERE SystemLogic_d16.instanceName = "d16" AND SystemLogic_d17.instanceName = "d17" CREATE (SystemLogic_d16)-[: c]->(SystemLogic_d17)
MATCH (SystemLogic_d11:FB), (SystemLogic_d5:FB) WHERE SystemLogic_d11.instanceName = "d11" AND SystemLogic_d5.instanceName = "d5" CREATE (SystemLogic_d11)-[: c]->(SystemLogic_d5)
This is giving the error WITH is required between CREATE and MATCH
I tried inserting a WITH in between as in this answer
Neo4j Cypher WITH is required between CREATE and MATCH:
Which gives a result like this:
MATCH (SystemLogic_d8:FB), (SystemLogic_d12:FB) WITH SystemLogic_d8,SystemLogic_d12 WHERE SystemLogic_d8.instanceName = "d8" AND SystemLogic_d12.instanceName = "d12" CREATE (SystemLogic_d8)-[: c]->(SystemLogic_d12)
MATCH (SystemLogic_d17:FB), (SystemLogic_d18:FB) WITH SystemLogic_d17,SystemLogic_d18 WHERE SystemLogic_d17.instanceName = "d17" AND SystemLogic_d18.instanceName = "d18" CREATE (SystemLogic_d17)-[: c]->(SystemLogic_d18)
MATCH (SystemLogic_d16:FB), (SystemLogic_d17:FB) WITH SystemLogic_d16,SystemLogic_d17 WHERE SystemLogic_d16.instanceName = "d16" AND SystemLogic_d17.instanceName = "d17" CREATE (SystemLogic_d16)-[: c]->(SystemLogic_d17)
MATCH (SystemLogic_d11:FB), (SystemLogic_d5:FB) WITH SystemLogic_d11,SystemLogic_d5 WHERE SystemLogic_d11.instanceName = "d11" AND SystemLogic_d5.instanceName = "d5" CREATE (SystemLogic_d11)-[: c]->(SystemLogic_d5)
MATCH (SystemLogic_FBI_1407:FB), (SystemLogic_FBI_1408:FB) WITH SystemLogic_FBI_1407,SystemLogic_FBI_1408 WHERE SystemLogic_FBI_1407.instanceName = "FBI_1407" AND SystemLogic_FBI_1408.instanceName = "FBI_1408" CREATE (SystemLogic_FBI_1407)-[: c]->(SystemLogic_FBI_1408)
But I still get the same error
I also tried putting the WITH statement after the create statement but that gives another error.
Are you able to import and run multiple node/relationships creation statements in this fashion?
It works fine for creating the nodes but I am new to using Neo4J / Cypher and I am not sure if it is my syntax that is incorrect or that you can't create multiple relatiionships in this fasion.
Thanks for your help
You need to separate the statements with a semicolon, Please refer following queries:
create(SystemLogic_d8:FB {type:"SEL", instanceName:"d8", section:"SystemLogic"});
create(SystemLogic_d9:FB {type:"SEL", instanceName:"d8", section:"SystemLogic"});
MATCH (SystemLogic_d2:FB), (SystemLogic_d21:FB) WHERE SystemLogic_d2.instanceName = "d8" AND SystemLogic_d21.instanceName = "d12" CREATE (SystemLogic_d2)-[: c]->(SystemLogic_d21);
MATCH (SystemLogic_d1:FB), (SystemLogic_d12:FB) WHERE SystemLogic_d1.instanceName = "d8" AND SystemLogic_d12.instanceName = "d12" CREATE (SystemLogic_d1)-[: c]->(SystemLogic_d12)
If you have only CREATE statements then there is no need to use semicolon it will work,
But if you are using MATCH and CREATE combined then you need to separate the statements with a semicolon.
#Raj answer is valid. However, as you are already capturing the nodes in your create statements, you do not need to perform a match on them to create relations.
Your file could then be :
create(SystemLogic_d6:FB {type:"SUB_DINT", instanceName:"d6", section:"SystemLogic"})
create(SystemLogic_d5:FB {type:"SUB_DINT", instanceName:"d5", section:"SystemLogic"})
create(SystemLogic_d7:FB {type:"ADD_DINT", instanceName:"d7", section:"SystemLogic"})
create(SystemLogic_d8:FB {type:"SEL", instanceName:"d8", section:"SystemLogic"})
CREATE (SystemLogic_d8)-[:c]->(SystemLogic_d6)
CREATE (SystemLogic_d7)-[:c]->(SystemLogic_d6)
CREATE (SystemLogic_d8)-[:c]->(SystemLogic_d5)

Neo4j Load csv on create set if not null

Is there a way in Neo4j to use ON CREATE SET with an IF NOT NULL?
I have a cvs file as such:
employeeid,firstname,lastname,suffix,title
1,john,baker,,mr
2,ellie,johnston,,mrs,
3,bob,smith,jr,,
My current load statement:
LOAD CSV from 'file://file' AS line
WITH line
MERGE (a:Employee {id:TOINT(line.`employeeid`)})
ON CREATE SET a.firstname = line.`firstname`, a.lastname = line.`lastname`, a.suffix = line.`suffix`
How would I change this so it won't set an attribute if null but still set those that have values?
If you set an attribute to null it will actually not be created.

RNeo4j Error: 400 Bad Request

I am not sure why I am getting the error below, but I suppose it's something that I am doing wrong.
First, you can grab my dataset by downloading the file dataset.r from this link and loading it into your session with dget("dataset.r").
In my case, I would do dat = dget("dataset.r").
The code below is what I am using to load data into the Neo4j.
library(RNeo4j)
graph = startGraph("http://localhost:7474/db/data/")
graph$version
# sure that the graph is clean -- you should backup first!!!
clear(graph, input = FALSE)
## ensure the constraints
addConstraint(graph, "School", "unitid")
addConstraint(graph, "Topic", "topic_id")
## create the query
## BE CAREFUL OF WHITESPACE between KEY:VALUE pairs for parameters!!!
query = "
MERGE (s:School {unitid:{unitid},
instnm:{instnm},
obereg:{obereg},
carnegie:{carnegie},
applefeeu:{applfeeu},
enrlft:{enrlft},
applcn:{applcn},
admssn:{admssn},
admit_rate:{admit_rate},
ape:{ape},
sat25:{sat25},
sat75:{sat75} })
MERGE (t:Topic {topic_id:{topic_id},
topic:{topic} })
MERGE (s)-[:HAS_TOPIC {score:{score} }]->(t)
"
for (i in 1:nrow(dat)) {
## status
cat("starting row ", i, "\n")
## run the query
cypher(graph,
query,
unitid = dat$unitid[i],
instnm = dat$instnm[i],
obereg = dat$obereg[i],
carnegie = dat$carnegie[i],
applfeeu = dat$applfeeu[i],
enrlft = dat$enrlt[i],
applcn = dat$applcn[i],
admssn = dat$admssn[i],
admit_rate = dat$admit_rate[i],
ape = dat$apps_per_enroll[i],
sat25 = dat$sat25[i],
sat75 = dat$sat75[i],
topic_id = dat$topic_id[i],
topic = dat$topic[i],
score = dat$score[i] )
} #endfor
I can successfully load the first 49 records of my dataframe dat, but errors out on the 50th row.
This is the error that I recieve:
starting row 50
Show Traceback
Rerun with Debug
Error: 400 Bad Request
{"message":"Node 1477 already exists with label School and property \"unitid\"=[110680]","exception":"CypherExecutionException","fullname":"org.neo4j.cypher.CypherExecutionException","stacktrace":["org.neo4j.cypher.internal.compiler.v2_1.spi.ExceptionTranslatingQueryContext.org$neo4j$cypher$internal$compiler$v2_1$spi$ExceptionTranslatingQueryContext$$translateException(ExceptionTranslatingQueryContext.scala:154)","org.neo4j.cypher.internal.compiler.v2_1.spi.ExceptionTranslatingQueryContext$ExceptionTranslatingOperations.setProperty(ExceptionTranslatingQueryContext.scala:121)","org.neo4j.cypher.internal.compiler.v2_1.spi.UpdateCountingQueryContext$CountingOps.setProperty(UpdateCountingQueryContext.scala:130)","org.neo4j.cypher.internal.compiler.v2_1.mutation.PropertySetAction.exec(PropertySetAction.scala:51)","org.neo4j.cypher.internal.compiler.v2_1.mutation.MergeNodeAction$$anonfun$exec$1.apply(MergeNodeAction.scala:80)","org.neo4j.cypher.internal.compiler.v2_1
Here is my session info:
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RNeo4j_1.2.0
loaded via a namespace (and not attached):
[1] RCurl_1.95-4.1 RJSONIO_1.2-0.2 tools_3.1.0
And it's worth noting that I am using Neo4j 2.1.3.
Thanks for any help in advance.
This is an issue with how MERGE works. By setting the score property within the MERGE clause itself here...
MERGE (s)-[:HAS_TOPIC {score:{score} }]->(t)
...MERGE tries to create the entire pattern, and thus your uniqueness constraint is violated. Instead, do this:
MERGE (s)-[r:HAS_TOPIC]->(t)
SET r.score = {score}
I was able to import all of your data after making this change.

Resources