Create node only if it doesn't exist in py2neo v4 - py2neo

My understanding of neo4j is that we can use Merge to add a node in such a way that we won't get duplicates. The py2neo v4 documentation says that
For a GraphObject, create and merge are an identical operation
Yet when I execute the following code, I get 3 nodes in my graph:
graph.create(Node("Person", id="Bob"))
graph.create(Node("Person", id="Bob"))
graph.create(Node("Person", id="Bob"))
How can I add a node while also checking that the node doesn't already exists?

Related

py2neo (Neo4j) bulk operation

I have created a nice graph using Neo4j Desktop, using a local CSV file that I added to the project. now I'm trying to use python to do that automatically for that I'm trying to use methods from here https://py2neo.org/2021.1/bulk/index.html like create/merge nodes/relatioships.
I have 2 questions,
Do I need to use create method (create node for exmaple) and after that to use merge method (merge nodes for this example) or can use merge nodes from the begining?
I have tried to use merge only and I got some wierd things when I'm using large sample size.
2)After creation of nodes and relationships how can I change the visualization of the nodes (put some value in the node)?
*If there is an other way to use python to create graph from a big CSV file I would like to hear, Thanks!

How to update existing specific node in graphdb by loading updated CSV file in neo4J apoc

I am facing problem updating node by loading recently updated csv file in. neo4j. since it is a large file I think apoc procedure is need to be used. I have updated existing node by loading external updated file without apoc. but problem is I need to update it in parallel using apoc. here is my file element
original element in file
ID,SHOPNAME,DIVISION,DISTRICT,THANA
1795,ARAFAT DISTRIBUTION,RAJSHAHI,JOYPURHAT,Panchbibi
1796,CONNECT DISTRIBUTION,DHAKA,GAZIPUR,Gazipur Sadar
1797,HUMAYUN KABIR,DHAKA,DHAKA,Demra
I have created node from this CSV
then I have another updated file u.csv the updated elements are given bellow
ID,SHOPNAME,DIVISION,DISTRICT,THANA
1795,ABC,RAJSHAHI,JOYPURHAT,Panchbibi
1796,XYZ,DHAKA,GAZIPUR,Gazipur Sadar
1797,HUMAYUN KABIR,DHAKA,DHAKA,Demra
without apoc my query was
LOAD CSV FROM "file:///u.csv" AS line
MERGE (c:Agent {ID:line[0]})
ON MATCH SET c.SHOPNAME = line[1]
RETURN c
This code updated desired column except I have got a blank node
{"ID":"ID"}
my first question is why a new blank node is created and how could I solve this
Now I am wanting it for updating large file so I have used to apoc procedure for batch processing
with apoc my query was
CALL apoc.periodic.iterate('LOAD CSV WITH HEADERS FROM "file:///u.csv" AS line return line','MERGE (p:Agent{ID:TOINTEGER(line.ID)}) ON MATCH SET p.SHOPNAME=TOINTEGER(line.SHOPNAME) ' ,{batchSize:10000, iterateList:true, parallel:true});
but I could not updated the specific nodes rather it created two nodes with related id so I am getting 5 nodes here rather than 3 nodes
{"ID":1795}
{"ID":1796}
I am very new to neo4j but trying to learn. kindly help me to solve the problem
I am using neo4j 3.5.6 and apoc 3.5.0.4
I see 2-3 possible issues here:
Regarding Duplicate Nodes: You used TOINTEGER function in one and not in another data load query, so nodes are duplicated. One Agent node with id with the data type string and other Agent node with id with the data type integer.
Suggestion: Use TOINTEGER function in both queries or none.
Regarding Blank Nodes:
In your second query, you are setting node property only if node found(i.e. ON MATCH).
But as per the first case, we have found it's creating a new node every time and not matching any of the previous node. Also not setting property when creating. So there will nodes with no SHOPNAME.
Suggestion: Either Add ON CREATE to MERGE query or remove ON MATCH from MERGE query and update node every time. Adding ON
CREATE is a recommended and efficient way.
Please find below query with ON CREATE:
MERGE (c:Agent {ID:line[0]})
ON CREATE SET
c.SHOPNAME = line[1]
You are also converting SHOPNAME to integer in your query with APOC using TOINTEGER, this will not work.

Can we make a graph object in py2neo and using other graph object's merge function for merging the graphs?

I want to merge a graph into my graph database using py2neo. My question is can we make a graph object, add all the required nodes and relationships and then using the merge function of object which is an instance of my graphdatabase, merge the created graph in graphdatabase?
Yes I was able to merge a subgraph into the main graph database. We can make a subgraph by doing union on all the nodes and relations to be created. And then using graph.merge() we can merge our subgraph into graph. Thank you for considering the question, I found the answer

Creating unique relationships in Neo4j using py2neo get_or_create

I am trying to create a graph using some nodes and relationships using py2noe. say, I'm creating a family tree.
I am creating nodes using get_or_create() so that my script doesn't create duplicates if I supply the same value again.
How can I do the same for a relationship? I can't find any reference to a get_or_create() like function for a relationship.
I want to publish (Joe)-[:son]->(John)
the first time it creates 2 nodes joe and john and a link between them.
if I re run my script, as the nodes are unique, they aren't published but a new relationship is created.
This gives me a graph with 2 nodes and n relationships where n is the number of times I run the script.
I also tried using cypher and I get the same issue. It keeps on creating relationships.
Can anyone suggest me a way to solve this problem?
Thanks.
I don't know py2neo (so there may be a wrapper for this function), but the way to achieve that would be using a Cypher MERGE which it looks like you have to run using the raw Cypher statement:
cypher_merge_result = neo4j.CypherQuery(graph_db,
"MERGE (s:Person{name:Joe})-[:SON]->(f:Person{name:John})")
That will create 2 Person nodes and 1 SON relationship, no matter how many times that you run it.
Neo4J documentation is here and it is important to understand how it works as partial matches in the MERGE statement will cause the whole pattern to be created. So if your Person nodes already exist you should match them in advance to avoid duplicate Persons being created. i.e
cypher_merge_result = neo4j.CypherQuery(graph_db,
"MATCH (s:Person{name:Joe}), (f:Person{name:John}) MERGE (s)-[:SON]->(f)")

Cypher query: Finding most common path between two nodes filtered by no of relationship

i am new to Neo4j using version 2.0.3.
Currently i am working on a project where we need to find out most common path traversed by user.
DB : there are some fixed checkpoints. Each checkpoint is a node.
when user move from one checkpoint to another then i create relation b/w 2 checkpoints and it is directional .
Relation name is current_time_username;
so checkpoint has more relation is most visited.
now i want to find out most common path b/w 2 checkpoints using Cipher not using java API.
i already check Cipher shortestPath function but not work for this

Resources