I have a list of gene names genes_list and a list of genes that target other genes (list of tuples) genes2, I successfully connected to my local database and created the 20244 nodes labeled GEN with name property.
I am trying to generate a script that automates the creation of relationships for any pair of nodes(usin variables tupla[0]and tupla[1]) in a Neo4j graph but I can't get the for loop to work for a list of tuples, any advice? Im still learning how to use this library any advices would be great! regards!
from py2neo import Node,Relationship,Graph, database,NodeMatcher
import time
import pandas as pd
genes_list=pd.read_csv("Gen_list.txt",delimiter="\t",header=None)
genes_list=genes_list[0].tolist()
for name in genes_list:
graph.run("CREATE(:GEN{name:$name})",name=name)
genes=pd.read_csv(r"C:\Users\espin\OneDrive\Escritorio\MCI\SCRIPTS\dorothea_final.csv", delimiter="\t",header=None)
genes2=list(genes.to_records(index=False))
for tupla in genes2:
existing_u1 = matcher.match("GEN").where(name=tupla[0]).first()
existing_u2 = matcher.match("GEN").where(name=tupla[1).first()
graph.merge(existing_u1,"REGULATES", existing_u2)
I figured out, For the people that want to try this implementation and is using py2neo V4, try using graph.run()
for tupla in genes2:
graph.run("MATCH(a:GEN{name:$name}) MATCH(b:GEN{name:$name1}) CREATE (a)-[:REGULATES]->(b)",name=tupla[0],name1=tupla[1])
remember that the query has to be in the first argument and then you declare the $variables separated by ","
at less this works when you have already created the nodes and does not duplicate the existing ones.
Related
I just downloaded and installed Neo4J. Now I'm working with a simple csv that is looking like that:
So first I'm using this to merge the nodes for that file:
LOAD CSV WITH HEADERS FROM 'file:///Athletes.csv' AS line
MERGE(Rank:rank{rang: line.Rank})
MERGE(Name:name{nom: line.Name})
MERGE(Sport:sport{sport: line.Sport})
MERGE(Nation:nation{pays: line.Nation})
MERGE(Gender: gender{genre: line.Gender})
MERGE(BirthDate:birthDate{dateDeNaissance: line.BirthDate})
MERGE(BirthPlace: birthplace{lieuDeNaissance: line.BirthPlace})
MERGE(Height: height{taille: line.Height})
MERGE(Pay: pay{salaire: line.Pay})
and this to create some constraint for that file:
CREATE CONSTRAINT ON(name:Name) ASSERT name.nom IS UNIQUE
CREATE CONSTRAINT ON(rank:Rank) ASSERT rank.rang IS UNIQUE
Then I want to display to which country the athletes live to. For that I use:
Create(name)-[:WORK_AT]->(nation)
But I have have that appear:
I would like to know why I have that please.
I thank in advance anyone that takes time to help me.
Several issues come to mind:
If your CREATE clause is part of your first query: since the CREATE clause uses the variable names name and nation, and your MERGE clauses use Name and Nation (which have different casing) -- the CREATE clause would just create new nodes instead of using the Name and Nation nodes.
If your CREATE clause is NOT part of your first query: your CREATE clause would just create new nodes (since variable names, even assuming they had the same casing, are local to a query and are not stored in the DB).
Solution: You can add this clause to the end of the first query:
CREATE (Name)-[:WORK_AT]->(Nation)
Yes, Agree with #cybersam, it's the case sensitive issue of 'name' and 'nation' variables.
My suggesttion:
MERGE (Name)-[:WORK_AT]->(Nation)
I see that you're using MERGE for nodes, so just in case any values of Name or Nation duplicated, you should use MERGE instead of CREATE.
Is it possible in Neo4j or SDN4 to create/emulate something similar to a PostgreSQL sequence database object?
I need this thread safe functionality in order to be able to ask it for next, unique Long value. I'm going to use this value as a surrogate key for my entities.
UPDATED
I don't want to go with UUID because I have to expose these IDs within my web application url parameters and in case of UUID my urls look awful. I want to go with a plain Long values for IDs like StackOverflow does, for example:
stackoverflow.com/questions/42228501/neo4j-sdn-4-emulate-sequence-objectnot-uuid
This can be done with user procedures and functions. As an example:
package sequence;
import org.neo4j.procedure.*;
import java.util.concurrent.atomic.AtomicInteger;
public class Next {
private static AtomicInteger sequence = new AtomicInteger(0);
#UserFunction
public synchronized Number next() {
return sequence.incrementAndGet();
}
}
The problem of this example is that when the server is restarted the counter will be set to zero.
So it is necessary to keep the last value of the counter. This can be done using these examples:
https://maxdemarzi.com/2015/03/25/triggers-in-neo4j/
https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/master/src/main/java/apoc/trigger/Trigger.java
No. As far as I'm aware there isn't any similar functionality to sequences or auto increment identifiers in Neo4j. This question has also been asked a few times in the past.
The APOC project might be worth checking out for this though. There seems to be a request to add it.
If your main interest is in having a way to generate unique IDs, and you do not care if the unique IDs are strings, then you should consider using the APOC facilities for generating UUIDs.
There is an APOC function that generates a UUID, apoc.create.uuid. In older versions of APOC, this is a procedure that must be invoked using the CALL syntax. For example, to create and return a single Foo node with a new UUID:
CREATE (f:Foo {uuid: apoc.create.uuid()})
RETURN f;
There is also an APOC procedure, apoc.create.uuids(count), that generates a specified number of UUIDs. For example, to create and return 5 Foo nodes with new UUIDs:
CALL apoc.create.uuids(5) YIELD uuid
CREATE (f:Foo {uuid: uuid})
RETURN f;
The most simplest way in Neo4j is to disable ids reuse and use node Graph ID like sequencer.
https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
Table A.83. dbms.ids.reuse.types.override
Description: Specified names of id types (comma separated) that should be reused. Currently only 'node' and 'relationship' types are supported.
Valid values: dbms.ids.reuse.types.override is a list separated by "," where items are one of NODE, RELATIONSHIP
Default value: [RELATIONSHIP, NODE]
I need to create a python function such that it adds nodes and relationship to a graph and returns the number of created nodes and relationships.
I have added the nodes and relationship using graph.cypher.execute().
arr_len = len(dic_st[story_id]['PER'])
for j in dic_st[story_id]['PER']:
graph.cypher.execute("MERGE (n:PER {name:{name}})",name = j[0].upper()) #creating the nodes of PER in the story
print j[0]
for j in range(0,arr_len):
for k in range(j+1,arr_len):
graph.cypher.execute("MATCH (p1:PER {name:{name1}}), (p2:PER {name:{name2}}) WHERE upper(p1.name)<>upper(p2.name) CREATE UNIQUE (p1)-[r:in_same_doc {st_id:{st_id}}]-(p2)", name1=dic_st[story_id]['PER'][j][0].upper(),name2=dic_st[story_id]['PER'][k][0].upper(),st_id=story_id) #linking the edges for PER nodes
What I need is to return the number of new nodes and relationships created.
What I get to know from the neo4j documentation is that there is something called "ON CREATE" and "ON MATCH" for MERGE in cypher, but thats not being very useful.
The browser interface for neo4j do actually shows the number of nodes and relationship updated. This is what I need to return, but I am not getting quite the way for it to access it.
Any help please.
In case you need the exact counts of properties either created or updated then you have use "Match" with "Create" or "Match" with "Set" and then count the size of results. Merge may not return which ones are updated and which ones are created.
When you post your query against the Cypher endpoint of the neo4j REST API without using py2neo, you can include the argument "includeStats": true in your post request to get the node/relationship statistics. See this question for an example.
As far as I can tell, py2neo currently does not support additional parameters for the Cypher query (even though it is using the same API endpoints under the hood).
In Python, you could do something like this (using the requests and json packages):
import requests
import json
payload = {
"statements": [{
"statement": "CREATE (t:Test) RETURN t",
"includeStats": True
}]
}
r = requests.post('http://your_server_host:7474/db/data/transaction/commit',
data=json.dumps(payload))
print(r.text)
The response will include statistics about the number of nodes created etc.
{
"stats":{
"contains_updates":true,
"nodes_created":1,
"nodes_deleted":0,
"properties_set":1,
"relationships_created":0,
"relationship_deleted":0,
"labels_added":1,
"labels_removed":0,
"indexes_added":0,
"indexes_removed":0,
"constraints_added":0,
"constraints_removed":0
}
}
After executing your query using x = session.run(...) you can use x.summary.counters to get the statistics noted in Martin Perusse's answer. See the documentation here.
In older versions the counters are available as a "private" field under x._summary.counters.
I'm trying to make a social network and its my first web experience.
I'm using Neo4j database and py2neo module.
Now I want to find a node from my database and change some of it's properties.
I'm using the code below,and i can run it with no errors .but it doesn't change anything in my database and i have no idea why...
please help me if you can.
from py2neo import Graph
graph=Graph()
def edit_name(Uname,name):
person=graph.merge_one("Person","username",Uname)
person.cast(fname=name)
Cast is for casting general Python objects to py2neo objects. For example, if you wanted to cast a Python dictionary to a py2neo Node object, you'd do:
from py2neo import Graph, Node
graph = Graph()
d = {'name':'Nicole', 'age':24}
nicole = Node.cast('Person', d)
However, you still need to pass nicole to Graph.create to actually create the node in the database:
graph.create(nicole)
Then, if you later retrieve this node from the database with Graph.merge_one and want to update properties:
nicole = graph.merge_one('Person', 'name', 'Nicole')
nicole['hair'] = 'blonde'
Then you need to push those changes to the graph; cast is inappropriate for updating properties on something that is already a py2neo Node object:
nicole.push()
TL;DR:
from py2neo import Graph
graph = Graph()
def edit_username(old_name, new_name):
person = graph.merge_one('Person', 'username', old_name)
person['username'] = new_name
person.push()
merge_one will either return a matching node, or, if no matching node exists, create and return a new one. So, in your case, a matching node probably already exists.
I am working on a Ruby on Rails project that will read and parse somewhat big text file (around 100k lines) and build Neo4j nodes (I am using Neography) with that data.
This is the Neo4j related fraction of the code I wrote:
d= Neography::Rest.new.execute_query("MATCH (n:`Label`) WHERE (n.`name`='#{id}') RETURN n")
d= Neography::Node.load(d, #neo)
p= Neography::Rest.new.create_node("name" => "#{id}")
Neography::Rest.new.add_label(p, "LabelSample")
d=Neography::Rest.new.get_node(d)
Neography::Rest.new.create_relationship("belongs_to", p, d)
so, what I want to do is: a search in the already populated db for the node with the same "name" field as the parsed data, create a new node for this data and finally create a relationship between the two of them.
Obiously this code simply takes way too much time to be executed.
So I tried with Neography's batch, but I ran into some issues.
p = Neography::Rest.new.batch [:create_node, {"name" => "#{id}"}]
gave me a "undefined method `split' for nil:NilClass" in
id["self"].split('/').last
d=Neography::Rest.new.batch [:get_node, d]
gives me a Neography::UnknownBatchOptionException for get_node
I am not even sure this will save me enough time either.
I also tried different ways to do this, using Batch Import for example, but I couldn't find a way to get the already created node I need from the db.
As you can see I'm kinda new to this so any help will be appreciated.
Thanks in advance.
You might be able to do this with pure cypher (or neography generated cypher). Something like this perhaps:
MATCH (n:Label) WHERE n.name={id}
WITH n
CREATE (p:LabelSample {name: n.name})-[:belongs_to]->n
Not that I'm using CREATE, but if you don't want to create duplicate LabelSample nodes you could do:
MATCH (n:Label) WHERE n.name={id}
WITH n
MERGE (p:LabelSample {name: n.name})
CREATE p-[:belongs_to]->n
Note that I'm using params, which are generally recommended for performance (though this is just one query, so it's not as big of a deal)