I'm trying to create a Node with its relationships in a test environment. I got the node with its relationships in another database. Is it somehow possible to get a CREATE statement that inserts the node in the test environment?
I believe it is possible to do this in plain Cypher, but it is quite tricky. I have previously used Cypher to generate command snippets in the form of import-cypher -o [filename] [query] to export data. This can be adapted to your use case, but there are some design decisions that have to made:
How do we identify nodes?
Do relationships have properties?
To get started, let's create an example node:
CREATE (:Label1:Label2 {prop1: 'string', prop2: 123, prop3: true})
To generate this from the database, use the following command:
MATCH (n)
WITH
reduce(
acc = '', label IN labels(n) |
acc + ':`' + label + '`')
AS labels,
reduce(
acc = '', key IN keys(n) |
acc + '`' + key + '`: ' +
CASE n[key] = true WHEN true THEN 'true' ELSE
CASE n[key] = false WHEN true THEN 'false' ELSE
CASE toInteger(n[key]) = n[key] WHEN true THEN n[key] ELSE
CASE toFloat(n[key]) = n[key] WHEN true THEN n[key] ELSE
"'" + n[key] + "'" END END END END
+ ', ') AS properties
WITH
labels,
substring(properties, 0, length(properties) - 2) AS properties
RETURN
'CREATE (' + labels + ' {' + properties + '})'
This query results in a CREATE command that is essentially the same as the one we have started with:
CREATE (:`Label1`:`Label2` {`prop1`: 'string', `prop2`: 123, `prop3`: true})
To connect neighbouring nodes, we'll need some ids - I'll improve this answer tomorrow based on the feedback I get.
Then you can use seperate sessions or neo4j and try out the process. Match (n:Label) return n then extract the properties and put in into an insert query. Now extracting the properties depend on the language.
These were the steps I followed:
query = "MATCH (n) return n"
result = session1.run(query)
loop the result object and convert it into dict and get the properties
then insert them into an insert query
insert_query = "MERGE/CREATE (n:{properties}) return n"
session2.run(insert_query)
This algorithm was implemented in python-neo4j.
Hope this helps!
Related
So, I have a tree of Person and I'm trying to query a subtree of a given node in the tree (root) and limit the subtree levels returned (max):
with "A" as root, 3 as max
match (a:Person {name: root})-[:PARENT*1..max]->(c:Person)
return a, c
But it's giving me this error:
Invalid input 'max': expected "$", "]", "{" or <UNSIGNED_DECIMAL_INTEGER> (line 2, column 44 (offset: 71))
"match (a:Person {name: root})-[:PARENT*1..max]->(c:Person)"
Both root and max will be an input, so in the code I've tried parameterizing those values:
result = tx.run(
"""
match (a:Person {name: $root})-[:PARENT*1..$max]->(c:Person)
return a, c
""",
{"root": "A", "max": 2}
)
but:
code: Neo.ClientError.Statement.SyntaxError} {message: Parameter maps cannot be used in MATCH patterns (use a literal map instead, eg. "{id: {param}.id}
(Based on #nimrod serok's answer)
I guess, I can just sanitize the max and manually interpolate it into the query string. But I'm wondering if there's a cleaner way to do it.
One way to insert the parameters in the code is:
result = tx.run(
'''
match (a:Person {name:''' + root +'''})-[:PARENT*1..''' + max + ''']->(c:Person)
return a, c
'''
)
or use the equivalent query, but with format:
result = tx.run(
'''
match (a:Person)-[:PARENT*1..{}]->(c:Person)
WHERE a.name={}
return a, c
'''.format(max, root)
)
If you want to use the UI instead , you can use:
:param max =>2
As can be seen here
One of the simplest ways is just to pass the values as strings in the query. You can achieve this by using the template literal.
e.g.
result = tx.run(
`
match (a:Person {name:'${root}'})-[:PARENT*1..'${max}']->(c:Person)
return a, c
`)
Assume that in an application, the user gives us a graph and we want to consider it as a pattern and find all occurrences of the pattern in the neo4j database. If we knew what the pattern is, we could write the pattern as a Cypher query and run it against our database. However, now we do not know what the pattern is beforehand and receive it from the user in the form of a graph. How can we perform a pattern matching on the database based on the given graph (pattern)? Is there any apoc for that? Any external library?
One way of doing this is to decompose your input graph into edges and create a dynamic cypher from it. I have worked on this quite some time ago, and the solution below is not perfect but indicates a possible direction.
For example, if you feed this graph:
and you take the id(node) from the graph, (i am not taking the rel ids, this is one of the imperfections)
this query
WITH $nodeids AS selection
UNWIND selection AS s
WITH COLLECT (DISTINCT s) AS selection
WITH selection,
SPLIT(left('a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z',SIZE(selection)*2-1),",") AS nodeletters
WITH selection,
nodeletters,
REDUCE (acc="", nl in nodeletters |
CASE acc
WHEN "" THEN acc+nl
ELSE acc+','+nl
END) AS rtnnodes
MATCH (n) WHERE id(n) IN selection
WITH COLLECT(n) AS nodes,selection,nodeletters,rtnnodes
UNWIND nodes AS n
UNWIND nodes AS m
MATCH (n)-[r]->(m)
WITH DISTINCT "("
+nodeletters[REDUCE(x=[-1,0], i IN selection | CASE WHEN i = id(n) THEN [x[1], x[1]+1] ELSE [x[0], x[1]+1] END)[0]]
+TRIM(REDUCE(acc = '', p IN labels(n)| acc + ':'+ p))+")-[:"+type(r)+"]->("
+ nodeletters[REDUCE(x=[-1,0], i IN selection | CASE WHEN i = id(m) THEN [x[1], x[1]+1] ELSE [x[0], x[1]+1] END)[0]]
+TRIM(REDUCE(acc = '', p IN labels(m)| acc + ':'+ p))+")" as z,rtnnodes
WITH COLLECT(z) AS parts,rtnnodes
WITH REDUCE(y=[], x in range(0, size(parts)-1) | y + replace(parts[x],"[","[r" + (x+1))) AS parts2,
REDUCE (acc="", x in range(0, size(parts)-1) | CASE acc WHEN "" THEN acc+"r"+(x+1) ELSE acc+",r"+(x+1) END) AS rtnrels,
rtnnodes
RETURN
REDUCE (acc="MATCH ",p in parts2 |
CASE acc
WHEN "MATCH " THEN acc+p
ELSE acc+','+p
END)+
" RETURN "+
rtnnodes+","+rtnrels+
" LIMIT "+{limit}
AS cypher
returns something like
cypher: "MATCH (a:Person)-[r1:DRIVES]->(b:Car),(a:Person)-[r2:KNOWS]->(c:Person) RETURN a,b,c,r1,r2 LIMIT 50"
which you can feed to the next query.
In Graphileon, you can just select the nodes, and the result will be visualized as well.
Disclosure : I work for Graphileon
I have used patterns in genealogy queries.
The X-chromosome is not transmitted from father to son. As you traverse a family tree you can use the reduce function to create a concatenated string of the sex of the ancestor. You can then accept results that lack MM (father-son). This query gives all the descendants inheriting the ancestor's (RN=32) X-chromosome.
match p=(n:Person{RN:32})<-[:father|mother*..99]-(m)
with m, reduce(status ='', q IN nodes(p)| status + q.sex) AS c
where c=replace(c,'MM','')
return distinct m.fullname as Fullname
I am developing other pattern specific queries as part of a Neo4j PlugIn for genealogy. These will include patterns of triangulation groups.
GitHub repository for Neo4j Genealogy PlugIn
I have a hierarchy in the following format:
Application.A <-[:belongs_to {roles:instance}]-Instance.A1-[:depends_on {roles:host}]-> Host.vm1
implying Application A has an instance A1 which is running on Host vm1 with the relation "belongs_to" and "depends_on".
Instance.A1-[:depends_on {roles:instance}]-> Database db1 <-[:belongs_to]-Instance.dbNode1-[:depends_on {roles:host}]-> Host.vm2
implying instance A1 of Application A depends on Database db1 that has an Instance dbNode1 running on Host vm2.
I am able to write individual cyphers and process the result in my java API.
I am trying to write a a single cypher that will take Application as input (in this case A) & return the whole hierarchy.
Something like this....
A.A1.vm1.db1.dbNode1.vm2
Is this doable? If yes, would appreciate some pointers.
Thanks.
It is certainly possible.
I would advise against putting, in your relationships, properties that would need to be used for matching purposes -- since Cypher does not allow you to index them yet. You should just have specific relationship types.
Also, to indicate which instance of the db1 DB is being used by a1, you really need a direct relationship between a1 and dbNode1. You cannot only have a relationship between a1 and db1, since it would not be clear which instance of db1 is being used by a1.
Here is an example of what you could do:
MATCH
(a1:Instance)-[:is_instance_of]->(a:Application {name: "MyApp"}),
(a1)-[:runs_on_host]->(vm1:Host),
(a1)-[:uses_db_type]->(db1:Database),
(a1)-[:uses_db_instance]->(dbNode1:Instance)-[:is_instance_of]->(db1),
(dbNode1)-[:runs_on_host]->(vm2:Host)
RETURN a.name + "." + a1.name + "." + vm1.name + "." + db1.name + "." + dbNode1.name + "." + vm2.name AS result;
Note that this simple query would not match an app instance that does not use a DB. If you need to match such instances as well, you can use an OPTIONAL MATCH clause:
MATCH
(a1:Instance)-[:is_instance_of]->(a:Application {name: "MyApp"}),
(a1)-[:runs_on_host]->(vm1:Host)
OPTIONAL MATCH
(a1)-[:uses_db_type]->(db1:Database),
(a1)-[:uses_db_instance]->(dbNode1:Instance)-[:is_instance_of]->(db1),
(dbNode1)-[:runs_on_host]->(vm2:Host)
RETURN
CASE WHEN db1 IS NULL THEN a.name + "." + a1.name + "." + vm1.name
ELSE a.name + "." + a1.name + "." + vm1.name + "." + db1.name + "." + dbNode1.name + "." + vm2.name
END
AS result;
I have modeled my neo4j database according to this answer by Nicole White in this link
and I also successfully tested the cypher query
MATCH (a:Stop {name:'A'}), (d:Stop {name:'D'})
MATCH route = allShortestPaths((a)-[:STOPS_AT*]-(d)),
stops = (a)-[:NEXT*]->(d)
RETURN EXTRACT(x IN NODES(route) | CASE WHEN x:Stop THEN 'Stop ' + x.name
WHEN x:Bus THEN 'Bus ' + x.id
ELSE '' END) AS itinerary,
REDUCE(d = 0, x IN RELATIONSHIPS(stops) | d + x.distance) AS distance
against a small test graph with 10 nodes.
But my original graph which contains about 2k nodes and 6k relationships causes trouble with the query. The query simply stops and I get an error:
java.lang.OutOfMemoryError: Java heap space
Can you help me to optimize my query or any other solution?
Thank you
try to introduce a WITH to limit the calculation of :NEXT paths to only those pairs of a, d that are known to be a shortestpath. It's also a good practice to supply an upper limit for variable path length matches - im using 100 here as an example:
MATCH route = allShortestPaths(
(a:Stop {name:'A'})-[:STOPS_AT*100]-(d:Stop {name:'D'})
)
WITH route, a, d
MATCH stops = (a)-[:NEXT*100]->(d)
RETURN EXTRACT(x IN NODES(route) | CASE WHEN x:Stop THEN 'Stop ' + x.name
WHEN x:Bus THEN 'Bus ' + x.id
ELSE '' END) AS itinerary,
REDUCE(d = 0, x IN RELATIONSHIPS(stops) | d + x.distance) AS distance
The following examples are taken from the Neo4j documentation found here.
Using Cypher, it is possible to remove a single, known label using a Cypher statement like so:
MATCH (n { name: 'Peter' })
REMOVE n:German
RETURN n
You can also remove multiple labels like so:
MATCH (n { name: 'Peter' })
REMOVE n:German:Swedish
RETURN n
So how would one remove all labels from a node using simple Cypher statements?
You can also try this way using doIt method from apoc library:
match (n {name: 'Peter'})
call apoc.cypher.doIt(
"match (o)" +
" where ID(o) = " + ID(n) +
" remove "+reduce(a="o",b in labels(n) | a+":"+b) +
" return (o);",
null)
yield value
return value
There's no syntax for that yet! Labels are usually things that are known quantities, so you can list them all out if you want. There's no dynamic way to remove them all, though.
so, how about a two step cypher approach? use cypher to generate some cypher statements and then execute your cypher statements in the shell.
You could try something like this to generate the batch cypher statements
match (n)
return distinct "match (n"
+ reduce( lbl_str= "", l in labels(n) | lbl_str + ":" + l)
+ ") remove n"
+ reduce( lbl_str= "", l in labels(n) | lbl_str + ":" + l)
+ ";"
The output should look something like this...
match (n:Label_1:Label_2) remove n:Label_1:Label_2;
match (n:Label_1:Label_3) remove n:Label_1:Label_3;
match (n:Label_2:Label_4) remove n:Label_2:Label_4;
You would probably want to remove any duplicates and depending on your data there could be quite a few.
Not exactly what you are looking for but I think it would get you to the same end state using just cypher and the neo4j shell.
Shiny NEW and improved cypher below...
I edited this down to something that would work in the browser alone. It hink this is a much better solution. It is still two steps but it produces a single statement that can be cut and paste into the browser.
match (n)
with distinct labels(n) as Labels
with reduce(lbl_str="", l in Labels | lbl_str + ":" + l) as Labels
order by Labels
with collect(Labels) as Labels
with Labels, range(0,length(Labels) - 1) as idx
unwind idx as i
return "match (n" + toString(i) + Labels[i] + ")" as Statement
union
match (n)
with distinct labels(n) as Labels
with reduce(lbl_str="", l in Labels | lbl_str + ":" + l) as Labels
order by Labels
with collect(Labels) as Labels
with Labels, range(0,length(Labels) - 1) as idx
unwind idx as i
return "remove n" + toString(i) + Labels[i] as Statement
which produces output like this...
match (n0:Label_A)
match (n1:Label_B)
match (n2:Label_C:Label_D)
match (n3:Label_E)
remove n0:Label_A
remove n1:Label_B
remove n2:Label_C:Label_D
remove n3:Label_E
which can then be cut and paste into the Neo4j browser.