Remove all labels for Neo4j Node - neo4j

The following examples are taken from the Neo4j documentation found here.
Using Cypher, it is possible to remove a single, known label using a Cypher statement like so:
MATCH (n { name: 'Peter' })
REMOVE n:German
RETURN n
You can also remove multiple labels like so:
MATCH (n { name: 'Peter' })
REMOVE n:German:Swedish
RETURN n
So how would one remove all labels from a node using simple Cypher statements?

You can also try this way using doIt method from apoc library:
match (n {name: 'Peter'})
call apoc.cypher.doIt(
"match (o)" +
" where ID(o) = " + ID(n) +
" remove "+reduce(a="o",b in labels(n) | a+":"+b) +
" return (o);",
null)
yield value
return value

There's no syntax for that yet! Labels are usually things that are known quantities, so you can list them all out if you want. There's no dynamic way to remove them all, though.

so, how about a two step cypher approach? use cypher to generate some cypher statements and then execute your cypher statements in the shell.
You could try something like this to generate the batch cypher statements
match (n)
return distinct "match (n"
+ reduce( lbl_str= "", l in labels(n) | lbl_str + ":" + l)
+ ") remove n"
+ reduce( lbl_str= "", l in labels(n) | lbl_str + ":" + l)
+ ";"
The output should look something like this...
match (n:Label_1:Label_2) remove n:Label_1:Label_2;
match (n:Label_1:Label_3) remove n:Label_1:Label_3;
match (n:Label_2:Label_4) remove n:Label_2:Label_4;
You would probably want to remove any duplicates and depending on your data there could be quite a few.
Not exactly what you are looking for but I think it would get you to the same end state using just cypher and the neo4j shell.
Shiny NEW and improved cypher below...
I edited this down to something that would work in the browser alone. It hink this is a much better solution. It is still two steps but it produces a single statement that can be cut and paste into the browser.
match (n)
with distinct labels(n) as Labels
with reduce(lbl_str="", l in Labels | lbl_str + ":" + l) as Labels
order by Labels
with collect(Labels) as Labels
with Labels, range(0,length(Labels) - 1) as idx
unwind idx as i
return "match (n" + toString(i) + Labels[i] + ")" as Statement
union
match (n)
with distinct labels(n) as Labels
with reduce(lbl_str="", l in Labels | lbl_str + ":" + l) as Labels
order by Labels
with collect(Labels) as Labels
with Labels, range(0,length(Labels) - 1) as idx
unwind idx as i
return "remove n" + toString(i) + Labels[i] as Statement
which produces output like this...
match (n0:Label_A)
match (n1:Label_B)
match (n2:Label_C:Label_D)
match (n3:Label_E)
remove n0:Label_A
remove n1:Label_B
remove n2:Label_C:Label_D
remove n3:Label_E
which can then be cut and paste into the Neo4j browser.

Related

Cypher How to use variables in a query

So, I have a tree of Person and I'm trying to query a subtree of a given node in the tree (root) and limit the subtree levels returned (max):
with "A" as root, 3 as max
match (a:Person {name: root})-[:PARENT*1..max]->(c:Person)
return a, c
But it's giving me this error:
Invalid input 'max': expected "$", "]", "{" or <UNSIGNED_DECIMAL_INTEGER> (line 2, column 44 (offset: 71))
"match (a:Person {name: root})-[:PARENT*1..max]->(c:Person)"
Both root and max will be an input, so in the code I've tried parameterizing those values:
result = tx.run(
"""
match (a:Person {name: $root})-[:PARENT*1..$max]->(c:Person)
return a, c
""",
{"root": "A", "max": 2}
)
but:
code: Neo.ClientError.Statement.SyntaxError} {message: Parameter maps cannot be used in MATCH patterns (use a literal map instead, eg. "{id: {param}.id}
(Based on #nimrod serok's answer)
I guess, I can just sanitize the max and manually interpolate it into the query string. But I'm wondering if there's a cleaner way to do it.
One way to insert the parameters in the code is:
result = tx.run(
'''
match (a:Person {name:''' + root +'''})-[:PARENT*1..''' + max + ''']->(c:Person)
return a, c
'''
)
or use the equivalent query, but with format:
result = tx.run(
'''
match (a:Person)-[:PARENT*1..{}]->(c:Person)
WHERE a.name={}
return a, c
'''.format(max, root)
)
If you want to use the UI instead , you can use:
:param max =>2
As can be seen here
One of the simplest ways is just to pass the values as strings in the query. You can achieve this by using the template literal.
e.g.
result = tx.run(
`
match (a:Person {name:'${root}'})-[:PARENT*1..'${max}']->(c:Person)
return a, c
`)

Pattern Matching in Neo4j

Assume that in an application, the user gives us a graph and we want to consider it as a pattern and find all occurrences of the pattern in the neo4j database. If we knew what the pattern is, we could write the pattern as a Cypher query and run it against our database. However, now we do not know what the pattern is beforehand and receive it from the user in the form of a graph. How can we perform a pattern matching on the database based on the given graph (pattern)? Is there any apoc for that? Any external library?
One way of doing this is to decompose your input graph into edges and create a dynamic cypher from it. I have worked on this quite some time ago, and the solution below is not perfect but indicates a possible direction.
For example, if you feed this graph:
and you take the id(node) from the graph, (i am not taking the rel ids, this is one of the imperfections)
this query
WITH $nodeids AS selection
UNWIND selection AS s
WITH COLLECT (DISTINCT s) AS selection
WITH selection,
SPLIT(left('a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z',SIZE(selection)*2-1),",") AS nodeletters
WITH selection,
nodeletters,
REDUCE (acc="", nl in nodeletters |
CASE acc
WHEN "" THEN acc+nl
ELSE acc+','+nl
END) AS rtnnodes
MATCH (n) WHERE id(n) IN selection
WITH COLLECT(n) AS nodes,selection,nodeletters,rtnnodes
UNWIND nodes AS n
UNWIND nodes AS m
MATCH (n)-[r]->(m)
WITH DISTINCT "("
+nodeletters[REDUCE(x=[-1,0], i IN selection | CASE WHEN i = id(n) THEN [x[1], x[1]+1] ELSE [x[0], x[1]+1] END)[0]]
+TRIM(REDUCE(acc = '', p IN labels(n)| acc + ':'+ p))+")-[:"+type(r)+"]->("
+ nodeletters[REDUCE(x=[-1,0], i IN selection | CASE WHEN i = id(m) THEN [x[1], x[1]+1] ELSE [x[0], x[1]+1] END)[0]]
+TRIM(REDUCE(acc = '', p IN labels(m)| acc + ':'+ p))+")" as z,rtnnodes
WITH COLLECT(z) AS parts,rtnnodes
WITH REDUCE(y=[], x in range(0, size(parts)-1) | y + replace(parts[x],"[","[r" + (x+1))) AS parts2,
REDUCE (acc="", x in range(0, size(parts)-1) | CASE acc WHEN "" THEN acc+"r"+(x+1) ELSE acc+",r"+(x+1) END) AS rtnrels,
rtnnodes
RETURN
REDUCE (acc="MATCH ",p in parts2 |
CASE acc
WHEN "MATCH " THEN acc+p
ELSE acc+','+p
END)+
" RETURN "+
rtnnodes+","+rtnrels+
" LIMIT "+{limit}
AS cypher
returns something like
cypher: "MATCH (a:Person)-[r1:DRIVES]->(b:Car),(a:Person)-[r2:KNOWS]->(c:Person) RETURN a,b,c,r1,r2 LIMIT 50"
which you can feed to the next query.
In Graphileon, you can just select the nodes, and the result will be visualized as well.
Disclosure : I work for Graphileon
I have used patterns in genealogy queries.
The X-chromosome is not transmitted from father to son. As you traverse a family tree you can use the reduce function to create a concatenated string of the sex of the ancestor. You can then accept results that lack MM (father-son). This query gives all the descendants inheriting the ancestor's (RN=32) X-chromosome.
match p=(n:Person{RN:32})<-[:father|mother*..99]-(m)
with m, reduce(status ='', q IN nodes(p)| status + q.sex) AS c
where c=replace(c,'MM','')
return distinct m.fullname as Fullname
I am developing other pattern specific queries as part of a Neo4j PlugIn for genealogy. These will include patterns of triangulation groups.
GitHub repository for Neo4j Genealogy PlugIn

Weird result returning conditional CASE WHEN value from apoc.do.when

We have following graph
where gray nodes (:Conversation) represent conversations between users (:User pink nodes). I created query that tries to find current conversation between people and if it does not exist then create it. In both cases conversation must be returned.
Here is its code:
MATCH (u1:User {login:"User_1"})
MATCH (u2:User {login:"User_2"})
MATCH (u3:User {login:"User_3"})
OPTIONAL MATCH
(conv:Conversation)-[:CONDUCTED_BY]->(u1),
(conv)-[:CONDUCTED_BY]->(u2),
(conv)-[:CONDUCTED_BY]->(u3)
WHERE NOT EXISTS {
MATCH (conv)-[:CONDUCTED_BY]->(u:User)
WHERE NOT u IN [u1, u2, u3]
}
CALL apoc.do.when(conv IS NULL,
"WITH $u1 AS u1, $u2 AS u2, $u3 AS u3 " +
"CREATE (conv:Conversation) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u1) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u2) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u3) " +
"RETURN conv AS conv",
"RETURN $conv AS conv", {u1:u1, u2:u2, u3:u3, conv:conv}) YIELD value
[...WEIRD PART...]
Explanation:
OPTIONAL MATCH - tries to find current conversation between User 1,2,3 - conversation 71 and 72
WHERE NOT EXIST - exclude other conversations between these Users where others may be included like User_4 - 72
We end up with only one conversation we are interested in: 71
...and now the weird part comes in [...WEIRD PART...]
If we replace [...WEIRD PART...] with code
RETURN value.conv
everything is fine, but before I came up to this solution I was struggling with other code, where in the apoc mapping conv:conv was not included and else-query was just ""
WITH CASE WHEN conv IS NULL THEN value ELSE conv END AS conv
RETURN conv
that part was creating new conversation between these users 1,2,3 every time we run the query.
However if I replaced it with just
RETURN value
it was working correctly, by which I mean it did not create new conversation between users 1,2,3 if it existed.
ISSUE: I do not understand why following code
MATCH (u1:User {login:"User_1"})
MATCH (u2:User {login:"User_2"})
MATCH (u3:User {login:"User_3"})
OPTIONAL MATCH
(conv:Conversation)-[:CONDUCTED_BY]->(u1),
(conv)-[:CONDUCTED_BY]->(u2),
(conv)-[:CONDUCTED_BY]->(u3)
WHERE NOT EXISTS {
MATCH (conv)-[:CONDUCTED_BY]->(u:User)
WHERE NOT u IN [u1, u2, u3]
}
CALL apoc.do.when(conv IS NULL,
"WITH $u1 AS u1, $u2 AS u2, $u3 AS u3 " +
"CREATE (conv:Conversation) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u1) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u2) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u3) " +
"RETURN conv",
"", {u1:u1, u2:u2, u3:u3}) YIELD value
WITH CASE WHEN conv IS NULL THEN value ELSE conv END AS conv
RETURN conv
could be responsible for such weird behavior.
This was a bug with the name spacing of existential subqueries. Since conv is reused to mean something else, Cypher rewrites all but the last two conv to conv#x and the last two to conv#y, in order to distinguish between them. Here x and y are the positions of the first occurence of conv with a specific meaning. This rewriting was not properly propagated to the inner subquery.
It has been fixed here: https://github.com/neo4j/neo4j/commit/2890463ad6d2f323bfbad5cf453f14b42f51c830 and will be included in the next patch release of Neo4j 4.0.
Thanks for the clarification, I can reproduce this, and it's definitely not expected. This looks like a bug to me.
We can circumvent the issue by renaming the alias used with your CASE to be conv2 or anything other than conv. This should work:
MATCH (u1:User {login:"User_1"})
MATCH (u2:User {login:"User_2"})
MATCH (u3:User {login:"User_3"})
OPTIONAL MATCH
(conv:Conversation)-[:CONDUCTED_BY]->(u1),
(conv)-[:CONDUCTED_BY]->(u2),
(conv)-[:CONDUCTED_BY]->(u3)
WHERE NOT EXISTS {
MATCH (conv)-[:CONDUCTED_BY]->(u:User)
WHERE NOT u IN [u1, u2, u3]
}
CALL apoc.do.when(conv IS NULL,
"WITH $u1 AS u1, $u2 AS u2, $u3 AS u3 " +
"CREATE (conv:Conversation) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u1) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u2) " +
"MERGE (conv)-[:CONDUCTED_BY]->(u3) " +
"RETURN conv",
"", {u1:u1, u2:u2, u3:u3}) YIELD value
WITH CASE WHEN conv IS NULL THEN value ELSE conv END AS conv2
RETURN conv2
I'll raise this with our engineers to confirm and start on a bug fix.

How to get the a create command of a neo4j node?

I'm trying to create a Node with its relationships in a test environment. I got the node with its relationships in another database. Is it somehow possible to get a CREATE statement that inserts the node in the test environment?
I believe it is possible to do this in plain Cypher, but it is quite tricky. I have previously used Cypher to generate command snippets in the form of import-cypher -o [filename] [query] to export data. This can be adapted to your use case, but there are some design decisions that have to made:
How do we identify nodes?
Do relationships have properties?
To get started, let's create an example node:
CREATE (:Label1:Label2 {prop1: 'string', prop2: 123, prop3: true})
To generate this from the database, use the following command:
MATCH (n)
WITH
reduce(
acc = '', label IN labels(n) |
acc + ':`' + label + '`')
AS labels,
reduce(
acc = '', key IN keys(n) |
acc + '`' + key + '`: ' +
CASE n[key] = true WHEN true THEN 'true' ELSE
CASE n[key] = false WHEN true THEN 'false' ELSE
CASE toInteger(n[key]) = n[key] WHEN true THEN n[key] ELSE
CASE toFloat(n[key]) = n[key] WHEN true THEN n[key] ELSE
"'" + n[key] + "'" END END END END
+ ', ') AS properties
WITH
labels,
substring(properties, 0, length(properties) - 2) AS properties
RETURN
'CREATE (' + labels + ' {' + properties + '})'
This query results in a CREATE command that is essentially the same as the one we have started with:
CREATE (:`Label1`:`Label2` {`prop1`: 'string', `prop2`: 123, `prop3`: true})
To connect neighbouring nodes, we'll need some ids - I'll improve this answer tomorrow based on the feedback I get.
Then you can use seperate sessions or neo4j and try out the process. Match (n:Label) return n then extract the properties and put in into an insert query. Now extracting the properties depend on the language.
These were the steps I followed:
query = "MATCH (n) return n"
result = session1.run(query)
loop the result object and convert it into dict and get the properties
then insert them into an insert query
insert_query = "MERGE/CREATE (n:{properties}) return n"
session2.run(insert_query)
This algorithm was implemented in python-neo4j.
Hope this helps!

Optimize cypher query to avoid cartesian product

The query purpose is pretty trivial. For a given nodeId(userId) I want to return on the graph all nodes which has relaionship within X hops and I want to aggregate and return the distance(param which set on the relationship) between them)
I came up with this:
MATCH p=shortestPath((user:FOLLOWERS{userId:{1}})-[r:follow]-(f:FOLLOWERS)) " +
"WHERE f <> user " +
"RETURN (f.userId) as userId," +
"reduce(s = '', rel IN r | s + rel.dist + ',') as dist," +
"length(r) as hop"
userId({1}) is given as Input and is indexed.
I believe Iam having here cartesian product. how would you suggest avoiding it?
You can make the cartesian product less onerous by creating an index on :FOLLOWERS(userId) to speed up one of the two "legs" of the cartesian product:
CREATE INDEX ON :FOLLOWERS(userId);
Even though this will not get rid of the cartesian product, it will run in O(N log N) time, which is much faster than O(N ^ 2).
By the way, your r relationship needs to be variable-length in order for your query to work. You should specify a reasonable upper bound (which depends on your DB) to assure that the query will finish in a reasonable time and not run out of memory. For example:
MATCH p=shortestPath((user:FOLLOWERS { userId: 1 })-[r:follow*..5]-(f:FOLLOWERS))
WHERE f <> user
RETURN (f.userId) AS userId,
REDUCE (s = '', rel IN r | s + rel.dist + ',') AS dist,
LENGTH(r) AS hop;

Resources