Replacing a string on whole database - neo4j

&amp char has somehow got through different imports into the db on many different node attributes and relationship attributes. How do I replace all & strings with regular & char?
I don't know all the possible property names that I can filter on.

If you want to make this efficient, you can use CALL{} in transactions of X rows
The :auto prefix is needed if you want to run this query in the Neo4j browser
This line
WITH n, [x in keys(n) WHERE n[x] CONTAINS '&amp'] AS keys
is needed to avoid trying a replace function on a property that is not of String type, in which case Neo4j will throw an exception.
Full query
:auto MATCH (n)
CALL {
WITH n
WITH n, [x in keys(n) WHERE n[x] CONTAINS '&amp'] AS keys
CALL apoc.create.setProperties(n, keys, [k in keys | replace(n[k], '&amp', '&')])
YIELD node
RETURN node
} IN TRANSACTIONS OF 100 ROWS
RETURN count(*)
If you're using a Neo4j cluster, you will need to run this on the leader of the database with the bolt connection ( not using the neo4j:// protocol.
Same query for the relationships now
:auto MATCH (n)-[r]->(x)
CALL {
WITH r
WITH r, [x in keys(r) WHERE r[x] CONTAINS '&amp'] AS keys
CALL apoc.create.setRelProperties(r, keys, [k in keys | replace(r[k], '&amp', '&')])
YIELD rel
RETURN rel
} IN TRANSACTIONS OF 100 ROWS
RETURN count(*)

You can find the answer on this documentation:
https://neo4j.com/labs/apoc/4.3/overview/apoc.create/apoc.create.setRelProperties/
For example, below will replace &amp with & in all properties in all nodes.
MATCH (p)
// collect keys (or properties) in node p and look for properties with &amp
WITH p, [k in keys(p) WHERE p[k] CONTAINS '&amp'] AS keys WHERE size(keys) > 0
// use apoc function to update the values in each prop key
CALL apoc.create.setProperties(p, keys, [k in keys | replace(p[k], '&amp', '&')])
YIELD node
RETURN node

Related

How to find and update node properties in neo4j to a new text

In my NEO4J DB, I am looking to write a query that will find all the properties of the nodes that contain a substring and update that substring.
I.E :
I have 10000 nodes like:
CREATE (n:Person {name: 'andyLovesNEO4J', title: 'NEO4J.BLAH'})
but the property names are all different.
I want to find the properties that contain 'NEO4J' and update that value to 'SQL' corresponding to them. Basically, update the strings that contain that substring.
Similar to Charchit, you can use this APOC function BUT without using UNWIND.
You can update the properties as a list of keys and values inside the APOC function.
MATCH (p:Person)
WITH p, keys(p) AS keys
CALL apoc.create.setProperties(p,[k in keys|k], [k in keys | replace(p[k], 'NEO4J', 'SQL')])
YIELD node
RETURN node;
You can try this:
MATCH (p:Person)
UNWIND keys(properties(p)) AS keys
CALL apoc.create.setProperty(p, keys, replace(p[keys], 'NEO4J', 'SQL'))
YIELD node
RETURN DISTINCT node
Fetch the nodes, unwind the property keys, and then set the new value for the property using apoc.create.setProperty.
If you want to find the required nodes, and only update the necessary keys, try this:
MATCH (p) WHERE ANY (k IN keys(p) WHERE apoc.map.get(properties(p),k) CONTAINS 'NEO4J')
WITH p, [k IN keys(p) WHERE apoc.map.get(properties(p),k) CONTAINS 'NEO4J' | k] as keys
CALL apoc.create.setProperties(p,[k in keys|k], [k in keys | replace(p[k], 'NEO4J', 'SQL')])
YIELD node
RETURN node;
Here, we have removed the Person label, so that every node is checked, and we filter and keep the relevant properties. I am using the setProperties function as suggested by jose_bacoy, in his answer, to avoid unnecessary complexity.

Update nodes by a list of ids and values in one cypher query

I've got a list of id's and a list of values. I want to catch each node with the id and set a property by the value.
With just one Node that is super basic:
MATCH (n) WHERE n.id='node1' SET n.name='value1'
But i have a list of id's ['node1', 'node2', 'node3'] and same amount of values ['value1', 'value2', 'value3'] (For simplicity i used a pattern but values and id's vary a lot). My first approach was to use the query above and just call the database each time. But nowadays this isn't appropriate since i got thousand of id's which would result in thousand of requests.
I came up with this approach that I iterate over each entry in both lists and set the values. The first node from the node list has to get the first value from the value list and so on.
MATCH (n) WHERE n.id IN["node1", "node2"]
WITH n, COLLECT(n) as nodeList, COLLECT(["value1","value2"]) as valueList
UNWIND nodeList as nodes
UNWIND valueList as values
FOREACH (index IN RANGE(0, size(nodeList)) | SET nodes.name=values[index])
RETURN nodes, values
The problem with this query is that every node gets the same value (the last of the value list). The reason is in the last part SET nodes.name=values[index] I can't use the index on the left side nodes[index].name - doesn't work and the database throws error if i would do so. I tried to do it with the nodeList, node and n. Nothing worked out well. I'm not sure if this is the right way to achieve the goal maybe there is a more elegant way.
Create pairs from the ids and values first, then use UNWIND and simple MATCH .. SET query:
// THe first line will likely come from parameters instead
WITH ['node1', 'node2', 'node3'] AS ids,['value1', 'value2', 'value3'] AS values
WITH [i in range(0, size(ids)) | {id:ids[i], value:values[i]}] as pairs
UNWIND pairs AS pair
MATCH (n:Node) WHERE n.id = pair.id
SET n.value = pair.value
The line
WITH [i in range(0, size(ids)) | {id:ids[i], value:values[i]}] as pairs
combines two concepts - list comprehensions and maps. Using the list comprehension (with omitted WHERE clause) it converts list of indexes into a list of maps with id,value keys.

cypher to combine nodes and relationships into a single column

So as a complication to this question, I basically want to do
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() RETURN DISTINCT n, r
And I want to return n and r as one column with no repeat values. However, running
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() UNWIND n+r AS x RETURN DISTINCT x
gives a "Type mismatch: expected List but was Relationship (line 1, column 47)" error. And this query
MATCH (n:TEST) RETURN DISTINCT n UNION MATCH ()-[n]->() RETURN DISTINCT n
Puts nodes and relationships in the same column, but the context from the first match is lost in the second half.
So how can I return all matched nodes and relationships as one minimal list?
UPDATE:
This is the final modified version of the answer query I am using
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {properties:properties(r), id:id(r), type:type(r), startNode:id(startNode(r)), endNode:id(endNode(r))})} as n
There are a couple ways to handle this, depending on if you want to hold these within lists, or within maps, or if you want a map projection of a node to include its relationships.
If you're using Neo4j 3.1 or newer, then map projection is probably the easiest approach. Using this, we can output the properties of a node and include its relationships as a collected property:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r)} as n
Here's what you might do if you wanted each row to be its own pairing of a node and a single one of its relationships as a list:
...
RETURN [n, r] as pair
And as a map:
...
RETURN {node:n, rel:r} as pair
EDIT
As far as returning more data from each relationship, if you check the Code results tab, you'll see that the id, relationship type, and start and end node ids are included, and accessible from your back-end code.
However, if you want to explicitly return this data, then we just need to include it in the query, using another map projection for each relationship:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {.*, id:id(r), type:type(r), startNode:startNode(r), endNode:endNode(r)})} as n

Neo4j load data relations with unknown labels

I have 4 Labels (A, B, C, D). All of them have a single Property {id}.
Now I have a file with relations which I would like to load. Every row has this structure:
{id_1}, {type_of_relations}, {id_2}
How can I create the relations?
My non-working guess is:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/data.csv" AS line
FIELDTERMINATOR ','
MATCH (a:A{id:line.id_1} OR a:B{id:line.id_1} OR a:C{id:line.id_1} OR a:D{id:line.id_1})
MATCH (b:A{id:line.id_2} OR b:B{id:line.id_2} OR b:C{id:line.id_2} OR b:D{id:line.id_2})
MERGE (a)-[:line.type_of_relations]->(b)
You cannot parameterize the relationship type in Cypher.
However, you can do this using the apoc.create.relationship procedure in Neo4j apoc procedures:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///file.csv" AS row
MATCH (a) WHERE a.id = row.id_1
MATCH (b) WHERE b.id = row.id_2
CALL apoc.create.relationship(a, row.type_of_relations, {}, b) YIELD rel
RETURN count(*) AS num
The procedure takes a parameter for the relationship type which allows for creating dynamic relationship types.
I don't think you can do that. For a number of reasons.
create (f:bar {name:'NewUserA'})
create (f:foo {name:'NewUserA'})
match (f:foo {name:'NewUserA'} or f:bar {name:'NewUserA'}) return f;
Code
Invalid input 'o': expected whitespace, comment, ')' or a relationship pattern (line 1, column 32 (offset: 31))
"match (f:foo {name:'NewUserA'} or f:bar {name:'NewUserA'}) return f".
So there is a problem on the match at any rate.
If the id is globally unique then you can ignore the label and just match on the id. that will take care of your 'or' problem.
match (f) where f.name='NewUserA' match (t) where t.name='NewUserA' return f,t
would give you the nodes.
That being said, when coding parameterized queries RELATIONSHIP_TYPE is one of the items you cannot parameterize. From the docs:
5.5. Parameters
[..]
Parameters can not be used as for property names, relationship types and labels, since these patterns are part of the query structure that is compiled into a query plan.
[..]
So you may need to look to ways of building your MERGE as a string somewhere else (awk is your friend) and then running that in the shell.

neo4j cypher, iterate over the result

I have a neo4j database and I would like to use the result of a part of the cypher code (a set of node ids) to use in the second part:
Something like:
MATCH ()-[:KNOWS]->(b)
FOREACH (n IN distinct(id(b))| SET n :label)
In a pure cypher code, is there a way to loop over the result "distinct(id(b))" and apply to each element another query?
Two problems with the original query:
You have to have a collection to use FOREACH.
You bind n to a node id, and you can't set labels on node ids, only on nodes.
You can use FOREACH to set labels by doing
MATCH ()-[:KNOWS]->(b)
WITH collect (distinct b) as bb
FOREACH (b IN bb | SET b:MyLabel)
In this case you don't need to do it as a collection, you can just do
MATCH ()-[:KNOWS]->(b)
WITH distinct b
SET b:MyLabel
And in general you can pipe results to an additional query part with WITH
I obtained the needed result with:
MATCH ()-[:KNOWS]->(b)
WITH DISTINCT (b)
RETURN id(b)

Resources