cypher - add/remove properties across all nodes of same label - neo4j

With following example data:
node1:Person {id: 1, name: 'NameOne'}
node2:Person {id: 2, name: 'NameTwo', age: 42}
question is: if it is possible to standardize properties across all nodes of label Person to the list ['id','name','age','lastname'] so that missing properties are added to the nodes with default empty value and using cypher only?
I have tied using apoc.map.merge({first},{second}) yield value procedure as following:
match (p:Person)
call apoc.map.merge(proeprties(p),{id:'',name:'',age:'',lastname:''}) yield value
return value
however I got this error:
There is no procedure with the name apoc.map.merge registered for
this database instance. Please ensure you've spelled the procedure
name correctly and that the procedure is properly deployed.
although I can confirm I have apoc in place
bash-4.3# ls -al /var/lib/neo4j/plugins/apoc-3.1.0.3-all.jar
-rw-r--r-- 1 root root 1319762 Dec 14 02:19 /var/lib/neo4j/plugins/apoc-3.1.0.3-all
and it is shown in apoc.help
neo4j-sh (?)$ call apoc.help("map.merge");
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| type | name | text | signature | roles | writes |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "function" | "apoc.map.merge" | "apoc.map.merge(first,second) - merges two maps" | "apoc.map.merge(first :: MAP?, second :: MAP?) :: (MAP?)" | <null> | <null> |
| "function" | "apoc.map.mergeList" | "apoc.map.mergeList([{maps}]) yield value - merges all maps in the list into one" | "apoc.map.mergeList(maps :: LIST? OF MAP?) :: (MAP?)" | <null> | <null> |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows
47 ms

Note that these are functions now, so you don't need to call them using CALL or YIELD like procedures. This should work:
match (p:Person)
RETURN apoc.map.merge(properties(p),{id:'',name:'',age:'',lastname:''})
Keep in mind that this query will only affect what is returned, since you haven't used SET to update the node properties.
You can use the += operator to update a node's properties instead of using apoc.map.merge:
match (p:Person)
set p += {id:'',name:'',age:'',lastname:''}
Keep in mind that both this and apoc.map.merge will replace existing values, so you'll be blanking out id, name, age, and lastname for all persons.
At this time I don't believe there is functionality in Neo4j or APOC to merge in properties while keeping existing properties instead of replacing. That said, there are some workarounds you might use.
COALESCE() is a useful function for this, as it allows you to supply defaults to use in case a value is null.
For example, you might use this to update the properties for all :Persons, using the supplied empty string as a default if the properties are null:
match (p:Person)
with {id:COALESCE(p.id, ''), name:COALESCE(p.name, ''), age:COALESCE(p.age, ''),
lastname:COALESCE(p.lastname, '')} as newProps
set p += newProps

Related

Cypher - loading results while optionally adding count of relationships

I'm using a graph database to establish a relationship between folders, their children and users (be it owners or sharers of the folder).
Here is an example of my structure. Where orange are folders and blue are users. -
What I want my query to achieve: It should return direct children of the folder under query, and while doing so determine if the child folder being returned is being shared.
My query
MATCH (:Folder { name: 'Nick Hamill' })-[:CHILD]->(children:Folder)
WITH children
OPTIONAL MATCH path = (children)<-[*]-(:User)
UNWIND RELATIONSHIPS(path) AS r WITH children, r
WHERE TYPE(r) = 'SHARES'
RETURN children AS model, COUNT(r) > 0 AS shared
So the query works brilliantly (perhaps a little optimisation needed?) when there is a related user (see below), however, the query fails to return any result if there is no user relationship. I personally can't see why this is because it's an optional match, and surely the count could just return empty?
╒══════════════════════════════════════════════════════════════════════╤════════╕
│"model" │"shared"│
╞══════════════════════════════════════════════════════════════════════╪════════╡
│{"name":"Dr. Denis Abshire","created_at":"2019-10-11 13:54:58","id":"c│true │
│f5e084f-d963-35d3-9c6f-fe29b86f6d43","updated_at":"2019-10-11 13:54:58│ │
│"} │ │
└──────────────────────────────────────────────────────────────────────┴────────┘
The query should be relatively self-explanatory but for the sake of clarity here's some expected outputs -
| Query Folder | Returned Folder | Shared? |
|----------------------|------------------|---------|
| Miss Dessie Oritz II | Nick Hamill | TRUE |
| Nick Hamill | Dr Denis Abshire | TRUE |
| Samara Russell | Shemar Huels PhD | FALSE |
| Shemar Huels PhD | Hazle Ward | FALSE |
I'm running neo4j 3.5.11 community edition. I feel like this should be a fairly easy solution, I'm just meeting the limits of my extremely limited cypher knowledge.
Appreciate any help!
I don't undertsand why you are using this in your query :
OPTIONAL MATCH path = (children)<-[*]-(:User)
UNWIND RELATIONSHIPS(path) AS r WITH children, r
WHERE TYPE(r) = 'SHARES'
With (children)<-[*]-(:User) you are searching all the path (without restriction on its size) between the children & User nodes.
And with the WHERE TYPE(r) = 'SHARES' you only want the SHARES relationship ...
So your query will work on this kind of pattern : (children)<-[:CHILD]-(:Folder)<-[:CHILD]-(:Folder)<-[:SHARES]-(:User)
Is it what you want ?
If so, can you try this query :
MATCH (:Folder { name: 'Nick Hamill' })-[:CHILD]->(children:Folder)
RETURN children AS model, size((children)<-[:CHILD*0..]-(:Folder)<-[:SHARES]-(:User)) > 0 AS shared

In neo4j, a query to count the number of distinct structures

In neo4j my database consists of chains of nodes. For each distinct stucture/layout (does graph theory has a better word?), I want to count the number of chains. For example, the database consists of 9 nodes and 5 relationships as this:
(:a)->(:b)
(:b)->(:a)
(:a)->(:b)
(:a)->(:b)->(:b)
where (:a) is a node with label a. Properties on nodes and relationships are irrelevant.
The result of the counting should be:
------------------------
| Structure | n |
------------------------
| (:a)->(:b) | 2 |
| (:b)->(:a) | 1 |
| (:a)->(:b)->(:b) | 1 |
------------------------
Is there a query that can achieve this?
Appendix
Query to create test data:
create (:a)-[:r]->(:b), (:b)-[:r]->(:a), (:a)-[:r]->(:b), (:a)-[:r]->(:b)-[:r]->(:b)
EDIT:
Thanks for the clarification.
We can get the equivalent of what you want, a capture of the path pattern using the labels present:
MATCH path = (start)-[*]->(end)
WHERE NOT ()-->(start) and NOT (end)-->()
RETURN [node in nodes(path) | labels(node)[0]] as structure, count(path) as n
This will give you a list of the labels of the nodes (the first label present for each...remember that nodes can be multi-labeled, which may throw off your results).
As for getting it into that exact format in your example, that's a different thing. We could do this with some text functions in APOC Procedures, specifically apoc.text.join().
We would need to first add formatting around the extraction of the first label to add the prefixed : as well as the parenthesis. Then we could use apoc.text.join() to get a string where the nodes are joined by your desired '->' symbol:
MATCH path = (start)-[*]->(end)
WHERE NOT ()-->(start) and NOT (end)-->()
WITH [node in nodes(path) | labels(node)[0]] as structure, count(path) as n
RETURN apoc.text.join([label in structure | '(:' + label + ')'], '->') as structure, n

Is it possible to match node by id, that lies in the properties of other node or relationship?

I've got a problem with simple matching.
For example,
I have some node
start startNode = node(0)
It has a relationship with another one. One of the relationship's properties is idOfThirdNode with id(thirdNode).
I found out that start point = node( ) get only digits as arguments and any toInt(rel.idOfThirdNode) is not available at all, as other match(point:_Node) where id(point) = rel.idOfThirdNode
Find node by property is not a problem. But it isn't possible to set new duplicate id-property.
Have this problem any decision or only saving this property in model and begining of new matching with this property like id?
Edit:
Earlier I have had in result of such action:
start startNode = node({0})
optional match startNode-[r:REL]-(relNode: _Node)
return distinct startNode, id(r) as linkId, id(relNode) as nodeId,
r.idOfthirdNode as point
beautiful table with nulls in some fields
______________________________________
| StartNode| linkId | nodeId | point |
--------------------------------------
| startNode| 1 | 2 | null |
| info | | | |
-------------------------------------
| startNode| 3 | 4 | 5 |
| info | | | |
But now this "where" make disabled all null matching
start startNode = node({0})
optional match startNode-[r:REL]-(relNode: _Node), (pointNode:_Node)
where id(pointNode) = r.idOfthirdNode
return distinct startNode, id(r) as linkId, id(relNode) as nodeId,
collect({pointNode.name, id:id(pointNode)}) as point
and I get only second line.
You should be able to do something like this:
MATCH (point:_Node), (node:Label)
WHERE ID(point) = node.idOfThirdNode
RETURN *
But I've never actually seen that done because relationships are so much better than foreign keys
This should work for you:
START startNode = node(0)
MATCH (startNode)-[rel]->(secondNode), (thirdNode:_Node)
WHERE ID(thirdNode) = rel.idOfThirdNode
RETURN startNode, secondNode, thirdNode

Express abscence of edges in Cypher

how do I express the following in Cypher
"Return all nodes with at least one incoming edge of type A and no outgoing edges".
Best Regards
You can use a pattern to exclude nodes from the result subset like this:
MATCH ()-[:A]->(n) WHERE NOT (n)-->() RETURN n
Try
MATCH (n)
WHERE ()-[:A]->n AND NOT n-->()
RETURN n
or
MATCH ()-[:A]->(n)
WHERE NOT n-->()
RETURN DISTINCT n
Edit
Pattern expressions can be used both for pattern matching and as predicates for filtering. If used in the MATCH clause, the paths that answer the pattern are included in the result. If used for filtering, in the WHERE clause, the pattern serves as a limiting condition on the paths that have previously been matched. The result is limited, not extended to include the filter condition. When a pattern is used as a predicate for filtering, the negation of that predicate is also a predicate that can be used as a filter condition. No path answers to the negation of a pattern (if there is such a thing) so negations of patterns cannot be used in the MATCH clause. The phrase
Return all nodes with at least one incoming edge of type A and no outgoing edges
involves two patterns on nodes n, namely any incoming relationship [:A] on n and any outgoing relationship on n. The second must be interpreted as a pattern for a predicate filter condition since it involves a negation, not any outgoing relationship on n. The first, however, can be interpreted either as a pattern to match along with n, or as another pattern predicate filter condition.
These two interpretations give rise to the two cypher queries above. The first query matches all nodes and uses both patterns to filter the result. The second matches the incoming relationship on n along with n and uses the second pattern to filter the results.
The first query will match every node only once before the filtering happens. It will therefore return one result item per node that meets the criteria. The second query will match the pattern any incoming relationship [:A] on n once for each path, i.e. once for each incoming relationship on n. It may therefore contain a node multiple times in the result, hence the DISTINCT keyword to remove doubles.
If the items of interest are precisely the nodes, then using both patterns for predicates in the WHERE clause seems to me the correct interpretation. It is also more efficient since it needs to find only zero or one incoming [:A] on n to resolve the predicate. If the incoming relationships are also of interest, then some version of the second query is the right choice. One would need to bind the relationship and do something useful with it, such as return it.
Below are the execution plans for the two queries executed on a 'fresh' neo4j console.
First query:
----
Filter
|
+AllNodes
+----------+------+--------+-------------+------------------------------------------------------------------------------------------------------------------------------+
| Operator | Rows | DbHits | Identifiers | Other |
+----------+------+--------+-------------+------------------------------------------------------------------------------------------------------------------------------+
| Filter | 0 | 0 | | (nonEmpty(PathExpression((17)-[ UNNAMED18:A]->(n), true)) AND NOT(nonEmpty(PathExpression((n)-[ UNNAMED36]->(40), true)))) |
| AllNodes | 6 | 7 | n, n | |
+----------+------+--------+-------------+------------------------------------------------------------------------------------------------------------------------------+
Second query:
----
Distinct
|
+Filter
|
+TraversalMatcher
+------------------+------+--------+-------------+--------------------------------------------------------------+
| Operator | Rows | DbHits | Identifiers | Other |
+------------------+------+--------+-------------+--------------------------------------------------------------+
| Distinct | 0 | 0 | | |
| Filter | 0 | 0 | | NOT(nonEmpty(PathExpression((n)-[ UNNAMED30]->(34), true))) |
| TraversalMatcher | 0 | 13 | | n, UNNAMED8, n |
+------------------+------+--------+-------------+--------------------------------------------------------------+

Cypher query returns duplicated results

I have the following graph setup:
start root=node(0)
create (F {name:'FRAME'}), (I {name: 'INTERACTION'}), (A {name: 'A'}), (B {name: 'B'}),
root-[:ROOT]->F, F-[:FRAME_INTERACTION]->I, I-[:INTERACTION_ACTOR]->A, I-[:INTERACTION_ACTOR]->B
And the following query returns duplicated results:
START actor=node:node_auto_index(name='A')
MATCH actor<-[:INTERACTION_ACTOR]-interaction-[:INTERACTION_ACTOR]->actor2,
frame-[:FRAME_INTERACTION]->interaction
RETURN frame, interaction
Query Results
+-----------------------------------------------------+
| frame | interaction |
+-----------------------------------------------------+
| Node[1]{name:"FRAME"} | Node[2]{name:"INTERACTION"} |
| Node[1]{name:"FRAME"} | Node[2]{name:"INTERACTION"} |
+-----------------------------------------------------+
2 rows
52 ms
Even if I add one more start node trying to limit the results, I have the same:
START actor=node:node_auto_index(name='A'), frame=node:node_auto_index(name='FRAME')
MATCH actor<-[:INTERACTION_ACTOR]-interaction-[:INTERACTION_ACTOR]->actor2,
frame-[:FRAME_INTERACTION]->interaction
RETURN frame, interaction
I would like to understand why the query returns duplicated results.
I know that it is possible to return unique results by using distinct, but is it possible to change the query in order to return only one result by matching path, without applying an additional operation (distinct)?
(setup and query can be tested at http://console.neo4j.org/?id=q2e0ay)
If you add actor2 to your return list you'll see what the problem is:
frame interaction actor actor2
(7 {name:"FRAME"}) (8 {name:"INTERACTION"}) (9 {name:"A"}) (9 {name:"A"})
(7 {name:"FRAME"}) (8 {name:"INTERACTION"}) (9 {name:"A"}) (10 {name:"B"})
Actor "A" is being included as a value for actor2! But this makes sense when you think about it, because nowhere in your query did you tell neo4j that actor and actor2 need to be distinct entities.
Luckily it's easy to do:
START actor=node:node_auto_index(name='A')
MATCH actor<-[:INTERACTION_ACTOR]-interaction-[:INTERACTION_ACTOR]->actor2,
frame-[:FRAME_INTERACTION]->interaction
WHERE actor <> actor2 //like this!
RETURN frame, interaction

Resources