I'm working on a project modeling product use patterns and I'm having trouble identifying the best way to make an exact match to a single pattern
In the model I have several "product_pattern" nodes acting as the center or hub to several nodes representing various products. Each product node is unique and can be connected to any product_pattern. A series of product pattern nodes may look like:
--pattern_1
--- Product A
--- Product B
--- Product C
--pattern_2
--- Product A
--- Product B
--- Product C
--- Product D
--pattern_3
--- Product B
--- Product C
I would like to query the graph for product_patterns that use products B and C and ONLY B and C. If I just used:
Start b = node(16), c = node(37)
MATCH (b)<-[:PRODUCT_USED]-(n)-[:PRODUCT_USED]->(c)
RETURN n
I would have all product_patterns returned because they all have relationships to B and C. To Remove matches that have additional relationships from what I'm querying I foresee two strategies..
Create a property in each pattern node of product_pattern.num_products to use against a WHERE clause after the initial MATCH. In this case the num_products property would have to match '2' for nodes with relationships ONLY to B and C. My concern here is that I have to dig into each returned node for properties and popular products will make the return list much larger.
Create a WHERE NOT clause for every other product in the graph that I don't won't a relationship to... not ideal and most likely traversing the entire graph.
Are there any elegant ways to confirm that your query returns exactly the relationship match you ask for and not nodes that match your query but also have additional relationships?
Can you try this:
Start b = node(16), c = node(37)
MATCH (b)<-[:PRODUCT_USED]-(n)-[:PRODUCT_USED]->(c)
WHERE length((n)-[:PRODUCT_USED]->(c)) == 2
RETURN n
Related
Consider a query similar to this:
MATCH p=(b:label{ID:"M04"})-[r:Edge*2..2]-(d:label{ID:"S02"})
RETURN p LIMIT 10
Let me call the intermediate node c. The relations from b to the intermediate nodes and to the final node d are all of the same type Edge and have the property EdgeID. From one node to another there are different relations of type Edge each one with a different EdgeID property value. To the next node there are other relations of the same type and most of them having the same value of the property EdgeID.
For example the graph is similar to that:
(b)-[:Edge{EdgeID:1}]->(c)-[:Edge{EdgeID:1}]->(d)
(b)-[:Edge{EdgeID:2}]->(c)-[:Edge{EdgeID:2}]->(d)
(b)-[:Edge{EdgeID:3}]->(c)-[:Edge{EdgeID:3}]->(d)
....
The query returns many relations from b to c but a single relation from c to d
(b)-[:Edge{EdgeID:1}]->(c)-[:Edge{EdgeID:1}]->(d)
(b)-[:Edge{EdgeID:2}]->(c)-[:Edge{EdgeID:1}]->(d)
(b)-[:Edge{EdgeID:3}]->(c)-[:Edge{EdgeID:1}]->(d)
....
I want to return the paths with the relations having the same EdgeID. So for example with LIMIT 1 I want to return only one among the above rows, for example
(b)-[:Edge{EdgeID:123123}]->(c)-[:Edge{EdgeID:123123}]->(d)
(not necessarily that ID)
With LIMIT 2 I want to return two, for example:
(b)-[:Edge{EdgeID:123123}]->(c)-[:Edge{EdgeID:123123}]->(d)
(b)-[:Edge{EdgeID:872346}]->(c)-[:Edge{EdgeID:872346}]->(d)
How can I do that?
You should be able to add the condition that the relationships in the path have the same property value:
MATCH p=(b:label{ID:"M04"})-[:Edge*2]-(d:label{ID:"S02"})
WHERE relationships(p)[0].EdgeID = relationships(p)[1].EdgeID
RETURN p LIMIT 10
And if you need this kind of restriction to be in place for arbitrary length paths, then you can do:
MATCH p=(b:label{ID:"M04"})-[:Edge*6]-(d:label{ID:"S02"})
WITH p, relationships(p)[0].EdgeID as edgeID
WHERE all(rel in tail(relationships(p)) WHERE rel.EdgeID = edgeID)
RETURN p LIMIT 10
I am working with bill of materials (BOM) and part data in a Neo4J database.
There are 3 types of nodes in my graph:
(ItemUsageInstance) these are the elements of the bill of materials tree
(Item) one exists for each unique item on the BOM tree
(Material)
The relationships are:
(ItemUsageInstance)-[CHILD_OF]->(ItemUsageInstance)
(ItemUsageInstance)-[INSTANCE_OF]->(Item)
(Item)-[MADE_FROM]->(Material)
The schema is pictured below:
Here is a simplified picture of the data. (Diagram with nodes repositioned to enhance visibility):
What I would like to do is find subtrees of adjacent ItemUsageInstances whose Itemss are all made from the same Materials
The query I have so far is:
MATCH (m:Material)
WITH m AS m
MATCH (m)<-[:MADE_FROM]-(i1:Item)<-[]-(iui1:ItemUsageInstance)-[:CHILD_OF]->(iui2:ItemUsageInstance)-[]->(i2:Item)-[:MADE_FROM]->(m) RETURN iui1, i1, iui2, i2, m
However, this only returns one such subtree, the adjacent nodes in the middle of the graph that have a common Material of "M0002". Also, the rows of the results are separate entries, one for each parent-child pair in the subtree:
╒══════════════════════════╤══════════════════════╤══════════════════════════╤══════════════════════╤═══════════════════════╕
│"iui1" │"i1" │"iui2" │"i2" │"m" │
╞══════════════════════════╪══════════════════════╪══════════════════════════╪══════════════════════╪═══════════════════════╡
│{"instance_id":"inst5002"}│{"part_number":"p003"}│{"instance_id":"inst7003"}│{"part_number":"p004"}│{"material_id":"M0002"}│
├──────────────────────────┼──────────────────────┼──────────────────────────┼──────────────────────┼───────────────────────┤
│{"instance_id":"inst7002"}│{"part_number":"p003"}│{"instance_id":"inst7003"}│{"part_number":"p004"}│{"material_id":"M0002"}│
├──────────────────────────┼──────────────────────┼──────────────────────────┼──────────────────────┼───────────────────────┤
│{"instance_id":"inst7001"}│{"part_number":"p002"}│{"instance_id":"inst7002"}│{"part_number":"p003"}│{"material_id":"M0002"}│
└──────────────────────────┴──────────────────────┴──────────────────────────┴──────────────────────┴───────────────────────┘
I was expecting a second subtree, which happens to also be a linked list, to be included. This second subtree consists of ItemUsageInstances inst7006, inst7007, inst7008 at the far right of the graph. For what it's worth, not only are these adjacent instances made from the same Material, they are all instances of the same Item.
I confirmed that every ItemUsageInstance node has an [INSTANCE_OF] relationship to an Item node:
MATCH (iui:ItemUsageInstance) WHERE NOT (iui)-[:INSTANCE_OF]->(:Item) RETURN iui
(returns 0 records).
Also confirmed that every Item node has a [MADE_FROM] relationship to a Material node:
MATCH (i:Item) WHERE NOT (i)-[:MADE_FROM]->(:Material) RETURN i
(returns 0 records).
Confirmed that inst7008 is the only ItemUsageInstance without an outgoing [CHILD_OF] relationship.
MATCH (iui:ItemUsageInstance) WHERE NOT (iui)-[:CHILD_OF]->(:ItemUsageInstance) RETURN iui
(returns 1 record: {"instance_id":"inst7008"})
inst5000 and inst7001 are the only ItemUsageInstances without an incoming [CHILD_OF] relationship
MATCH (iui:ItemUsageInstance) WHERE NOT (iui)<-[:CHILD_OF]-(:ItemUsageInstance) RETURN iui
(returns 2 records: {"instance_id":"inst7001"} and {"instance_id":"inst5000"})
I'd like to collect/aggregate the results so that each row is a subtree. I saw this example of how to collect() and got the array method to work. But it still has duplicate ItemUsageInstances in it. (The "map of items" discussed there failed completely...)
Any insights as to why my query is only finding one subtree of adjacent item usage instances with the same material?
What is the best way to aggregate the results by subtree?
Finding the roots is easy. MATCH (root:ItemUsageInstance) WHERE NOT ()-[:CHILD_OF]->(root)
And for the children, you can include the root by specifying a min distance of 0 (default is 1).
MATCH p=(root)-[:CHILD_OF*0..25]->(ins), (m:Material)<-[:MADE_FROM]-(:Item)<-[:INSTANCE_OF]-(ins)
And then assuming only one item-material per instance, aggregate everything based on material (You can't aggregate in an aggregate, so use WITH to get the depth before collecting the depth with the node)
WITH ins, SIZE(NODES(p)) as depth, m RETURN COLLECT({node:ins, depth:depth}) as instances, m as material
So, all together
MATCH (root:ItemUsageInstance),
p=(root)<-[:CHILD_OF*0..25]-(ins),
(m:Material)<-[:MADE_FROM]-(:Item)<-[:INSTANCE_OF]-(ins)
WHERE NOT ()<-[:CHILD_OF]-(root)
AND NOT (m:Material)<-[:MADE_FROM]-(:Item)<-[:INSTANCE_OF]-()<-[:CHILD_OF]-(ins)
MATCH p2=(ins)<-[:CHILD_OF*1..25]-(cins)
WHERE ALL(n in NODES(p2) WHERE (m)<-[:MADE_FROM]-(:Item)<-[:INSTANCE_OF]-(n))
WITH ins, cins, SIZE(NODES(p2)) as depth, m ORDER BY depth ASC
RETURN ins as collection_head, ins+COLLECT(cins) as instances, m as material
In your pattern, you don't account for situations like the link between inst_5001 and inst_7001. Inst_5001 doesn't have any links to any part usages, but your match pattern requires that both usages have such a link. I think this is where you're going off track. The inst_5002 tree you're finding because it happens to have a link to an usage as your pattern requires.
In terms of "aggregating by subtree", I would return the ID of the root of the tree (e.g. id(iui1) and then count(*) the rest, to show how many subtrees a given root participates in.
Here is my heavily edited query:
MATCH path = (cinst:ItemUsageInstance)-[:CHILD_OF*1..]->(pinst:ItemUsageInstance), (m:Material)<-[:MADE_FROM]-(:Item)<-[:INSTANCE_OF]-(pinst)
WHERE ID(cinst) <> ID(pinst) AND ALL (x in nodes(path) WHERE ((x)-[:INSTANCE_OF]->(:Item)-[:MADE_FROM]->(m)))
WITH nodes(path) as insts, m
UNWIND insts AS instance
WITH DISTINCT instance, m
RETURN collect(instance), m
It returns what I was expecting:
╒═════════════════════════════════════════════════════════════════════════════════════════════════════════════╤═══════════════════════╕
│"collect(instance)" │"m" │
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════╪═══════════════════════╡
│[{"instance_id":"inst7002"},{"instance_id":"inst7003"},{"instance_id":"inst7001"},{"instance_id":"inst5002"}]│{"material_id":"M0002"}│
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────────────────┤
│[{"instance_id":"inst7007"},{"instance_id":"inst7008"},{"instance_id":"inst7006"}] │{"material_id":"M0001"}│
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────┴───────────────────────┘
The one limitation is that it does not distinguish the root of the subtree from the children. Ideally the list of {"instance_id"} would be sorted by depth in the tree.
I have in my graph places and persons as labels, and a relationship "knows_the_place". Like:
(person)-[knows_the_place]->(place)
A person usually knows multiple places.
Now I want to find the persons with a "strong" relationship via the places (which have a lot of "places" in common), so for example I want to query all persons, that share at least 3 different places, something like this (not working!) query:
MATCH
(a:person)-[:knows_the_place]->(x:place)<-[:knows_the_place]-(b:person),
(a:person)-[:knows_the_place]->(y:place)<-[:knows_the_place]-(b:person),
(a:person)-[:knows_the_place]->(z:place)<-[:knows_the_place]-(b:person)
WHERE NOT x=y and y=z
RETURN a, b
How can I do this with neo4j Query?
Bonus-Question:
Instead of showing me the person which have x places in common with another person, even better would be, if I could get a order list like:
a shares 7 places with b
c shares 5 places with b
d shares 2 places with e
f shares 1 places with a
...
Thanks for your help!
Here you go:
MATCH (a:person)-[:knows_the_place]->(x:place)<-[:knows_the_place]-(b:person)
WITH a, b, count(x) AS count
WHERE count >= 3
RETURN a, b, count
To order:
MATCH (a:person)-[:knows_the_place]->(x:place)<-[:knows_the_place]-(b:person)
RETURN a, b, count(x) AS count
ORDER BY count(x) DESC
You can also do both by adding an ORDER BY to the of the first query.
Keep in mind that this query is a cartesian product of a and b so it will examine every combination of person nodes, which may be not great performance-wise if you have a lot of person nodes. Neo4j 2.3 should warn you about these sorts of queries.
What I'm looking for
With variable length relationships (see here in the neo4j manual), it is possible to have a variable number of relationships with a certain label between two nodes.
# Cypher
match (g1:Group)-[:sub_group*]->(g2:Group) return g1, g2
I'm looking for the same thing with nodes, i.e. a way to query for two nodes with a variable number of nodes in between, but with a label condition on the nodes rather than the relationships:
# Looking for something like this in Cypher:
match (g1:Group)-->(:Group*)-->(g2:Group) return g1, g2
Example
I would use this mechanism, for example, to find all (direct or indirect) members of a group within a group structure.
# Looking for somthing like this in Cypher:
match (group:Group)-->(:Group*)-->(member:User) return member
Take, for example, this structure:
group1:Group
|-------> group2:Group -------> user1:User
|-------> group3:Group
|--------> page1:Page -----> group4:Group -----> user2:User
In this example, user1 is a member of group1 and group2, but user2 is only member of group4, not member of the other groups, because a non-Group labeled node is in between.
Abstraction
A more abstract pattern would be a kind of repeat operator |...|* in Cypher:
# Looking for repeat operator in Cypher:
match (g1:Group)|-[:is_subgroup_of]->(:Group)|*-[:is_member_of]->(member:User)
return member
Does anyone know of such a repeat operator? Thanks!
Possible Solution
One solution I've found, is to use a condition on the nodes using where, but I hope, there is a better (and shorter) soluation out there!
# Cypher
match path = (member:User)<-[*]-(g:Group{id:1})
where all(node in tail(nodes(path)) where ('Group' in labels(node)))
return member
Explanation
In the above query, all(node in tail(nodes(path)) where ('Group' in labels(node))) is one single where condition, which consists of the following key parts:
all: ALL(x in coll where pred): TRUE if pred is TRUE for all values in
coll
nodes(path): NODES(path): Returns the nodes in path
tail(): TAIL(coll): coll except first element–––I'm using this, because the first node is a User, not a Group.
Reference
See Cypher Cheat Sheet.
How about this:
MATCH (:Group {id:1})<-[:IS_SUBGROUP_OF|:IS_MEMBER_OF*]-(u:User)
RETURN DISTINCT u
This will:
find all subtrees of the group with ID 1
only traverse the relationships IS_GROUP_OF and IS_MEMBER_OF in incoming direction (meaning sub-groups or users that belong to group with ID or one of its sub-groups)
only return nodes which have a IS_MEMBER_OF relationship to a group in the subtree
and discard duplicate results (users who belong to more than one of the groups in the tree would otherwise appear multiple times)
I know this relies on relationships types rather than node labels, but IMHO this is a more graphy approach.
Let me know if this would work or not.
for example:
a-[r]->b, there are multi r between the two nodes, each r.userId is unique.
(eg: a-[r:R {userId:"user1"}]->b, (a-[r:R{userId:"user2"}]->b,
and the same for a-[r]->c
And the situation is a-[r]->b has a relationship: r.userId = amdin, but a-[r]->c doesn't have this relationship.
how can i only return c.
i try to create cypher:
"MATCH (a:SomeLabel)-[r:SomeR]->(any:SomeLabel) "
"WHERE id(a)=0 AND r.userId <> \"admin\" "
"RETURN any";
but this will also return b ,because a->b has other relationship: r.userId=xxxx
how can i write the cypher to return nodes not inculde user.Id="admin"......
If you not clearly understand what i say,please let me know....i need your help for this case..thanks
I draw a picture below, multi relationship named sr but with different properties (userId is unique),
and i want to find all nodes that related to node A, but not contains sr {userId:admin}, i add a red underline there. So as in the picture, node B has the relationship sr {userId:admin}, so i only want to return node C, no node B
For showing simple representations of graph problems, graphgists are really helpful as people can explore the data.
I've created one based on your description: http://gist.neo4j.org/?94ef056e41153b116e4f
To your problem, you can collect all usernames involved in the relationships per pair of nodes and filter based on those:
MATCH (a { name:'A' })-[r:sr]->b
WITH a,b, collect(r.name) AS usernames
WHERE NOT 'admin' IN usernames
RETURN a, b
Your question is pretty unclear. My interpretation is that you want to find nodes c that are not connected to a node a with a relationship of type R.
You basically want to do a negative match aka search for a pattern that does not exist. Negative patterns can be retrieved using a where not:
MATCH (a:SomeLabel), (c:SomeLabel)
WHERE ID(a)=0 AND NOT (a)-[:R]->(c)
RETURN c
This returns a list of all SomeLabel nodes not being connected to a.
See http://docs.neo4j.org/chunked/stable/query-where.html#query-where-patterns in the reference manual.