i'm working with Users assigned to a Grid location
(User)-[:PICK_UP]->(Grid)
With the query
MATCH (u:User)-[:PICK_UP]->(g:Grid)-[:TO]-(g2:Grid)<-[:PICK_UP]-(u2:User)
RETURN g,g2,u,u2
I have the result
In the image i have two groups of nodes, that represent the grid and its neighbors with users (red node). I would like to 'group'/create relations between the users nearby to a Spot node.
E.g. with the first group: grids 34, 40, 41, with the users 1,4,5,9. I would like to group the users in my query so i can get the result [user1, u4, u5, u9] and then those users i can assign them to a Spot, like this
Any suggestions??
Thank you !!
The thing to keep in mind is that your (u:User)-[:PICK_UP]->(g:Grid)-[:TO]-(g2:Grid)<-[:PICK_UP]-(u2:User) is matching a specific path, and while you see two groups in the graphical display, there are actually overlapping paths there. Viewing your result in table mode might be helpful.
So onto answering your question! Firstly, this was a tricky one, but a really cool one. I think I've got a good solution:
MATCH path=(grid:Grid)-[:TO]-(other_grid:Grid)
WITH CASE WHEN ID(grid) < ID(other_grid) THEN ID(other_grid) ELSE ID(grid) END AS id_to_reject
WITH collect(DISTINCT id_to_reject) AS ids_to_reject
MATCH (grid:Grid)
WHERE NOT(ID(grid) IN ids_to_reject)
CREATE (spot:Spot)
WITH grid, spot
MATCH (grid)-[:TO|PICK_UP*1..6]-(user:User)
MERGE (user)-[:AT_SPOT]->(spot)
The first thing that the query does it to compare all Grid nodes which are related to each other. For each of these pairs it passes on the ID() of the Grid node which is greater. The IDs which aren't in the list are therefore the smallest in the group and can act as a representative of the group. For each one of these representative Grid nodes we create a Spot node.
Using that node, it finds all User nodes within six hops via both TO and PICK_UP relationships. That should give all users in the group (both the users of our representative grid as well as the users of the other grids).
Then it's a simple matter to MERGE a relationship from each user to the Spot.
Related
I have the following setup
Person is part of an organisation
Person attends meeting
Meeting is held in a location
More than one person can attend a meeting
More than one person can be part of same organisation
Persons from different organisation can attend same meeting
Multiple meeting can be held at same location
Of all the locations, there is a very oft-used one (the home base).
Meaning that when I "Expand a spanning Tree", and when I hit that location, my graph "explodes"
Example code I use:
MATCH (p:Person {pcode: 123456})
MATCH (terminator:Location) WHERE terminator.LocCode = 1
CALL apoc.path.spanningTree(p, {
minLevel: 1,
maxLevel: 3,
terminatorNodes: terminator
})
YIELD path
RETURN path
;
My hope when using terminatorNodes is that the path would stop at that particular node and ignore everything that's "beyond" .. but that's not what happens, in actual fact I see all the nodes "beyond"
I have tried using endNodes too, but there it looks like the code bombs out as soon as it bumps into that particular node and stops spanning trees everywhere else too!
I would like to obtain the same effect for a particular organisation too (mine!) but one step at a time!
What I am really trying to achieve is to retrieve all Persons connected to a starting person via meetings.
I.e. "Start Person" A attends a meeting with another 3 people from different organisations, then I want to see those Persons returned, and their organisation, and then all the people linked to their organisation.
The above is just a start, in the sense that I have other Node labels to deal with but with the same aim.
Couldn't you use a depth search in your case?
MATCH path = (p:Person {pcode: 123456})-[:RELATIONSHIP_NAME*1..3]->(terminator:Loaction)
WHERE terminator.LocCode = 1
RETURN path
I have the following graph:
I would look to get all contractors and subcontractors and clients, starting from David.
So I thought of a query likes this:
MATCH (a:contractor)-[*0..1]->(b)-[w:works_for]->(c:client) return a,b,c
This would return:
(0:contractor {name:"David"}) (0:contractor {name:"David"}) (56:client {name:"Sarah"})
(0:contractor {name:"David"}) (1:subcontractor {name:"John"}) (56:client {name:"Sarah"})
Which returns the desired result. The issue here is performance.
If the DB contains millions of records and I leave (b) without a label, the query will take forever. If I add a label to (b) such as (b:subcontractor) I won't hit millions of rows but I will only get results with subcontractors:
(0:contractor {name:"David"}) (1:subcontractor {name:"John"}) (56:client {name:"Sarah"})
Is there a more efficient way to do this?
link to graph example: https://console.neo4j.org/r/pry01l
There are some things to consider with your query.
The relationship type is not specified- is it the case that the only relationships from contractor nodes are works_for and hired? If not, you should constrain the relationship types being matched in your query. For example
MATCH (a:contractor)-[:works_for|:hired*0..1]->(b)-[w:works_for]->(c:client)
RETURN a,b,c
The fact that (b) is unlabelled does not mean that every node in the graph will be matched. It will be reached either as a result of traversing the works_for or hired relationships if specified, or any relationship from :contractor, or via the works_for relationship.
If you do want to label it, and you have a hierarchy of types, you can assign multiple labels to nodes and just use the most general one in your query. For example, you could have a label such as ExternalStaff as the generic label, and then further add Contractor or SubContractor to distinguish individual nodes. Then you can do something like
MATCH (a:contractor)-[:works_for|:hired*0..1]->(b:ExternalStaff)-[w:works_for]->(c:client)
RETURN a,b,c
Depends really on your use cases.
I have a problem in which there a number of nodes A,B,C,D
where
B-->A
C-->B
D-->B
and the relation between them is children.
Now I want to query Neo4j to find that from a list of labels (B,C,D) which nodes exists at the bottom of the graph
I am making a bot application. In the neo4j database relations would be stored between different terms.
Like :dog-->:animal
:labra-->:dog
:germanShepard-->:dog
Now If a user asks a qustion tell me about dog then i should be able to get dog label data and if the user asks tell me about labra dog then i should be able to get labra label data.I am breaking the user input into tokens and then trying to find which label is at the bottom.
You can try something like
Match (a:Label) where not (a)<--(:Label) return a
(should work but I didn't test it)
As mentioned in my comment, using a unique label for every single node is going to be costly in the long run, and is going to impact your lookup speed on your queries.
So, if I'm understanding your use case correctly, you're breaking up user input into tokens, and the tokens should match to nodes on the same path in your graph. You want to find the label on the "bottom" of the graph, basically a leaf node, though in your description child nodes point toward their parent. I'll assume it's a :Parent relationship from the child to the parent node.
Here's a query which might do what you want. We'll assume you pass in the list of tokens as a parameter {tokens}. Please review the developer documentation for using parameters.
UNWIND {tokens} as token
MATCH (n)
WHERE labels(n) = token
AND NOT ()-[:Parent]->(n)
RETURN n
This will ensure the nodes you return are not themselves parents of any other node.
However, if you want instead wanted to be able to return nodes even if they were parents of other nodes, then we could instead return the node that is farthest from the root node. This requires a :Root node at the root of your entire graph. For your example in your description, :Root would be the parent of :animal.
UNWIND {tokens} as token
MATCH (n)
WHERE labels(n) = token
MATCH (n)-[r:Parent*]->(:Root)
RETURN n
ORDER BY SIZE(r)
LIMIT 1
Keep in mind that this query isn't guaranteed to work when there are multiple nodes with the same distance to the :Root. For example, if "germanShepard" and "labra" were given as elements of the tokens list, only one of the corresponding nodes would be returned because of the LIMIT 1, with no guarantee of which node would be returned.
I want to determine groups of users who have common interests.
Data Model and Characteristics
User and Interest are node labels and represent unique nodes
LIKES is the relationship among them, (User)-[:LIKES]->(Interest)
All properties of nodes are indexed
Relation nature can be characterized as many to many between the nodes
There are 300+ interests and 120,000+ users
I used the following query to determine user count with one common interest and all others;
MATCH (u:User)-[:LIKES]-(i:Interest)
WHERE i.name = "Baking"
WITH u
MATCH (u)-[:LIKES]-(i:Interest)
WHERE i.name <> "Baking"
RETURN i.name, COUNT(u) AS userCount
ORDER BY userCount DESC
I tried making a query which can have 3 common interests but that made it slower. I think this is not a good, scallable design, can anyone help?
Though maybe its not plausible but the end goal is to calculate nxn combinations of interests.
maybe you should limit the interests and only take the top five or something?
Also i don't know your data model but is the interest a unique node. That would speed up the query. So the relation [has interest]->( baking) points to the same node and you just can start from baking to get all the users.
Maybe flip your query and start from interest (cypher is strange) or you can force the query to use indexes
The answer to this question shows how to get a list of all nodes connected to a particular node via a path of known relationship types.
As a follow up to that question, I'm trying to determine if traversing the graph like this is the most efficient way to get all nodes connected to a particular node via any path.
My scenario: I have a tree of groups (group can have any number of children). This I model with IS_PARENT_OF relationships. Groups can also relate to any other groups via a special relationship called role playing. This I model with PLAYS_ROLE_IN relationships.
The most common question I want to ask is MATCH(n {name: "xxx") -[*]-> (o) RETURN o.name, but this seems to be extremely slow on even a small number of nodes (4000 nodes - takes 5s to return an answer). Note that the graph may contain cycles (n-IS_PARENT_OF->o, n<-PLAYS_ROLE_IN-o).
Is connectedness via any path not something that can be indexed?
As a first point, by not using labels and an indexed property for your starting node, this will already need to first find ALL the nodes in the graph and opening the PropertyContainer to see if the node has the property name with a value "xxx".
Secondly, if you now an approximate maximum depth of parentship, you may want to limit the depth of the search
I would suggest you add a label of your choice to your nodes and index the name property.
Use label, e.g. :Group for your starting point and an index for :Group(name)
Then Neo4j can quickly find your starting point without scanning the whole graph.
You can easily see where the time is spent by prefixing your query with PROFILE.
Do you really want all arbitrarily long paths from the starting point? Or just all pairs of connected nodes?
If the latter then this query would be more efficient.
MATCH (n:Group)-[:IS_PARENT_OF|:PLAYS_ROLE_IN]->(m:Group)
RETURN n,m