I'm new to Neo4j and have been playing with an idea about people moving houses, in order to learn more about cypher. This is what I have currently
Each Person [:OWNS] a House
Each House [:ISIN] a Street
A Person [:WANTS] (to live in) a Street
The aim is to find a complete 'chain'
If I run
MATCH (s:Street)<-[:WANTS]-(p:Person)-[:OWNS]->(h:House) RETURN s,h,p
This returns me the complete chain linking right back to the person.
What I'm trying to do is only return complete chains and not broken ones.
I've also tried
MATCH (s:Street)<-[:WANTS]-(p:Person)-[:OWNS]->(h:House)-[:ISIN]->(s) RETURN s,h,p
but this never returns results. Any thoughts?
UPDATE
I got the last query returning results by doing this
MATCH (s:Street)<-[:WANTS]-(p:Person)-[:OWNS]->(h:House)-[:ISIN]->(s1: Street) RETURN s,h,p
But am not sure if this is what I want.
I just want to return circular results so I can see complete house moving chains. Ultimately based on one person so I'll need to put a WHERE in there.
I will try move queries tomorrow with a larger dataset
MATCH won't repeat a node in a chain, aka a single match can't match the same node in two variables. But you can just break out the last chain to a where clause
MATCH (s:Street)<-[:WANTS]-(p:Person)-[:OWNS]->(h:House)
WHERE (h)-[:ISIN]->(s)
RETURN s,h,p
Related
My website has pages dedicated to some events, represented by nodes in neo4j. Those events possess sub-events which are relationships under neo4j, and which correspond to links on the source page to the target page. I currently have a search engine that highlights the links to the searched events, but it is flawed by cycles in the data model. Indeed it is highlighting all the links that contain cyclic references to the current page if this page contains any link to the searched event.
The aim is therefore to have a query that is able to flag the nodes and relationships which are related to the searched events, without flagging a path only because of cyclic relationships.
I've created a small dataset representative of the issue that you can build using this query:
CREATE
(r:Event:Searched {name:'R', tag:1}),
(d:Event:Searched {name:'D', tag:1}),
(o:Event {name:'O'}),
(a:Event {name:'A'}),
(b:Event {name:'B'}),
(c:Event {name:'C'}),
(e:Event {name:'E'}),
(o)-[:hasEvent]->(a),
(o)-[:hasEvent]->(e),
(o)-[:hasEvent]->(r),
(o)-[:hasEvent]->(c),
(a)-[:hasEvent]->(b),
(b)-[:hasEvent]->(o),
(c)-[:hasEvent]->(d)
Which produces the following graph:
My aim is to have a query that only fetches nodes C and O, but not A or B, as the only reason they are flagged is that O is already flagged:
.
My current query that I need to fix is the following:
MATCH path=(upper:Event)-[:hasEvent*]->(source:Event:Searched)
RETURN upper
I hope you can help me, I couldn't manage to make similar questions' answers work on my specific case.
Ideally, the solution shouldn't be too computing-intensive, as my real model is quite big (2.300.000 nodes and 9.500.000 relationships), and the current indexing in the search engine is already quite slow.
Thank you in advance for your help
You could try out apoc.path.expandConfig procedure. It has a uniqueness property, which you can be configured that a node cannot be traversed more than once.
MATCH (n:Event)
CALL apoc.path.expandConfig(n, {
relationshipFilter: "hasEvent>",
labelFilter: "/Searched",
uniqueness: "NODE_GLOBAL"
}) YIELD path
RETURN [n IN nodes(path) WHERE NOT n:Searched] AS upper
However, this query will still return A and B, because with MATCH (n:Event) you start looking from every node (regardless of the relationship between O and A). But if I understood you correctly, you don't wan't to start from all nodes, but from a specific one ("pages dedicated to some events"). So you might want to start with, e.g., MATCH (n:Event {name: "O"}) that will return only O and C.
I am trying to learn neo4j, so I just took a use case of a travel app to learn but I am not sure about the optimal way to solve it. Any help will be appreciated.
Thanks in advance.
So consider a use case in which I have to travel from one place (PLACE A) to other (PLACE C) by train, but there is no direct connection between the two places. And so we have to change our train in PLACE B.
Two places are connected via a relation IS_CONNECTED relation. refering to green nodes in the image
And then if there is an is_connected relation between two place then there will be an out going relation i.e. CONNECTED_VIA to a common train from both the node which implies how they are connected referring to red nodes in image
my question is how are we suppose to know that we have to change the station from place b
My understanding is:
We will check where the two places are connected via IS_CONNECTED relationship
match (start:place{name:"heidelberg"}), (end:place{name:"frankfurt"})
MATCH path = (start)-[:IS_CONNECTED*..]->(end)
RETURN path
this will show that these two places are connected
Then we will see that if place A and place c are directly connected or not by the query
match (p:place{name:"heidelberg"})-[:CONNECTED_VIA]->(q)<-[:CONNECTED_VIA]-(t:place{name:"frankfurt"})
return q
And this will return nothing because there is no direct connections
My brain stopped functioning after this. I am trying to figure how from past 3 days. I am sorry I look ao confused
Please click here for the image of what i am referring
You'll want to use variable-length relationships in your :CONNECTED_VIA match, and then get the :Place nodes that are in your path. And it's usually a good idea to use an upper bound, whatever makes sense in your graph.
Then we can use a filter on the nodes in your path to only keep the ones that are :Place nodes.
match path = (p:place{name:"heidelberg"})-[:CONNECTED_VIA*..4]-(t:place{name:"frankfurt"})
return path, [node in nodes(path)[1..-1] where node:Place] as connectionPlaces
And if you're only interested in the shortest paths, you may want to check the shortestPath() or shortestPaths() functions.
One last thing to note...when determining if two locations are connected, if all you need is a true or false if they're connected, you can use the EXISTS() function to return whether such a pattern exists:
match (start:place{name:"heidelberg"}), (end:place{name:"frankfurt"})
return exists((start)-[:IS_CONNECTED*..5]->(end))
The answer to this question shows how to get a list of all nodes connected to a particular node via a path of known relationship types.
As a follow up to that question, I'm trying to determine if traversing the graph like this is the most efficient way to get all nodes connected to a particular node via any path.
My scenario: I have a tree of groups (group can have any number of children). This I model with IS_PARENT_OF relationships. Groups can also relate to any other groups via a special relationship called role playing. This I model with PLAYS_ROLE_IN relationships.
The most common question I want to ask is MATCH(n {name: "xxx") -[*]-> (o) RETURN o.name, but this seems to be extremely slow on even a small number of nodes (4000 nodes - takes 5s to return an answer). Note that the graph may contain cycles (n-IS_PARENT_OF->o, n<-PLAYS_ROLE_IN-o).
Is connectedness via any path not something that can be indexed?
As a first point, by not using labels and an indexed property for your starting node, this will already need to first find ALL the nodes in the graph and opening the PropertyContainer to see if the node has the property name with a value "xxx".
Secondly, if you now an approximate maximum depth of parentship, you may want to limit the depth of the search
I would suggest you add a label of your choice to your nodes and index the name property.
Use label, e.g. :Group for your starting point and an index for :Group(name)
Then Neo4j can quickly find your starting point without scanning the whole graph.
You can easily see where the time is spent by prefixing your query with PROFILE.
Do you really want all arbitrarily long paths from the starting point? Or just all pairs of connected nodes?
If the latter then this query would be more efficient.
MATCH (n:Group)-[:IS_PARENT_OF|:PLAYS_ROLE_IN]->(m:Group)
RETURN n,m
My database contains about 300k nodes and 350k relationships.
My current query is:
start n=node(3) match p=(n)-[r:move*1..2]->(m) where all(r2 in relationships(p) where r2.GameID = STR(id(n))) return m;
The nodes touched in this query are all of the same kind, they are different positions in a game. Each of the relationships contains a property "GameID", which is used to identify the right relationship if you want to pass the graph via a path. So if you start traversing the graph at a node and follow the relationship with the right GameID, there won't be another path starting at the first node with a relationship that fits the GameID.
There are nodes that have hundreds of in and outgoing relationships, some others only have a few.
The problem is, that I don't know how to tell Cypher how to do this. The above query works for a depth of 1 or 2, but it should look like [r:move*] to return the whole path, which is about 20-200 hops.
But if i raise the values, the querys won't finish. I think that Cypher looks at each outgoing relationship at every single path depth relating to the start node, but as I already explained, there is only one right path. So it should do some kind of a DFS search instead of a BFS search. Is there a way to do so?
I would consider configuring a relationship index for the GameID property. See http://docs.neo4j.org/chunked/milestone/auto-indexing.html#auto-indexing-config.
Once you have done that, you can try a query like the following (I have not tested this):
START n=node(3), r=relationship:rels(GameID = 3)
MATCH (n)-[r*1..]->(m)
RETURN m;
Such a query would limit the relationships considered by the MATCH cause to just the ones with the GameID you care about. And getting that initial collection of relationships would be fast, because of the indexing.
As an aside: since neo4j reuses its internally-generated IDs (for nodes that are deleted), storing those IDs as GameIDs will make your data unreliable (unless you never delete any such nodes). You may want to generate and use you own unique IDs, and store them in your nodes and use them for your GameIDs; and, if you do this, then you should also create a uniqueness constraint for your own IDs -- this will, as a nice side effect, automatically create an index for your IDs.
I've been playing around with Neo4j and have a problem for which I do not have a solution, hence my question here.
For my particular problem I'll describe a simplified version that captures the essence. Suppose I have a graph of locations that are connected either directly or via a detour:
direct: (A)-[:GOES_TO]->(B)
indirect: (A)->[:GOES_THROUGH]->(C)-[:COMES_BACK_TO]->(B)
If I want to have everything between "Go" and the "Finish" with a GOES_TO relationship I can easily use the Cypher query:
START a=node:NODE_IDX(Id = "Go"), b=node:NODE_IDX(Id = "Finish)
MATCH a-[r:GOES_TO*]->b
RETURN a,r,b
Here, NODE_IDX is an index on the nodes (Id).
Where I get stuck is when I want to have all the paths between "Go" and "Finish" that are not GOES_TO relationships but rather multiple GOES_THROUGH-->()-->COMES_BACK_TO relationship combinations (of variable depth).
I do not want to filter out the GOES_TO relationships because there are many more relationships among the nodes, and I do not want to accommodate removing all of them (dynamically). Is it possible to have a variable-depth, multi-relationship MATCH that I envisage?
Thanks!
Let me restate what I believe is being asked.
"If there is a path of the form (a)-[:X]->(b), find all other paths from a to b."
The answer is simple:
MATCH p=(a)-[:X]->(b), q=(a)-[r*]->(b)
WHERE p<>q
RETURN r;