Cypher: Multiple independent queries in one call - neo4j

in my Neo4j 2.0 server database I have a forest, i.e. a set of trees. One of my use cases is to get the child nodes of an arbitrary subset of tree nodes.
For instance, I have the root nodes
root1 root2 root3 root4
and now I want the child nodes of root1 and root4. And I need to know which children belong to which root. Each query individually is a simple MATCH Cypher query. But for the sake of performance I would like to keep the amount of database calls low since I use the Neo4j server. Thus I am thinking about a way to tell Cypher "give me the child terms of root1 and root4 and tell me which node belongs to which root in the result". That is, I think of a kind of map. Or a collection of result sets where the first element is the child nodes of the first root, the second element the child nodes of the second root etc.
Is there a way to do this in Cypher or will I have to fall back to a server plugin here?
Thank you and best regards!
Edit:
To clarify: My main concern is that I need to know which children belong to which root. As an example, consider the small graph generated by this command:
create (r1:ROOT {name:"root1"}),
(r2:ROOT {name:"root2"}),
(c11:CHILD {name:"child1_1"}),
(c12:CHILD {name:"child1_2"}),
(c13:CHILD {name:"child1_3"}),
(c21:CHILD {name:"child2_1"}),
(c22:CHILD {name:"child2_2"}),
(c23:CHILD {name:"child2_3"}),
(r1)-[:HAS_CHILD]->(c11),
(r1)-[:HAS_CHILD]->(c12),
(r1)-[:HAS_CHILD]->(c13),
(r2)-[:HAS_CHILD]->(c21),
(r2)-[:HAS_CHILD]->(c22),
(r2)-[:HAS_CHILD]->(c23)
Here, we get root1 and root2 with three children, respectively.
To get the children of root1 I would issue the following query:
MATCH (r:ROOT)-[:HAS_CHILD]->c where r.name='root1' RETURN collect(c)
Now I know the children of root root1.
The question is: How would a query look like that queries the children of root1 AND root2 where the result would show the association of which child belongs to which root. Because clearly the query
MATCH (r:ROOT)-[:HAS_CHILD]->c where r.name='root1' OR r.name='root2' RETURN collect(c.id)
would give me the children of both roots. But now I would not know which root had which children. So what can I do?

You should give us more details but a query like this (adjusting properties and relationships), should work as you want:
MATCH (child) <-[:HAS_CHILD]- (root:ROOT)
WHERE root.name IN ['root1','root4']
RETURN child, root

Related

Get all nodes with a specific type of relationship to a root node

I have a rather large and complex graph in Neo4j (millions of nodes and relationships in various types), I want to get all child nodes (in all depths) of a specific root node, but only with a specific type of relationship
I have tried: Match (n:NODE_TYPE)-[*:REL_TYPE]->(r:NODE_TYPE {id:SPECIFIC_ID}) return n
But I get a syntax error for specifying a label on the relationship
Querying the whole graph takes a really long time without specifying the relationship type, and nodes could go through paths that will eventually lead to the root node but will use other types of relationships (which is not good for my use case)
you need to change the order of the rel type and wildcard operator:
Match (n:NODE_TYPE)-[:REL_TYPE*]->(r:NODE_TYPE {id:SPECIFIC_ID})
return n

Neo4j query to ignore parent nodes which doesn't satisfy a condition but keep the same structure

I have a tree-like structure and I'm trying to get a Cypher query which will replace the parent node with the child if the parent node does not have a certain relation
for example the query: MATCH (c)-[:CHILD_OF*]->(p {id:"123"}) return c returns a structure like so (we don't care about what the other nodes are, the structure is the only thing that needs to be preserved)
()<-(A)
()<-()<-(B)<-()<-(C)
()<-(D)<-(E)<-()<-(F)
\-(G)<-()<-H)
How could I get the query to ignore all nodes without a certain property but keep it the same structure like so:
(A)
(B)<-(C)
(D)<-(E)<-(F)
(G)<-(H)
You should take a look at the procedures for creating virtual nodes and relationships in APOC Procedures.
These will allow you to create virtual relationships, that will not be saved to the graph, but will be present and viewable in your query.
The tricky part will be creating those new virtual relationships. You'll likely be filtering down nodes in all paths to the nodes you're interested in. At that point you may need to use apoc.coll.pairsMin() in order to get each adjacent pair of nodes in the collection on a row so you can create the virtual relationships between them.
After all the virtual relationships are created (in the same cypher query), match from the root node using those virtual relationships, and you should see the graph you want.

How to query parents-children tree in neo4j?

I have a tree, I would like to get all nodes at every level. The depth of tree could be anything.
node(1)<-[PARENT]-node(2)<-[PARENT]-node(3)<-[PARENT]-node(4)
node(1)<-[PARENT]-node(5)<-[PARENT]-node(6)
node(2)<-[PARENT]-node(7)
node(5)<-[PARENT]-node(8)
node(2)<-[PARENT]-node(9)
so,
node(1) has two children node(2) and node(5)
node(2) has three children node(3),node(7) and node(9)
node(5) has two children node(6) and node(8)
node(3) has one child node(4)
This is the example of tree. I would like to get all nodes at every level in separate map. I tried many different cypher queries, but could not figure out a way to do it. If anyone can help. I would like to write one cypher query for doing this operation.
I figured out a simple query which keeps track of relationships, but in java, temple.query() returns Result> which is not good as I have to get nodes and relationships from that result. Here is the query:
match p=(n)<-[r:PARENT*]-b return relationships(p);
which returns all relationships in every path. from that list, have to build up the tree in java to maintain parent-children relationships.

Is a DFS Cypher Query possible?

My database contains about 300k nodes and 350k relationships.
My current query is:
start n=node(3) match p=(n)-[r:move*1..2]->(m) where all(r2 in relationships(p) where r2.GameID = STR(id(n))) return m;
The nodes touched in this query are all of the same kind, they are different positions in a game. Each of the relationships contains a property "GameID", which is used to identify the right relationship if you want to pass the graph via a path. So if you start traversing the graph at a node and follow the relationship with the right GameID, there won't be another path starting at the first node with a relationship that fits the GameID.
There are nodes that have hundreds of in and outgoing relationships, some others only have a few.
The problem is, that I don't know how to tell Cypher how to do this. The above query works for a depth of 1 or 2, but it should look like [r:move*] to return the whole path, which is about 20-200 hops.
But if i raise the values, the querys won't finish. I think that Cypher looks at each outgoing relationship at every single path depth relating to the start node, but as I already explained, there is only one right path. So it should do some kind of a DFS search instead of a BFS search. Is there a way to do so?
I would consider configuring a relationship index for the GameID property. See http://docs.neo4j.org/chunked/milestone/auto-indexing.html#auto-indexing-config.
Once you have done that, you can try a query like the following (I have not tested this):
START n=node(3), r=relationship:rels(GameID = 3)
MATCH (n)-[r*1..]->(m)
RETURN m;
Such a query would limit the relationships considered by the MATCH cause to just the ones with the GameID you care about. And getting that initial collection of relationships would be fast, because of the indexing.
As an aside: since neo4j reuses its internally-generated IDs (for nodes that are deleted), storing those IDs as GameIDs will make your data unreliable (unless you never delete any such nodes). You may want to generate and use you own unique IDs, and store them in your nodes and use them for your GameIDs; and, if you do this, then you should also create a uniqueness constraint for your own IDs -- this will, as a nice side effect, automatically create an index for your IDs.

Slow performing cypher query that creates nodes to group existing nodes by property values

I have a performance issue with a modifying cypher query. Given is an origin node that has a huge amount of outgoing relationships to child nodes. These child nodes all have a key property. Now the goal is to create new nodes between the origin and the child nodes to group all child nodes which share the same key properties value. A plot of that idea can be found at the neo4j console: http://console.neo4j.org/?id=vinntj
I use the query together with spring-data-neo4j 2.2.2.RELEASE and neo4j 1.9.2 embedded. The parameter for that query must be a node id and the result of that query should be the modified root node.
The query currently looks like (a bit more complex than in the linked neo4j console):
START root=node({0})
MATCH (root)-[r:LEAF]->(child)
SET root.__type__='my.GroupedRoot'
DELETE r
WITH child.`custom-GROUP` AS groupingKey, root AS origin, child AS leaf
CREATE UNIQUE (origin)-[:GROUP]->(group{__type__:'my.Group',key:'GROUP',value:groupingKey,origin:ID(origin)})-[:LEAF]->(leaf)
RETURN DISTINCT origin
The property custom-GROUP is the key to group by. In SDN it is represented by a DynamicProperties object. I annotated it to be indexed as well as the groupingKey and origin property of the created group node.
With 5000 child nodes it takes ~50sec to group them. For 10000 nodes ~90sec. For 20000 nodes ~380s and for 30000 nodes > 50min! This looks like an o(log n) scale to me. But my goal is an o(n) scale and to get 500000+ child nodes processed below 30min. I assume that the CREATE UNIQUE part of that query causes that problem because for new group nodes it always need to check what kind of group nodes have already been created. And the amount to check grows with the amount of already grouped child nodes.
Does someone have an idea about how to get this query faster? Or to do the same thing faster with an other query?
If the CREATE UNIQUE is indeed the problem, then this will first create the groups, then map to them.
START root=node(*)
MATCH (root)-[r:LEAF]->(child)
WHERE HAS (root.key) AND root.key='root'
WITH DISTINCT child.key AS groupingKey, root as origin
CREATE UNIQUE (origin)-[:GROUP]->(intermediate { key:groupingKey,origin:ID(origin)})
WITH groupingKey, origin, intermediate
MATCH (origin)-[r:LEAF]->(leaf)
WHERE leaf.key = groupingKey
DELETE r
CREATE (intermediate)-[:LEAF]->(leaf)
RETURN DISTINCT origin
The console is not letting me view the execution plan for either of our queries for some reason so I don't know for sure if it will help.
You might also consider indexing the roots so that you aren't having to do a "WHERE" on all of the nodes. You could just check an index for key=root.
EDIT An alternative to the above query is as follows which will prevent having to match the leaf nodes a second time by using a collect.
START root=node(*)
MATCH (root)-[r:LEAF]->(child)
WHERE HAS (root.key) AND root.key='root'
DELETE r
WITH DISTINCT child.key AS groupingKey, root as origin, COLLECT(child) as children
CREATE UNIQUE (origin)-[:GROUP]->(intermediate { key:groupingKey,origin:ID(origin)})
WITH groupingKey, origin, intermediate, children
FOREACH(leaf IN children : CREATE (intermediate)-[:LEAF]->(leaf))
RETURN DISTINCT origin
Well, now I turned to not use this kind of cypher queries on such a big amount of data. I implemented the same functionality using the traversal API for extracting the groupable items and the Neo4jTemplate to create the new nodes and relationships. Now 50000 items can be grouped in 5474ms instead of ~1h with the previously used cypher query. This is a very big improvement.

Resources