When using LIMIT with ORDER BY, every node with the selected label still gets scanned (even with index).
For example, let's say I have the following:
MERGE (:Test {name:'b'})
MERGE (:Test {name:'c'})
MERGE (:Test {name:'a'})
MERGE (:Test {name:'d'})
Running the following gets us :Test {name: 'a'}, however using PROFILE we can see the entire list get scanned, which obviously will not scale well.
MATCH (n:Node)
RETURN n
ORDER BY n.name
LIMIT 1
I have a few sorting options available for this label. the order of nodes within these sorts should not change often, however, I can't cache these lists because each list is personalized for a user, i.e. a user may have hidden :Test {name:'b'}
Is there a golden rule for something like this? Would creating pointers from node to node for each sort option be a good option here? Something like
(n {name:'a'})-[:ABC_NEXT]->(n {name:'b'})-[:ABC_NEXT]->(n {name:'c'})-...
Would I be able to have multiple sort pointers? Would that be overkill?
Ref:
https://neo4j.com/blog/moving-relationships-neo4j/
http://www.markhneedham.com/blog/2014/04/19/neo4j-cypher-creating-relationships-between-a-collection-of-nodes-invalid-input/
Here's what I ended up doing for anyone interested:
// connect nodes
MATCH (n:Test)
WITH n
ORDER BY n.name
WITH COLLECT(n) AS nodes
FOREACH(i in RANGE(0, length(nodes)-2) |
FOREACH(node1 in [nodes[i]] |
FOREACH(node2 in [nodes[i+1]] |
CREATE UNIQUE (node1)-[:IN_ORDER_NAME]->(node2))))
// create list, point first item to list
CREATE (l:List { name: 'name' })
WITH l
MATCH (n:Test) WHERE NOT (m)<-[:IN_ORDER_NAME]-()
MERGE (l)-[:IN_ORDER_NAME]->(n)
// getting 10 nodes sorted alphabetically
MATCH (:List { name: 'name' })-[:IN_ORDER_NAME*]->(n)
RETURN n
LIMIT 10
Related
So this is a very basic question. I am trying to make a cypher query that creates a node and connects it to multiple nodes.
As an example, let's say I have a database with towns and cars. I want to create a query that:
creates people, and
connects them with the town they live in and any cars they may own.
So here goes:
Here's one way I tried this query (I have WHERE clauses that specify which town and which cars, but to simplify):
MATCH (t: Town)
OPTIONAL MATCH (c: Car)
MERGE a = ((c) <-[:OWNS_CAR]- (p:Person {name: "John"}) -[:LIVES_IN]-> (t))
RETURN a
But this returns multiple people named John - one for each car he owns!
In two queries:
MATCH (t:Town)
MERGE a = ((p:Person {name: "John"}) -[:LIVES_IN]-> (t))
MATCH (p:Person {name: "John"})
OPTIONAL MATCH (c:Car)
MERGE a = ((p) -[:OWNS_CAR]-> (c))
This gives me the result I want, but I was wondering if I could do this in 1 query. I don't like the idea that I have to find John again! Any suggestions?
It took me a bit to wrap my head around why MERGE sometimes creates duplicate nodes when I didn't intend that. This article helped me.
The basic insight is that it would be best to merge the Person node first before you match the towns and cars. That way you won't get a new Person node for each relationship pattern.
If Person nodes are uniquely identified by their name properties, a unique constraint would prevent you from creating duplicates even if you run a mistaken query.
If a person can have multiple cars and residences in multiple towns, you also want to avoid a cartesian product of cars and towns in your result set before you do the merge. Try using the table output in Neo4j Browser to see how many rows are getting returned before you do the MERGE to create relationships.
Here's how I would approach your query.
MERGE (p:Person {name:"John"})
WITH p
OPTIONAL MATCH (c:Car)
WHERE c.licensePlate in ["xyz123", "999aaa"]
WITH p, COLLECT(c) as cars
OPTIONAL MATCH (t:Town)
WHERE t.name in ["Lexington", "Concord"]
WITH p, cars, COLLECT(t) as towns
FOREACH(car in cars | MERGE (p)-[:OWNS]->(car))
FOREACH(town in towns | MERGE (p)-[:LIVES_IN]->(town))
RETURN p, towns, cars
I am trying to return a set of a node from 2 sessions with a condition that returned node should not be present in another session (third session). I am using the following code but it is not working as intended.
MATCH (:Session {session_id: 'abc3'})-[:HAS_PRODUCT]->(p:Product)
UNWIND ['abc1', 'abc2'] as session_id
MATCH (target:Session {session_id: session_id})-[r:HAS_PRODUCT]->(product:Product)
where p<>product
WITH distinct product.products_id as products_id, r
RETURN products_id, count(r) as score
ORDER BY score desc
This query was supposed to return all nodes present in abc1 & abc2 but not in abc3. This query is not excluding all products present in abc3. Is there any way I can get it working?
UPDATE 1:
I tried to simplify it without UNWIND as this
match (:Session {session_id: 'abc3'})-[:HAS_PRODUCT]->(p:Product)
MATCH (target:Session {session_id: 'abc1'})-[r:HAS_PRODUCT]->(product:Product)
where product <> p
WITH distinct product.products_id as products_id
RETURN products_id
Even this is also not working. It is returning all items present in abc1 without removing those which are already in abc3. Seems like where product <> p is not working correctly.
I would suggest it would be best to check if the nodes are in a list, and to prove out the approach, start with a very simple example.
Here is a simple cypher showing one way to do it. This approach can then be extended into the complex query,
// get first two product IDs as a list
MATCH (p:Product)
WITH p LIMIT 2
WITH COLLECT(ID(p)) as list
RETURN list
// now show two more product IDs which not in that list
MATCH (p:Product)
WITH p LIMIT 2
WITH COLLECT(ID(p)) as list
MATCH (p2:Product)
WHERE NOT ID(p2) in list
RETURN ID(p2) LIMIT 2
Note: I'm using the ID() of the nodes instead of the entire node, same dbhits but may be more performant...
I have currently visualised a graph between myself and a number of other people.
My Current query is:
MATCH (p)-[:emailed]->(m)
WITH p,count(m) as rels, collect(m) as Contact
WHERE rels > 2
RETURN p,Contact, rels
It creates a pretty complex graph as per image below:
Messy Graph
You can manually remove them by directly clicking on them as per below:
Manually hide node from visualisation
Which then results in a very different looking graph.
Q. How do I change my query to automatically show the graph visualisation without showing the nodes that I wish to remove? (i.e by editing the query, so I dont have to manually remove each one)
By doing either
A) Adding a list of the specific Node ID's in the query to ignore, OR
B) (Ideally) Exclude all nodes that meet a criteria against the node Property
In this case: Ignore [Slug: "myname" ] where includes 'myname'
MATCH (p)-[:emailed]->(m)
WITH p,count(m) as rels, collect(m) as Contact
WHERE rels > 2 AND NOT WHERE p.slug Contains 'Mahdi'
RETURN p,Contact, rels
Thanks for any help!
I would change it slightly. If you collect the actual :emailed relationships rather than just counting the node they are connect to you can use them in your result set. Then if you turn off autocomplete as JeromeB suggests above then you will actually see some relationship. If you turn off autocomplete in your current query there will only be nodes and no relationships which I don't think you are after (unless of course you are).
You could also check to make sure that the p.slug attribute exists when testing for CONTAINS otherwise if the attribute does not exist you will not generate any results for that row.
MATCH (p:User)-[r:emailed]->(m:User)
WITH p, COLLECT(r) as rels, COLLECT(m) as contact
WHERE (NOT p.slug CONTAINS 'Mahdi' OR NOT EXISTS(p.slug))
AND size(rels) > 2
RETURN p, contact, rels
I would also add a label to the nodes in the match and an index on the slug property.
The autocomplete is 'Connect result nodes' in the Gear tab.
I am trying to implement https://neo4j.com/blog/moving-relationships-neo4j/ pointer functionality for using it as a team order machine.See http://imgur.com/a/MViF0 for a model. I am using this cypher query.
MERGE (list:LIST)
WITH list
MATCH (u) WHERE ID(u) IN [421, 419, 420]
MERGE (team:TEAM{name: u.name})
MERGE (team)-[:PARTOF]->(list)
WITH collect(team)as elems,list
FOREACH (n IN RANGE(0, LENGTH(elems)-2) |
FOREACH (prec IN [elems[n]] |
FOREACH (next IN [elems[n+1]] |
MERGE (prec)-[:NEXT]->(next))))
with list
MATCH (elem:TEAM) WHERE NOT (elem)<-[:NEXT]-()
MERGE (list)-[:POINTER]->(elem)
Now this works quite nicely, but I have only one problem. This line:
MATCH (u) WHERE ID(u) IN [421, 419, 420]
returns my original teams ordered by id, but I would like to define my order by the pattern in the [421,419,420] pattern, like a function that
return * order by my array input.
Keep in mind that it should work for any amount of teams,this is just an example. And that my original team node isn't labeled a team but something else, so we make a duplicate every time. Any input appreciated, thanks.
Try to use the statement "unwind":
MERGE (list:LIST)
WITH list
UNWIND [421, 419, 420] as uid
MATCH (u) WHERE id(u) = uid
MERGE (team:TEAM{name: u.name})
...
[Update] Of course, it is possible to know the order manually for each node:
MERGE (list:LIST)
WITH list, [3871013, 3871011, 3871012] as ids
MATCH (u) WHERE ID(u) IN ids
WITH list, u,
FILTER(x in RANGE(0,size(ids)-1) WHERE ids[x] = id(u)) as orderIndex
ORDER BY orderIndex[0] // Sort by node position in the array of identifiers
MERGE (team:TEAM{name: u.name})
...
I have the nodes: (a:charlie), (b:economy), and (c:bicycle) . I want to create this pattern:
create (a:charlie)-[x:wants_make]->(b:economy)->[y:by_using]->(c:bicycle)
But it gives me cartesian product. I already thought to skip the creation of the node (b) giving to relation [x:want_make]a property. But node (b) has many other relations in the same context(economic context). What I want to get the pattern above.
Any suggestion?
If your query looks like this:
MATCH (a:charlie), (b:economy), (c:bicycle)
MERGE (a)-[:wants_make]->(b)-[:by_using]->(c);
then it is saying both of these things:
Create a wants_make relationship between every charlie node and every economy node.
Create a by_using relationship between every economy node and every bicycle node.
So, if the number of charlie, economy, and bicycle nodes are C, E, and B -- this results in (C * E * B) merges, which is a Cartesian product of a Cartesian product.
Also, your data model seems to be wrong. For example, it seems much more reasonable to have a Person label instead of a charlie label.
A more reasonable query might look something like this:
MERGE (a:Person {name: 'Charlie Brown'})
MERGE (c:Bicycle {id: 123})
MERGE (a)-[:wants_make]->(b:Economy)
MERGE (b)-[:by_using]->(c);
This query avoids Cartesian products by being more specific about the first and last nodes in the path, and it also avoids creating nodes and relationships that already exist.
And, going even further, you might want to combine wants_make, Economy, and by_using into a single economizes_by_using relationship:
MERGE (a:Person {name: 'Charlie Brown'})
MERGE (c:Bicycle {id: 123})
MERGE (a)-[:economizes_by_using]->(c);
You might need to break up your query a bit:
MATCH (a:charlie), (b:economy), (c:bicycle)
MERGE (a)-[:wants_make]->(b), (b)->[:by_using]->(c)