neo4j why is start node not found in result set? - neo4j

I haven't found a question about this or found any comment in the Neo4j manual.
This query returns the start node:
start n = node:node_auto_index(subject_id='A1')
match (n)-[]->()<-[]-(n)
return distinct n.subject_id;
==> +--------------+
==> | n.subject_id |
==> +--------------+
==> | "A1" |
==> +--------------+
==> 1 row
but this query does not return the start node. Is there any way to make it return the start node along with with other matching nodes?
start n = node:node_auto_index(subject_id='A1')
match (n)-[]->()<-[]-(s)
where s.subject_id = 'A1'
return distinct s.subject_id;
==> +--------------+
==> | s.subject_id |
==> +--------------+
==> +--------------+
==> 0 row
Just to be sure I have the syntax right, the previous query works on nodes other than the start node:
start n = node:node_auto_index(subject_id='A1')
match (n)-[]->()<-[]-(s)
where s.subject_id = 'B2'
return distinct s.subject_id;
==> +--------------+
==> | s.subject_id |
==> +--------------+
==> | "B2" |
==> +--------------+
==> 1 row

I think you ran into identifier uniqueness in cypher paths.
In the same path two different identifiers (if not bound upfront) won't point to the same node.
In your fist example both sides of the path are bound (to the same node) and in the last example you have two different nodes, one bound to n the other bound to s.
In the second example you would end up with the same node being bound to n and s, which cypher does not do in a path.

Related

Why does the number of nodes keep increasing in neo4j even though we don't create any nodes?

This is the number of nodes before I create the new one:
neo4j-sh (0)$ match n return n;
==> +------------------------------------------------------------------------+
==> | n |
==> +------------------------------------------------------------------------+
==> | Node[0]{} |
==> | Node[1]{address:"rioeduardo92#gmail.com",comment:"home",person_id:"1"} |
==> | Node[2]{address:"rioeduardo92#yahoo.com",comment:"work",person_id:"1"} |
==> | Node[3]{person_id:"1",name:"Rio"} |
==> +------------------------------------------------------------------------+
after I created the new one, the node that I just created is started from node number 300:
neo4j-sh (0)$ create (n:lolo{color:'blue'}) return n;
==> +-------------------------+
==> | n |
==> +-------------------------+
==> | Node[300]{color:"blue"} |
==> +-------------------------+
Thank you
It's not the number of nodes increasing but the internal node id. If you created a lot of nodes and deleted them for example, then your new node might have taken up the next highest id (300) because the old id's haven't been recycled yet.
Which is why you should never count on the internal node ID to serve as an identifier/key on your nodes.
start n=node(*) return count(n)
should give you the true number of nodes in your graph

neo4j creates another same instance on selecting data

The structure of my nodes are like this:
==> | Node[613]{name:"The Bigos",fs_id:"51a8e1a12fc6e7ef6d121077"}
==> | Node[614]{name:"Maceraperest",fs_id:"51bafb3d498ed54bd4c7fa8c"}
==> | Node[616]{name:"Viking",fs_id:"51bafe1de4b090ea9dceb20e"}
==> | Node[618]{name:"Metro Gross Market",fs_id:"51bb426c498e47af428ca013"}
When I try to create these nodes again, a php script I wrote checks on fs_id to find that if the node already exists or not. If it exists, it returns me the node and does not create a new one.
Now the problem is that even though it does not create new nodes, the console shows me that it did.
==> | Node[613]{name:"The Bigos",fs_id:"51a8e1a12fc6e7ef6d121077"}
==> | Node[613]{name:"The Bigos",fs_id:"51a8e1a12fc6e7ef6d121077"}
==> | Node[613]{name:"The Bigos",fs_id:"51a8e1a12fc6e7ef6d121077"}
==> | Node[614]{name:"Maceraperest",fs_id:"51bafb3d498ed54bd4c7fa8c"}
==> | Node[614]{name:"Maceraperest",fs_id:"51bafb3d498ed54bd4c7fa8c"}
==> | Node[614]{name:"Maceraperest",fs_id:"51bafb3d498ed54bd4c7fa8c"}
==> | Node[616]{name:"Viking",fs_id:"51bafe1de4b090ea9dceb20e"}
==> | Node[616]{name:"Viking",fs_id:"51bafe1de4b090ea9dceb20e"}
==> | Node[616]{name:"Viking",fs_id:"51bafe1de4b090ea9dceb20e"}
==> | Node[618]{name:"Metro Gross Market",fs_id:"51bb426c498e47af428ca013"}
==> | Node[618]{name:"Metro Gross Market",fs_id:"51bb426c498e47af428ca013"}
==> | Node[618]{name:"Metro Gross Market",fs_id:"51bb426c498e47af428ca013"}
Look at the node ids, they are same! And if I explore the node 618 for example in the data browser, it returns me a single node. Also the query
start n=node(618) return n;
also returns single row. But the query below returns multiple rows of same node id and the row count is increasing when I test the above nodes for existence.
start n=node(331) match n-[:BEEN]->(venues) return venues order by id(venues);
It might be nothing but I'm curious that if somehow Neo4j is eating extra memory for doing this or it is just something like caching system.
You probably just have multiple BEEN relationships, then each of those relationships yields another result row.
If you just have one row per venue do this:
start n=node(331)
match n-[:BEEN]->(venues)
return distinct venues;
to see the different relationships, use:
start n=node(331)
match n-[rel:BEEN]->(venues)
return venues,collect(rel);

Potential inconsistency in creating relationships in cypher when we don't use directions

I am getting confused with the way relationships are created through cypher. I was under the impression that _src-[:likes]- _dst creates a bidirectional relationship but looks like that is not the case as _src-[:likes]- _dst == _src<-[:likes]- _dst (example provided below)
Let's say I create the following graph but using the _src[:likes]-_dst notation ( using '-' as opposed to '->')
create
(_u1 {type:"User",`name`:"u1",userId:'u1' }) , ( _u2 {type:"User",`name`:"u2",userId:'u2'} ) , ( _u3 {type:"User",`name`:"u3",userId:'u3' }) , ( _u4 {type:"User",`name`:"u4",userId:'u4' }) , ( _u5 {type:"User",`name`:"u5",userId:'u5'}) , (_u6 {type:"User",`name`:"u6",userId:'u6'}),
(_f1 {type:"Item",`name`:"f1",itemId:'f1' }) , ( _f2 {type:"Item",`name`:"f2",itemId:'f2' }) , ( _f3 {type:"Item",`name`:"f3",itemId:'f3' }) , ( _f4 {type:"Item",`name`:"f4",itemId:'f4'}) , (_f5 {type:"Item",`name`:"f5",itemId:'f5'}),
_u1-[:`likes`{likeValue:3}]-_f1 , _u1-[:`likes` {likeValue:13}]-_f2 , _u1-[:`likes` {likeValue:1}]-_f3 , _u1-[:`likes` {likeValue:5}]-_f4,
_u2-[:`likes`{likeValue:7}]-_f1 , _u2-[:`likes` {likeValue:13}]-_f2 , _u2-[:`likes` {likeValue:1}]-_f3,
_u3-[:`likes`{likeValue:5}]-_f1 , _u3-[:`likes` {likeValue:8}]-_f2 , _u4-[:`likes`{likeValue:5}]-_f1
,_u5-[:`likes` {likeValue:8}]-_f2,_u6-[:`likes` {likeValue:8}]-_f2;
My impression was this way, you tell neo4j to created a bidirectional relationship. Now, look at the following query
neo4j-sh (?)$ start n=node(*) match n-[:likes]->m where has(n.type) and n.type='User' return n,m;
==> +-------+
==> | n | m |
==> +-------+
==> +-------+
==> 0 row
But the opposite works
neo4j-sh (?)$ start n=node(*) match n-[r]->m where has(n.type) and n.type="Item" return n,m limit 3;
==> +-----------------------------------------------------------------------------------------+
==> | n | m |
==> +-----------------------------------------------------------------------------------------+
==> | Node[7]{type:"Item",name:"f1",itemId:"f1"} | Node[4]{type:"User",name:"u4",userId:"u4"} |
==> | Node[7]{type:"Item",name:"f1",itemId:"f1"} | Node[3]{type:"User",name:"u3",userId:"u3"} |
==> | Node[7]{type:"Item",name:"f1",itemId:"f1"} | Node[2]{type:"User",name:"u2",userId:"u2"} |
==> +-----------------------------------------------------------------------------------------+
The question is why a-[:likes]-b = a<-[:likes]-b ?
Now I create two more nodes and a relationship as instructed in the Cypher manual
create (_u7 {type:"User",`name`:"u7",userId:'u7' });
create (_f7 {type:"Item",`name`:"f7",itemId:'f7' });
start src=node(*),dst=node(*) where src.name='u7' and dst.name='f7' create src-[:likes{likeValue:3}]-dst;
neo4j-sh (?)$ start n=node(*) match n-[r]->m where has(n.type) and n.type="User" return n,m limit 3;
==> +-------+
==> | n | m |
==> +-------+
==> +-------+
==> 0 row
same results, we can't query from User to Item but we can from Item to User
now if use the following method things change
create (_u {type:"User",`name`:"u8",userId:'u8' }) , ( _f {type:"User",`name`:"f8",userId:'f8'} ), _u-[:likes{likeValue:2}]-_f;
neo4j-sh (?)$ start n=node(*) match n-[r]->m where has(n.type) and n.type="User" return n,m limit 3;
==> +-------------------------------------------------------------------------------------------+
==> | n | m |
==> +-------------------------------------------------------------------------------------------+
==> | Node[19]{type:"User",name:"f8",userId:"f8"} | Node[18]{type:"User",name:"u8",userId:"u8"} |
==> +-------------------------------------------------------------------------------------------+
What is going on? These are my questions
1- Why create _src-[:likes]-_dst does not create a bidirectional relationship?
2- If it can't then why even allow _src-[:likes]-_dst for relationship creation? Why not force people to use directions when creating relationships?
3- What is the difference between the two methods I used to create relationships? (u7-f7 and u8-f8)
You can't create a bidirectional relationship using _src[:likes]-_dst
In Neo4j, a relation can and must only have a single direction. So to represent bidirectional, you have two options:
a) Create the relation with a direction but ignore when querying (_src[:likes]-_dst will match both directions when part of a match clause)
b) Create two relations- one in either direction
It appears that if you execute a create without a direction such as _src[:likes]-_dst, an incoming relation is created for _src

Neo4j Cypher: Find exact match to array Node property in WHERE clause

Given a Neo4J Node with an array property, how do I create a Cypher query to return only the node(s) that match an array literal?
Using the console I created a node with the array property called "list":
neo4j-sh (0)$ create n = {list: [1,2,3]};
==> +-------------------+
==> | No data returned. |
==> +-------------------+
==> Nodes created: 1
==> Properties set: 1
==> 83 ms
neo4j-sh (0)$ start n=node(1) return n;
==> +-----------------------+
==> | n |
==> +-----------------------+
==> | Node[1]{list:[1,2,3]} |
==> +-----------------------+
==> 1 row
==> 1 ms
However, my query does not return the Node that was just created given a WHERE clause that matches an array literal:
neo4j-sh (0)$ start n=node(1) where n.list=[1,2,3] return n;
==> +---+
==> | n |
==> +---+
==> +---+
==> 0 row
==> 0 ms
It's entirely possible I'm mis-using Cypher. Any tips on doing exact array property matching in Cypher would be helpful.
The console is always running the latest SNAPSHOT builds of Neoj4. the version refers to the Cypher Syntax parswer, we will point that out more clearly :)
Now, there has been some fixing around the Array handling in Cypher, see https://github.com/neo4j/community/pull/815 and https://github.com/neo4j/community/issues/818 which problably are the ones that make the console work. This has been merged in after 1.8.M07, so in order to get it work locally, please download one of the latest 1.8.-SNAPSHOT, build it from GITHUB or wait for 1.8.M08 which is due very soon.
/peter
Which version of Neo4j are you using?
Your same code works for me in 1.8M07.
http://console.neo4j.org/?id=p9cy6l
Update:
I get the same result (no results) in a local install via the web client. Maybe it's a web client issue?

understanding cypher output

I have a graph like this:
(2)<-[0:CHILD]-(1)-[1:CHILD]->(3)
In words: Node 1,2 and 3 (all with names); Edges 0 and 1
I write the following cypher-query:
START nodes = node(1,2,3), relationship = relationship(0,1)
RETURN nodes, relationship
and got as a result:
==> +-----------------------------------------------+
==> | nodes | relationship |
==> +-----------------------------------------------+
==> | Node[1]{name->"Risikogruppe2"} | :CHILD[0] {} |
==> | Node[1]{name->"Risikogruppe2"} | :CHILD[1] {} |
==> | Node[2]{name->"Beruf 1"} | :CHILD[0] {} |
==> | Node[2]{name->"Beruf 1"} | :CHILD[1] {} |
==> | Node[3]{name->"Beruf 2"} | :CHILD[0] {} |
==> | Node[3]{name->"Beruf 2"} | :CHILD[1] {} |
==> +-----------------------------------------------+
==> 6 rows, 0 ms
now my question:
why I became all nodes twice and relationships three time? I just want to get all of it one time.
thanks for your time ^^
The way Cypher works is very similar to SQL. When you create your variables in your START clause, you're sort of doing a from nodes, relationships in SQL (tables). The reason you're getting a cartesian product of all of the possible values for the two, is because you're not doing any sort of match or where to filter them, so it's basically like:
select *
from nodes, relationships
Where you forgot to put the foreign key relationship between the tables.
In Cypher, you do this by doing a match, usually:
start n=node(1,2,3), r=relationship(0,1)
match n-[r]-m // find where the n nodes and the r relationships point (to m)
return *
But since you have no match, you get a cartesian product.
You should only see the nodes and relationships once, unless you do some matching.
Tried to reproduce your problem, but I haven't been able to.
http://tinyurl.com/cobd8oq
Is it possible for you to create an console.neo4j.org example of your problem?
Thanks,
Andrés

Resources