Neo4j Cypher Query to Get User Feed - neo4j

Assume that my project is Facebook. I want to display a feed which consists of my status updates and my friends' status updates both.
Here are the relations;
User KNOWS user
User UPDATES_STATUS status
This is how I get my friends status updates;
START me = node(1) MATCH me-[:KNOWS]-()-[:UPDATES_STATUS]->friendsStatusUpdates RETURN friendsStatusUpdates
And this is how I get my own status updates;
START me = node(1) MATCH me-[:UPDATES_STATUS]->myStatusUpdates RETURN myStatusUpdates
Both queries work fine but I need a single query that combines these two.

Here is the answer I got from Google Groups;
START me = node(1) MATCH me-[:KNOWS*0..1]-()-[:UPDATES_STATUS]->statusUpdate RETURN DISTINCT statusUpdate
Only thing I had to do was adding *0..1 depth indicator to the relation in order to get both 0 or 1 level depth results.
Edit: I had to add DISTINCT because without it query includes 0 level nodes 2 times which results in duplicates.
Alternative query which returns same results using WITH statement;
START me = node(1)
MATCH me-[:KNOWS*0..1]-friend
WITH DISTINCT friend
MATCH friend-[:UPDATES_STATUS]->statusUpdate
RETURN DISTINCT statusUpdate

Crate a himself relationship between user node to itself than query is START me = node(1) MATCH me-[:KNOWS|HIMSELF]-()-[:UPDATES_STATUS]->friendsStatusUpdates RETURN friendsStatusUpdates

http://docs.neo4j.org/chunked/milestone/introduction-pattern.html
START me = node(1)
MATCH me-[:UPDATES_STATUS*1..2|KNOWS]-myStatusUpdates
RETURN myStatusUpdates
in case the *1..2 wont work with | command, do this:
START me = node(1)
MATCH friendsStatusUpdates2-[?:UPDATES_STATUS]-me-[:KNOWS]-()-[:UPDATES_STATUS]->friendsStatusUpdates
RETURN distinct myStatusUpdates,friendsStatusUpdates2
just edit the RETURN statement with some aggregation function so you will get one status per row

Related

Do not return set of nodes from a specific path in Cypher

I am trying to return a set of a node from 2 sessions with a condition that returned node should not be present in another session (third session). I am using the following code but it is not working as intended.
MATCH (:Session {session_id: 'abc3'})-[:HAS_PRODUCT]->(p:Product)
UNWIND ['abc1', 'abc2'] as session_id
MATCH (target:Session {session_id: session_id})-[r:HAS_PRODUCT]->(product:Product)
where p<>product
WITH distinct product.products_id as products_id, r
RETURN products_id, count(r) as score
ORDER BY score desc
This query was supposed to return all nodes present in abc1 & abc2 but not in abc3. This query is not excluding all products present in abc3. Is there any way I can get it working?
UPDATE 1:
I tried to simplify it without UNWIND as this
match (:Session {session_id: 'abc3'})-[:HAS_PRODUCT]->(p:Product)
MATCH (target:Session {session_id: 'abc1'})-[r:HAS_PRODUCT]->(product:Product)
where product <> p
WITH distinct product.products_id as products_id
RETURN products_id
Even this is also not working. It is returning all items present in abc1 without removing those which are already in abc3. Seems like where product <> p is not working correctly.
I would suggest it would be best to check if the nodes are in a list, and to prove out the approach, start with a very simple example.
Here is a simple cypher showing one way to do it. This approach can then be extended into the complex query,
// get first two product IDs as a list
MATCH (p:Product)
WITH p LIMIT 2
WITH COLLECT(ID(p)) as list
RETURN list
// now show two more product IDs which not in that list
MATCH (p:Product)
WITH p LIMIT 2
WITH COLLECT(ID(p)) as list
MATCH (p2:Product)
WHERE NOT ID(p2) in list
RETURN ID(p2) LIMIT 2
Note: I'm using the ID() of the nodes instead of the entire node, same dbhits but may be more performant...

cql doesnt pass node in WITH statement

I'm using the following query to count the number of created users and create a user if the user with that id doesnt exist:
MERGE (uc:UserCounter)
ON CREATE SET uc.count = 0
WITH uc
MATCH (u:User{id:X})
WITH uc, count(u) as counts
MERGE (u:User{id:X})
ON CREATE SET uc.count = uc.count+1, u.id = uc.count, u.creation_ts = TIMESTAMP()
RETURN counts
I'm also returning counts to see if the user existed before or not. This query gives me back
(no rows). After some debugging, I came to the conclusion, that the uc node is not been passed until the end. What am I missing ?
This looks like the same problem as this question: if the user doesn't exist yet, the MATCH will not return any row despite the count() aggregation. You'll need an OPTIONAL MATCH for it to work instead.
Your query seems off though: why would you match/merge on id X, then overwrite it on creation with the current count? It's also dubious it would work correctly when executed concurrently.

Linked list query takes too long

I have build a linked list model using neo4j. Here there is a representation:
A user has a list of Events and each one has two attributes: date and done. Given a particular time, I would like to set all previous events' done attribute to true.
My current query is this one:
MATCH (user:User {id: {myId} })-[rel:PREV*]->(event:Event {done:false})
WHERE event.date <= {eventTime}
SET event.done = true;
This query takes 12 sec when the list has 500 events and I would like to make it faster. One possibility would be to stop the query once it finds an event which is already done, but I don't know how to do it.
You can use shortestPath for this and it will be much faster. In general, you should never use [:REL_TYPE*] because it does an exhaustive search for every path of any length between the nodes.
I created your data:
CREATE (:User {id:1})-[:PREV]->(:Event {id:1, date:1450806880004, done:false})-[:PREV]->(:Event {id:2, date:1450806880003, done:false})-[:PREV]->(:Event {id:3, date:1450806880002, done:true})-[:PREV]->(:Event {id:4, date:1450806880002, done:true});
Then, the following query will find all previous Event nodes in a particular User's linked list where done=false and the date is less than or equal to, say, 1450806880005.
MATCH p = shortestPath((u:User)-[:PREV*]->(e:Event))
WHERE u.id = 1 AND
e.done = FALSE AND
e.date <= 1450806880005
RETURN p;
This yields:
p
[(6:User {id:1}), (6)-[6:PREV]->(7), (7:Event {date:1450806880004, done:false, id:1})]
[(6:User {id:1}), (6)-[6:PREV]->(7), (7:Event {date:1450806880004, done:false, id:1}), (7)-[7:PREV]->(8), (8:Event {date:1450806880003, done:false, id:2})]
So you can see it's returning two paths, one that terminates at Event with id=1 and another that terminates at Event with id=2.
Then you can do something like this:
MATCH p = shortestPath((u:User)-[:PREV*]->(e:Event))
WHERE u.id = 1 AND e.done = FALSE AND e.date <= 1450806880005
FOREACH (event IN TAIL(NODES(p)) | SET event.done = TRUE)
RETURN p;
I'm using TAIL here because it grabs all the nodes except for the first one (since we don't want to update this property for the User node). Now all of the done properties have been updated on the Event nodes:
p
[(6:User {id:1}), (6)-[6:PREV]->(7), (7:Event {date:1450806880004, done:true, id:1})]
[(6:User {id:1}), (6)-[6:PREV]->(7), (7:Event {date:1450806880004, done:true, id:1}), (7)-[7:PREV]->(8), (8:Event {date:1450806880003, done:true, id:2})]
EDIT: And don't forget the super fun bug where the shortestPath function silently sets the maximum hop limit to 15 in Neo4j < 2.3.0. See
ShortestPath doesn't find any path without max hops limit
Find all events between 2 dates
So if you're on Neo4j < 2.3.0, you'll want to do:
MATCH p = shortestPath((u:User)-[:PREV*..1000000000]->(e:Event))
WHERE u.id = 1 AND e.done = FALSE AND e.date <= 1450806880005
FOREACH (event IN TAIL(NODES(p)) | SET event.done = TRUE)
RETURN p;
Your question is fairly vague with performance issues and targets, but one critical thing for performance is creating an index on properties you are examining. In your case, that would mean creating an index on both the done property and the date property:
CREATE INDEX ON :Event(done)
CREATE INDEX ON :Event(date)
Additionally, your query retrieves all events in the entire history of a user, as seen in:
-[rel:PREV*]->
You could cap the depth, such as
-[rel:PREV*..20]->
to prevent complete traversal. That might not give you the outcome you're looking for, but it would prevent long-running queries if you have an extreme number of nodes in your linked list (you haven't specified how large that list could get, so I have no idea if this will actually help).

How to update Nodes within a random manner in Neo4j

how can i update a random set of nodes in Neo4j. I tried the folowing:
match (Firstgraph)
with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph)
order by rand();
match (G1:FirstGraph)
where id(G1)=Id
set G1.Version=5
My idea is the get a random set then update it, but i got the error:
Expected exactly one statement per query but got: 2
Thanks for your help.
Let's find out what's the problem here, first of all, your error
Expected exactly one statement per query but got: 2
This is coming from your query, if we check it, we see that you did two queries in the same sentence, that's why you get this error.
match (Firstgraph) with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph) order by
rand(); match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
This is not a good query, because you can't use ; in a query sentence, it's the query end marker, so you can't do another query after this, but you can use UNION:
match (Firstgraph) with id(Firstgraph) as Id
return
Firstgraph.name, Firstgraph.version,id(Firstgraph) order by rand()
UNION
match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
Also, if you want to match a random set of nodes, you can simply do this (this example is for a 50% chances to get each node):
Match (node) Where rand() > 0.5 return node
And then do whatever you want with the node using WITH

Neo4j cypher liking a status update

I am trying to implement a liking mechanism to the the demo newsfeed shown here http://docs.neo4j.org/chunked/stable/cypher-cookbook-newsfeed.html
So basically when a user clicks like on a status update I want to link the user node to the status update node. However I want to search for the status update node through the author's node. Hence I use something like the following:
MATCH (n:user)-[:STATUSUPDATE]->(m)-[:NEXT*]->(o)
WHERE n.username = "pewpewlasers" AND (m.permalink = "acode" OR o.permalink = "acode")
MATCH (p:user)
WHERE id(p)=1,
CREATE (p)-[x:LIKED]->(o)
return x
Basically what I am trying to achieve is, finding a status update node through the author's node, and then looking for the update with a permalink code.
When found I want to connect the liker user's node to the status update through a LIKED relationship.
However, you probably already see the problems with this cypher.
This cypher requires that the permalink is one of the nodes that is connected with NEXT relationship, otherwise if the first node (connected by the STATUSUPDATE relationship) contains the permalink, it selects all status update nodes connected by the NEXT relationship. The user will thus end up liking all posts. What is probably required is like the following:
MATCH (n:user)-[:STATUSUPDATE]->(m)-[:NEXT*]->(o)
WHERE n.username = "pewpewlasers" AND m.permalink = "acode"
-- IF THE ABOVE GIVES AN OUTPUT THEN --
MATCH (p:user)
WHERE id(p)=1,
CREATE (p)-[x:LIKED]->(m)
return x
-- ELSE --
MATCH (n:user)-[:STATUSUPDATE]->(m)-[:NEXT*]->(o)
WHERE n.username = "pewpewlasers" AND o.permalink = "acode"
MATCH (p:user)
WHERE id(p)=1,
CREATE (p)-[x:LIKED]->(o)
return x
Here's a way to get around your problem.
START p = node(1) // Do you really want to use node numbers?
MATCH (n:user {username = 'pewpewlasers'})-[:STATUSUPDATE|:NEXT*]->(m {permalink : 'acode'})
CREATE (p)-[x:LIKED]->(m)
By using the '|' and multimatch in the relationship part of the MATCH clause, the 'm' node is able to match any part of the status update chain. If you really are going to use node numbers (the output of the 'id()' function) to get your liking user node, it's probably faster to do it as shown above.

Resources