I am following this article to setup a system where users can follow each other and also become friends : http://neo4j.com/docs/stable/cypher-cookbook-newsfeed.html
A user (A) can friend another user (B) and hence A is automatically following B. User A can also follow B without adding B as a friend. Hence there should be a distinction made to the feed results. If A and B are not confirmed friends, A should get status updates from B that are marked public only. If A is a confirmed friend of B, A should get all status updates from B. A, even if is B's friend, can also unfollow his/her feed. (typical facebook model?). So basically I need to check who A follows and grab their updates. However, while doing this I also need to check if A has access to these status updates.
Is there an easy cypher to implement this? Or do you have a better model in mind? Assuming all updates are public following query should work. How would you add privacy setting dimension to it if there are friends only posts too?
MATCH (me { name: 'Joe' })-[rels:FOLLOWS*0..1]-(anotherUser)
WITH anotherUser
MATCH (anotherUser)-[:STATUS]-(latestupdate)-[:NEXT*0..1]-(statusupdates)
RETURN anotherUser.name AS name, statusupdates.date AS date, statusupdates.text AS text
ORDER BY statusupdates.date DESC LIMIT 3
Yes, you can implement all of these requirements, it seems to me to boil down to a few extra carefully chosen WHERE clauses.
Here's your base query, with modifications:
MATCH (me { name: 'Joe' })-[rels:FOLLOWS*0..1]-(anotherUser)
WITH anotherUser
MATCH (anotherUser)-[:STATUS]-(latestupdate)-[:NEXT*0..1]-(statusupdates)
WHERE statusupdates.visibility='PUBLIC'
RETURN anotherUser.name AS name, statusupdates.date AS date, statusupdates.text AS text
ORDER BY statusupdates.date DESC LIMIT 3
Here, I've just added a WHERE to check for visibility=PUBLIC (which I made up, because the sample app doesn't specify those things; that would have to be part of your model one way or another).
You might consider doing that query along with a UNION to another query, which would be intended to fetch only those status updates from friends. (If it's a friend, then it doesn't matter what the visibility is)
MATCH (me { name: 'Joe' })-[:FRIEND]-(friend)-[:STATUS|NEXT*1..]->(statusupdates)
RETURN statusupdates
ORDER BY statusupdates.date DESC LIMIT 3;
Instead of using UNION you could also combine the two queries with an OPTIONAL MATCH clause on the second pattern. But either way, basically, your query needs to get the list of all status updates that are either people you follow whose posts are public, friends posts, or both. So conceptually it's easy to break that into those two separate cases, and then UNION the two result sets together.
Related
I am trying to write a query which looks for potential friends in a Neo4j db based on common friends and interests.
I don't want to post the whole query (part of school assignment), but this is the important part
MATCH (me:User {firstname: "Name"}), (me)-[:FRIEND]->(friend:User)<-[:FRIEND]-(potential:User), (me)-[:MEMBER]->(i:Interest)
WHERE NOT (potential)-[:FRIEND]->(me)
WITH COLLECT(DISTINCT potential) AS potentialFriends,
COLLECT(DISTINCT friend) AS friends,
COLLECT(i) as interests
UNWIND potentialFriends AS potential
/*
#HANDLING_FINDINGS
Here I count common friends, interests and try to find relationships between
potential friends too -- hence the collect/unwind
*/
RETURN potential,
commonFriends,
commonInterests,
(commonFriends+commonInterests) as totalPotential
ORDER BY totalPotential DESC
LIMIT 10
In the section #HANDLING_FINDINGS I use the found potential friends to find relationships between each other and calculate their potential (i.e. sum of shared friends and common interests) and then order them by potential.
The problem is that there might be users with no friends whom I would also like to recommend someone friends.
My question - can I somehow insert a few random users into the "potential" findings if their count is below 10 so that everyone gets a recommendation?
I have tried something like this
...
UNWIND potentialFriends AS potential
CASE
WHEN (count(potential) < 10 )
...
But that produced an error as soon as it hit start of the CASE. I think that case can be used only as part of a command like return? (maybe just return)
Edit with 2nd related question:
I was already thinking of matching all users and then ranking them based on common friends/interestes, but wouldn't searching through the whole DB be intensive?
A CASE expression can be used wherever a value is needed, but it cannot be used as a complete clause.
With respect to your main question, you can put a WITH clause like the following between your existing WITH and UNWIND clauses:
WITH friends, interests,
CASE WHEN SIZE(potentialFriends) < 10 THEN {randomFriends} ELSE potentialFriends END AS potentialFriends
If the size of the potentialFriends collection is less than 10, the CASE expression assigns the value of the {randomFriends} parameter to potentialFriends.
As for your second question, yes it would be expensive.
I work in a company that makes a social game where our users can have friends and can make content that based on popularity shows up on highscores.
I am trying to find out whether we can move some of our data to a graph database like neo4j and one of the things I can't figure out is how to implement a highscore system in a graph database. I basically want to make queries like this:
Get list of movies/artbooks/photos content created by friends ordered by content with most likes.
Get list of movies/artbooks/photos content created by ALL USERS in the last 7 days ordered by content with most likes.
What kind of data modeling and queries should we do to implement this?
The datamodel I was planning to do was to have users as nodes and the content made by a user linked to the user as a list of connected content nodes with the latest one linked to the user, but how do I get highscores into such a model.
Thanks.
Here is one possible model:
(f:User {name: "Fred"})-[:CREATED]->(c:Content {created: 2345, type: "Music"})
(m:User {name: "Mary"})-[:LIKES {score:5}]->(c:Content)
(f)-[:KNOWS]->(m);
To get the content created by all Users since a specific timestamp, in descending order by the number of likes, you can use the following query. The OPTIONAL MATCH is used to avoid filtering out Content with no likes.
MATCH (c:Content)
WHERE c.created > 1234
OPTIONAL MATCH ()-[l:LIKES]->(c)
RETURN c, COUNT(l) AS num_likes
ORDER BY num_likes DESC;
Here is a console that illustrates this.
I have been reading about Graph databases and want to know if this type of structure is applicable to it:
Company > Has user Accounts > Accounts send out facebook posts (which are available to all users)
Up to here - I think this makes sense - yes it would be a good use of Graph. A post has a relationship to any accounts and you can find out the direction both ways - posts for a company and which posts were sent by which users or companies.
However
Users get added and deleted on a daily basis and I need a record store of how many there were at a given time
Accounts are getting results for each post (likes/friends) which I need to store on a daily basis
I need to find out how many likes a company received (on any given day)
I also need to find out how many likes a user received
I need to find out how many likes a user received per post
You would need to store Likes as a group and then date-value - can you even have "sub" properties?
I struggle at this point unless you are storing lots of date-value property lists per node. Is that the way you would do it? If I wanted to find out the later 2 points for example would it be as efficient as a RDBMS?
Here is a very simple example of a Graph data model that seems to cover your stated use cases. (Since nodes can have multiple labels, all Company and User nodes are also Entity nodes -- to simplify the model.)
(:Company:Entity {id:100})-[:HAS_USER]->(:User:Entity {id: 200})
(:Entity)-[:SENT]->(:Post {date: 123, msg: "I like cats!"})
(:Entity)-[:LIKES {date: 234}]->(:Post)
Your use cases:
Users get added and deleted on a daily basis and I need a record store of how many there were at a given time.
How to count all users:
MATCH (u:User)
RETURN COUNT(*);
How to count a company's users:
MATCH (c:Company {id:100})-[:HAS_USER]->(u:User)
RETURN COUNT(*);
I need to find out how many likes a company received (on any given day)
MATCH (c:Company {id: 100})-[:SENT]->(p:Post)<-[:LIKES {date:234}]-()
RETURN COUNT(*)
I also need to find out how many likes a user received
MATCH (u:User {id:200})-[:SENT]->(p:Post)<-[:LIKES]-()
RETURN COUNT(*);
I need to find out how many likes a user received per post
MATCH (u:User {id:200})-[:SENT]->(p:Post)<-[:LIKES]-()
RETURN p, COUNT(*)
You would need to store Likes as a group and then date-value - can you even have "sub" properties?
You do not need to explicitly group likes by date (if that is what you mean). Such "groupings" can be easily obtained by the appropriate query (e.g., in #2 above).
seems I am too tired to find the solution, maybe someone has a hint for me.
I have built a graph in Neo4J which I connect via neo4jphp (everyman). Using the java browser the graph looks ok, every user exists one time and might have several groups he belongs to.
While creating the user I use MERGE in order to avoid them to be doubled - like
MERGE (user:PERSON {
firstname: "Max",
name: "Muster",
password:"e52ddddddd9afb7b373f9da437",
title:"something",
login:"Nick",
status:"active"
})
ON CREATE SET user.uuid = "'.uniqid().'" // PHP function for a UUID
return user;
This works well as I see the correct number of users even when I resend the query or reload the page.
The users are connected with a query towards groups like this
MATCH (user:PERSON), (team:GROUP)
WHERE user.name= "Muster" AND user.firstname="Max" AND team.name="LOCAL_USER"
CREATE (user)-[:IS_MEMBER_OF {role:"user", status:"active"}]->(team);
Checking this in the GUI of Neo4J shows a correct graph (at least from what I can see). I have the right amount of users and their relations.
When I query the graph directly by Cypher in the browser GUI like this
MATCH (user:PERSON {status: "active"})-[relation:IS_MEMBER_OF{status:"active"}]->(team:GROUP {name:"LOCAL_USER"} )
RETURN user
ORDER BY user.name;
I get the correct number of users.
When I use the neo4jphp lib (everyman) I receive some users double - the resultset has several elements with the same user. I couldnt figure out why they behave differently but I assume that I might have messed up the relations somehow. But still I am wondering why the same cypher query returns different amount of records when you send it via GUI or via everyman lib and I would need a hint how to change maybe the queries to make sure that I only get one record per user as every user is only one time connected to the LOCAL_USER group.
Thanks for pushing me to the right direction.
I think you get doubles because you have multiple paths (e.g. multiple teams) for the user,
use RETURN distinct user
For your import statement, you do it the wrong way round, instead of your approach.
MERGE by unique id (e.g login in your case) and set the other properties with ON CREATE SET ...
Also use parameters not literal values in your query strings !!
MERGE (user:PERSON {login:{login}})
ON CREATE SET
user.firstname = {firstname}, user.name= {name}, user.password = {password},
user.title={title}, user.status = {status}, user.uuid = {uuid}
RETURN user;
I have read the Neo4j manual and saw the numerous short examples regarding movie graph. I have also installed it locally and played with the cypher.
Here is the setup:
I have the following nodes: Movies (with name and id, owned by friend), Actors(with name and ids) Directors (with names and id), Genre (with id and name)
Relations are: Actors acted in Movies (1 movie - many actors), Directors directed a movie (1 director per movie but a director can direct many movies), and Movies has several genre "(many to many)
1) Owned by friend I dont know why but following the LOAD CSV example they put USA as a node rather than a property but is there a logical reason why its better to put it as a node rather than a property like i did?
2)
What I want to search is similar to the answer given to this question:
Nearest nodes to a give node, assigning dynamically weight to relationship types
However - I do not have a weight on the relationship and its more of a "go find the first give nodes connected to it"
Given that the "owned by friend" can only be owned by 1 person.
If given movie title "Spider-Man" (which for example purpose is owned by frank) go find the next occurrence of a movie that is owned by John.
So after reading Neo4j I believe that I dont need to specify which relationship is needed to traverse but just go find the next movie that meets my criteria, right?
So Following the above link
MATCH (n:Start { title: 'Spider-Man' }),
(n)-[:CONNECTED*0..2]-(x)
RETURN x
So go to node Spider-Man and go find me X as long as it is connected but I got stump by *0..2 because its the range...what if I just say "go find me the first you that means the own by John"
3) following up to #2 - how do i insert the fitler "own by john" ?
There are a number of things in your question that don't quite make sense. Here's a stab at an answer.
1) Making 'USA' a node rather than a property is useful if you want to search based on country. If 'USA' is a node, you are able to limit your search by starting at the 'USA' node. If you don't care to do this, then it doesn't really matter. It may also save a small amount of space for longer country names to store the name once and link to it via relationships.
2) Your example doesn't match your described graph. I can't really speak to it without a better example.
3) This is probably easy to answer once you improve your example.
OK. Based on the comments to the answer, here's what you need. To find one movie owned by John that is connected via common actors, directors, etc to the movie Spider-man owned by Frank (that is, sub-graphs like (movie)<--(actor)-->(movie) ) you can write:
MATCH (n:Movie {title : 'Spider-Man', owned_by : 'Frank'})<-[*2]->(m:Movie {owned_by : 'John'})
RETURN m LIMIT 1
If you want more responses, alter or remove the LIMIT on the RETURN clause. If you want to allow chains that pass through chains like (movie)<--(actor)-->(movie)<--(director)-->(movie), you can increase the number of relationships matched (the *2) to 4, 6, 8, etc. You probably shouldn't just write the relationship part of the MATCH clause as -[*]-, because this could get into infinite loops.