I work in a company that makes a social game where our users can have friends and can make content that based on popularity shows up on highscores.
I am trying to find out whether we can move some of our data to a graph database like neo4j and one of the things I can't figure out is how to implement a highscore system in a graph database. I basically want to make queries like this:
Get list of movies/artbooks/photos content created by friends ordered by content with most likes.
Get list of movies/artbooks/photos content created by ALL USERS in the last 7 days ordered by content with most likes.
What kind of data modeling and queries should we do to implement this?
The datamodel I was planning to do was to have users as nodes and the content made by a user linked to the user as a list of connected content nodes with the latest one linked to the user, but how do I get highscores into such a model.
Thanks.
Here is one possible model:
(f:User {name: "Fred"})-[:CREATED]->(c:Content {created: 2345, type: "Music"})
(m:User {name: "Mary"})-[:LIKES {score:5}]->(c:Content)
(f)-[:KNOWS]->(m);
To get the content created by all Users since a specific timestamp, in descending order by the number of likes, you can use the following query. The OPTIONAL MATCH is used to avoid filtering out Content with no likes.
MATCH (c:Content)
WHERE c.created > 1234
OPTIONAL MATCH ()-[l:LIKES]->(c)
RETURN c, COUNT(l) AS num_likes
ORDER BY num_likes DESC;
Here is a console that illustrates this.
Related
I am working on a dating app where users can "like" or "dislike" other users and get matched.
As you can imagine the most important query of the app would be:
Give me a stack of nearby user profiles that I have NOT liked/disliked before.
I tried to work on this with a document database (Firestore) and figured it's simply not suitable for such kind of application and hence landed in the graph database world which is new and fascinating to me.
I understand that by nature a graph database retrieves data by tracing through the relationships and make relationships first-class citizens. My question now is that what if the nodes that I am trying to get are those with no relationship from the given node? What would the query look like? Can anyone provide an example query?
Edit:
- added nearby criteria to the query statement
This is definitely possible, here is a query example :
MATCH (me:Profile {name: "Chris"})
MATCH (other:Profile) WHERE NOT (other)-[:LIKES]->(me)
As stated in the comments of your original question, on a large dataset it might not scale well, that said it is pretty uncommon that you would use only one criteria for matching, for example, the list of possible profiles to match from can be grouped by :
geolocation
profiles in depth 2 ( who is liking me, then find who other people they like, do those people like me ? )
shared interests
age group
skin color
...
I have a database containing movies and users. Users can follow each other and rate movies. I have a user with the id 599. I want to find all the movies he hasn't watched that the other users he follows have watched. Here's what I tried so far. I do get a result and it works but the numbers don't really add up so I think there might be something off.
MATCH (u:user {id:599}),(m2:Movie), (u2:user),(u)-[r]->(u2),(u2)-->(m2)
WHERE not (u)-->(m2)
RETURN m2.title, m2.movieId,u2.id;
As #Inversefalcon stated, perhaps your query needs to be specific about the relationship types, in case there are multiple types of relationships between the same nodes.
For example (assuming your data model has the relationship types FOLLOWS and WATCHED):
MATCH (u:user {id:599})-[:FOLLOWS]->(u2:user)-[:WATCHED]->(m2:Movie)
WHERE not (u)-[:WATCHED]->(m2)
RETURN m2.title, m2.movieId, u2.id;
If this does not solve your issue, then please provide some sample data and what your expected number of results is.
I have been reading about Graph databases and want to know if this type of structure is applicable to it:
Company > Has user Accounts > Accounts send out facebook posts (which are available to all users)
Up to here - I think this makes sense - yes it would be a good use of Graph. A post has a relationship to any accounts and you can find out the direction both ways - posts for a company and which posts were sent by which users or companies.
However
Users get added and deleted on a daily basis and I need a record store of how many there were at a given time
Accounts are getting results for each post (likes/friends) which I need to store on a daily basis
I need to find out how many likes a company received (on any given day)
I also need to find out how many likes a user received
I need to find out how many likes a user received per post
You would need to store Likes as a group and then date-value - can you even have "sub" properties?
I struggle at this point unless you are storing lots of date-value property lists per node. Is that the way you would do it? If I wanted to find out the later 2 points for example would it be as efficient as a RDBMS?
Here is a very simple example of a Graph data model that seems to cover your stated use cases. (Since nodes can have multiple labels, all Company and User nodes are also Entity nodes -- to simplify the model.)
(:Company:Entity {id:100})-[:HAS_USER]->(:User:Entity {id: 200})
(:Entity)-[:SENT]->(:Post {date: 123, msg: "I like cats!"})
(:Entity)-[:LIKES {date: 234}]->(:Post)
Your use cases:
Users get added and deleted on a daily basis and I need a record store of how many there were at a given time.
How to count all users:
MATCH (u:User)
RETURN COUNT(*);
How to count a company's users:
MATCH (c:Company {id:100})-[:HAS_USER]->(u:User)
RETURN COUNT(*);
I need to find out how many likes a company received (on any given day)
MATCH (c:Company {id: 100})-[:SENT]->(p:Post)<-[:LIKES {date:234}]-()
RETURN COUNT(*)
I also need to find out how many likes a user received
MATCH (u:User {id:200})-[:SENT]->(p:Post)<-[:LIKES]-()
RETURN COUNT(*);
I need to find out how many likes a user received per post
MATCH (u:User {id:200})-[:SENT]->(p:Post)<-[:LIKES]-()
RETURN p, COUNT(*)
You would need to store Likes as a group and then date-value - can you even have "sub" properties?
You do not need to explicitly group likes by date (if that is what you mean). Such "groupings" can be easily obtained by the appropriate query (e.g., in #2 above).
I have read the Neo4j manual and saw the numerous short examples regarding movie graph. I have also installed it locally and played with the cypher.
Here is the setup:
I have the following nodes: Movies (with name and id, owned by friend), Actors(with name and ids) Directors (with names and id), Genre (with id and name)
Relations are: Actors acted in Movies (1 movie - many actors), Directors directed a movie (1 director per movie but a director can direct many movies), and Movies has several genre "(many to many)
1) Owned by friend I dont know why but following the LOAD CSV example they put USA as a node rather than a property but is there a logical reason why its better to put it as a node rather than a property like i did?
2)
What I want to search is similar to the answer given to this question:
Nearest nodes to a give node, assigning dynamically weight to relationship types
However - I do not have a weight on the relationship and its more of a "go find the first give nodes connected to it"
Given that the "owned by friend" can only be owned by 1 person.
If given movie title "Spider-Man" (which for example purpose is owned by frank) go find the next occurrence of a movie that is owned by John.
So after reading Neo4j I believe that I dont need to specify which relationship is needed to traverse but just go find the next movie that meets my criteria, right?
So Following the above link
MATCH (n:Start { title: 'Spider-Man' }),
(n)-[:CONNECTED*0..2]-(x)
RETURN x
So go to node Spider-Man and go find me X as long as it is connected but I got stump by *0..2 because its the range...what if I just say "go find me the first you that means the own by John"
3) following up to #2 - how do i insert the fitler "own by john" ?
There are a number of things in your question that don't quite make sense. Here's a stab at an answer.
1) Making 'USA' a node rather than a property is useful if you want to search based on country. If 'USA' is a node, you are able to limit your search by starting at the 'USA' node. If you don't care to do this, then it doesn't really matter. It may also save a small amount of space for longer country names to store the name once and link to it via relationships.
2) Your example doesn't match your described graph. I can't really speak to it without a better example.
3) This is probably easy to answer once you improve your example.
OK. Based on the comments to the answer, here's what you need. To find one movie owned by John that is connected via common actors, directors, etc to the movie Spider-man owned by Frank (that is, sub-graphs like (movie)<--(actor)-->(movie) ) you can write:
MATCH (n:Movie {title : 'Spider-Man', owned_by : 'Frank'})<-[*2]->(m:Movie {owned_by : 'John'})
RETURN m LIMIT 1
If you want more responses, alter or remove the LIMIT on the RETURN clause. If you want to allow chains that pass through chains like (movie)<--(actor)-->(movie)<--(director)-->(movie), you can increase the number of relationships matched (the *2) to 4, 6, 8, etc. You probably shouldn't just write the relationship part of the MATCH clause as -[*]-, because this could get into infinite loops.
I'm very new to neo4j and to graph database in general. I'm prototyping an app, and I don't know how should i write these queries
I've this domain:
User
Restaurant
Review
TypeOfFood
So a Restarurant have one or many TypeOfFood, the User leaves reviews about restaurants. The User have some preferred foods, matching the TypeOfFood a restaurant sell. Also Users are related to each other with the typically friend relationship.
Some of the queries I'm trying to write:
Give me all the restaurants that my friends have rated with 3 or more stars that make the kind of food I like (exclude those restaurants that I already reviewed)
Suggest me friends I may know (I guess this should be something like "all the friends that are friends of my friends but no yet mine, order by something)
Using Neo4j's Cypher query language you could write your queries like this:
Selecting the top-20 best rated restaurants, sorted by stars and number of reviews
start user=(users,name,'Nico')
match user-[:FRIEND]->friend-[r,:RATED]->restaurant-[:SERVES]->food,
user-[:LIKES]->food,user-[:RATED]->rated_by_me
where r.stars > 3
return restaurant.name, avg(r.stars), count(*)
order by avg(r.stars) desc, count(*) desc
limit 20
Friends of a Friend
start user=(users,name,'Nico')
match user-[:FRIEND]->friend->[:FRIEND]->foaf
return foaf, foaf.name
You can execute these cypher queries in the Neo4j Webadmin Console on your dataset, but also in the neo4j-shell, remotely via the Cypher-Rest-Plugin via Spring Data Graph.
There is also a screencast discussing similar queries in cypher.
You can also use Gremlin, Neo4j-Traversers or manual traversing via getRelationships if you'd like.