I'm very new to neo4j and to graph database in general. I'm prototyping an app, and I don't know how should i write these queries
I've this domain:
User
Restaurant
Review
TypeOfFood
So a Restarurant have one or many TypeOfFood, the User leaves reviews about restaurants. The User have some preferred foods, matching the TypeOfFood a restaurant sell. Also Users are related to each other with the typically friend relationship.
Some of the queries I'm trying to write:
Give me all the restaurants that my friends have rated with 3 or more stars that make the kind of food I like (exclude those restaurants that I already reviewed)
Suggest me friends I may know (I guess this should be something like "all the friends that are friends of my friends but no yet mine, order by something)
Using Neo4j's Cypher query language you could write your queries like this:
Selecting the top-20 best rated restaurants, sorted by stars and number of reviews
start user=(users,name,'Nico')
match user-[:FRIEND]->friend-[r,:RATED]->restaurant-[:SERVES]->food,
user-[:LIKES]->food,user-[:RATED]->rated_by_me
where r.stars > 3
return restaurant.name, avg(r.stars), count(*)
order by avg(r.stars) desc, count(*) desc
limit 20
Friends of a Friend
start user=(users,name,'Nico')
match user-[:FRIEND]->friend->[:FRIEND]->foaf
return foaf, foaf.name
You can execute these cypher queries in the Neo4j Webadmin Console on your dataset, but also in the neo4j-shell, remotely via the Cypher-Rest-Plugin via Spring Data Graph.
There is also a screencast discussing similar queries in cypher.
You can also use Gremlin, Neo4j-Traversers or manual traversing via getRelationships if you'd like.
Related
I have created a knowledge with the nodes and relationships pictured. Each person has any number of jobs and skills connected to them and each Job and Skill can have any number of People connected to them. I would like to be able to search for a particular job (e.g. Security Architect) and return a list of all the people who have been employed_as that job and all of the skills that each person is skilled_in. I have created a query hich retrieves these results, however a new line in the query is created for each skill, duplicating the person details each time. This is the query I have which retrieves those results.
MATCH (j:Job {job_title: "Security Architect"})<-[p_rel:employed_as]-(p:Person)-[skilled_in]->(s:Skill) return p,s,p_rel
Is it possible to create a query that returns all of the skill nodes connected to a person as a single list with the details of that person?
Since you need all skills in single line, you can collect all the skills per person.
MATCH (j:Job {job_title: "Security Architect"})<-[p_rel:employed_as]-(p:Person)
-[skilled_in]->(s:Skill)
RETURN p,p_rel, collect(s) as skills_per_person
I am working on a dating app where users can "like" or "dislike" other users and get matched.
As you can imagine the most important query of the app would be:
Give me a stack of nearby user profiles that I have NOT liked/disliked before.
I tried to work on this with a document database (Firestore) and figured it's simply not suitable for such kind of application and hence landed in the graph database world which is new and fascinating to me.
I understand that by nature a graph database retrieves data by tracing through the relationships and make relationships first-class citizens. My question now is that what if the nodes that I am trying to get are those with no relationship from the given node? What would the query look like? Can anyone provide an example query?
Edit:
- added nearby criteria to the query statement
This is definitely possible, here is a query example :
MATCH (me:Profile {name: "Chris"})
MATCH (other:Profile) WHERE NOT (other)-[:LIKES]->(me)
As stated in the comments of your original question, on a large dataset it might not scale well, that said it is pretty uncommon that you would use only one criteria for matching, for example, the list of possible profiles to match from can be grouped by :
geolocation
profiles in depth 2 ( who is liking me, then find who other people they like, do those people like me ? )
shared interests
age group
skin color
...
I work in a company that makes a social game where our users can have friends and can make content that based on popularity shows up on highscores.
I am trying to find out whether we can move some of our data to a graph database like neo4j and one of the things I can't figure out is how to implement a highscore system in a graph database. I basically want to make queries like this:
Get list of movies/artbooks/photos content created by friends ordered by content with most likes.
Get list of movies/artbooks/photos content created by ALL USERS in the last 7 days ordered by content with most likes.
What kind of data modeling and queries should we do to implement this?
The datamodel I was planning to do was to have users as nodes and the content made by a user linked to the user as a list of connected content nodes with the latest one linked to the user, but how do I get highscores into such a model.
Thanks.
Here is one possible model:
(f:User {name: "Fred"})-[:CREATED]->(c:Content {created: 2345, type: "Music"})
(m:User {name: "Mary"})-[:LIKES {score:5}]->(c:Content)
(f)-[:KNOWS]->(m);
To get the content created by all Users since a specific timestamp, in descending order by the number of likes, you can use the following query. The OPTIONAL MATCH is used to avoid filtering out Content with no likes.
MATCH (c:Content)
WHERE c.created > 1234
OPTIONAL MATCH ()-[l:LIKES]->(c)
RETURN c, COUNT(l) AS num_likes
ORDER BY num_likes DESC;
Here is a console that illustrates this.
I have been reading about Graph databases and want to know if this type of structure is applicable to it:
Company > Has user Accounts > Accounts send out facebook posts (which are available to all users)
Up to here - I think this makes sense - yes it would be a good use of Graph. A post has a relationship to any accounts and you can find out the direction both ways - posts for a company and which posts were sent by which users or companies.
However
Users get added and deleted on a daily basis and I need a record store of how many there were at a given time
Accounts are getting results for each post (likes/friends) which I need to store on a daily basis
I need to find out how many likes a company received (on any given day)
I also need to find out how many likes a user received
I need to find out how many likes a user received per post
You would need to store Likes as a group and then date-value - can you even have "sub" properties?
I struggle at this point unless you are storing lots of date-value property lists per node. Is that the way you would do it? If I wanted to find out the later 2 points for example would it be as efficient as a RDBMS?
Here is a very simple example of a Graph data model that seems to cover your stated use cases. (Since nodes can have multiple labels, all Company and User nodes are also Entity nodes -- to simplify the model.)
(:Company:Entity {id:100})-[:HAS_USER]->(:User:Entity {id: 200})
(:Entity)-[:SENT]->(:Post {date: 123, msg: "I like cats!"})
(:Entity)-[:LIKES {date: 234}]->(:Post)
Your use cases:
Users get added and deleted on a daily basis and I need a record store of how many there were at a given time.
How to count all users:
MATCH (u:User)
RETURN COUNT(*);
How to count a company's users:
MATCH (c:Company {id:100})-[:HAS_USER]->(u:User)
RETURN COUNT(*);
I need to find out how many likes a company received (on any given day)
MATCH (c:Company {id: 100})-[:SENT]->(p:Post)<-[:LIKES {date:234}]-()
RETURN COUNT(*)
I also need to find out how many likes a user received
MATCH (u:User {id:200})-[:SENT]->(p:Post)<-[:LIKES]-()
RETURN COUNT(*);
I need to find out how many likes a user received per post
MATCH (u:User {id:200})-[:SENT]->(p:Post)<-[:LIKES]-()
RETURN p, COUNT(*)
You would need to store Likes as a group and then date-value - can you even have "sub" properties?
You do not need to explicitly group likes by date (if that is what you mean). Such "groupings" can be easily obtained by the appropriate query (e.g., in #2 above).
i need to implement a suggestion system for my project
in this system we should recommend people base on some parameters like current city, education, friend of friends etc.
i have designed this by creating(update) may_know relations when users edit their profile or become friend with someone and i will retrieve them by MATCH u-[r:MAY_KNOW]-x RETURN * ORDER BY r.weight so people can find most like people to them
but i think this is not a best practice because soon may_know relation from/to every user can reach even milions and scan and sorting them will be heavy cost
do you have a better idea?
Depends a bit on the data-structure, I assume there are relationships to cities, education facilities and friends. So you don't actually have MAY_KNOW relationships as those are only inferred?
Also it depends if you want to create a cross products between all your users (how many) and how you would want to filter out non-related people.
Perhaps check out this blog post from Max: http://maxdemarzi.com/2013/04/19/match-making-with-neo4j/
So something like this query might work (depending on the data volume I'd rewrite it in the Java API).
match (p:Person {id:{user_id})
match (p)-[:LIVES_IN]->(:City)<-[:LIVES_IN]-(other)
match (p)-[:GRADUATED]->(:School)<-[:GRADUATED]-(other)
match (p)-[:KNOWS]->(:Person)<-[:KNOWS]-(other)
RETURN other