Neo4j - data modeling - neo4j

I am building some test application with Neo4j. I want to model small social network and try to find:
All friends of user X
Friends of user's X friends, who likes beer
I stuck with modeling "know" relation. Let's take 3 users A, B and C. Is it enough to define only one relation between, them ex.
A knows B
B knows C
or I have to make 'bidirectional' relations and explicitly provide that
A knows B so B knows A
B knows C so C knows B
it will increase the number of relations, but maybe it is needed.
The same question is related to favorite drink.
A likes beer
should I also define?
beer is liked by A

If you want to be able to differentiate between a knowing b and b knowing a, then you need to have two relationships. Otherwise, at query time, you can easily get both by omitting the direction.
Similarly, with beer being liked, you really only need to define one direction.
For a real example: The facebook/linkedin model where connections are mutual only needs one direction/relationship, but the twitter model where one person can follow another (but the other person doesn't have to follow them back), you'd need two relationships--one for each direction.
Update with some query examples:
CREATE
(joe {name:"Joe"}),
(jim {name:"Jim"}),
(bob {name:"Bob"}),
(beer {name:"Beer"}),
joe-[:friends_with]-jim,
joe-[:friends_with]-bob,
bob-[:likes]->beer;
For the friends:
START person=node:node_auto_index(name="Joe")
MATCH (person)-[:friends_with]-(friend)
RETURN person, friend;
And the friends who like beer:
START person=node:node_auto_index(name="Joe"), beer=node:node_auto_index(name="Beer")
MATCH (person)-[:friends_with]-(friend)-[:likes]->(beer)
RETURN person, friend, beer;

Related

Bidirectional Data modeling issue in neo4j

I have two nodes, A and B,
A talks to B and B talks to A, (A)-[:talksTo]-(B)
A has a sentiment value towards B, and B has a sentiment value towards A.
So there is the problem, I need A to B relationship to store a value that the B to A relationship will also want to store (same key).
So I will try to do queries such as, MATCH (A:person)-[:talksTo]-(B:person) where A.sentiment < -2 return A;
So here A's sentiment toward B will be different the B's sentiment toward A, thus the needed separation.
I have tried to make unique key names to specify direction - but that makes queries difficult unless I can query with a wild card ex: ... where A.Asentiment < -2 would be queried as ... where A.*sentiment < -2
Another way I can think of to do this is make two different graphs, 1) A talks to B graph and B talks to A graph... but this would make queries tricky as I may get back more then one node for single node queries OR if I have to update a single node key:value to something else. I would prefer to have one node name per person.
Any ideas?
I don't know that this is a solution, but I don't think I understand enough so it might be a foil for better understanding:
MATCH (A:Person)-[dir1:talksTo]->(B:Person), (A)<-[dir2:talksTo]-(B)
WHERE dir1.sentiment < 2
RETURN A, B

Neo4j first node to meet relationship in a movie model

I have read the Neo4j manual and saw the numerous short examples regarding movie graph. I have also installed it locally and played with the cypher.
Here is the setup:
I have the following nodes: Movies (with name and id, owned by friend), Actors(with name and ids) Directors (with names and id), Genre (with id and name)
Relations are: Actors acted in Movies (1 movie - many actors), Directors directed a movie (1 director per movie but a director can direct many movies), and Movies has several genre "(many to many)
1) Owned by friend I dont know why but following the LOAD CSV example they put USA as a node rather than a property but is there a logical reason why its better to put it as a node rather than a property like i did?
2)
What I want to search is similar to the answer given to this question:
Nearest nodes to a give node, assigning dynamically weight to relationship types
However - I do not have a weight on the relationship and its more of a "go find the first give nodes connected to it"
Given that the "owned by friend" can only be owned by 1 person.
If given movie title "Spider-Man" (which for example purpose is owned by frank) go find the next occurrence of a movie that is owned by John.
So after reading Neo4j I believe that I dont need to specify which relationship is needed to traverse but just go find the next movie that meets my criteria, right?
So Following the above link
MATCH (n:Start { title: 'Spider-Man' }),
(n)-[:CONNECTED*0..2]-(x)
RETURN x
So go to node Spider-Man and go find me X as long as it is connected but I got stump by *0..2 because its the range...what if I just say "go find me the first you that means the own by John"
3) following up to #2 - how do i insert the fitler "own by john" ?
There are a number of things in your question that don't quite make sense. Here's a stab at an answer.
1) Making 'USA' a node rather than a property is useful if you want to search based on country. If 'USA' is a node, you are able to limit your search by starting at the 'USA' node. If you don't care to do this, then it doesn't really matter. It may also save a small amount of space for longer country names to store the name once and link to it via relationships.
2) Your example doesn't match your described graph. I can't really speak to it without a better example.
3) This is probably easy to answer once you improve your example.
OK. Based on the comments to the answer, here's what you need. To find one movie owned by John that is connected via common actors, directors, etc to the movie Spider-man owned by Frank (that is, sub-graphs like (movie)<--(actor)-->(movie) ) you can write:
MATCH (n:Movie {title : 'Spider-Man', owned_by : 'Frank'})<-[*2]->(m:Movie {owned_by : 'John'})
RETURN m LIMIT 1
If you want more responses, alter or remove the LIMIT on the RETURN clause. If you want to allow chains that pass through chains like (movie)<--(actor)-->(movie)<--(director)-->(movie), you can increase the number of relationships matched (the *2) to 4, 6, 8, etc. You probably shouldn't just write the relationship part of the MATCH clause as -[*]-, because this could get into infinite loops.

How to query recommendation using Cypher

I'm trying to query Book nodes for recommendation by Cypher.
I want to recommend A:Book and C:Book for A:User.
i'm sorry I need some graph to explain this question, but I could't up graph image because my lepletion lacks for upload function.
I wrote query below.
match (u1:User{uid:'1003'})-->(o1:Order)-->(b1:Book)<--(o2:Order)
<--(u2:User)-->(o3:Order)-->(b2:Book)
return b2
This query return all Books(A,B,C,D) dispite cypher's Uniqueness.
I expect to only return A:Book and C:Book.
Is this behavior Neo4j' specification?
How do I get expected return? Thanks, everyone.
environment:
Neo4j ver.v2.0.0-RC1
Using Neo4j Server with REST API
Without the sample graph its hard to say why you get something back when you expected something else. You can share a sample graph by including a create statement that would generate said graph, or by creating it in Neo4j console and putting the link in your question. Here is an example of the latter: console.neo4j.org/r/fnnz6b
In the meantime, you probably want to declare the type of the relationships in your pattern. If a :User has more than one type of outgoing relationships you will be excluding those other paths based on the labels of the nodes on the other end, which is much less efficient than to only traverse the right relationships to begin with.
To my mind its not clear whether (u:User)-->(o:Order)-->(b:Book) means that a user has one or more orders, and each order consists of one or more books; or if it means only that a user ordered a book. If you can share a sample, hopefully that will be clear too.
Edit:
Great, so looking at the graph: You get B and D back because others who bought B also bought D, and others who bought D also bought B, which is your criterion for recommendation. You can add a filter in the WHERE clause to exclude those books that the user has already bought, something like
WHERE NOT (u1)-[:BUY]->()-[:CONTAINS]->(b2)
This will give you A, C, C back, since there are two matching paths to C. It's probably not important to get two result items for C, so you can either limit the return to give only distinct values
RETURN DISTINCT(b2)
or group the return values by counting the matching paths for each result as a 'recommendation score'
RETURN b2, COUNT(b2) as score
Also, if each order only [CONTAINS] one book, you could try modelling without order, just (:User)-[:BOUGHT]->(:Book).

Find friend that has the highest number of common friends.. and returning his/her friends that are not in your list

I need to find most connected friend's friend node with help of a Cypher query.
In the example below, given B, I need a query to return the "Salad" node but not the "Bacon" node. For this particular case, I have picked up node C (as opposed to node A), as the most connected friend. This is because B&C share most friends. Then I have picked up the friend of C that are not B's friend's list so that friend node (salad) can be recommended.
Given B, I need a Cypher query to return "Salad" node in Neo4j. Per Stefan's suggestion I have added Neo4j Console data here. Thanks.
I assume you know the id of node B and C beforehand.
start b=node(<id_of_b>), c=node(<id_of_c>)
match c-[:LIKES]->stuff
where not(b-[:LIKES]->stuff)
return stuff
This should give you a list of items that C likes not B not, aka the "salad" node.
For future questions please consider setting up your data set on http://console.neo4j.org, this makes your question much clearer.

neo4j complex pattern searching

I'm new to NEO4J and I need help on a specific problem. Or an answer if it's even possible.
SETUP:
We have 2 distinct type of nodes: users (A,B,C,D) and Products (1,2,3,4,5,6,7,8)
Next we have 2 distinct type of relationships between users and products where a users WANTS a Product and where a product is OWNED BY a user.
1,2 is owned by A
3,4 is owned by B
5,6 is owned by C
7,8 is owned by D
Now
B wants 1
C wants 3
D wants 5
So for now, I have no problems and I created the graph data with no difficulty. My questions starts here. We have a circle, when A wants product 8.
A-[:WANTS]->8-[:OWNEDBY]->D-[:WANTS]->5-[:OWNEDBY]->C-[:WANTS]->3-[:OWNEDBY]->B-[:WANTS]->1-[:OWNEDBY]->A
So we have a distinct pattern, U-[:WANTS]->P-[:OWNEDBY]->U
Now what I want to do is to find the paths toward the start node (initiating user that wants a product) following that pattern.
How do I define this using Cypher? Or do I need another way?
Thanks upfront.
i got a feeling this can be hacked with reduce and counting every even (every second) relationship:
MATCH p=A-[:OWNEDBY|WANTS*..20]->X
WITH r in relationships(p)
RETURN type(r),count(r) as cnt,
WHERE cnt=10;
or maybe counting all paths where the number of rels is even:
MATCH p=A-[:OWNEDBY|WANTS*..]->X
RETURN p,reduce(total, r in relationships(p): total + 1) as tt
WHERE tt%2=0;
but you graph must have the strict pattern, where all incoming relationship from the set of ownedby and wants must be different from all outgoin relationships from the same set. in other words, this pattern can not exist: A-[:WANTS]->B-[:WANTS]->C or A-[:OWNEDBY]->B-[:OWNEDBY]->C
the query is probably wrong in syntax, but the logic can be implemented in cypher whn you will play more with it.
or use gremlin, I think I saw somewhere a gremlin query, where you could define a pattern and than loop n-times via that pattern further till the end node.
I've played around with this and created http://console.neo4j.org/?id=qq9v1 showing the sample graph. I've found the following cypher statement solving the issue:
start a=node(1)
match p=( a-[:WANTS]->()-[:WANTS|OWNEDBY*]-()-[:OWNEDBY]->a )
return p
order by length(p) desc
limit 1
There is just one glitch: the intermediate part [:WANTS|OWNEDBY*] does not mandate alternating WANT and OWNEDBY chains. As #ulkas stated, this should not be an issue if you take care during data modelling.
You might also look into http://api.neo4j.org/current/org/neo4j/graphalgo/GraphAlgoFactory.html to apply graph algorithms from Java code. You might use unmangaged extensions to provide REST access to that.

Resources