Cypher: Is it possible to find creepy people following my friends? - neo4j

Let's say I've pulled down the Twitter graph local to myself into Neo4J. I want to find people who follow my friends in number larger that should be expected. More specifically, I want to find people who follow the people I follow, but I want the results to be sorted so that the person following the highest number of my friends is sorted first. Possible in Cypher?

Here's a console example:
http://console.neo4j.org/r/p36cgj
create (me {n:"a"}), (fo1 {n:"fo1"}), (fo2 {n:"fo2"}), (fo3 {n:"fo3"}), (fr1 {n:"fr1"}),
(fr2 {n:"fr2"}), (fr3 {n:"fr3"}),
fo1-[:follows]->me, fo2-[:follows]->me, fo3-[:follows]->me, me-[:follows]->fr1,
me-[:follows]->fr2, me-[:follows]->fr3, fo1-[:follows]->fr1, fo2-[:follows]->fr2,
fo1-[:follows]->fr2, fo1-[:follows]->fr3;
start me=node:node_auto_index(n="me")
match me-[:follows]->friends<-[:follows]-follower-[:follows]->me
return follower, count(friends) as creepinessFactor, length(me-[:follows]->()) as countIFollow
order by creepinessFactor desc;
I'm curious to hear the results, btw. :P
You could also throw in a where like:
where not(me-[:follows]->follower)
To avoid getting friends within your circle.

Related

Return all users given user chats with and the latest message in conversation

my relationships look like this
A-[:CHATS_WITH]->B - denotes that the user have sent at least 1 mesg to the other user
then messages
A-[:FROM]->message-[:SENT_TO]->B
and vice versa
B-[:FROM]->message-[:SENT_TO]->A
and so on
now i would like to select all users a given user chats with together with the latest message between the two.
for now i have managed to get all messages between two users with this query
MATCH (me:user)-[:CHATS_WITH]->(other:user) WHERE me.nick = 'bazo'
WITH me, other
MATCH me-[:FROM|:SENT_TO]-(m:message)-[:FROM|:SENT_TO]-other
RETURN other,m ORDER BY m.timestamp DESC
how can I return just the latest message for each conversation?
Taking what you already have do you just want to tag LIMIT 1 to the end of the query?
The preferential way in a graph store is to manually manage a linked list to model the interaction stream in which case you'd just select the head or tail of the list. This is because you are playing to the graphs strengths (traversal) rather than reading data out of every Message node.
EDIT - Last message to each distinct contact.
I think you'll have to collect all the messages into an ordered collection and then return the head, but this sounds like it get get very slow if you have many friends/messages.
MATCH (me:user)-[:CHATS_WITH]->(other:user) WHERE me.nick = 'bazo'
WITH me, other
MATCH me-[:FROM|:SENT_TO]-(m:message)-[:FROM|:SENT_TO]-other
WITH other, m
ORDER BY m.timestamp DESC
RETURN other, HEAD(COLLECT(m))
See: Neo Linked Lists and Neo Modelling a Newsfeed.

Cypher: Query to Combine Collection for InFeed Ads (Show 1 Ad Every X Amount of Posts)?

I am trying to build a Cypher query which allows me to build in-feed ads:
An example is how on the Facebook Mobile App an ad appears inside the feed for every X numbers of posts (Lets say 1 ad for every 5 posts on same feed).
So far I have this: "MATCH (P:Post) (A:Ad) return P,A"
Post would be the User's Posts.
Ad would be ads to put inside a User's feed.
I'm able to get both collections, but am lost on how to combine this to create an effect similar to in-Feed apps.
What is your actual use-case?
Do you have a feed of Ads somewhere and want to merge it with user's posts?
How do you model ad-feeds and post-feeds?
You probably also have Ad-Publishers, Categories etc? Same for posts?
So something like this:
MATCH (u:User {login:"john"})-[:POSTED]->(p:Post)
WITH p
LIMIT 20
MATCH (:Publisher {id:"3829472"})-[:PUBLISHED]->(ad:Ad)<-[:AD_CATEGORY]-(c)-[:POST_CATEGORY]->(p)
RETURN p,case when random() < 0.2 then ad else null end
You should probably look into graph modeling.
For actual cypher questions check the manual and the refcard.

Neo4j / Cypher : order by and where, know the position of the result in the sort

Does it possible to have an order by "property" with a where clause and now the "index/position" of the result?
I mean, when using order for sorting we need to be able to know the position of the result in the sort.
Imagine a scoreboard with 1 million user node, i do an order by on user node.score with a where "name = user_name" and i wan't to know the current rank of the user. I do not find how to do this using order by ...
start game=node(1)
match game-[:has_child_user]->user
with user
order by user.score
with user
where user.name = "my_user"
return user , "the position in the sort";
the expected result would be :
node_user | rank
(i don't want to fetch one million entries at client side to know the current rank/position of a node in the ORDER BY!)
This functionality does not exist today in Cypher. Do you have an example of what this would look like in SQL? Would the below be something that fits the bill? (just a sketch, not working!)
(your code)
start game=node(1)
match game-[:has_child_user]->user
with user
order by user.score
(+ this code)
with user, index() as rank
return user.name, rank;
If you have more thoughts or want to start hacking on this please open an issue at https://github.com/neo4j/neo4j/issues
For the time being there is a work around that you can do:
start n=node(0),rank_node=node(1)
match n-[r:rank]->rn
where rn.score <= rank_node.score
return rank_node,count(*) as pos;
For live example see: http://console.neo4j.org/?id=bela20

Can Neo4j be effectively used to show a collection of nodes in a sortable and filterable table?

I realise this may not be ideal usage, but apart from all the graphy goodness of Neo4j, I'd like to show a collection of nodes, say, People, in a tabular format that has indexed properties for sorting and filtering
I'm guessing the Type of a node can be stored as a Link, say Bob -> type -> Person, which would allow us to retrieve all People
Are the following possible to do efficiently (indexed?) and in a scalable manner?
Retrieve all People nodes and display all of their names, ages, cities of birth, etc (NOTE: some of this data will be properties, some Links to other nodes (which could be denormalised as properties for table display's and simplicity's sake)
Show me all People sorted by Age
Show me all People with Age < 30
Also a quick how to do the above (or a link to some place in the docs describing how) would be lovely
Thanks very much!
Oh and if the above isn't a good idea, please suggest a storage solution which allows both graph-like retrieval and relational-like retrieval
if you want to operate on these person nodes, you can put them into an index (default is Lucene) and then retrieve and sort the nodes using Lucene (see for instance How do I sort Lucene results by field value using a HitCollector? on how to do a custom sort in java). This will get you for instance People sorted by Age etc. The code in Neo4j could look like
Transaction tx = neo4j.beginTx();
idxManager = neo4j.index()
personIndex = idxManager.forNodes('persons')
personIndex.add(meNode,'name',meNode.getProperty('name'))
personIndex.add(youNode,'name',youNode.getProperty('name'))
tx.success()
tx.finish()
'*** Prepare a custom Lucene query context with Neo4j API ***'
query = new QueryContext( 'name:*' ).sort( new Sort(new SortField( 'name',SortField.STRING, true ) ) )
results = personIndex.query( query )
For combining index lookups and graph traversals, Cypher is a good choice, e.g.
START people = node:people_index(name="E*") MATCH people-[r]->() return people.name, r.age order by r.age asc
in order to return data on both the node and the relationships.
Sure, that's easily possible with the Neo4j query language Cypher.
For example:
start cat=node:Types(name='Person')
match cat<-[:IS_A]-person-[born:BORN]->city
where person.age > 30
return person.name, person.age, born.date, city.name
order by person.age asc
limit 10
You can experiment with it in our cypher console.

How to get friends of friends that have the same interest?

Getting friends of friend are pretty easy, I got this which seems to work great.
g.v(1).in('FRIEND').in('FRIEND').filter{it != g.v(1)}
But what I want to do is only get friends of friends that have the same interests. Below I want Joe to be suggested Moe but not Noe because they do not have the same interest.
You simply need to extend your gremlin traversal to go over the LIKES edges too:
g.v(1).in('FRIEND').in('FRIEND').filter{it != g.v(1)}.dedup() \
as('friend').in('LIKES').out('LIKES').filter{it == g.v(1)}. \
back('friend').dedup()
Basically this goes out to friends of friends, as you had before and saves the position in the pipe under the name friend. It then goes out to mutual likes and searches for the original
source node. If it finds one it jumps back friend. The dedup() just removes duplicates and may speed up traversals.
The directionality of this may not be 100% correct as you haven't indicated direction of edges in your diagram.
Does this have to be in Gremlin? If Cypher is acceptable, you can do:
START s=node(Joe)
MATCH s-[:FRIEND]-()-[:FRIEND]-fof, s-[:LIKES]-()-[:LIKES]-fof
WHERE s != fof
RETURN fof
Query to get Mutual friends without considering common likes,
But if you they have common likes it will come on top.
Take a look of Order by.
MATCH (me:User{userid:'34219'})
MATCH (me)-[:FRIEND]-()-[:FRIEND]-(potentialFriend)
WITH me, potentialFriend, COUNT(*) AS friendsInCommon
WITH me,
potentialFriend,
SIZE((potentialFriend)-[:LIKES]->()<-[:LIKES]-(me)) AS sameInterest,
friendsInCommon
WHERE NOT (me)-[:FRIEND]-(potentialFriend)
RETURN potentialFriend, sameInterest, friendsInCommon,
friendsInCommon + sameInterest AS score
ORDER BY score DESC;
If you want only common likes add foll. condition -
Where sameInterest>0

Resources