Neo4j similar paths

Neo4j similar paths - neo4j

i want to do a query that will take all users (without a pre-condition like user ids) , and to find the common similar paths. (for example top 10 users flows)
For example:
User u1 has events: a,b,c,d
User u2 has events: b,d,e
Each event is a node with property event-type
the result should look like:
[a,b,e] - 100 users
[a,c,f] -80 users
[b,d,t]- 50 users
.......
the data the generated the 1st aggregated row in the result can be for example:
user 1: a,b,c,e
user 2: a,b,e,f
.........
user 100: a,c,t,b,g,e
i wonder if this link can help:
http://neo4j.com/docs/stable/rest-api-graph-algos.html#rest-api-execute-a-dijkstra-algorithm-with-equal-weights-on-relationships

Here is a Cypher query that returns all the Event nodes that user 1 and user 2 have in common (in a single row):
MATCH (u1:User {id: 1}) -[:HAS]-> (e:Event) <-[:HAS]- (u2:User {id: 2})
RETURN u1, u2, COLLECT(e);
[Added by MichaelHunger; modified by cybersam] For your additional question try:
// Specify the user ids of interest. This would normally be a query parameter.
WITH [1,2,3] as ids
MATCH (u1:User) -[:HAS]-> (e:Event)
// Only match events for users with one of the specified ids.
WHERE u1.id IN ids
// Count # of distinct user ids per event, and count # of input ids
WITH e, size(collect(distinct u1.id)) as n_users, size(ids) AS n_ids
// Only match when the 2 counts are the same
WHERE n_users = n_ids
RETURN e;

Related

Preserving a query result for the duration of the query in Neo4j Cypher

I am using neo4j 3.5.2 Desktop with Nodejs. I am trying to update a user record properties and add/remove relationship with other nodes in same query:
my query look like this:
MATCH (user:Dealer {email: $paramObj.email})
SET user += apoc.map.clean($paramObj, ["email","vehicles"],[])
WITH user, $paramObj.vehicles AS vehicles
UNWIND vehicles AS vehicle
MATCH(v:Vehicles {name:vehicle})
MERGE (user)-[r:SUPPLY_PARTS_FOR]->(v)
ON CREATE SET r.since = timestamp()
WITH vehicles,user
MATCH (user)-[r:SUPPLY_PARTS_FOR]->(v)
WHERE NOT apoc.coll.contains(vehicles,v.name)
DELETE r
WITH $paramObj.email AS dealeremail
MATCH (user:Dealer {email: dealeremail})
RETURN user
The issue I am having is the return of empty 'user' array when a query related to deleting a vehicle relationship (r) result in zero rows.
How do I preserve the original 'user' result or save the email address to redo the query. I tried using WITH $paramObj.email AS dealerEmail but it seems that I cannot forward the dealerEmail...Thought I could.

This problem is as a result of returning zero rows so it dawned on me that the OPTIONAL MATCH would also return a NULL result but with a single row with null values. So I change the MATCH searching for a relationship to delete to an OPTIONAL MATCH.
MATCH (user:Dealer {email: $paramObj.email})
SET user += apoc.map.clean($paramObj, ["email","vehicles"],[])
WITH user, $paramObj.vehicles AS vehicles
UNWIND vehicles AS vehicle
MATCH(v:Vehicles {name:vehicle})
MERGE (user)-[r:SUPPLY_PARTS_FOR]->(v)
ON CREATE SET r.since = timestamp()
WITH vehicles,user
OPTIONAL MATCH (user)-[r:SUPPLY_PARTS_FOR]->(v)
WHERE NOT apoc.coll.contains(vehicles,v.name)
DELETE r
RETURN user
This did the trick

Cypher return a node with the given neighbors

I'm trying to write a query in Cypher which is able to find a single node with the label :CONVERSATION having the given neighbor nodes. The neighbor nodes are users with the label :USER and a property called "username".
In the query a list of "username"s is given and a desired is to find a conversation node which has as its neighbor all the users with the username in the given list.
I have tried some queries but they don't return what I want. Is there anyone who has any idea how the query may look like?

Assuming you are passing the given usernames as a {users} parameters and the relationship between your users and the conversations is named IN_CONVERSATION :
MATCH (c:Conversation)
WHERE ALL( x IN {users} WHERE (:User {name:x} )-[:IN_CONVERSATION]->(c) )
RETURN c
If you want to test the query by passing the usernames in the neo4j browser for eg, you can simulate the parameters with WITH :
WITH ["adam","john","sally"] AS users
MATCH (c:Conversation)
WHERE ALL( x IN users WHERE (:User {name:x} )-[:IN_CONVERSATION]->(c) )
RETURN c
Another solution is to match first the users also :
MATCH (u:User) WHERE u.name IN {users}
MATCH (c:Conversation)
WHERE ALL( x IN collect(u) WHERE (x)-[:IN_CONVERSATION]->(c) )
RETURN c

Neo4j Get nodes and node count simultaneously

I'm relatively new to Neo4j so I apologize if there is an obvious answer to this question. I have a db with User nodes, Account nodes and ASSIGNED_TO relationships between them. I have a query (below) to get the users and assigned accounts but I also want to get a count of the users found in the same query regardless of the LIMIT/SKIP result. What seems to be happening, is the user count is based on the OPTIONAL MATCH result, not the result of the MATCH query.
I have 3 users and 3 accounts in the database with 2 users assigned to 2 accounts and one user assigned only to one account.
This is the query:
MATCH (user:User)
WITH user
OPTIONAL MATCH (user)-[assigned:ASSIGNED_TO]-(account:Account)
RETURN user, count(user) as userCount, collect(account) as accounts
SKIP 0 LIMIT 25
This is the result:
user userCount accounts
{id: 2} 1 [{id: 2}]
{id: 1} 2 [{id: 2}, {id: 1}]
{id: 3} 2 [{id: 1}, {id: 3}]
I want the userCount value to be 3 for all rows. If I change 'count(user)' to 'count(DISTINCT user)' I get 1 for userCount. I want to avoid running 2 separate queries if possible.

A collect-unwind pair should do the trick
MATCH (user:User)
WITH collect(user) as users, count(DISTINCT user) as userCount
UNWIND users as user
OPTIONAL MATCH (user)-[assigned:ASSIGNED_TO]-(account:Account)
RETURN user, userCount, collect(account) as accounts
SKIP 0 LIMIT 25

// Get user count
MATCH (user:User) WITH count(user) as userCount
// Get user
MATCH (user:User)
// To optimize a query, first apply the pagination
WITH user, userCount SKIP 0 LIMIT 25
// The other part of query
OPTIONAL MATCH (user)-[assigned:assigned_to]-(account:Account)
RETURN user,
userCount,
collect(distinct account) as accounts

neo4j match node with multipal nodes connected to him, return them as array

Assuming a node for this example a User has many friends (they are Users aswell)
lets assume as well that im looking the user by id which is unique.
How can i query to get one row back with a property of friends as array?
Example:
MATCH (user:User {id: "some-id"})-[:FriendsWith]->(friend:User)
RETURN user, friend
Now I expected the result to be an array of length one,
like this [{user: data, friend: [array of users]}]
But instade I got rows [{user: , friend:}, {user, friend: }]
the user was duplicated in each row..

You can use the collect function to create a collection:
MATCH (user:User {id: "some-id"})-[:FriendsWith]->(friend:User)
RETURN user, collect(friend.name) AS friends
There is an implicit group by when using an aggregation.

server requirements and optimizations for my data model ( 1.5 billion relationships )

My data model is fairly simple:
(n:User)-[:WANTS]->(c:Card)<-[:HAS]-(o:User)
Whenever a user updates a card in his wants list, I create outgoing :FOLLOWS connections to users who also have that card in their haves list. At the same time, I also create incoming :FOLLOWS connections from users who want cards in the user's have list like so:
// update my total wants
MATCH (u:User)-[w:WANTS]-()
WHERE u.id = 1
WITH u, SUM(w.qty) AS wqty
SET u.wqty = wqty
RETURN wqty;
// delete all my incoming and outgoing follows
MATCH (u1:User {id: 1})-[f:FOLLOWS]-() DELETE f;
// outgoing follows
MATCH (u1:User)-[w:WANTS]->(c:Card)<-[h:HAS]-(u2:User)
WHERE u1.id = 1 AND u1.id <> u2.id
WITH u1, u2, (CASE WHEN h.qty > w.qty THEN w.qty ELSE h.qty END) AS haves
WITH u1.id AS id1, u2.id AS id2, SUM(haves) as weight
MATCH (uf:User), (ut:User)
WHERE uf.id = id1 AND ut.id = id2
MERGE (uf)-[f:FOLLOWS {weight: weight}]->(ut)
ON MATCH SET f.weight = weight;
// incoming follows
MATCH (u1:User)-[h:HAS]->(c:Card)<-[w:WANTS]-(u2:User)
WHERE u1.id = 1 AND u1.id <> u2.id
WITH u1, u2, (CASE WHEN h.qty > w.qty THEN w.qty ELSE h.qty END) AS haves
WITH u1.id AS id1, u2.id AS id2, SUM(haves) as weight
MATCH (uf:User), (ut:User)
WHERE uf.id = id1 AND ut.id = id2
MERGE (uf)<-[f:FOLLOWS {weight: weight}]-(ut)
ON MATCH SET f.weight = weight;
I decided to include this hard-coded :FOLLOWS relationship every time a user updates something in his inventory because I tried querying the trade potential based on the cards they had and the query was very expensive. This way the users will be able to check trade potentials by doing the following query:
MATCH (u1:User {id: 1})-[f1:FOLLOWS]->(u2:User)-[f2:FOLLOWS]->u1
RETURN u2.id, f1.weight AS num_cards_i_need, f2.weight AS num_cards_they_need
This works very fast for my test database who only has the incoming/outgoing follow relationships calculated for 1 user.
Now on to the problem. I have a small amount of nodes: 50k users and 14k cards. However, each user on average follows 30k other users making roughly 1.5 billion relationships. The data store for this is expected to be around 20-30GB after I'm done loading it to neo4j.
My question is, do I need to be able to load the whole database in memory in order to achieve fast reads as well as fast and frequent write of the follows relationships? Let's say I didn't have the resources to rent some large memory instance Amazon and I'm limited to conventional server hardware, what optimizations do I need to do so that I can read and write the :FOLLOWS very fast?
I obviously have memory for the nodestore and I also have some memory for relationshipstores for users->cards relationship, but not for users->users relationship. Can I choose which ones get loaded to memory so in effect they are "warm"?

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Neo4j similar paths - neo4j

Related

Preserving a query result for the duration of the query in Neo4j Cypher

Cypher return a node with the given neighbors

Neo4j Get nodes and node count simultaneously

neo4j match node with multipal nodes connected to him, return them as array

server requirements and optimizations for my data model ( 1.5 billion relationships )

Categories

Resources