Cypher UNION - How to apply to collected lists - neo4j

Consider the following use of UNION cypher command:
MATCH (user:User)-[]-(org:Organization)
WHERE org.size > 100
RETURN collect({
user.name,
user.age
}) AS userList
UNION
MATCH (user:User)-[]-(family:Family)
WHERE family.mood = "Happy"
RETURN collect({
user.name,
user.age
}) AS userList
The UNION does not work, this query returns users only from the first MATCH. I suspect it's because of the collect statements, however the project's design requires the data to be collected. Is there a way to create a union of the collections, or perhaps collect after the union?

Your query will work just fine except that you should 1) return a valid dictionary format and 2) use CALL which is a subquery for neo4j cypher.
RETURN {
name: user.name,
age: user.age
} AS userList
See sample below:
CALL {MATCH (user:user{id:"some_id"})
RETURN {
id: user.id,
age: user.age
} AS userList
UNION
MATCH (user:user{id:"some_id2"})
RETURN {
id: user.id,
age: user.age
} AS userList
}
RETURN collect(userList) as userList
Result:
╒══════════════════════════════════════════════════════════╕
│"userList" │
╞══════════════════════════════════════════════════════════╡
│[{"id":"some_id","age":null},{"id":"some_id2","age":null}]│
└──────────────────────────────────────────────────────────┘
I am using neo4j version 4.4.3

You can use apoc.coll.union of the APOC library, to create a union of two lists, like this:
MATCH (user:User)-[]-(org:Organization)
WHERE org.size > 100
WITH collect({
user.name,
user.age
}) AS userList1
MATCH (user:User)-[]-(family:Family)
WHERE family.mood = "Happy"
WITH userList1, collect({
user.name,
user.age
}) AS userList2
RETURN apoc.coll.union(userList1, userList2) AS userList
The function apoc.coll.union will not include duplicates, if you want to include duplicates use apoc.coll.unionAll.

Related

Cypher join informations from different tables into a single one

I'm new in cypher and I'm struggling with this problem:
I have these two queries
MATCH (u:UserNode)-[:PROMOTER_OF*1..]->(c:UserNode)
WHERE u.promoterActualRole IN ["GOLD","RUBY","SAPPHIRE","BRONZE","EMERALD", "DIAMOND"]
AND datetime(c.promoterStartActivity) >= datetime("2021-02-01T00:00:00Z")
AND datetime(c.promoterStartActivity)<= datetime("2021-05-31T23:59:59Z")
AND c.promoterEnabled = true
AND u.firstName="Gianvito"
WITH distinct u as user, count(c) as num_promoter
WHERE num_promoter >= 150
RETURN user.firstName as name, user.email as email, num_promoter
which will return me a table like this
name
email
num_promoter
Gianvito
gianvito#email.com
1475
and
MATCH (u:UserNode)-[:PROMOTER_OF*1..]->(c:UserNode)
WHERE u.promoterActualRole IN ["GOLD","RUBY","SAPPHIRE","BRONZE","EMERALD", "DIAMOND"]
AND datetime(c.subscriptionDate) >= datetime("2021-02-01T00:00:00Z")
AND datetime(c.subscriptionDate)<= datetime("2021-05-31T23:59:59Z")
AND c.kycStatus = "OK"
AND u.firstName="Gianvito"
WITH distinct u as user, count(c) as num_swaggy
WHERE num_swaggy >= 1
RETURN user.firstName as name, user.email as email , num_swaggy
name
email
num_swaggy
Gianvito
gianvito#email.com
1820
I would like to merge these two results into a single table.
I was doing a Union but in this way I can only create a single table with two different rows with duplicate common information and "null" as non present value.
How can I do if I want to obtain a table like this one?
name
email
num_promoter
num_swaggy
Gianvito
gianvito#email.com
1475
1820
If you're using Neo4j 4.x or higher, you can UNION the results of the queries in a subquery, and outside of it perform a sum() to get the results into a single row per user:
CALL {
MATCH (u:UserNode)-[:PROMOTER_OF*1..]->(c:UserNode)
WHERE u.promoterActualRole IN ["GOLD","RUBY","SAPPHIRE","BRONZE","EMERALD", "DIAMOND"]
AND datetime(c.promoterStartActivity) >= datetime("2021-02-01T00:00:00Z")
AND datetime(c.promoterStartActivity)<= datetime("2021-05-31T23:59:59Z")
AND c.promoterEnabled = true
AND u.firstName="Gianvito"
WITH u as user, count(c) as num_promoter
WHERE num_promoter >= 150
RETURN user, num_promoter, 0 as num_swaggy
UNION
MATCH (u:UserNode)-[:PROMOTER_OF*1..]->(c:UserNode)
WHERE u.promoterActualRole IN ["GOLD","RUBY","SAPPHIRE","BRONZE","EMERALD", "DIAMOND"]
AND datetime(c.subscriptionDate) >= datetime("2021-02-01T00:00:00Z")
AND datetime(c.subscriptionDate)<= datetime("2021-05-31T23:59:59Z")
AND c.kycStatus = "OK"
AND u.firstName="Gianvito"
WITH u as user, count(c) as num_swaggy
WHERE num_swaggy >= 1
RETURN user, 0 as num_promoter, num_swaggy
}
WITH user, sum(num_promoter) as num_promoter, sum(num_swaggy) as num_swaggy
RETURN user.firstName as name, user.email as email , num_promoter, num_swaggy
Also you don't need to use DISTINCT when you're performing any aggregation, since the grouping key will become distinct automatically as a result of the aggregation.

Neo4j Cypher query for a linked list to conditionally create a NEWEST_REPLY vs modifying NEWEST_REPLY to NEXT_REPLY

I have a linked list of replies off of a Post that looks like this:
With the "first" reply in the list having a NEWEST_REPLY relationship to Post and subsequent replies having a NEXT_REPLY relationship. The query to get the above graph:
MATCH (p:Post {id: $postId})-[:NEWEST_REPLY|NEXT_REPLY*]->(r:Reply)
return p, r
I want to create a cypher query that either
Creates a reply and creates the NEWEST_REPLY relationship when there are no replies OR
Creates a reply, deletes the current NEWEST_REPLY relationship, creates a NEXT_REPLY relationship to the previous NEWEST_REPLY and a NEWEST_REPLY relationship to the new Reply.
This statement:
MATCH (p:Post {id: $postId})-[rel:NEWEST_REPLY]->(previousNewestReply:Reply)
DELETE rel
CREATE (r:Reply { id: apoc.create.uuid(), body: $body, createdAt: datetime(), updatedAt: datetime() })
WITH r, p, previousNewestReply
MATCH (u:User)
WHERE u.id = $userId
CREATE (r)<-[:WROTE]-(u)
CREATE (r)<-[:NEWEST_REPLY]-(p)
CREATE (r)-[:NEXT_REPLY]->(previousNewestReply)
RETURN u, p
achieves number 2.
What I now need to do is conditionally run this statement if the rel in MATCH (p:Post {id: $postId})-[rel:NEWEST_REPLY]->(previousNewestReply:Reply) exists, but if it does not exist, just create NEWEST_REPLY for the first time as well as creating the reply and the User-[:WROTE]->Reply relationship. I'm new to cypher and digging into MERGE, CASE, predicate functions and apoc.when() and not sure which would be the simplest and most appropriate.
Here's an attempt at using CASE:
MATCH (p:Post {id: "db7ee38c-fe60-430e-a7c7-0b2514401343"})
RETURN
CASE EXISTS( (p)-[rel:NEWEST_REPLY]->(replies:Reply) )
WHEN true THEN DELETE rel CREATE (r:Reply { id: apoc.create.uuid(), body: "new with CASE1", createdAt: datetime(), updatedAt: datetime() }) WITH r, p, replies MATCH (u:User) WHERE u.id = "e14d409e-d970-4c5c-9cc7-3b224c774835" CREATE (r)<-[:WROTE]-(u) CREATE (r)<-[:NEWEST_REPLY]-(p) CREATE (r)-[:NEXT_REPLY]->(replies)
WHEN false THEN CREATE (r:Reply { id: apoc.create.uuid(), body: "new with CASE2", createdAt: datetime(), updatedAt: datetime() }) WITH r, p, previousNewestReply MATCH (u:User) WHERE u.id = "e14d409e-d970-4c5c-9cc7-3b224c774835" CREATE (r)<-[:WROTE]-(u) CREATE (r)<-[:NEWEST_REPLY]-(p) END
AS result;
And running into the following SyntaxError:
Invalid input 'r': expected whitespace, comment, '{', node labels, MapLiteral, a parameter, a parameter (old syntax), a relationship pattern, '(', '.', '[', '^', '*', '/', '%', '+', '-', "=~", IN, STARTS, ENDS, CONTAINS, IS, '=', '~', "<>", "!=", '<', '>', "<=", ">=", AND, XOR, OR, WHEN, ELSE or END (line 4, column 24 (offset: 145))
"WHEN true THEN DELETE rel CREATE (r:Reply { id: apoc.create.uuid(), body: "new with CASE1", createdAt: datetime(), updatedAt: datetime() }) WITH r, p, replies MATCH (u:User) WHERE u.id = "e14d409e-d970-4c5c-9cc7-3b224c774835" CREATE (r)<-[:WROTE]-(u) CREATE (r)<-[:NEWEST_REPLY]-(p) CREATE (r)-[:NEXT_REPLY]->(replies)"
My sense is that the logic I am attempting in either THEN statements is too complex for a CASE. Is there a more appropriate with to essentially do an if/else off of whether or not the NEWEST_REPLY relationship exists off of a specific Post?
[UPDATED]
This query should work for you:
MATCH (p:Post), (u:User)
WHERE p.id = $postId AND u.id = $userId
OPTIONAL MATCH (p)-[rel:NEWEST_REPLY]->(prevNewest:Reply)
CREATE (u)-[:WROTE]->(r:Reply {id: apoc.create.uuid(), body: "foo", createdAt: datetime(), updatedAt: datetime()})<-[:NEWEST_REPLY]-(p)
FOREACH(_ IN CASE WHEN rel IS NOT NULL THEN [1] END | DELETE rel CREATE (r)-[:NEXT_REPLY]->(prevNewest))
I asssume postId and userId are passed as parameters. Also, you should create indexes on :Post(di) and :User(id) to speed up the query.
You can do this to delete any existing [:NEWEST_REPLY] rels:
MATCH (p:Post {id: $postId})
OPTIONAL MATCH (p)-[rel:NEWEST_REPLY]->(previousNewestReply:Reply)
WITH p,previousNewestReply,
// create a collection of size 1 or 0
CASE WHEN NOT rel IS NULL THEN [rel] ELSE [] END AS toBeDeleted
// loop through the collection
FOREACH( tbd IN toBeDeleted | DELETE tbd )
WITH p,previousNewestReply
.....

CQL Syntax exception

Neo.ClientError.Statement.SyntaxError: Invalid input ')': expected
whitespace or a relationship pattern (line 66, column 100 (offset:
1898)) "CREATE (z:Subscription{ subscriptionId: subs.subscriptionId,
startDate: subs.startDate, endDate:''})<-[r:ASSOCIATION]-(y:Person
{nationalIdentityNumber: subs.nationalIdentityNumber, name: subs.name,
surname: subs.surname, fathername: subs.fathername , nationality:
subs.nationality, passportNo: subs.passportNo, birthdate:
subs.birthdate})"
I want to create/merge nodes and relation that types are Person, Subscription and Line
If I had same subscription I should check to startDate, If new data's start date greater then old data; I sould create new Subscription and also change old subscription's end date.
UNWIND [{
msisdn:'99658321564',
name:'Lady',
surname:'Camble',
fatherName:'Aeron',
nationality:'EN',
passportNo:'PN-1234224',
birthDate:'12-05-1979',
nationalIdentityNumber:'112124224',
subscriptionId:'2009201999658321564',
startDate:'20-09-2019 12:00:12'
},{msisdn:'99658363275',
name:'John',
surname:'Mckeen',
fatherName:'Frank',
nationality:'EN',
passportNo:'PN-126587',
birthDate:'15-08-1998',
nationalIdentityNumber:'2548746542',
subscriptionId:'1506201999658363275',
startDate:'15-06-2019 13:00:12'}
{
msisdn:'99658321564',
name:'Lady',
surname:'Camble',
fatherName:'Aeron',
nationality:'EN',
passportNo:'PN-1234224',
birthDate:'12-05-1979',
nationalIdentityNumber:'112124224',
subscriptionId:'2009201999658321564',
startDate:'31-11-2019 12:00:12'
}
] as subs
MERGE (y:Person {nationalIdentityNumber: subs.nationalIdentityNumber, name: subs.name, surname: subs.surname, fathername: subs.fathername , nationality: subs.nationality, passportNo: subs.passportNo, birthdate: subs.birthdate })
MERGE (t:Subscription{subscriptionId:subs.subscriptionId })
MERGE (y)-[rel:ASSOCIATION]-(t)
ON MATCH SET
t.endDate = (case when t.startDate <subs.startDate then subs.startDate else ''
end)
MATCH (t:Subscription) where t.subscriprionId=subs.subscriprionId and
(CASE
WHEN t.endDate=subs.startDate then
CREATE (z:Subscription{ subscriptionId: subs.subscriptionId, startDate: subs.startDate, endDate:''})-[r:ASSOCIATION]-(y:Person {nationalIdentityNumber: subs.nationalIdentityNumber, name: subs.name, surname: subs.surname, fathername: subs.fathername , nationality: subs.nationality, passportNo: subs.passportNo, birthdate: subs.birthdate})
END)
RETURN y
UNWIND[...] as subs
MERGE (y:Person {nationalIdentityNumber: subs.nationalIdentityNumber, name: subs.name, surname: subs.surname, fatherName: subs.fatherName , nationality: subs.nationality, passportNo: subs.passportNo, birthDate: subs.birthDate })
MERGE (t:Subscription{subscriptionId:subs.subscriptionId,startDate:subs.startDate,endDate:''})
MERGE (y)-[rel:ASSOCIATION]-(t)
MERGE(x:Subscription{subscriptionId:subs.subscriptionId, endDate:''})
SET
x.endDate = (case when x.startDate < subs.startDate then subs.startDate else null end);
CQL should like this. Thanks my co-worker.
You're trying to have conditional Cypher clauses through a CASE statement, and that won't work. You can't do a nested CREATE (or any other Cypher clause) in a CASE.
You can however use a trick with FOREACH and CASE to mimic an if conditional. That should work in your case, as you want to only execute a CREATE under certain conditions (though since you already matched to the y node for the person, just reuse (y) in that CREATE instead of trying to define the entire node again from labels and properties, that won't work properly).
If you need more advanced conditional logic, that's available via conditional procs in APOC Procedures

Neo4j - UNION of 3 different queries

I have a problem with one composed query, which has three parts.
Get direct friends
Get friends of friends
Get others - just fill up space to limit
So it should always return limited users, ordered by direct friends, friends of friends and others. First two parts are very fast, no problem here, but last part is slow and it's getting slower while db is growing on size. There are indexes on Person.number and Person.createdAt.
Does anyone have an idea how to improve or rewrite this query, to be more performant?
MATCH (me:Person { number: $number })-[r:KNOWS]-(contact:Person { registered: "true" }) WHERE contact.number <> $number AND (r.state = "contact" OR r.state = "declined")
MATCH (contact)-[:HAS_AVATAR]-(avatar:Avatar { primary: true })
WITH contact, avatar
RETURN contact AS friend, avatar, contact.createdAt AS rank
ORDER BY contact.createdAt DESC
UNION
MATCH (me:Person { number: $number })-[:KNOWS]-(friend)-[:KNOWS { state: "accepted" }]-(friend_of_friend:Person { registered: "true" }) WHERE NOT friend.username = 'default' AND NOT (me)-[:KNOWS]-(friend_of_friend)
MATCH (friend_of_friend)-[:HAS_AVATAR]-(avatar:Avatar { primary: true })
OPTIONAL MATCH (friend_of_friend)-[rel:KNOWS]-(friend)
RETURN friend_of_friend AS friend, avatar, COUNT(rel) AS rank
ORDER BY rank DESC
UNION
MATCH (me:Person { number: $number })
MATCH (others:Person { registered: "true" }) WHERE others.number <> $number AND NOT (me)-[:KNOWS]-(others) AND NOT (me)-[:KNOWS]-()-[:KNOWS { state: "accepted" }]-(others:Person { registered: "true" })
MATCH (others)-[:HAS_AVATAR]->(avatar:Avatar { primary: true })
OPTIONAL MATCH (others)-[rel:KNOWS { state: "accepted" }]-()
WITH others, rel, avatar
RETURN others AS friend, avatar, COUNT(rel) AS rank
ORDER BY others.createdAt DESC
SKIP $skip
LIMIT $limit
Here are some profiles:
https://i.stack.imgur.com/LfNww.png
https://i.stack.imgur.com/0EO0r.png
Final solution is to break down the whole query into three and call them separately, in our case it won't reach 3rd query in 99% and first two are super fast. And it seems that even if it reach 3rd stage, it is still fast, so maybe UNION was slowing the whole thing down the most.
const contacts = await this.neo4j.readQuery(`...
if (contacts.records.length < limit){
const friendOfFriend = await this.neo4j.readQuery(`...
if (contacts.records.length + friendOfFriend.records.length < limit){
const others = await this.neo4j.readQuery(`...
merge all results
You're doing a lot of work in that third query before the limit. You may want to move the ordering and LIMIT up sooner.
It's also going to be more efficient to pre-match to the friends (and friends of friends) in a single MATCH pattern, we can use *0..1 as an optional relationship to a potential next node.
And just a bit of style advice, I find it a good idea to reserve plurals for lists/collections and otherwise use singular, as you will only have a single one of those nodes per row.
Try this out for the third part:
MATCH (me:Person { number: $number })
OPTIONAL MATCH (me)-[:KNOWS]-()-[:KNOWS*0..1 { state: "accepted" }]-(other:Person {registered:"true"})
WITH collect(DISTINCT other) as excluded
MATCH (other:Person { registered: "true" }) WHERE other.createdAt < dateTime() AND other.number <> $number AND NOT other IN excluded
WITH other
ORDER BY other.createdAt DESC
SKIP $skip
LIMIT $limit
MATCH (other)-[:HAS_AVATAR]->(avatar:Avatar { primary: true })
WITH other, avatar, size((other)-[:KNOWS { state: "accepted" }]-()) AS rank
RETURN other AS friend, avatar, rank
If we know the type of createdAt then we can add a modification that may trigger index-backed ordering which could improve this.

Cypher query and multi-pass references

I have the following Cypher query that looks for the Permission for User via Role:
MATCH (p:Permission)<-[:CONTAINS]-(r:Role)<-[:HAS]-(u:User)
WHERE u.id = {userId} AND p.type = {permissionType} AND p.code = {permissionCode}
RETURN p
This query works fine.
Also, the User can have a direct relationship with the Permission:
(p:Permission)<-[:HAS]-(u:User)
How to extend the original query in order to also look for the Permission that is directly associated with the User?
You can try this :
MATCH (p:Permission)<-[:HAS|:CONTAINS*1..2]-(u:User)
WHERE u.id = {userId} AND p.type = {permissionType} AND p.code = {permissionCode}
RETURN p
Cheers

Resources