Returning tasks for each session in Neo4J (list of lists) - neo4j

I'm using Neo4J for a mentor platform I'm building and I'm stumped by the following:
Given the following nodes and properties:
Mentor{ login, ... }
Mentee{ login, ... }
Session{ notes, ... }
Task{ complete, name }
And the following associations:
// each session has 1 mentor and 1 mentee
(Mentor)<-[:HAS]-(Session)-[:HAS]->(Mentee)
// each task is FOR one person (a Mentor or Mentee)
// each task is FROM one Session
(Session)<-[:FROM]-(Task)-[:FOR]->(Mentor or Mentee)
What's the best way to query this data to produce an API response in the following shape? Similarly, is this a reasonable way to model the data? Maybe something with coalesce?
{
mentor: { login: '...', /* ... */ },
mentee: { login: '...', /* ... */ },
sessions: [
{
notes,
/* ... */
mentorTasks: [{ id, name, complete }],
menteeTasks: [{ id, name, complete }]
]
I first tried:
MATCH (mentor:Mentor{ github: "mentorlogin" })
MATCH (session:Session)-[:HAS]->(mentee:Mentee{ github: "menteelogin" })
OPTIONAL MATCH (mentor)<-[:FOR]-(mentorTask:Task)-[:FROM]->(session)
OPTIONAL MATCH (mentee)<-[:FOR]-(menteeTask:Task)-[:FROM]->(session)
RETURN
mentor,
mentee,
session,
COLLECT(DISTINCT mentorTask) as mentorTasks,
COLLECT(DISTINCT menteeTask) as menteeTasks
ORDER BY session.date DESC
But that's janky - The mentor and mentee data is returned many times, and it's completely gone if the mentee has no sessions.
This seems more appropriate, but I'm not sure how to fold in the tasks:
MATCH (mentor:Mentor{ github: "mentorlogin" })
MATCH (mentee:Mentee{ github: "menteelogin })
OPTIONAL MATCH (session:Session)-[:HAS]->(mentee)
OPTIONAL MATCH (mentor)<-[:FOR]-(mentorTask:Task)-[:FROM]->(session)
OPTIONAL MATCH (mentee)<-[:FOR]-(menteeTask:Task)-[:FROM]->(session)
RETURN
mentor,
mentee,
COLLECT(DISTINCT session) as sessions
EDIT: Working! thanks to a prompt response from Graphileon. I made a few modifications:
changed MATCH statement so it returns the mentor and mentee even if there are no sessions
sort sessions by date (most recent first)
return all node properties, instead of whitelisting
MATCH (mentor:Mentor{ github: $mentorGithub })
MATCH (mentee:Mentee{ github: $menteeGithub })
RETURN DISTINCT {
mentor: mentor{ .*, id: toString(id(mentor)) },
mentee: mentee{ .*, id: toString(id(mentee)) },
sessions: apoc.coll.sortMaps([(mentor:Mentor)<-[:HAS]-(session:Session)-[:HAS]->(mentee:Mentee) |
session{
.*,
id: toString(id(session)),
mentorTasks: [
(session)<-[:FROM]-(task:Task)-[:FOR]->(mentor) |
task{ .*, id: toString(id(task)) }
],
menteeTasks: [
(session)<-[:FROM]-(task:Task)-[:FOR]->(mentee) |
task{ .*, id: toString(id(task)) }
]
}
], "date")
} AS result

Presuming you would have these data:
You can do something along these lines, with nested pattern comprehensions
MATCH (mentor:Mentor)<-[:HAS]-(:Session)-[:HAS]->(mentee:Mentee)
RETURN DISTINCT {
mentor: {id:id(mentor), name: mentor.name},
mentee: {id:id(mentee), name: mentee.name},
sessions: [(mentor:Mentor)<-[:HAS]-(session:Session)-[:HAS]->(mentee:Mentee) |
{ id: id(session),
name: session.name,
mentorTasks: [(session)<-[:FROM]-(task:Task)-[:FOR]->(mentor) |
{id:id(task), name: task.name}
],
menteeTasks: [(session)<-[:FROM]-(task:Task)-[:FOR]->(mentee) |
{id:id(task), name: task.name}
]
}
]
} AS myResult
returning
{
"mentor": {
"name": "Mentor Jill",
"id": 211
},
"sessions": [
{
"menteeTasks": [
{
"id": 223,
"name": "Task D"
},
{
"id": 220,
"name": "Task C"
},
{
"id": 219,
"name": "Task B"
}
],
"name": "Session 1",
"id": 208,
"mentorTasks": [
{
"id": 213,
"name": "Task A"
}
]
}
],
"mentee": {
"name": "Mentee Joe",
"id": 212
}
}
Note that using the pattern comprehensions, you can avoid the OPTIONAL matches. If a pattern comprehension does not find anything, it returns []

Related

OData GroupBy and Select

Expecting the following example table CustomerOrders
Id
CustomerId
Customer
Product
1
1
Alice
Pizza
2
1
Alice
Pasta
3
2
Bob
Burger
In C# I'm was able to use the following Linq query to produce a nice List<Customer> result with a nested orders collection for every customer:
List<CustomerOrders> queryResult = GetCustomerOrders();
return queryResult
.GroupBy(x => x.CustomerId)
.Select(x => new Customer
{
Id = x.First().CustomrId,
Customer = x.First().Customer,
Orders = x.ToList()
})
.ToList();
Now I want to achive this result directly over an odata query in the client application to get the following JSON result:
[
{
"id": 1,
"customer": Alice,
"orders": [ "Pizza", "Pasta" ]
},
{
"id": 2,
"customer": Bob,
"orders": [ "Burger" ]
}
]
Is there a way to transfer this query in odata?
GroupBy in OData is similar to SQL, only the aggregates and common columns are returned, we lose access to the individual items, so we can return a grouping and a count of the orders, using group by, but not the array of orders.
If your schema has a Customer entity and there is a collection navigation property from Customer to Orders, then we do not need to use grouping at all:
~/Customers?$expand=Orders($select=Product)&$select=Id,Name
The output is structured in a slightly similar manner and should resemble something like this:
{
"#odata.context": "~/$metadata#Customers(Id,Name,Orders(Product))",
"value": [
{
"Id": 1,
"Name": "Alice",
"Orders": [{"Product": "Pizza"},
{"Product": "Pasta"}]
},
{
"Id": 2,
"Name": "Bob",
"Orders": [{"Product": "Burger"}]
}
]
}
A key concept in OData is that the shape of the overall graph should not be modified, it is designed deliberately to always maintain the structure of the Entities that are returned. This means that the definition document is always correct, the only thing missing from this response is the additional fields that were not requested.
If you need the output in the client specifically as mentioned, then you can expose that as a custom function on the controller:
[EnableQuery]
public IQueryable<CustomerSummary> GetCustomersWithOrderSummary()
{
List<CustomerOrders> queryResult = GetCustomerOrders();
return queryResult
.GroupBy(x => x.CustomerId)
.Select(x => new CustomerSummary
{
Id = x.Key,
Customer = x.First().Customer,
Orders = x.Select(o => o.Product)
});
}
If using GroupBy, the closest response we can get is this:
~/CustomerOrders?$apply=groupby((CustomerId,Customer),aggregate($count as Orders))
But here we will return a count of the orders, and not an array of the product values as expected:
{
"#odata.context": "~/$metadata#CustomerOrders(CustomerId,Customer,Orders)",
"value": [
{
"#odata.id": null,
"CustomerId": 1,
"Customer": "Alice",
"Orders": 2
},
{
"#odata.id": null,
"CustomerId": 2,
"Customer": "Bob",
"Orders": 1
}
]
}

How to create dynamic node relation in neo4j for dynamic data?

I was able to create author nodes directly from the json file . But the challenge is on what basis or how we have to link the data. Linking "Author" to "organization". since the data is dynamic we cannot generalize it. I have tried with using csv file but, it fails the conditions when dynamic data is coming. For example one json record contain 2 organization and 3 authors, next record will be different. Different json record have different author and organization to link. organization/1 represent organization1 and organization/2 represents organization 2. Any help or hint will be great. Thank you. Please find the json file below.
"Author": [
{
"seq": "3",
"type": "abc",
"identifier": [
{
"idtype:auid": "10000000"
}
],
"familyName": "xyz",
"indexedName": "MI",
"givenName": "T",
"preferredName": {
"familyName": "xyz1",
"givenName": "a",
"initials": "T.",
"indexedName": "bT."
},
"emailAddressList": [],
"degrees": [],
"#id": "https:abc/2009127993/author/person/3",
"hasAffiliation": [
"https:abc/author/organization/1"
],
"organization": [
[
{
"identifier": [
{
"#type": "idtype:uuid",
"#subtype": "idsubtype:affiliationInstanceId",
"#value": "aff2"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:afid",
"#value": "12345"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:dptid"
}
],
"organizations": [],
"addressParts": [],
"sourceText": "",
"text": " Medical University School of Medicine",
"#id": "https:abc/author/organization/1"
}
],
[
{
"identifier": [
{
"#type": "idtype:uuid",
"#subtype": "idsubtype:affiliationInstanceId",
"#value": "aff1"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:afid",
"#value": "7890"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:dptid"
}
],
"organizations": [],
"addressParts": [],
"sourceText": "",
"text": "K University",
"#id": "https:efg/author/organization/2"
}
]
Hi I see that Organisation is part of the Author data, so you have to model it like wise. So for instance (Author)-[:AFFILIATED_WITH]->(Organisation)
When you use apoc.load.json which supports a stream of author objects you can load the data.
I did some checks on your JSON structure with this cypher query:
call apoc.load.json("file:///Users/keesv/work/check.json") yield value
unwind value as record
WITH record.Author as author
WITH author.identifier[0].`idtype:auid` as authorId,author, author.organization[0] as organizations
return authorId, author, organizations
To get this working you will need to create include apoc in the plugins directory, and add the following two lines in the apoc.conf file (create one if it is not there) in the 'conf' directory.
apoc.import.file.enabled=true
apoc.import.file.use_neo4j_config=false
I also see a nested array for the organisations in the output why is that and what is the meaning of that?
And finally I see also in the JSON that an organisation can have a reference to other organisations.
explanation
In my query I use UNWIND to unwind the base Author array. This means you get for every author a 'record' to work with.
With a MERGE or CREATE statement you can now create an Author Node with the correct properties. With the FOREACH construct you can walk over all the Organization entry and create/merge an Organization node and create the relation between the Author and the Organization.
here an 'psuedo' example
call apoc.load.json("file:///Users/keesv/work/check.json") yield value
unwind value as record
WITH record.Author as author
WITH author.identifier[0].`idtype:auid` as authorId,author, author.organization[0] as organizations
// creating the Author node
MERGE (a:Author { id: authorId })
SET a.familyName = author.familyName
...
// walk over the organizations
// determine
FOREACH (org in organizations |
MERGE (o:Organization { id: ... })
SET o.name = org.text
...
MERGE (a)-[:AFFILIATED_WITH]->(o)
// if needed you can also do a nested FOREACH here to process the Org Org relationship
)
Here is the JSON file I used I had to change something at the start and the end
[
{
"Author":{
"seq":"3",
"type":"abc",
"identifier":[
{
"idtype:auid":"10000000"
}
],
"familyName":"xyz",
"indexedName":"MI",
"givenName":"T",
"preferredName":{
"familyName":"xyz1",
"givenName":"a",
"initials":"T.",
"indexedName":"bT."
},
"emailAddressList":[
],
"degrees":[
],
"#id":"https:abc/2009127993/author/person/3",
"hasAffiliation":[
"https:abc/author/organization/1"
],
"organization":[
[
{
"identifier":[
{
"#type":"idtype:uuid",
"#subtype":"idsubtype:affiliationInstanceId",
"#value":"aff2"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:afid",
"#value":"12345"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:dptid"
}
],
"organizations":[
],
"addressParts":[
],
"sourceText":"",
"text":" Medical University School of Medicine",
"#id":"https:abc/author/organization/1"
}
],
[
{
"identifier":[
{
"#type":"idtype:uuid",
"#subtype":"idsubtype:affiliationInstanceId",
"#value":"aff1"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:afid",
"#value":"7890"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:dptid"
}
],
"organizations":[
],
"addressParts":[
],
"sourceText":"",
"text":"K University",
"#id":"https:efg/author/organization/2"
}
]
]
}
}
]
IMPORTANT create unique constraints for Author.id and Organization.id!!
In this way you can process any json file with an unknown number of author elements and an unknown number of affiliated organisations

How to add sorting for field object of graphql type which refers to different graphql type?

I am using Neo4j dB and using pattern comprehension to return the values. I have 2 types Person and Friend:
(p:Person)-[:FRIEND_WITH]->(f:Friend)
Type Person{
id: String
name: String
friends: [Friend]
}
Type Friend{
id: String
name: String
}
type Query {
persons( limit:Int = 10): [Person]
friends( limit:Int = 10): [Friend]
}
What i want to do is to pull the array list of field friends (present in Person Type) in ascending order when the "persons" query executes. For e.g.
{
"data": {
"persons": {
"id": "1",
"name": "Timothy",
"friends": [
{
"id": "c3ef473",
"name": "Adam",
},
{
"id": "ef4e373",
"name": "Bryan",
},
(
"id": "e373ln45",
"name": "Craig",
},
How should I do it ? I researched regarding the sorting, but I did not find anything specific on the array object's sorting when we are using pattern comprehension in neo4j. Any suggestions would be really helpful !
I used the sortBy function of lodash to return the result into an ascending order.
And here is the graphql resolver query:
persons(_, params) {
let query = `MATCH (p:Person)
RETURN p{
.id,
.name,
friends: [(p)-[:FRIEND_WITH]->(f:Friend)) | f{.*}]
}
LIMIT $limit;`;
return dbSession().run(query, params)
.then(result => {
return result.records.map(record => {
let item = record.get("p");
item.friends = sortBy(item.friends, [function(i) {
return i.name;
}]);
return item;
})
})
}

Simulating a join in ElasticSearch

Assume there are documents in an ES index that have two fields, user_id and action_id. How to count users such that there are documents both with action_id = 1 and action_id = 2?
Equivalent SQL would be
SELECT COUNT(DISTINCT `a`.`uuid`)
FROM `action` AS `a`
JOIN `action` AS `b` ON `a`.`user_id` = `b`.`user_id`
WHERE `a`.`action_id` = 1
AND `b`.`action_id` = 2
I found the only way to do so: request twice all unique user_ids with these action_ids and find intersection of resulting sets on the ES client. Yet this approach needs to transfer megabytes of data from ES, so I'm searching for an alternative.
You can do it like this:
first you have a query that filters your documents with actions 1 and 2 only (I have no idea if you can have other action types)
then the magic is with aggregations
the first aggregation is a terms one for user_id, so that you can do individual calculations per user
then you use a cardinality sub-aggregation to count the number of distinct actions per user. Since the query is for actions 1 and 2 that number can only be 1 or 2
then you use a bucket_selector sub-aggregation to only keep those users that have the cardinality result of 2.
{
"size": 0,
"query": {
"bool": {
"should": [
{
"terms": {
"action_id": [
1,
2
]
}
}
]
}
},
"aggs": {
"users": {
"terms": {
"field": "user_id",
"size": 10
},
"aggs": {
"actions": {
"cardinality": {
"field": "action_id"
}
},
"actions_count_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"totalActions": "actions"
},
"script": "totalActions >= 2"
}
}
}
}
}
}
The result will look like this:
"aggregations": {
"users": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 1,
"doc_count": 2,
"actions": {
"value": 2
}
},
{
"key": 5,
"doc_count": 2,
"actions": {
"value": 2
}
}
]
}
}
The keys are the user_ids whose actions are 1 and 2. bucket_selector aggregation is available in 2.x+ version of ES.

How to use parameters correctly in Transactional Cypher HTTP

I'm a bit stuck. I'm trying to use parameters in my http request in order to reduce the overhead but I just can't figure out why this won't work:
{
"statements" : [ {
"statement" : "MATCH (n:Person) WHERE n.name = {name} SET n.dogs={dogs} RETURN n",
"parameters" : [{
"name" : "Andres",
"dogs":5
},{
"name" : "Michael",
"dogs":3
},{
"name" : "Someone",
"dogs":2
}
]
}]
}
I've tried just opening a transaction with a STATEMENT and feeding the separate 'rows' in as PARAMETERS on subsequent transactions before I /COMMIT, but no joy.
I know that multiple nodes can be created in a similar from the examples in the manual.
What am I missing?
I've since modified an answer from this post, which seems to work by using a FOREACH statement to allow for multplie 'sets' of parameters.
{
"statements" : [
{
"parameters": {
"props": [
{
"userid": "177032492760",
"username": "John"
},
{
"userid": "177032492760",
"username": "Mike"
},
{
"userid": "100007496328",
"username": "Wilber"
}
]
},
"statement": "FOREACH (p in {props} | MERGE (user:People {id:p.userid}) ON CREATE SET user.name = p.username) "
}
]
}
You can also use UNWIND for this case :
The statement :
UNWIND props as prop
MERGE (user:People {id: {prop}.id}) // and the rest of your query
The parameters :
{"props":[ {"id": 1234, "name": "John"},{"id": 4567, "name": "Chris"}]}
This is what is used on Graphgen to load the generated graphs in a local database from the webapp. http://graphgen.neoxygen.io

Resources