how can I return two collections with a cypher query in the neo4j .net client - neo4jclient

I'd like to return two collections in one query "tags" and "items" where each tag can have 0..many items. It looks like if I use the projection, it will assume a single collection with two columns rather than two collections, is that correct? Is there a better way to run this search query?
I'm getting "the query response contains columns Tags, Items however ...anonymous type does not contain settable properties to receive this data"
var query = client
.Cypher
.StartWithNodeIndexLookup("tags", "tags_fulltext", keyword)
.Match("tags<-[:TaggedWith]-items")
.Return((items, tags) => new
{
Tags = tags.As<Tag>(),
Items = items.As<Item>()
});
var results = await query.ResultsAsync;
return new SearchResult
{
Items = results.Select(x => x.Items).ToList(),
Tags = results.Select(x => x.Tags).Distinct().ToList()
};

Option 1
Scenario: You want to retrieve all of the tags that match a keyword, then for each of those tags, retrieve each of the items (in a way that still links them to the tag).
First up, this line:
.StartWithNodeIndexLookup("tags", "tags_fulltext", keyword)
Should be:
.StartWithNodeIndexLookup("tag", "tags_fulltext", keyword)
That is, the identity should be tag not tags. That's because the START clause results in a set of nodes which are each a tag, not a set of nodes called tags. Semantics, but it makes things simpler in the next step.
Now that we're calling it tag instead of tags, we update our MATCH clause to:
.Match("tag<-[:TaggedWith]-item")
That says "for each tag in the set, go and find each item attached to it". Again, 'item' is singular.
Now lets return it:
.Return((tag, item) => new
{
Tag = tag.As<Tag>(),
Items = item.CollectAs<Item>()
});
Here, we take each 'item' and collect them into a set of 'items'. My usage of singular vs plural in that code is very specific.
The resulting Cypher table looks something like this:
-------------------------
| tag | items |
-------------------------
| red | A, B, C |
| blue | B, D |
| green | E, F, G |
-------------------------
Final code:
var query = client
.Cypher
.StartWithNodeIndexLookup("tag", "tags_fulltext", keyword)
.Match("tag<-[:TaggedWith]-item")
.Return((tag, item) => new
{
Tag = tag.As<Tag>(),
Items = item.CollectAs<Item>()
});
That's not what fits into your SearchResult though.
Option 2
Scenario: You want to retrieve all of the tags that match a keyword, then all of the items that match any of those tags, but you don't care about linking the two together.
Let's go back to the Cypher query:
START tag=node:tags_fulltext('keyword')
MATCH tag<-[:TaggedWith]-item
RETURN tag, item
That would produce a Cypher result table like this:
--------------------
| tag | item |
--------------------
| red | A |
| red | B |
| red | C |
| blue | B |
| blue | D |
| green | E |
| green | F |
| green | G |
--------------------
You want to collapse each of these to a single, unrelated list of tags and items.
We can use collect to do that:
START tag=node:tags_fulltext('keyword')
MATCH tag<-[:TaggedWith]-item
RETURN collect(tag) AS Tags, collect(item) AS Items
-----------------------------------------------------------------------------
| tags | items |
-----------------------------------------------------------------------------
| red, red, red, blue, blue, green, green, green | A, B, C, B, D, E, F, G |
-----------------------------------------------------------------------------
We don't want all of those duplicates though, so let's just collect the distinct ones:
START tag=node:tags_fulltext('keyword')
MATCH tag<-[:TaggedWith]-item
RETURN collect(distinct tag) AS Tags, collect(distinct item) AS Items
--------------------------------------------
| tags | items |
--------------------------------------------
| red, blue, green | A, B, C, D, E, F, G |
--------------------------------------------
With the Cypher working, turning it into .NET is an easy translation:
var query = client
.Cypher
.StartWithNodeIndexLookup("tag", "tags_fulltext", keyword)
.Match("tag<-[:TaggedWith]-item")
.Return((tag, item) => new
{
Tags = tag.CollectDistinct<Tag>(),
Items = item.CollectDistinct<Item>()
});
Summary
Always start with the Cypher
Always start with the Cypher
When you have working Cypher, the .NET implementation should be almost one-for-one
Problems?
I've typed all of this code out in a textbox with no VS support and I haven't tested any of it. If something crashes, please report the full exception text and query on our issues page. Tracking crashes here is hard. Tracking crashes without the full exception text, message, stack trace and so forth just consumes my time by making it harder to debug, and reducing how much time I can spend helping you otherwise.

Related

Pivoting data in Cypher

I've just gotten into working with a Neo4J database, and I'm having a hard time figuring out a good way to pivot some data that I'm working with.
I have a basic query that looks like this:
MATCH (n:DATA) WHERE n.status =~ "SUCCESS" return n.group as Group, n.label as Label, avg(toFloat(n.durationMillis)/60000) as Minutes, which produces tall narrow data like this:
|Group |Label |Minutes|
|-------|-------|-------|
|group1 |label1 |1.0 |
|group1 |label2 |2.0 |
|group1 |label3 |5.0 |
|group2 |label1 |3.0 |
|group2 |label3 |2.0 |
...
What I would like to do is pivot this data to provide a short wide view as a summary table:
| Group | label1 | label2 | label3 |
| ----- | ------ | ------ | ------ |
|group1 | 1.0 | 2.0 | 5.0 |
|group2 | 3.0 | - | 2.0 |
...
Is there a simple way to do this with Cypher?
In order for Neo4j tools (like the Neo4j Browser) to generate a visualization that looks like a pivot table from a Cypher query, the query would have to hardcode the headings for each "column" -- since a Cypher query cannot dynamically generate the names of the values it returns. That is, your RETURN clause would have to look something like RETURN Group, label1, label2, label3.
Now, if you do happen to know all the possible labels beforehand, then you can indeed perform a simple query that returns your pivot table. For example:
MATCH (n:DATA)
WHERE n.status =~ "SUCCESS"
WITH n.group as Group, n.label AS l, AVG(n.durationMillis/60000.0) AS m
WITH Group, apoc.map.fromLists(COLLECT(l), COLLECT(m)) AS lmMap
RETURN Group,
lmMap['label1'] AS label1,
lmMap['label2'] AS label2,
lmMap['label3'] AS label3
The APOC function apoc.map.fromLists returns a map generated from lists of keys and values. If a Group does not have a particular label, its cell value will be null.

cypher - add/remove properties across all nodes of same label

With following example data:
node1:Person {id: 1, name: 'NameOne'}
node2:Person {id: 2, name: 'NameTwo', age: 42}
question is: if it is possible to standardize properties across all nodes of label Person to the list ['id','name','age','lastname'] so that missing properties are added to the nodes with default empty value and using cypher only?
I have tied using apoc.map.merge({first},{second}) yield value procedure as following:
match (p:Person)
call apoc.map.merge(proeprties(p),{id:'',name:'',age:'',lastname:''}) yield value
return value
however I got this error:
There is no procedure with the name apoc.map.merge registered for
this database instance. Please ensure you've spelled the procedure
name correctly and that the procedure is properly deployed.
although I can confirm I have apoc in place
bash-4.3# ls -al /var/lib/neo4j/plugins/apoc-3.1.0.3-all.jar
-rw-r--r-- 1 root root 1319762 Dec 14 02:19 /var/lib/neo4j/plugins/apoc-3.1.0.3-all
and it is shown in apoc.help
neo4j-sh (?)$ call apoc.help("map.merge");
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| type | name | text | signature | roles | writes |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "function" | "apoc.map.merge" | "apoc.map.merge(first,second) - merges two maps" | "apoc.map.merge(first :: MAP?, second :: MAP?) :: (MAP?)" | <null> | <null> |
| "function" | "apoc.map.mergeList" | "apoc.map.mergeList([{maps}]) yield value - merges all maps in the list into one" | "apoc.map.mergeList(maps :: LIST? OF MAP?) :: (MAP?)" | <null> | <null> |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows
47 ms
Note that these are functions now, so you don't need to call them using CALL or YIELD like procedures. This should work:
match (p:Person)
RETURN apoc.map.merge(properties(p),{id:'',name:'',age:'',lastname:''})
Keep in mind that this query will only affect what is returned, since you haven't used SET to update the node properties.
You can use the += operator to update a node's properties instead of using apoc.map.merge:
match (p:Person)
set p += {id:'',name:'',age:'',lastname:''}
Keep in mind that both this and apoc.map.merge will replace existing values, so you'll be blanking out id, name, age, and lastname for all persons.
At this time I don't believe there is functionality in Neo4j or APOC to merge in properties while keeping existing properties instead of replacing. That said, there are some workarounds you might use.
COALESCE() is a useful function for this, as it allows you to supply defaults to use in case a value is null.
For example, you might use this to update the properties for all :Persons, using the supplied empty string as a default if the properties are null:
match (p:Person)
with {id:COALESCE(p.id, ''), name:COALESCE(p.name, ''), age:COALESCE(p.age, ''),
lastname:COALESCE(p.lastname, '')} as newProps
set p += newProps

Getting Mutliple results from different relationships with Cypher

I am sure this question has been asked but I can't find it.
I have a social graph and I want to be able to show people suggestions based on 3 different relationships in one result.
I have 3 different nodes (Skill, Interest, Title)
Each person has a relationship of SKILL_OF, INTEREST_OF, and IS_TITLED respectively.
I would like to have a single (unique if possible) results set of Matching the person, then finding people that have the same skills, interests, and job title.
I tried to start with 2 results (and then wanted to add title on after) but here is what I have.
MATCH (p:Person { username:'wkolcz' })-[INTEREST_OF]->(Interest)<-[i:INTEREST_OF]-(f:Person)
MATCH(p)-[SKILL_OF]->(s:Skill)<-[sk:SKILL_OF]-(sf:Person)
RETURN f.first_name,f.last_name, sf.first_name, sf.last_name, i, s
I tried to make the matching person the same variable but, as you experts know, that failed. I got a result set but it doesn't make sense to me how I could then display it.
I would like a single list of first_name, last_name, username from the 2 and bonus points of I could get the matches also returned (i and s) so I could display the matching results (This person also has skill(s) in X or This person also has interest in X)
Thanks and let me know!
[EDITED]
This turned out to be a very interesting problem.
I provide a solution that:
Only returns a single result row for every person.
Displays all the interests and skills shared by that person and wkolcz as separate collections. (I presume that people in the DB can have multiple interests and skills.)
The solution finds all the people with shared interests and/or skills in a single MATCH clause.
MATCH (p:Person { username:'wkolcz' })-[r1:INTEREST_OF|SKILL_OF]->(n)<-[r2:INTEREST_OF|SKILL_OF]-(f)
WHERE TYPE(r1) = TYPE(r2)
WITH f, COLLECT(TYPE(r1)) AS ts, COLLECT(n.name) AS names
RETURN f.first_name, f.last_name, f.username,
REDUCE(s = { interests: [], skills: []}, i IN RANGE(0, LENGTH(ts)-1) | CASE
WHEN ts[i] = "INTEREST_OF"
THEN { interests: s.interests + names[i], skills: s.skills }
ELSE { interests: s.interests, skills: s.skills + names[i]} END ) AS shared;
Here is a console that shows these sample results:
+---------------------------------------------------------------------------------------------+
| f.first_name | f.last_name | f.username | shared |
+---------------------------------------------------------------------------------------------+
| "Fred" | "Smith" | "fsmith" | {interests=[Bird Watching], skills=[]} |
| "Oscar" | "Grouch" | "ogrouch" | {interests=[Bird Watching, Politics], skills=[]} |
| "Wilma" | "Jones" | "wjones" | {interests=[Bird Watching], skills=[Woodworking]} |
+---------------------------------------------------------------------------------------------+

Neo4j Cypher - How to Count Multiple Property Values With Cypher Efficiently And Paginate Properly

I am struggling to get the proper cypher that is both efficient and allows pagination through skip and limit.
Here is the simple scenario: I have the related nodes (company)<-[A]-(set)<-[B]-(job) where there are multiple instances of (set) with distinct (job) instances related to them. The (job) nodes have a specific status property that can hold one of several states. We need to count the number of (job) nodes in a particular state per (set) and use skip and limit to paginate on the distinct (set) nodes.
So we can get a very efficient query for job.status counts using this.
match (c:Company {id: 'MY.co'})<-[:type_of]-(s:Set)<-[:job_for]-(j:Job)
return s.Description, j.Status, count(*) as StatusCount;
Which will give us a rows of the Set.Description, Job.Status, and JobStatus count. But we will get multiple rows for the Set based on the Job.Status. This is not conducive to paging over distinct sets though. Something like:
s.Description j.Status StatusCount
-------------------+--------------+----------------
Set 1 | Unassigned | 10
Set 1 | Completed | 2
Set 2 | Unassigned | 3
Set 1 | Reviewed | 10
Set 3 | Completed | 4
Set 2 | Reviewed | 7
What we are trying to achieve with the proper cypher is result rows based on distinct Sets. Something like this:
s.Description Unassigned Completed Reviewed
-------------------+--------------+-------------+----------
Set 1 | 10 | 2 | 10
Set 2 | 3 | 0 | 7
Set 3 | 0 | 4 | 0
This would then allow us to paginate over Sets using skip and limit properly.
I have tried many different approaches and cannot seem to find the right combination for this type of result. Anyone have any ideas? Thanks!
** EDIT - Using the answer provided by MIchael, here's how to get the status count values in java **
match (c:Company {id: 'MY.co'})<-[:type_of]-(s:Set)<-[:job_for]-(j:Job)
with s, j.Status as Status,count(*) as StatusCount
return s.Description, collect({Status:Status,StatusCount:StatusCount]) as StatusCounts;
List<Object> statusMaps = (List<Object>) row.get("StatusCounts");
for(Object statusEntry : statusMaps ) {
Map<String,Object> statusMap = (Map<String,Object>) statusEntry;
String status = (String) statusMap.get("Status");
Number count = statusMap.get("StatusCount");
}
You can use WITH and aggregation, and optionally a map result
match (c:Company {id: 'MY.co'})<-[:type_of]-(s:Set)<-[:job_for]-(j:Job)
with s, j.Status as Status,count(*) as StatusCount
return s.Description, collect([Status,StatusCount]);
or
match (c:Company {id: 'MY.co'})<-[:type_of]-(s:Set)<-[:job_for]-(j:Job)
with s, j.Status as Status,count(*) as StatusCount
return s.Description, collect({Status:Status,StatusCount:StatusCount]);

Get Node ID's in Neo4j using Python

I have recently begun using Neo4j and am struggling to understand how things work. I am trying to create relationships between nodes that I created earlier in my script. The cypher query that I found looks like it should work, but I don't know how to get the id's to replace the #'s
START a= node(#), b= node(#)
CREATE UNIQUE a-[r:POSTED]->b
RETURN r
If you want to use plain cypher, the documentation has a lot of usage examples.
When you create nodes you can return them (or just their ids by returning id(a)), like this:
CREATE (a {name:'john doe'}) RETURN a
This way you can keep the id around to add relationships.
If you want to attach relationships later, you should not use the internal id of the nodes to reference them from external system. They can for example be re-used if you delete and create nodes.
You can either search for a node by scanning over all and filtering using WHERE or add an index to your database, e.g. if you add an auto_index on name:
START n = node:node_auto_index(name='john doe')
and continue from there. Neo4j 2.0 will support index lookup transparently so that MATCH and WHERE should be as efficient.
If you are using python, you can also take a look at py2neo which provides you with a more pythonic interface while using cypher and the REST interface to communicate with the server.
This could be what you are looking for:
START n = node(*) , x = node(*)
Where x<>n
CREATE UNIQUE n-[r:POSTED]->x
RETURN r
It will create POSTED relationship between all the nodes like this
+-----------------------+
| r |
+-----------------------+
| (0)-[10:POSTED]->(1) |
| (0)-[10:POSTED]->(2) |
| (0)-[10:POSTED]->(3) |
| (1)-[10:POSTED]->(0) |
| (1)-[10:POSTED]->(2) |
| (1)-[10:POSTED]->(3) |
| (2)-[10:POSTED]->(0) |
| (2)-[10:POSTED]->(1) |
| (2)-[10:POSTED]->(3) |
| (3)-[10:POSTED]->(0) |
| (3)-[10:POSTED]->(1) |
| (3)-[10:POSTED]->(2) |
And if you don't want a relation between the reference node(0) and the other nodes, you can make the query like this
START n = node(*), x = node(*)
WHERE x<>n AND id(n)<>0 AND id(x)<>0
CREATE UNIQUE n-[r:POSTED]->x
RETURN r
and the result will be like that:
+-----------------------+
| r |
+-----------------------+
| (1)-[10:POSTED]->(2) |
| (1)-[10:POSTED]->(3) |
| (2)-[10:POSTED]->(1) |
| (2)-[10:POSTED]->(3) |
| (3)-[10:POSTED]->(1) |
| (3)-[10:POSTED]->(2) |
On the client side using Javascript I post the cypher query:
start n = node(*) WHERE n.name = '" + a.name + "' return n
and then parse the id number from response "self" in the form of:
server_url:7474/db/data/node/node_id
After hours of trying to figure this out, I finally found what I was looking for. I was struggling with how nodes were getting returned and found that
userId=person[0][0][0].id
would return what I wanted. Thanks for all your help though!
Using py2neo, the way I've found that is really useful is to use the remote module.
from py2neo import Graph, remote
graph = Graph()
graph.run('CREATE (a)-[r:POSTED]-(b)')
a = graph.run('MATCH (a)-[r]-(b) RETURN a').evaluate()
a_id = remote(a)._id
b = graph.run('MATCH (a)-[r]-(b) WHERE ID(a) = {num} RETURN b', num=a_id).evaluate()
b_id = remote(b)._id
graph.run('MATCH (a)-[r]-(b) WHERE ID(a)={num1} AND ID(b)={num2} CREATE (a)-[x:UPDATED]-(b)', num1=a_id, num2=b_id)
The remote function takes in a py2neo Node object and has an _id attribute that you can use to return the current ID number from the graph database.

Resources