Aggregating relationships via cypher

Aggregating relationships via cypher - neo4j

I am fairly certain I have seen it somewhere but all keywords I have tried came up empty.
I have a graph that connects persons and companies via documents:
(:Person/:Company)-[ ]-(:Document)-[ ]-(:Person/:Company)
What I would like to do is return a graph that shows the connection between persons and companies directly with the relationship strength based on the number of connections between them.
I get the data with
MATCH (p)-[]-(d:Document)-[]-(c)
WHERE p:Person or p:Company and c:Person or c:Company
WITH p,c, count(d) as rel
RETURN p,rel,c
However in the Neo4J-Browser, the nodes appear without any relationships. Is there a way to achieve this or do I have to create some kind of meta relationship?

If you install APOC Procedures, you'll be able to create virtual relationships which are used for visualization but aren't actually stored in the db.
MATCH (p)-[]-(d:Document)-[]-(c)
WHERE (p:Person or p:Company AND c:Person or c:Company)
AND id(p) < id(c)
WITH p,c, count(d) as relStrength
CALL apoc.create.vRelationship(p,'REL',{strength:relStrength}, c) YIELD rel
RETURN p,rel,c
I also added a predicate on ids of p and c so you don't repeat the same two nodes with p and c switched.

Related

Cypher - given relationship, get the nodes

If I do the query
MATCH (:Label1 {prop1: "start node"}) -[relationships*1..10]-> ()
UNWIND relationships as relationship
RETURN DISTINCT relationship
How do I get nodes for each of acquired relationship to get result in format:
╒════════╤════════╤═══════╕
│"from" │"type" │"to" │
╞════════╪════════╪═══════╡
├────────┼────────┼───────┤
└────────┴────────┴───────┘
Is there a function such as type(r) but for getting nodes from relationship?

RomanMitasov and ray have working answers above.
I don't think they quite get at what you want to do though, because you're basically returning every relationship in the graph in a sort of inefficient way. I say that because without a start or end position, specifying a path length of 1-10 doesn't do anything.
For example:
CREATE (r1:Temp)-[:TEMP_REL]->(r2:Temp)-[:TEMP_REL]->(r3:Temp)
Now we have a graph with 3 Temp nodes with 2 relationships: from r1 to r2, from r2 to r3.
Run your query on these nodes:
MATCH (:Temp)-[rels*1..10]->(:Temp)
UNWIND rels as rel
RETURN startNode(rel), type(rel), endNode(rel)
And you'll see you get four rows. Which is not what you want because there are only two distinct relationships.
You could modify that to return only distinct values, but you're still over-searching the graph.
To get an idea of what relationships are in the graph and what they connect, I use a query like:
MMATCH (n)-[r]->(m)
RETURN labels(n), type(r), labels(m), count(r)
The downside of that, of course, is that it can take a while to run if you have a very large graph.
If you just want to see the structure of your graph:
CALL db.schema.visualization()
Best wishes and happy graphing! :)

Yes, such functions do exist!
startNode(r) to get the start node from relationship r
endNode(r) to get the end node
Here's the final query:
MATCH () -[relationships*1..10]-> ()
UNWIND relationships as r
RETURN startNode(r) as from, type(r) as type, endNode(r) as to

Excluding "symmetric" results in Neo4j

I want to query a Neo4j graph for a structure that includes two interchangeable nodes, but I don't want two unique responses for each of the "symmetric" responses.
How do I express in Cypher that two nodes are interchangeable?
An example:
I want to look for the following structure in the graph with the following query:
MATCH (c:Customer)-[]->(p:Purchase)
MATCH (c:Customer)-[]->(q:Purchase)
MATCH (p)-[]->(m:Company)
MATCH (q)-[]->(m:Company)
RETURN DISTINCT c, p, q, m
The default behavior would be for Neo4j to return the following two graphs:
(i.e. The assignment of p and q to Purchase1 and Purchase2 are reversed)
How do I express that the elements p and q in my query are interchangeable, and I only need one of the above responses?

To prevent those kinds of results, you would typically have an inequality based on the node ids:
WHERE id(p) < id(q)
That said, you may be able to form this query a little cleaner like this (provided you want all purchases between a customer and a company with at least two purchases made from that customer to the company):
MATCH (c:Customer)-->(p:Purchase)-->(m:Company)
WITH c, m, collect(p) as purchases, count(p) as purchaseCount
WHERE purchaseCount >= 2
RETURN c, m, purchases

Neo4j - Filter out All Nodes in a Graph that contain a Certain Relationship

Background:
I have a graph with Company nodes joined to eachother through One or MORE relationships.
Looks like this:
Trying to Achieve:
I want to keep all nodes where the relationship between them is "competes_on_backlinks" BUT if another relationship of "links_to" exists between them THEN filter out that node.
I have Tried:
MATCH p =(c:Company {name:"example.com"})-[r]-(b:Company)
where NONE (x IN relationships(p) WHERE type(x) ="links_to")
RETURN p
The above query produces the graph in the image.
That filters out nodes where the ONLY relationship between them is "links_to" BUT it does not remove the nodes where there is another relationship as well(as you can see in the image)
So many other attempts but the same result.
Any idea how to do this?

The NOT operator can be used to exclude a pattern:
MATCH p = (c:Company {name:"F"})-[:competes_on_backlinks]-(b:Company)
WHERE NOT (b)-[:links_to]-(c)
RETURN p

Filtering out nodes based on outgoing relationship in Cypher query (Similar to SQL outer join)

I have a simple database with three types of nodes (t:transcripts, f:protein families and g:genes. There are two types of relationships, PFAM_MRNA (t)-[r]->(f) and Parent (t)-[p]->(g).
(g:Gene{Name:'g1'})<-[p:Parent]-(t:transcript{Name:'t1'})
(g:Gene{Name:'g1'})<-[p:Parent]-(t:transcript{Name:'t2'})
(g:Gene{Name:'g2'})<-[p:Parent]-(t:transcript{Name:'t3'})
(g:Gene{Name:'g3'})<-[p:Parent]-(t:transcript{Name:'t4'})
(g:Gene{Name:'g4'})<-[p:Parent]-(t:transcript{Name:'t5'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t1'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t2'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t3'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t4'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t5'})
(f:PFAM{ID:'PF1040'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t4'})
(f:PFAM{ID:'PF1040'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t5'})
Next, I am trying to get the transcripts (and their Parent genes) connected to PF0752 but get rid of the transcripts (and their Parent genes) that are also connected to PF1040.
So, my CYPHER query looks like
MATCH (f)<-[rel:PFAM_MRNA]-(t)-[p:Parent]->(g)
WHERE f.ID IN ['PF0752']
AND NOT f.ID IN ['PF1040']
RETURN *
However, I got a graph like
(g:Gene{Name:'g1'})<-[p:Parent]-(t:transcript{Name:'t1'})
(g:Gene{Name:'g1'})<-[p:Parent]-(t:transcript{Name:'t2'})
(g:Gene{Name:'g2'})<-[p:Parent]-(t:transcript{Name:'t3'})
(g:Gene{Name:'g3'})<-[p:Parent]-(t:transcript{Name:'t4'})
(g:Gene{Name:'g4'})<-[p:Parent]-(t:transcript{Name:'t5'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t1'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t2'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t3'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t4'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t5'})
Instead of
(g:Gene{Name:'g1'})<-[p:Parent]-(t:transcript{Name:'t1'})
(g:Gene{Name:'g1'})<-[p:Parent]-(t:transcript{Name:'t2'})
(g:Gene{Name:'g2'})<-[p:Parent]-(t:transcript{Name:'t3'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t1'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t2'})
(f:PFAM{ID:'PF0752'})<-[r:PFAM_MRNA]-(t:transcript{Name:'t3'})
Any hint/idea of how to make it works is really appreciated.
Thanks,

You can add a WHERE NOT clause on a pattern from t to the PF1040 protein:
MATCH (f:PFAM {ID: 'PF0752'}), (pf:PFAM {ID:'PF1040'})
MATCH (f)<-[rel:PFAM_MRNA]-(t)-[p:Parent]->(g)
WHERE NOT (pf)<-[:PFAM_MRNA]-(t)
RETURN *

Neo4j cypher query with variable relationship path length

I'm moving my complex user database where users can be on one of many teams, be friends with each other and more to Neo4j. Doing this in a RDBMS was painful and slow, but is simple and blazing with Neo4j. :)
I was hoping there is a way to query for
a relationship that is 1 hop away and
another relationship that is 2 hops away
from the same query.
START n=node:myIndex(user='345')
MATCH n-[:IS_FRIEND|ON_TEAM*2]-m
RETURN DISTINCT m;
The reason is that users that are friends are one edge from each other, but users linked by teams are linked through that team node, so they are two edges away. This query does IS_FRIEND*2 and ON_TEAM*2, which gets teammates (yeah) and friends of friends (boo).
Is there a succinct way in Cypher to get both differing length relations in a single query?

I rewrote it to return a collection:
start person=node(1)
match person-[:IS_FRIEND]-friend
with person, collect(distinct friend) as friends
match person-[:ON_TEAM*2]-teammate
with person, friends, collect(distinct teammate) as teammates
return person, friends + filter(dupcheck in teammates: not(dupcheck in friends)) as teammates_and_friends
http://console.neo4j.org/r/oo4dvx
thanks for putting together the sample db, Werner.

I have created a small test database at http://console.neo4j.org/?id=sqyz7i
I have also created a query which will work as you described:
START n=node(1)
MATCH n-[:IS_FRIEND]-m
WITH collect(distinct id(m)) as a, n
MATCH n-[:ON_TEAM*2]-m
WITH collect(distinct id(m)) as b, a
START n=node(*)
WHERE id(n) in a + b
RETURN n

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Aggregating relationships via cypher - neo4j

Related

Cypher - given relationship, get the nodes

Excluding "symmetric" results in Neo4j

Neo4j - Filter out All Nodes in a Graph that contain a Certain Relationship

Filtering out nodes based on outgoing relationship in Cypher query (Similar to SQL outer join)

Neo4j cypher query with variable relationship path length

Categories

Resources