Any length relationship between two nodes that share the same label - neo4j

I'm trying to find a query that will show me any length relationship that exists between two nodes that share the same index. Basically, if there is any overlap between for a specific label. My graph is Pretty simple and not particularly large:
(m:`Campaign`), (n:`Politician`), (o:`Assistant`), (p:`Staff`), (q:`Aid`), (s:`Contributor`)
(m)<-[:Campaigns_for]-(n)
(o)<-[:works_for]-(m)
(p)<-[:works_for]-(o)
(q)<-[:volunteers_for]-(p)
(m)<-[:contributes_to]-(s)
I want to find all the shared nodes and their relationships between Campaigns.
so far i have:
MATCH (n:`Campaign`)-[r*]-(m:`Campaign`)
RETURN n,count(r) as R,m
ORDER BY R DESC
but it's not returning everyhing I want, I want in addition to the counts, the labels of each relationship and the names of the nodes in between.

Assuming that "names of the nodes" means "return the name property of the node" (you could always substitute in "labels(n)" if you're after labels), then something like this might work, but you have some aggregation going on here so you may need to parse a bit:
MATCH p =(a:Campaign)-[r*]-(b:Campaign)
RETURN a, length(relationships(p)) AS count, b, extract(x IN relationships(p)| type(x)), extract(x IN nodes(p)| x.name)
ORDER BY count DESC
I'm also assuming that when you say "not returning everything you want", you mean that in addition to what's currently returned in your result set, you want just those other items you listed.
Keep in mind it might also be possible to have a cycle in your graph (not knowing too much about your particular graph), so, you may want to check your beginning and ending nodes.

Related

depth in nodes of the same label in custom query

I have in my database Company with:
#Relationship(type = "COLLECTS_FROM")
private Set<Company> subCompanies
What I'm trying to achieve is to create two queries:
get the company with limited depth 2 for nodes of the same type (so company which collects from different companies, which collects from different companies)
get the company with full depth for nodes of the same type (similar to previous, but just all the way down without any limitations.
I had a few attempts to solve this and here are those:
first approach was to use findById(ID,2) and findById(ID,999999), but this query has a huge performance drop and I ran into StackOverflow exception, that's why I'm trying to create my own methods similar to what I found here.
second approach was to use custom query (which were actually working in case of different node types):
#Query("Match (company:Company {name:$name})-[relation:COLLECTS_FROM*0..2]->(secondCompany:Company) RETURN company,relation,secondCompany")
Company customerWithCustomDepth(String name);
Unfortunately while using the same node types I ran into the IncorrectResultSizeDataAccessException: Incorrect result size: expected at most 1 and when I changed return type to the list 3 records were returned (actually the first one of them is what I would like to get, but it seems like specifying secondCompany in the return statement populates nodes that I don't want to have in my return type.
on the third approach I did similar to the second, but it also returned 3 rows (the first one is what I wanted).
#Query("Match p = (company:Company {name:$name})-[relation:COLLECTS_FROM*0..2]->(secondCompany:Company) RETURN p")
and fourth one once again 3 companies, instead of a single one (the first one is what I wanted)
#Query("MATCH (company:Company {name:$name} ) WITH company MATCH p=(company)-[COLLECTS_FROM*0..2]->(m) RETURN p")
And that's basically it. I feel like I'm close, but a little detail is missing. Maybe instead of trying to fix the query, I should somehow limit the result to get the first row from the Query? Any help will be really appreciated.
update after receiving #cybersam answer:
The general purpose of this query is to create a tree graph, that's why I would like to get a company with their sub-companies inside (and companies in the sub-companies), not only the parent one.
I could use second, or third approach and just grab the first element on the server side:
result.stream()
.findFirst()
.orElseThrow();
but I believe it should be able to do it on the service side.
To return the Company with the specified name if it is connected by a path of 1 to 2 COLLECTS_FROM relationships to descendant Company nodes:
#Query("MATCH p=(c:Company {name:$name})-[:COLLECTS_FROM*..2]->(:Company) WHERE ALL(n IN NODES(p) WHERE 'Company' IN LABELS(n)) RETURN c")
Company customerWithCustomDepth(String name);
I assume that you do not want to get a result if there are no such descendants, which is why the variable-length relationship pattern does not use *0..2.
Note: If you want to test for paths of exactly length 2, change ..2 to just 2.
The corresponding query for paths of any length can be obtained by just changing ..2 to ... However, variable-length relationship pattern with no upper bound are not recommended, since they can take "forever" to run or run out of memory.

Neo4j query for complete paths

I have the following structure.
CREATE
(`0` :Sentence {`{text`:'This is a sentence'}}) ,
(`1` :Word {`{ text`:'This' }}) ,
(`2` :Word {`{text`:'is'}}) ,
(`3` :Sentence {`{'text'`:'Sam is a dog'}}) ,
(`0`)-[:`RELATED_TO`]->(`1`),
(`0`)-[:`RELATED_TO`]->(`2`),
(`3`)-[:`RELATED_TO`]->(`2`)
schema example
So my question is this. I have a bunch of sentences that I have decomposed into word objects. These word objects are all unique and therefore will point to different sentences. If I perform a search for one word it's very easy to figure out all of the sentences that word is related to. How can I structure a query to figure out the same information for two words instead of one.
I would like to submit two or more words and find a path that includes all words submitted picking up all sentences of interest.
I just remembered an alternate approach that may work better. Compare the PROFILE on this query with the profiles for the others, see if it works better for you.
WITH {myListOfWords} as wordList
WITH wordList, size(wordList) as wordCnt
MATCH (s)-[:RELATED_TO]->(w:Word)
WHERE w.text in wordList
WITH s, wordCnt, count(DISTINCT w) as cnt
WHERE wordCnt = cnt
RETURN s
Unfortunately it's not a very pretty approach, it basically comes down to collecting :Word nodes and using the ALL() predicate to ensure that the pattern you want holds true for all elements of the collection.
MATCH (w:Word)
WHERE w.text in {myListOfWords}
WITH collect(w) as words
MATCH (s:Sentence)
WHERE ALL(word in words WHERE (s)-[:RELATED_TO]->(word))
RETURN s
What makes this ugly is that the planner isn't intelligent enough right now to infer that when you say MATCH (s:Sentence) WHERE ALL(word in words ... that the initial matches for s ought to come from the match from the first w in your words collection, so it starts out from all :Sentence nodes first, which is a major performance hit.
So to get around this, we have to explicitly match from the first of the words collection, and then use WHERE ALL() for the remaining.
MATCH (w:Word)
WHERE w.text in {myListOfWords}
WITH w, size(()-[:RELATED_TO]->(w)) as rels
WITH w ORDER BY rels ASC
WITH collect(w) as words
WITH head(words) as head, tail(words) as words
MATCH (s)-[:RELATED_TO]->(head)
WHERE ALL(word in words WHERE (s)-[:RELATED_TO]->(word))
RETURN s
EDIT:
Added an optimization to order your w nodes by the degree of their incoming :RELATED_TO relationships (this is a degree lookup on very few nodes), as this will mean the initial match to your :Sentence nodes is the smallest possible starting set before you filter for relationships from the rest of the words.
As an alternative, you could consider using manual indexing (also called "legacy indexing") instead of using Word nodes and RELATED_TO relationships. Manual indexes support "fulltext" searches using lucene.
There are many apoc procedures that help you with this.
Here is an example that might work for you. In this example, I assume case-insensitive comparisons are OK, you retain the Sentence nodes (and their text properties), and you want to automatically add the text properties of all Sentence nodes to a manual index.
If you are using neo4j 3.2+, you have to add this setting to the neo4j.conf file to make some expensive apoc.index procedures (like apoc.index.addAllNodes) available:
dbms.security.procedures.unrestricted=apoc.*
Execute this Cypher code to initialize a manual index named "WordIndex" with the text text from all existing Sentence nodes, and to enable automatic indexing from that point onwards:
CALL apoc.index.addAllNodes('WordIndex', {Sentence: ['text']}, {autoUpdate: true})
YIELD label, property, nodeCount
RETURN *;
To find (case insensitively) the Sentence nodes containing all the words in a collection (passed via a $words parameter), you'd execute a Cypher statement like the one below. The WITH clause builds the lucene query string (e.g., "foo AND bar") for you. Caveat: since lucene's special boolean terms (like "AND" and "OR") are always in uppercase, you should make sure the words you pass in are lowercased (or modify the WITH clause below to use the TOLOWER()` function as needed).
WITH REDUCE(s = $words[0], x IN $words[1..] | s + ' AND ' + x) AS q
CALL apoc.index.search('WordIndex', q) YIELD node
RETURN node;

NEO4J nodes under a node filter by relationship

How can we add a relationship to the query.
Say A-[C01]-B-[C02]-D and A-[C01]-B-[C03]-E
C01 C02 C03 are relationship codes I want to get output
B E
because I want only nodes that can be reached unbroken by C01 or C03
How can I get this result in Cypher?
You may want to clarify, what you're asking for seems like a very simple case of matching. You may want to provide some more info, such as node labels and how you're matching to your start nodes, since without these we have to make things up for example code.
MATCH (a:Thing)
WHERE a.ID = 123
WITH a
MATCH (a)-[:C01|C03*]->(b:Thing)
RETURN b
The key here is specifying multiple relationship types to traverse, using * for multiplicity, so it will match on all nodes that can be reached by any chain of those relationships.

Neo4j - return results from match results starting from specific node

Lets say i have nodes that are connected in FRIEND relationship.
I want to query 2 of them each time, so i use SKIP and LIMIT to maintain this.
However, if someone adds a FRIEND in between calls, this messes up my results (since suddenly the 'whole list' is pushed 1 index forward).
For example, lets say i had this list of friends (ordered by some parameter):
A B C D
I query the first time, so i get A B (skipped 0 and limited 2).
Then someone adds a friend named E, list is now E A B C D.
now the second query will return B C (skipped 2 and limited 2). Notice B returned twice because the skipping method is not aware of the changes that the DB had.
Is there a way to return 2 each time starting considering the previous query? For example, if i knew that B was last returned from the query, i could provide it to the query and it would query the 2 NEXT, getting C D (Which is correct) instead of B C.
I tried finding a solution and i read about START and indexes but i am not sure how to do this.
Thanks for your time!
You could store a timestamp when the FRIEND relationship was created and order by that property.
When the FRIEND relationship is created, add a timestamp property:
MATCH (a:Person {name: "Bob"}), (b:Person {name: "Mike"})
CREATE (a)-[r:FRIEND]->(b)
SET r.created = timestamp()
Then when you are paginating through friends two at a time you can order by the created property:
MATCH (a:Person {name: "Bob"})-[r:FRIEND]->(friends)
RETURN friends SKIP {page_number_times_page_size} LIMIT {page_size}
ORDER BY r.created
You can parameterize this query with the page size (the number of friends to return) and the number of friends to skip based on which page you want.
Sorry, if It's not exactly answer to you question. On my previous project I had experience of modifying big data. It wasn't possible to modify everything with one query so I needed to split it in batches. First I started with skip limit. But for some reason in some cases it worked unpredictable (not modified all the data). And when I become tired of finding the reason I changed my approach. I used Java for querying database. So I get all the ids that I needed to modify in first query. And after this I run through stored ids.

neo4j cypher: "stacking" nodes from query result

Considering the existence of three types of nodes in a db, connected by the schema
(a)-[ra {qty}]->(b)-[rb {qty}]->(c)
with the user being able to have some of each in their wishlist or whatever.
What would be the best way to query the database to return a list of all the nodes the user has on their wishlist, considering that when he has an (a) then in the result the associated (b) and (c) should also be returned after having multiplied some of their fields (say b.price and c.price) for the respective ra.qty and rb.qty?
NOTE: you can find the same problem without the variable length over here
Assuming you have users connected to the things they want like so:
(user:User)-[:WANTS]->(part:Part)
And that parts, like you describe, have dependencies on other parts in specific quantities:
CREATE
(a:Part) -[:CONTAINS {qty:2}]->(b:Part),
(a:Part) -[:CONTAINS {qty:3}]->(c:Part),
(b:Part) -[:CONTAINS {qty:2}]->(c:Part)
Then you can find all parts, and how many of each, you need like so:
MATCH
(user:User {name:"Steven"})-[:WANTS]->(part),
chain=(part)-[:CONTAINS*1..4]->(subcomponent:Part)
RETURN subcomponent, sum( reduce( total=1, r IN relationships(chain) | total * r.rty) )
The 1..4 term says to look between 1-4 sub-components down the tree. You can obv. set that to whatever you like, including "1..", infinite depth.
The second term there is a bit complex. It helps to try the query without the sum to see what it does. Without that, the reduce will do the multiplying of parts that you want for each "chain" of dependencies. Adding the sum will then aggregate the result by subcomponent (inferred from your RETURN clause) and sum up the total count for that subcomponent.
Figuring out the price is then an excercise of multiplying the aggregate quantities of each part. I'll leave that as an exercise for the reader ;)
You can try this out by running the queries in the online console at http://console.neo4j.org/

Resources