depth in nodes of the same label in custom query - neo4j

I have in my database Company with:
#Relationship(type = "COLLECTS_FROM")
private Set<Company> subCompanies
What I'm trying to achieve is to create two queries:
get the company with limited depth 2 for nodes of the same type (so company which collects from different companies, which collects from different companies)
get the company with full depth for nodes of the same type (similar to previous, but just all the way down without any limitations.
I had a few attempts to solve this and here are those:
first approach was to use findById(ID,2) and findById(ID,999999), but this query has a huge performance drop and I ran into StackOverflow exception, that's why I'm trying to create my own methods similar to what I found here.
second approach was to use custom query (which were actually working in case of different node types):
#Query("Match (company:Company {name:$name})-[relation:COLLECTS_FROM*0..2]->(secondCompany:Company) RETURN company,relation,secondCompany")
Company customerWithCustomDepth(String name);
Unfortunately while using the same node types I ran into the IncorrectResultSizeDataAccessException: Incorrect result size: expected at most 1 and when I changed return type to the list 3 records were returned (actually the first one of them is what I would like to get, but it seems like specifying secondCompany in the return statement populates nodes that I don't want to have in my return type.
on the third approach I did similar to the second, but it also returned 3 rows (the first one is what I wanted).
#Query("Match p = (company:Company {name:$name})-[relation:COLLECTS_FROM*0..2]->(secondCompany:Company) RETURN p")
and fourth one once again 3 companies, instead of a single one (the first one is what I wanted)
#Query("MATCH (company:Company {name:$name} ) WITH company MATCH p=(company)-[COLLECTS_FROM*0..2]->(m) RETURN p")
And that's basically it. I feel like I'm close, but a little detail is missing. Maybe instead of trying to fix the query, I should somehow limit the result to get the first row from the Query? Any help will be really appreciated.
update after receiving #cybersam answer:
The general purpose of this query is to create a tree graph, that's why I would like to get a company with their sub-companies inside (and companies in the sub-companies), not only the parent one.
I could use second, or third approach and just grab the first element on the server side:
result.stream()
.findFirst()
.orElseThrow();
but I believe it should be able to do it on the service side.

To return the Company with the specified name if it is connected by a path of 1 to 2 COLLECTS_FROM relationships to descendant Company nodes:
#Query("MATCH p=(c:Company {name:$name})-[:COLLECTS_FROM*..2]->(:Company) WHERE ALL(n IN NODES(p) WHERE 'Company' IN LABELS(n)) RETURN c")
Company customerWithCustomDepth(String name);
I assume that you do not want to get a result if there are no such descendants, which is why the variable-length relationship pattern does not use *0..2.
Note: If you want to test for paths of exactly length 2, change ..2 to just 2.
The corresponding query for paths of any length can be obtained by just changing ..2 to ... However, variable-length relationship pattern with no upper bound are not recommended, since they can take "forever" to run or run out of memory.

Related

Match where AND on the same relationship

Given the following schema, the node in the middle is called Business and the sattelite ones are called BusinessType. The relationship is one to many as one Business has many BusinessType nodes. What I'm trying to do is get all Business nodes where the BusinessType equals a set of codes. The problem is that I'm getting back Business nodes which only have one part of the criteria. For example:
MATCH (business:Business {uuid: 'f872e7d4-1105-11ea-9461-4cedfb791f65'})-[r:BUSINESS_HAS_TYPE]->(bt:BusinessType)
return *
returns the result bellow.
MATCH (business:Business)
MATCH (business)-[r:BUSINESS_HAS_TYPE]->(bt:BusinessType)
WHERE bt.slug = 'rail-doors' AND bt.slug = 'pergolas'
RETURN business
What i'm after in the query above is to get all Business nodes where their BusinessType matches 'rail-doors' AND 'pergolas'. This returns null for some reason though clearly there's a potential match as shown from the previous query.
I also tried using ['rail-doors','pergolas'] but that's clearly wrong cause it returns any Business that has either one of the two.
Any input is welcome
A single node cannot have 2 different values for the same property at the same time. You want a query that looks for each desired value across all related :BusinessType nodes.
Try this query, instead:
MATCH (business:Business)-[:BUSINESS_HAS_TYPE]->(bt:BusinessType)
WITH business, COLLECT(bt.slug) AS slugs
WHERE ALL(x IN ['rail-doors', 'pergolas'] WHERE x IN slugs)
RETURN business

Find nodes in Neo4j that are at the end of all paths and not in another

We are in a POC for Neo4j. The use case is a dashboard where we only bring back opportunities for a seller that they are qualified for and have not already taken an action on. Currently there are 3 criteria and we are looking to add two more. The corresponding SQL is 3 pages so we are looking at a better way as when we add the next criteria, 2 more nodes paths in Neo, will be a bear in SQL. When I run the query below I get back a different amount of rows than the SQL. the buys returned must be at the end of all 3 paths and not be in the 4th. I hope you can point out where I went wrong. If this is a good query then I have a data problem.
Here is the query:
//oportunities dashboard
MATCH (s:SellerRep)-[:SELLS]->(subCat:ProductSubCategory)<-[:IS_FOR_SUBCAT]-(b:Buy)
MATCH (s:SellerRep)-[:SELLS_FOR]->(o:SellerOrg)-[:HAS_SELLER_TYPE]->(st:SellerType)<-[:IS_FOR_ST]-(b:Buy)
MATCH (s:SellerRep)-[:SELLS_FOR]->(o:SellerOrg)-[:IS_IN_SC]->(sc:SellerCommunity)<-[:IS_FOR_SC]-(b:Buy)
WHERE NOT (s:SellerRep)-[:PLACED_BID]->(:Bid)-[:IS_FOR_BUY]->(b:Buy)
AND s.sellerRepId = 217722 and b.currBuyStatus = 'Open'
RETURN b.buyNumber, b.buyDesc, st.sellerType, sc.communtiyName, subCat.subCategoryName+' - '+subCat.desc as sub_cat
If it helps, here is the data model:
POC Data model
Thanks for any help.
A WHERE clause only filters the immediately preceding MATCH clause.
Since you placed your WHERE clause after the third MATCH clause, the first 2 MATCH clauses are not bound to a specific SellerRep or Buy node, and are therefore bringing in more ProductSubCategory and SellerType nodes than you intended.
The following query is probably closer to what you intended:
MATCH (s:SellerRep)-[:SELLS]->(subCat:ProductSubCategory)<-[:IS_FOR_SUBCAT]-(b:Buy)
WHERE s.sellerRepId = 217722 AND b.currBuyStatus = 'Open' AND NOT (s:SellerRep)-[:PLACED_BID]->(:Bid)-[:IS_FOR_BUY]->(b:Buy)
MATCH (s)-[:SELLS_FOR]->(o:SellerOrg)-[:HAS_SELLER_TYPE]->(st:SellerType)<-[:IS_FOR_ST]-(b)
MATCH (o)-[:IS_IN_SC]->(sc:SellerCommunity)<-[:IS_FOR_SC]-(b)
RETURN b.buyNumber, b.buyDesc, st.sellerType, sc.communtiyName, subCat.subCategoryName+' - '+subCat.desc as sub_cat
NOTE: Your second and third MATCH clauses both started with (s:SellerRep)-[:SELLS_FOR]->(o:SellerOrg). I simplified the same logic by having my third MATCH clause just start with (o). Hopefully, you actually intended to force both clauses to refer to the same SellerOrg node.

Neo4j relate nodes by same property

I have a Neo4J DB up and running with currently 2 Labels: Company and Person.
Each Company Node has a Property called old_id.
Each Person Node has a Property called company.
Now I want to establish a relation between each Company and each Person where old_id and company share the same value.
Already tried suggestions from: Find Nodes with the same properties in Neo4J and
Find Nodes with the same properties in Neo4J
following the first link I tried:
MATCH (p:Person)
MATCH (c:Company) WHERE p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
resulting in no change at all and as suggested by the second link I tried:
START
p=node(*), c=node(*)
WHERE
HAS(p.company) AND HAS(c.old_id) AND p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
RETURN p, c;
resulting in a runtime >36 hours. Now I had to abort the command without knowing if it would eventually have worked. Therefor I'd like to ask if its theoretically correct and I'm just impatient (the dataset is quite big tbh). Or if theres a more efficient way in doing it.
This simple console shows that your original query works as expected, assuming:
Your stated data model is correct
Your data actually has Person and Company nodes with matching company and old_id values, respectively.
Note that, in order to match, the values must be of the same type (e.g., both are strings, or both are integers, etc.).
So, check that #1 and #2 are true.
Depending on the size of your dataset you want to page it
create constraint on (c:Company) assert c.old_id is unique;
MATCH (p:Person)
WITH p SKIP 100000 LIMIT 100000
MATCH (c:Company) WHERE p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
RETURN count(*);
Just increase the skip value from zero to your total number of people in 100k steps.

Any length relationship between two nodes that share the same label

I'm trying to find a query that will show me any length relationship that exists between two nodes that share the same index. Basically, if there is any overlap between for a specific label. My graph is Pretty simple and not particularly large:
(m:`Campaign`), (n:`Politician`), (o:`Assistant`), (p:`Staff`), (q:`Aid`), (s:`Contributor`)
(m)<-[:Campaigns_for]-(n)
(o)<-[:works_for]-(m)
(p)<-[:works_for]-(o)
(q)<-[:volunteers_for]-(p)
(m)<-[:contributes_to]-(s)
I want to find all the shared nodes and their relationships between Campaigns.
so far i have:
MATCH (n:`Campaign`)-[r*]-(m:`Campaign`)
RETURN n,count(r) as R,m
ORDER BY R DESC
but it's not returning everyhing I want, I want in addition to the counts, the labels of each relationship and the names of the nodes in between.
Assuming that "names of the nodes" means "return the name property of the node" (you could always substitute in "labels(n)" if you're after labels), then something like this might work, but you have some aggregation going on here so you may need to parse a bit:
MATCH p =(a:Campaign)-[r*]-(b:Campaign)
RETURN a, length(relationships(p)) AS count, b, extract(x IN relationships(p)| type(x)), extract(x IN nodes(p)| x.name)
ORDER BY count DESC
I'm also assuming that when you say "not returning everything you want", you mean that in addition to what's currently returned in your result set, you want just those other items you listed.
Keep in mind it might also be possible to have a cycle in your graph (not knowing too much about your particular graph), so, you may want to check your beginning and ending nodes.

neo4j cypher: "stacking" nodes from query result

Considering the existence of three types of nodes in a db, connected by the schema
(a)-[ra {qty}]->(b)-[rb {qty}]->(c)
with the user being able to have some of each in their wishlist or whatever.
What would be the best way to query the database to return a list of all the nodes the user has on their wishlist, considering that when he has an (a) then in the result the associated (b) and (c) should also be returned after having multiplied some of their fields (say b.price and c.price) for the respective ra.qty and rb.qty?
NOTE: you can find the same problem without the variable length over here
Assuming you have users connected to the things they want like so:
(user:User)-[:WANTS]->(part:Part)
And that parts, like you describe, have dependencies on other parts in specific quantities:
CREATE
(a:Part) -[:CONTAINS {qty:2}]->(b:Part),
(a:Part) -[:CONTAINS {qty:3}]->(c:Part),
(b:Part) -[:CONTAINS {qty:2}]->(c:Part)
Then you can find all parts, and how many of each, you need like so:
MATCH
(user:User {name:"Steven"})-[:WANTS]->(part),
chain=(part)-[:CONTAINS*1..4]->(subcomponent:Part)
RETURN subcomponent, sum( reduce( total=1, r IN relationships(chain) | total * r.rty) )
The 1..4 term says to look between 1-4 sub-components down the tree. You can obv. set that to whatever you like, including "1..", infinite depth.
The second term there is a bit complex. It helps to try the query without the sum to see what it does. Without that, the reduce will do the multiplying of parts that you want for each "chain" of dependencies. Adding the sum will then aggregate the result by subcomponent (inferred from your RETURN clause) and sum up the total count for that subcomponent.
Figuring out the price is then an excercise of multiplying the aggregate quantities of each part. I'll leave that as an exercise for the reader ;)
You can try this out by running the queries in the online console at http://console.neo4j.org/

Resources