Returning nodes with specific relationship - neo4j

I have a db of drugs and manufacturers and I want to find all manufacturers who have produced multiple drugs. How can I get only the manufacturers and the drugs they have produced?
I'm currently using
match (a:Brand), (c:Manufacturer) where size((c)-[:PRODUCED]->()) >1 return a,c;
which returns manufacturers with more than one drug produced but also all drugs, regardless of manufacturer

This query uses the aggregating function, COLLECT, to return a record for each manufacturer who makes multiple brands, along with a collection of those brands:
MATCH (m:Manufacturer)-[:PRODUCED]->(b:Brand)
WITH m, COLLECT(b) AS brands
WHERE SIZE(brands) > 1
RETURN m, brands;

Sounds like you only need to select the manufacturers, like so:
MATCH (c:Manufacturer) WHERE size((c)-[:PRODUCED]->()) > 1 RETURN c;

Related

How do I self reference within a table in Neo4j?

I have a few tables loaded up in Neo4j. I have gone through some tutorials and come up with this cypher query.
MATCH (n:car_detail)
RETURN COUNT(DISTINCT n.model_year), n.model, n.maker_name
ORDER BY COUNT(DISTINCT n.model_year) desc
This query gave me all the cars that were continued or discontinued. Logic being count one being discontinued and anything higher being continued.
My table car_detail has cars which were build in different years. I want to make a relationship saying for example
"Audi A4 2011" - (:CONTINUED) -> "Audi A4 2015" - (:CONTINUED) -> "Audi A4 2016"
So it sounds like you want to match to the model and make of the car, ordered by the model year ascending, and to create relationships between those nodes.
We can make use of APOC Procedures as a shortcut for creating the linked list through the ordered and collected nodes, you'll want to install this (with the appropriate version given your Neo4j version) to take advantage of this capability, as the pure cypher approach is quite ugly.
The query would look something like this:
MATCH (n:car_detail)
WITH n
ORDER BY n.model_year
WITH collect(n) as cars, n.model as model, n.maker_name as maker
WHERE size(cars) > 1
CALL apoc.nodes.link(cars, 'CONTINUED')
RETURN cars
The key here is that after we order the nodes, we aggregate the nodes with respect to the model and maker, which act as your grouping key (when aggregating, the non-aggregation variables become the grouping key for the aggregation). This means your ordered cars are going to be grouped per make and model, so all that's left is to use APOC to create the relationships linking the nodes in the list.
You can just find both cars with MATCH and then connect them:
e.g.
MATCH (c1:car_detail)
where c1.model = 'Audi A4 2011'
MATCH (c2:car_detail)
where c2.model = 'Audi A4 2015'
CREATE (c1)-[:CONTIUED]->(c2);
etc.

Nodes with relationship to multiple nodes

I want to get the Persons that know everyone in a group of persons which know some specific places.
This:
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
WITH collect(DISTINCT b) as persons
Match (a:Person)
WHERE ALL(b in persons WHERE (a)-[:knows]->(b))
RETURN a
works, but for the second part does a full nodelabelscan, before applying the where clause, which is extremely slow - in a bigger db it takes 8~9 seconds. I also tried this:
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
Match (a:Person)-[:knows]->(b)
RETURN a
This only needs 2ms, however it returns all persons that know any person of group b, instead of those that know everyone.
So my question is: Is there a effective/fast query to get what i want?
We have a knowledge base article for this kind of query that show a few approaches.
One of these is to match to :Persons known by the group, and then count the number of times each of those persons shows up in the results. Provided there aren't multiple :knows relationships between the same two people, if the count is equal to the collection of people from your first match, then that person must know all of the people in the collection.
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
WITH collect(b) as persons
UNWIND persons as b // so we have the entire list of persons along with each person
WITH size(persons) as total, b
MATCH (a:Person)-[:knows]->(b)
WITH total, a, count(a) as knownCount
WHERE total = knownCount
RETURN a
Here is a simpler Cypher query that also compares counts -- the same basic idea used by #InverseFalcon.
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'}), (a:Person)-[:knows]->(b)
WITH COLLECT({a:a, b:b}) as data, COUNT(DISTINCT b) AS total
UNWIND data AS d
WITH total, d.a AS a, COUNT(d.b) AS bCount
WHERE total = bCount
RETURN a

Cypher query help: Order query results by content of property array

I have a bunch of venues in my Neo4J DB. Each venue object has the property 'catIds' that is an array and contains the Ids for the type of venue it is. I want to query the database so that I get all Venues but they are ordered where their catIds match or contain some off a list of Ids that I give the query. I hope that makes sense :)
Please, could someone point me in the direction of how to write this query?
Since you're working in a graph database you could think about modeling your data in the graph, not in a property where it's hard to get at it. For example, in this case you might create a bunch of (v:venue) nodes and a bunch of (t:type) nodes, then link them by an [:is] relation. Each venue is linked to one or more type nodes. Each type node has an 'id' property: {id:'t1'}, {id:'t2'}, etc.
Then you could do a query like this:
match (v:venue)-[r:is]->(:type) return v, count(r) as n order by n desc;
This finds all your venues, along with ALL their type relations and returns them ordered by how many type-relations they have.
If you only want to get nodes of particular venue types on your list:
match (v:venue)-[r:is]-(t:type) where t.id in ['t1','t2'] return v, count(r) as n order by n desc;
And if you want ALL venues but rank ordered according to how well they fit your list, as I think you were looking for:
match (v:venue) optional match (v)-[r:is]->(t:type) where t.id in ['t1','t2'] return v, count(r) as n order by n desc;
The match will get all your venues; the optional match will find relations on your list if the node has any. If a node has no links on your list, the optional match will fail and return null for count(r) and should sort to the bottom.

Neo4j: multiple counts from multiple matches

Given a neo4j schema similar to
(:Person)-[:OWNS]-(:Book)-[:CATEGORIZED_AS]-(:Category)
I'm trying to write a query to get the count of books owned by each person as well as the count of books in each category so that I can calculate the percentage of books in each category for each person.
I've tried queries along the lines of
match (p:Person)-[:OWNS]-(b:Book)-[:CATEGORIZED_AS]-(c:Category)
where person.name in []
with p, b, c
match (p)-[:OWNS]-(b2:Book)-[:CATEGORIZED_AS]-(c2:Category)
with p, b, c, b2
return p.name, b.name, c.name,
count(distinct b) as count_books_in_category,
count(distinct b2) as count_books_total
But the query plan is absolutely horrible when trying to do the second match. I've tried to figure out different ways to write the query so that I can do the two different counts, but haven't figured out anything other than doing two matches. My schema isn't really about people and books. The :CATEGORIZED_AS relationship in my example is actually a few different relationship options, specified as [:option1|option2|option3]. So in my 2nd match I repeat the relationship options so that my total count is constrained by them.
Ideas? This feels similar to Neo4j - apply match to each result of previous match but there didn't seem to be a good answer for that one.
UNWIND is your friend here. First, calculate the total books per person, collecting them as you go.
Then unwind them so you can match which categories they belong to.
Aggregate by category and person, and you should get the number of books in each category, for a person
match (p:Person)-[:OWNS]->(b:Book)
with p,collect(b) as books, count(b) as total
with p,total,books
unwind books as book
match (book)-[:CATEGORIZED_AS]->(c)
return p,c, count(book) as subtotal, total

How to make simple recommendation in Neo4j

I'm working on a simple demo in neo4j where I want to use recommendations based on orders and how has bought what. I've created a graph here: http://console.neo4j.org/?id=jvqr95.
Basically I have many relations like:
(o:Order)-[:INCLUDES]->(p:Product)
An order can have multiple products.
Given a specific product id I would like to find other products that was in an order containing a product with the given product id and I would like to order it by the number of orders the product is in.
I've tried the following:
MATCH (p:Product)--(o)-[:INCLUDES]->(p2:Product)--(o2)
WHERE p.name = "chocolate"
RETURN p2, COUNT(DISTINCT o2)
but that doesn't give me the result I want. For that query I expected to get chips back with a count of 2, but I only get a count of 1.
And for the follwing query:
MATCH (p:Product)--(o)-[:INCLUDES]->(p2:Product)--(o2)
WHERE p.name = "chips"
RETURN p2, COUNT(DISTINCT o2)
I expect to get chocolate and ball back where each has a count of 1, but I don't get anything back. What have I missed?
You're matching on too many things in your initial MATCH.
MATCH (o:Order)-[:INCLUDES]->(p:Product { name:'ball' })
MATCH (o)-[:INCLUDES]->(p2:Product)
WHERE p2 <> p
MATCH (o2:Order)-[:INCLUDES]->(p2)
RETURN p2.name AS Product, COUNT(o2) AS Count
ORDER BY Count DESC
In English: "Match on orders that include a specific product. For these orders, get all included products that are not the initial product. For these products, match on orders that they are included in. Return the product name and the count of how many orders it's been included in."
http://console.neo4j.org/?id=q49sx6
http://console.neo4j.org/?id=uy3t9e

Resources