Find nodes with same relationships that initial node - neo4j

I have customers (id, name, type), commerces (id, name, type) and relationships between them (idcustomer, idcommerce, quantity) that indicates that a customer has bought in a commerce and the quantity.
Well, I want to achieve nodes that have same relationships that the origin node, I mean, if customer 1 bought in commerce id=10 and id=11 I want to achive other customers who have bought in exact the same commerces (at least) that customer 1 in order to recommend the rest of commerces.
Now, I have next command but it doesn't work because it returns me all customers that have bought in one of the commerce where customer 1 bought but not in all of them.
START m=node:id(id="1") MATCH (m)-[:BUY]->(commerces)<-[:BUY]-(customers) RETURN customers;
Example
Customer 1 bought commerce 10, 11
Customer 2 bought commerce 10, 3
Customer 3 bought commerce 10, 11, 4
Customer 4 bought commerce 5, 8, 10
The return that I want is Customer 3 in order to recommend commerce 4.
Thank you.

Here is one solution,
The first query gets all of the products the start node m buys, that is the collect(commerce) of the first "WITH" clause;
The second query gets all products each customer shares with the m, that is the customerCommerces of the second "With" clause;
Then the "Where" clause eliminates those customers who share only a subset of the products bought by the m, therefore returns the customers who share all of the products with the m.
START m=node:id(id="1")
Match (m)-[:BUY]->(commerce)
With collect(commerce) as mCos
START m=node:id(id="1")
Match (m)-[:BUY]->(commerce)<-[:BUY]-(customer)
with mCos, customer, collect(commerce) as customerCommerces
Where length(mCos) = length(customerCommerces)
Return customer

Related

Product recommendation cypher

I have the following simple graph -
I wish to build a simple recommendation system on the basis of the following example:
Consider that we have invoice 1 with an Article "Apple".
We also have invoice 2 which has "Apple" and "Oranges".
Customer of invoice 1 should be recommended "Oranges".
Basically, When a customer adds an item to an invoice, we need to recommend articles that were added to another invoice with at least one of its article in the current invoice. And the recommended article not in the current invoice.
Another way to say this -
When an article A exists in Invoice 1 AND Invoice 2 also contains article A, then list all other articles in Invoice 2 provided they do not exist in Invoice 1.
However, as a complete beginner I'm unable to figure out how to write the cypher query. Any help on how to write such a query?
Something like below should work to start with:
MATCH (i:Invoice)-[]-(a:Article)-[]-(:Invoice)-[]-(b:Article)
WHERE i.invoiceNumber = 123
RETURN b;
What is does is - start from the invoice, then navigate through the articles connected to that invoice, onto other invoices (all other invoices that share this article). From there it collects all the articles connected to those invoices.
(this assumes that you are using unique Articlenodes and connecting the invoices to them)
You can use below query for a given Customer (let say Customer1), give me other customers and recommended food based on any food that Customer1 ordered and common to other customers.
MATCH (c1:Customer {name: 'Customer1'})<-[:GENERATED_FOR]-(:Invoice)<-[:ITEMIZED_IN]-(:Article)-[:TYPE]->(f:FoodArticle)
WITH c1, collect(f) as food
MATCH (c2:Customer)<-[:GENERATED_FOR]-(:Invoice)<-[:ITEMIZED_IN]-(:Article)-[:TYPE]->(f2:FoodArticle)
WHERE c1 <> c2 AND f2 in food
WITH c2, food, collect(f2) as food2
WITH c2, [fd IN food WHERE NOT fd IN food2] as recommendations
WHERE size(recommendations) > 0
RETURN c2.name, recommendations
First, get all food that customer1 has ordered
Next, find all customers that has at least one food contained in Customer1's food
List out customer2 and collect all food for this customer2
Create a list of recommended food based on those found in customer1 food list BUT NOT found in customer2 food list
Return customer2 name and recommended food but ensure that there is at least one food in Customer1 list that is not found in customer2 list (food2)

Filtering a 2-level hierarchy with contains in ODATA-v4

Assuming the relationship between Race, Team and Car Manufacturer are all many:many. A race can feature many teams, a team can have many car manufacturers, car manufacturers can sponsor many teams and teams can enter many races.
Using odata v-4, how can I select all races featuring cars by specified manufacturers.
If I wanted to select all the races including Team with id 475 and 476 I would form my odata query as
Race$expand=Team($select=id,name)&$filter=((Team/any(c:((c/id eq 475) or (c/id eq 476)))))
But how would I form my URL if I wanted to select all races featuring teams that use car manufacturer with ford or chevy.
IN SQL I would just do:
SELECT *
FROM race
WHERE id IN ((SELECT raceid
FROM race_team
WHERE teamid IN (SELECT teamid
FROM team_carmanufacturer
WHERE carid IN (SELECT id
FROM carmanufacturer
WHERE name IN
( 'ford', 'chevy' )
))))
race_team, team_carmanufacturer are just the many to many mapping tables in the database.
You could try something like this:
Race?$filter=Team/any(y:y/Manufacturer/any(z:z/name eq 'ford' or z/name eq 'chevy' ))&$expand=Team($expand=Manufacturer)
It'a bit hard to get this correct without an endpoint.
Also make sure, your Controller is supporting an expansiondepth of at least 2.
TBH, i dont think this is a very practical way to use over time. Consider using a static endpoint, which operates on you manufacturer brands as parameter.
Cheers

Iterate over a list in Neo4j

I am working on Neo4j database and I want to replicate the scenario mentioned below,
I have 2 nodes Product and customer. In the customer node I am storing customer id and list of products. and in the product I am storing only productid.
Customer has values {custId:1,products:[1,2,3,4]}
Product has values {productid:1},{productid:2},{productid:3},{productid:4}
Now what I want to do is,
I need to replace all these ids to an autogenerated ids after adding the nodes in the graph database. SOmething like set custId=ID(customer) and productId=ID(product) but what I am stuck at is how to iterate the list of products in customer node and change the product id to auto generated ids.
Any help is appreciated.
The idea of storing the product IDs are automatically generated by database in an array of user property - it is the wrong idea. In all senses.
The graph spirit - is to establish a relationship between the node Customer and its corresponding nodes Product, and then delete the property products from Customer and productid from Product:
MATCH (Customer:Customer)
UNWIND Customer.products as prodID
MATCH (Product:Product {productid: prodID})
MERGE (Customer)-[r:hasProduct]->(Product)
WITH Customer, count(Product) as mergedProduct
REMOVE Customer.products
WITH count(Customer) as totalMerged
MATCH (Product:Product)
REMOVE Product.productid

Joining four tables but excluding duplicates

I am trying to join four tables (users, user_payments, content_type and media_content) but I always get duplicates. Instead of seeing for example that user Smith purchased media_content_id_purchase 5011 for a price of 3.99 and he streamed media_content_stream_id 5000 for a price of 0.001 per min, I get:
multiple combinations such as, media_content_id_purchase 5011 costs 3.99, 1.99, 6.99 etc. with media_content_id_stream that also has all sorts of prices.
This is my query:
select u.surname, up.media_content_id_purchase, ct.purchase_price, up.media_content_id_stream, ct.stream_price, ct.min_price
from users u, user_payments up, content_type ct, media_content mc
where u.user_ID=up.user_ID_purchase and
up.media_content_ID_purchase=mc.media_content_ID or up.media_content_ID_purchase is null and
ct.content_type_ID=mc.content_type_ID;
My goal is to display each user and what they have consumed with the corresponding prices.
Thanks!!!
Perhaps you should try using select distinct?
http://www.w3schools.com/sql/sql_distinct.asp
As you can see here select DISTINCT is supposed to show only the different (distinct) values.

How to create a dimensional model with different metrics depending of the hierarchical level

I need to create a dimensional environment for sales analysis for a retail company.
The hierarchy that will be present in my Sales fact is:
1 - Country
1.1 - Region
1.1.1 - State
1.1.1.1 - City
1.1.1.1.1 - Neighbourhood
1.1.1.1.1.1 - Store
1.1.1.1.1.1.1 - Section
1.1.1.1.1.1.1.1 - Catgory
1.1.1.1.1.1.1.1.1 - Subcatgory
1.1.1.1.1.1.1.1.1.1 - Product
Metrics such as Number of Sales, Revenue and Medium Ticket (Revenue / Number of Sales) makes sense up to the Subcategory level, because if I reach the Product level the agreggation composition will need to change (I guess).
Also, metrics such as Productivity, which is Revenue / Number of Store Staff, won't make sense to existe in this fact table, because it only works up to the Store level (also, I guess).
I'd like to know the best solution resolve this question because all of it are about Sales, but some makes sense until a specifict level of my hierarchy and others don't.
Waiting for the reply and Thanks in advance!
You should split your hierarchy into 2 dimensions, Stores and Products
The Stores dimension is all about the Location of the sale, and you can put the number of employees in this dimension
Store_Key STORE Neighbourhood City Country Num_Staff
1 Store1 4th Street LA US 10
2 Store2 Main Street NY US 2
The products dimension looks like
Product_Key Prod_Name SubCat Category Unit_Cost
1 Cheese Sticks Diary Food $2.00
2 Timer Software Computing $25.00
The your fact table has a record for each Sale, and is keyed to the above dimensions
Store_Key Product_Key Date Quantity Tot_Amount
1 1 31/7/2014 5 $10.00 (store1 sells 5 cheese)
1 2 31/7/2014 1 $25.00 (store1 sells 1 timer)
2 1 31/7/2014 3 $6.00 (store2 sells 3 cheese)
2 2 31/7/2014 1 $25.00 (store2 sells 1 timer)
Now that your data is in place you can use your reporting tool to get the measures you need. Example SQL is something like below
SELECT store.STORE,
SUM(fact.tot_amount) as revenue,
COUNT(*) as num_sales
SUM(fact.tot_amount) / store.NumStaff as Productivity
FROM tbl_Store store, tb_Fact fact
WHERE fact.Store_key = store.Store_key
GROUP BY store.STORE
should return the following result
STORE revenue num_sales Productivity
Store1 $35.00 2 3.5
Store2 $31.00 2 15.5

Resources