Count associations from a where clause active record - ruby-on-rails

I have a list of Parents who have many Children and each Child has one and only one skill. The parents have a many-to-many relationship to children and Children can have the same skill.
I am trying to get a list of all parents and add a count field of how many children they have with a particular skill. Is this possible with active record?
My current solution uses this Parent.joins(:children).select('parents.*, COUNT(*) AS child_count').group('parents.id').where(children: {skill_name: skill})
This however doesn't return Parents with a count of 0 for child_count. Is there any way to accomplish this with Active Record? I want to return JSON for every parent with a count of how many of their children have a specific skill.

.joins(:children) generates an INNER JOIN which will exclude parents with 0 children. You should change it to an OUTER JOIN. For the most part you cannot do this with ActiveRecord (unless you want to perform eager loading, too). Assuming that your join table is named parents_children, then you can build the join yourself:
.joins("LEFT OUTER JOIN parents_children ON parents_children.parent_id = parents.id INNER JOIN children ON parents_children.child_id = children.id")

Related

Cypher: get all the relationships of node with a specific relationship

I'm trying to find all the relationships of the nodes which have one specific relationship. People can be connected to events which in turn are connected to churches. I'm interested in the people who are connected as witnesses to events (marriages) in the following manner:
(p:person)-[:ACTED_AS_BEKENDE]-(e:event)
What I'm struggling with is that when I write a simple MATCH statement with a WHERE clause (see below), I only get the events to which people were connected via this specific relationship.
MATCH (p:person)--(e:event)--(c:church)
WHERE (p:person)-[:ACTED_AS_BEKENDE]-(e:event)
RETURN distinct p.ID AS ID, p.Name AS NAME, labels(e) AS Event_name, e.Event_year AS year, labels(c) AS Church ORDER BY e.Event_year ASC
To reiterate: I need a query which first selects the people who are tied to events via the [:ACTED_AS_BEKENDE] edge and then retrieves all the events to which these people were tied.
Do you need something like this?
MATCH (p:person)-[:ACTED_AS_BEKENDE]-(:event)
WITH p
MATCH (p)--(e:event)--(c:church)
RETURN distinct p.ID AS ID, p.Name AS NAME, labels(e) AS Event_name, e.Event_year AS year, labels(c) AS Church ORDER BY e.Event_year ASC
This will first find all persons that are ACTED_AS_BEKENDE, and for them it will find the events and churches as you wanted

How to insert multiple properites with same name but different values in neo4j relation?

I have 2 nodes, person and job.The relation between them is VIEWED.
I need to store the list of timestamps the person viewed the job in the properties of VIEWED relation.
The best way to do this would be to use additional nodes (with label :Viewing) with a relationship to the :Person and the :Job nodes.
With Cypher, you can use MERGE to ensure that only a single VIEWED relationship exists between a Person and Job pair, and the ON CREATE and ON MATCH clauses to either initialize or append to the relationship's timestamp list.
For example:
MATCH (p:Person), (j:Job)
WHERE p.id = 123 AND j.id = 987
MERGE (p)-[r:VIEWED]->(j)
ON CREATE SET r.times = [datetime()]
ON MATCH SET r.times = r.times + datetime()

How to generate relationships using property information [Node4j]

I have imported a CSV where each Node contains 3 columns. id, parent_id, and title. This is a simple tree structure i had in mysql. Now i need to create the relationships between those nodes considering the parent_id data. So each node to node will have 2 relationships as parent and child. Im really new to node4j and suggestions ?
i tried following, but no luck
MATCH (b:Branch {id}), (bb:Branch {parent_id})
CREATE (b)-[:PARENT]->(bb)
It seems as though your cypher is very close. The first thing you are going to want to do is create an index on the id and parent_id properties for the label Branch.
CREATE INDEX ON :Branch(id)
CREATE INDEX ON :Branch(parent_id)
Once you have indexes created you want to match all of the nodes with the label Branch (I would limit this with a specific value to start to make sure you create exactly what you want) and for each find the corresponding parent by matching on your indexed attributes.
MATCH (b:Branch), (bb:Branch)
WHERE b.id = ???
AND b.parent_id = bb.id
CREATE (b)-[:PARENT]->(bb)
Once you have proved this out on one branch and you get the results you expect I would run it for more branches at once. You could still choose to do it in batches depending on the number of branches in your graph.
After you have created all of the :PARENT relationships you could optionally remove all of the parent_id properties.
MATCH (b:Branch)-[:PARENT]->(:Branch)
WHERE exists(b.parent_id)
REMOVE b.parent_id

Neo4j indexing for large number of nodes

I am learning the basics of neo4j and I am looking at the following example with credit card fraud https://linkurio.us/stolen-credit-cards-and-fraud-detection-with-neo4j. Cypher query that finds stores where all compromised user shopped is
MATCH (victim:person)-[r:HAS_BOUGHT_AT]->(merchant)
WHERE r.status = “Disputed”
MATCH victim-[t:HAS_BOUGHT_AT]->(othermerchants)
WHERE t.status = “Undisputed” AND t.time < r.time
WITH victim, othermerchants, t ORDER BY t.time DESC
RETURN DISTINCT othermerchants.name as suspicious_store, count(DISTINCT t) as count, collect(DISTINCT victim.name) as victims
ORDER BY count DESC
However, when the number of users increase (let's say to millions of users), this query may become slow since the initial query will have to traverse through all nodes labeled person. Is it possible to speed up the query by asigning properties to nodes instead of transactions? I tried to remove "status" property from relationships and add it to nodes (users, not merchants). However, when I run query with constraint WHERE victim.status="Disputed" query doesn't return anything. So, in my case person has one additional property 'status'. I assume I did a lot of things wrong, but would appreciate comments. For example
MATCH (victim:person)-[r:HAS_BOUGHT_AT]->(merchant)
WHERE victim.status = “Disputed”
returns the correct number of disputed transactions. The same holds for separately quering number of undisputed transactions. However, when merged, they yield an empty set.
If I made a mistake in my approach, how can I speed up queries for large number of nodes (avoid traversing all nodes in the first step). I will be working with a data set with similar properties, but will have around 100 million users, so I would like to index users on additional properties.
[Edited]
Moving the status property from the relationship to the person node does not seem to be the right approach, since I presume the same person can be a customer of multiple merchants.
Instead, you can reify the relationship as a node (let's label it purchase), as in:
(:person)-[:HAS_PURCHASE]->(:purchase)-[:BOUGHT_AT]->(merchant)
The purchase nodes can have the status property. You just have to create the index:
CREATE INDEX ON :purchase(status)
Also, you can put the time property in the new purchase nodes.
With the above, your query would become:
MATCH (victim:person)-[:HAS_PURCHASE]->(pd:purchase)-[:BOUGHT_AT]->(merchant)
WHERE pd.status = “Disputed”
MATCH victim-[:HAS_PURCHASE]->(pu:purchase)-[:BOUGHT_AT]->(othermerchants)
WHERE pu.status = “Undisputed” AND pu.time < pd.time
WITH victim, othermerchants, pu ORDER BY pu.time DESC
RETURN DISTINCT othermerchants.name as suspicious_store, count(DISTINCT pu) as count, collect(DISTINCT victim.name) as victims
ORDER BY count DES

Selecting Related Entities When No Association Exists

EF4 is great for pulling in associated data but what do you do when the association is not explicit? An example illustrates my situation:
MasterTable has a child1Id and child2Id column.
There are two tables Child1 and Child2 with corresponding primary key child1Id and child2Id. There are Master, Child1 and Child2 entities.
There is no foreign key or entity framework association between Master and Child1 / Child2 tables or entities.
How can I select the master records and corresponding child records from the two child tables when all I have are the matching child Ids in the master?
I can't retrofit a relationship or association.
Richard
You must select them manually by linq to entities. Here is how to do left join between two tables:
var query = from m in context.Masters
where m.Id == 1
join c in context.Childs on m.Child.Id equals c.Id into leftJoin
from x in leftJoin.DefaultIfEmpty()
select new
{
Id = x.Id,
Name = x.Name,
Child = x.Childs
};
Btw. if your entities have a property which contains a value of PK from other entity you can create relation in EF designer. In such case you will be able to use navigation properties.

Resources