I have the following HQL query:
select c.device from Choice c
group by c.device
Now I want to count the number of groups in the result not the number of devices per group.
I tried:
select count(distinct c.device) from Choice c
group by c.device
but this give the number of distinct devices in each group. This is something like [2,3,4]. But I need 2+3+4.
How do I get the number of groups with HQL?
You would have to do a count of a count, which is not supported in HQL.
SQL
In SQL it would look something like this:
select count(innerQuery.counted)
from (select count(d.id) as counted
from Choice c inner join Device d
on c.device_id = d.id
group by d.id) as innerQuery
In the example above the outer query selects from a sub-query which returns the groups. The outer query then does a count on the generated counted column.
HQL
An alternative is to do a count and then get the size of the list.
Choice.executeQuery('select count(c.device) from Choice c group by c.device').size()
There's a performance penalty because the counting is done on the client-side.
Related
I’m trying to join three tables. Two of these, N1 and N2, share an ID (PK) but they differ in the number of rows, so I have written two coalesce statements to clean up the PK and a related field:
SELECT
(SELECT COALESCE(N1.ID_Year, N2.ID_Year)) AS ID_Year,
(SELECT COALESCE(N1.ReceiverUniqueID, N2.ReceiverUniqueID)) AS AllNetworkRUID,
[…]
FROM N1
FULL OUTER JOIN N2
ON N1.ID_Year_RU = N2.ID_Year_RU
What I need to do is now join the third table, N3, to the second COALESCE field, ie to ‘AllNetworkRUID’, something like:
JOIN N3
ON N3.ID = AllnetworkRUID .
I’ve not been able to work out how to make that happen, of if it’s even possible.
Any suggestions?
We are aware of how map join and SMBM join works reducing the execution time( eliminating reduce phase i.e eliminating shuffle).
Ex: For join between two tables
select a.col1,b.col2 from
a join b on a.col1=b.col1
(both the tables are bucketed on col1 into same no of buckets)
But while joining with 3 or more tables on different columns,
Ex:
Select a. col1,b.col3,c.col2,d.date from
a join b on a.id=b.id join c on a.state=b.state join d on c.date=d.date
A scenario like this, how bucketing will help, if we don't want to split up the query in multiple smaller queries.
I may be asking too much but I want to do a left outer join between two cores
and get data from A only where B does not have related data.
Following is exactly my equivalent SQL query (for simplicity I have removed other conditions),
1. SELECT A.* FROM A AS A
WHERE A.ID NOT IN (SELECT B.A_ID FROM B AS B WHERE B.STATUS_ID != 1)
I understand that solr join is actually subquery, I need data from only A.
It would be very easy if the not was not there in where condition for sub query.
For example,
2. SELECT A.* FROM A AS A
WHERE A.ID IN (SELECT B.A_ID FROM B AS B WHERE B.STATUS_ID != 1)
I can have q={!join from=aId to=id fromIndex=b}(-statusId:1).
How can I do a nagete here, i.e. solr query for 1
I have graph: (:Sector)<-[:BELONGS_TO]-(:Company)-[:PRODUCE]->(:Product).
I'm looking for the query below.
Start with (:Sector). Then match first 50 companies in that sector and for each company match first 10 products.
First limit is simple. But what about limiting products.
Is it possible with cypher?
UPDATE
As #cybersam suggested below query will return valid results
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH c
LIMIT 50
MATCH (c)-[:PRODUCE]->(p:Product)
WITH c, (COLLECT(p))[0..10] AS products
RETURN c, products
However this solution doesn't scale as it still traverses all products per company. Slice applied after each company products collected. As number of products grows query performance will degrade.
Each returned row of this query will contain: a sector, one of its companies (at most 50 per sector), and a collection of up to 10 products for that company:
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH s, (COLLECT(c))[0..50] AS companies
UNWIND companies AS company
MATCH (company)-[:PRODUCE]->(p:Product)
WITH s, company, (COLLECT(p))[0..10] AS products;
Updating with some solutions using APOC Procedures.
This Neo4j knowledge base article on limiting results per row describes a few different ways to do this.
One way is to use apoc.cypher.run() to execute a limited subquery per row. Applied to the query in question, this would work:
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH c
LIMIT 50
CALL apoc.cypher.run('MATCH (c)-[:PRODUCE]->(p:Product) WITH p LIMIT 10 RETURN collect(p) as products', {c:c}) YIELD value
RETURN c, value.products AS products
The other alternative mentioned is using APOC path expander procedures, providing the label on a termination filter and a limit:
MATCH (s:Sector)<-[:BELONGS_TO]-(c:Company)
WITH c
LIMIT 50
CALL apoc.path.subgraphNodes(c, {maxLevel:1, relationshipFilter:'PRODUCE>', labelFilter:'/Product', limit:10}) YIELD node
RETURN c, collect(node) AS products
I have 3 oracle tables. A joins to B and B joins to C. I want all records from A irerspective of whether a corresponding record exists in B or C. I wrote a query like this:
select a.name from a,b,c where a.a_id = b.b_id(+) and b.b_id = c.c_id(+)
This query does not seem right to me, particularly with the second join. What will exactly happen if there is a record in A but correspondingly nothing in B and C? Will it still fetch the record?
For some reason the above query returns same count of records as select a.name from a
So I am guessing that the query is right? Also is there a better way to rewrite the query?
I presume the better query can be
Select a.name from A a left join B b on a.a_id=b.b_id inner join C c on b.b_id=c.c_id
This should give the result as you have expected
http://rajanmaharjan.com.np