JPQL join two entities with no direct relations - join

I have an issue: When I am trying to join two tables which do not have a foreign key or a direct entity relation through my java code within themselves. I am using the below JPQL query: -
SELECT p FROM P p, OM orgm WHERE p.o.id = orgm.o.id and p.u.id = orgm.u.id and orgm.ma = true and p.u.id = ? AND p.o.id IN (:oId);
But this turns to a MySQL query which has a "cross join" which obviously is expensive.
What I need is to make sure that a similar query gives me an inner join MySQL query between the two tables.
I am trying to make usage of the "WITH" clause but seems that it doesn't work with inner join.
Please revert what can be done in this scenario.
Thanks in advance.

Related

Why does Hive warn that this subquery would cause a Cartesian product?

According to Hive's documentation it supports NOT IN subqueries in a WHERE clause, provided that the subquery is an uncorrelated subquery (does not reference columns from the main query).
However, when I attempt to run the trivial query below, I get an error FAILED: SemanticException Cartesian products are disabled for safety reasons.
-- sample data
CREATE TEMPORARY TABLE foods (name STRING);
CREATE TEMPORARY TABLE vegetables (name STRING);
INSERT INTO foods VALUES ('steak'), ('eggs'), ('celery'), ('onion'), ('carrot');
INSERT INTO vegetables VALUES ('celery'), ('onion'), ('carrot');
-- the problematic query
SELECT *
FROM foods
WHERE foods.name NOT IN (SELECT vegetables.name FROM vegetables)
Note that if I use an IN clause instead of a NOT IN clause, it actually works fine, which is perplexing because the query evaluation structure should be the same in either case.
Is there a workaround for this, or another way to filter values from a query based on their presence in another table?
This is Hive 2.3.4 btw, running on an Amazon EMR cluster.
Not sure why you would get that error. One work around is to use not exists.
SELECT f.*
FROM foods f
WHERE NOT EXISTS (SELECT 1
FROM vegetables v
WHERE v.name = f.name)
or a left join
SELECT f.*
FROM foods f
LEFT JOIN vegetables v ON v.name = f.name
WHERE v.name is NULL
You got cartesian join because this is what Hive does in this case. vegetables table is very small (just one row) and it is being broadcasted to perform the cross (most probably map-join, check the plan) join. Hive does cross (map) join first and then applies filter. Explicit left join syntax with filter as #VamsiPrabhala said will force to perform left join, but in this case it works the same, because the table is very small and CROSS JOIN does not multiply rows.
Execute EXPLAIN on your query and you will see what is exactly happening.

Unusual Joins SQL

I am having to convert code written by a former employee to work in a new database. In doing so I came across some joins I have never seen and do not fully understand how they work or if there is a need for them to be done in this fashion.
The joins look like this:
From Table A
Join(Table B
Join Table C
on B.Field1 = C.Field1)
On A.Field1 = B.Field1
Does this code function differently from something like this:
From Table A
Join Table B
On A.Field1 = B.Field1
Join Table C
On B.Field1 = C.Field1
If there is a difference please explain the purpose of the first set of code.
All of this is done in SQL Server 2012. Thanks in advance for any help you can provide.
I could create a temp table and then join that. But why use up the cycles\RAM on additional storage and indexes if I can just do it on the fly?
I ran across this scenario today in SSRS - a user wanted to see all the Individuals granted access through an AD group. The user was using a cursor and some temp tables to get the users out of AD and then joining the user to each SSRS object (Folders, reports, linked reports) associated with the AD group. I simplified the whole thing with Cross Apply and a sub query.
GroupMembers table
GroupName
UserID
UserName
AccountType
AccountTypeDesc
SSRSOjbects_Permissions table
Path
PathType
RoleName
RoleDesc
Name (AD group name)
The query needs to return each individual in an AD group associated with each report. Basically a Cartesian product of users to reports within a subset of data. The easiest way to do this looks like this:
select
G.GroupName, G.UserID, G.Name, G.AccountType, G.AccountTypeDesc,
[Path], PathType, RoleName, RoleDesc
from
GroupMembers G
cross apply
(select
[Path], PathType, RoleName, RoleDesc
from
SSRSOjbects_Permissions
where
Name = G.GroupName) S;
You could achieve this with a temp table and some outer joins, but why waste system resources?
I saw this kind of joins - it's MS Access style for handling multi-table joins. In MS Access you need to nest each subsequent join statement into its level brackets. So, for example this T-SQL join:
SELECT a.columna, b.columnb, c.columnc
FROM tablea AS a
LEFT JOIN tableb AS b ON a.id = b.id
LEFT JOIN tablec AS c ON a.id = c.id
you should convert to this:
SELECT a.columna, b.columnb, c.columnc
FROM ((tablea AS a) LEFT JOIN tableb AS b ON a.id = b.id) LEFT JOIN tablec AS c ON a.id = c.id
So, yes, I believe you are right in your assumption

Impala join with or query

I am trying to perform a join in impala as such:
Select * from Table1 t1
left outer join Table2 t2 on (t1.column1 = t2.column1 OR t1.column2 = t2.column2)
But I get the following error:
NotImplementedException: Join with 't2' requires at least one conjunctive equality precidate.
To perform a Cartesian product between two tables, use a CROSS JOIN.
I have tried using a CROSS JOIN but it does not work either.
Is it possible to perform or queries on a join in Impala? Is there a work around?
I have tried it using and AND query and it runs successfully.
Any help or advice is appriciated.
As suggested on the Impala JIRA, you can trying rewriting your query with a UNION ALL clause. Unfortunately you'll have to do the deduplication following the UNION ALL manually.

Rails: Joining by one table OR another table

I'm looking for a more efficient way to write an ActiveRecord query. I want to get all instances of a model that either join one table or another table. Both is easy, but either is difficult.
Right now, I have the following two queries:
across_clues = Clue.joins(:across_cells)
down_clues = Clue.joins(:down_cells)
(Followed by the unsatisfactory clues = (across_clues + down_clues).uniq.sort_by{|clue| clue.id} )
I'm wondering how to write a single query that will give me the union of both of my queries. That way I can let Postgres do the heavy lifting and keep Rails from getting its hands dirty.
I know how to get the intersection of the two sets:
bad_clues = Clue.joins(:across_cells, :down_cells)
but I haven't seen a good way to get their union. Any help would be appreciated and loved!
(For posterity)
I used UNION DISTINCT according to shiva's answer, but just slightly modified it to be less hard-coded:
across_query = Clue.joins(:across_cells).to_sql
down_query = Clue.joins(:down_cells).to_sql
clues = Clue.find_by_sql("(#{across_query}) UNION DISTINCT (#{down_query})")
It works!
The key is you need to use find_by_sql and UNION DISTINCT
I am a MySQL guy so here is how I would do it
Clue.find_by_sql("(SELECT clue.* FROM clue
INNER JOIN across_cell ON across_cell.clue_id=clue.id)
UNION DISTINCT
(SELECT clue.* FROM clue
INNER JOIN down_cell ON down_cell.clue_id = clue.id)")
What about
across_clues = Clue.joins(:across_cells)
down_clues = Clue.joins(:down_cells)
Clue.where do
(id.in across_clues.select{id}) | (id.in down_clues.select{id})
end
with Squeel?

Excluding data using NOT IN in the join using DB2, SQL PL

I need assistance in formulating the correct approach to a query.
I have staff members that I need to give work to. If they're not available on a date, they're excluded from the group of staff members that can get work. I think it's clear what I'm trying to do, but it's incorrect syntax:
INNER JOIN mySchema."STAFF" S
ON RS.STAFF_ID = S.STAFF_ID
AND RS.STAFF_ID NOT IN (SELECT SU.STAFF_ID
FROM mySchema."STAFF_UNAVAIL" SU
WHERE SU.UNAVAIL_DT = OUTSTANDING_DATE)
Any ideas on how one could achieve a NOT IN in a join without actually doing it in the join?
put it in a where clause after the joins
INNER JOIN mySchema."STAFF" S
ON RS.STAFF_ID = S.STAFF_ID
...any other joins...
WHERE RS.STAFF_ID NOT IN (SELECT SU.STAFF_ID
FROM mySchema."STAFF_UNAVAIL" SU
WHERE SU.UNAVAIL_DT = OUTSTANDING_DATE)

Resources