Can I join a table to a coalesce field? - join

I’m trying to join three tables. Two of these, N1 and N2, share an ID (PK) but they differ in the number of rows, so I have written two coalesce statements to clean up the PK and a related field:
SELECT
(SELECT COALESCE(N1.ID_Year, N2.ID_Year)) AS ID_Year,
(SELECT COALESCE(N1.ReceiverUniqueID, N2.ReceiverUniqueID)) AS AllNetworkRUID,
[…]
FROM N1
FULL OUTER JOIN N2
ON N1.ID_Year_RU = N2.ID_Year_RU
What I need to do is now join the third table, N3, to the second COALESCE field, ie to ‘AllNetworkRUID’, something like:
JOIN N3
ON N3.ID = AllnetworkRUID .
I’ve not been able to work out how to make that happen, of if it’s even possible.
Any suggestions?

Related

How bucketing helps in case of more than two tables, if at all it does.( Hive Sort Merge Bucket Join)

We are aware of how map join and SMBM join works reducing the execution time( eliminating reduce phase i.e eliminating shuffle).
Ex: For join between two tables
select a.col1,b.col2 from
a join b on a.col1=b.col1
(both the tables are bucketed on col1 into same no of buckets)
But while joining with 3 or more tables on different columns,
Ex:
Select a. col1,b.col3,c.col2,d.date from
a join b on a.id=b.id join c on a.state=b.state join d on c.date=d.date
A scenario like this, how bucketing will help, if we don't want to split up the query in multiple smaller queries.

SQL Join :: Fetching records outside join condition

I have 2 tables A and B
A
B
The requirement is to join both tables using id column and along with that, if the fetched name value is having another record with a different id, that record should also be fetched. Like the below screenshot.
Output :
Requirements
Table B is in the size of TBs. single join of both tables will be
preferable
query needs to be executed on hive
I'm not familiar with HiveQL, but with regular SQL, you'll need to join table B to itself a second time as part of the query.
select
b_name.id, b_name.name
from
#table_A a
join #table_B b -- This table gets the "name" value for lookup
on (a.id=b.id)
join #table_B b_name -- This is the table you want to pull your "output" from
on (b.name=b_name.name)
This query essentially says that you need to find the value of the "name" column in table B, where there is a matching ID in table A, and then lookup all the rows with that name value in table B.
You can join the same table multiple times. So in the query below, b1 will give you all the names for the ids in A, and b2 is joined by name, to get you all the extra ids that are not in A.
select
b2.*
from
A
inner join B b1 on b1.id = A.id
inner join B b2 on b2.name = b1.name

Solr outer join / not join query

I may be asking too much but I want to do a left outer join between two cores
and get data from A only where B does not have related data.
Following is exactly my equivalent SQL query (for simplicity I have removed other conditions),
1. SELECT A.* FROM A AS A
WHERE A.ID NOT IN (SELECT B.A_ID FROM B AS B WHERE B.STATUS_ID != 1)
I understand that solr join is actually subquery, I need data from only A.
It would be very easy if the not was not there in where condition for sub query.
For example,
2. SELECT A.* FROM A AS A
WHERE A.ID IN (SELECT B.A_ID FROM B AS B WHERE B.STATUS_ID != 1)
I can have q={!join from=aId to=id fromIndex=b}(-statusId:1).
How can I do a nagete here, i.e. solr query for 1

Pig: Outer join on more than 2 relations

I want to do an outer join involving 3 tables. I tried with this:
features = JOIN group_event by group left outer, group_session by group, group_order by group;
I want all the rows of group_event to be present in the output even if one or neither of the other 2 relations have a match for that.
The command above is not working. Obviously since it is not supposed to work (http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#JOIN+%28outer%29)
Outer joins will only work for two-way joins; to perform a multi-way outer join, you will need to perform multiple two-way outer join statements.
The split works and can be done like:
features1 = JOIN group_event by group left outer, group_session by group;
features2 = JOIN features1 by group_event::group left outer, group_order by group;
Any ideas to do this in a single command? (Would be useful if am joining even more number of tables)
I think at some point, we need to trust the documentation, don't try single command multiple outer join.
Why? How should the following line work?
JOIN a BY a1 LEFT OUTER, b BY b1, c BY c1
Is the LEFT OUTER working for both tables, or just the first one? If the former, then should the LEFT OUTER between b and c remove all records not matched in b? Or in a? The more you look for it, the less sense it makes, doesn't it?
What you want to do is the JOIN the relation a with b into ab and then ab with c. If you think about it, it is not natural to do it within a single command because of the intermediate state ab.

oracle outer join query

I have 3 oracle tables. A joins to B and B joins to C. I want all records from A irerspective of whether a corresponding record exists in B or C. I wrote a query like this:
select a.name from a,b,c where a.a_id = b.b_id(+) and b.b_id = c.c_id(+)
This query does not seem right to me, particularly with the second join. What will exactly happen if there is a record in A but correspondingly nothing in B and C? Will it still fetch the record?
For some reason the above query returns same count of records as select a.name from a
So I am guessing that the query is right? Also is there a better way to rewrite the query?
I presume the better query can be
Select a.name from A a left join B b on a.a_id=b.b_id inner join C c on b.b_id=c.c_id
This should give the result as you have expected
http://rajanmaharjan.com.np

Resources