Folks,
I've had a pretty thorough search before posting and couldn't see this answered anywhere previously. Perhaps it isn't possible.... I'm using SQL server 2008 R2
Anyway, thanks in advance for looking/helping.
I have two tables that I'd like to join.
Table1 (t1):
Account------Name--------Amount
12345-------account1-----10000.00
12346-------account2-----20000.00
Table2 (t2):
ID-----Account---extraData
10-----12345-----ZZ100
20-----12345-----ZZ250
30-----12345-----ZZ400
10-----12346-----ZZ150
20-----12346-----ZZ200
I'm trying to return the following from the above tables:
t1.Account---t1.Name------ID1(t2.ID=10)---ID2(td.ID=20)----SUM(Amount)
12345--------account1-------ZZ100------------ZZ250-------------10000.00
12346--------account2-------ZZ150------------ZZ200-------------20000.00
I have tried various joins of sorts and a union, but can't seem to get the results above. Most result in either nothing, or the Amount column returning as double the required result.
My starting point is:
Select t1.Account, t1.Name, t2A.extraData, t2B.extraData, SUM(t1.AMOUNT)
from table1 t1
join table2 t2A on t1.Account = t2A.Account and t2A.ID = '10'
join table2 t2B on t1.Account = t2B.Account and t2B.ID = '20'
Group by t1.Account, t1.Name, t2A.extraData, t2B.extraData
I've reduced the code and complexity of the query for this thread, but the problem is as above. I have no control over the table structure as they form part of an accounting system that I can't amend (I could, but I'd upset one or two people!).
Hopefully I've explained the issue clearly enough. It seems like it should be simple, but I can't seem to fathom it - perhaps I've just been staring too long. Anyway, thanks in advance for your assistance.
Edit: to change the code to reflect the first response highlighting a mistake in my posting.
Please try this. I think this helps you to achieve your result.
DECLARE #ids varchar(max)
SELECT #ids=STUFF((SELECT DISTINCT ', [' + CAST(ID AS VARCHAR(10))+']'
FROM t2
FOR XML PATH(''), TYPE)
.value('.','NVARCHAR(MAX)'),1,2,' ')
SELECT #ids
EXECUTE ('SELECT
Account,Name,'+#ids+',Amount
FROM
(SELECT t1.Account,Name,ID,ExtraData,SUM(Amount) AS Amount
FROM t1 t1 INNER JOIN t2 t2 ON t1.Account=t2.Account
GROUP BY t1.Account,Name,ID,ExtraData) AS SourceTable
PIVOT
(
MAX(ExtraData)
FOR ID IN ('+#ids+')
) AS PivotTable;')
Related
I have the need to join a huge table (10 million plus rows) to a lookup table (15k plus rows) with an OR condition. Something like:
SELECT t1.a, t1.b, nvl(t1.c, t2.c), nvl(t1.d, t2.d)
FROM table1 t1
JOIN table2 t2 ON t1.c = t2.c OR t1.d = t2.d;
This is because table1 can have c or d as NULL, and I'd like to join on whichever is available, leaving out the rest. The query plan says there is a Nested Loop, which I realize is because of the OR condition. Is there a clean, efficient way of solving this problem? I'm using Redshift.
EDIT: I am trying to run this with a UNION, but it doesn't seem to be any faster than before.
If you have a preferred column you can NVL() (aka COALESCE()) them and join on that.
SELECT t1.a, t1.b, nvl(t1.c, t2.c), nvl(t1.d, t2.d)
FROM table1 t1
JOIN table2 t2
ON t1.c = NVL(t2.c,t2.d);
I'd also suggest that you should set the lookup table to DISTSTYLE ALL to ensure that the larger table is not redistributed.
[ Also, 10 million rows isn't big for Redshift. Not trying to be snotty just saying that we get excellent performance on Redshift even when querying (and joining) tables with hundreds of billions of rows. ]
How about doing two (left) joins? With the small lookup table performance shouldn't be too bad even.
SELECT t1.a, t1.b, nvl(t1.c, t2.c), nvl(t1.d, t3.d)
FROM table1 t1
LEFT JOIN table2 t2 ON t1.d = t2.d and t1.c is null
LEFT JOIN table2 t3 ON t1.c = t3.c and t1.d is null
Your original query only returns rows that match at least one of c or d in the lookup table. If that's not guaranteed you may need to add filters...for example rows in t1 where both c and d are null or have values not present in table2.
Don't really need the null checks in the joins, but might be slightly faster.
I'm still a novice at SQL and I need to run a report which JOINs 3 tables. The third table has duplicates of fields I need. So I tried to join with a distinct option but hat didn't work. Can anyone suggest the right code I could use?
My Code looks like this:
SELECT
C.CUSTOMER_CODE
, MS.SALESMAN_NAME
, SUM(C.REVENUE_AMT)
FROM C_REVENUE_ANALYSIS C
JOIN M_CUSTOMER MC ON C.CUSTOMER_CODE = MC.CUSTOMER_CODE
/* This following JOIN is the issue. */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE = (SELECT SALESMAN_CODE FROM M_SALESMAN WHERE COMP_CODE = '00')
WHERE REVENUE_DATE >= :from_date
AND REVENUE_DATE <= :to_date
GROUP BY C.CUSTOMER_CODE, MS.SALESMAN_NAME
I also tried a different variation to get a DISTINCT.
/* I also tried this variation to get a distinct */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE =
(SELECT distinct(SALESMAN_CODE) FROM M_SALESMAN)
Please can anyone help? I would truly appreciate it.
Thanks in advance.
select distinct
c.customer_code,
ms.salesman_code,
SUM(c.revenue_amt)
FROM
c_revenue c,
m_customer mc,
m_salesman ms
where
c.customer_code = mc.customer_code
AND mc.salesman_code = ms.salesman_code
AND ms.comp_code = '00'
AND Revenue_Date BETWEEN (from_date AND to_date)
group by
c.customer_code, ms.salesman_name
The above will return you any distinct combination of Customer Code, Salesman Code and SUM of Revenue Amount where the c.CustomerCode matches an mc.customer_code AND that same mc record matches an ms.salesman_code AND that ms record has a comp_code of '00' AND the Revenue_Date is between the from and to variables. Then, the whole result will be grouped by customer code and salesman name; the only thing that will cause duplicates to appear is if the SUM(revenue) is somehow different.
To explain, if you're just doing a straight JOIN, you don't need the JOIN keywords. I find it tends to convolute things; you only need them if you're doing an "odd" join, like an LEFT/RIGHT join. I don't know your data model so the above MIGHT still return duplicates but, if so, let me know.
I am trying to perform a join in impala as such:
Select * from Table1 t1
left outer join Table2 t2 on (t1.column1 = t2.column1 OR t1.column2 = t2.column2)
But I get the following error:
NotImplementedException: Join with 't2' requires at least one conjunctive equality precidate.
To perform a Cartesian product between two tables, use a CROSS JOIN.
I have tried using a CROSS JOIN but it does not work either.
Is it possible to perform or queries on a join in Impala? Is there a work around?
I have tried it using and AND query and it runs successfully.
Any help or advice is appriciated.
As suggested on the Impala JIRA, you can trying rewriting your query with a UNION ALL clause. Unfortunately you'll have to do the deduplication following the UNION ALL manually.
I'm working on a project that uses SQLAnywhere and I found this Query:
update MY_TABLE table1
set table1.column1 = table3.id
from MY_TABLE table2, MY_OTHER_TABLE table3
where table2.some_col = table3.some_col and table2.other_col is null;
The problem is, that table1 which is updated and table2/table2 which are joined do not have any link, no constraint. Table1 is completely independent from the other two.
So as far as I can understand it, if the condition in the last line is met for at least one row, then ALL rows of table1 will be updated because then the join-statement is always true.
Am I right or am I missing something?
Short answer: Yes. I agree ;)
Long answer: It really looks like that there is no connection between table 1 and tables 2 and 3. So based on your input above, I'd expect the mentioned behaviour.
Also I'd remove the implicit JOIN here, as it might causes confusion.
I need to generate such SQL using Propel build criteria:
"SELECT *
FROM `table1`
LEFT JOIN table2 ON ( table1.OBJECT_ID = table2.ID )
LEFT JOIN table3 ON ( table1.OBJECT_ID = table3.ID )
LEFT JOIN table4 ON ( table4.USER_ID = table2.ID
OR table4.USER_ID = table3.AUTHOR_ID )"
Is it possible to make join with or condition? Or maybe some other ways?
Propel 1.5
Table1Query::create()
->leftJoinTable2()
->leftJoinTable3()
->useTable2Query()
->leftJoinTable4()
->endUse()
->condition('cond1', Table4::USER_ID . ' = ' . Table2::ID)
->condition('cond2', Table4::USER_ID . ' = ' . Table3::AUTHOR_ID)
->combine(array('cond1', 'cond2'), Criteria::LOGICAL_OR, 'onClause')
->setJoinCondition('Table4', 'onClause')
->find();
useTable2Query() is necessary because your information seems to imply that Table4 is related to Table2 and not to Table1, and so joining Table4 directly to Table1 will result in a series of fatal Propel errors. The "use" functionality bridges that relationship.
The first two joins (table2, table3) are easy, if I recall correctly. Just make table1.OBJECT_ID non-required in your schema, and the left join will be used automatically.
Not immediately sure about the OR join. If you get stuck, one way to do it is to use the above in a raw query, and then "hydrate" objects from the resultset. Another way (very good for complex queries that are a pain to express in an ORM) is to create a database view for the above, and then add a new table in your schema for the view. It's cheating a bit for sure, but for some really complex master-detail things I did in a large symfony project, it was great - and it made query debugging really easy as well.