Snowflake query with column alias in join failing - join

This query works in one account but throws an error (below) in another. Is there a parameter that affects this behaviour? Both accounts have the same Snowflake version (6.41.2).
Error Message
with cte_one as (
select 'aaa' a
,'bbb' b
)
select t1.a || t1.b as c
from cte_one t1
join cte_one t2 on c = t2.a || t2.b
;

This will work.
with cte_one as (
select 'aaa' a
,'bbb' b
), t3 as
(
select t1.a || t1.b as c
from cte_one t1
)
select * from t3
join cte_one t2 on c = t2.a || t2.b
;
C
A
B
aaabbb
aaa
bbb
Virtual column C doesn't exist yet in the original syntax when the compiler goes into the join. As humans we can read what the original SQL is trying to do and it makes sense. The compiler sees it differently.
Out of curiosity I went to SQL Fiddle I tried the original SQL on Oracle, Postgres, and MS SQL Server. They all reported errors saying C does not exist.

The provided code sample works as-is:
Environment:
SELECT CURRENT_REGION(), CURRENT_VERSION();
-- AZURE_WESTEUROPE 6.41.2
Query profile:
explain using tabular
with cte_one as (
select 'aaa' a ,'bbb' b union all select 'x', 'y'
)
select t1.a || t1.b as c, 1
from cte_one t1
join cte_one t2
on c = t2.a || t2.b;
Output:

There is a feature for resolving aliases from the select list in the join predicate. The rollout process was not completed yet for all accounts. You should engage Snowflake Support and ask them to fix the configuration discrepancies on your accounts.
I would suggest disabling it on both accounts and using Greg's workaround.

Related

Interesting mysql query

I have 2 different tables.
My goal is to find people who use the same ip address with different names.
Table 1 - logs
Fields: member_id, ip_adress
Table 2 - members
Fields: id, name, last_name
You can use GROUP_CONCAT for that:
SELECT
ip_adress, GROUP_CONCAT(name)
FROM table1
LEFT JOIN
table2 ON table1.member_id = table2.id
GROUP BY
ip_adress
In MySQL 8.x you can use ROW_NUMBER() to identify which IP addresses have multiple members.
For example:
select id, name, last_name
from (
select m.*,
row_number() over(partition by l.ip_address
order by m.name, m.last_name) as rn
from members m
join logs l on l.member_id = m.id
) x
where rn = 2
EDIT FOR MYSQL 5.7
Since MySQL 5.x doesn't have window functions, you can do:
select m.*
from members m
join logs l on l.member_id = m.id
where l.ip_address in (
select l.ip_address
from members m
join logs l on l.member_id = m.id
group by l.ip_address
having min(m.name) <> max(m.name) or min(m.last_name) <> max(m.last_name)
)
See Running Example at DB Fiddle.

Find records with ID in array of IDS and keep the order of records matching that of IDs [duplicate]

I have a simple SQL query in PostgreSQL 8.3 that grabs a bunch of comments. I provide a sorted list of values to the IN construct in the WHERE clause:
SELECT * FROM comments WHERE (comments.id IN (1,3,2,4));
This returns comments in an arbitrary order which in my happens to be ids like 1,2,3,4.
I want the resulting rows sorted like the list in the IN construct: (1,3,2,4).
How to achieve that?
You can do it quite easily with (introduced in PostgreSQL 8.2) VALUES (), ().
Syntax will be like this:
select c.*
from comments c
join (
values
(1,1),
(3,2),
(2,3),
(4,4)
) as x (id, ordering) on c.id = x.id
order by x.ordering
In Postgres 9.4 or later, this is simplest and fastest:
SELECT c.*
FROM comments c
JOIN unnest('{1,3,2,4}'::int[]) WITH ORDINALITY t(id, ord) USING (id)
ORDER BY t.ord;
WITH ORDINALITY was introduced with in Postgres 9.4.
No need for a subquery, we can use the set-returning function like a table directly. (A.k.a. "table-function".)
A string literal to hand in the array instead of an ARRAY constructor may be easier to implement with some clients.
For convenience (optionally), copy the column name we are joining to ("id" in the example), so we can join with a short USING clause to only get a single instance of the join column in the result.
Works with any input type. If your key column is of type text, provide something like '{foo,bar,baz}'::text[].
Detailed explanation:
PostgreSQL unnest() with element number
Just because it is so difficult to find and it has to be spread: in mySQL this can be done much simpler, but I don't know if it works in other SQL.
SELECT * FROM `comments`
WHERE `comments`.`id` IN ('12','5','3','17')
ORDER BY FIELD(`comments`.`id`,'12','5','3','17')
With Postgres 9.4 this can be done a bit shorter:
select c.*
from comments c
join (
select *
from unnest(array[43,47,42]) with ordinality
) as x (id, ordering) on c.id = x.id
order by x.ordering;
Or a bit more compact without a derived table:
select c.*
from comments c
join unnest(array[43,47,42]) with ordinality as x (id, ordering)
on c.id = x.id
order by x.ordering
Removing the need to manually assign/maintain a position to each value.
With Postgres 9.6 this can be done using array_position():
with x (id_list) as (
values (array[42,48,43])
)
select c.*
from comments c, x
where id = any (x.id_list)
order by array_position(x.id_list, c.id);
The CTE is used so that the list of values only needs to be specified once. If that is not important this can also be written as:
select c.*
from comments c
where id in (42,48,43)
order by array_position(array[42,48,43], c.id);
I think this way is better :
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY id=1 DESC, id=3 DESC, id=2 DESC, id=4 DESC
Another way to do it in Postgres would be to use the idx function.
SELECT *
FROM comments
ORDER BY idx(array[1,3,2,4], comments.id)
Don't forget to create the idx function first, as described here: http://wiki.postgresql.org/wiki/Array_Index
In Postgresql:
select *
from comments
where id in (1,3,2,4)
order by position(id::text in '1,3,2,4')
On researching this some more I found this solution:
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY CASE "comments"."id"
WHEN 1 THEN 1
WHEN 3 THEN 2
WHEN 2 THEN 3
WHEN 4 THEN 4
END
However this seems rather verbose and might have performance issues with large datasets.
Can anyone comment on these issues?
To do this, I think you should probably have an additional "ORDER" table which defines the mapping of IDs to order (effectively doing what your response to your own question said), which you can then use as an additional column on your select which you can then sort on.
In that way, you explicitly describe the ordering you desire in the database, where it should be.
sans SEQUENCE, works only on 8.4:
select * from comments c
join
(
select id, row_number() over() as id_sorter
from (select unnest(ARRAY[1,3,2,4]) as id) as y
) x on x.id = c.id
order by x.id_sorter
SELECT * FROM "comments" JOIN (
SELECT 1 as "id",1 as "order" UNION ALL
SELECT 3,2 UNION ALL SELECT 2,3 UNION ALL SELECT 4,4
) j ON "comments"."id" = j."id" ORDER BY j.ORDER
or if you prefer evil over good:
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY POSITION(','+"comments"."id"+',' IN ',1,3,2,4,')
And here's another solution that works and uses a constant table (http://www.postgresql.org/docs/8.3/interactive/sql-values.html):
SELECT * FROM comments AS c,
(VALUES (1,1),(3,2),(2,3),(4,4) ) AS t (ord_id,ord)
WHERE (c.id IN (1,3,2,4)) AND (c.id = t.ord_id)
ORDER BY ord
But again I'm not sure that this is performant.
I've got a bunch of answers now. Can I get some voting and comments so I know which is the winner!
Thanks All :-)
create sequence serial start 1;
select * from comments c
join (select unnest(ARRAY[1,3,2,4]) as id, nextval('serial') as id_sorter) x
on x.id = c.id
order by x.id_sorter;
drop sequence serial;
[EDIT]
unnest is not yet built-in in 8.3, but you can create one yourself(the beauty of any*):
create function unnest(anyarray) returns setof anyelement
language sql as
$$
select $1[i] from generate_series(array_lower($1,1),array_upper($1,1)) i;
$$;
that function can work in any type:
select unnest(array['John','Paul','George','Ringo']) as beatle
select unnest(array[1,3,2,4]) as id
Slight improvement over the version that uses a sequence I think:
CREATE OR REPLACE FUNCTION in_sort(anyarray, out id anyelement, out ordinal int)
LANGUAGE SQL AS
$$
SELECT $1[i], i FROM generate_series(array_lower($1,1),array_upper($1,1)) i;
$$;
SELECT
*
FROM
comments c
INNER JOIN (SELECT * FROM in_sort(ARRAY[1,3,2,4])) AS in_sort
USING (id)
ORDER BY in_sort.ordinal;
select * from comments where comments.id in
(select unnest(ids) from bbs where id=19795)
order by array_position((select ids from bbs where id=19795),comments.id)
here, [bbs] is the main table that has a field called ids,
and, ids is the array that store the comments.id .
passed in postgresql 9.6
Lets get a visual impression about what was already said. For example you have a table with some tasks:
SELECT a.id,a.status,a.description FROM minicloud_tasks as a ORDER BY random();
id | status | description
----+------------+------------------
4 | processing | work on postgres
6 | deleted | need some rest
3 | pending | garden party
5 | completed | work on html
And you want to order the list of tasks by its status.
The status is a list of string values:
(processing, pending, completed, deleted)
The trick is to give each status value an interger and order the list numerical:
SELECT a.id,a.status,a.description FROM minicloud_tasks AS a
JOIN (
VALUES ('processing', 1), ('pending', 2), ('completed', 3), ('deleted', 4)
) AS b (status, id) ON (a.status = b.status)
ORDER BY b.id ASC;
Which leads to:
id | status | description
----+------------+------------------
4 | processing | work on postgres
3 | pending | garden party
5 | completed | work on html
6 | deleted | need some rest
Credit #user80168
I agree with all other posters that say "don't do that" or "SQL isn't good at that". If you want to sort by some facet of comments then add another integer column to one of your tables to hold your sort criteria and sort by that value. eg "ORDER BY comments.sort DESC " If you want to sort these in a different order every time then... SQL won't be for you in this case.

Error in Hive Query while joining tables

I am unable to pass the equality check using the below HIVE query.
I have 3 table and i want to join these table. I trying as below, but get error :
FAILED: Error in semantic analysis: Line 3:40 Both left and right aliases encountered in JOIN 'visit_date'
select t1.*, t99.* from table1 t1 JOIN
(select v3.*, t3.* from table2 v3 JOIN table3 t3 ON
( v3.AS_upc= t3.upc_no AND v3.start_dt <= t3.visit_date AND v3.end_dt >= t3.visit_date AND v3.adv_price <= t3.comp_price ) ) t99 ON
(t1.comp_store_id = t99.cpnumber AND t1.AS_store_nbr = t99.store_no);
EDITED based on help from FuzzyTree:
1st:
We tried to edit above query using between and where clause, but not getting any output from the query.
But If we changed the above query by removing the between clause with date, then I got some output based on "v3.adv_price <= t3.comp_price", but not using "date filter".
select t1.*, t99.* from table1 t1 JOIN
(select v3.*, t3.* from table2 v3 JOIN table3 t3 on (v3.AS_upc= t3.upc_no)
where v3.adv_price <= t3.comp_price
) t99 ON
(t1.comp_store_id = t99.cpnumber AND t1.AS_store_nbr = t99.store_no);
2nd :
Next we tried to pass only one date as :
select t1.*, t99.* from table1 t1 JOIN
(select v3.*, t3.* from table2 v3 JOIN table3 t3 on (v3.AS_upc= t3.upc_no)
where v3.adv_price <= t3.comp_price and v3.start_dt <= t3.visit_date
) t99 ON
(t1.comp_store_id = t99.cpnumber AND t1.AS_store_nbr = t99.store_no);
So, now it's showing some result but if we pass both the start and end date filter, it; not showing any result.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins
Only equality joins, outer joins, and left semi joins are supported in
Hive. Hive does not support join conditions that are not equality
conditions as it is very difficult to express such conditions as a
map/reduce job.
Try moving your inequalities to the where clause
select t1.*, t99.* from table1 t1 JOIN
(select v3.*, t3.* from table2 v3 JOIN table3 t3 on (v3.AS_upc= t3.upc_no)
where t3.visit_date between v3.start_dt and v3.end_dt
and v3.adv_price <= t3.comp_price
) t99 ON
(t1.comp_store_id = t99.cpnumber AND t1.AS_store_nbr = t99.store_no);

Can a DB2 WITH statement be used as part of an UPDATE or MERGE?

I need to update some rows in a DB table. How I identify the rows to be updated involved a series of complicated statements, and I managed to boil them down to a series of WITH statements. Now I have the correct data values, I need to update the table.
Since I managed to get these values with a WITH statement, I was hoping to use it in the UPDATE/MERGE. A simplified example follows:
with data1
(
ID_1
)
as
(
Select ID
from ID_TABLE
where ID > 10
)
,
cmedb.data2
(
MIN_ORIGINAL_ID
,OTHER_ID
)
as
(
Select min(ORIGINAL_ID)
,OTHER_ID
from OTHER_ID_TABLE
where OTHER_ID in
(
Select distinct ID_1
From data1
)
group by OTHER_ID
)
select MIN_ORIGINAL_ID
,OTHER_ID
from cmedb.data2
Now I have the two columns of data, I want to use them to update a table. So instead of having the select at the bottom, I've tried all sorts of combinations of merges and updates, including having the WITH statement above the UPDATE/MERGE, or as part of the UPDATE/MERGE statement. The following is what comes closest in my mind to what I want to do:
merge into ID_TABLE as it
using
(
select MIN_ORIGINAL_ID
,OTHER_ID
from cmedb.data2
) AS SEL
ON
(
it.ID = sel.OTHER_ID
)
when matched then
update
set it.ORIGINAL_ID = sel.MIN_ORIGINAL_ID
So it doesn't work. I'm unsure if this is even possible, as I've found no examples on the internet using WITH statements in combination with UPDATE or MERGE. I have examples of WITH statements being used in conjunction with INSERT, so believe it might be possible.
If anyone can help it would be great, and please let me know if I've left out any information that would be useful to solve the problem.
Disclaimer: The example I've provided is a boiled down version of what I'm trying to do, and may not actually make any sense!
As #Andrew White says, you can't use a common table expression in a MERGE statement.
However, you can eliminate the common table expressions with nested subselects. Here is your example select statement, rewritten using nested subselects:
select min_original_id, other_id
from (
select min(original_id), other_id
from other_id_table
where other_id in (
select distinct id_1 from (select id from id_table where id > 10) AS DATA1 (ID_1)
)
group by other_id
) AS T (MIN_ORIGINAL_ID, OTHER_ID);
This is somewhat convoluted (the exact statement could be written better), but I realize that you were just giving a simplified example.
You may be able to rewrite your MERGE statement using nested subselects instead of common table expressions. It is certainly syntactically possible.
For example:
merge into other_id_table x
using (
select min_original_id, other_id
from (
select min(original_id), other_id
from other_id_table
where other_id in (
select distinct id_1 from (select id from id_table where id > 10) AS DATA1 (ID_1)
)
group by other_id
) AS T (MIN_ORIGINAL_ID, OTHER_ID)
) as y
on y.other_id = x.other_id
when matched
then update set other_id = y.min_original_id;
Again, this is convoluted, but it shows you that it is at least possible.
A way to use WITH statement with UPDATE (and INSERT too) is using SELECT FROM UPDATE statement (here):
WITH TEMP_TABLE AS (
SELECT [...]
)
SELECT * FROM FINAL TABLE (
UPDATE TABLE_A SET (COL1, COL2) = (SELECT [...] FROM TEMP_TABLE)
WHERE [...]
);
I'm looking up the grammar now but I am pretty sure the answer is no. At least not in the version of DB2 I last used. Take a peek at the update and merge doc pages for their syntax. Even if you see the fullselect in the syntax you can't use with as that is explicitly separate according to the select doc page.
If you're running DB2 V8 or later, there's an interesting SQL hack here that allows you to UPDATE/INSERT in a query with a WITH statement. For inserts & updates that require a lot of preliminary data prepping, I find this method offers a lot of clarity.
Edit One correction here - selecting from UPDATE statements was introduced in V9 i believe, so the above will work for inserts on V8 or greater, and updates for V9 or greater.
Put the CTEs into a view, and select from the view in the merge. You get a clean, readable view that way, and a clean, readable merge.
Another method is to simply substitute your WITH queries and just use subselects.
For example, if you had (and I tried to include a somewhat complex example with some WHERE logic, an aggregate function (MAX) and a GROUP BY, just to show it more real world):
WITH
Q1 AS (
SELECT
A.X,
A.Y,
A.Z,
MAX(A.W) AS W
FROM
TABLEB B
INNER JOIN TABLEA A ON B.X = A.X AND B.Y = A.Y AND B.Z = A.Z
WHERE A.W <= DATE('2013-01-01')
GROUP BY
A.X,
A.Y,
A.Z
),
Q2 AS (
SELECT
A.X,
A.Y,
A.Z,
A.W,
MAX(A.V) AS V
FROM
Q1
INNER JOIN TABLEA A ON Q1.X = A.X AND Q1.Y = A.Y AND Q1.Z = A.Z AND Q1.W = A.W
GROUP BY
A.X,
A.Y,
A.Z,
A.W
)
SELECT
B.U,
A.T
FROM
Q2
INNER JOIN TABLEA A ON Q2.X = A.X AND Q2.Y = A.Y AND Q2.Z = A.Z AND Q2.W = A.W AND Q2.V = A.V)
RIGHT OUTER JOIN TABLEB B ON Q2.X = B.X AND Q2.Y = B.Y AND Q2.Z = B.Z
... you could turn this into something appropriate for a MERGE INTO by doing the following:
remove the WITH at the top
remove the comma from the end of the Q1 block (after the closing parenthesis)
take the Q1 AS from before the opening parenthesis and put is after the ending parenthesis (remove the comma) and then put the AS in front of the Q1.
take this new Q1 block and cut it and paste it into the Q2 block after the FROM Q1 (replacing the Q1 with the query in your clipboard) NOTE: leave the other references to Q1 (in the inner join keys) alone, of course.
Now you have a bigger Q2 query. Do steps 3 and 4 again, this time replacing the Q2 (after the FROM) in your main select with the bigger Q2 query in your clipboard.
In the end, you'll have a straight SELECT query that looks like this (reformatted to show proper indentation):
SELECT
B.U,
A.T
FROM
(SELECT
A.X,
A.Y,
A.Z,
A.W,
MAX(A.V) AS V
FROM
(SELECT
A.X,
A.Y,
A.Z,
MAX(A.W) AS W
FROM
TABLEB B
INNER JOIN TABLEA A ON B.X = A.X AND B.Y = A.Y AND B.Z = A.Z
WHERE A.W <= DATE('2013-01-01')
GROUP BY
A.X,
A.Y,
A.Z) AS Q1
INNER JOIN TABLEA A ON Q1.X = A.X AND Q1.Y = A.Y AND Q1.Z = A.Z AND Q1.W = A.W
GROUP BY
A.X,
A.Y,
A.Z,
A.W) AS Q2
INNER JOIN TABLEA A ON Q2.X = A.X AND Q2.Y = A.Y AND Q2.Z = A.Z AND Q2.W = A.W AND Q2.V = A.V
RIGHT OUTER JOIN TABLEB B ON Q2.X = B.X AND Q2.Y = B.Y AND Q2.Z = B.Z
I have done this in my own personal experience (just now actually) and it works perfectly.
Good luck.

LINQ to SQL: making a "double IN" query crashes

I need to do the following thing:
var a = from c in DB.Customers
where (from t1 in DB.Table1 where t1.Date >= DataTime.Now
select t1.ID).Contains(c.ID) &&
(from t2 in DB.Table2 where t2.Date >= DataTime.Now
select t2.ID).Contains(c.ID)
select c
It doesn't want to run. I get the following error:
Timeout expired. The timeout period
elapsed prior to completion of the
operation or the server is not
responding.
But when I try to run:
var a = from c in DB.Customers
where (from t1 in DB.Table1 where t1.Date >= DataTime.Now
select t1.ID).Contains(c.ID)
select c
Or:
var a = from c in DB.Customers
where (from t2 in DB.Table2 where t2.Date >= DataTime.Now
select t2.ID).Contains(c.ID)
select c
It works! I'm sure that there both IN queries contain some customers ids.
In case this is an efficiency issue, it would be a good idea to look at the SQL query that LINQ to SQL produces (in debug mode, place the mouse cursor over a). In any case, you could try rewriting the query using join. Something like this should do the trick:
var a = from c in DB.Customers
join t1 in DB.Table1 on c.ID equals t1.ID
join t2 in DB.Table2 on c.ID equals t2.ID
where t1.Date >= DateTimeNow && t2.Date >= DateTimeNow
select c
It's not necessarily crashing but rather is likely producing an inefficient query that is timing out. A good thing to do is to run the SQL Server Profiler to see the actual query being emitted in SQL and then to do some analysis on that.
I found the problem. It's in my NEWID() order by method, because I want to get random results. When I remove it, it works fine. How can I use NEWID()?

Resources