I am working within Snowflake and some of the IDs in a given table and column are delimited by semicolons. Despite this delimiter the tables should still be joined. Any attempt to join the table is usually met with an error of some sort.
Below I have an example of what I have attempted to do.
table_A table_B
+----------+----------+ +----------+----------+
| some_id | F_Name | | some_id | L_Name |
+----------+----------+ +----------+----------+
| 12345 | John | |12345;4321| Doe |
+----------+----------+ +----------+----------+
Attempted SQL Statement
select table_A.some_id, table_A.F_name, table_B.L_Name
from table_A
left join table_B on table_A.some_id like '%'||table_B.some_id||'%'
The source for this particular problem has shown up here but it doesn't seem to work. It may just not be possible to do joins in this particular way.
https://community.snowflake.com/s/question/0D50Z00008zPLLx/join-with-partial-string-match
Any help is most appreciated.
Your approach is correct but your use of LIKE clause isn't. You are looking for table_A.some_id as a partial part of table_B.some_id
So your query should be :
WITH table_A AS (
SELECT '12345' AS some_id, 'John' AS F_NAME
),
table_B AS (
SELECT '12345;4321' AS some_id, 'Doe' AS L_NAME
)
SELECT
table_A.some_id,
table_A.F_name,
L_Name
FROM table_A
LEFT JOIN table_B
ON table_B.some_id LIKE '%'||table_A.some_id||'%'
;
If you use a CTE to flatten the second table and then filter out duplicates
WITH expand AS (
SELECT
f.seq as seq,
f.index as index
f.value as join_token
t.l_name
FROM table_b t,
lateral split_to_table(t.some_id, ';') f
)
SELECT
a.some_id,
a.f_name,
f.l_name
FROM table_a AS a
LEFT JOIN expand AS f
ON a.some_id = f.join_token
QUALIFY row_number() over (partition by f.seq ORDER BY f.index) == 1
I have a simple SQL query in PostgreSQL 8.3 that grabs a bunch of comments. I provide a sorted list of values to the IN construct in the WHERE clause:
SELECT * FROM comments WHERE (comments.id IN (1,3,2,4));
This returns comments in an arbitrary order which in my happens to be ids like 1,2,3,4.
I want the resulting rows sorted like the list in the IN construct: (1,3,2,4).
How to achieve that?
You can do it quite easily with (introduced in PostgreSQL 8.2) VALUES (), ().
Syntax will be like this:
select c.*
from comments c
join (
values
(1,1),
(3,2),
(2,3),
(4,4)
) as x (id, ordering) on c.id = x.id
order by x.ordering
In Postgres 9.4 or later, this is simplest and fastest:
SELECT c.*
FROM comments c
JOIN unnest('{1,3,2,4}'::int[]) WITH ORDINALITY t(id, ord) USING (id)
ORDER BY t.ord;
WITH ORDINALITY was introduced with in Postgres 9.4.
No need for a subquery, we can use the set-returning function like a table directly. (A.k.a. "table-function".)
A string literal to hand in the array instead of an ARRAY constructor may be easier to implement with some clients.
For convenience (optionally), copy the column name we are joining to ("id" in the example), so we can join with a short USING clause to only get a single instance of the join column in the result.
Works with any input type. If your key column is of type text, provide something like '{foo,bar,baz}'::text[].
Detailed explanation:
PostgreSQL unnest() with element number
Just because it is so difficult to find and it has to be spread: in mySQL this can be done much simpler, but I don't know if it works in other SQL.
SELECT * FROM `comments`
WHERE `comments`.`id` IN ('12','5','3','17')
ORDER BY FIELD(`comments`.`id`,'12','5','3','17')
With Postgres 9.4 this can be done a bit shorter:
select c.*
from comments c
join (
select *
from unnest(array[43,47,42]) with ordinality
) as x (id, ordering) on c.id = x.id
order by x.ordering;
Or a bit more compact without a derived table:
select c.*
from comments c
join unnest(array[43,47,42]) with ordinality as x (id, ordering)
on c.id = x.id
order by x.ordering
Removing the need to manually assign/maintain a position to each value.
With Postgres 9.6 this can be done using array_position():
with x (id_list) as (
values (array[42,48,43])
)
select c.*
from comments c, x
where id = any (x.id_list)
order by array_position(x.id_list, c.id);
The CTE is used so that the list of values only needs to be specified once. If that is not important this can also be written as:
select c.*
from comments c
where id in (42,48,43)
order by array_position(array[42,48,43], c.id);
I think this way is better :
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY id=1 DESC, id=3 DESC, id=2 DESC, id=4 DESC
Another way to do it in Postgres would be to use the idx function.
SELECT *
FROM comments
ORDER BY idx(array[1,3,2,4], comments.id)
Don't forget to create the idx function first, as described here: http://wiki.postgresql.org/wiki/Array_Index
In Postgresql:
select *
from comments
where id in (1,3,2,4)
order by position(id::text in '1,3,2,4')
On researching this some more I found this solution:
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY CASE "comments"."id"
WHEN 1 THEN 1
WHEN 3 THEN 2
WHEN 2 THEN 3
WHEN 4 THEN 4
END
However this seems rather verbose and might have performance issues with large datasets.
Can anyone comment on these issues?
To do this, I think you should probably have an additional "ORDER" table which defines the mapping of IDs to order (effectively doing what your response to your own question said), which you can then use as an additional column on your select which you can then sort on.
In that way, you explicitly describe the ordering you desire in the database, where it should be.
sans SEQUENCE, works only on 8.4:
select * from comments c
join
(
select id, row_number() over() as id_sorter
from (select unnest(ARRAY[1,3,2,4]) as id) as y
) x on x.id = c.id
order by x.id_sorter
SELECT * FROM "comments" JOIN (
SELECT 1 as "id",1 as "order" UNION ALL
SELECT 3,2 UNION ALL SELECT 2,3 UNION ALL SELECT 4,4
) j ON "comments"."id" = j."id" ORDER BY j.ORDER
or if you prefer evil over good:
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY POSITION(','+"comments"."id"+',' IN ',1,3,2,4,')
And here's another solution that works and uses a constant table (http://www.postgresql.org/docs/8.3/interactive/sql-values.html):
SELECT * FROM comments AS c,
(VALUES (1,1),(3,2),(2,3),(4,4) ) AS t (ord_id,ord)
WHERE (c.id IN (1,3,2,4)) AND (c.id = t.ord_id)
ORDER BY ord
But again I'm not sure that this is performant.
I've got a bunch of answers now. Can I get some voting and comments so I know which is the winner!
Thanks All :-)
create sequence serial start 1;
select * from comments c
join (select unnest(ARRAY[1,3,2,4]) as id, nextval('serial') as id_sorter) x
on x.id = c.id
order by x.id_sorter;
drop sequence serial;
[EDIT]
unnest is not yet built-in in 8.3, but you can create one yourself(the beauty of any*):
create function unnest(anyarray) returns setof anyelement
language sql as
$$
select $1[i] from generate_series(array_lower($1,1),array_upper($1,1)) i;
$$;
that function can work in any type:
select unnest(array['John','Paul','George','Ringo']) as beatle
select unnest(array[1,3,2,4]) as id
Slight improvement over the version that uses a sequence I think:
CREATE OR REPLACE FUNCTION in_sort(anyarray, out id anyelement, out ordinal int)
LANGUAGE SQL AS
$$
SELECT $1[i], i FROM generate_series(array_lower($1,1),array_upper($1,1)) i;
$$;
SELECT
*
FROM
comments c
INNER JOIN (SELECT * FROM in_sort(ARRAY[1,3,2,4])) AS in_sort
USING (id)
ORDER BY in_sort.ordinal;
select * from comments where comments.id in
(select unnest(ids) from bbs where id=19795)
order by array_position((select ids from bbs where id=19795),comments.id)
here, [bbs] is the main table that has a field called ids,
and, ids is the array that store the comments.id .
passed in postgresql 9.6
Lets get a visual impression about what was already said. For example you have a table with some tasks:
SELECT a.id,a.status,a.description FROM minicloud_tasks as a ORDER BY random();
id | status | description
----+------------+------------------
4 | processing | work on postgres
6 | deleted | need some rest
3 | pending | garden party
5 | completed | work on html
And you want to order the list of tasks by its status.
The status is a list of string values:
(processing, pending, completed, deleted)
The trick is to give each status value an interger and order the list numerical:
SELECT a.id,a.status,a.description FROM minicloud_tasks AS a
JOIN (
VALUES ('processing', 1), ('pending', 2), ('completed', 3), ('deleted', 4)
) AS b (status, id) ON (a.status = b.status)
ORDER BY b.id ASC;
Which leads to:
id | status | description
----+------------+------------------
4 | processing | work on postgres
3 | pending | garden party
5 | completed | work on html
6 | deleted | need some rest
Credit #user80168
I agree with all other posters that say "don't do that" or "SQL isn't good at that". If you want to sort by some facet of comments then add another integer column to one of your tables to hold your sort criteria and sort by that value. eg "ORDER BY comments.sort DESC " If you want to sort these in a different order every time then... SQL won't be for you in this case.
sorry for the late response.
For a key in table A, there may be 2 or more records present in tables B and C. That is, one another column in these tables will have a date value which would be making the keys unique. So I want to extract the record that has maximum date value. And that's why I am using the max function. I know that the subquery which I have coded should not be included in the ON clause and it would do the filtering before the join statement. So eventually I want to know how to mention the max clause in the query.
Example:
Table A
Key - AAAAA
Table B:
Record 1
Key - AAAAA
Date - 2017-10-01
Record 2
Key - AAAAA
Date - 2017-10-05
I want the only the record AAAAA/2017-10-05 to be selected from the table B
Basically records from table A where A.c3 = 'Y' should be extracted first (assume it gives 500 records)
Then join these 500 records with tables B and C (left outer, to have all the matching records and the non-matching records should have nulls in the columns from the tables B and C)
In tables B and C, if more than 1 record present with different dates, the maximum date field should be extracted.
Hence final output should contain 500 records.
This is all you need for what you describe
SELECT A.A1, A.A2, B.B1, B.B2, C.C1, C.C2
FROM TABLE1 A
LEFT OUTER JOIN TABLE2 B
ON A.A1 = B.B1
LEFT OUTER JOIN TABLE3 C
ON A.A1 = C.C1
WHERE A.C3 = ‘Y’
These lines are causing your problem...basically forcing your outer joins to an inner joins.
AND B.C3 = (SELECT MAX(B3) FROM TABLE2 T1
WHERE T1.B1 = B.B1)
AND C.C3 = (SELECT MAX(C3) FROM TABLE3 T1
WHERE T1.C1 = C.C1)
If there's no match in B or C , then B.C3 and/or C.C3 will be NULL and NULL can't be = to anything (or <> to anything for that matter)
What are you trying to accomplish with the above that you've not included in the question?
Just do it?
SELECT A.A1, A.A2, B.B1, B.B2, C.C1, C.C2
FROM TABLE1 A
LEFT OUTER JOIN TABLE2 B
ON A.A1 = B.B1
LEFT OUTER JOIN TABLE3 C
ON A.A1 = C.C1
WHERE A.C3 = 'Y' and (B.B1 is null or C.B1 is null)
I want to join two tables on a key based on LAG function. My query doesn't work though. I get an error:
Msg 4108, Level 15, State 1, Line 13 Windowed functions can only appear in the SELECT or ORDER BY clauses.
I shall appreciate any suggestion on how to tackle it.
**Table A**
Key
1
2
3
and so on...
**Table B**
MaxKey | Something
3 | A
5 | B
8 | C
**Expected Results**
Key|Something
1 A
2 A
3 A
4 B
5 B
6 C
SELECT
tabA.Key
,tabB.[Something]
,LAG (tabB.MaxKey,1,1) OVER (ORDER BY tabB.MaxKey) AS MinKey
,tabB.[MaxKey]
FROM TableA as tabA
LEFT JOIN TableB as tabB
ON tabA.Key > tabB.MinKey AND tabA.Key <= tabB.MaxKey
I think you can solve this using an outer apply like this:
select * from TableA a
outer apply (
select top 1 something
from TableB b
where b.maxkey >= a.[key]
) oa
Sample SQL Fiddle
Another option is to modify your query to do the lag in a derived table, I believe this might work too:
SELECT
tabA.[Key]
,tabB.[Something]
,MinKey
,tabB.[MaxKey]
FROM TableA as tabA
LEFT JOIN (
SELECT
[Something]
,LAG (MaxKey,1,1) OVER (ORDER BY MaxKey) AS MinKey
,[MaxKey]
FROM TableB) tabB
ON tabA.[key] >= tabB.MinKey AND tabA.[key] <= tabB.MaxKey
ORDER BY tabA.[key]