php vlookup in sql - xml-parsing

I have a table like this in sql
ID NAME SIZE GROUP1 GROUP2 SIZE2
1 casa xl 1 2
2 casa l 1 2
I'd like to obtain a table like this
ID NAME SIZE GROUP1 GROUP2 SIZE2
1 casa xl 1 2 l
2 casa l 1 2 xl
So the value of GROUP1 and GROUP2 identify the id that have similar NAME but different value for size
Ho can I do?

Join in the same table again, with the id that is not the same as the record itself:
select
t.ID, t.NAME, t.SIZE, t.GROUP1, t.GROUP2, t2.SIZE
from
TheTable t
inner join TheTable t2 on t2.ID = case t.GROUP1 when t.ID then t.GROUP2 else t.GROUP1 end
To select from table1 and insert it into table2:
insert into table2
select
t.ID, t.NAME, t.SIZE, t.GROUP1, t.GROUP2, t2.SIZE
from
table1 t
inner join table1 t2 on t2.ID = case t.GROUP1 when t.ID then t.GROUP2 else t.GROUP1 end

Related

Error while running hive join query

SELECT A.* , B.* FROM
(SELECT ID,DATE FROM APPLE) A
INNER JOIN
(SELECT ID,MAX(DATE) AS MAXDATE FROM APPLE GROUP BY ID) A1
ON A.ID = A.ID AND A.DATE = A1.MAXDATE
WHERE A.DATE > CURRENT_DATE
LEFT OUTER JOIN (
SELECT ID,NAME FROM BANANA) B
ON A.ID = B.ID
WHERE B.NAME IN ('USA','GBR') LIMIT 10;
Error: Error while compiling statement: FAILED: ParseException line
22:0 missing EOF at 'LEFT' near 'CURRENT_DATE'
(state=42000,code=40000)
Your problem is that you have a WHERE clause in the middle of your SQL statement. you can either move it into the nested query for A, or add it to the WHERE clause at the end. You also probably want to move the filtering on the B table inside the nested query, because you are essentially making the left join into an inner join by putting it in a WHERE clause at the end of the statement.
either
SELECT A.* , B.* FROM
(SELECT ID,DATE FROM APPLE WHERE DATE > CURRENT_DATE) A
INNER JOIN
(SELECT ID,MAX(DATE) AS MAXDATE FROM APPLE GROUP BY ID) A1
ON A.ID = A.ID AND A.DATE = A1.MAXDATE
LEFT OUTER JOIN (
SELECT ID,NAME FROM BANANA WHERE NAME IN ('USA','GBR') ) B
ON A.ID = B.ID
LIMIT 10;
or
SELECT A.* , B.* FROM
(SELECT ID,DATE FROM APPLE) A
INNER JOIN
(SELECT ID,MAX(DATE) AS MAXDATE FROM APPLE GROUP BY ID) A1
ON A.ID = A.ID AND A.DATE = A1.MAXDATE
LEFT OUTER JOIN (
SELECT ID,NAME FROM BANANA WHERE NAME IN ('USA','GBR') ) B
ON A.ID = B.ID
WHERE A.DATE > CURRENT_DATE
LIMIT 10;
The WHERE clause i.e. A.DATE > CURRENT_DATE should be inside the first select.Also note that you have a condition A.ID = A.ID and instead of A.ID = A1.ID
SELECT
A.* , B.*
FROM
(SELECT ID,DATE FROM APPLE WHERE DATE > CURRENTDATE) A
INNER JOIN
(SELECT ID,MAX(DATE) AS MAXDATE FROM APPLE GROUP BY ID) A1
ON
A.ID = A1.ID AND A1.DATE = A1.MAXDATE
LEFT OUTER JOIN
(SELECT ID,NAME FROM BANANA) B
ON
A.ID = B.ID
WHERE B.NAME IN ('USA','GBR') LIMIT 10;

Transpose rows into columns in BigQuery using standard sql [duplicate]

This question already has answers here:
How to Pivot table in BigQuery
(7 answers)
Closed 2 years ago.
Good morning,
I'm trying to transpose some data in big query. I've looked at a few other people who have asked this on stackoverflow but the way to do this seems to be to use legacy sql (using group_concat_unquoted) rather than standard sql. I would use legacy but I've had issues with nested data in the past so have since used standard only.
Here's my example, to give some context I'm trying to map out some customer journeys which I have below:
uniqueid | page_flag | order_of_pages
A | Collection| 1
A | Product | 2
A | Product | 3
A | Login | 4
A | Delivery | 5
B | Clearance | 1
B | Search | 2
B | Product | 3
C | Search | 1
C | Collection| 2
C | Product | 3
However I'd like to transpose the data so it looks like this:
uniqueid | 1 | 2 | 3 | 4 | 5
A | Collection | Product | Product | Login | Delivery
B | Clearance | Search | Product | NULL | NULL
C | Search | Collection | Product | NULL | NULL
I've tried using multiple left joins but get the following error:
select a.uniqueid,
b.page_flag as page1,
c.page_flag as page2,
d.page_flag as page3,
e.page_flag as page4,
f.page_flag as page5
from
(select distinct uniqueid,
(case when uniqueid is not null then 1 end) as page_hit1,
(case when uniqueid is not null then 2 end) as page_hit2,
(case when uniqueid is not null then 3 end) as page_hit3,
(case when uniqueid is not null then 4 end) as page_hit4,
(case when uniqueid is not null then 5 end) as page_hit5
from `mytable`) a
LEFT JOIN (
SELECT *
from `mytable`) b on a.uniqueid = b.uniqueid
and a.page_hit1 = b.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) c on a.uniqueid = c.uniqueid
and a.page_hit2 = c.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) d on a.uniqueid = d.uniqueid
and a.page_hit3 = d.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) e on a.uniqueid = e.uniqueid
and a.page_hit4 = e.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) f on a.uniqueid = f.uniqueid
and a.page_hit5 = f.order_of_pages
Error: Query exceeded resource limits for tier 1. Tier 13 or higher required.
I've looked at using Array function as well but I've never used this before and I'm not sure if this is just for transposing the other way around. Any advice would be grand.
Thank you
for BigQuery Standard SQL
#standardSQL
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid
You can play/test with below dummy data from your question
#standardSQL
WITH `mytable` AS (
SELECT 'A' AS uniqueid, 'Collection' AS page_flag, 1 AS order_of_pages UNION ALL
SELECT 'A', 'Product', 2 UNION ALL
SELECT 'A', 'Product', 3 UNION ALL
SELECT 'A', 'Login', 4 UNION ALL
SELECT 'A', 'Delivery', 5 UNION ALL
SELECT 'B', 'Clearance', 1 UNION ALL
SELECT 'B', 'Search', 2 UNION ALL
SELECT 'B', 'Product', 3 UNION ALL
SELECT 'C', 'Search', 1 UNION ALL
SELECT 'C', 'Collection', 2 UNION ALL
SELECT 'C', 'Product', 3
)
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid
result is
uniqueid p1 p2 p3 p4 p5
A Collection Product Product Login Delivery
B Clearance Search Product null null
C Search Collection Product null null
Depends on your needs you can also consider below approach (not pivot though)
#standardSQL
SELECT uniqueid,
STRING_AGG(page_flag, '>' ORDER BY order_of_pages) AS journey
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid
if to run with same dummy data as above - result is
uniqueid journey
A Collection>Product>Product>Login>Delivery
B Clearance>Search>Product
C Search>Collection>Product

How to query a table which has a parent child relation

Situation:
I have a table called "word" which contains a word with the associated translations.
| ID | name | lang_id | parent_id |
|----|----------|---------|-----------|
| 1 | screw | 1 | null |
| 2 | schraube | 2 | 1 |
| 3 | vis | 3 | 1 |
So screw is the main word which has no parent. The other data sets have an association to the parent with the parent_id.
What I want:
I need a query which displays the word I searched for and the word which I typed in.
I want to get the datasets 2 and 3, if I query the word "schraube" from german to french.
I want to get the datasets 1 and 3, if I query the word "screw" from english to french.
...
What I tried:
select word.id, word.name, word.lang_id, word.parent_id
from word
left join word w2 on word.parent_id = w2.parent_id
WHERE w2.name = 'screw';
-- and word.lang_id = 2
Unfortunately the result doesn't contain the word I typed. Also this displays all datasets, not only the ones with the specific language.
You can modifiy th below query to get your answer.
DECLARE #FromLanguageId SMALLINT = 2; --german
DECLARE #ToLanguageId SMALLINT = 3; --french
DECLARE #NAME NVARCHAR(300) = 'schraube';
--DECLARE #FromLanguageId SMALLINT = 1; --english
--DECLARE #ToLanguageId SMALLINT = 3; --french
--DECLARE #NAME NVARCHAR(300) = 'screw';
--Get the mathing record
;
WITH ctematch
AS (
--gets the matching record (child or parent)
SELECT match.*
FROM [word] match
WHERE match.NAME LIKE #NAME),
--Join its sibling , parent and childs
ctefamilydata
AS (SELECT *
FROM ctematch match
UNION
--Parent
SELECT parent.*
FROM ctematch match
INNER JOIN [word] parent
ON match.[parent_id] = parent.[id]
UNION
--Child
SELECT child.*
FROM ctematch match
INNER JOIN [word] child
ON child.[parent_id] = match.[id]
UNION
--Siblings
SELECT siblings.*
FROM ctematch match
INNER JOIN [word] siblings
ON match.[parent_id] = siblings.[parent_id])
--Filter and get the data
SELECT *
FROM ctefamilydata Cte
WHERE Cte.[lang_id] = #ToLanguageId
OR Cte.[lang_id] = #FromLanguageId

Rewrite subquery with join

How would I go about rewriting the subquery below using join.
select name
from person p
where exists (select *
from friends r, person p2
where r.name1 = p.name and p2.name = r.name2 and p.address = p2.address)
select
p1.name
from
friends r
inner join person p1 on (p1.name=r.name1)
inner join person p2 on (p2.name=r.name2 and p2.address=p1.address)

Select rows where column A is not unique and column B is unique?

Suppose I have the following table:
MyTable
id INTEGER PRIMARY KEY
column_a TEXT
column_b TEXT
Now I want to return all rows where column_a is not unique, but where column_b is unique. So if I have the following data in the table:
id column_a column_b
1 A x
2 B x
3 A y
4 A x
5 B x
6 C z
I want the SQL statement to return this:
id column_a column_b
1 A x
3 A y
because column_a is the same in both rows but column_b differs. The rows with column_a="B" have the same value in column_b, so they should not be returned. And the row with column_a="C" has a unique column_a, so it shouldn't be returned either. How would I do that?
I've come half way by the following SQL:
SELECT *
FROM MyTable
JOIN
(
SELECT column_a, column_b
FROM MyTable
GROUP BY column_a
HAVING COUNT(*) >= 2
) TmpTable
ON MyTable.column_a = TmpTable2.column_a
WHERE MyTable.column_b != TmpTable.column_b
but that omits the last of the rows that I want to return, so in the above example it would only return
id column_a column_b
1 A x
SELECT MIN(id),
column_a,
column_b
FROM MyTable
WHERE column_a IN (SELECT column_a
FROM MyTable
GROUP BY column_a
HAVING COUNT(DISTINCT column_b) >= 2)
GROUP BY column_a,
column_b

Resources