[BigQuery]Cannot GROUP BY field references from SELECT list - join

When running query statement with group by in Google BigQuery, it was failed, and show
Cannot GROUP BY field references from SELECT list alias xxx
I tried many times to obtain its rules, but failed either.
My investigation is below:
a> Create tables and insert values
Create table FFNR_A, FFNR_B CREATE TABLE FFNR_A (A1 INT NOT NULL,
A2 INT NOT NULL, A3 INT NOT NULL);
CREATE TABLE FFNR_B (B1 INT NOT
NULL, B2 INT NOT NULL, B3 INT NOT NULL,B4 INT NOT NULL);
INSERT INTO FFNR_A VALUES (0, 3, 1); INSERT INTO FFNR_A VALUES (1,
0, 2); INSERT INTO FFNR_A VALUES (2, 1, 1); INSERT INTO FFNR_A
VALUES (3, 2, 2); INSERT INTO FFNR_A VALUES (5, 3, 0); INSERT INTO
FFNR_A VALUES (6, 3, 2); INSERT INTO FFNR_A VALUES (7, 4, 1); INSERT
INTO FFNR_A VALUES (8, 4, 3);
INSERT INTO FFNR_B VALUES (1, 1, 2, 0); INSERT INTO FFNR_B VALUES
(2, 2, 3, 0); INSERT INTO FFNR_B VALUES (3, 2, 4, 0); INSERT INTO
FFNR_B VALUES (4, 1, 5, 0); INSERT INTO FFNR_B VALUES (5, 7, 0, 0);
INSERT INTO FFNR_B VALUES (6, 8, 2, 0); INSERT INTO FFNR_B VALUES
(7, 7, 1, 0); INSERT INTO FFNR_B VALUES (8, 8, 3, 0); INSERT INTO
FFNR_B VALUES (0, 1, 3, 0);
b> Run Query
-- Cannot GROUP BY field references from SELECT list alias B1 at [3:60]
SELECT A0.`A1`, B1.`B1`,
FROM `xxx`.`FFNR_B` B1, `xxx`.`FFNR_A` A0
WHERE (A0.`A2` = B1.`B1`) AND (A0.`A2` = B1.`B1`) GROUP BY B1.`B1`,
A0.`A1` limit 2;
-- Works
SELECT A0.`A1`, B1.`B2`,
FROM `xxx`.`FFNR_B` B1, `xxx`.`FFNR_A` A0
WHERE (A0.`A2` = B1.`B1`) AND (A0.`A2` = B1.`B1`) GROUP BY B1.`B2`,
A0.`A1` limit 2;
-- Replace B1->A1 and column A1->A2
-- If use B1(tab), failed either
SELECT A0.`A2`, A1.`B1`,
FROM `xxx`.`FFNR_B` A1, `xxx`.`FFNR_A` A0
WHERE (A0.`A1` = A1.`B1`) AND (A0.`A1` = A1.`B1`) GROUP BY A1.`B1`,
A0.`A2` limit 2;
I didn't get any doc in BigQuery docs.
Can you give me any suggestions about the rules of group by?
Or
Is it a bug in BigQuery?
Thanks

Concluding the discussion from the comments:
-- Cannot GROUP BY field references from SELECT list alias B1 at [3:60]
SELECT A0.`A1`, B1.`B1`, FROM `xxx`.`FFNR_B` B1, `xxx`.`FFNR_A` A0
WHERE (A0.`A2` = B1.`B1`) AND (A0.`A2` = B1.`B1`) GROUP BY B1.`B1`, A0.`A1` limit 2;
As #Mikhail Berlyant said, in your query, the GROUP BY is getting confused by B1 being table alias and column name in output. The table/alias name should not be the same with column name in GROUP BY. To avoid this issue, use table aliases different from column names or simply use GROUP BY B1, A1 or GROUP BY 1, 2. It is a limitation in BigQuery and not a Bug. Refer to this doc for more information about Groupable data types.

Related

PostgreSQL migration from id(bigint) to id(uuid) while preserving relations

I have many tables and a lot of data and I would like to change all primary key types from bigint to uuid type. I have no idea how to preserve relations. Here is an example.
This means I am planning to
Add new column uuid of type uuid
Rename id column to id_obsolete
Rename uuid to id
Now I somehow need to preserve relations, so for example, if I have a table
users that has person_id (bigint) column, this should get migrated to person_id (uuid) and maintain the relationship. How can I do this without losing any data?
First, let's create sample data:
create table employees
(
id int4,
emp_name varchar,
emp_surname varchar
);
insert into employees (id, emp_name, emp_surname) values (100, 'Tim', 'James');
insert into employees (id, emp_name, emp_surname) values (101, 'Bella', 'Tucker');
insert into employees (id, emp_name, emp_surname) values (102, 'Ryan', 'Metcalfe');
insert into employees (id, emp_name, emp_surname) values (103, 'Dominic', 'King');
create table transactions
(
id int4,
emp_id int4,
tran_date date,
total int4
);
insert into transactions (id, emp_id, tran_date, total) values (1, 100, now(), 120);
insert into transactions (id, emp_id, tran_date, total) values (2, 100, now(), 195);
insert into transactions (id, emp_id, tran_date, total) values (3, 100, now(), 250);
insert into transactions (id, emp_id, tran_date, total) values (4, 100, now(), 50);
insert into transactions (id, emp_id, tran_date, total) values (5, 101, now(), 70);
insert into transactions (id, emp_id, tran_date, total) values (6, 101, now(), 125);
insert into transactions (id, emp_id, tran_date, total) values (7, 102, now(), 600);
insert into transactions (id, emp_id, tran_date, total) values (8, 102, now(), 15);
insert into transactions (id, emp_id, tran_date, total) values (9, 102, now(), 90);
insert into transactions (id, emp_id, tran_date, total) values (10, 103, now(), 10);
insert into transactions (id, emp_id, tran_date, total) values (11, 103, now(), 60);
insert into transactions (id, emp_id, tran_date, total) values (12, 103, now(), 155);
insert into transactions (id, emp_id, tran_date, total) values (13, 103, now(), 30);
create table item_sales
(
id int4,
emp_id int4,
process_date date,
price int4
);
insert into item_sales (id, emp_id, process_date, price) values (1, 100, now(), 5);
insert into item_sales (id, emp_id, process_date, price) values (2, 101, now(), 7);
insert into item_sales (id, emp_id, process_date, price) values (3, 102, now(), 12);
insert into item_sales (id, emp_id, process_date, price) values (4, 101, now(), 5);
insert into item_sales (id, emp_id, process_date, price) values (5, 103, now(), 9);
insert into item_sales (id, emp_id, process_date, price) values (6, 102, now(), 12);
insert into item_sales (id, emp_id, process_date, price) values (7, 100, now(), 9);
insert into item_sales (id, emp_id, process_date, price) values (8, 101, now(), 5);
insert into item_sales (id, emp_id, process_date, price) values (9, 100, now(), 9);
insert into item_sales (id, emp_id, process_date, price) values (10, 103, now(), 1);
insert into item_sales (id, emp_id, process_date, price) values (11, 102, now(), 6);
We need convert id field on employees to uuid and update all emp_id on another tables. First, create new uuid field.
ALTER TABLE employees
ADD id_new uuid;
ALTER TABLE item_sales
ADD emp_id_new uuid;
ALTER TABLE transactions
ADD emp_id_new uuid;
Update our new uuid field using generate uuid function:
update employees
set id_new = uuid_generate_v4();
Then update another tables using join:
update item_sales sls
set emp_id_new = emp.id_new
from employees emp
where emp.id = sls.emp_id;
update transactions trn
set emp_id_new = emp.id_new
from employees emp
where emp.id = trn.emp_id;
After then we can delete old id and emp_id fields and rename new fields to old name.
ALTER TABLE employees
DROP COLUMN id;
ALTER TABLE item_sales
DROP COLUMN emp_id;
ALTER TABLE transactions
DROP COLUMN emp_id;
ALTER TABLE employees
RENAME COLUMN id_new TO id;
ALTER TABLE item_sales
RENAME COLUMN emp_id_new TO emp_id;
ALTER TABLE transactions
RENAME COLUMN emp_id_new TO emp_id;

postgresql Get latest value before date

Let's say I have the following Inventory table.
id item_id stock_amount Date
(1, 1, 10, '2020-01-01T00:00:00')
(2, 1, 9, '2020-01-02T00:00:00')
(3, 1, 8, '2020-01-02T10:00:00')
(4, 3, 11, '2020-01-03T00:00:00')
(5, 3, 13, '2020-01-04T00:00:00')
(6, 4, 7, '2020-01-05T00:00:00')
(7, 2, 12, '2020-01-06T00:00:00')
Basically, per each day, I want to get the sum of stock_amount for each unique item_id but it should exclude the current day's stock amount. The item_id chosen should be the latest one. This is to calculate the starting stock on each day. So the response in this case would be:
Date starting_amount
'2020-01-01T00:00:00' 0
'2020-01-02T00:00:00' 10
'2020-01-03T00:00:00' 8
'2020-01-04T00:00:00' 19 -- # -> 11 + 8 (id 5 + id 3)
'2020-01-05T00:00:00' 21 -- # -> 13 + 8
'2020-01-06T00:00:00' 28 -- # -> 7 + 13 + 8
Any help would be greatly appreciated.
Using nested subqueries like this:
select
Date,
coalesce(sum(stock_amount), 0) starting_amount
from
(
select
row_number() over(partition by i1.Date, item_id order by i2.Date desc) i,
i1.Date,
i2.item_id,
i2.stock_amount
from
(select distinct date_trunc('day', Date) as Date from Inventory) i1
left outer join
Inventory i2
on i2.Date < i1.Date
) s
where i = 1
group by Date
order by Date
This query sorts in descending order and uses the first row.

Use Sheet's Array formula to count values in each row

When I apply an array formula for:
=count(D3:AA3)
It looks like this:
=ArrayFormula(if(row(A:A)=1,"Count",Count(D1:D:AA1:AA)))
Too many ":" (colons)?
I could (manually) paste the =count(D3:AA3) ...down every row, but I'd like it to be automated.
Here is a formula to count all the number values (COUNT does exactly that) row-wise:
={
"Count";
MMULT(
ARRAYFORMULA(--(ISNUMBER(F2:O))),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
)
}
You can replace F2:O with the range you have the data in.
Update.
Count is in column A:A, sum - column B:B, avg - column C:C, avg in a single cell (w/o using count and sum columns) - column D:D. F2:N cells have random data, some numeric, some text (will be ignored).
Here is a formula for the row wise sum of numeric values:
={
"Sum";
MMULT(
ARRAYFORMULA(IF(ISNUMBER(F2:O), F2:O, 0)),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
)
}
Here is the formula for the row wise average if you have count and sum columns:
={
"AVG";
ARRAYFORMULA(IF(A2:A = 0, 0, B2:B / A2:A))
}
And the row wise average in a single cell w/o using count and sum columns:
={
"AVG one single formula";
ARRAYFORMULA(
IF(
MMULT(
--(ISNUMBER(F2:O)),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
) = 0,
0,
MMULT(
IF(ISNUMBER(F2:O), F2:O, 0),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
) / MMULT(
--(ISNUMBER(F2:O)),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
)
)
)
}

Attempting a transpose by performing multiple joins of table on subsets of same table in Hive

I'm attempting to perform a transpose on the column date by performing multiple joins of my table data_A on subsets of the same table:
Here's the code to create my test dataset, which contains duplicate records for every value of count:
create table database.data_A (member_id string, x1 int, x2 int, count int, date date);
insert into table database.data_A
select 'A0001',1, 10, 1, '2017-01-01'
union all
select 'A0001',1, 10, 2, '2017-07-01'
union all
select 'A0001',2, 20, 1, '2017-01-01'
union all
select 'A0001',2, 20, 2, '2017-07-01'
union all
select 'B0001',3, 50, 1, '2017-03-01'
union all
select 'C0001',4, 100, 1, '2017-04-01'
union all
select 'D0001',5, 200, 1, '2017-10-01'
union all
select 'D0001',5, 200, 2, '2017-11-01'
union all
select 'D0001',5, 200, 3, '2017-12-01'
union all
select 'D0001',6, 500, 1, '2017-10-01'
union all
select 'D0001',6, 500, 2, '2017-11-01'
union all
select 'D0001',6, 500, 3, '2017-12-01'
union all
select 'D0001',7, 1000, 1, '2017-10-01'
union all
select 'D0001',7, 1000, 2, '2017-11-01'
union all
select 'D0001',7, 1000, 3, '2017-12-01';
I'd like to transpose the data into this:
member_id x1 x2 date1 date2 date3
'A0001', 1, 10, '2017-01-01' '2017-07-01' .
'A0001', 2, 20, '2017-01-01' '2017-07-01' .
'B0001', 3, 50, '2017-03-01' . .
'C0001', 4, 100, '2017-04-01' . .
'D0001', 5, 200, '2017-10-01' '2017-11-01' '2017-12-01'
'D0001', 6, 500, '2017-10-01' '2017-11-01' '2017-12-01'
'D0001', 7, 1000, '2017-10-01' '2017-11-01' '2017-12-01'
My first program (which was not successful):
create table database.data_B as
select a.member_id, a.x1, a.x2, a.date_1, b.date_2, c.date_3
from (select member_id, x1, x2, date as date_1 from database.data_A where count=1) as a
left join
(select member_id, date as date_2 from database.data_A where count=2) as b
on (a.member_id=b.member_id)
left join
(select member_id, date as date_3 from database.data_A where count=3) as c
on (a.member_id=c.member_id);
Below will do the job.
select
member_id,
x1,
x2,
max(case when count=1 then date1 else '.' end) as date11,
max(case when count=2 then date1 else '.' end) as date2,
max(case when count=3 then date1 else '.' end) as date3
from data_A
group by member_id,x1, x2

Join two tables with SUM and COUNT

I am attempting to join two tables and also get a SUM and COUNT. I need to get the SUM QTY from history table and COUNT SN for each PN and LOC from rota table.
History table:
create table history (
code int(10) primary key,
PN varchar(10) not null,
LOC varchar(10) not null,
Qty int(10) not null);
insert into history values (1, 'T1', 'AAA', 1);
insert into history values (2, 'A1', 'BBB', 2);
insert into history values (3, 'J1', 'CCC', 3);
insert into history values (4, 'A2', 'AAA', 1);
insert into history values (5, 'J2', 'BBB', 2);
insert into history values (6, 'A3', 'CCC', 3);
insert into history values (7, 'J3', 'AAA', 4);
insert into history values (8, 'T1', 'BBB', 5);
insert into history values (9, 'A1', 'CCC', 1);
insert into history values (10, 'J2', 'AAA', 3);
insert into history values (11, 'J2', 'BBB', 4);
insert into history values (12, 'A1', 'CCC', 3);
insert into history values (13, 'J2', 'AAA', 5);
Rota table
create table rota (
code int(10) primary key,
PN varchar(10) not null,
SN varchar(10) not null,
LOC varchar(10) not null);
insert into rota values (1, 'T1', 't1a', 'AAA');
insert into rota values (2, 'A1', 'a1a', 'BBB');
insert into rota values (3, 'J1', 'j1a', 'CCC');
insert into rota values (4, 'A2', 'a2a', 'AAA');
insert into rota values (5, 'J2', 'j2a', 'BBB');
insert into rota values (6, 'A3', 'a3a', 'CCC');
insert into rota values (7, 'J3', 'j3a', 'AAA');
insert into rota values (8, 'T1', 't1b', 'BBB');
insert into rota values (9, 'A1', 'a1b', 'CCC');
insert into rota values (10, 'J2', 'j2b', 'AAA');
insert into rota values (11, 'J2', 'j2c', 'BBB');
insert into rota values (12, 'A1', 'a1c', 'CCC');
insert into rota values (13, 'J2', 'j2d', 'AAA');
insert into rota values (14, 'J2', 'j2e', 'AAA');
insert into rota values (15, 'J2', 'j2f', 'AAA');
The desired result is the following table
PN LOC SUM(QTY) COUNT(SN)
A1 BBB 2 1
A1 CCC 4 2
A2 AAA 1 1
A3 CCC 3 1
J1 CCC 3 1
J2 AAA 8 4
J2 BBB 6 2
J3 AAA 4 1
T1 AAA 1 1
T1 BBB 5 1
Use sub-queries like so:
select a.pn, a.loc, a.q, b.c from
(select h.pn, h.loc, sum(qty) q from history h group by h.pn, h.loc) a
join
(select r.pn, r.loc, count(sn) c from rota r group by r.pn, r.loc) b
on a.pn = b.pn and a.loc = b.loc
order by a.pn;
(tried it on Oracle, don't have MySQL available right now, but it should work fine or with minor adaptations)

Resources