"lateral join" equivalent in KDB? - join

How do you "unpack" an array valued column in kdb ?
I have a table T which includes an array-valued column C,
and want a result with each the rows of T duplicated as many times as it has entries in the column C, with each duplicate containing one of the values from that column.
In PostgreSQL this would be a "lateral join".

Assuming you have the following table:
t:([]a:`a`b`c`d;b:(1,();(2;3);4,();(5;6;7)))
t
a b
-------
a ,1
b 2 3
c ,4
d 5 6 7
And you want duplicated rows for each value in column b, you can use UNGROUP to get:
q) ungroup t
a b
---
a 1
b 2
b 3
c 4
d 5
d 6
d 7

The simplest way to flatten nested columns is with the ungroup command. This command will work where multiple nested columns exist, provided the lists in each row have the same length.
q)show tab:([]a:`a`b`c;b:(1#`d;`e`f;`g`h);c:(1#1;2 3;4 5))
a b c
----------
a ,`d ,1
b `e`f 2 3
c `g`h 4 5
q)ungroup tab
a b c
-----
a d 1
b e 2
b f 3
c g 4
c h 5
The downsides to this approach are that all nested columns are ungrouped and if there are different length lists in each row then the command will fail:
q)show tab2:([]a:`a`b`c;b:(1#`d;`e`f;`g`h);c:(1#1;2 3;1#4))
a b c
----------
a ,`d ,1
b `e`f 2 3
c `g`h ,4 / different length lists
q)ungroup tab2
'length
[0] ungroup tab2
^
One possible solution to ungroup by a single column is the following, which duplicates each row by the number of elements in each c value:
q)f:{[t;c]#[t where count each r;c;:;raze r:t c]}
q)f[tab2;`c]
a b c
--------
a ,`d 1
b `e`f 2
b `e`f 3
c `g`h 4

Related

Formula to list data in two or more column ranges, excluding blank rows

a
b
c
d
e
f
result
1
q
4
r
q
2
w
5
t
w
3
e
6
e
r
7
r
8
r
t
Column ranges must be used from row 3 to the entire row.
The range referring to the image is B3:B column, D3:D column. Please ignore columns A and C.
The result must not contain blank cells.
Try:
=QUERY(FLATTEN(TRANSPOSE(FILTER(A3:D,MOD(COLUMN(A3:D),2)=0))),"where Col1 is not null")

Google sheets - summarize matrix values

I have data like below:
1 2 3 A B C
4 5 6 A B C
7 C
3 4 B C
8 9 C B
1 2 3 C A B
The values 1-9 are assigned to values A, B, C - so for instance in first row A=1, B=2, C=3 etc.
I need to summarize all the elements above to get A, B, C.
How to start with this problem?
Final results:
A 7
B 22
C 29
Edit:
https://docs.google.com/spreadsheets/d/1kzOtV9_SE5s7DaA_6YcZq6z08FpfXaIAjukyISew49U/edit?usp=sharing
This should do your summary provided the ranges are the same dimensions.
=QUERY({FLATTEN(Arkusz1!F2:H)\FLATTEN(Arkusz1!B2:D)};"select Col1,SUM(Col2) where Col1<>'' group by Col1 order by SUM(Col2) desc label Col1'Summary'")
It is installed in a new tab called MK_Help

How to get unique values in one column based on unique values in other column

How can I transform this
A P 1
A Q 2
A P 1
B P 1
B Q 2
B R 3
C P 1
C P 1
C Q 2
Into this:
A P 1
A Q 2
B P 1
B Q 2
B R 3
C P 1
C Q 2
More info:
The values in 1st column have common values in Column B which have a relative values in Column C. Column B has multiple of common values for Column A. I want to filter/use other tools to only show unique values in Column B along with Column C which will unique to same value of Column A.
There are also other columns which are different values per entry but I don't care about those.
use simple UNIQUE:
=UNIQUE(A1:C)

calculate difference in counts in 2 columns for corresponding values in other column

I have 4 columns with values (A and C are products that mostly overlap, B and D are counts. What I'd like to do is for the values that occur in both A and C calculate the difference between B and D, and put the result in E and F.
So for example 5028421938592 count is 6 in column B and count is 2 in column D.
The result would be then be in column E: 5028421938592 and column F: 4
5028421928548 3 5028421928548 1
5028421938592 6 3259190205192 7
5028421997131 1 5028421938592 2
5028421995748 4 5028421995748 1
I suggest in E1:
=unique(query({A:B;C:D},"select Col1 where Col1 is not null order by Col1"))
and in F1 and copied down to suit:
=iferror(vlookup(E1,A:B,2,0),0)-iferror(vlookup(E1,C:D,2,0),0)

How to create a listing of unique names from a list

I have a list of names in column A and numbers (1-4) in Column B.
I need a formula to extract the names to Columns D, E, F, etc., where if multiple names have the same number (1-4) each name would appear in its own column.
Have This Need this result
Column
A B C D E F
greg 1 3 Tim
hank 2 2 Hank Mike
mike 2 1 Greg
tim 3
If C1 has '3', then the following will give you all names that have 3 in column B
=transpose(filter(A:A,B:B=c1))

Resources