How to approach joining a single dataset twice to another dataset - join

I'm curious to learn best practices when joining a single dataset to another dataset, twice.
df1:
id1 id2 rank
11 12. 1
11 13. 2
22. 14. 1
22. 15. 2
df2:
id type
11 apple
22 pear
12 peach
13 berry
14 grape
15 banana
Resulting dataset:
id1 type1 id2 type2 rank
11 apple 12. peach 1
11 apple 13. berry 2
22. pear 14. grape 1
22. pear 15. banana 2
In production here are the dataset sizes:
df1: 3B Rows
df2: 120M Rows
My first thought would be to cache df2 and then join it twice but I'm not sure if there is anything else I should do.

Related

Google sheet COUNTIF compare 2 cells in 2 differents columns

I have this kind of data in a google spreadsheet:
NAME AGE VALUE1 VALUE2
John 18 1 5
Tyron 22 5 4
May 18 1 6
Lewis 25 8 9
Donald 18 6 7
I wanna try to count how many occurrences where VALUE2 > VALUE1 and where age = 18
I tried something like that:
=COUNTIFS(B0:B10;18; B0:B10; D0:D10>C0:C10)
but that doesn't work
Someone to help?
Try below formula
=SUMPRODUCT(B2:B6=18,D2:D6>C2:C6)
FILTER() will also work.
=ArrayFormula(SUM(--(FILTER(A2:A6,B2:B6=18,D2:D6>C2:C6)<>"")))

how to count sum for specific year in google sheet

i have data given below in Google Sheet
jumlah tanggal
1 Rp15.000 15-Apr-2020
2 Rp15.000 15-Mei-2020
3 Rp15.000 15-Jun-2020
4 Rp15.000 15-Jul-2020
5 Rp15.000 18-Agu-2020
6 Rp15.000 15-Sep-2020
7 Rp15.000 20-Okt-2020
8 Rp15.000 18-Nov-2020
9 Rp15.000 12-Des-2020
10 Rp15.000 11-Jan-2021
11 Rp15.000 15-Feb-2021
12 Rp15.000 15-Mar-2021
how can i only sum column "jumlah" only from year 2020?
[update]
i'm already follow some answer but it's still error
You could try:
=sumproduct(year(C2:C)=2020, B2:B)
You will need to exclude row 1 as C1 is not a date and year(C1) will give an error.
This formula creates a column of true/false values depending on whether the year is 2020 or not. Then multiples that column with the values in column B. And then adds it all up.
thank you for people who trying to help. So far i'm trying every solution above but it doesn't work. I googling many times for everyday then get the solution.
here's the formula:
=SUMPRODUCT((YEAR(C3:C)=2020)*B3:B)
here's the link to the sites:
https://www.excel-easy.com/examples/sumproduct.html

Getting column data and re-arranging it in rows, with a pattern, using formulas

In a worksheet of multiple sheets, I have Sheet1, e.g. with the following: (these rows will be less or more and are manually entered)
Sheet1
A B C
1 APPLE ORANGE LEMON
2 bravo chair mars
3 charlie table jupiter
4 alpha box venus
5 delta saturn
6 foxtrot
I would like some help in constructing Sheet2 via formulas so that it rearranges data from Sheet1 as follows
Sheet2 (Desired result)
A B
1 APPLE
2 bravo
3 charlie
4 alpha
5 delta
6 foxtrot
7
8 ORANGE
9 chair
10 table
11 box
12
13 LEMON
14 mars
15 jupiter
16 venus
17 saturn
It probably needs some combination of QUERY() ARRAYFORMULA(), TRANSPOSE() and/or INDEX() but I need some help with getting started and having them into lesser columns (and more rows.) as shown. Please note that Sheet1's data will keep changing in number of rows (or columns) so Sheet2 needs to adapt to that.
Thank you.
You can try following formula:
=ArrayFormula(
{FILTER(
FLATTEN(TRANSPOSE(IF(ROW(A:F)=1;A:F;"")));
FLATTEN(TRANSPOSE(A:F))<>"")
\FILTER(
FLATTEN(TRANSPOSE(IF(ROW(A:F)<>1;A:F;"")));
FLATTEN(TRANSPOSE(A:F))<>"")}
)
if you use semicolon as function argument separator.
If you use comma, change to
=ArrayFormula(
{FILTER(
FLATTEN(TRANSPOSE(IF(ROW(A:F)=1,A:F,""))),
FLATTEN(TRANSPOSE(A:F))<>"")
,FILTER(
FLATTEN(TRANSPOSE(IF(ROW(A:F)<>1,A:F,""))),
FLATTEN(TRANSPOSE(A:F))<>"")}
)
The formula will run faster if you specify a row constraint.

how to make a new table from an existing table and add new column in spreadsheet

I want to Ask about spreadsheet or google-sheets formula. I have a data that looks like this:
A B C D
------------------------------------------------
1 | UserId fruitType media PriceStatus
2 | 3 Apple Bag Paid
3 | 7 Banana Bag Paid
4 | 7 Apple Bag Paid
5 | 43 Banana Bag Paid
6 | 43 Apple Bag FREE
7 | 43 Apple Cart Credit
Note:
My data only consist of 2 type of fruit : Apple and banana
2 type of media : Bag and Cart
3 Type of PriceStatus : Paid, Credit, & Free
As you can see the column is the same as the one in spreadsheet that I give value A-D and the row as 1-7, My plan is to make the sheets look like this:
UserId Apple Banana
3 1 0
7 1 1
43 2 1
is it possible to make a new table like this in spreadsheet? I tried using VLOOKUP but still failed to implement it, can someone help me with this?
try:
=QUERY(A1:D, "select A,count(A) where A is not null group by A pivot B", 1)

Find a missing value or a different one in Excel between 2 couple of 2 columns

I have a Excel table with 4 columns, something like this :
A B C D
11 Id Value Id2 Value2
12 2 10 2 10
13 4 11 4 11
14 1 100 1 10
and I'd like to compare the id and value with the id2 and value2 to find out if there is any typed error of a value that corresponds to the id,
In this case the value2 with the corresponding id2 "1", D14, is wrong because on the row A14 dhe value that corresponds to the id 1 is 100, I have to do this for something like 2000 rows!
On cell E2 type this:
=IF(CONCATENATE(A2;B2)=CONCATENATE(C2;D2);"OK";"ERROR")
If it doesnt work try replacing ; with ,
It should work
Regards,

Resources