PowerBI - Showing all values from one dataset after join - join

I apprecicate that this must be a fairly simple issue to overcome however I have tried all join types with no success.
My data is structured in two excel files, one for 2022 and one for 2021. Headings are roughly the same on both :
ID Name 2021 Quantity 2021 Assessment
1234 Name1 32 High
5678 Name2 9 Low
9112 Name3 1 Medium
and the same for 2022 :
ID Name 2022 Quantity 2022 Assessment
3456 Name1 14 Medium
7891 Name3 23 Medium
1001 Name4 1 Low
I can join both sets on the NAME field, however the 2021 file will contain some Names that are not on the 2022 file, and vice versa. I am interested in the 2022 file as my primary source, and would like to show, in a table, all records and if there is a 2021 quantity (if not, show a blank). Output should look something like the below
ID Name 2022 Quantity 2022 Assessment 2021 Quantity
1234 Name1 32 High 14
5678 Name2 9 Low
I have experimented with one-to-many and many-to-many joins and various filters however every output seems to filter off the records where there is no join?

The first option is to append the tables (in PowerQuery: Home Ribbon/Append queries). Before that, we need to rename column names, so that they have similar names, and add a year column to each table (in PowerQuery:Add Column/ Custom column) (please see the screenshot).
Then, we can just create a pivot table.
The second option is to create a table with unique Names and set relationships between that table and the original ones by the name columns.
In PowerQuery:
Right-click on the 2021 table/Reference
Remove all columns except Names
Do the same for the 2022 table
Append those two tables as the new one
Remove duplicates in the new table
In PowerBI
Set relationships using the name columns
Create a pivot table

Related

Explode each row into multiple rows by splitting a column of a given computed range

I was recently tasked with 'exploding' each row in a given range with respect to the split value of one of the columns, i.e. going from
Name
Interests
Age
John
swimming, movies
31
Mary
basketball
26
Richard
football, music
21
to:
Name
Interest
Age
John
swimming
31
John
movies
31
Mary
basketball
26
Richard
football
21
Richard
music
21
It's a little similar to a Cartesian product, only one of the terms needs to be computed on the basis of the value in the Interests column. I eventually solved it using an Apps Script function, but I'm wondering if it could be easily solved using a regular formula.
Note that the input range in my case was a product of another formula (a QUERY(...), to be exact), so not necessarily contiguous or addressable within the spreadsheet.
Any ideas?
try:
=INDEX(QUERY(SPLIT(FLATTEN(A1:A&"×"&SPLIT(B1:B, ", ", )&"×"&C1:C), "×"),
"where Col3 is not null"))
You can use the custom "UNPIVOT" function found on this sheet. File>Make a Copy to grab the script. Also here on github.
=ARRAYFORMULA(UNPIVOT(A2:A,"V",SPLIT(B2:B,", ",0),"B",C2:C,"V"))
You would then QUERY() the output to eliminate the rows where there was nothing in the second column.

How to create a cumulative report based on differences between data updated daily in google sheets?

I am trying to create a report from another report(source sheet). :)
The source sheet updates daily automatically by inserting new rows with progress on sales on top of the rows completed a day before:
Date
Product
Units sold
11/15
A
35
11/15
B
12
11/15
C
18
11/14
A
30
11/14
C
11
11/14
B
10
11/13
F
88
11/12
B
7
11/12
A
29
11/12
C
10
11/11
C
8
11/11
A
29
11/11
B
3
The "Units sold" column is cumulative meaning that a newer record on a certain product will show a greater or equal value to a previous record on that specific product.
New products appear in the source sheet when entering the company and they disappear from it when they are sold out, pretty much randomly. (e.g. product "F" that showed up and sold out in the same day)
In the first column in the source report i already found a formula that concatenates date and product and is used by another reports.
To solve this, in the results report i made on column T the same concat of date and product. Then, in my new report, in the results column, i used the following formula: =vlookup(T2,Source!$A2:$C$10000,3,0)-vlookup(T2,Source!$A3:$C$10000,3,0) with the intention to obtain the difference between the amount of products sold in the last day vs the amount of products sold in the day before it, or, better said, the amount of each of the products sold on a specific date. Finally, by using a column of =year() and one of =month() applied on date column and a couple of pivot tables i was able to obtain the value of the daily increment for each and/or year.
The problem i couldn't find a solution for is that when the source sheet updates, the new rows added with the freshest data move down the cell references from the vlookup formula i used in the results sheet.
Please help me find a way to "pin down" the cell references in the vlookup formula without being affected by the new rows insertions.
Thank you!
to "pin down" the references you can use INDIRECT
example:
A1:A >>> INDIRECT("A1:A")

Google Sheets availability chart formula

I am putting an overtime sheet together that staff can show their availability for the Saturday on a table with name and date with a simple Y/N, I also have another table for the hours each person has accumulated.
Based on several staff members saying Y to their availability (we have two members of staff in) I would like two cells to display the name of staff that has the least number of hours to their name.
Column A is the name
Column B is the hours they have worked
Column C is the checkbox,
checked meaning they will work overtime.
The following formula will return the two willing to work overtime with the least hours.
=query(A:C,"select * where A is not null and C = TRUE order by B limit 2")

how to split one column into multiple variable columns in parse.com

I have the following records in my data base
fields: brand |model |attributes
----------------------------
record 1) apple |iPhone|6s space grey sprint
record 2) audi | a6 |quattro coupe
How can I store the attributes into different column names, with version/color/carrier columns for the first record and engine/type as the different columns for second record. The table should have 5 total columns for 1st record and 4 columns for the second record.
How do I achieve this? Should I split the table? If there are million products and each have varied length attributes then the number of columns in the table will be long. Whats the efficient way of doing this?
You can store the related attributes in a second table.
record|attribute|value
------+---------+-----
1 |version |6s
1 |color |space grey
1 |carrier |sprint
2 |engine |quattro
2 |type |coupe
This would allow for an item to have any number of (optional) attributes.

Combining the select clause in query function with Indirect function. - Google Spreadsheets

So basically I need a dynamic select statement which changes references when dragged across rows or columns.
Example of what I need.
=sum(query('Sheet1'!$A$1:$F$621, Indirect("Select"&$F&"Where A='ABC' AND B="&"Sheet2!"&$A1)))/20
Sample Sheet 1 (Data Sheet)..--Since I am not allowed to use images till i reach rep 10 lol :)
Column 1 - Sales Sites (ABC, DEF, GHI.....)
Column 2 - Sales Roles (SM, ASM, SE.....)
Column 3 - Sales in Month Jan
Column 4 - Sales in Month Feb
Column 5 - Sales in Month Mar
Sample Sheet 2 (Desired Output)
Description (in pivot terms):
Site wise (filter)
Role wise (Rows)
Month wise (Columns - Sum of Jan, Feb etc)
Value (Sum of Jan/20)--To get day wise sales numbers
FYI:
I have tried using pivots, but google spreadsheets don't allow use of calculated fields in pivots in any manner (for the /20 in the formula), hence trying to achieve the same results by formula.
I know a table on the basis of the pivot table could help solve this problem, but to make it more efficient I am trying to avoid using 2 tables.
Many Thanks for your help in advance, please let me know if you need additional info to understand the scenario.

Resources