Scorecard to show total average per month - google-sheets

I have a data source (coming from a Google Sheet) of engagements that has two columns:
Submitted Date
ID
Each row is a unique engagement.
I want to show a single Scorecard widget that has the total average # of engagements per month. For example, if:
2020-01 - 5 rows / engagements
2020-02 - 7 rows / engagements
2020-03 - 4 rows / engagements
Then the scorecard would show average of 5.33 rows/engagements.
Here is some sample data:
| Submitted Date | ID |
|----------------|------|
| 2020-01-02 | ID01 |
| 2020-01-05 | ID02 |
| 2020-01-10 | ID03 |
| 2020-01-12 | ID04 |
| 2020-01-21 | ID05 |
| 2020-02-01 | ID06 |
| 2020-02-02 | ID07 |
| 2020-02-05 | ID08 |
| 2020-02-15 | ID09 |
| 2020-02-16 | ID10 |
| 2020-02-17 | ID11 |
| 2020-02-21 | ID12 |
| 2020-03-10 | ID13 |
| 2020-03-15 | ID14 |
| 2020-03-20 | ID15 |
| 2020-03-25 | ID16 |
I know I can pre-process this data in another sheet in Google to create a table that shows # of rows per month and then in Data Studio I can create an average of that. I am trying to avoid doing that.

In pseudocode, the formula below is COUNT(ID) / COUNT_DISTINCT(Year Month) (in this case, 16 / 3):
COUNT(ID) / COUNT_DISTINCT(TODATE(Submitted Date, "%Y%m"))
Google Data Studio Report to demonstrate:

Since each row is a unique engagement, first, I would extract the year-month from the Submitted Date column. Then I would count their occurrences and get the average value.
table data

Related

Union Vertical Blending in Data Studio

I want to blend several tables into 1 table. All of the tables have the same column so I'm thinking to UNION vertical all of the tables.
My data source is Google Sheets/ Spreadsheets.
The data will look like this:
Table1
| Type | Object | Amount |
|:---- |:---------:| ------:|
| Tech | PC | $100 |
| Tech | Keyboard | $50 |
| Tech | Mouse | $60 |
Table2
| Type | Object | Amount |
|:----- |:-----------------------:| ------:|
| Sales | Sales Incentives | $1000 |
| Sales | Meeting with Client | $400 |
| Sales | Visiting stores | $80 |
While the desired output would be:
| Type | Object | Amount |
|:----- |:-----------------------:| ------:|
| Sales | Sales Incentives | $1000 |
| Sales | Meeting with Client | $400 |
| Sales | Visiting stores | $80 |
| Tech | PC | $100 |
| Tech | Keyboard | $50 |
| Tech | Mouse | $60 |
If you can't see the table you can see the picture here
enter image description here
Anyone can help me with this? Thank you
I just got the the answer:
You can use the blending FULL OUTER JOIN and use the formula:
COALESCE(Name (Source #1),Name (Source #2),Name (Source #3))
You can see full information here
Thank you for Mehdi Oidjida for the help.

Google Spreadsheet : sort greatest values from a sort/uniq/sumifs

I use Google Spreadsheet to keep track of my wine cellar, with a simple sheet with number of bottles / name of the wine / where it's from :
+--------------+------------+-------------+
| # of bottles | Wine | Appellation |
+--------------+------------+-------------+
| 2 | Talbot | St Julien |
| 16 | Marbuzet | St Estephe |
| 1 | Terrebrune | Bandol |
| 10 | Madiniere | Cote Rotie |
+--------------+------------+-------------+
I'd like to get a roundup of appellation I have the most, sorted by number of bottles, eg:
+--------------+-------------+
| # of bottles | Appellation |
+--------------+-------------+
| 16 | St Estephe |
| 10 | Cote Rotie |
| ... | ... |
+--------------+-------------+
I know how to get the sorted list of appellations (=sort(UNIQUE($C$2:$C$999) with wine origin in column C) and the matching number of bottles (=SUMIFS(A:A,C:C,<cell with appellation name>), but I'm stuck at sorting by the number of bottles instead.
With QUERY
=QUERY(A:C,"select sum(A),C group by C order by sum(A) desc",1)
To rename the header:
=QUERY(A:C,"select sum(A),C group by C order by sum(A) desc label sum(A) '# of bottles'",1)
With SORT and SUMIF
=ArrayFormula(SORT({SUMIF(C:C,UNIQUE(C2:C),A:A),UNIQUE(C2:C)},1,FALSE))

Partial transpose of Sheet

I have a Google Sheet with this format:
+---------+---------+---------+------------+------------+------------+------------+--------+--------+
| Field_A | Field_B | Field_C | 24/09/2019 | 25/09/2019 | 26/09/2019 | 27/09/2019 | day... | day... |
+---------+---------+---------+------------+------------+------------+------------+--------+--------+
| ValX | ValY | ValZ | Val1 | Val2 | Val3 | Val4 | | |
| ValW | ValY | ValZ | Val5 | Val6 | Val7 | Val8 | | |
+---------+---------+---------+------------+------------+------------+------------+--------+--------+
First 3 columns are specific fields and all other columns are related to one specific day in a given (and static) range.
I need to convert the table in the following format:
+---------+---------+---------+------------+-----------+
| Field_A | Field_B | Field_C | Date | DateValue |
+---------+---------+---------+------------+-----------+
| ValX | Valy | Valz | 24/09/2019 | Val1 |
| ValX | Valy | Valz | 25/09/2019 | Val2 |
| ValX | Valy | Valz | 26/09/2019 | Val3 |
| ... | | | | |
+---------+---------+---------+------------+-----------+
Basically, the first 3 columns are gathered as-is, but the day-column in transposed (is even the correct term?) with 2 values:
The date
The value in the cell related to date
Is something that can be achieved with formula or do I need to create a bounded AppsScript?
Following a sample Sheet demo: https://docs.google.com/spreadsheets/d/1cprzD96i-4NQ8tieA_nwd8s43yKF-M8Kww4yWNfB6tg/edit#gid=505040170
In Sheet Start you can see the initial data and format, 3 static columns and one column for every da
In Sheet End you can see the output format I'm looking for, the same 3 static columns, but the date and cell value related to date are transposed as a row.
You can see the Formula I used, TRANSPOSE for every row, where I select the days for the IV column and one row at a time for the V row.
For the 3 static columns, I replicated the Formula for every instance of the day related to that row.
This is working but requires much manual work to set up every single TRANSPOSE. I'm wondering if there is a more automatic way of doing this (except for using AppsScript, in that case, I'm already planning on doing this if not other solutions are available)
=ARRAYFORMULA(TRIM(SPLIT(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
IF(Start!D2:F<>""; "♦"&TRANSPOSE(QUERY(TRANSPOSE(Start!A2:C&"♠");;999^99))&
TEXT(Start!D1:F1; "dd/mm/yyyy")&"♠"&Start!D2:F; ));;999^99));;999^99); "♦")); "♠")))

Time span accumulating fact tables design

I need to design a star schema to process order processing. The progress of an order look like this:
Customer C place an order on item I with quantity 100
Factory F1 take the order partially with quantity 30
Factory F2 take the order partially with quantity 20
Buy from market 50 items
F1 delivery 20 items
F1 delivery 7 items
F1 cancel the contract (we need to buy 3 more item from market)
F2 delivery 20 items
Buy from market 3 items
Complete the order
How can I design a fact table in this case, since the number of step is not fixed, the data types of event is not the same.
I'm sorry for my bad English.
The definition of an Accumulating Snapshot Fact table according to Kimball is:
summarizes the measurement events occurring at predictable steps between the beginning and the end of a process.
For this particular use case I would go with a Transaction Fact Table as the events (steps) are unpredictable, it is more like an event fact table, something similar to logs or audits.
| order_key | date_key | full_datetime | entity_key (customer, factory, etc. varchar) | entity_type | state | quantity |
|-----------|----------|---------------------|----------------------------------------------|-------------|----------|----------|
| 1 | 20190602 | 2019-06-02 04:30:00 | C1 | customer | request | 100 |
| 1 | 20190602 | 2019-06-02 05:30:00 | F1 | factory | receive | 30 |
| 1 | 20190602 | 2019-06-02 05:30:00 | F2 | factory | receive | 20 |
| 1 | 20190602 | 2019-06-02 05:40:00 | Company? | company | buy | 50 |
| 1 | 20190603 | 2019-06-03 06:40:00 | F1 | factory | deliver | 20 |
| 1 | 20190603 | 2019-06-03 02:40:00 | F1 | factory | deliver | 7 |
| 1 | 20190603 | 2019-06-03 04:40:00 | F1 | factory | deliver | 3 |
| 1 | 20190603 | 2019-06-03 06:40:00 | F1 | factory | cancel | |
| 1 | 20190604 | 2019-06-04 07:40:00 | F2 | factory | deliver | 20 |
| 1 | 20190604 | 2019-06-04 07:40:00 | Company? | company | buy | 3 |
| 1 | 20190604 | 2019-06-04 09:40:00 | Company? | company | complete | 100 |
I'm not sure about your reporting needs as they were not specified, but assuming you need to measure lag/durations of unpredictable steps, you could PIVOT and use dynamic SQL to create the required view
SQL Server dynamic PIVOT query?
Let me know if you came up with something different as I'm interested on this particular use case. Good luck

Automated way to create a confusion matrix in Google Sheets?

I have a table of this form in Google Sheets:
+---------+------------+--------+
| item_id | prediction | actual |
+---------+------------+--------+
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 0 |
| 4 | 0 | 1 |
| 5 | 0 | 0 |
| 6 | 1 | 1 |
+---------+------------+--------+
And I'd like to know if there's an automated way to get this kind of summary, with the counts of items that fit the criteria specified in that row/column combination:
+----------+--------------+--------------+-------+
| | prediction=0 | prediction=1 | total |
+----------+--------------+--------------+-------+
| actual=0 | 1 | 1 | 2 |
| actual=1 | 1 | 3 | 4 |
+----------+--------------+--------------+-------+
| total | 2 | 4 | |
+----------+--------------+--------------+-------+
I've been doing this somewhat manually in Google Sheets by using COUNTIFS, but I'm wondering if there's a built-in way? I tried using pivot tables, but couldn't figure out how to get the calculated fields to show the data I want.
A coworker figured it out - you can get this by creating a pivot table with the correct columns and rows, and setting the value to item_id summarized by COUNTUNIQUE.

Resources