Automated way to create a confusion matrix in Google Sheets? - google-sheets

I have a table of this form in Google Sheets:
+---------+------------+--------+
| item_id | prediction | actual |
+---------+------------+--------+
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 0 |
| 4 | 0 | 1 |
| 5 | 0 | 0 |
| 6 | 1 | 1 |
+---------+------------+--------+
And I'd like to know if there's an automated way to get this kind of summary, with the counts of items that fit the criteria specified in that row/column combination:
+----------+--------------+--------------+-------+
| | prediction=0 | prediction=1 | total |
+----------+--------------+--------------+-------+
| actual=0 | 1 | 1 | 2 |
| actual=1 | 1 | 3 | 4 |
+----------+--------------+--------------+-------+
| total | 2 | 4 | |
+----------+--------------+--------------+-------+
I've been doing this somewhat manually in Google Sheets by using COUNTIFS, but I'm wondering if there's a built-in way? I tried using pivot tables, but couldn't figure out how to get the calculated fields to show the data I want.

A coworker figured it out - you can get this by creating a pivot table with the correct columns and rows, and setting the value to item_id summarized by COUNTUNIQUE.

Related

How to use FILTER to return rows if its value is different than the above row in Google Sheets?

I'm trying to make a "friendly" view of a more complicated sheet by filtering out unnecessary data. I want one of the filter conditions to be to return a value only if that value is different than the previous row's value.
I've tried doing a filter, such as =filter(B:B, B2 <> B1) which obviously doesn't resolve. How do I do an offset reference so that the results are similar to as follows:
Sample Data
| Row | Category | Subcategory |
| --- | -------- | ----------- |
| 1 | Accounts | Ac1 |
| 2 | Accounts | Ac2 |
| 3 | Accounts | Ac3 |
| 4 | Accounts | Ac4 |
| 5 | Feedback | FbA |
| 6 | Feedback | FbB |
| 7 | Feedback | FbC |
| 8 | Feedback | FbD |
| 9 | Profile | PfOne |
| 10 | Profile | PfTwo |
| 11 | Profile | PfThree |
Desired Result
| Row | Category | Subcategory |
| --- | -------- | ----------- |
| 1 | Accounts | Ac1 |
| 5 | Feedback | FbA |
| 9 | Profile | PfOne |
Here is another approach (assuming that your posted "Row" column is not actually part of your data, but that your "Category" header is, in fact, in A1):
=ArrayFormula({A1:B1;FILTER(A2:B,A2:A<>INDIRECT("A1:A"&ROWS(A:A)-1))})
Try:
={A1:C1;FILTER(A2:C,B1:INDEX(B:B,COUNTA(B:B),1)<>B2:B)}
go for:
=SORTN(A1:C, 9^9, 2, 2, 1)

Google Sheets - Conditional Formatting a dynamically resizing `Total` Row

I am trying to conditional format a dynamically resizing Total row of multiple tables stacked one above the other. The number of cells in the Total row may change depending upon the selection in a Year dropdown.
A | B | C | D | E | F | G | H | I | J | K | L | M |
<Year dropdown>
|Issues|2020-Jan|2020-Feb|2020-Mar|2020-Apr|2020-May|2020-Jun| | | | | | |
--------------------------------------------------------------------------------
| abc | 9 | 2 | 2 | 1 | 3 | 8 | | | | | | |
| def | 1 | 3 | 7 | 1 | 5 | 3 | | | | | | |
| ghi | 2 | 1 | 3 | 1 | 1 | 2 | | | | | | |
--------------------------------------------------------------------------------
|Total | 12 | 6 | 12 | 3 | 9 | 13 | | | | | | |
--------------------------------------------------------------------------------
I am using this formula to highlight the Total row. However, it is highlighting the entire row depending upon what range is set in the Conditional Formatting > Apply to Range box. I want to highlight only the numeric cells i.e. exclude blank cells.
=ISNUMBER(SEARCH("total",$B1))
This highlights the entire Total row including any blank cells. I modified it to exclude any blank cells in row, but no joy!
=AND(ISNUMBER(SEARCH("total",$B1)), SEARCH("total",$B1)<>""))
You can also do it like this, relying on relative addressing to modify the formula as it's effectively dragged across the formatted area:
=and(isnumber(search("Total",$B3)),B3<>"")
Use formula:
=(ISNUMBER(SEARCH("total",$B1))+(($B1:$Z1<>"")))>1

Formula to randomly assign names in a given group with Google Sheets

I would like to create a formula in the below column Assigned so that a random name matching the same group as the current name is automatically inserted.
Names in the Assigned column must match the group of the current name and must not repeat throughout the list. I can't assign the current name with it's own name either. Any suggestions on how to do this?
I guess it is worth posting this. Although it might not be practical for large groups, it is a formula solution that works in Excel and Google sheets:
=INDEX($A$2:$A$8,SMALL(IF(($B$2:$B$8=B2)*($A$2:$A$8<>A2)*(COUNTIF(C$1:C1,$A$2:$A$8)=0),ROW($A$2:$A$8)-1),
RANDBETWEEN(1,SUM(($B$2:$B$8=B2)*($A$2:$A$8<>A2)*(COUNTIF(C$1:C1,$A$2:$A$8)=0)))))
entered as an array formula using CtrlShiftEnter
Here is an example of a successful match:
and an unsuccessful one:
As you can see, Mike, Jack and Fred have paired up together leaving Dave on his own, likewise with Lucy and Harry.
In Excel you may have to press F9 a few times to get a successful result - in Google sheets you just have to keep changing something, or set it to update every minute while you make a cup of tea.
It was my first time to work on something related to unique values, it took time but I learned lot from this question.
Answer of #Tom-Sharpe gave me an inspiration that it can be done, thanks to Tom for that. I tried and here it is.
I have checked it on some random data
+------------+-------+-------------+------------+
| Name | Group | RAND | Assigned |
| Malynda | 1 | 0.644382728 | Boonie |
| Boonie | 1 | 0.167369621 | Venus |
| Venus | 1 | 0.547165865 | Malynda |
| Jamal | 2 | 0.193081046 | Cora |
| Cora | 2 | 0.399459181 | Jamal |
| Alaster | 2 | 0.910498559 | Enrika |
| Enrika | 2 | 0.45819549 | Melisandra |
| Melisandra | 2 | 0.612592303 | Alaster |
| Petunia | 3 | 0.957104883 | Lawton |
| Mariam | 3 | 0.602293619 | Grenville |
| Caterina | 3 | 0.695342797 | Mariam |
| Stace | 3 | 0.942926886 | Caterina |
| Perle1 | 3 | 0.787227158 | Stace |
| Lawton | 3 | 0.315403693 | Perle1 |
| Grenville | 3 | 0.515276361 | Petunia |
| Elia | 4 | 0.655404975 | Catarina |
| Agosto | 4 | 0.322045058 | Fidela |
| Fidela | 4 | 0.490635045 | Agosto |
| Catarina | 4 | 0.121053081 | Elia |
| Elliot | 5 | 0.994137138 | Eddie |
| Mae | 5 | 0.349103119 | Wadsworth |
| Farleigh | 5 | 0.645375865 | Mae |
| Trudey | 5 | 0.473849475 | Farleigh |
| Gwenneth | 5 | 0.678186154 | Trudey |
| Wadsworth | 5 | 0.254168853 | Gwenneth |
| Eddie | 5 | 0.02103249 | Elliot |
| Denyse | 6 | 0.294801945 | Fayina |
| Tracie | 6 | 0.113670327 | Denyse |
| Aili | 6 | 0.901562575 | Tracie |
| Fayina | 6 | 0.029515522 | Alain |
| Mort | 6 | 0.938536467 | Perle |
| Alain | 6 | 0.389741125 | Aili |
| Perle | 6 | 0.513800791 | Mort |
| Mathew | 6 | 0.972656521 | #N/A |
| Benton | 7 | 0.423710316 | Bret |
| Bret | 7 | 0.127478128 | Benton |
| Mayne | 7 | 0.701027869 | Kirbee |
| Derry | 7 | 0.564710572 | Marje |
| Kirbee | 7 | 0.510258205 | Derry |
| Marje | 7 | 0.600908601 | Mayne |
| Devin | 7 | 0.718740939 | #N/A |
| Wilbert | 8 | 0.763761013 | Griswold |
| Brandice | 8 | 0.482092682 | Marty |
| Griswold | 8 | 0.111418464 | Brandice |
| Brandais | 8 | 0.594020577 | Fair |
| Kim | 8 | 0.727863883 | Brandais |
| Cam | 8 | 0.858246187 | Kim |
| Fair | 8 | 0.640979168 | Wilbert |
| Ardath | 8 | 0.883008322 | Cam |
| Marty | 8 | 0.339506717 | Ardath |
+------------+-------+-------------+------------+
D2 contains the below formula
=INDEX(
$A$2:$A$51,
MATCH(
MIN(IF(
(COUNTIF($D$1:D1,$A$2:$A$51)=0)*
($B$2:$B$51=B2)*
($A$2:$A$51<>A2),
$C$2:$C$51)
),
$C$2:$C$51,0)
)
C2 contains just the RAND() function but it is pasted as values so that it doesn't update on every calculation, but if you want you can keep it as a function
There are some #N/A values returned by the formula, it arrives when there is no other person left that can be assigned.
Check it on your data and let me know if it working correctly.
Note that the formula is assuming that names are unique and are sorted based on the groups.
If you're fluent with javascript you could attempt to use Google Apps Script to handle this by going to the toolbar Tools > Script Editor. Another way would be to record a macro via Tools > Macros > Record Macro. The script I generated via this method is below, however I couldn't figure out how to prevent assigning the current name with it's own name. My workaround for that is to select the range of assigned names in the same group and applying Data > Randomize Range until the assignment is satisfactory.
function AppendRandomNameByGroup() {
//Sort based on integer in group column
var spreadsheet = SpreadsheetApp.getActive();
var sheet = spreadsheet.getActiveSheet();
sheet.getRange(1, 1, sheet.getMaxRows(), sheet.getMaxColumns()).activate();
spreadsheet.getActiveRange().offset(1, 0, spreadsheet.getActiveRange().getNumRows() - 1).sort({column: spreadsheet.getActiveRange().getColumn() + 1, ascending: true});
spreadsheet.getRange('C2').activate();
//Duplicate names to assigned names column
spreadsheet.getRange('A2:A1000').copyTo(spreadsheet.getActiveRange(), SpreadsheetApp.CopyPasteType.PASTE_NORMAL, false);
var sheet = spreadsheet.getActiveSheet();
sheet.getRange(1, 1, sheet.getMaxRows(), sheet.getMaxColumns()).activate();
//Filter for each group and randomize items in 'assigned' column
sheet = spreadsheet.getActiveSheet();
sheet.getRange(1, 1, sheet.getMaxRows(), sheet.getMaxColumns()).createFilter();
spreadsheet.getRange('B1').activate();
var criteria = SpreadsheetApp.newFilterCriteria()
.setHiddenValues(['', '2', '3'])
.build();
spreadsheet.getActiveSheet().getFilter().setColumnFilterCriteria(2, criteria);
spreadsheet.getRange('C2:C1000').activate();
spreadsheet.setCurrentCell(spreadsheet.getRange('C4'));
spreadsheet.getActiveRange().randomize();
spreadsheet.getRange('B1').activate();
criteria = SpreadsheetApp.newFilterCriteria()
.setHiddenValues(['1', '', '3'])
.build();
spreadsheet.getActiveSheet().getFilter().setColumnFilterCriteria(2, criteria);
spreadsheet.getRange('C2:C1000').activate()
.randomize();
spreadsheet.getRange('B1').activate();
criteria = SpreadsheetApp.newFilterCriteria()
.setHiddenValues(['2', '', '1'])
.build();
spreadsheet.getActiveSheet().getFilter().setColumnFilterCriteria(2, criteria);
spreadsheet.getRange('C2:C1000').activate();
spreadsheet.setCurrentCell(spreadsheet.getRange('C8'));
spreadsheet.getActiveRange().randomize();
spreadsheet.getActiveSheet().getFilter().remove();
};
If you're trying to set-up a secret santa of some kind, another possible alternative to using scripts would be this Google Sheets Add-On that does something similar including the group functionality that you desire. However instead of assigning a recipient name in-sheet, it requires you to input corresponding e-mail addresses for each name and proceeds to email each participant with their assigned name.
Secret Santa
Sorry that this doesn't fit all your criteria.

SUM of text in cells based on rules in another sheet

I have created this Sheets to test this formula:
| | A | B | C | D |
| 1 | Object | Yes | Maybe | No |
| 2 | Object 1 | 50 | 25 | 0 |
| 3 | Object 2 | 20 | 10 | 0 |
| 4 | Object 3 | 20 | 10 | 0 |
| 5 | Object 4 | 10 | 5 | 0 |
Rules
| | A | B | C | D | E | F | G |
| 1 | Article | Object 1 | Object 2 | Object 3 | Object 4 | Total | |
| 2 | Article 1 | 50 | 20 | 20 | 10 | 100 | |
| 3 | Article 2 | Yes | Yes | Yes | Yes | 100 | |
| 4 | Article 3 | Yes | No | No | Yes | 60 | |
| 5 | Article 4 | No | Yes | Yes | No | 40 | |
| 6 | Test | No | Yes | Yes | No | #VALUE! | |
| 7 | Test2 | Yes | Yes | No | Yes | 50 | |
| 8 | Test3 | Yes | Yes | No | Yes | 70 | * |
* This works partially, but if No is selected the next Yes won't be calculated and breaks
if first Object is not Yes. The example says 70 but should be 80.
Sheet
https://docs.google.com/spreadsheets/d/1ydSfa4dpkTdcvwPPqGLRdQ9r-JZstB-hYS7J7tondUs/edit?usp=sharing
What I want to achieve is that the values listed in Rules! should correspond to Yes/No in Sheet! when it adds up the SUM.
For example in Sheet!, if I select Yes, Yes, No, Yes it should add up to 50 + 20 + 0 + 10 = 80. As the first Yes equals 50, followed by 20, 20, 10, and any No equals 0.
I know very basics formulas when it comes to spreadsheets, and what I have tried so far is the following, and it is also where I get stuck.
I want it to read B8 through E8, see how many Yes is listed, and if any Yes is listed, compare it to B2 through B5.
=SUMIF(B8:E8,"Yes",Rules!B2:B5)
The closest I have come is by ignoring the Rules sheet and to put the rules directly into the formula by repeating IF statements. This way it works, but I would still prefer to have the rules set by the Rules sheet.
=IF(B10="Yes",50+IF(C10="Yes",20+IF(D10="Yes",20+IF(E10="Yes",10,0))))
What I try it probably very incorrect, but as I said I have no idea how to proceed or fix it.
Does anyone have a suggestion for me?
Or if you need further explanation of what I want to achieve, if something is not clear, please let me know and I will try to explain.
Maybe you are after SUMPRODUCT:
=sumproduct(B$2:E$2,B3:E3="Yes")
in F3 of Sheet, copied down to suit, or perhaps:
=sumproduct(transpose(Rules!B$2:B$5),B3:E3="Yes")

How do you SUM two fields from two tables, even when the field in the second table could be null?

I have the following tables:
products.rb
# has_many :sales
+----+----------+----------+-------+
| id | name | quantity | price |
+----+----------+----------+-------+
| 1 | Pencil | 30 | 1.0 |
| 2 | Pen | 50 | 1.5 |
| 3 | Notebook | 100 | 2.0 |
+----+----------+----------+-------+
sales.rb
# belongs_to :product
+----+----------+------------+
| id | quantity | product_id |
+----+----------+------------+
| 1 | 10 | 1 |
| 2 | 2 | 1 |
| 3 | 5 | 1 |
| 4 | 2 | 2 |
| 5 | 10 | 2 |
+----+----------+------------+
I'd like to know, first, how many items I have left, regardless of their type. The answer is of course 151, but that'd be cheating. I could simply make a SUM of both tables individually, then put them together to know the final number, but I'm wondering if this could be done via activerecord in a single command.
I tried the following:
Product.includes(:sales).group('products.id').sum('products.quantity - sales.quantity')
but I get:
=> {1=>73, 2=>88, 3=>0}
which is understandable, as it is going through each one to do the sum like this:
+-------------------+----------------+-----+
| products.quantity | sales.quantity | sum |
+-------------------+----------------+-----+
| 30 | 10 | 20 |
| 30 | 2 | 28 |
| 30 | 5 | 25 |
+-------------------+----------------+-----+
which equals 73.
Anyway, how could this be achieved with ActiveRecord? I want to know the total number of items, but I'd also like to know the total of each type.
I'm not familiar of any ActiveRecord way to achieve what you want but you can try mixing a little sql in there
Product
.group('products.id')
.sum('products.quantity - (SELECT SUM(sales.quantity) AS sales_quantity FROM sales WHERE sales.product_id = products.id)')

Resources