Google Spreadsheets: CORREL() and MMULT() with missing cases/blanks - google-sheets

So I've got the following formula to correlate two ranges:
=ROUND(CORREL(ARRAYFORMULA(MMULT('E0:Sample'!$D$2:$AY,TRANSPOSE(SIGN(COLUMN(('E0:Sample'!$D$2:$AY)))))),FILTER(OFFSET('E0:Sample'!$D$2:$D,0,ROW()-2),NOT(ISBLANK(OFFSET('E0:Sample'!$D$2:$D,0,ROW()-2))))),3)
The formula works fine, as long as there are no blanks in 'E0:Sample'!$D$2:$AY. Otherwise the error message Function MMULT parameter 1 expects number values. But '' is a empty and cannot be coerced to a number. is thrown.
I´ve tried to filter() for empty rows, but the filter-function won't work since the ranges differ.
How do I solve this without the best way?
Thanks!

It's difficult to test your complete formula, but I did a test on a mini-version of matrix multiply and it seems that you can use the N function the same way as you can in Excel. Here is my mini-test:-
=ArrayFormula(MMULT(n(B1:G1),n(A1:A6)))
where both ranges contain a mix of numbers, alphas and blanks. Non-numeric cells are treated as zeroes.
Reference
I'm not totally clear about the context for this - I think you're trying to get the row sums from your large 2D array by using the mmult - if this is correct I think my answer is OK because the blanks would contribute nothing to the sums. Since CORREL ignores blanks in the second range, you don't need to filter at all?
I did eventually set up some test data for your formula, and my formula ended up like this:-
=ROUND(CORREL(ARRAYFORMULA(MMULT(n('E0:Sample'!$D$2:$AY),TRANSPOSE(SIGN(COLUMN(('E0:Sample'!$D$2:$AY)))))),OFFSET('E0:Sample'!$D$2:$D,0,ROW()-2)),3)

Related

What's going on with my max value operation in google sheets?

I'm doing a very simple max function to find the max between 2 cells, it works on the first few lines, but doesn't work the rest of the way down.
You'll see in the pic the max functions are in column R and only find the max between cells in column P and Q.
What you can't see is Column P is data input manually, while column Q references a different cell that contains a formula.
Why is this not working? thanks
The issue with your ranges is that the values in range Q1:Q are NOT numbers.
Reading the official page about the MAX function:
Each value argument must be a cell, a number, or a range containing numbers. Cells without numbers or ranges are ignored. Entering text values will cause MAX to return the #VALUE! error. To allow text values, use MAXA.
Because they are not numbers they are considered as 0.
So, when using =max(P2,Q2) the result appears realistic.
But not always.
Do test your values using =ISNUMBER()
You can correct the formulas by formatting values as numbers.
Just using a value alone didn't work for me. Wrapping my range in an arrayformula with a nested value formula worked. For example:
=max(ArrayFormula((value($R2:$R))))
This is probably because my entire column is calculated using formulas. I had to apply the value formula to the entire array.

Is there a way to specify an input is a single cell in Google Sheets?

I want to iterate over an array of cells, in this case B5:B32, and keep the values that are equal to some reference text in a new array.
However, SPLIT nowadays accepts arrays as inputs. That means that if I use the array notation of "B5:B32" within ARRAYFORMULA or FILTER, it treats it as a range, rather than the array over which we iterate one cell at a time.
Is there a way to ensure that a particular range is the range over which we iterate, rather than the range given at once as an input?
What I considered was using alternative formulations of a cell, using INDEX(ROW(B5), COLUMN(B5)) but ROW and COLUMN also accept array values, so I'm out of ideas on how to proceed.
Example code:
ARRAYFORMULA(
INDEX(
SPLIT(B5:B32, " ", 1), 1
) = "Some text here"
)
Example sheet:
https://docs.google.com/spreadsheets/d/1H8vQqD5DFxIS-d_nBxpuwoRH34WfKIYGP9xKKLvCFkA/edit?usp=sharing
Note: In the example sheet, I can get to my desired answer if I create separate columns containing the results of the SPLIT formula. This way, I first do the desired SPLITS, and then take the values I need from that output by specifying the correct range.
Is there a way to do this without first creating an output and then taking a cell range as an input to FILTER or other similar functions?
For example in cell C35 I've already gotten the desired SPLIT and FILTER done in one go, but I'd still need to find a way to sum up the values of the first character of the second column. Doing this requires that I take the LEFT value of the second column, but for that I need to output the results and continue in a new cell. Is there a way to avoid this?
Ralph, I'm not sure if your sample sheet really reflects what you are trying to end up with, since, for example, I assume you are likely to want the total of the hours per area.
In any case, this formula extracts all of the areas, and the hours worked, and is then easy to do further calculations with.
=ArrayFormula({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))})
Try that in cell E13, to see the output.
The first REGEXEXTRACT pulls out all the text in front of the first space and number, and the second pulls out all the digits in a string of " #hr" in each cell. These criteria could be modified, if necessary, depending on your actual requirements. Note that it requires the use of VALUE, to convert the hours from text to numeric values, since REGEXEXTRACT produces text (string) results.
It involved concatenating your multiple data columns into one long column of data, to make it simpler to process all the cells in the same way.
This next formula will give you a sum, for whatever matching room/task you type into B6, as an example.
=ArrayFormula(QUERY({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))},
"select Col1, sum(Col2) where Col1='"&B6&"' group by Col1 label sum(Col2) '' ",0))
I will also answer my own question given what I know from kirkg13's answer and other sources.
Short answer: no, there isn't. If you want to do really convoluted computations with particular cell values, there are a few options and tips:
Script your own functions. You can expand INDEX to accept array inputs and thereby you can select any set of values from an array without outputting it first. Example that doesn't use REGEXMATCH and QUERY to get the SUM of hours in the question's example data set: https://docs.google.com/spreadsheets/d/1NljC-pK_Y4iYwNCWgum8B4NJioyNJKYZ86BsUX6R27Y/edit?usp=sharing.
Use QUERY. This makes your formula more convoluted quite quickly, but is still a readable and universally applicable method of selecting data, for example particular columns. In the question's initial example, QUERY can retrieve only the second column just like an adapted INDEX function would.
Format your input data more effectively. The more easily you can get numbers from your input, the less you have to obfuscate your code with REGEXMATCHES and QUERY's to do computations. Doing a SUM over a RANGE is a lot more compact of a formula than doing a VALUE of a LEFT of a QUERY of an ARRAYFORMULA of a SPLIT of a FILTER. Of course, this will depend on where you get your inputs from and if you have any say in this.
Also, depending on how many queries you will run on a given data set, it may actually be desirable to split up the formula into separate parts and output partial results to keep the code from becoming an amalgamation of 12 different queries and formulas. If the results don't need to be viewed by people, you can always choose to hide specific columns and rows.

How to apply multiple conditions from one cell for SUMIF | GoogleSheets

Do you have an idea for a function that would sum the amounts from table 2 based on Unique_nr from Table 1?
I tried to do it this way:
=SUM(ARRAYFORMULA(SUMIF(E3:E9,{SPLIT(A3,",")},F3:F9))) <---doesn't work
=SUM(ARRAYFORMULA(SUMIF(E3:E9,{"8-1","9-1"},F3:F9))) <----It works
Theoretically the SPLIT() function gives the same result as I type manually, but unfortunately it doesn't work.
I would like to do this with one function for the entire range of data
https://docs.google.com/spreadsheets/d/1JGvFIZIE6c_D0A2Z4xCWf7pxqVft4Zsb6S-45_d9LY4/edit?usp=sharing
You were almost in the correct way
First of all, remove the double quotes from the cells.
By default SPLIT will Divide text around a specified character or string and it means there will be an extra step in order to use this output to another function, it's possible that your cell had an extra character and the TRIM function will solve it.
=SUM(ARRAYFORMULA(SUMIF(E3:E9,{trim(SPLIT(A3,","))},F3:F9)))
You can use VLOOKUP and SUM as a different approach
As you mentioned SPLIT is a good approach to treat comma-separated cells. In order to avoid unexpected spaces TRIM is a good option (it's optional) as well as IFNA in order to fill that cell in case there's not a match.
=ArrayFormula(SUM(IFNA(vlookup(trim(split(A3,",")),E3:F9,2,0))))
If you can't find a better option, you can use a dragable formula:
=ArrayFormula(SUM($F$3:F*(ISNUMBER(SEARCH($E$3:E,A3)))))

How to use AVERAGEIFS within ARRAYFORMULA

I am trying to use AVERAGEIFS inside ARRAYFORMULA. Looking at other questions, I have come to the conclusion that it is not possible without using QUERY function.
My intention is to average the values of a column whenever they share the same ID.
I think this question comes pretty close to what I need, but I haven't been able to replicate and adapt its solution on my own sheet.
In this sheet I show the result I expect (I got it by dragging the formula). I've also reviewed the Query Language Reference, unsuccessfully.
Thanks a lot for your time and effort.
So the formula should be
=ArrayFormula(iferror(sumif(A2:A,A2:A,B2:B)/countif(A2:A,A2:A)))
Note that if there were any text values in the points column, this would still return a result (because count would be greater than zero) - you could instead use
=ArrayFormula(if(isnumber(B2:B),(sumif(A2:A,A2:A,B2:B)/countif(A2:A,A2:A)),""))
If you had a mixture of rows with text and rows with numbers for any ID, this would return a smaller result than the avg or average formula. This is a limitation of this method. You can't put an extra condition in (that column B has to contain a number) because you would need countifs and countifs isn't array-friendly. It still seems strange that AFAIK countif and sumif are the only functions out of this family that are array-friendly while countifs, sumifs, averageif etc. are not.
you can do:
=ARRAYFORMULA(IFERROR(VLOOKUP(A2:A; QUERY(A2:B; "select A,avg(B) group by A"); 2; )))

Is there a way to use ARRAYFORMULA to find the most-recent even input of a column?

SOLVED EDIT
Thank you for the help. Solution here.
ORIGINAL POST
I have made a google sheet to describe the issue I am facing linked here (https://docs.google.com/spreadsheets/d/1yK6ZAX8BFnEqiuQO9HIxuY0l62ewDDccj-8EN1r2i2w/edit?usp=sharing).
I will also describe in words, below, the problem I am facing, along with the solutions I have tried.
The data of column A are random single-digit (0-9). I would like column B to show the most recent even number from column A, but only up to a specific row. That specific row is the row corresponding to the row of the cell in column B. In other words, in cell B7, I want to find the most recently entered even number of column A, specifically only on the range A2:A7 (A1 contains a column header).
This is actually a pretty simple formula, and I can get the desired outputs by simply checking if the value in a cell in column A is even and then returning the value of that cell if it is, or the output of the cell above if it isn't. So the formula would look something like: ​=IF(ISEVEN(A7),A7,B6)​
However, my problem is that the length of the data in column A will be growing as more data are entered, and my current solution of using the fill handle to copy the formula to new cells is inelegant and time-consuming. So my desired solution is to use an array formula entered into the first cell of column B (B2), capable of returning the same value as the other formula. The formula I tried to enter to perform this was the following: ​=ARRAYFORMULA(IF(ISEVEN(A2:A),A2:A,INDIRECT(ADDRESS(ROW(A2:A)-1,2))))​
However, as some of my previous work with arrays has taught me, not all formulas iterate as expected down the array. The formula seems to be able to return the correct output on lines which are already even, but it is unable to return the expected most-recently entered even number for all the other lines. It appears that the formula is not able to appropriately interpret the ​value_if_false​ argument of the ​IF​ formula.
I'm a little new to scripting, so I'm still trying to learn, but I also tried to dabble around with custom functions to no avail. I'm still wet behind the ears when it comes to coding, which is why I've been so lenient on the built-in formulas of Google Sheets, but I fear I may have reached the limit of what Sheets formulas can do.
I am open to trying new approaches, but my only real constraint is that I would really like for this to be a one-touch (or even better no-touch) solution, hope that's not too far beyond the scope of this issue. Any assistance would be much appreciated.
EDIT
After rubber-ducking the problem here, I went back and tried to use the OFFSET formula, hoping I could get it to play nicely with the array formula. Alas, I was unable, but I thought I should at least post my progress here for reference.
Attempt with offset
Still working at it!
Doing a vlookup on the row number seems to work for me
=ArrayFormula(if(A2:A="","",vlookup(row(A2:A),{if(iseven(A2:A),row(A2:A)),A2:A},2)))
Note: if there are no even numbers in range for some rows, it will produce #N/A for those rows.

Resources