Do you have an idea for a function that would sum the amounts from table 2 based on Unique_nr from Table 1?
I tried to do it this way:
=SUM(ARRAYFORMULA(SUMIF(E3:E9,{SPLIT(A3,",")},F3:F9))) <---doesn't work
=SUM(ARRAYFORMULA(SUMIF(E3:E9,{"8-1","9-1"},F3:F9))) <----It works
Theoretically the SPLIT() function gives the same result as I type manually, but unfortunately it doesn't work.
I would like to do this with one function for the entire range of data
https://docs.google.com/spreadsheets/d/1JGvFIZIE6c_D0A2Z4xCWf7pxqVft4Zsb6S-45_d9LY4/edit?usp=sharing
You were almost in the correct way
First of all, remove the double quotes from the cells.
By default SPLIT will Divide text around a specified character or string and it means there will be an extra step in order to use this output to another function, it's possible that your cell had an extra character and the TRIM function will solve it.
=SUM(ARRAYFORMULA(SUMIF(E3:E9,{trim(SPLIT(A3,","))},F3:F9)))
You can use VLOOKUP and SUM as a different approach
As you mentioned SPLIT is a good approach to treat comma-separated cells. In order to avoid unexpected spaces TRIM is a good option (it's optional) as well as IFNA in order to fill that cell in case there's not a match.
=ArrayFormula(SUM(IFNA(vlookup(trim(split(A3,",")),E3:F9,2,0))))
If you can't find a better option, you can use a dragable formula:
=ArrayFormula(SUM($F$3:F*(ISNUMBER(SEARCH($E$3:E,A3)))))
Related
I want to iterate over an array of cells, in this case B5:B32, and keep the values that are equal to some reference text in a new array.
However, SPLIT nowadays accepts arrays as inputs. That means that if I use the array notation of "B5:B32" within ARRAYFORMULA or FILTER, it treats it as a range, rather than the array over which we iterate one cell at a time.
Is there a way to ensure that a particular range is the range over which we iterate, rather than the range given at once as an input?
What I considered was using alternative formulations of a cell, using INDEX(ROW(B5), COLUMN(B5)) but ROW and COLUMN also accept array values, so I'm out of ideas on how to proceed.
Example code:
ARRAYFORMULA(
INDEX(
SPLIT(B5:B32, " ", 1), 1
) = "Some text here"
)
Example sheet:
https://docs.google.com/spreadsheets/d/1H8vQqD5DFxIS-d_nBxpuwoRH34WfKIYGP9xKKLvCFkA/edit?usp=sharing
Note: In the example sheet, I can get to my desired answer if I create separate columns containing the results of the SPLIT formula. This way, I first do the desired SPLITS, and then take the values I need from that output by specifying the correct range.
Is there a way to do this without first creating an output and then taking a cell range as an input to FILTER or other similar functions?
For example in cell C35 I've already gotten the desired SPLIT and FILTER done in one go, but I'd still need to find a way to sum up the values of the first character of the second column. Doing this requires that I take the LEFT value of the second column, but for that I need to output the results and continue in a new cell. Is there a way to avoid this?
Ralph, I'm not sure if your sample sheet really reflects what you are trying to end up with, since, for example, I assume you are likely to want the total of the hours per area.
In any case, this formula extracts all of the areas, and the hours worked, and is then easy to do further calculations with.
=ArrayFormula({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))})
Try that in cell E13, to see the output.
The first REGEXEXTRACT pulls out all the text in front of the first space and number, and the second pulls out all the digits in a string of " #hr" in each cell. These criteria could be modified, if necessary, depending on your actual requirements. Note that it requires the use of VALUE, to convert the hours from text to numeric values, since REGEXEXTRACT produces text (string) results.
It involved concatenating your multiple data columns into one long column of data, to make it simpler to process all the cells in the same way.
This next formula will give you a sum, for whatever matching room/task you type into B6, as an example.
=ArrayFormula(QUERY({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))},
"select Col1, sum(Col2) where Col1='"&B6&"' group by Col1 label sum(Col2) '' ",0))
I will also answer my own question given what I know from kirkg13's answer and other sources.
Short answer: no, there isn't. If you want to do really convoluted computations with particular cell values, there are a few options and tips:
Script your own functions. You can expand INDEX to accept array inputs and thereby you can select any set of values from an array without outputting it first. Example that doesn't use REGEXMATCH and QUERY to get the SUM of hours in the question's example data set: https://docs.google.com/spreadsheets/d/1NljC-pK_Y4iYwNCWgum8B4NJioyNJKYZ86BsUX6R27Y/edit?usp=sharing.
Use QUERY. This makes your formula more convoluted quite quickly, but is still a readable and universally applicable method of selecting data, for example particular columns. In the question's initial example, QUERY can retrieve only the second column just like an adapted INDEX function would.
Format your input data more effectively. The more easily you can get numbers from your input, the less you have to obfuscate your code with REGEXMATCHES and QUERY's to do computations. Doing a SUM over a RANGE is a lot more compact of a formula than doing a VALUE of a LEFT of a QUERY of an ARRAYFORMULA of a SPLIT of a FILTER. Of course, this will depend on where you get your inputs from and if you have any say in this.
Also, depending on how many queries you will run on a given data set, it may actually be desirable to split up the formula into separate parts and output partial results to keep the code from becoming an amalgamation of 12 different queries and formulas. If the results don't need to be viewed by people, you can always choose to hide specific columns and rows.
I'm extracting text from filename cells into separate metadata field cells. So far I have done this successfully using the REGEXTRACT formula, as seen below.
=REGEXEXTRACT(A1, "TILEABLE|ROOM|MAIN|FLOORSHOT|SWATCH|ANGLED")
However some metadata fields that include multiple words require that a space or other character be placed between words. I'm trying to figure out how to use SUBSTITUTE or REPLACE in conjunction with REGEXTRACT to find a phrase and replace it with a version with something different. Ex. Replace "TOPDOWN" with "Top Down" or replace "1TO1" with "1-to-1).
Depending on your purpose one formula might be better than other. If you want to list in a column the substituted values of this string you could chain the number of phrases you want using SUBSTITUTE and REGEXTRACT.
This will return all the phrases you are looking for and substitute them to then use the formula TRANSPOSE to take this range and display it in a columns (as it normally would be displayed in a row and only a single value). This is a simple example:
=TRANSPOSE({SUBSTITUTE(REGEXEXTRACT(A1,"TOPDOWN"),"TOPDOWN","Top Down"),SUBSTITUTE(REGEXEXTRACT(A1,"SHIRTS"),"SHIRTS","Shirts1")})
try:
=SUBSTITUTE(SUBSTITUTE(REGEXEXTRACT(A1,
"TOPDOWN|1TO1|TILEABLE|ROOM|MAIN|FLOORSHOT|SWATCH|ANGLED"),
"TOPDOWN", "Top Down"),
"1TO1", "1-to-1")
=ArrayFormula(IF(A1:B6<0,0,A1:B6))
The range is referred twice. Is it possible to do this with a singular reference within a formula?
Perhaps something akin to IFERROR like IFCONDITION(range, condition, result_if_condition)
The use case is the range itself are in many cases computed using complex arrangements - so it becomes quite inconvenient/unwieldy when that same complex arrangement needs to be inserted into multiple places.
Sample sheet.
In this particular case, you can use
=ArrayFormula(text(A1:A6,"0;\0"))
so that any negative numbers are displayed as zero.
Since the result is a string, it may need to be coerced to a number for use in further calculations.
This was first suggested to me by #barry houdini - here is an example of it in use (in Excel).
EDIT by OP (as in comment below) ;
Here is the link https://support.google.com/docs/answer/56470 So if you wanted blank cells to be zero, you would set the 4th part to \0 i.e. =ArrayFormula(text(A1:A7,"0;\0;\0;\0")) because a blank cell is not a number.
So I've got the following formula to correlate two ranges:
=ROUND(CORREL(ARRAYFORMULA(MMULT('E0:Sample'!$D$2:$AY,TRANSPOSE(SIGN(COLUMN(('E0:Sample'!$D$2:$AY)))))),FILTER(OFFSET('E0:Sample'!$D$2:$D,0,ROW()-2),NOT(ISBLANK(OFFSET('E0:Sample'!$D$2:$D,0,ROW()-2))))),3)
The formula works fine, as long as there are no blanks in 'E0:Sample'!$D$2:$AY. Otherwise the error message Function MMULT parameter 1 expects number values. But '' is a empty and cannot be coerced to a number. is thrown.
I´ve tried to filter() for empty rows, but the filter-function won't work since the ranges differ.
How do I solve this without the best way?
Thanks!
It's difficult to test your complete formula, but I did a test on a mini-version of matrix multiply and it seems that you can use the N function the same way as you can in Excel. Here is my mini-test:-
=ArrayFormula(MMULT(n(B1:G1),n(A1:A6)))
where both ranges contain a mix of numbers, alphas and blanks. Non-numeric cells are treated as zeroes.
Reference
I'm not totally clear about the context for this - I think you're trying to get the row sums from your large 2D array by using the mmult - if this is correct I think my answer is OK because the blanks would contribute nothing to the sums. Since CORREL ignores blanks in the second range, you don't need to filter at all?
I did eventually set up some test data for your formula, and my formula ended up like this:-
=ROUND(CORREL(ARRAYFORMULA(MMULT(n('E0:Sample'!$D$2:$AY),TRANSPOSE(SIGN(COLUMN(('E0:Sample'!$D$2:$AY)))))),OFFSET('E0:Sample'!$D$2:$D,0,ROW()-2)),3)
I have a column of numbers. I want to know if there are any duplicates. I don't need to know how many or what their value is. I just want to know if there are any.
The best way I could figure out was to have another column of equal height to the column of numbers, with the formula:
=countif(A:A,A1)>1
So this will put a TRUE next to every number that has one or more duplicates in the list.
From here I need to see if this second column contains a TRUE.
So I have a final cell with this formula in it:
=lookup(true, B:B)
This always displays FALSE, even when there are duplicates in the list, with corresponding "TRUE" values next to them in column B.
Also, is there a simpler way of solving this problem?
Note: I can get it to work if the single cell result simply does an =OR(B:B) but I still want to know why my first way won't work and if there is an all around simpler way of doing this.
you can use both =unique(A:A) and also =counta(unique(A:A))
note: the A:A is just a dummy array i threw in for example, replace with whatever column you want to refer to.
to get a final yes or no, you could nest it together by putting =if(eq(counta(A:A),counta(unique(A:A))),"No Duplicates", "Contains Duplicates")
I'm not sure whether simpler (I am confident the formula could be simplified!) but copy/pasting the following might be deemed so:
=sum(if(ARRAYFORMULA(countif(A:A,A1:A)>1),1,0))
This should return 0 only if there are no duplicates. If a single entry is repeated twice (three instances) and all other values are unique, the result should be 3.
TRUE is curious as the behaviour is not what I expected and I differs from Excel where true would be converted to TRUE, which normally indicates an automatic change from text to function. I don't have an explanation but it may be connected with lookup because the boolean behaves as I would expect in say an if formula.