Related
I am having some difficulties summing up some values in Google Sheets. In my spreadsheet, from multiple other tabs, values and bonuses are combined into one cell (Cell B1 in this example). The format of each "unit" of data is Name,5%xxx (Where "Name" is the name of the item, "5%" represents the sum I want to add, mostly always a percentage, and "xxx" separates one unit from the next). As you can see in cell B1, there are two instances where "Parkour" receives a bonus to sum up (from different sources).
Parkour,5%xxxParkour (Subskill: Sense of Balance),10%xxxParkour,2%xxx
Parkour
0.07
Parkour (Subskill: Sense of Balance)
H2H Combat: Parkour
The formula in cell B2 is:
=IFERROR(SUM(ARRAYFORMULA(IFERROR(VALUE(MID(FILTER(SPLIT(TEXTJOIN("",TRUE,filter(B$1,regexmatch(B$1,$A2)=TRUE)),"xxx"),SEARCH($A2,SPLIT(TEXTJOIN("",TRUE,filter(B$1,regexmatch(B$1,$A2)=TRUE)),"xxx"))),len($A2)+2,1000)),""))),"")
(Dragged down through the rest of the list) (Could not figure out how to make the formula "in line" on the question.)
Expected Results:
B2 = .07 (Working)
B3 = .1 (Not working)
B4 = Blank (Working)
The goal of the formula is to look into cell B1, and split everything out by "xxx". Then, filter the array of items with only exact matches with the line item in column A, then split again by the comma and add up those values. It worked for the first line item, but not the second. (Unsure why, but I strongly believe it has something to do with the parenthesis. When I removed the parenthesis from the name in Column A (and adjusted cell B1 to not have parenthesis), it worked. However, given the structure of the data, parenthesis are required, and I need to find a way for it to work with them.)
When I removed the IFERROR wrap around it in cell B3, I get this error note:
Function SUM parameter 1 expects number values. But " is a text and cannot be coerced to a number.
Any help is greatly appreciated.
You may find useful combining SPLIT with QUERY like this. It will group names and sum percentages:
=QUERY(INDEX (IFERROR(SPLIT(FLATTEN(INDEX(SPLIT(B1:B100,"xxx"))),","))),"SELECT Col1,SUM(Col2) where Col1 is not null group by Col1")
PS: invented a couple of extra line
UPDATE
I've thought you had another goal, try this formula. Having the previous chart generated by QUERY, I used VLOOKUP to match first column and return second one:
=INDEX(IFERROR (VLOOKUP(A2:A,QUERY(INDEX (SPLIT(FLATTEN(SPLIT(B1,"xxx")),",")),"SELECT Col1,SUM(Col2) where Col1 is not null group by Col1"),2,0)))
I wanted a ArrayFormula at C1 which gives the required result as shown.
Entry sheet:
(Column C is my required column)
Date Entered is the date when the Name is Assigned a group i.e. a, b, c, d, e, f
Criteria:
The value of count is purely on basis of Date Entered (if john is assigned a on lowest date(10-Jun) then count value is 1, if rose is assigned a on 2nd lowest date(17-Jun) then count value is 2).
The value of count does not change even when the data is sorted in any manner because Date Entered column values is always permanent & does not change.
New entry date could be any date not necessarily highest date (If a new entry with name Rydu is assigned a on 9-Jun then the it's count value will become 1, then john's (10-Jun) will become 2 and so on)
Example:
After I sort the data in any random order say like this:
Random ordered sheet:
(Count value remains permanent)
And when I do New entries in between (Row 4th & 14th) and after last row (Row 17th):
Random Ordered sheet:
(Doesn't matter where I do)
I already got a ArrayFormula which gives the required result:
={"AF Formula1"; ArrayFormula(IF(B2:B="", "", COUNTIFS(B$2:B, "="&B2:B, D$2:D, <"&D2:D)+1))}
I'm not looking for another Arrayformula as solutions. What I want is to know what is wrong in my ArrayFormula? and how do I correct it?
I tried to figure my own ArrayFormula but it's not working:
I got Formula for each cell:
=RANK($D2,FILTER($D$2:$D, $B$2:$B=$B2),1)
I figured out Filter doesn't work with ArrayFormula so I had to take a different approach.
I took help from my previous question answer (Arrayformula at H3) which was similar since in both cases each cell FILTER formula returns more than 1 value. (It was actually answered by player0)
Using the same technique I came up with this Formula which works absolutely fine :
=RANK($D2, ARRAYFORMULA(TRANSPOSE(SPLIT(VLOOKUP($B2, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ","))), 1)
Now when I tried converting it to ArrayFormula:
($D2 to $D2:$D & $B2 to $B2:$B)
=ARRAYFORMULA(RANK($D2:$D,TRANSPOSE(SPLIT(VLOOKUP($B2:$B, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ",")), 1))
It gives me an error "Did not find value '' in VLOOKUP evaluation", I figured out that the problem is only in VLOOKUP when I change $B2 to $B2:$B.
I'm sure VLOOKUP works with ArrayFormula, I fail to understand where my formula is going wrong! Please help me correct my ArrayFormula.
Here is the editable sheet link
if I understand correctly, you are trying to "rank" B column based on D column dates in such way that dates are in theoretical ascending order so if you randomize your dataset, the "rank" of each entry would stay same and not change based on the randomness you introduce.
therefore the correct formula would be:
={"fx"; INDEX(IFNA(VLOOKUP(B2:B&D2:D,
{INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1),
IFERROR(1/(1/COUNTIFS(
INDEX(SORT(B2:D, 3, 1),,1),
INDEX(SORT(B2:D, 3, 1),,1), ROW(B2:B), "<="&ROW(B2:B))))}, 2, 0)))}
{"fx"; ...} array of 2 tables (header & actual table) under each other eg. ;
outer shorter INDEX or longer ARRAYFORMULA (doesnt matter which one) is needed coz we are processing an array
IFNA for removing possible #N/A errors from VLOOKUP function when VLOOKUP fails to find a match
we VLOOKUP joint B and D column B2:B&D2:D in our virtual table {} and returning second 2 column if there is an exact match 0
our virtual table {INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1), ...} we VLOOKUP from is constructed with 2 columns next to each other eg. ,
we are getting the first column by creating an array of 2 columns {B2:B&D2:D, D2:D} next to each other where we SORT this array by date/2nd column 2, in ascending order 1 but all we need after sorting is the 1st column so we use INDEX where we bring all rows ,, and the first column 1
now lets take a look on how we getting the 2nd column of our virtual table by using COUNTIFS which will mimic the "rank"
IFERROR(1/(1/ is used to remove all zero values from the output (all empty rows would have 0 in it as the "rank")
under COUNTIFS we put 2 pairs of arguments: "if column is qual to column" and "if row is larger or equal to next row increment it by 1" ROW(B2:B), "<="&ROW(B2:B))
for "if column is qual to column" we do this twice and use range B2:D and sort it by date/3rd column 3 in ascending order 1 and of this we again need only the 1st column so we INDEX it and return all rows ,, and first column 1
with this formula you can add, remove or randomize your dataset and you will always get the right value for each of your rows
as for why your formula doesnt work... to not get #N/A error for vlookup you would need to define the end row of the range but still, the result wont be as you would expect coz formula is not the right one for this job.
as mentioned there are functions that are not supported under AF like SUM,AND,OR and then there are also functions which work but in a different way like IFS or with some limitations like SPLIT,GOOGLEFINANCE,etc.
I have answered you on the tab in your shared sheet called My Practice thusly:
You cannot split a two column array as you have attempted to do in cell CI2. That is why your formula does not work. You can only split a ONE column array.
I understand you are trying to learn, but attempting to use complicated formulas like that is going to make it harder I'm afraid.
Google sheets user here.
I am using the formula minifs to return the lowest match (out of multiple possible match). Is there a way I can use arrayformula as well to auto-populate an entire column so I don't need to copy the same formula to an entire column?
Sample data below:
Column D and J are data manually inputted. Column I is the formula(s).
Essentially what I want to do here is:
Look at Column D - sees the name "Tom"
Sees that "Tom" has 3 scores 100, 90, 70 in Column J
Formula slaps "70" back into Column I because that is the lowest score
Repeats logic for "John" and "Mary"
Note: The actual data type for column J and I is a date instead of a number. But it is easier to illustrate the problem this way.
So I can do this elegantly with the formula: =minifs(J:J,D:D,D2) and D3,D4,D5,D6...etc.
However, I will have to manually drag the formula to the entire column. This is a problem because my colleagues often insert rows in between (and forget to copy n paste the formula to Column I), is there a way I can auto-populate the entire column like I could with an arrayformula?
Assuming your data are A2:C, you can get the min or max of each row by this way: (you can also add a condition in query)
=query(transpose(query(transpose(A2:C),"select " & "min(Col"&arrayformula(textjoin("),min(Col",,row(A2:C)-1))&")")),"select Col2")
https://docs.google.com/spreadsheets/d/1Ia05jywxlvT2amFDG4vQhYOd0lo68FKdOY733MzU-MQ/copy
I have a working formula that I need to drag to autofill down a column and want to make it into an array formula:
=AVERAGEIF(INDIRECT("A2:A"&ROW()), ">=0",INDIRECT("A2:A"&ROW()))
So if you put this formula in column B it will take the values in column A and continually average them going down, skipping any values that are less than 0. Here is an example screenshot: https://i.imgur.com/nRq8hAH.png
How can I make an array formula for this?
This formula comes close but I couldn't figure out how to add the ">=0" conditional:
=ArrayFormula(IF(LEN(A2:A),SUMIF(ROW(A2:A),"<="&ROW(A2:A),A2:A)/COUNTIF(ROW(A2:A),"<="&ROW(A2:A)),))
Lambda Update
There is no longer any need to use ArrayFormula for this.
=MAP(SEQUENCE(COUNTA(A2:A)),
LAMBDA(rowOff,
AVERAGEIF(OFFSET(A2,0,0,rowOff),">=0"))
)
How?
For each element rowOff in 1..# items in column:
Use AverageIf to get the average of everything starting at the top taking rowOff rows, excluding everything >=0
Old solution
Here's a single formula that can go into B2 (no need to drag), but it's fairly complicated:
=ArrayFormula(IFERROR(IF(LEN(A2:A),MMULT(TRANSPOSE((SEQUENCE(COUNTA(A2:A),1,2)<=TRANSPOSE(SEQUENCE(COUNTA(A2:A),1,2)))*FILTER(A2:A,LEN(A2:A))),--(FILTER(A2:A,LEN(A2:A))>0))/COUNTIFS(SEQUENCE(COUNTA(A2:A)),"<="&SEQUENCE(COUNTA(A2:A)),FILTER(A2:A,LEN(A2:A)),">=0"),"")))
Readable:
=ArrayFormula(IFERROR(
IF(
LEN(A2:A),
MMULT(
TRANSPOSE(
(SEQUENCE(COUNTA(A2:A),1,2)<=
TRANSPOSE(SEQUENCE(COUNTA(A2:A),1,2))
)*FILTER(A2:A,LEN(A2:A))
),
--(FILTER(A2:A,LEN(A2:A))>0)
)/
COUNTIFS(
SEQUENCE(COUNTA(A2:A)),
"<="&SEQUENCE(COUNTA(A2:A)),
FILTER(A2:A,LEN(A2:A)),
">=0"
),
""
)
))
How?
We can achieve a running sum using MMULT on a Lower Triangular Matrix of size COUNTA(A2:A) of all 1's and all non blanks of A2:A, which we filter out if the number is negative. In this case, it produces {2;2;6;6;6;6}.
The COUNTIFS() produces an array of the number of elements we want to divide by. Here, it's {1;1;2;2;3;4}
Then ignore any blanks at the with IF.
Blank out any errors with IFERROR. (#DIV/0! errors can happen if the leading numbers are negative.)
Perhaps, this formula can help:
=ARRAYFORMULA(AVERAGE(IF($A$2:A2>=0,$A$2:A2,"")))
I've got the following Google spreadsheet:
item have ready need1 need2 need3
A 1 2 1
B 1 2 1 1
C 2 2
etc
I want to fill ready column as follows:
find the first column in need1, ..., needN range which has a non-empty value
if the value found is less or equals the value in have column, set ready column to something cheerful (e.g. yes)
if the value found is larger than the value in have column, don't do anything
So above input, when processed should look like this:
item have ready need1 need2 need3
A 1 2 1
B 1 2 1 1
C 2 yes 2
For the first step I found a suggested solution, which did not work for me:
=INDEX( SORT( FILTER( D10:H10 , LEN( D10:H10 ) ) ,
FILTER( COLUMN( D10:H10 ) , LEN( D10:H10 ) ) , 0 ) , 1 )
(it returns #REF!) Not sure what's wrong with it or how to proceed to the next step.
Thanks in advance!
If you know how many need columns you have, or even just how many columns are on the sheet, this is quite straightforward. If not and you need to look at the entire row, you might have to redesign a bit to avoid a circular reference from the cell with the formula being part of that row.
Your second two steps are fairly simple either way - you want one of two results based on a condition, so you're going to want to use =IF. Your condition is that the 'need' number is less than or equal to the 'have' number, and you want it to say 'yes' if that's true, and nothing if it isn't. So, that gives us:
=IF(need<=have, "Yes", "")
The examples below assume your table above starts from cell A1 in the top left, and that the last column in your sheet is Z
Next we need to find 'need' and 'have'. Finding 'have' is pretty easy - it's just the number in column B.
Finding 'need' is slightly more complicated. You've got the right idea using INDEX and FILTER, but your formula seems a little overcomplicated. Basically we can use FILTER to filter out the blank values, and INDEX to find the first one that is left. First, FILTER:
The range you want to filter from is everything in the same row from column D to column Z (or whatever the final column is), and the condition you want to filter for is that those same cells are not blank. For the formula you're typing into cell C2, that gives us:
=FILTER(D2:Z2, D2:Z2<>"")
Next, INDEX: If you give INDEX an array, a row number, and a column number, it will tell you what is at that the cell where that row and column meet. As we've filtered out the blanks, we just want whatever is left in the first column of our filtered array, which gives us:
=INDEX(FILTER(D2:Z2, D2:Z2<>""), 1, 1)
Or, as we only have one row in our array, and INDEX is pretty smart, simply:
=INDEX(FILTER(D2:Z2, D2:Z2<>""), 1)
So to bring it all together, our final formula for cell C2 is:
=IF(INDEX(FILTER(D2:Z2, D2:Z2<>""), 1)<=B2, "Yes", "")
Then just drag the formula down for as many rows as you need. If your sheet is or becomes wider, just change Z to whatever your last column is.
When you don't know the size of a range, use functions row, column, rows, columns.
Simple formula
Here's an example of what you are looking:
=if(INDEX(FILTER(OFFSET(D2,,,1,COLUMNS(1:1)-column(D2)+1),OFFSET(D2,,,1,COLUMNS(1:1)-column(D2)+1)<>""),1)<=B2,"yes","")
this part of formula:
OFFSET(D2,,,1,COLUMNS(1:1)-column(D2)+1)
returns the range starting from given cell (D2) to the end of Sheet (COLUMNS(1:1)-column(D2)+1)
ArrayFormula
I suggest using ArrayFormula, it'll expand automatically:
=ARRAYFORMULA(if(REGEXEXTRACT(SUBSTITUTE(trim(transpose(query(transpose(OFFSET(D2,,,COUNTA(A2:A),COLUMNS(1:1)-column(D2)+1)),,COLUMNS(OFFSET(D2,,,COUNTA(A2:A),COLUMNS(1:1)-column(D2)+1)))))," ",", "),"\d+")*1<=OFFSET(B2,,,COUNTA(A2:A)),"yes",""))
It assumes that 'Item' column has no blank values.
The solution from #Max Makhrov works, and has the advantage of using a single formula for the whole column.
However, it assumes that all of your columns at the right from your ready column (D) will be need_ columns.
The solution from #dmusgrave also works, provided you remove the extra "=" before INDEX:
=IF(INDEX(FILTER(D2:Z2,D2:Z2<>""),1)<=B2,"Yes","").
However, it makes the same assumption, and also limits at column Z.
Such assumptions seem reasonable, but if they are limiting you, here's how you can have any number of need_ columns starting right of your ready column:
=IF(INDEX(FILTER(INDIRECT( "D"&ROW()&":"&CHAR(67+COLUMNS(FILTER($1:$1,LEFT($1:$1, 4)="need")))&row() ), INDIRECT( "D"&ROW()&":"&CHAR(67+COLUMNS(FILTER($1:$1,LEFT($1:$1,4)="need")))&row() )<>""),1)<=B2,"Yes","")
The idea is simply to replace D2:Z2 (in #dmusgrave's solution) by :
INDIRECT( "D"&ROW()&":"&CHAR(67+COLUMNS(FILTER($1:$1,LEFT($1:$1, 4)="need")))&row() )
Explanation: You start from D at current row, and you go until the last need_ column on the same current row.
CHAR(68) is D, to which you add the number of columns titled need.*, minus one (hence the 67).
Using the same logic, you can easily make your formula more robust/generic, such as not having the need_ columns starting right form the ready column, etc.