Within my Google sheet I need to find a value in a specific range (say A15:A45) when the row value in another column (Say Column D) exceeds a specific cell value (Say C20). How do I write the syntax?
I did try using Vlookup and If combination. Also tried using query. However I think I am making some mistake in the arguments. either I get an error or I do not get any outcome.
Assuming C20 and ColumnD are Numeric (other than a label) then perhaps:
=query(A:D,"select A where D > "&C20&"")
VLOOKUP would not be suitable as this does not "look to its left" and an INDEX/MATCH combination would not be suitable for multiple results.
Related
I want to iterate over an array of cells, in this case B5:B32, and keep the values that are equal to some reference text in a new array.
However, SPLIT nowadays accepts arrays as inputs. That means that if I use the array notation of "B5:B32" within ARRAYFORMULA or FILTER, it treats it as a range, rather than the array over which we iterate one cell at a time.
Is there a way to ensure that a particular range is the range over which we iterate, rather than the range given at once as an input?
What I considered was using alternative formulations of a cell, using INDEX(ROW(B5), COLUMN(B5)) but ROW and COLUMN also accept array values, so I'm out of ideas on how to proceed.
Example code:
ARRAYFORMULA(
INDEX(
SPLIT(B5:B32, " ", 1), 1
) = "Some text here"
)
Example sheet:
https://docs.google.com/spreadsheets/d/1H8vQqD5DFxIS-d_nBxpuwoRH34WfKIYGP9xKKLvCFkA/edit?usp=sharing
Note: In the example sheet, I can get to my desired answer if I create separate columns containing the results of the SPLIT formula. This way, I first do the desired SPLITS, and then take the values I need from that output by specifying the correct range.
Is there a way to do this without first creating an output and then taking a cell range as an input to FILTER or other similar functions?
For example in cell C35 I've already gotten the desired SPLIT and FILTER done in one go, but I'd still need to find a way to sum up the values of the first character of the second column. Doing this requires that I take the LEFT value of the second column, but for that I need to output the results and continue in a new cell. Is there a way to avoid this?
Ralph, I'm not sure if your sample sheet really reflects what you are trying to end up with, since, for example, I assume you are likely to want the total of the hours per area.
In any case, this formula extracts all of the areas, and the hours worked, and is then easy to do further calculations with.
=ArrayFormula({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))})
Try that in cell E13, to see the output.
The first REGEXEXTRACT pulls out all the text in front of the first space and number, and the second pulls out all the digits in a string of " #hr" in each cell. These criteria could be modified, if necessary, depending on your actual requirements. Note that it requires the use of VALUE, to convert the hours from text to numeric values, since REGEXEXTRACT produces text (string) results.
It involved concatenating your multiple data columns into one long column of data, to make it simpler to process all the cells in the same way.
This next formula will give you a sum, for whatever matching room/task you type into B6, as an example.
=ArrayFormula(QUERY({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))},
"select Col1, sum(Col2) where Col1='"&B6&"' group by Col1 label sum(Col2) '' ",0))
I will also answer my own question given what I know from kirkg13's answer and other sources.
Short answer: no, there isn't. If you want to do really convoluted computations with particular cell values, there are a few options and tips:
Script your own functions. You can expand INDEX to accept array inputs and thereby you can select any set of values from an array without outputting it first. Example that doesn't use REGEXMATCH and QUERY to get the SUM of hours in the question's example data set: https://docs.google.com/spreadsheets/d/1NljC-pK_Y4iYwNCWgum8B4NJioyNJKYZ86BsUX6R27Y/edit?usp=sharing.
Use QUERY. This makes your formula more convoluted quite quickly, but is still a readable and universally applicable method of selecting data, for example particular columns. In the question's initial example, QUERY can retrieve only the second column just like an adapted INDEX function would.
Format your input data more effectively. The more easily you can get numbers from your input, the less you have to obfuscate your code with REGEXMATCHES and QUERY's to do computations. Doing a SUM over a RANGE is a lot more compact of a formula than doing a VALUE of a LEFT of a QUERY of an ARRAYFORMULA of a SPLIT of a FILTER. Of course, this will depend on where you get your inputs from and if you have any say in this.
Also, depending on how many queries you will run on a given data set, it may actually be desirable to split up the formula into separate parts and output partial results to keep the code from becoming an amalgamation of 12 different queries and formulas. If the results don't need to be viewed by people, you can always choose to hide specific columns and rows.
Using google sheets, I'm trying to pull the earliest dates for unique values using the query function.
The Raw data looks like this
I want to pull the data so that I only get the first test completed for each unique identifier provided that test was done within 2 days prior or after their ward admission. So it should spit out something like this:
The data range I need
I am using the following formula which is nearly what I want, it's just including multiple values for the unique identifers:
=Query(Sheet1!1:952,"Select A,B,C,D,E where C > -2 and C < 2 and C is not null and E is not null Order By D",1)
Results I'm getting with the above formula
I feel like I'm nearly there, I just need to somehow only pull the minimum date values instead of them all. Any help would be really appreciated!
It's actually easier to use Sortn with these than query (although you can choose the min value of any individual column (e.g. date) within a group using query, it doesn't give you the min value of any other columns corresponding to that min value, which is what you want). Sortn has an option to ignore rows that are duplicates with respect to a particular sort key or keys (in this case, Unique Identifier).
=sortn(sort(filter(A2:E,C2:C<2),4,1,1,1),999,2,4,1)
(if column C can be negative, put another condition in the filter).
I am trying to use AVERAGEIFS inside ARRAYFORMULA. Looking at other questions, I have come to the conclusion that it is not possible without using QUERY function.
My intention is to average the values of a column whenever they share the same ID.
I think this question comes pretty close to what I need, but I haven't been able to replicate and adapt its solution on my own sheet.
In this sheet I show the result I expect (I got it by dragging the formula). I've also reviewed the Query Language Reference, unsuccessfully.
Thanks a lot for your time and effort.
So the formula should be
=ArrayFormula(iferror(sumif(A2:A,A2:A,B2:B)/countif(A2:A,A2:A)))
Note that if there were any text values in the points column, this would still return a result (because count would be greater than zero) - you could instead use
=ArrayFormula(if(isnumber(B2:B),(sumif(A2:A,A2:A,B2:B)/countif(A2:A,A2:A)),""))
If you had a mixture of rows with text and rows with numbers for any ID, this would return a smaller result than the avg or average formula. This is a limitation of this method. You can't put an extra condition in (that column B has to contain a number) because you would need countifs and countifs isn't array-friendly. It still seems strange that AFAIK countif and sumif are the only functions out of this family that are array-friendly while countifs, sumifs, averageif etc. are not.
you can do:
=ARRAYFORMULA(IFERROR(VLOOKUP(A2:A; QUERY(A2:B; "select A,avg(B) group by A"); 2; )))
I'm trying to style the first instance of a value in a column. I found this custom formula through googling:
=COUNTIF($A1:$A100,$A1)=1
but this styles the last instance of the value, and I'm not sure why.
try this formula:
=COUNTIF($A$1:$A1,$A1)=1
THE COUNTIF SOLUTION:
The formula =COUNTIF($A$1:$A1,$A1)=1 as suggested by Max is a common solution to this problem. It is a variation of the formula for finding duplicates : =COUNTIF($A:$A,$A1)>1.
COUNTIF DRAWBACK:
One of the drawbacks of using the COUNTIF formula is that it relies on the first parameter $A$1:$A1 in order to accurately evaluate the conidtional-formatting correctly. The formula works the same in the conditional formatting as it would if you were to physically put the formula in B1, and the copy it down the whole column. The first copy in B1 will appear as the original formula =COUNTIF($A$1:$A1,$A1)=1but the one in B2 will appear as =COUNTIF($A$1:$A2,$A2)=1.
This can be a real problem and result in false positives or maybe the conditional formatting not working at all if you are doing any sorting, cutting and pasting, dragging and dropping rows or cells, etc.
THE MATCH SOLUTION:
An improved version of this formula that eliminates the possibility of false positives and prevents the range from automatically being updated when it has been sorted, copied, cut, dragged, dropped, etc is as follows:
=MATCH($A1,INDIRECT("$A:$A"),0)=ROW()
EXPLANATION OF MATCH SOLUTION:
The only purpose in the INDIRECT formula is to prevent the range from automatically updating. If you would prefer it to update when you copy and paste you can instead do: =MATCH($A1,$A:$A,0)=ROW() The key to this formula working properly is that the MATCH formula parameter 2 looks at the entire column, that way when it finds the exact location of parameter 1 it can compare it to the row#. If there are duplicates within column A Match will only return the location of the first instance. Since parameter 2 is the entire column the answer it returns is also the row# of the first instance. So the second part of the formula above =ROW() will compare the first instance's row# to the row# of the current cell, if they are identical than the formula will entire formula will return TRUE
ADAPT MATCH SOLUTION TO FIND DUPLICATES (after first instance):
The MATCH formula can also be adapted to find all duplicates after the first entry. (basically the inverse) by changing the last part of the formula =ROW() into <ROW() So the duplicate finding formula would be: =MATCH($A1,INDIRECT("$A:$A"),0)<ROW()
I'd like to quickly include or exclude an entire range of values in a SUM.
Presently I'm SUMing select cells for a grand total: [E19] =SUM(E13,E20,E30,E45,E55,E70,E80)
These are in turn SUMs of selected ranges:
... [E30] =SUM(E31:E44), [E55] =SUM(E56:E69), ...etc.
One of these ranges I would like to toggle it's inclusion in the Grand Total.
It seemed the best way to do it was this:
[E45] =SUMIF(D45,"☑",E46:E54)
In short, in cell E45 I'd like to SUM E46 to E54 only if D45 contains a ☑.
However Google Doc's SUMIF seems to only work with matched ranges: =SUMIF(D46:D54,"☑",E46:E54)
Is there a way to SUM a range only if a specific value exists in a single cell?
You're right about SUMIF, it allows you to sum values from a range, which meet a certain criteria (on another range of the same length). For example, if you had two columns called "status" and "price", you could use it to sum all the prices for a given status.
What you're trying to do can be done, instead, with the use of the IF function:
=IF(D45="☑";SUM(E46:E54);0)
If the condition specified in the first argument is true, it will return the second argument, that is, the sum. Otherwise, it will return the third argument, 0.
After working through the logic to share the issue I wound up identifying a solution. Rather than trying to force SUMIF to check a single cell against a range. I just nested the 1:1 SUMIF inside my 'Grand SUM': =SUM(E13,E20,E30,SUMIF(D45,"☑",E45),E55,E70,E80).