I have a row with random numbers and want to put their order (smallest first):
In this example, the random numbers are in row 2 and their order is in row 1. Note that columns A and C both have the same values and therefore have the same ranking.
I am not opposed to doing this as a function, but I am looking for an elegant solution.
Turns out there is a rank function!
=rank(A2,$A2:$F2,TRUE)
Using the RANK function, I believe you'll wind up with no rank of 4 in your sample scenario, given the two ties at rank 3.
If you don't care about that, you can use an array formula that will deliver all results at once. Clear all of Row 1 and then put this formula in cell A1:
=ArrayFormula(RANK(FILTER(2:2,2:2<>""),FILTER(2:2,2:2<>""),1))
This formula will deliver all results and it will continue to rank if more numbers are added into Row 2.
If you do care about missing rankings when there is a tie, you can still use an array formula; it's just a bit more complex. Again, clear all of Row 1 and place the following formula in A1:
=ArrayFormula(VLOOKUP(FILTER(2:2,2:2<>""),{SORT(UNIQUE(TRANSPOSE(FILTER(2:2,2:2<>"")))),SEQUENCE(COUNTA(UNIQUE(TRANSPOSE(FILTER(2:2,2:2<>"")))))},2,FALSE))
This is one of many approaches.
The section between the curly brackets { } will form an ascending sorted unique list of the numbers to the left of a sequence of numbers that starts at one and goes as high as the number of elements in the sorted unique list beside it.
FILTER considers only those cells in Row 2 (i.e., 2:2) that are not null, which will continue to expand as new cells are added.
VLOOKUP will look up each value in the first column of the virtual array held between those curly brackets and return the number in column 2 (i.e., the rank).
This formula assumes that any cells in Row 2 will be contiguous and will be real numbers. If for some reason that isn't the case, the following error controls will allow for gaps or non-numerical entries in Row 2 while still returning the rank of all numbers:
=ArrayFormula(IFERROR(VLOOKUP(2:2,{SORT(UNIQUE(TRANSPOSE(FILTER(2:2,ISNUMBER(2:2))))),SEQUENCE(COUNTA(UNIQUE(TRANSPOSE(FILTER(2:2,ISNUMBER(2:2))))))},2,FALSE)))
Related
I want to iterate over an array of cells, in this case B5:B32, and keep the values that are equal to some reference text in a new array.
However, SPLIT nowadays accepts arrays as inputs. That means that if I use the array notation of "B5:B32" within ARRAYFORMULA or FILTER, it treats it as a range, rather than the array over which we iterate one cell at a time.
Is there a way to ensure that a particular range is the range over which we iterate, rather than the range given at once as an input?
What I considered was using alternative formulations of a cell, using INDEX(ROW(B5), COLUMN(B5)) but ROW and COLUMN also accept array values, so I'm out of ideas on how to proceed.
Example code:
ARRAYFORMULA(
INDEX(
SPLIT(B5:B32, " ", 1), 1
) = "Some text here"
)
Example sheet:
https://docs.google.com/spreadsheets/d/1H8vQqD5DFxIS-d_nBxpuwoRH34WfKIYGP9xKKLvCFkA/edit?usp=sharing
Note: In the example sheet, I can get to my desired answer if I create separate columns containing the results of the SPLIT formula. This way, I first do the desired SPLITS, and then take the values I need from that output by specifying the correct range.
Is there a way to do this without first creating an output and then taking a cell range as an input to FILTER or other similar functions?
For example in cell C35 I've already gotten the desired SPLIT and FILTER done in one go, but I'd still need to find a way to sum up the values of the first character of the second column. Doing this requires that I take the LEFT value of the second column, but for that I need to output the results and continue in a new cell. Is there a way to avoid this?
Ralph, I'm not sure if your sample sheet really reflects what you are trying to end up with, since, for example, I assume you are likely to want the total of the hours per area.
In any case, this formula extracts all of the areas, and the hours worked, and is then easy to do further calculations with.
=ArrayFormula({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))})
Try that in cell E13, to see the output.
The first REGEXEXTRACT pulls out all the text in front of the first space and number, and the second pulls out all the digits in a string of " #hr" in each cell. These criteria could be modified, if necessary, depending on your actual requirements. Note that it requires the use of VALUE, to convert the hours from text to numeric values, since REGEXEXTRACT produces text (string) results.
It involved concatenating your multiple data columns into one long column of data, to make it simpler to process all the cells in the same way.
This next formula will give you a sum, for whatever matching room/task you type into B6, as an example.
=ArrayFormula(QUERY({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))},
"select Col1, sum(Col2) where Col1='"&B6&"' group by Col1 label sum(Col2) '' ",0))
I will also answer my own question given what I know from kirkg13's answer and other sources.
Short answer: no, there isn't. If you want to do really convoluted computations with particular cell values, there are a few options and tips:
Script your own functions. You can expand INDEX to accept array inputs and thereby you can select any set of values from an array without outputting it first. Example that doesn't use REGEXMATCH and QUERY to get the SUM of hours in the question's example data set: https://docs.google.com/spreadsheets/d/1NljC-pK_Y4iYwNCWgum8B4NJioyNJKYZ86BsUX6R27Y/edit?usp=sharing.
Use QUERY. This makes your formula more convoluted quite quickly, but is still a readable and universally applicable method of selecting data, for example particular columns. In the question's initial example, QUERY can retrieve only the second column just like an adapted INDEX function would.
Format your input data more effectively. The more easily you can get numbers from your input, the less you have to obfuscate your code with REGEXMATCHES and QUERY's to do computations. Doing a SUM over a RANGE is a lot more compact of a formula than doing a VALUE of a LEFT of a QUERY of an ARRAYFORMULA of a SPLIT of a FILTER. Of course, this will depend on where you get your inputs from and if you have any say in this.
Also, depending on how many queries you will run on a given data set, it may actually be desirable to split up the formula into separate parts and output partial results to keep the code from becoming an amalgamation of 12 different queries and formulas. If the results don't need to be viewed by people, you can always choose to hide specific columns and rows.
I have a Google Sheet, and I'm trying to see if it's possible to get a consecutive count outputted in a third column based on the values of two other columns.
My columns are:
Column A: Will have a handful of text values that are "grouped" together. Likely around 30 of the same value, until it changes to another value. In the image above, these are text1, and text2.
Column B: Will have one of 3 values assigned to each value in column A. In the image above, these are id1, id2, id3.
Column C: Will output a consecutive count based on the values of the first two columns. My hope is that if there are multiple ID1,ID2 in consecutive order, they'll repeat that first +1 value; while ID3 is always plus 1 to the count. This is what I am trying to show in column C in the layout image above.
I've hit a wall with trying to accomplish this with various COUNTIF iterations.
Thanks for any help, or any better ideas to accomplish something similar.
(I'm hoping for a formula, but open to being pointed into a direction for a script if that's the only way).
You can try following formula:
=IF(A2=A1;IF(OR(B2="id3";B2<>B1);C1+1;C1);1)
It is also possible to do this as an array formula. I used offset ranges for column B in the first Countifs to check for a change in value but this made it a little awkward to get equal-sized arrays:
=ArrayFormula(if(A2:A="","",
countifs({"";B2:B}<>{B2:B;""},true,{A2:A;""},A2:A,row(A:A),"<"&row(A2:A),{B2:B;""},"<>id3")+
countifs(A2:A,A2:A,row(A2:A),"<="&row(A2:A),B2:B,"=id3")
))
Is there a way to sort a Google Sheet by the order in which values are entered into a data validation criteria?
I want to sort the sheet based in ascending order Low,Medium,High or descending order High,Medium,Low. Not by alphabetical order High,Low,Medium and Medium,Low,High respectively.
Aaron. The easiest way would be to use a helper column (which you can hide later if you like) wherein you assign numerical values to your Low, Medium and High (presumably 1, 2 and 3 respectively). Then you sort using the numerical column. It's fairly easy to write a one-cell array formula that would assign the numerical values to your labels. The numerical column need not be beside the label column; it can be any column.
Without seeing an actual sample sheet, I can't show you. But hopefully the concept is clear, and you can take it from there.
Added description after sheet was shared:
In the example sheet, Sheet1 Column A contained the Priority in words (Low, Medium, High) and Column B contained "other data." I placed the following array formula into C1:
=ArrayFormula({"Priority Val";IF(A2:A="","",VLOOKUP(A2:A,Data!A:B,2,FALSE))})
The formula is an array formula, hence the ArrayFormula() wrap.
Inside this are curly brackets {} which allow the building of arrays that are not "of a type." In this case, the header is listed first ("Priority Val"). The semicolon means "place the next part underneath." Then a VLOOKUP references every value in Column A (i.e., the priority words) against a simple chart in a second sheet named "Data." In that "Data" sheet, Column A simply lists 1, 2, 3 and Column B lists your exact words: Low, Medium, High. The IF() function just checks to see if a row in Sheet1!A:A is blank. If so, a null is assigned before trying the VLOOKUP; otherwise, every blank row would show an #NA error.
If you want to make it even more air tight, it's good practice to wrap VLOOKUP in IFERROR(), just in case you misspell something in Sheet1!A:A. That would look like this:
=ArrayFormula({"Priority Val";IF(A2:A="","",IFERROR(VLOOKUP(A2:A,Data!A:B,2,FALSE)))})
And you can avoid misspelling by applying data validation to Sheet1!A2:A, referencing Data!A:A as the only allowable answers. This is not strictly necessary; but I have done it in the sample sheet to show you.
I have a QUERY that seems to be treating AND more like OR. In other words, when the value of Col11=TRUE and the value of Col12=7, the results are displayed as though Col12=8. Am I missing something? I've tried adding quotes around the variables, parentheses around the two criteria. Adding spacing around the =. What else is there?
Col11 is only TRUE or FALSE values and Col12 is only numeric values from 1-8.
=QUERY({$A$3:$AJ},"SELECT Col3,Col10 where Col11=TRUE and Col12=8",0)
Here's a link to my sheet. It's buried in a larger formula in AK2
AK6 is a good example. It shows U U. It should only show U. It is treat X6 as though it's value is 8 when it is actually 7.
I believe I worked out what is happening.
You are getting two 'U's because I think your inner array is returning multiple rows for Col3='R2-D2', one row where Col23=TRUE and Col24=8, and then another row where Col27=TRUE and Col28=8.
I'm not positive, but I think the values in AK don't relate specifically to the values in that specific row, but instead relate to an array queried across all of your data rows. So as the outer ArrayFormula works down the column, the inner array (with multiple VLOOKUP/ArrayFormula/Queries) is still a large subset of the whole data range. That's assuming I've understood your complex formula correctly - my apologies if I've misunderstood something.
I've added a Heroes-TEST sheet to your sheet. It only has ten rows, all of the R2-D2 data from your Heroes tab. The columns are collapsed for visibility. See what happens when you highlight all the row data below Row3 and press delete - and then UNDO. The two 'U's in column AK become one, because there is only one row of data to query through now.
Your original formula is in AK2.
Let me know if this has helped.
Is there a formula to randomize a column of data which keeps each item represented only once (has the same items)?
So:
APPLES
PEARS
BERRIES
Might come out as
PEARS
BERRIES
APPLES
Randbetween formulas no good here, as you might get two 'PEAR's.
There is a new "randomize range" feature available in the context menu after selecting a range:
]
The following approach implements the idea of pnuts, but without creating a column filled with random numbers:
=query({A2:A20, arrayformula(randbetween(0, 1e20 + row(A2:A20)))}, "select Col1 order by Col2", 0)
Here A2:A20 is the range to be permuted. The arrayformula generates a random integer for each. The query sorts the array by those random integers, but does not put the random numbers in the spreadsheet.
The entropy of randbetween is 64 bits, so collisions are extremely unlikely. And even if two random numbers happen to be equal, that will not generate repetitions; sorting by whatever column never does that. It only means the corresponding pair of entries will appear in their original order.
Came across this while looking for a formula to generate a set of random unique integers and ended up devising my own, so I'm leaving it here for anyone else looking for the same:
=SORT(SEQUENCE(A$1),RANDARRAY(A$1),FALSE) where A$1 is the count of integers to generate (expressed here as a cell reference because I like to create sheets where I can input a number in a cell rather than changing the formula, but this can of course be just a number.)
This can be expanded by adding the three other fields to SEQUENCE as explained in the function's documentation, or by wrapping it in an ARRAYCONSTRAIN to limit the count of entries returned without changing the minimum or maximum values of the generated entries. Hope all this makes sense!
I adopted a similar approach to user6655984 before I found this post.
RANDARRAY seemed to be a neat call once solution.
I had similar demands. Formula based, randomized return order, ability to have only unique records or not as the whim took me.
Right clicking to randomize range meant user interaction I didn't want and the data is dynamic.
I built in the random numbers into a query data range on the fly.
I get the flexibility of query (can easily expand the range, add returned columns filter criteria etc), I don't have to show the random numbers at all and can wrap it in UNIQUE if desired, it re-randomizes with each recalc.
Have some data in column A2:A.
To see the inline data range.
={RANDARRAY(ROWS($A$2:$A)),$A$2:$A}
Query (inc duplicates), filter out empty.
=QUERY({RANDARRAY(ROWS($A$2:$A)),$A$2:$A},"SELECT Col2 WHERE COL2<>'' ORDER BY Col1 ",0)
Same but wrapped by unique.
=UNIQUE(QUERY({RANDARRAY(ROWS($A$2:$A)),$A$2:$A},"SELECT Col2 WHERE COL2<>'' ORDER BY Col1 ",0))
Hope it helps someone, even if years later. :)
Matt