How do I create a formula that recommends a matchup of 2 names? - google-sheets

I’m trying to make a formula that accomplishes this:
From a list of numbers, find the pairs of numbers with the lowest difference and second lowest difference, and write their differences elsewhere(I don’t know how to do this)
Use those numbers to find and use the value(a name)of the cell to their left(I don’t know how to do this with duplicate numbers)
How would I go about this?
The final product should include all 4 numbers, names, and both differences.
In the picture below, the pink shaded section is the goal.

What I did is to first SORT the values:
=SORT(A2:B,2,1)
Then, wrapped it in LAMBDA (in order not to repeat those formula again and again), and started "scanning" those values to find the differences. Since it's already sorted, the "smallest" will be found in this comparison.
For this, I used MAP. Used SEQUENCE, COUNTA and INDEX in order to find the amount of values in the first column; and SEQUENCE will help me do the "navigation". With the variable resultant "p" I searched each row with the next one (p+1). Resulting in something like this:
=LAMBDA(sorted,MAP(SEQUENCE(COUNTA(INDEX(sorted,,1))-1),LAMBDA(p,{INDEX(sorted,p,1),INDEX(sorted,p+1,1),INDEX(sorted,p+1,2)-INDEX(sorted,p,2)})))(SORT(A2:B,2,1))
With that, you're already one step away. Just with SORTN you can find the four smallest differences:
=LAMBDA(diff,SORTN(diff,4,1,3,1))
(LAMBDA(sorted,MAP(SEQUENCE(COUNTA(INDEX(sorted,,1))-1),LAMBDA(p,{INDEX(sorted,p,1),INDEX(sorted,p+1,1),INDEX(sorted,p+1,2)-INDEX(sorted,p,2)})))
(SORT(A2:B,2,1)))
I hope it's already useful! You can split those values in two columns, obviously; just using INDEX to access the desired row of this formula; or just locate the formula somewhere and refer to those cells. Let me know if it's useful!

you can try:
=LAMBDA(z,SORTN(MAP(INDEX(z,,1),INDEX(z,,2),LAMBDA(a,b,{a,b,ABS(XLOOKUP(a,A:A,B:B,)-XLOOKUP(b,A:A,B:B,))})),4,0,3,1))(INDEX(SPLIT(QUERY(UNIQUE(MAP(LAMBDA(z,FLATTEN(z& "🐠" &TRANSPOSE(z)))(FILTER(A2:A, A2:A<>"")),LAMBDA(y,JOIN("🐠",SORT(UNIQUE(TRANSPOSE(SPLIT(y,"🐠")))))))),"Select Col1 Where Col1 CONTAINS '🐠'"),"🐠")))

Related

Why is my COUNTA formula counting too high?

I know a similar but different question has been asked, but I don’t know how to use query or where to learn, so I figured I’d ask…
I’d like to use this formula:
=COUNTA(FILTER($L:$L,OFFSET($L:$L,0,1)="Draw",$L:$L=$E2))
L:L is a range of names, which it should test against E2:E, a range of unique names. I also need to ensure that the cell in the column to the right of each instance of the name in E2:E has a specific word in it (“Draw”). I tried using AND, but this just sets everything to 1, even when one of them has 3 draws. There are no losses, so the loss column should show up as all 0s, but it is instead 1s. The infuriating thing about this is that the draw column is all right for some reason. Picture below. For reference, Losses(as well as draws and wins)are the end goal.
image
COUNT is meant for counting numeric values, COUNTA for non empty cells. COUNTIF or COUNTIFS would probably be your best choice. Try with:
=COUNTIFS($L:$L,E2,OFFSET($L:$L,0,1),"Draw")

Is there a way to specify an input is a single cell in Google Sheets?

I want to iterate over an array of cells, in this case B5:B32, and keep the values that are equal to some reference text in a new array.
However, SPLIT nowadays accepts arrays as inputs. That means that if I use the array notation of "B5:B32" within ARRAYFORMULA or FILTER, it treats it as a range, rather than the array over which we iterate one cell at a time.
Is there a way to ensure that a particular range is the range over which we iterate, rather than the range given at once as an input?
What I considered was using alternative formulations of a cell, using INDEX(ROW(B5), COLUMN(B5)) but ROW and COLUMN also accept array values, so I'm out of ideas on how to proceed.
Example code:
ARRAYFORMULA(
INDEX(
SPLIT(B5:B32, " ", 1), 1
) = "Some text here"
)
Example sheet:
https://docs.google.com/spreadsheets/d/1H8vQqD5DFxIS-d_nBxpuwoRH34WfKIYGP9xKKLvCFkA/edit?usp=sharing
Note: In the example sheet, I can get to my desired answer if I create separate columns containing the results of the SPLIT formula. This way, I first do the desired SPLITS, and then take the values I need from that output by specifying the correct range.
Is there a way to do this without first creating an output and then taking a cell range as an input to FILTER or other similar functions?
For example in cell C35 I've already gotten the desired SPLIT and FILTER done in one go, but I'd still need to find a way to sum up the values of the first character of the second column. Doing this requires that I take the LEFT value of the second column, but for that I need to output the results and continue in a new cell. Is there a way to avoid this?
Ralph, I'm not sure if your sample sheet really reflects what you are trying to end up with, since, for example, I assume you are likely to want the total of the hours per area.
In any case, this formula extracts all of the areas, and the hours worked, and is then easy to do further calculations with.
=ArrayFormula({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))})
Try that in cell E13, to see the output.
The first REGEXEXTRACT pulls out all the text in front of the first space and number, and the second pulls out all the digits in a string of " #hr" in each cell. These criteria could be modified, if necessary, depending on your actual requirements. Note that it requires the use of VALUE, to convert the hours from text to numeric values, since REGEXEXTRACT produces text (string) results.
It involved concatenating your multiple data columns into one long column of data, to make it simpler to process all the cells in the same way.
This next formula will give you a sum, for whatever matching room/task you type into B6, as an example.
=ArrayFormula(QUERY({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))},
"select Col1, sum(Col2) where Col1='"&B6&"' group by Col1 label sum(Col2) '' ",0))
I will also answer my own question given what I know from kirkg13's answer and other sources.
Short answer: no, there isn't. If you want to do really convoluted computations with particular cell values, there are a few options and tips:
Script your own functions. You can expand INDEX to accept array inputs and thereby you can select any set of values from an array without outputting it first. Example that doesn't use REGEXMATCH and QUERY to get the SUM of hours in the question's example data set: https://docs.google.com/spreadsheets/d/1NljC-pK_Y4iYwNCWgum8B4NJioyNJKYZ86BsUX6R27Y/edit?usp=sharing.
Use QUERY. This makes your formula more convoluted quite quickly, but is still a readable and universally applicable method of selecting data, for example particular columns. In the question's initial example, QUERY can retrieve only the second column just like an adapted INDEX function would.
Format your input data more effectively. The more easily you can get numbers from your input, the less you have to obfuscate your code with REGEXMATCHES and QUERY's to do computations. Doing a SUM over a RANGE is a lot more compact of a formula than doing a VALUE of a LEFT of a QUERY of an ARRAYFORMULA of a SPLIT of a FILTER. Of course, this will depend on where you get your inputs from and if you have any say in this.
Also, depending on how many queries you will run on a given data set, it may actually be desirable to split up the formula into separate parts and output partial results to keep the code from becoming an amalgamation of 12 different queries and formulas. If the results don't need to be viewed by people, you can always choose to hide specific columns and rows.

Excel/Sheets Consecutive Count Based on Two Conditions (function?)

I have a Google Sheet, and I'm trying to see if it's possible to get a consecutive count outputted in a third column based on the values of two other columns.
My columns are:
Column A: Will have a handful of text values that are "grouped" together. Likely around 30 of the same value, until it changes to another value. In the image above, these are text1, and text2.
Column B: Will have one of 3 values assigned to each value in column A. In the image above, these are id1, id2, id3.
Column C: Will output a consecutive count based on the values of the first two columns. My hope is that if there are multiple ID1,ID2 in consecutive order, they'll repeat that first +1 value; while ID3 is always plus 1 to the count. This is what I am trying to show in column C in the layout image above.
I've hit a wall with trying to accomplish this with various COUNTIF iterations.
Thanks for any help, or any better ideas to accomplish something similar.
(I'm hoping for a formula, but open to being pointed into a direction for a script if that's the only way).
You can try following formula:
=IF(A2=A1;IF(OR(B2="id3";B2<>B1);C1+1;C1);1)
It is also possible to do this as an array formula. I used offset ranges for column B in the first Countifs to check for a change in value but this made it a little awkward to get equal-sized arrays:
=ArrayFormula(if(A2:A="","",
countifs({"";B2:B}<>{B2:B;""},true,{A2:A;""},A2:A,row(A:A),"<"&row(A2:A),{B2:B;""},"<>id3")+
countifs(A2:A,A2:A,row(A2:A),"<="&row(A2:A),B2:B,"=id3")
))

Randomize cells in Google Sheets

Is there a formula to randomize a column of data which keeps each item represented only once (has the same items)?
So:
APPLES
PEARS
BERRIES
Might come out as
PEARS
BERRIES
APPLES
Randbetween formulas no good here, as you might get two 'PEAR's.
There is a new "randomize range" feature available in the context menu after selecting a range:
]
The following approach implements the idea of pnuts, but without creating a column filled with random numbers:
=query({A2:A20, arrayformula(randbetween(0, 1e20 + row(A2:A20)))}, "select Col1 order by Col2", 0)
Here A2:A20 is the range to be permuted. The arrayformula generates a random integer for each. The query sorts the array by those random integers, but does not put the random numbers in the spreadsheet.
The entropy of randbetween is 64 bits, so collisions are extremely unlikely. And even if two random numbers happen to be equal, that will not generate repetitions; sorting by whatever column never does that. It only means the corresponding pair of entries will appear in their original order.
Came across this while looking for a formula to generate a set of random unique integers and ended up devising my own, so I'm leaving it here for anyone else looking for the same:
=SORT(SEQUENCE(A$1),RANDARRAY(A$1),FALSE) where A$1 is the count of integers to generate (expressed here as a cell reference because I like to create sheets where I can input a number in a cell rather than changing the formula, but this can of course be just a number.)
This can be expanded by adding the three other fields to SEQUENCE as explained in the function's documentation, or by wrapping it in an ARRAYCONSTRAIN to limit the count of entries returned without changing the minimum or maximum values of the generated entries. Hope all this makes sense!
I adopted a similar approach to user6655984 before I found this post.
RANDARRAY seemed to be a neat call once solution.
I had similar demands. Formula based, randomized return order, ability to have only unique records or not as the whim took me.
Right clicking to randomize range meant user interaction I didn't want and the data is dynamic.
I built in the random numbers into a query data range on the fly.
I get the flexibility of query (can easily expand the range, add returned columns filter criteria etc), I don't have to show the random numbers at all and can wrap it in UNIQUE if desired, it re-randomizes with each recalc.
Have some data in column A2:A.
To see the inline data range.
={RANDARRAY(ROWS($A$2:$A)),$A$2:$A}
Query (inc duplicates), filter out empty.
=QUERY({RANDARRAY(ROWS($A$2:$A)),$A$2:$A},"SELECT Col2 WHERE COL2<>'' ORDER BY Col1 ",0)
Same but wrapped by unique.
=UNIQUE(QUERY({RANDARRAY(ROWS($A$2:$A)),$A$2:$A},"SELECT Col2 WHERE COL2<>'' ORDER BY Col1 ",0))
Hope it helps someone, even if years later. :)
Matt

Average calories of "moderate" food

I keep a spreadsheet with caloric information about food. Each row represents a product labelled as {negative, small, moderate} and the number of calories.
I included a shared copy of the spreadsheet.
Table
Shared table
I would like to calculate the average of the numbers that are in the same row as the keyword 'moderate'. For instance, I would like to obtain something like (890+914+731+1159+789)/5=897. I have tried =AVERAGE(B3,B7:B10) and it works but it needs to be modified when I add another product.
The expected output is in red. I want to obtain such output using formulas.
Thanks in advance
Maybe you can also consider the use of AVERAGEIF() as it seems the function is built for situations like this:
=averageif(A2:A; "moderate"; B2:B)
When you insert new row, formula will be modified automatically. Also, your formula is wrong, missing semicolon:
=AVERAGE(B3; B7:B10)
^
You can Use a IF Statement. Then you automatically do the average of the numbers IF the respective value is moderate. This is better, because it will work, if you have 5 or 5000 Moderates
Just use: =AVERAGE(FILTER(B2:B10;(A2:A10="moderate")))
You could also use SUMIF with COUNTIF, it's a more basic formula, not as powerful as FILTER, but just as funcitonal. Also to use unlimited ROWs, you can remove the number on the second part of the RANGE, this for any formula, in this case your formula would be as such:
=SUMIF(A2:A;"moderate";B2:B)/COUNTIF(A2:A;"moderate")

Resources