Random select in with a bias towards certain outcomes (ie 60/40) - google-sheets

Lets say I have 2 lists and I would like to randomly select a winner between the lists but I would like to select the winner from list A 60% of the time and from list B 40% of the time, how can that be done in Google Sheets?
You can randomly select names from a list using this formula
INDEX(A2:A, RANDBETWEEN(1, COUNTA(A2:A)))

Without knowing some more information on your setup here is a general formula that does what you're describing:
=IF(RAND()<=0.6,INDEX(A2:A, RANDBETWEEN(1, COUNTA(A2:A))),INDEX(B2:B, RANDBETWEEN(1, COUNTA(B2:B))))
Essentially it is rolling a random number between 0 and 1. If it is equal to or less than .6 (simulating 60%, since there is a 60% chance it will be less than or equal to .6) it then selects a random name from Column A, otherwise (bottom 40%) it selects from column B.
You can also replace the "0.6" with A1 in my example to have the weight be a dynamic number. Changing A1 to 75% for example will then compare the random value against less than or equal to .75.
EDIT: Image shows the wrong condition, I was corrected in the sense you want less than or equal to .6 and not greater than, I had the weights flipped.

Related

Conditional formatting at row level

I have 2000 rows of cost price data. In each row, I would like to apply a color scale to quickly highlight cost prices (from low to high). However, I would like the color scale comparison logic each time to be applied within a specific row. So row 12 data should not be compared to row 13 data for instance. How can I do this without creating 2000 rules stipulating each row?
I have done it for the first row as below:
D3:BL3
However, when I try $D3:$BL2000 and hit "Done" the $ signs just disappear meaning the formatting logic isn't applied at row level but all rows (so e.g. row 4 is compared to e.g. row 100).
You can't have a conditional formatting with scale color row by row with Google Sheets option. You can simulate it with the help of MIN, MAX and QUARTILE. Here you have an example:
=(D1=MAX($D1:$Z1))*(D1<>"")
=(D1>QUARTILE($D1:$Z1,3))*(D1<>"")
=(D1>QUARTILE($D1:$Z1,2))*(D1<>"")
=(D1>QUARTILE($D1:$Z1,1))*(D1<>"")
=(D1>MIN($D1:$Z1))*(D1<>"")
=(D1=MIN($D1:$Z1))*(D1<>"")
PS: remember to sort accordingly the rules. In the top the highest values (in green in my example) and in the bottom the lowest values
PPS: you could do something similar with the help of RANK or LARGE/SMALL, depending on your data

Google Sheets - Multiply field by field three to the left

My title might not be very specific, so I'm going to try to explain a little better.
The sheet is divided into a name(Column A), containing a certain number of values(Column B), that get added together to a total in Column C. Furthermore, Column D, E and F contains the values I want the Total in Column C multiplied by. These first columns A to F I just fill in manually, but I would like a function to calculate the Columns I've called x, y and z total (G, H and I).
I see a pattern in this, I just can't figure out the syntax to get Sheets to see it aswell.
The pattern I'm invisioning is for each row, I want column G, H and I to take the value 3 fields to their left, and multiply it by Column C, at their row number.
Is this somehow achievable? I tried finding a solution online but I guess I don't know how to word myself.
Here's a picture to maybe make everything a little clearer
This would save me alot of time, given that I have over a hundred different rows this calculation needs to be performed on...
If something is not clear, please feel free to write a comment. I'll be following this thread quite liberally.
Thanks in advance!
You can have this formula on the first xTotal:
cell G2: =ARRAYFORMULA(if(len(C2:C),C2:C*D2:D,))
cell G2: =ARRAYFORMULA(if(len(C2:C),C2:C*E2:E,))
cell G2: =ARRAYFORMULA(if(len(C2:C),C2:C*F2:F,))
I created a sheet with the same results you had before, but this time you don't need vertical columns, just say in the # of Values column how many numbers you should have below. You just need to input the values with the grey columns.
Note: This is assuming you will always have growing vertical numbers like 1,2,3,4,5. In the new sheet you just need to set 5 in the column and it will calculate the result.
Please make a copy of this sheet and edit as you wish.
Sheet
You can use a single, simpler formula for this in cell G2
=ARRAYFORMULA(if(C2:C="",,C2:C*{D2:D,E2:E,F2:F}))

Counting the Number of Empty Cells between Non-Empty Cells in Google Sheets

I'm trying to count the number of empty cells that exist in a column between each non-empty cell but haven't been able to work out how.
Using this, I'm also trying to find the largest "empty distances" and locate the cell in the center of these distances.
The sheet I'm working with lists a set of marker colors and denotes the ones that are owned out of the full set of colors. I'm trying to find the largest ranges of missing colors and then find the colors in the middle of those ranges in order to find a handful of markers that would best help to fill out the spectrum.
Columns 1-6 are information- Column 7 marks whether the color is owned:
I may have an answer that helps you.
I could only get it to work using a helper column, but someone may know how to eliminate that requirement.
The helper column creates an array, basically listing the row numbers of the rows that have an "x" in your column B.
The main formula then measures the gap between each of these listed row numbers. It also checks the gap before the first "x", and after the last "x". Note that I have the data starting on row 2, which complicates the formula, but makes the sample sheet clearer - this can easily be changed to row 1 if you prefer.
={F2-1;
query(ArrayFormula(if(isnumber(F3:F),F3:F-F2:F-1,"")),
"select Col1 where Col1 > 0",0);
counta(A2:A)-indirect("F"&COUNTA(F$2:F))}
See a sample sheet here:
https://docs.google.com/spreadsheets/d/19QUFGRqTT6BqOsBrEBpTIxQCeNdRa5mzXhxQpCZ8sV4/edit?usp=sharing
Then I used a second formula to calculate the max gap between "x"s, (or before the first or after the last x).
Note that calculating the midpoint of the gaps, and doing a lookup of the corresponding mid-point colour, is something that can be added to this answer, if you share a sample copy of your sheet and share it for editing.
Let me know if this helps. I'll add more explanation to describe what the formula is doing tomorrow.
And I'll provide a second tab with the formulas adjusted to work with data beginning on row 1.
You can also get the lengths of the gaps using Frequency:
=ArrayFormula(frequency(if((B1:B20<>"X")*(A1:A20<>""),row(B1:B20)),if((B1:B20="X")*(A1:A20<>""),row(B1:B20))))
but finding the centres of the gaps and allowing for equal-sized gaps is more difficult.
This should find the position of the "X" at the end of the longest gap:
=ArrayFormula(
sum(frequency(if((B1:B20<>"X")*(A1:A20<>""),row(B1:B20)),
if((B1:B20="X")*(A1:A20<>""),row(B1:B20)))*(sequence(countif(B1:B20,"X")+1,1)<=
match(max(frequency(if((B1:B20<>"X")*(A1:A20<>""),row(B1:B20)),
if((B1:B20="X")*(A1:A20<>""),row(B1:B20)))),frequency(if((B1:B20<>"X")*(A1:A20<>""),row(B1:B20)),
if((B1:B20="X")*(A1:A20<>""),row(B1:B20))),0)))+
countif(sequence(countif(B1:B20,"X")+1,1),"<="&
match(max(frequency(if((B1:B20<>"X")*(A1:A20<>""),row(B1:B20)),
if((B1:B20="X")*(A1:A20<>""),row(B1:B20)))),frequency(if((B1:B20<>"X")*(A1:A20<>""),row(B1:B20)),
if((B1:B20="X")*(A1:A20<>""),row(B1:B20))),0))
)
and then it should just be a case of working backwards from there to the centre of the longest gap. However the formula needs further refinement to deal with the cases
(1) Where the longest gap is after the last "X"
(2) Where there is a tie for the longest gap
(3) Where there is a need to list the longest, second longest, third longest gap etc.

How do I create an arrayformula to calculate the median of three different values from one row?

How do you create an arrayformula to calculate the median of three different values from one row in google sheet?
For example, I want to do ARRAYFORMULA for MEDIAN formula.
=ARRAYFORMULA(MEDIAN(A2:A,$B$2,$C$2))
where
A2: start date - 2020-07-22 10:00
B2: start hour - 8:00
C2: end hour - 17:30
And the result of MEDIAN(A2:A,B2,C2) is 10:00, but ARRAYFORMULA does not work. (the result is 00:00:00)
is it possible to make an Array for MEDIAN? Or is there any option to do that in other way?
Mean and median are not the same. Mean is essentially AVERAGE, while MEDIAN is the "middle-most" value in a set:
=AVERAGE(2,1,9) returns 4 [i.e., (1+2+9)/4 ]
=MEDIAN(2,1,9) returns 2 (i.e., the value in the middle if all the numbers were lined up from lowest to highest)
If you only have to compare three columns, you could get by with a sort of "brute force" value-to-value comparison array. For instance, in D2:
=ArrayFormula(IF(((A2:A-INT(A2:A)<B2:B)*(A2:A-INT(A2:A)>C2:C))+((A2:A-INT(A2:A)>B2:B)*(A2:A-INT(A2:A)<C2:C)),A2:A-INT(A2:A),IF((B2:B<(A2:A-INT(A2:A))*(B2:B>C2:C))+(B2:B>(A2:A-INT(A2:A))*(B2:B<C2:C)),B2:B,C2:C)))
In English, this says:
"If the first value is lower than the second AND higher than the third, OR if the first value is higher than the second AND lower than the third, it is the median. Return the first value.
If neither of those is true, check to see if the second value is lower than the first AND higher than the third, OR if the second value is higher than the first AND lower than the third. If so, it is the median. Return the second value.
If nothing has been true so far, then the third value must be the median. Return the third value."
The one additional thing is you'll notice I have E2:E-INT(E2:E). This gets rid of the date portion and leaves only the time, since in Google Sheets, dates are whole numbers while times are decimal portions less than 1. So removing the INTeger (i.e., whole) part of the cell value leaves only the decimal portion, which is the time. This is necessary so that the comparisons can be of-a-kind.
arrayformula for average:
=ARRAYFORMULA(TEXT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(N(A2:C*1)),
"select "&TEXTJOIN(",", 1, IF(A2:A="",,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))&"")),
"select Col2"), "hh:mm"))
arrayformula for MEDIAN now possible:
=IFERROR(BYROW(A2:C10, LAMBDA(xx, TEXT(MEDIAN(xx), "hh:mm"))))

How to eliminate highlighting duplicates in google sheets conditional formatting

I have a spreadsheet where I need to conditional format/highlight the lowest 3 scores in a row to reflect dropped scores that are part of a Total calculation. I'm using the SMALL function to successfully calculate the Total..=SUM(A2:I2)-SMALL(A2:I2,1)-SMALL(A2:I2,2)-SMALL(A2:I2,3) but when I try to use the SMALL function in the Custom Formula field of the Conditional Format it highlights 0,60,60,60 and not 0,60,60
119 101 60 100 0 109 60 60 112 TOTAL:601
If four of the values are 0, it will highlight all for 0's.. if 60 is the lowest score and there are 4 or more scores of 60, it will highlight all and not reflect that only 3 of the scores are actually dropped.
Is there another way (custom formula) that can only highlight the lowest 3 scores in the row even when the 3rd lowest may have duplicates in the row?
I've come up with this formula (assuming values start in A1) which unfortunately is a bit long
=OR(A1<SMALL($A1:$I1,3),AND(A1=SMALL($A1:$I1,3),COUNTIF($A1:A1,SMALL($A1:$I1,3))<=(3-COUNTIF($A1:$I1,"<"&SMALL($A1:$I1,3)))))
or
=OR(A1<SMALL($A1:$I1,3),AND(A1=SMALL($A1:$I1,3),(COUNTIF($A1:A1,SMALL($A1:$I1,3))+COUNTIF($A1:$I1,"<"&SMALL($A1:$I1,3))<=3)))
The logic is that it highlights all cells which are less than the third smallest value, then any values (starting from the left) which are equal to the third smallest value until the total equals three.
I've changed the second row to show that it selects the second zero instead of the second 60.

Resources