VLOOKUP with wildcard and find Nth occurance?

VLOOKUP with wildcard and find Nth occurance? - google-sheets

I'm setting up a Google Sheet that will calculate the most effective purchase size of specific agricultural inputs (fertilizer, chemical, etc). I set up the price data in its own tab with a separate row for each input name + size.
To keep it easy for the user I'd like to require only the input name, # of gallons per acre, and acres and then have a formula spit out the total cost and most effective purchase (bulk if > X gallons, X # of 250 gallon containers + X 55 drums, etc). How can I use the input name plus a wildcard to find the appropriate purchase size?
https://docs.google.com/spreadsheets/d/1bMOPuk2qhmVuJT7vE_ni3KFxfcgKvwTwkM4p50xQF_0/edit?usp=sharing
I tried:
=ArrayFormula(iferror(INDEX('Data (Current)'!H2:H,SMALL(IF($A2&"*"='Data (Current)'!A2:A,ROW('Data (Current)'!A2:A)-1),1))))
...but it returns blank so I'm guessing the reference $A2&"*" to the input name isn't working properly. When I replace it with a string found in the 'Data (Current)' tab then it works fine.
=ArrayFormula(iferror(INDEX('Data (Current)'!H2:H,SMALL(IF($A2&"*"='Data (Current)'!A2:A,ROW('Data (Current)'!A2:A)-1),1))))
I expected the output to be the smallest value (in this case I think it's 5). Then when I change the last number to 2 or 3 it will find the next smallest value, in this case, 55 or 250. Then I can use simple formulas to interact with that and finish the spreadsheet.
Unfortunately, the actual output is nothing, or "".

Sorry if this isn't what you're looking for, as I had some trouble understanding your question.
Presuming what you want is essentially this:
I want to buy Y quantity of item.
I can buy item at cheaper prices if I buy in higher quantities, although sometimes they have a minimum order quantity.
What is the most optimal combination of the options I have to minimize the price I pay?
I'm unsure if there's a simple solution for this within Google Sheets alone. This might be treading more into Apps Script territory.
However, that's not to say that it's not impossible. I've "brute-forced" the above solution above with an iterative-like approach, for the "Chelated Calcium" product: https://docs.google.com/spreadsheets/d/1YSBiSx0IMr4T0R11Dqb-tqOhH4AOTTAWeH2yQfT4X5w
First, list the data in a standardized manner. This includes giving each same product something easy to look it up by. For example, on the Data (Current) tab, I've added 3 columns:
Product Common Name - This is used so that all items of different quantities can be found easily, without needing wildcards.
Gallons - Much easier to parse the data if it it's explicitly laid out.
Minimum Order Gallons - This is your threshold for Bulk. I've set it at an arbitrary 20,000 gallons for Chelated Calcium.
The data here is ordered least-effective first. How you do this will be up to you. In this case, I sorted by the Retail Cost Per Ounce parameter from your sheet, highest first. This eliminates any guesswork about which of the options are most effective, since you can just traverse your options in order. Note: The way I've laid out the formulas will only work IFF the same products are directly next to each other. It won't work if there are other products between them.
On the Field Level Tool tab, standardize your inputs to the Gallons unit. I do this in Total Gallons Needed column (I multiply anything with a "GAL" with 1, and "QUART" with 0.25).
For each item, determine the row numbers where the product begins and ends. This is marked by columns L (Least Efficient Index) and M (Most Efficient Index). I got these results by using the MATCH function.
Set up the iterations, from 0 to N-1. On this sheet, I've set up N=5 iterations, which means that it can traverse 5 different options of the same product only. Since Chelated Calcium only has 4 different options (5 Gal, 30 Gal, 250 Gal, Bulk), 5 is more than enough for this product. If you have products with more options, you may want to have more iterations.
The iterations are on the right side of the Field Level Tool tab.
In your case, you might want to put it on a different tab since the place I put it makes the file look very messy.
In each iteration, I perform the following steps:
To Fulfill - How many gallons still need to be purchased by this iteration?
ThisIndex - What is the row number of this iteration? This is determined by Most Efficient Index - Iteration Number. Remember that since we sorted in order of ascending efficiency, this means that the iteration starts with the most efficient option it can find first. There is a check to make sure that it only outputs a value if it is between the range [Least Efficient Index, Most Efficient Index]. Otherwise, it will be blank to avoid miscalculations by intruding into another product in the Data (Current) tab.
Retail Price, Minimum Gals, Gallons per Order - Simple data extraction for easy usage in the iteration, using INDEX (and indirectly, MATCH by virtue of ThisIndex).
Order - This formula does a couple of things, outlined below:
It checks whether there still remains a valid choice of product at this iteration. It does this by checking whether ThisIndex still exists. If the product doesn't exist, then it will be nulled. This is accomplished by using the IF function.
It will determine if there is a minimum threshold that must be met to purchase this choice. You can see in the 0th iteration, for example, that there is a minimum quantity of 20,000 gallons. If To Fulfill quantity is greater than or equal to the threshold OR there is no threshold, then a purchase is quantified by this column. The mathematics are simply to divide the To Fulfill amount by the Gallons per Order amount to determine the number of orders of this particular product choice. If there is a threshold but the To Fulfill amount doesn't meet it, then this iteration is skipped with a 0 order value.
If the item is already on its least efficient choice (ThisIndex == Least Efficient Index), it will do a CEILING function to ensure that the order is fulfilled. If not, it will do a FLOOR function instead. This is because you cannot order 3.5 units of an item, so they have to be rounded either up or down.
Expenditure - This is simply Order multiplied by the Retail Price, or how much money you spend in this iteration.
Remaining - How much of the product is left unfulfilled at the end of this iteration, to be used as To Fulfill for the next iteration.
Note: If you see formulas that are of the form =IF(ThisIndex, [calculations_here],), that is simply a check to nullify that calculation if ThisIndex is invalid.
Copy the iterations as many times as you want to the right. Something nice to do is to force the iterations to do a CEILING on the very last one to ensure that you never under-buy.
Generate a user-readable string for the purchase suggestion. You can see this on the Suggested Purchase column.
Calculate the Gallons Bought with a simple SUMPRODUCT over all the iterations.
Calculate the total expenditure with a simple SUM over all the iterations.
I hope this is what you were looking for. Regardless, it's at least a fun exercise on how much you can abuse Sheets. ;)

Related

How to find optimal scores in a score database in Google Sheets?

I have a database of scores (C2:C937) which range from 4000-6500. I have the records for each score (D2:D937) and the differential between the 2 (E2:E937). I am trying to find a formula to tell me which score is closest to going over 5000. I was thinking to have a formula that checks the sums of the C column values and their corresponding E column values, and then sorting by which values have the most to improve. I don't know if it's possible to make a loop of sorts in sheets (kind of like a for loop in Java).
The image below it shows a set of values as explained above. The result I would be looking for would return whichever value has the highest chance of going over 5000, that is already the closest.
Here is a set of expected results. In this dataset, "Yanmega 4" would never be returned since it is already over 5000 in the C column. "Zangoose 1" and "Yanmega 3" would be first since they are close to going over 5000 and have enough leeway above 5000 (~400 points of leeway above it). "Zangoose 2" would be next since it has ~300 over 5000 meaning getting over 5000 should be easier than some of the others. "Zangoose 4" would be next since only 28 points need to be gained to get more than 5000, and there are 105 points of wiggle room as well. "Zangoose 3" is last since it only has 15 points of wiggle room based upon the current record.
Here is what I had started. The first line is convuluted and was my attempt at creating a loop. The second line has a good start I just don't know where to go from there. Both lines are unfinished.
=IF(IF(LARGE($C$2:$C$937,1)<5000,LARGE($C$2:$C$937,1),IF(LARGE($C$2:$C$937,2)<5000,LARGE($C$2:$C$937,2),LARGE($C$2:$C$937,3))+INDIRECT("E"&match(E954,$E$1:$E$940,0))<=5000,))
=IF(LARGE($C$2:$C$937,1)<5000,)

1st I think your logic is not correct. I would rather have Zanggose 4 as 1st, then Yanmega 3, then Zanggose 2 or Zangoose 3...
you can use a formula like:
=INDEX(B3:B8,MATCH( SMALL( IF($C$3:$C$8<5000,(5000-$C$3:$C$8)*100/$D$3:$D$8,100),*ROW($A$1:$A$6)* ), IF($C$3:$C$8<5000,(5000-$C$3:$C$8)*100/$D$3:$D$8,100),0),)
and insert the Function as Array (with ctrl+sht+enter) to as many cells as needed (paste formula in top cell, fix it, then enter as Array, then select all destination cells, press F2 to edit formula, then ectrl+sht+enter)
adjusting c3:c8 to your c928:c933 and so on, then ROW($A$1:$A$6) as big as you data.
Problem is that too many calls are made recursively and in large number of rows things will get very slow... You better use an extra Column with the formula
=IF($C$3:$C$8<5000,(5000-$C$3:$C$8)*100/$D$3:$D$8,100)
and then Filter/sort the data with VBA or Filters (there are excellent youtube videos on the subject, as long as company policies (if any) allow usage of VBA).

Arrayformulas and how to format them using COUNTIF and IF statements

long time reader, first time poster! I had recently took on the task of creating a new roster for something and I am currently at an impasse using ARRAYFORMULA with IF statements and COUNTIF. The code blocks are rather large to post, however, I have included a copy of the roster here.
EDIT: This is for the Knight+ Roster tab, which is to calculate eligibility for promotions, IF Column B blacklist is not current = 0, IF columns F and G contain a keyword, column J date exceeds 6 days, and the count for column P is greater than or equal to 1 then the statement is True and they are eligible for promotion considering their Knight rank which does have different requirements.
If you look at R31, it is the exact same formula without Array, and its statement is True, vs L31 where it is false.
Also, as far as the count if portion, if you look at the W column, it shows counting 4, when in fact it should be counting 1.
Any helping reducing my migraine of trying to solve this riddle would be appreciate, if in fact I am just making this harder on myself than I need to, and should just be using individual columns/rows for formulas. My want is to have to reduce the amount of editing I have to do over the long run in case changes in requirements are made down the road, as well as making it much simpler for people who have no experience in upkeeping code to a minimum by having to worry about 1 formula, rather than 1000 rows of formulas.

Google Spreadsheets Repeat Function Nth Times & Sum Results

I have the following function
=IF(RAND()<0.25,1,0)
RAND() returns any value between 0 to 1 in decimal format and the idea is that an item has a 25% chance of getting a 1. If it was less than 0.25 the rand() then its a hit and gets a 1 otherwise a 0. Now lets say I need to do this 100 times and add up the sum of all the '1's that were created, which in this case will average to around 25 for 25%. How do I do this in Google Spreadsheets?
Basically looking for a way to repeat a function n'th amount of times and sum the results.
I have looked around everywhere (youtube, google forums) and have not found any solutions.

I may as well put this as an answer because it tries to address the broader question of whether you can repeat a function (say) 100 times. The answer is, yes if the function is compatible with an array formula. Rand can't be used in this way because it doesn't take any arguments (neither do some other functions like countifs for some reason). But you could get round it by using Randbetween instead and providing it with 100 array elements. These are multiplied by zero so don't actually affect the answer, but Google Sheets still evaluates the function 100 times:
ArrayFormula(sum(if(randbetween(0,A1:A100*0+99)<25,1,0)))
or
=Sumproduct(if(randbetween(0,A1:A100*0+99)<25,1,0))
The result is each time you force this to re-calculate (by changing something in the range A1:A100 or by setting File -> Spreadsheet Settings -> (Tab) Calculation -> Recalculation to every minute) it will give an answer around 25.
To make it more resilient (allow any value in A1:A100 including error values) could try
=ArrayFormula(sum(if(randbetween(0,iferror(A1:A100/0,0)+99)<25,1,0)))
or
=Sumproduct(if(randbetween(0,iferror(A1:A100/0,0)+99)<25,1,0))
I don't know why I didn't do this in the first place
=ArrayFormula(sum(if(randbetween(0,row(A1:A100)*0+99)<25,1,0)))
then this easily allows for a variable range
=ArrayFormula(sum(if(randbetween(0,row(indirect("A1:A"&H1))*0+99)<25,1,0)))
where the number in H1 doesn't have to be limited to the number of rows in the sheet.

Okay so I found a very convoluted answer. If someone finds a better please let me know.
The first thing as the user |'-'| commented was to create a range on separate sheet.
Since I know that I will not be looking up more than 200 values at once I created my range to be 200 long of this formula.
=IF(RAND()<0.25,1,0)
This will create the initial list of random values.
The next step is you need to generate a randomizer seed. Which is basically a random number between the range you created. You can do this with
=RANDBETWEEN(1,200)
This should be on the same column as what you are trying to sum up later.
Next you want to create a dynamic string that you can access via arrayFormula later.
="Randomizer!B"&B12&":B"&B12+B3
In my case I had the 200 random numbers on a sheet called randomizer. Notice the &, this is how you connect strings. In my example B12 is the reference to the =RANDBETWEEN(1,200), and B3 is how many times I want the randomness to occur. It can be any value as long as it's less than the randomizer seed by the amount of times you want it to be random.
Finally refer to this string using, =SUM(ARRAYFORMULA(INDIRECT(B13))) , indirect lets you refer to a string as a cell and this is how I was able to create a dynamic range to calculate from.
I will say the advantage of this method is its super fast to calculate since the random numbers have been pre-computed.
The idea is that it will keep creating random ranges from the precomputed random numbers you created, and then summing those ranges, essentially calculating random numbers n'th amount of times.
Hope this helps someone.

Google Sheets: Dense Ranking from sorted values

I have a simple table with 3 columns:
[Name] [Score] [Rank]
For the 3rd column, I'm using the following formula to rank each row according to the score:
=RANK(C9,$C$9:$C$28,0)
The problem is that the formula isn't returning the values I'd expect. For example on the last row it returns 19 when it should be 5.
I found other formulas for ranking (RANK.EQ, etc.) but same issue happens.
Here is the Google Sheet to see it in context:
https://docs.google.com/spreadsheets/d/1P1m7UHPPIcQLQkzpnk-SI1y7-0mhKytCWDjA6FJzFrM/edit?usp=sharing
Any guidance appreciated

The results you want can be achieved with a simple MATCH formula:
=match(round(C9,0),NamedRange1,0)
Provided an array (named NamedRange1 for above) is created, say with:
=sort(unique(round(C9:C28,0)),1,0)

I think the result is as intended. Check this Ranking Wikipedia page (called 'standard competition ranking'). It says:
Standard competition ranking ("1224" ranking)
In competition ranking, items that compare equal receive the same
ranking number, and then a gap is left in the ranking numbers. The
number of ranking numbers that are left out in this gap is one less
than the number of items that compared equal. Equivalently, each
item's ranking number is 1 plus the number of items ranked above it.
This ranking strategy is frequently adopted for competitions, as it
means that if two (or more) competitors tie for a position in the
ranking, the position of all those ranked below them is unaffected
(i.e., a competitor only comes second if exactly one person scores
better than them, third if exactly two people score better than them,
fourth if exactly three people score better than them, etc.).
Thus if A ranks ahead of B and C (which compare equal) which are both
ranked ahead of D, then A gets ranking number 1 ("first"), B gets
ranking number 2 ("joint second"), C also gets ranking number 2
("joint second") and D gets ranking number 4 ("fourth").
What you want is 'dense ranking' and it can be achieved by pnuts's answer or something like this:
set G9 to 1
set G10 to =if(round(C10,0)<round(C9,0), G9+1, G9)
copy G10 and paste it into G11:G28
Sample sheet is here.

Thanks to #pnuts and #sangboklee for your solutions. I think I have a good solution now. It is pnuts's solution, just simplified:
=match(round($C9,0),sort(unique(round($C$9:$C$28,0)),1,false),0)
This essentially "embeds" the created array within a single formula, that can be applied to all rows. And as a bonus, the values don't even have to be sorted.
Please check for correctness folks, but I think this works. I've updated the linked Google Sheet from the original question description (it's "Solution 2b").

Logic of fixed number of records per page alphabetical pagination structure

The easiest way to explain this question is by example. See the following two images of the browse links on a particular website:
Basically, the way that it works is that there are a set number of records per page, and it works "backwards" in some manner to break down the browse pages into an appropriate number of ranges. So when there are relatively more records (as in the case of those starting with an "A"), there are more ranges, and more pages, than when there are fewer records ("X"). I am developing in Ruby on Rails, but would also be interested in some perspective on the logic here. Thanks!

The simplest way to visualize this is to think about the "deepest" groups all having 10 elements each, so split all your records into groups of 10.
Now, each group of 10 should be referenced to by an upper level group.
Each group of 10 of those should be referenced to by an even higher level group.
Finally, you'll reach the highest level group.
For any group, you take the n first letters of the first and last elements in their tree where n is it's depth. So for a group in depth 1, you take the 1st character of the very first element (recursively go deeper until you are at the sparsest branches) as the start of its range and the 1st char of the last elements as the end of it's range.
I could mock it up in PHP if you would be able to get what you need from it, but can't quite grasp the concept here.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart