How does Monte Carlo Exploring Starts work? - machine-learning

Flow Chart
I'm having trouble understanding the 4th and 5th step in the flowchart.
Am I right to say that the Q value of a particular state and action is the same as the state-action pair value of that same state and action?
For the 4th step, does 'calculate return for a state-action pair' mean the same as finding the state-action pair value of that particular state?
For the 5th step, the 'update the Q function by taking the average of returns' is confusing. From what I understand, the Q function is basically the state-action pair values put in a table (the Q table).To update it means to make adjustments to the state-action pair value of the individual states and their respective actions(e.g state 1 action 1,state 3 action 1, state 3 action 2, so on and so forth...). I'm not sure what 'average of returns' means though. Is it asking me to take the average of the returns after x episodes?(From my understanding, returns is the sum of rewards in 1 episode) ( So, AVG=sum of returns/x) And what do I do with that average? I'm a little confused when they say 'update the Q function' because the Q function consists of many parameters that must be updated(the individual state-action pair value), and im not sure which one they are refering to
Also, what is the point of calculating the avg of returns? Since the state-action pair value for a particular state and particular action will always be the same(e.g if i always take action 3 in state 4, i will always get value=2)
Thanks :)

Related

How to find optimal scores in a score database in Google Sheets?

I have a database of scores (C2:C937) which range from 4000-6500. I have the records for each score (D2:D937) and the differential between the 2 (E2:E937). I am trying to find a formula to tell me which score is closest to going over 5000. I was thinking to have a formula that checks the sums of the C column values and their corresponding E column values, and then sorting by which values have the most to improve. I don't know if it's possible to make a loop of sorts in sheets (kind of like a for loop in Java).
The image below it shows a set of values as explained above. The result I would be looking for would return whichever value has the highest chance of going over 5000, that is already the closest.
Here is a set of expected results. In this dataset, "Yanmega 4" would never be returned since it is already over 5000 in the C column. "Zangoose 1" and "Yanmega 3" would be first since they are close to going over 5000 and have enough leeway above 5000 (~400 points of leeway above it). "Zangoose 2" would be next since it has ~300 over 5000 meaning getting over 5000 should be easier than some of the others. "Zangoose 4" would be next since only 28 points need to be gained to get more than 5000, and there are 105 points of wiggle room as well. "Zangoose 3" is last since it only has 15 points of wiggle room based upon the current record.
Here is what I had started. The first line is convuluted and was my attempt at creating a loop. The second line has a good start I just don't know where to go from there. Both lines are unfinished.
=IF(IF(LARGE($C$2:$C$937,1)<5000,LARGE($C$2:$C$937,1),IF(LARGE($C$2:$C$937,2)<5000,LARGE($C$2:$C$937,2),LARGE($C$2:$C$937,3))+INDIRECT("E"&match(E954,$E$1:$E$940,0))<=5000,))
=IF(LARGE($C$2:$C$937,1)<5000,)
1st I think your logic is not correct. I would rather have Zanggose 4 as 1st, then Yanmega 3, then Zanggose 2 or Zangoose 3...
you can use a formula like:
=INDEX(B3:B8,MATCH( SMALL( IF($C$3:$C$8<5000,(5000-$C$3:$C$8)*100/$D$3:$D$8,100),*ROW($A$1:$A$6)* ), IF($C$3:$C$8<5000,(5000-$C$3:$C$8)*100/$D$3:$D$8,100),0),)
and insert the Function as Array (with ctrl+sht+enter) to as many cells as needed (paste formula in top cell, fix it, then enter as Array, then select all destination cells, press F2 to edit formula, then ectrl+sht+enter)
adjusting c3:c8 to your c928:c933 and so on, then ROW($A$1:$A$6) as big as you data.
Problem is that too many calls are made recursively and in large number of rows things will get very slow... You better use an extra Column with the formula
=IF($C$3:$C$8<5000,(5000-$C$3:$C$8)*100/$D$3:$D$8,100)
and then Filter/sort the data with VBA or Filters (there are excellent youtube videos on the subject, as long as company policies (if any) allow usage of VBA).

I would like that a numeric list indicate movement when a value in the same row changes to another

An example of what im asking for
Im asking if there's a way to "mark" in the B column when a value changes in position respect at the C column.
Like if new data came and the EXAMPLE1 changes to the C8 cell, that the number in the B8 column show that it has a lower position.
Sheet : https://docs.google.com/spreadsheets/d/1pLYqhkLuAS8ZgjPnsZTZyU5yuPXVgm0IRUPIF2FBKkM/edit?usp=sharing
Google Sheets are better formulated in numeric value. In view with your screenshot provided, I saw player, and I assumed your player comes with various other numerical value along such as scores, age, weight, etc.
With the aid of the numerical value, you can formulate a deductive relation by finding who is best ranked number 1.
Here's an example:
As you can see from the picture above, assumed the greater the age, the higher the rank, you may use countif to count the other bigger value, while you can combine use count for deduction, and that builds the sequence from getting number 1 the older, and last the younger
Hope you find it helpful

Excel formula for counting combinations of checked boxes

I need help with a formula that counts unique combination of events from a checkbox form. Users check how many times they completed a task X, Y or Z. The program counts how many events were logged (under '#') and then the counts the unique combination of events and spits out the combo count (under 'Combinations').
For the sake of clarification, I'll refer to each category by its name and each numbered checkbox as X_1, X_2, etc.
Here are the design criteria:
Count unique combinations between two separate events (e.g. [X_1,Y_1])
Once a single instance of an event is counted, you cannot use it again (e.g. X_1 cannot be paired up twice => [X_1,Y_1], [X_1,Y_2])
However, you can pair multiple instances of the same event to other unique events (e.g. [X_1,Y_1], [X_2,Y_2], [X_3,Z_1])
Combinations cannot be made between multiple instances of the same event (i.e. [X_1,X_2] is not valid)
So in the example above, the correct number of combinations should be 3 (i.e. three unique combos of events with each individual event counted only once). I've built two formulas. The first (H2) uses INT and COUNTIF functions to count number of checked boxes column-by-column. It yields an incorrect answer of two.
=INT(COUNTIF(C2:C4,true)/2)+INT(COUNTIF(D2:D4,true)/2)+INT(COUNTIF(E2:E4,true)/2)+INT(COUNTIF(F2:F4,true)/2)+INT(COUNTIF(G2:G4,true)/2)
The second (H3) uses the INT and SUM function to estimate a total from the data container in column A. It yields an incorrect answer of 4.
=INT(SUM($A$2:$A$18)/2)
I believe the MOD function may work well in addition to the COUNTIF function. Go column-by-column, count unique combinations, and any remainder will count towards finding an odd event in the next column.
Any help is appreciated. Thank you for reading.
Does this formula work, in H5? I think it gives the correct number of possible pairings.
=MIN((MAX(1,C10)+MAX(1,D10)+MAX(1,E10)+MAX(1,F10)+MAX(1,G10) - COUNT(C10:G10)),
INT(COUNTIF(C2:G9,TRUE)/2))
It uses a helper row, which totals the number of different users for each possible attempt, in cells C10:G10. This only needs to be entered once.
The formula in H10 is:
=COUNTIF(C2:C9,true)
and it needs to be dragged across under all of the checkboxes, ie in H10 to G10.
Here is my result.
It would be possible to eliminate the helper row, doing the totals below the checkboxes, but it would make the formula very messy I think.

VLOOKUP with wildcard and find Nth occurance?

I'm setting up a Google Sheet that will calculate the most effective purchase size of specific agricultural inputs (fertilizer, chemical, etc). I set up the price data in its own tab with a separate row for each input name + size.
To keep it easy for the user I'd like to require only the input name, # of gallons per acre, and acres and then have a formula spit out the total cost and most effective purchase (bulk if > X gallons, X # of 250 gallon containers + X 55 drums, etc). How can I use the input name plus a wildcard to find the appropriate purchase size?
https://docs.google.com/spreadsheets/d/1bMOPuk2qhmVuJT7vE_ni3KFxfcgKvwTwkM4p50xQF_0/edit?usp=sharing
I tried:
=ArrayFormula(iferror(INDEX('Data (Current)'!H2:H,SMALL(IF($A2&"*"='Data (Current)'!A2:A,ROW('Data (Current)'!A2:A)-1),1))))
...but it returns blank so I'm guessing the reference $A2&"*" to the input name isn't working properly. When I replace it with a string found in the 'Data (Current)' tab then it works fine.
=ArrayFormula(iferror(INDEX('Data (Current)'!H2:H,SMALL(IF($A2&"*"='Data (Current)'!A2:A,ROW('Data (Current)'!A2:A)-1),1))))
I expected the output to be the smallest value (in this case I think it's 5). Then when I change the last number to 2 or 3 it will find the next smallest value, in this case, 55 or 250. Then I can use simple formulas to interact with that and finish the spreadsheet.
Unfortunately, the actual output is nothing, or "".
Sorry if this isn't what you're looking for, as I had some trouble understanding your question.
Presuming what you want is essentially this:
I want to buy Y quantity of item.
I can buy item at cheaper prices if I buy in higher quantities, although sometimes they have a minimum order quantity.
What is the most optimal combination of the options I have to minimize the price I pay?
I'm unsure if there's a simple solution for this within Google Sheets alone. This might be treading more into Apps Script territory.
However, that's not to say that it's not impossible. I've "brute-forced" the above solution above with an iterative-like approach, for the "Chelated Calcium" product: https://docs.google.com/spreadsheets/d/1YSBiSx0IMr4T0R11Dqb-tqOhH4AOTTAWeH2yQfT4X5w
First, list the data in a standardized manner. This includes giving each same product something easy to look it up by. For example, on the Data (Current) tab, I've added 3 columns:
Product Common Name - This is used so that all items of different quantities can be found easily, without needing wildcards.
Gallons - Much easier to parse the data if it it's explicitly laid out.
Minimum Order Gallons - This is your threshold for Bulk. I've set it at an arbitrary 20,000 gallons for Chelated Calcium.
The data here is ordered least-effective first. How you do this will be up to you. In this case, I sorted by the Retail Cost Per Ounce parameter from your sheet, highest first. This eliminates any guesswork about which of the options are most effective, since you can just traverse your options in order. Note: The way I've laid out the formulas will only work IFF the same products are directly next to each other. It won't work if there are other products between them.
On the Field Level Tool tab, standardize your inputs to the Gallons unit. I do this in Total Gallons Needed column (I multiply anything with a "GAL" with 1, and "QUART" with 0.25).
For each item, determine the row numbers where the product begins and ends. This is marked by columns L (Least Efficient Index) and M (Most Efficient Index). I got these results by using the MATCH function.
Set up the iterations, from 0 to N-1. On this sheet, I've set up N=5 iterations, which means that it can traverse 5 different options of the same product only. Since Chelated Calcium only has 4 different options (5 Gal, 30 Gal, 250 Gal, Bulk), 5 is more than enough for this product. If you have products with more options, you may want to have more iterations.
The iterations are on the right side of the Field Level Tool tab.
In your case, you might want to put it on a different tab since the place I put it makes the file look very messy.
In each iteration, I perform the following steps:
To Fulfill - How many gallons still need to be purchased by this iteration?
ThisIndex - What is the row number of this iteration? This is determined by Most Efficient Index - Iteration Number. Remember that since we sorted in order of ascending efficiency, this means that the iteration starts with the most efficient option it can find first. There is a check to make sure that it only outputs a value if it is between the range [Least Efficient Index, Most Efficient Index]. Otherwise, it will be blank to avoid miscalculations by intruding into another product in the Data (Current) tab.
Retail Price, Minimum Gals, Gallons per Order - Simple data extraction for easy usage in the iteration, using INDEX (and indirectly, MATCH by virtue of ThisIndex).
Order - This formula does a couple of things, outlined below:
It checks whether there still remains a valid choice of product at this iteration. It does this by checking whether ThisIndex still exists. If the product doesn't exist, then it will be nulled. This is accomplished by using the IF function.
It will determine if there is a minimum threshold that must be met to purchase this choice. You can see in the 0th iteration, for example, that there is a minimum quantity of 20,000 gallons. If To Fulfill quantity is greater than or equal to the threshold OR there is no threshold, then a purchase is quantified by this column. The mathematics are simply to divide the To Fulfill amount by the Gallons per Order amount to determine the number of orders of this particular product choice. If there is a threshold but the To Fulfill amount doesn't meet it, then this iteration is skipped with a 0 order value.
If the item is already on its least efficient choice (ThisIndex == Least Efficient Index), it will do a CEILING function to ensure that the order is fulfilled. If not, it will do a FLOOR function instead. This is because you cannot order 3.5 units of an item, so they have to be rounded either up or down.
Expenditure - This is simply Order multiplied by the Retail Price, or how much money you spend in this iteration.
Remaining - How much of the product is left unfulfilled at the end of this iteration, to be used as To Fulfill for the next iteration.
Note: If you see formulas that are of the form =IF(ThisIndex, [calculations_here],), that is simply a check to nullify that calculation if ThisIndex is invalid.
Copy the iterations as many times as you want to the right. Something nice to do is to force the iterations to do a CEILING on the very last one to ensure that you never under-buy.
Generate a user-readable string for the purchase suggestion. You can see this on the Suggested Purchase column.
Calculate the Gallons Bought with a simple SUMPRODUCT over all the iterations.
Calculate the total expenditure with a simple SUM over all the iterations.
I hope this is what you were looking for. Regardless, it's at least a fun exercise on how much you can abuse Sheets. ;)

Google Sheets: Dense Ranking from sorted values

I have a simple table with 3 columns:
[Name] [Score] [Rank]
For the 3rd column, I'm using the following formula to rank each row according to the score:
=RANK(C9,$C$9:$C$28,0)
The problem is that the formula isn't returning the values I'd expect. For example on the last row it returns 19 when it should be 5.
I found other formulas for ranking (RANK.EQ, etc.) but same issue happens.
Here is the Google Sheet to see it in context:
https://docs.google.com/spreadsheets/d/1P1m7UHPPIcQLQkzpnk-SI1y7-0mhKytCWDjA6FJzFrM/edit?usp=sharing
Any guidance appreciated
The results you want can be achieved with a simple MATCH formula:
=match(round(C9,0),NamedRange1,0)
Provided an array (named NamedRange1 for above) is created, say with:
=sort(unique(round(C9:C28,0)),1,0)
I think the result is as intended. Check this Ranking Wikipedia page (called 'standard competition ranking'). It says:
Standard competition ranking ("1224" ranking)
In competition ranking, items that compare equal receive the same
ranking number, and then a gap is left in the ranking numbers. The
number of ranking numbers that are left out in this gap is one less
than the number of items that compared equal. Equivalently, each
item's ranking number is 1 plus the number of items ranked above it.
This ranking strategy is frequently adopted for competitions, as it
means that if two (or more) competitors tie for a position in the
ranking, the position of all those ranked below them is unaffected
(i.e., a competitor only comes second if exactly one person scores
better than them, third if exactly two people score better than them,
fourth if exactly three people score better than them, etc.).
Thus if A ranks ahead of B and C (which compare equal) which are both
ranked ahead of D, then A gets ranking number 1 ("first"), B gets
ranking number 2 ("joint second"), C also gets ranking number 2
("joint second") and D gets ranking number 4 ("fourth").
What you want is 'dense ranking' and it can be achieved by pnuts's answer or something like this:
set G9 to 1
set G10 to =if(round(C10,0)<round(C9,0), G9+1, G9)
copy G10 and paste it into G11:G28
Sample sheet is here.
Thanks to #pnuts and #sangboklee for your solutions. I think I have a good solution now. It is pnuts's solution, just simplified:
=match(round($C9,0),sort(unique(round($C$9:$C$28,0)),1,false),0)
This essentially "embeds" the created array within a single formula, that can be applied to all rows. And as a bonus, the values don't even have to be sorted.
Please check for correctness folks, but I think this works. I've updated the linked Google Sheet from the original question description (it's "Solution 2b").

Resources