How do I create a function that will run input-dependent sums or outcomes in Google Sheets? - google-sheets

I have built a spreadsheet for a game I play. Its purpose is to demonstrate the amount of points earned after each round. The amount of points earned is dependent on the win/loss of that round, and then the points compound or retract accordingly.
There are 24 matches that are played in this game, with a possible result of win/loss for each round. So, there are 48 different individual results that can occur. The complicated thing is that each round's points depends on win/loss, as well as the previous round's individual earnings. For example, if Round 2 is won and earns 120 points, and Round 3 is won, it earns 150 points. But if Round 3 is lost, 120 points is earned.
I am looking to build a program or function that will compute the final score of every one of the 16,777216 possible combination of outcomes.
Thanks in advance!

Related

How do I combine two electromagnetic readings to predict the position of a sensor?

I have an electromagnetic sensor and electromagnetic field emitter.
The sensor will read power from the emitter. I want to predict the position of the sensor using the reading.
Let me simplify the problem, suppose the sensor and the emitter are in 1 dimension world where there are only position X (not X,Y,Z) and the emitter emits power as a function of distance squared.
From the painted image below, you will see that the emitter is drawn as a circle and the sensor is drawn as a cross.
E.g. if the sensor is 5 meter away from the emitter, the reading you get on the sensor will be 5^2 = 25. So the correct position will be either 0 or 10, because the emitter is at position 5.
So, with one emitter, I cannot know the exact position of the sensor. I only know that there are 50% chance it's at 0, and 50% chance it's at 10.
So if I have two emitters like the following image:
I will get two readings. And I can know exactly where the sensor is. If the reading is 25 and 16, I know the sensor is at 10.
So from this fact, I want to use 2 emitters to locate the sensor.
Now that I've explained you the situation, my problems are like this:
The emitter has a more complicated function of the distance. It's
not just distance squared. And it also have noise. so I'm trying to
model it using machine learning.
Some of the areas, the emitter don't work so well. E.g. if you are
between 3 to 4 meters away, the emitter will always give you a fixed
reading of 9 instead of going from 9 to 16.
When I train the machine learning model with 2 inputs, the
prediction is very accurate. E.g. if the input is 25,36 and the
output will be position 0. But it means that after training, I
cannot move the emitters at all. If I move one of the emitters to be
further apart, the prediction will be broken immediately because the
reading will be something like 25,49 when the right emitter moves to
the right 1 meter. And the prediction can be anything because the
model has not seen this input pair before. And I cannot afford to
train the model on all possible distance of the 2 emitters.
The emitters can be slightly not identical. The difference will
be on the scale. E.g. one of the emitters can be giving 10% bigger
reading. But you can ignore this problem for now.
My question is How do I make the model work when the emitters are allowed to move? Give me some ideas.
Some of my ideas:
I think that I have to figure out the position of both
emitters relative to each other dynamically. But after knowing the
position of both emitters, how do I tell that to the model?
I have tried training each emitter separately instead of pairing
them as input. But that means there are many positions that cause
conflict like when you get reading=25, the model will predict the
average of 0 and 10 because both are valid position of reading=25.
You might suggest training to predict distance instead of position,
that's possible if there is no problem number 2. But because
there is problem number 2, the prediction between 3 to 4 meters away
will be wrong. The model will get input as 9, and the output will be
the average distance 3.5 meters or somewhere between 3 to 4 meters.
Use the model to predict position
probability density function instead of predicting the position.
E.g. when the reading is 9, the model should predict a uniform
density function from 3 to 4 meters. And then you can combine the 2
density functions from the 2 readings somehow. But I think it's not
going to be that accurate compared to modeling 2 emitters together
because the density function can be quite complicated. We cannot
assume normal distribution or even uniform distribution.
Use some kind of optimizer to predict the position separately for each
emitter based on the assumption that both predictions must be the same. If
the predictions are not the same, the optimizer must try to move the
predictions so that they are exactly at the same point. Maybe reinforcement
learning where the actions are "move left", "move right", etc.
I told you my ideas so that it might evoke some ideas in you. Because this is already my best but it's not solving the issue elegantly yet.
So ideally, I would want the end-to-end model that are fed 2 readings, and give me position even when the emitters are moved. How would I go about that?
PS. The emitters are only allowed to move before usage. During usage or prediction, the model can assume that the emitter will not be moved anymore. This allows you to have time to run emitters position calibration algorithm before usage. Maybe this will be a helpful thing for you to know.
You're confusing memoizing a function with training a model; the former is merely recalling previous results; the latter is the province of AI. To train with two emitters, you need to give the useful input data and appropriate labels (right answers), and design your model topology such that it can be trained to a useful functional response fro cases it has never seen.
Let the first emitter be at position 0 by definition. Your data then consists of the position of the second emitter and the two readings. The label is the sensor's position. Your given examples would look like this:
emit2 read1 read2 sensor
1 25 36 0
1 25 16 5
2 25 49 0
1.5 25 9 5 distance of 3 < d < 4 always reads as 3^2
Since you know that you have an squared relationship in the underlying physics, you need to include quadratic capability in your model. To handle noise, you'll want some dampener capability, such as an extra node or two in a hidden layer after the first. For more complex relationships, you'll need other topologies, non-linear activation functions, etc.
Can you take it from there?

Google Sheets - Incorrect result

I am confused with a Google Sheet I created.
https://docs.google.com/spreadsheets/d/1k0osuq_WFztRxNGcxXBhG5Hi6LSrHj8A5RCEwMnQZUs/edit?usp=sharing
These are bike times taken to complete 90 kms. These are split into 5kms chunks.
The interesting thing is that I input the time for each 5km chunk and calculate the speed in km/h from it. Then I calculate the total time taken using sum and the average speed using average. However this is incorrect. For Zell-am-See 2017 I get an average speed of 31km/h when the it should be around 28 km/h.
I can't seem to find the error. Initially I thought it was due to rounding but even if I change the data format to scientific nothing changes.
It is an incorrect assumption about the mathematics. You cannot average the averages (you need total distance over total time) because the lower rates impact the average more than the higher rates because they happen for longer times. You might want to Google "harmonic mean" for more.
For example, suppose you go 120 km at 40 km/h, and then ride back at 30 km/h. You have traveled 240 km in 7 hours. Your average rate is under 35 km/h.
EDIT: Total distance over total time is the way to go. But if you want to satisfy yourself that it is the harmonic mean you want, add a column F to the right of your speeds, and in F3 say =1/E3, and drag that on down through F20. In F21 say =1/AVERAGE(F3:F20), and behold you have the harmonic mean, which is the desired answer.

What kind of heuristics for BFS use to solve this 'game' (find path)?

I want to solve a 'game'.
I have 5 circles, we can rotate circles into left or into right (90 degrees).
Example:
Goal: 1,2,3,....,14,15,16
Ex. of starting situations: 16,15,14,...,3,2,1
I'm using BFS to find path but I can't invent heuristic function (my every solutions are not good). I was trying manhattan distance and others... (Maybe idea is good but something wrong with my solution). Please help!
One trick you might try is to do a breadth-first search backward from the goal state. Stop it early. Then you can terminate your (forward from the initial state) search once you've hit a state seen by the backward search.
Sum of Manhattan distances from pieces to their goals is a decent baseline heuristic for the forward A* search. You can do rather better by adding up the number of turns needed to get 1-8 into their places to the number of turns needed to get 9-16 into their places; each of these state spaces is small enough (half a billion states or so) to precompute.
One heuristic that you could use is the cumulative number of turns that it takes to move each individual segment to its designated spot. The individual values would range from zero (the item is in its spot) to five (moving corner to corner). The total for the goal configuration is zero.
One has to be careful using this heuristic, because going from the initial configuration to the desired configuration may require steps when the cumulative number of turns increases after a move.
Finding a solution may require an exhaustive search. You need to memoize or use another DP technique to avoid solving the same position multiple times.
A simple conservative (admissible) heuristic would be:
For each number 1 <= i <= 16, find the minimum number of rotations needed to put i back in its correct position (disregarding all other numbers)
Take the maximum over all these minimums.
This amounts to reporting minimum number of rotations needed to position the "worst" number correctly, and will therefore never overestimate the number of moves needed (since fixing all numbers' positions simultaneously requires at least as many moves as fixing any one of them).
It may, however, underestimate the number of moves needed by a long way. You can get more sophisticated by calculating, for each number 1 <= i <= 16 and for each wheel 1 <= j <= 5, the minimum number of rotations of wheel j needed by any sequence of moves that positions i correctly. For each wheel j, you can then take a separate maximum over all numbers i, and finally add these 5 maxima together, since they are all independent. (This may be less than the previous heuristic, but you are always allowed to take the greater of the two, so this won't be a problem.)

Is there a way to summarize the features of many time series?

I'm actually trying to detect characteristics of the time series for a very big region composed of many smaller subregions (in my case pixels). I don't know much about this, so the only way I can come up with is an averaged time series for the entire region, although I know this would definitely conceal many features by averaging.
I'm just wondering if there are any widely used techniques that can detect the common features of a suite of time series? like pattern recognition or time series classification?
Any ideas/suggestions are much appreciated!
Thanks!
Some extra explanations: I'm dealing with remote sensing images of several years with a time step of 7 days. So for each pixel, there is a time series associated, with values extracted from this pixel on different dates.So if I define a region consisting of many pixels, is there a way to detect or extract some common features charactering all or most of the time series of pixels within this region? Such as the shape of the time series, or a date around which there's an obvious increase in the values?
You could compute the correlation matrix for the pixels. This would simply be:
corr = np.zeros((npix,npix))
for i in range(npix):
for j in range(npix):
corr(i,j) = sum(data(i,:)*data(j,:))/sqrt(sum(data(i,:)**2)*sum(data(j,:)**2))
If you want more information, you can compute this as a function of time, i.e. divide your time series into blocks (say minutes) and compute the correlation for each of them. Then you can see how the correlation changes over time.
If the correlation changes a lot, you may be more interested in the cross-power spectrum of the pixels. This is defined as
cpow(i,j,:) = (fft(data(i,:))*conj(fft(data(j,:)))
This will tell you how much pixel i and j tend to change together on various time-scales. For example, they could be moving in unison in time-scales of a second (1 Hz), but also have changes on a time-scale of, say, 10 seconds which are not correlated with each other.
It all depends on what you need, really.

Algorithm for variability analysis

I work with a lot of histograms. In particular, these histograms are of basecalls along segments on the human genome.
Each point along the x-axis is one of the four nitrogenous bases(A,C,T,G) that compose DNA and the y-axis represents how many times a base was able to be "called" (or recognized by a sequencer machine, so as to sequence the genome, which is simply determining the identity of each base along the genome).
Many of these histograms display roughly linear dropoffs (when the machines aren't able to get sufficient read depth) that fall to 0 or (almost-0) from plateau-like regions. When the score drops to zero, it means the sequencer isn't able to determine the identity of the base. If you've seen the double helix before, it means the sequencer can't figure out the identify of one half of a rung of the helix. Certain regions of the genome are more difficult to characterize than others. Bases (or x data points) with high numbers of basecalls, on the order of >=100, are able to be definitively identified. For example, if there were a total of 250 calls for one base, and we had 248 T's called, 1 G called, and 1 A called, we would call that a T. Regions with 0 basecalls are of concern because then we've got to infer from neighboring regions what the identity of the low-read region could be. Is there a straightforward algorithm for assigning these plots a score that reflects this tendency? See box.net/shared/nbygq2x03u for an example histo.
You could just use the count of base numbers where read depth was 0... The slope of that line could also be a useful indicator (steep negative slope = drop from plateau).

Resources