Predict delay before next retry on a rate-limited API - machine-learning

I'm using an API that rate-limits me, forcing me to wait and retry my request if I hit a URL too frequently. Let's say for the sake of argument that I don't know what the specific rate limit threshold is (and don't want to hardcode one into my app even if I did know).
I can make an API call and either watch it succeed, or get back a response saying I've been rate limited, then try again shortly, and I can record the total time elapsed for all tries before I was able to successfully complete an API call.
What algorithms would be a good fit for predicting the minimum time I need to wait before retrying an API call after hitting the rate limit?

Use something like a one sided binary search.
See page 134 of The Algorithm Design Manual:
Now suppose we have an array A consisting of a run of 0's, followed by
an unbounded run of 1's, and would like to identify the exact point of
transition between them. Binary search on the array would provide the
transition point in ⌈lg n ⌉ tests, if we had a bound n on the
number of elements in the array. In the absence of such a bound, we can
test repeatedly at larger intervals (A[1], A[2], A[4], A[8], A[16], ...)
until we find a first non-zero value. Now we have a window containing
the target and can proceed with binary search. This one-sided binary
search finds the transition point p using at most 2 ⌈lg(p)⌉
comparisons, regardless of how large the array actally is. One-sided
binary search is most useful whenever we are looking for a key that
probably lies close to our current position.

Related

Prometheus increase not handling process restarts

I am trying to figure out the behavior of Prometheus' increase() querying function with process restarts.
When there is a process restart within a 2m interval and I query:
sum(increase(my_metric_total[2m]))
I get a value less than expected.
For example, in a simple experiment I mock:
3 lcm_restarts
1 process restart
2 lcm_restarts
All within a 2 minute interval.
Upon querying:
sum(increase(lcm_restarts[2m]))
I receive a value of ~4.5 when I am expecting 5.
lcm_restarts graph
sum(increase(lcm_restarts[2m])) result
Could someone please explain?
Pretty concise and well-prepared first question here. Please keep this spirit!
When working with counters, functions as rate(), irate() and also increase() are adjusting on resets due to restarts. Other than the name suggests, the increase() function does not calculate the absolute increase in the given time frame but is a different way to write rate(metric[interval]) * number_of_seconds_in_interval. The rate() function takes the first and the last measurement in a series and calculates the per-second increase in the given time. This is the reason why you may observe non-integer increases even if you always increase in full numbers as the measurements are almost never exactly at the start and end of the interval.
For more details about this, please have a look at the prometheus docs for the increase() function. There are also some good hints on what and what not to do when working with counters in the robust perception blog.
Having a look at your label dimensions, I also think that counter resets don't apply to your constructed example. There is one label called reason that changed between the restarts and so created a second time series (not continuing the existing one). Here you are also basically summing up the rates of two different time series increases that (for themselves) both have their extrapolation happening.
So basically there isn't really anything wrong what you are doing, you just shouldn't rely on getting highly precise numbers out of prometheus for your use case.
Prometheus may return unexpected results from increase() function due to the following reasons:
Prometheus may return fractional results from increase() over integer counter because of extrapolation. See this issue for details.
Prometheus may return lower than expected results from increase(m[d]) because it doesn't take into account possible counter increase between the last raw sample just before the specified lookbehind window [d] and the first raw sample inside the lookbehind window [d]. See this article and this comment for details.
Prometheus skips the increase for the first sample in a time series. For example, increase() over the following series of samples would return 1 instead of 11: 10 11 11. See these docs for details.
These issues are going to be fixed according to this design doc. In the mean time it is possible to use other Prometheus-like systems such as VictoriaMetrics, which are free from these issues.

Real-time pipeline feedback loop

I have a dataset with potentially corrupted/malicious data. The data is timestamped. I'm rating the data with a heuristic function. After a period of time I know that all new data items coming with some IDs needs to be discarded and they represent a significant portion of data (up to 40%).
Right now I have two batch pipelines:
First one just runs the rating over the data.
The second one first filters out the corrupted data and runs the analysis.
I would like to switch from batch mode (say, running every day) into an online processing mode (hope to get a delay < 10 minutes).
The second pipeline uses a global window which makes processing easy. When the corrupted data key is detected, all other records are simply discarded (also using the discarded keys from previous days as a pre-filter is easy). Additionally it makes it easier to make decisions about the output data as during the processing all historic data for a given key is available.
The main question is: can I create a loop in a Dataflow DAG? Let's say I would like to accumulate quality-rates given to each session window I process and if the rate sum is over X, some a filter function in earlier stage of pipeline should filter out malicious keys.
I know about side input, I don't know if it can change during runtime.
I'm aware that DAG by definition cannot have cycle, but how achieve same result without it?
Idea that comes to my mind is to use side output to mark ID as malicious and make fake unbounded output/input. The output would dump the data to some storage and the input would load it every hour and stream so it can be joined.
Side inputs in the Beam programming model are windowed.
So you were on the right path: it seems reasonable to have a pipeline structured as two parts: 1) computing a detection model for the malicious data, and 2) taking the model as a side input and the data as a main input, and filtering the data according to the model. This second part of the pipeline will get the model for the matching window, which seems to be exactly what you want.
In fact, this is one of the main examples in the Millwheel paper (page 2), upon which Dataflow's streaming runner is based.

Is this a correct implementation of Q-Learning for Checkers?

I am trying to understand Q-Learning,
My current algorithm operates as follows:
1. A lookup table is maintained that maps a state to information about its immediate reward and utility for each action available.
2. At each state, check to see if it is contained in the lookup table and initialise it if not (With a default utility of 0).
3. Choose an action to take with a probability of:
(*ϵ* = 0>ϵ>1 - probability of taking a random action)
1-ϵ = Choosing the state-action pair with the highest utility.
ϵ = Choosing a random move.
ϵ decreases over time.
4. Update the current state's utility based on:
Q(st, at) += a[rt+1, + d.max(Q(st+1, a)) - Q(st,at)]
I am currently playing my agent against a simple heuristic player, who always takes the move that will give it the best immediate reward.
The results - The results are very poor, even after a couple hundred games, the Q-Learning agent is losing a lot more than it is winning. Furthermore, the change in win-rate is almost non-existent, especially after reaching a couple hundred games.
Am I missing something? I have implemented a couple agents:
(Rote-Learning, TD(0), TD(Lambda), Q-Learning)
But they all seem to be yielding similar, disappointing, results.
There are on the order of 10²⁰ different states in checkers, and you need to play a whole game for every update, so it will be a very, very long time until you get meaningful action values this way. Generally, you'd want a simplified state representation, like a neural network, to solve this kind of problem using reinforcement learning.
Also, a couple of caveats:
Ideally, you should update 1 value per game, because the moves in a single game are highly correlated.
You should initialize action values to small random values to avoid large policy changes from small Q updates.

If I called arc4random_uniform(6) at 5 o'clock OR 5:01, would I get the same number?

I'm making an iOS dice game and one beta tester said he liked the idea that the rolls were already predetermined, as I use arc4random_uniform(6). I'm not sure if they are. So leaving aside the possibility that the code may choose the same number consecutively, would I generate a different number if I tapped the dice in 5 or 10 seconds time?
Your tester was probably thinking of the idea that software random number generators are in fact pseudo-random. Their output is not truly random as a physical process like a die roll would be: it's determined by some state that the generators hold or are given.
One simple implementation of a PRNG is a "linear congruential generator": the function rand() in the standard library uses this technique. At its core, it is a straightforward mathematical function, and each output is generated by feeding in the previous one as input. It thus takes a "seed" value, and -- this is what your tester was thinking of -- the sequence of output values that you get is completely determined by the seed value.
If you create a simple C program using rand(), you can (must, in fact) use the companion function srand() (that's "seed rand") to give the LCG a starting value. If you use a constant as the seed value: srand(4), you will get the same values from rand(), in the same order, every time.
One common way to get an arbitrary -- note, not random -- seed for rand() is to use the current time: srand(time(NULL)). If you did that, and re-seeded and generated a number fast enough that the return of time() did not change, you would indeed see the same output from rand().
This doesn't apply to arc4random(): it does not use an LCG, and it does not share this trait with rand(). It was considered* "cryptographically secure"; that is, its output is indistinguishable from true, physical randomness.
This is partly due to the fact that arc4random() re-seeds itself as you use it, and the seeding is itself based on unpredictable data gathered by the OS. The state that determines the output is entirely internal to the algorithm; as a normal user (i.e., not an attacker) you don't view, set, or otherwise interact with that state.
So no, the output of arc4random() is not reliably repeatable by you. Pseudo-random algorithms which are repeatable do exist, however, and you can certainly use them for testing.
*Wikipedia notes that weaknesses have been found in the last few years, and that it may no longer be usable for cryptography. Should be fine for your game, though, as long as there's no money at stake!
Basically, it's random. No it is not based around time. Apple has documented how this is randomized here: https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/arc4random_uniform.3.html

How does OpenTSDB downsample data

I have a 2 part question regarding downsampling on OpenTSDB.
The first is I was wondering if anyone knows whether OpenTSDB takes the last end point inclusive or exclusive when it calculates downsampling, or does it count the end data point twice?
For example, if my time interval is 12:30pm-1:30pm and I get DPs every 5 min starting at 12:29:44pm and my downsample interval is summing every 10 minute block, does the system take the DPs from 12:30-12:39 and summing them, 12:40-12:49 and sum them, etc or does it take the DPs from 12:30-12:40, then from 12:40-12:50, etc. Yes, I know my data is off by 15 sec but I don't control that.
I've tried to calculate it by hand but the data I have isn't helping me. The numbers I'm calculating aren't adding up to the above, nor is it matching what the graph is showing. I don't have access to the system that's pushing numbers into OpenTSDB so I can't setup dummy data to check.
The second question is how does downsampling plot its points on the graph from my time range and downsample interval? I set downsample to sum 10 min blocks. I set my range to be 12:30pm-1:30pm. The graph shows the first point of the downsampled graph to start at 12:35pm. That makes logical sense.I change the range to be 12:24pm-1:29pm and expected the first point to start at 12:30 but the first point shown is 12:25pm.
Hopefully someone can answer these questions for me. In the meantime, I'll continue trying to find some data in my system that helps show/prove how downsampling should work.
Thanks in advance for your help.
Downsampling isn't currently working the way you expect, although since this is a reasonable and commonly made expectations, we are thinking of changing this in a later release of OpenTSDB.
You're assuming that if you ask for a "10 min sum", the data points will be summed up within each "round" (or "aligned") 10 minute block (e.g. 12:30-12:39 then 12:40-12:49 in your example), but that's not what happens. What happens is that the code will start a 10-minute block from whichever data point is the first one it finds. So if the first one is at time 12:29:44, then the code will sum all subsequent data points until 600 seconds later, meaning until 12:39:44.
Within each 600 second block, there may be a varying number of data points. Some blocks may have more data points than others. Some blocks may have unevenly spaced data points, e.g. maybe all the data points are within one second of each other at the beginning of the 600s block. So in order to decide what timestamp will result from the downsampling operation, the code uses the average timestamp of all the data points of the block.
So if all your data points are evenly spaced throughout your 600s block, the average timestamp will fall somewhere in the middle of the block. But if you have, say, all the data points are within one second of each other at the beginning of the 600s block, then the timestamp returned will reflect that by virtue of being an average. Just to be clear, the code takes an average of the timestamps regardless of what downsampling function you picked (sum, min, max, average, etc.).
If you want to experiment quickly with OpenTSDB without writing to your production system, consider setting up a single-node OpenTSDB instance. It's very easy to do as is shown in the getting started guide.

Resources