Amazon SDB - PUTS per second limit explained? - amazon-simpledb

I believe the max PUT requests to Amazon's Simple DB is 300?
What happens when I throw 500 or 1,000 requests to it? Is it queued on the Amazon side, do I get 504's or should I build my own queuing server on EC2?

The max request volume is not a fixed number, but a combination of factors. There is a per-domain throttling policy but there seems to be some room for bursting requests before throttling kicks in. Also, every SimpleDB node handles many domains and every domain is handled by multiple nodes. The load on the node handling your request also contributes to your max request volume. So you can get higher throughput (in general) during off-peak hours.
If you send more requests than SimpleDB is willing or able to service, you will get back a 503 HTTP code. 503 Service unavailable responses are business as usual and should be retried. There is no request queuing going on within SimpleDB.
If you want to get the absolute max available throughput you have to be able to (or have a SimpleDB client that can) micro manage your request transmission rate. When the 503 response rate reaches about 10% you have to back off your request volume and subsequently build it back up. Also, spreading the requests across multiple domains is the primary means of scaling.
I wouldn't recommend building your own queuing server on EC2. I would try to get SimpleDB to handle the request volume directly. An extra layer could smooth things out, but it won't let you handle higher load.

I would use the work done at Netflix as an inspiration for high throughput writes:
http://practicalcloudcomputing.com/post/313922691/5-steps-simpledb-performance

Related

Large percent of requests in CLRThreadPoolQueue

We have an ASP.NET MVC application hosted in an azure app-service. After running the profiler to help diagnose possible slow requests, we were surprised to see this:
An unusually high % of slow requests in the CLRThreadPoolQueue. We've now run multiple profile sessions each come back having between 40-80% in the CLRThreadPoolQueue (something we'd never seen before in previous profiles). CPU each time was below 40%, and after checking our metrics we aren't getting sudden spikes in requests.
The majority of the requests listed as slow are super simple api calls. We've added response caching and made them async. The only thing they do is hit a database looking for a single record result. We've checked the metrics on the database and the query avg run time is around 50ms or less. Looking at application insights for these requests confirms this, and shows that the database query doesn't take place until the very end of the request time line (I assume this is the request sitting in the queue).
Recently we started including SignalR into a portion of our application. Its not fully in use but it is in the code base. We since switched to using Azure SignalR Service and saw no changes. The addition of SignalR is the only "major" change/addition we've made since encountering this issue.
I understand we can scale up and/or increase the minWorkerThreads. However, this feels like I'm just treating the symptom not the cause.
Things we've tried:
Finding the most frequent requests and making them async (they weren't before)
Response caching to frequent requests
Using Azure SignalR service rather than hosting it on the same web
Running memory dumps and contacting azure support (they
found nothing).
Scaling up to an S3
Profiling with and without thread report
-- None of these steps have resolved our issue --
How can we determine what requests and/or code is causing requests to pile up in the CLRThreadPoolQueue?
We encountered a similar problem, I guess internally SignalR must be using up a lot of threads or some other contended resource.
We did three things that helped a lot:
Call ThreadPool.SetMinThreads(400, 1) on app startup to make sure that the threadpool has enough threads to handle all the incoming requests from the start
Create a second App Service with the same code deployed to it. In the javascript, set the SignalR URL to point to that second instance. That way, all the SignalR requests go to one app service, and all the app's HTTP requests go to the other. Obviously this requires a SignalR backplane to be set up, but assuming your app service has more than 1 instance you'll have had to do this anyway
Review the code for any synchronous code paths (eg. making a non-async call to the database or to an API) and convert them to async code paths

Microsoft Graph API - Throttling

Is there a specific number of requests/minute (specific to a tenant) that an application can make to Microsoft Graph APIs before requests start getting throttled?
No, not specific to a tenant (at least not for the Outlook-related parts of the Graph). Throttling is done per user per app. The threshold is 10000 requests every 10 minutes.
https://blogs.msdn.microsoft.com/exchangedev/2017/04/07/throttling-coming-to-outlook-api-and-microsoft-graph/
For non-Outlook stuff, I'm not sure what the limits are. All Graph has to say about it is here:
https://developer.microsoft.com/en-us/graph/docs/concepts/throttling
The takeaway here is you should not depend on a specific threshold since we can always change it if we need to in order to protect the integrity of the service. Ensure that your app can gracefully handle being throttled by handling the 429 error response properly.

Youtube Reports API HttpError 429 FreeQuotaGroupCLIENT_PROJECT-100

I'm pulling youtube analytics by youtube bulk reports.
https://developers.google.com/youtube/reporting/v1/reports/
Everything works fine accept when we have many users, we encounter
<HttpError 429 when requesting https://youtubereporting.googleapis.com/v1/media/CHANNEL/****/jobs/****/reports/***?alt=media returned "Insufficient tokens for quota group and limit 'FreeQuotaGroupCLIENT_PROJECT-100s' of service 'youtubereporting.googleapis.com', using the limit by ID '****'.">
I know there is a limit number of API calls per 100 seconds.
Is there anyway to increase this limit, because I saw it's FreeQuotaGroupCLIENT_PROJECT-100s, so there might be Paid Quota or something else.
If not, what's the best way to handle fallback? We can't use sleep method because there are many parallel process, they won't wait for another.
Thank you.
The 429 status code indicates that the user has sent too many requests in a given amount of time ("rate limiting"). Check this related SO post which states that:
Receiving a status 429 is not an error, it is the other server "kindly" asking you to please stop spamming requests. Obviously, your rate of requests has been too high and the server is not willing to accept this.
You should not seek to "dodge" this, or even try to circumvent server security settings by trying to spoof your IP, you should simply respect the server's answer by not sending too many requests.
If everything is set up properly, you will also have received a "Retry-after" header along with the 429 response. This header specifies the number of seconds you should wait before making another call. The proper way to deal with this "problem" is to read this header and to sleep your process for that many seconds.
The discovery response does not change frequently; cache the discovery response locally or retry using exponential back-off. You need to slow down the rate at which you are sending the requests.

Old HTTP requests arrive to server behind AWS ELB

We had an interesting event the other day in our system where a burst of month old HTTP requests arrived to our ELB and from there to one of our coupled servers. We could tell that the requests were old by a timestamp we are sending from our client app (and from the fact it had no relevant data :) ).
Our system is hosted on AWS, using a group of EC2 instances behind an ELB which communicates in HTTP with the EC2's. Also, our client app runs on iOS.
A thing to notice - the old requests were dated to a day in which we had a server crash which lead to a great load on our remaining servers (resulting in a lot of hanged HTTP requests, i.e they were not processed)
Also, despite the group of old messages originally spanned across several minutes (which we know from the timestamps), they all came in a single bulk the other day (this is from the ELB metrics).
We are trying to figure out how or where could these requests stack up and maybe understand why it happened when it did.
Any insights, similar experiences or suggestions will be appreciated as we've failed to find similar events on the web, thanks!

Cost of continuous replications vs one-shot replications (using TouchDB and Cloudant)

We have an app that uses Cloudant as a remote server. Nevertheless, Cloudant is not completely compatible with TouchDB's continuous replications from previous experience. So our alternative for now is to trigger manually one-shot replications at a fixed frequency. Nevertheless, we would like to know if that approach is going to cost us more money than continuous replications, since continuous replications use longpoll and doesn't need to query the server often. In other words, does one-shot pull replications with Cloudant as the target cost us a GET request?
Thank you,
Paul
I think the issue you refer to is [1].
Cloudant's replication is 100% compatible with CouchDB. In this
instance, TouchDB's logs indicate the iOS network stack passed
on incomplete JSON to TouchDB. It's not clear who was to blame
in this case for the replication failure.
[1] https://github.com/couchbaselabs/TouchDB-iOS/issues/241
For the cost question, a one-shot pull replication will result in a GET to the _changes
feed each time it happens, plus the other requests required to
replicate. This _changes request will be counted as a light
HTTP request against your Cloudant account.
However, whether this works out as more or fewer requests overall
depends on the number of changes coming down from the remote server.
It's also important to remember that the number of _changes calls are very small
relative to the number of other calls involved (e.g., getting the
content of the changes themselves and particularly if there are many
attachments).
While this question is specific to TouchDB, and I mention specific
behaviours of that codebase, this answer deals with the requests involved
in replication between any two systems speaking the CouchDB replication
protocol[2].
[2] http://www.dataprotocols.org/en/latest/couchdb_replication.html
Let's take a contrived example: 1 update per 10 second window to
the source database for the replication, where a TouchDB database
is the target. Let's take a 5 minute poll vs. a continuous replication.
For simplicity of call-counting, let's also take attachments out of the
picture. We'll also assume the device has a constant network connection.
For the continuous case, every 10s TouchDB will receive an update in
the _changes feed. This causes the longpoll connection to close.
TouchDB then runs through the changes, requesting the updates from the
source database; one or more GET requests on the remote server. While
this is happening, TouchDB has to open up another longpoll request
to _changes. So in a five minute period, you'd end up with perhaps
30 calls to _changes, plus all the calls to get documents and record
checkpoints.
Compare this with a one-shot replication every five minutes. You'd
receive notification of the 30 updates in one _changes feed call.
TouchDB implements an optimisation[3] whereby it will call _all_docs
to get updated documents for 1- revs, so you might end up with a single
call to get all 30 documents (not possible in the continuous case as
you've received a single change). Then you've the checkpoint documents
to record. At best fewer than 5 HTTP calls, at most about a third of
the continuous case as you've avoided extra _changes requests.
[3] https://github.com/couchbaselabs/TouchDB-iOS/wiki/Replication-Algorithm#performance
It comes down to the frequency of updates you expect to the source
database. One-shot replication is likely to provide a smoother price
curve as you're in better control of the number of requests you make.
A further question is how often connections will drop because of the
network disconnects which happen regularly with mobile devices.
TouchDB's continuous replications will fire back up each time the
user comes on line (if added via the _replicator database). This is a
further source of unpredictable costs.
However, the benefits from more immediate visibility of changes may
certainly be worth the uncertainty.

Resources