Got this issue while trying to clear the events for primary calendar ID.
This issue occurred with my first_account#gmail.com account, But with my another account second_account#gmail.com , clear event works.
can anyone please help me with this ?
Probable root cause could be heavy network traffic. Since you are just testing, you can try again until you get a success response.
If you will encounter this during the development stage suggested action for this error is to use exponential backoff.
Exponential backoff is a standard error handling strategy for network
applications in which the client periodically retries a failed request
over an increasing amount of time. If a high volume of requests or
heavy network traffic causes the server to return errors, exponential
backoff may be a good strategy for handling those errors. Conversely,
it is not a relevant strategy for dealing with errors unrelated to
rate-limiting, network volume or response times, such as invalid
authorization credentials or file not found errors.
Used properly, exponential backoff increases the efficiency of
bandwidth usage, reduces the number of requests required to get a
successful response, and maximizes the throughput of requests in
concurrent environments.
Create requests are not idempotent. A simple retry is insufficient and
may result in duplicate entities. Check whether the entity exists
before retrying.
Related
I use a time series database to report some network metrics, such as the download time or DNS lookup time for some endpoints. However, sometimes the measure fails like if the endpoint is down, or if there is a network issue. In theses cases, what should be done according to the best practices? Should I report an impossible value, like -1, or just not write anything at all in the database?
The problem I see when not writing anything, is that I cannot know if my test is not running anymore, or if it is a problem with the endpoint/network.
The best practice is to capture the failures in their own time series for separate analysis.
Failures or bad readings will skew the series, so they should be filtered out or replaced with a projected value for 'normal' events. The beauty of a time series is that one measure (time) is globally common, so it is easy to project between two known points when one is missing.
The failure information is also important, as it is an early indicator to issues or outages on your target. You can record the network error and other diagnostic information to find trends and ensure it is the client and not your server having the issue. Further, there can be several instances deployed to monitor the same target so that they cancel each other's noise.
You can also monitor a known endpoint like google's 204 page to ensure network connectivity. If all the monitors report an error connecting to your site but not to the known endpoint, your server is indeed down.
I understand AWS SQS is high reliability, but still there are still chance that network can be disconnected from our server to the AWS datacenter from time to time.
Are there any way tool to prevent this kind of error, e.g. by caching the request locally and resend if network is available again?
The AWS SDK has built-in retries to handle transient errors - you can configure the retry policy based on your needs. This should handle the most common types of network errors dealing with AWS.
If AWS service you depend on is down for longer than retry policy will handle, then you need to decide how to handle that - you could fallback to other AWS region(s) (very unlikely the service is down in multiple regions), or emit failures to your service callers, or cache locally, drop the requests, or something else entirely.
Handling failure cases is highly variable, and depends on the use case and needs of the system - there's definitely no "one right way". As a note, I've used each of the above failure mode suggestions I made on different systems I've built that depend on AWS.
We have 1 swarm cluster with 3 managers and 10 workers for performance test. When 100 concurrent requests(create service) sent to one swarm manager, dockerd may accept all the requests to dispatch to workers. But if we increase the num of concurrent requests, the dockerd error log says:
Error creating service serviceXXX: rpc error: code = 4 desc = context
deadline exceeded"
Is there a default value of max concurrent requests that dockerd can handle in code? How could we increase the concurrent requests that dockerd can process successfully?
The daemon is version 17.03.
As commented in issue 29987, this error message is not very explicit:
I think whenever we encounter a context deadline exceeded error, we should rewrite it to a coherent explanation of what timed out, and perhaps list possible reasons that could cause the timeout (loss of quorum, etc).
When working on docker/docker-e2e, I had problems where things were timing out causing context deadline exceed errors, but the root cause of the timeout was some other error that was getting ignored, superseded, or otherwise buried.
As detailed in issue 33631:
This error can have various causes (See this search for existing issues mentioning this error).
The error itself is quite generic, and could mean that the manager was not able to communicate with other managers in the cluster.
From just the error, it's not easy to discover why it fails to communicate (it can be either a bad network connection, other managers did not properly re-join the cluster, therefore you lost quorum, or it could be if (e.g.) managers did not have a static IP-address, and the IP-address changed - which is currently not supported).
You can see a similar case here but this happens also with less queries.
I'm looking for suggestions to the best way to handle network connection issues for an iPhone app (iOS9/Swift2/Xcode7), to give the best user experience since we know that mobile data networks are unreliable. I have my coding options in place but I'd like to know what's worked well for other experienced techs. There's lots of info out there but nothing I could find specific to a strategy to occur when there is a connection failure.
Here is my basic strategy dealing with failed connections I'd like to implement (along with questions):
App sends request to api.myserver.com and the request fails
Wait X second(s) and try request to api.myserver.com again (how many tries and at what time interval would you suggest?)
Try pinging some other server (i.e. google.com) to see if we can access a resource other than api.myserver.com
If we can successfully ping google.com then we know our internet is working, so we try once again to ping api.myserver.com
If this last ping fails then we alert the user that we can't communicate for some reason and to try again later
I'm using the philosophy outlined in this SO answer recommended by an Apple tech, which in general means you always check the connection to your server first, using Reachability as a separate check to ensure phone hardware is available.
At any time during this process if Reachability is false then we would put our request in a queue to be tried again when the phone hardware connection was restored.
I think I've got a handle on the code involved, but looking for insights like "this is what worked for our app and gives a good user experience during connection issues...and was approved for use in the Apple app store...". I understand the concepts of trying/retrying connections in the case of failure and alerting the user (currently my code already does this successfully), but still not solid on a good policy to use for how many times should I try to reconnect and at what intervals?
For most of the apps I have worked on it was useful to define a couple of categories of requests which have different rules. For each category consider if retries are appropriate and how long you can really afford to wait before considering the request(s) a failure.
At the most sensitive are blocking requests, things which the user must allow to complete before they can proceed. Sign in, checkout, some editing actions, etc. For these it is often not worth retrying(1) and failures need to be communicated to the user immediately: if the device is offline let the user decide when to try again, if the request fails you've probably already made the user wait too long. Since failures tend to block the user they usually also need to be communicated prominently.
Less sensitive are usually use initiated but non-blocking actions: pull-to-refresh, loading details of a selected collection item, or performing a search. Your user might be waiting to see the results but is probably free to give up or navigate elsewhere in the app and check back later. Failures still need to be communicated so users can choose to try again or at least know to stop waiting but the notification of those failures can be less prominent. Here retries start to make sense. I usually start by trying to define a time limit from the user's perspective, how long will they wait before the app feels broken, and then let that be your limit for how long a request can wait for connectivity or make any number of retries in response to failed connections.
Even less sensitive are requests triggered only indirectly by your user; polling for updates, loading non-essential images, warming caches. These you might retry but the impact of failure is often so low that it may not matter.
Of all of those requests your retry policy really only impacts #2 so I would make sure you actually have requests of that type before worrying about it. Assuming those do actually apply to your app...
Wait X second(s) and try request to api.myserver.com again (how many tries and at what time interval would you suggest?)
I would set some interval here (in the tens to hundreds of milliseconds depending on your normal api performance) to avoid an accidental flood of requests. I don't want to suggest a precise number when I don't have a solid justification for it.
My experience has been that optimizing this value is unlikely to make a perceptible difference to your users because requests often take hundreds of milliseconds to fail and users are only willing to wait for a few thousand milliseconds so making 1 or 5 or 10 requests in that time doesn't really change the final outcome. If you are able to set different expectations with your users then your results may vary.
Try pinging some other server (i.e. google.com) to see if we can access a resource other than api.myserver.com
If we can successfully ping google.com then we know our internet is working, so we try once again to ping api.myserver.com
I would not assume that this is true nor do I think that making an extra request to a third party will help you make useful predictions about when to attempt to reach your own systems. This seems like extra work to build and maintain and likely to be a source of misleading results more than valuable information. In what scenario do you imagine this provides useful information to your app or its user?
Maybe not the answer you're looking for, hopefully it's still useful.
Disclaimer: my experience is biased toward apps with a fairly simple set of REST or RPC style network requests. If you're working on a problem which calls for streaming data, P2P connections, or some other scenario then don't start with these assumptions.
(1) One end note here because I see it as a source of failures so often: These requests should really be idempotent. Yes, even those POSTs creating new resources, checking out your cart, or whatever. When you cannot safely repeat a request you'll eventually see cases where the request completed but the client never got the acknowledgement so it looked like a failure. It's much easier to recover through a retry (automatic or user triggered) of the same request than to detect and recover from duplicate requests.
For better network performance. In my application I use to ping Google server for before every API request if its reachable then I called my server API else no network alert.
If you are on wifi network then also you have to do the same, because wifi reachability only checks for wifi connectivity not for network access.
I believe the max PUT requests to Amazon's Simple DB is 300?
What happens when I throw 500 or 1,000 requests to it? Is it queued on the Amazon side, do I get 504's or should I build my own queuing server on EC2?
The max request volume is not a fixed number, but a combination of factors. There is a per-domain throttling policy but there seems to be some room for bursting requests before throttling kicks in. Also, every SimpleDB node handles many domains and every domain is handled by multiple nodes. The load on the node handling your request also contributes to your max request volume. So you can get higher throughput (in general) during off-peak hours.
If you send more requests than SimpleDB is willing or able to service, you will get back a 503 HTTP code. 503 Service unavailable responses are business as usual and should be retried. There is no request queuing going on within SimpleDB.
If you want to get the absolute max available throughput you have to be able to (or have a SimpleDB client that can) micro manage your request transmission rate. When the 503 response rate reaches about 10% you have to back off your request volume and subsequently build it back up. Also, spreading the requests across multiple domains is the primary means of scaling.
I wouldn't recommend building your own queuing server on EC2. I would try to get SimpleDB to handle the request volume directly. An extra layer could smooth things out, but it won't let you handle higher load.
I would use the work done at Netflix as an inspiration for high throughput writes:
http://practicalcloudcomputing.com/post/313922691/5-steps-simpledb-performance