I'm currently running a web application that uses Microsoft Graph's API and we encountered the following message today which severely impacted our application, for a whole day:
"error": {
"code": "ErrorTooManyObjectsOpened",
"message": "Too many concurrent connections opened., The process failed to get the correct properties.",
"innerError": {
"request-id": "removed",
"date": "2017-12-13T17:01:14"
}
}
please note that the request-id was removed
Let me summarize what our web application does.
Basically, we have 2 email folders that we are actively subscribed to, Junk and Folder A.
If anything hits Folder A, we strip the body of the email message and then move the message to Folder B. The subscription on our Junk folder also strips the body and sends them over to Folder B.
Sometimes the webhook subscription service skips messages that may come at the same time, therefore we have 2 cron jobs in our server that run a script and check Junk/Folder A for any messages every 5 minutes, therefore my assumption is that the cron job runs about 288*2 times per day. Not counting our subscription to the folders, we usually get around 200-300 email messages per day.
Unfortunately Microsoft's Graph error codes page does not provide us with any explanation about this code. I would really appreciate if anyone can explain what this means and how to avoid it from happening.
This is occurring because your application is exceeding the throttling thresholds.
There are several different throttling metrics that can affect Microsoft Graph requests. For a high-level overview, see the Microsoft Graph throttling guidance. Since in this case you're hitting Exchange Online via Graph, you can find more specific information from What throttling values do I need to take into consideration? in the Exchange documentation.
Architecturally, you are making a lot of unnecessary calls into the API. Rather than having both a subscription and a scheduled job, you should use just the webhook subscription and the /delta endpoint.
Each call to the /delta endpoint gives you a token that can be used to fetch any changes to a given resource since the token was originally issued. So regardless of if 1 email came in or 1,000, you only get the new emails.
Once you're using the /delta to find your changes, you then use a webhook only as a "trigger". When you receive the webhook, you can ignore the contents and instead issue a request to /delta. This ensures that you capture every incoming email even if you didn't necessarily receive separate webhook notifications.
There is a bug. After making 500 message move requests, a "cannot copy/move error" occurs. Subsequently, a "429: Too many concurrent connections opened" error occurs. Most applications miss the first error because you continually get the 429 error afterwards.
If you let the application "rest" for 30 minutes, the throttle resets itself and you can continue on. I do not think there is a time limit for hitting the 500 moves. My application did 500 moves after 6.5 hours and then we started getting the error.
And, if you keep trying your move call before the 30 min rest period, it never resets. Also, in the response, the retry-after is null... so, that doesn't help you.
If you find a work around, please let me know. We are trying a few things like setting the category, then manually moving the messages. I am also investigating making a rule the moves them for us or some other job. I cannot find a way to execute a rule from the Graph API.
See this link for more information. Also, the more people who report having this issue, hopefully the sooner it can be resolved. Outlook API Throttling documentation #144
Related
I'm trying to figure out how I can implement a retry policy in a Twilio Studio Flow. I see that they have an example, but it only has a delay of no more then 10-seconds.
I want something that can use to retry when my webhooks service is down. I did setup the sample from the Twilio docs but it only seems to work when you want a delay of no more then 10-seconds. But I need it to pause for an hour or two. So say the HTTP Post step fails because the webhooks service is offline, I want it to pause for an hour and try again. Then pause for 2, then 3, then 4, etc. and try again. The point being, I don't want to lose the user's response.
What I am trying to do is not lose any of the user responses from a survey if my webhooks application goes down. We saw this happen in production for a couple of hours and we lost survey response from 200 users.
If this is not possible, is there a way I can reach back out to Twilio logs and get access to the responses that failed while the webhooks service was down? I recall running into something where you can pull back the logs, which could then be used to identify the ones that failed.
This kind of logic isn't really built into Studio. Ten second waits are typically the most you will see due to both Twilio Functions & the http request widget timing out at this point.
If you wish to include this kind of wait then you will need some sort of workaround where you go into a send & wait for reply widget (which ignores responses from your customers with some additional logic) and has a timeout set to the amount of time you want to wait. You can then transition to the webhook request again and re-attempt.
Alternatively, you can create a utility which uses the Execution resource to find all the failed flows for a given time period so you can choose how best to move forward.
I'm starting to use Microsoft Graph threat assessment API to report Phishing Website URL.
(Ref: https://learn.microsoft.com/en-us/graph/api/informationprotection-post-threatassessmentrequests?view=graph-rest-1.0&tabs=http)
My use-case is automatic reporting and manual reporting via Slack Command.
But throttling is very strict, so I get "429" response immediately.
"code": "ActivityLimitReached",
"message": "The client application has been throttled for reaching an activity limit. The request may be repeated after a delay, the length of which may be specified in a 'Retry-After' header.",
Does anyone know a workaround for the throttling?
As far as I confirmed, throttling is 1 request per 15 minutes (Limit per resource) by default.
(150 requests per 15 minutes (Limit per tenant) though)
Ref: https://learn.microsoft.com/en-us/graph/throttling#information-protection
I would try the following best practice to avoid/handle the throttling.
When you implement error handling, use the HTTP error code 429 to detect throttling. The failed response includes the Retry-After response header. Backing off requests using the Retry-After delay is the fastest way to recover from throttling because Microsoft Graph continues to log resource usage while a client is being throttled.
Wait the number of seconds specified in the Retry-After header.
Retry the request.
If the request fails again with a 429 error code, you are still being throttled. Continue to use the recommended Retry-After delay and retry the request until it succeeds.
The guidelines for throttling is already provided by Microsoft Graph team and it's straightforward. Please go through this doc, look out for best practices to avoid throttling, to handle throttling etc and think about throttling/batching to see if it suits your scenario (so you can optimize the calls).
If retry-after header doesnt exists then it would be tough - thats the way to handle throttling, unless if any alternate way exists provided. If you still believe Microsoft to implement this feature, consider creating a new user voice.
Update: #rung created a new uservoice on this.
I am working on the similar use case. I am planning to submit the Msg Id/URL/file to Microsoft for phishing assessment using API . I am stuck with an error that shows " "code": "Unauthorized",
"message": "Required authentication information is either missing or not valid for the resource.",
"
I would highly appreciate your help!
When establishing a call using Twilio, many many POSTs are made to https://eventgw.twilio.com/v2/EndpointMetrics, and I've noticed in our javascript error tracking service that some of our users get 403s from these calls. In the last week, there have been 19 users getting 2,600 errors. Can anyone from Twilio tell me what the calls are, whether these errors are benign and I should ignore them, and whether there is any way to disable these? For a single short call I'm seeing 34 separate POST requests to this endpoint.
Can you check that when you get these errors, your clients Access or Capability token didn't expire?
I have a use case where I need to poll the OneNote API approximately every minute in order to respond to text added to pages by the user.
(Aside: I'd LOVE to use webhooks to get notifications only when something changes, but that's only supported for consumer notebooks at this time, as far as I can tell.)
Polling with this frequency works for a few users (5 or so), but then, with more users who authorized the same Microsoft application, the app seems to hit an application-level rate limit and begins receiving 429 Too Many Requests responses.
How can I ensure polling will still work as the number of users grows? And are there any rate limits that can be made public or shared in confidence for valid use cases?
So it is possible to register for webhooks on the sharepoint notebooks as onedrive items - as a notebook page gets updated the notificationUrl fires and you can then using delta calls to determine which sections (section.one files) have been updated.
I would then use the onenote-api to get the pages in the updated notebook sections GET https://www.onenote.com/api/v1.0/me/notes/sections/{id}/pages
An alternative would be to treat the sharepoint drive as a webdav server and use the propfind method with the getlastmodified property to poll the drive determine which sections of various notebooks have been updated.
But I agree it would be easier if onenote webhooks were extended to sharepoint.
When I make API calls to the server, I'm getting 404 errors for various data -- grades, role IDs, terms -- that I won't get on the next time I call it. The data's there on the server, viewable by the same user, and is often returned successfully, but not every time. The same user context will return data successfully for other calls.
Any ideas what could be causing this?
I'm using the Valence API with the Python client library and our 9.4.1 SP18 instance of Desire2Learn in a non-interactive script.
more detail: the text it returns on the bad 404s is " ErrorThe system cannot find the path specified."
It would help enormously to gather data about your case: packet traces that can show successful calls from your client alongside unsuccessful calls, in particular, would be very useful to see. If you are quite certain (and I see no reason you shouldn't be from your description) that you're forming the calls in the right way each time you make them, then the kind of behaviour you're noticing would seem to speak to some wider network or configuration issue: sometimes your calls are properly getting through the web service layer, and sometimes they are not -- this would seem therefore not to be down to the way you're using the API but in the way the service is able to receive that request.
I would encourage you, especially if you can gather data to provide showing this behaviour, to open a support incident with Desire2Learn's help desk in conjunction with your Approved Support Contact, or your Partner Manager (depending on whether you're a D2L client or a D2L partner).