We have an application that sends mail merge campaigns via Graph API. In the app, we keep track of the timings and benchmarks per campaign. We found that some of our users will run into an issue where their mail merge campaign is sending very slow sometimes. Normally for a campaign with about 1k recipients, the average send per email will be less than 1-2 seconds. But sometimes we get averaging around 17-30 seconds per email send with Graph API. For a total of 1k recipients, this is a long time to process a mail merge campaign. More often than not, this causes the total time for a mail merge campaign to finish between 2-4 hours. Note, it's not necessarily because of the number of emails / recipients. We've seen this happen when the recipients range from 500 to 2k even.
I'm not too sure whether the issue here is with Graph API or the user's mailbox throttling (sending limits). I suspect it's the prior because, often, users will resend the same campaign when they see that the first campaign is very slow. What ends up happening is that the second campaign will actually finish very quickly, even before the slow campaign has processed half of them.
It almost seems like the server processing my first set of Graph API requests is warming up. Is there any configuration that I can set in the API request that may help with this? I couldn't find anything in the documentation referencing this.
Has anyone else had experience with this issue? Is there some configuration that the tenant can adjust or review that may be a root cause of this?
Related
I got this error:
The request cannot be completed because you have exceeded your quota.
and I can not understand YouTube limits the number of requests? That is, I cannot create my project by taking API from my channel? If this is so, what is the point of YouTube Data API, if at the development stage I was already limited, what will happen when users come in, then my project will fall within 5 minutes?
and I cannot understand how I was able to make 10,000 requests per day, given that I worked on the localhost for about 3 hours, is this possible?
Indeed the Google's Developers Console shows text like Queries per day, but that's very much misleading (and may well be reported as an Web UI bug to Google).
You have to acknowledge that YouTube Data API's quota system is not accounting for the number of endpoints calls you made during a day long, but it accounts for the cumulated number of quota units corresponding to each of your endpoint calls.
For example, if you have 10000 units of quota allocated for daily usage, you may very easily exceed this upper bound after only 100 calls to the Search.list API endpoint.
Many API users find the default amount of quota allocated -- 10000 units -- to be quite constraining -- that even during the development stage of their apps. For tackling this issue, I recommend two things:
Develop your app such that to cache API responses it received from the endpoints it calls; this way, during the development stage of your app (afterwards, even during production, but albeit functioning with a different logic), repeated calls to endpoints would not result in actual API requests, but would get served from the app's local cache.
Apply for a quota extension, using Google's official form; be aware that, as per the experience of users of this forum, Google's answer, usually, does not arrive shortly.
We have a multitenant application, that pulls top 100 messages from users inboxes and sent items, once every minute. The code has been working fine, and hasn't been changed the last week, but for the last 24 hours or so, we are getting intermittent gateway timeouts after 30 seconds, across different users and tenants, when pulling the messages from Graph.
Is there anyway to reach out to support, to fix this?
If you reduce the size of the messages you ask in one request and also batch requests to graph you will find that you will not see the 504 gateway time outs. More on batching here https://learn.microsoft.com/en-us/graph/json-batching
Also with your query, what select parameters are you using? Are you using expands or any open extensions as this can put additional load on too.
For the past week or so, we've been experiencing 504, Gateway Timeout errors while making fetching email messages from the MS Graph API. Prior to that for over a month of running, the same application did not experience that error, at least not in any significant frequency.
We are using V1.0 of the MS Graph API
Our query is fairly simple:
$top=100&$orderBy=lastModifiedDateTime desc&$filter=lastModifiedDateTime lt 2019-09-09T19:27:55Z and parentFolderId ne 'JunkEmail'
We get the timeout for users who have large volumes of data (> 100K email messages), but occasionally do get it for users with lesser (around 18K email messages) volume. Volume has not changed much from the time where the system was working, to now when we see many timeouts.
We've tried simplifying the query, reducing the number of messages we request etc., but that seems to have only limited and intermittent impact.
My question - What can we do to eliminate/significantly reduce the possibility of getting the 504, Gateway Timeout error from the MS Graph API?
I suspect that since we are asking for messages without a folder filter, it may be possible that we are stressing out the query engine. Just a hunch, and if any one has real insight into MS Graph API, i'd love to know if that may be possible. Also, any information that helps us better understand what is going on under the hood would be much appreciated.
Update 1 (2019-09-13 15:44:00 EST) - Here is a visualization of a set of fetch requests made by the app over a 12 hour period (approximately). The pink bars are the number of successful fetches, and the light blue ones are the failed requests (all having 504, Gateway Timeout as the failure code). As you can see, when the app starts it has a number of failures, which eventually reduce and go away. Then from around 4:30AM to 9:30AM, there are a number of failures, which eventually subside. Almost all failures happen while fetching messages for one user, who has a very large mailbox (> 220K messages). I realize this is a small data set, and am happy to generate one that runs for a longer period of time if that helps. Also, the app in question is running on our Azure tenant, as a part of a Azure Function app, in the "East US" location.
Update 2, (16th Sept 2019, 09:32:00 EST) - We ran the system for the last 3 days and here is a visualization of the fetch requests made by the app during that time. The blue bars are successful fetches, and the pink bars are failed fetched (all having 504, Gateway Timeout as the failure code). The summary is that except for a small window 11PM - 2AM on the first night, no request succeeded for this one particular user with a large mailbox. In effect, that means that inspite of retry logic etc., we are unable to process that user's data.
Microsoft Graph can be slow at times and will throttle occasionally.
I'd advise you let the Graph SDK do the hard work to save you from writing code to handle all this yourself.
Use the Microsoft Graph client library version 1.17.0+ as it introduced auto retry on 504 errors. It alsos handle throttling (code 429) when they occur.
The point I am trying to make is that you can retry when you get a 504 or 429 yourself or delegate such responsibilities to a SDK
Good to hear that the retry is helping. I've got a couple of options to try:
1) Change your query and move the ordering responsibilities to the client. $orderBy=lastModifiedDateTime desc and the filter require indices to be created and this increase the load on the mailbox. Doing client-side ordering may be better for these large mailboxes.
2) Use delta query (with your filter) to sync and get incremental changes. You will have to add a folder hierarchy sync. You may be able to make parallel calls. I suspect that this will give you much better performance after the initial sync.
I encountered the same issue. 504 error while trying to get all messages. After a thorough inspection I figured that in our case the problem was draft items. In some cases they were throwing errors. After adding filter "isDraft eq false" 504 stopped and we're getting all messages. Turns out that some drafts are broken. They won't show up in OWA or Outlook and in our case the one that was messing with the query was stored under parentFolderId that was non-existent, which is a huge problem in and of itself in my opinion.
This question is regarding the O365 Activity Management API
We are using the API to retrieve audit log notifications from multiple channels (AzureAD, Outlook, SharePoint, etc.) for very large tenants, meaning that we need to retrieve potentially millions of notifications over a relatively short timespan.
O365 gathers audit notifications into a series of "blobs" which then contain a number of individual notifications (JSON messages). To my understanding, which in part comes from correspondence with the API's dev. team and from reading the docs, these blobs should contain a "considerable" number notifications as to function as a sort of batch approach when doing the actual web requests.
In our approach, we request blobs URLs for an interval of an hour, and then do a request for the individual blobs.
However, we have tested with a number of different tenants and different PublisherIdentifiers, but only seem to get around 2.5 messages per blob on average, no matter the total number of notifications "waiting" to be fetched.
This becomes a major issue for the larger tenants as is puts a strain on the SIEM solution running the fetcher logic (a Python service), due to the number of needed requests, and it also gives us throttling issues with the API itself.
In effect, we simply cannot fetch the audit notifications fast enough to keep up - within the retention period. Had the blobs contained more notifications per blob, we would be fine - as the total amount of data (in MBs) is not that large.
A "funny" thing is, that if we use the visual query tool within the Admin Center of the tenant, it searches and retrieves the notifications very fast.
My questions
Has anyone had any experience with this issue, or perhaps had a better "batch performance"?
Does anyone have any ideas as to what we could try to get a better performance?
As mentioned we have been in direct contact with the dev team and the program manager in Redmond. They have been very helpful with other issues we had, but they referred us to support for this specific issue - who in turn referred us to the forums / community. We currently do not have access to premium support...
Example request for content blobs for an hour
https://manage.office.com/api/v1.0/{tenantid}/activity/feed/subscriptions/content?contentType=Audit.Exchange&PublisherIdentifier={pub.id}&startTime=2017-12-03T10:31:24&endTime=2017-12-03T11:31:24
When retrieving the individual blobs, we just use the URLs given to us by the above request.
You can avoid throttling by appending "?PublisherIdentifier={Tenant ID}" to the contentUri in the retrieve content get request.
How can I add a PublisherId to a GetBlob call to the Office365 Rest API to avoid throttling?
I have been working with Office 365 Management Activity API for the past 6 months. I too faced this kind of issue before. This issue will occur if you are trying to get all the audit log contents from your Office 365 tenant at a particular interval, it will result in throttling issue. For your information, it is not possible to avoid throttling issues (resource over usage) for large active tenants.
To overcome these issues, you can create and deploy a web application in cloud and register with Office 365 Management Activity API webhook.
Whenever the office 365 tenant wrap the activity logs into an Azure Blob, it will immediately give the blob details to your registered Web Application. You can refer this link to know about how to enable webhook for a Web Application. Once you received the blob detail from Office 365 tenant, extract the logs from the Azure Blob and save it in your own blob storage / store in SQL / NOSQL databases.
I had a similar issue. Pulling down logs would take longer than the interval of time allotted to the Python script and the script would start overlapping itself or would fall behind when trying to pull logs for a SIEM implementation.
https://github.com/IntegralDefense/o365_log_fetch
I'm a little late to this post, but by using Asyncio in Python 3.5+ as well as aiohttp, you can make concurrent calls to O365 Management API and pull down the logs much faster. I performed some testing and retrieved logs for a 13 hour window (Audit.Exchange, Audit.AzureActiveDirectory, and Audit.Sharepoint). It took around 20 minutes using 'requests' and sequentially making the API calls. After implementing Asyncio/aiohttp, the same time frame took just under 2 minutes (500,000+ individual events were pulled from the data located at several thousand content blobs/locations).
I've been running the script in 10 minute intervals and usually the script completes in < 10 seconds.
The script I pasted above also supports pagination. So if you get a content list that was truncated in the response from Microsoft, the script will keep reaching out and pulling down more content locations.
At this time, the documentation isn't up to speed, but hopefully that will be caught up soon.
I have gone through documentation here provided http://www.fedex.com/us/developer/web-services/index.html but I am not able to find how many times I can query FedEx APIs
Does anyone have one idea or experience?
I don't want to put a bug in production code, so taking precautions.
Documentation I follow
Thanks
There are no hard API limits for web services. FedEx does audit logs so they will shut you down if you're sending too many requests, especially with tracking.
Seems like they do have rate limits as of now (2022):
According to this page
The throttling limit is set to 250 transaction over 10 seconds. If
this limit is reached in the first few seconds, HTTP error code 429
Too many requests will be returned and transactions will be restricted
until 10 seconds is reached; transactions will then resume again. For
example, if we receive 250 requests in the first four seconds, an HTTP
error code 429 Too many requests - ‘We have received too many requests
in a short duration. Please wait a while to try again.’ will be
returned and transactions will be restricted for the next six seconds
and then resume again.