Does the number of rows count towards the API rate limit in Google Fusion importRows? - google-fusion-tables

I have been getting intermittent 500 errors while batch-uploading simple row data to Google Fusion Tables via the v2 API, using the importRows method.
We have tried throttling and backing off, but the patterns seem to indicate that we are going over quota even with small numbers of requests and fairly slow rates.
I can see in the API console it's limited to 200 requests / 100s (as confirmed in other posts it's a 0.5/s rate limit).
We are about to sadly abandon the Fusion Tables API and rebuild the entire project using something else, due to the unpredictable nature of the 500 errors. (Sometimes insert happens but sometimes not, after an error is returned which makes retrying run the risk of duplicate inserts).
It occurred to me that as we are uploading 1,000 rows per request, does this count as 1,000 requests?

Are you uploading media files when you make an importRows request? It could be that you're exceeding table storage limits (250MB per table). You may want to check your code and data payloads against this and other Fusion Table limitations.
Here's a good reference on the limits of Fusion Tables:
What are the technical limitations when using Fusion Tables?

Related

Youtube Data API Wrongly Calculated, Quota Exceeded

I have a very simple message and getting the v3 youtube data api to get the list of comments. I am just fetching the list of videos and then fetching the comments (at frequency of 5 sec) to get updated messages. using the page token as needed to minimize the load and computaion.
Today after some time while internally testing the application i started getting the quota exceeded exception. I know the youtube provided by default 10000 units and since reading the comments (and videos as well) is just 1 unit, i should expect to get similar numbers.
However, the data is wrongly calculated.
Following are request details
If you see, there are 2895 total requests LiveChatMessages-> List.
However, when i go to IAM-> Quotas, it showed 14k earlier, then 12.6k in quota usage
There seems to be some problem either with the computation or with the Documentation that defines the units for queries. Can someone help please..
PS: Just using the two apis as mentioned above in screenshot. Both are list.
If you see, there are 2895 total requests LiveChatMessages-> List. However, when i go to IAM-> Quotas, it showed 14k earlier, then 12.6k in quota usage
Yes i can see that there are 2895 requests, but how do you know what the qutoa costs are for those requests. You are using the YouTube Live Streaming api for those requests. Not the YouTube-Data-api
There is no documentation of the quota cost for the YouTube Live Streaming api calls. If Google says you used all your quota then you probably have.
I would post an issue over on the issue forum asking them to document the quota cost for the calls Issue forum

How can I solve the "Read Request Limit Error"?

I receive this error while trying to export form my datagrid to Google Sheets. How can I solve it?
Don't make many requests too quickly.
You are either exceeding your quota or you are making too many requests too quickly.
Also, look into batch requests
https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/batchUpdate
As you may be trying to make a call to the API for every single cell updated, which is an easy way to run into the above error.
If you must do it on a cell by cell basis, you would have to insert a small delay between requests. Bear in mind that although the usage page says:
This version of the Google Sheets API has a limit of 500 requests per 100 seconds per project, and 100 requests per 100 seconds per user. Limits for reads and writes are tracked separately. There is no daily usage limit.
This does not mean that you can make 100 requests in 1 second and then wait 99 seconds. This will give you a quota error like what you are running into. You would have to put in a one second delay between requests, for example.

YouTube Data API limit reached! How can I increase this limit?

My application uses the YouTube Data API v3 and every day I hit a limit of 10k query cost at which point my application is unusable.
I understand the old default limit was 1m (100x!) and I can find NO way to increase it. I completed the google form titled "YouTube Data API Services – Exceptions form" which is apparently the only line of contact according to cloud support and have not received any update in over two weeks. Is there ANY other way to increase it?
I had no idea google API would be the largest hurdle in my project. thanks! :)

Throttling of OneNote (Graph) API

We have developed an importing solution for one of our clients. It parses and converts data contained in many OneNote notebooks, to required proprietary data structures, for the client to store and use within another information system.
There is substantial amount of data across many notebooks, requiring a considerable amount of Graph API queries to be performed, in order to retrieve all of the data.
In essence, we built a bulk-importing (batch process, essentially) solution, which goes through all OneNote notebooks under a client's account, parses sections and pages data of each, as well as downloads and stores all page content - including linked documents and images. The linked documents and images require the most amount of Graph API queries.
When performing these imports, the Graph API throttling issue arises. After certain time, even though we are sending queries at a relatively low rate, we start getting the 429 errors.
Regarding data volume, average section size of a client notebook is 50-70 pages. Each page contains links to about 5 documents for download, on average. Thus, it requires up to 70+350 requests to retrieve all the pages content and files of a single notebook section. And our client has many such sections in a notebook. In turn, there are many notebooks.
In total, there are approximately 150 such sections across several notebooks that we need to import for our client. Considering the stats above, this means that our import needs to make a total of 60000-65000 Graph API queries, estimated.
To not flood the Graph API service and keep within the throttling limits, we have experimented a lot and gradually decreased our request rate to be just 1 query for every 4 seconds. That is, at max 900 Graph API requests are made per hour.
This already makes each section import noticeably slow - but it is endurable, even though it means that our full import would take up to 72 continuous hours to complete.
However - even with our throttling logic at this rate implemented and proven working, we still get 429 "too many requests" errors from the Graph API, after about 1hr 10mins, about 1100 consequtive queries. As a result, we are unable to proceed our import on all remaining, unfinished notebook sections. This enables us to only import a few sections consequtively, having then to wait for some random while before we can manually attempt to continue the importing again.
So this is our problem that we seek help with - especially from Microsoft representatives. Can Microsoft provide a way for us to be able to perform this importing of these 60...65K pages+documents, at a reasonably fast query rate, without getting throttled, so we could just get the job done in a continuous batch process, for our client? In example, as either a separate access point (dedicated service endpoint), perhaps time-constrained eg configured for our use within a certain period - so we could within that period, perform all the necessary imports?
For additional information - we currently load the data using the following Graph API URL-s (placeholders of actual different values are brought in uppercase letters between curly braces):
Pages under the notebook section:
https://graph.microsoft.com/v1.0/users/{USER}/onenote/sections/{SECTION_ID}/pages?...
Content of a page:
https://graph.microsoft.com/v1.0/users/{USER}/onenote/pages/{PAGE_ID}/content
A file (document or image) eg link from the page content:
https://graph.microsoft.com/v1.0/{USER}/onenote/resources/{RESOURCE_ID}/$value
which call is most likely to cause the throttling?
What can you retrieve before throttling - just pageids (150 calls total) or pageids+content (10000 calls)? If the latter can you store the results (eg sql database) so that you don't have to call these again.
If you can get pageids+content can you then access the resources using preAuthenticated=true (maybe this is less likely to be throttled). I don't actually offline images as I usually deal with ink or print.
I find the onenote API is very sensitive to multiple calls without waiting for them to complete, I find more than 12 simultaneous calls via a curl multi technique problematic. Once you get throttled if you don't back off immediately you can be throttled for a long, long time. I usually have my scripts bail if I get too many 429 in a row (I have it set for 10 simultaneous 429s and it bails for 10 minutes).
We now have the solution released & working in production. Turns out that indeed adding ?preAuthenticated=true to the page requests returns the page content having resource links (for contained documents, images) in a different format. Then, as it seems, querying these resource links will not impact the API throttling counters - as we've had no 429 errors since.
We even managed to bring the call rate down to 2 seconds from 4, without any problems. So I have marked codeeye's answer as the accepted one.

Ways to pull (potentially) large amounts of data from Twitter

I've been playing around with the Twitter API using Twitter4j. I am trying to pull data given a keyword and date, and example of a query I would run using the REST API would be
bagels since:2014-12-27
Which would give me all tweets containing the keyword 'bagels' since 2014-12-27.
This works in theory, but I've quickly exceeded the rate limits since each query allows up to 100 results, and only 180 queries are allowed within a 15-minute interval. There are many keywords that return more than 18k results.
Is there a better way to pull large amounts of data from Twitter? I looked at the Streaming API but I don't know if I can pull data from a certain date range.
There are a few things you can do to improve your rates:
Make sure your count is maxed at 100, which it looks like you're doing.
Use Application-Only authorization - it increases your rate limit to 450.
Use the max_id, since_id parameters to page through data and avoid querying for results you're already received. See the Working with Timelines docs to see what I mean.
Consider using Gnip if you're willing to pay to remove rate limits.

Resources