Why limited number of next page tokens? - youtube-api

Through a script I can collect a sequence of videos that search list returns. The maxresults variable was set to 50. The total number items are big in number but the number of next page tokens are not enough to retrieve all the desired results. Is there any way to take all the returned items or it is YouTube restricted?
Thank you.

No, retrieving the results of a search is limited in size.
The total results that you are allowed to retrieve seems to have been reduced to 500 (in the past it was limited to 1000). The api does not allow you to retrieve more from a query. To try to get more, try using a number of queries with different parameters, like: publishedAfter, publishedBefore, order, type, videoCategoryId, or vary the query tags and keep track of getting different video id's returned.
See for a reference:
https://code.google.com/p/gdata-issues/issues/detail?id=4282
BTW. "totalResults" is an estimation and its value can change on the next page call.
See: YouTube API v3 totalResults field is returning 1 000 000 when it shoudn't

Related

Youtube Data API V3 Number of results not equal to maxResults/pageInfo.resultsPerPage

Good day. I'm trying to get the top videos for gaming within the specified publish dates ordered by view count. I'm having a problem with the number of items that it is returning in the JSON response because it's less than 50 items even though I have set the maxResults parameter to 50 and the JSON response returns pageInfo.resultsPerPage equal to 50. The pageInfo.totalResults also returns a lot more than 50 so it should be displaying 50 items in the result. As an example, I'm using the followng URL
https://www.googleapis.com/youtube/v3/search?key={API_Key}&part=snippet&maxResults=50&order=viewCount&publishedAfter=2017-02-01T00%3A00%3A00Z&publishedBefore=2017-02-01T00%3A01%3A59Z&type=video&videoCategoryId=20
The returned result here the last time I ran this query has 20 items even though there are 161,307 total results. I would like to ask a solution that will make sure that I will always get the number of items equal to the maxResults if possible. Hope someone can help me on this. Thank you very much.
EDIT: I know how to use the page token, but I'm not aiming for getting the items that I need for more than one request (as the Data API have limited credits per day). The issue that I'm trying to resolve is to make sure that I always get 50 items everytime I invoke the request.
Although there are 160k+ total results, your publishedAfter/publishedBefore filters cut the number returned down to the videos published in that time range, which is 20.
Well, I got 50 results with the URL request that you provided. Just make sure that you place right value on the parameter that you use, especially the value in publishedAfter. You can also use the parameter pageToken to get the next page or next 50 results.
For more information, check this SO question on how to use pageToken

Is there any limit to the number of rows returned by API?

I am making a bulk call with 30 posts and daily data of all. Is there any limits to the number of rows that will be returned by the API?
I am having problem getting the results.
Can anyone please help.
YouTube doesn't return any rows ... it's not relational data. That may sound like a pedantic thing to point out, but it's crucial for this next point; the API will return 50 videos at a time, along with tokens to get more results based on the same query, up to a total of 500 ... because the data isn't relational, you can't just "select all rows" that match a certain criteria. Rather, it is probabilistically determining relevance to your search parameters, and after about 500 results the algorithms don't have enough certainty to make additional results relevant.
So in your case, where you can change the date as needed (to allow the algorithms to be more specific), you'll want to do a series of calls; perhaps one at a time (since you have to paginate anyway to get more than 50 results, it's probably not that much more expensive in terms of network bandwidth).

How can get all results from Youtube API (search API) response

Like the title.
I do a request like this :
https://www.googleapis.com/youtube/v3/search?key=AIzaSyDuxczhyyvHWfxKuF3ygW9p0GWmKlvWLYc&part=id,snippet&publishedAfter=2014-12-09T00:00:00Z&publishedBefore=2014-12-11T00:00:00Z&videoCategoryId=GCSG93LXRvICYgRElZ&type=video&maxResults=50&pageToken=
Total result is 1000000. But I just can get 500 results maximum (10 page, 50 results/page).
At 10th page, I don't see nextPageToken property to go to the next page. ???
I don't know why.
How can I get all of result.
YouTube imposes a soft limit of about 500. There is no direct way to get more than that through the API.
Full details: https://code.google.com/p/gdata-issues/issues/detail?id=4282
Relevant Excerpt:
"We can't provide more than ~500 search results for any arbitrary YouTube query via the API without the quality of the search results severely degrading (duplicates, etc.).
The v1/v2 GData API was updated back in November to limit the number of search results returned to 500. If you specify a start-index of 500 or more, you won't get back any results.
This was supposed to have also gone into effect for the v3 API (which uses a different method of paging through results) but it apparently was not pushed out, so it is still possible to retrieve up to 1000 search results in v3—the last 500 of which are usually of bad quality.
The change to limit v3 to 500 search results will be pushed out sometime in the near future. There will no longer be nextPageTokens returned once you hit 500 results.
I understand that the totalResults that are returned is much higher than 500 in all of these cases, but that is not the same thing as saying that we can effectively return all X million possible results. It's meant as an estimate of the total size of the set of videos that match a query and normally isn't very useful."
Updated - How to get around the 500 result max soft limit
Use the filters 'publishedAfter' and 'publishedBefore' to break up your query into loops of queries by day/week/month until no more results are returned. Each periodic query should return less than 500 results each, but you'll get them all.
There's documentation for channelId (still not for videoCategoryId) by the way
https://developers.google.com/youtube/v3/docs/search/list#channelId
Note: Search results are constrained to a maximum of 500 videos if your request specifies a value
for the channelId parameter and sets the type parameter value to video, ...

Accessing an item beyond start_index=1000 in a YouTube user upload feed

I am currently trying to pull data about videos from a YouTube user upload feed. This feed contains all of the videos uploaded by a certain user, and is accessed from the API by a request to:
http://gdata.youtube.com/feeds/api/users/USERNAME/uploads
Where USERNAME is the name of the YouTube user who owns the feed.
However, I have encountered problems when trying to access feeds which are longer than 1000 videos. Since each request to the API can return 50 items, I am iterating through the feed using max_length and start_index as follows:
http://gdata.youtube.com/feeds/api/users/USERNAME/uploads?start-index=1&max-results=50&orderby=published
http://gdata.youtube.com/feeds/api/users/USERNAME/uploads?start-index=51&max-results=50&orderby=published
And so on, incrementing start_index by 50 on each call. This works perfectly up until:
http://gdata.youtube.com/feeds/api/users/USERNAME/uploads?start-index=1001&max-results=50&orderby=published
At which point I receive a 400 error informing me that 'You cannot request beyond item 1000.' This confused me as I assumed that the query would have only returned 50 videos: 1001-1051 in the order of most recently published. Having looked through the documentation, I discovered this:
Limits on result counts and accessible results
...
For any given query, you will not be able to retrieve more than 1,000
results even if there are more than that. The API will return an error
if you try to retrieve greater than 1,000 results. Thus, the API will
return an error if you set the start-index query parameter to a value
of 1001 or greater. It will also return an error if the sum of the
start-index and max-results parameters is greater than 1,001.
For example, if you set the start-index parameter value to 1000, then
you must set the max-results parameter value to 1, and if you set the
start-index parameter value to 980, then you must set the max-results
parameter value to 21 or less.
I am at a loss about how to access a generic user's 1001st last uploaded video and beyond in a consistent fashion, since they cannot be indexed using only max-results and start-index. Does anyone have any useful suggestions for how to avoid this problem? I hope that I've outlined the difficulty clearly!
Getting all the videos for a given account is supported, but you need to make sure that your request for the uploads feed is going against the backend database and not the search index. Because you're including orderby=published in your request URL, you're going against the search index. Search index feeds are limited to 1000 entries.
Get rid of the orderby=published and you'll get the data you're looking for. The default ordering of the uploads feed is reverse-chronological anyway.
This is a particularly easy mistake to make, and we have a blog post up explaining it in more detail:
http://apiblog.youtube.com/2012/03/keeping-things-fresh.html
The nice thing is that this is something that will no longer be a problem in version 3 of the API.

Twitter search API results

I'm using the Twitter API atom format
http://search.twitter.com/search.atom?q=Name&:)&since:year-month-date&rpp=1500
but it's only returning 100 tweets, I tried using the JSON format as well, but it only returned 100 results. Is there anything that I'm doing wrong to only get 100 results?
Yes, you're limited on the number of results per page. In order to get more results, you have to use the page parameter like so:
http://search.twitter.com/search.atom?q=Name&:)&since:year-month-date&rpp=1500&page=2
EDIT
rpp: the number of tweets to return
per page, up to a max of 100. E.g.,
http://search.twitter.com/search.atom?lang=en&q=devo&rpp=15
page: the page number to return, up to
a max of roughly 1500 results (based
on rpp * page)
Source: http://search.twitter.com/api/
In other words your rpp won't work as you expect because the max is 100.
My sugestion.
Make a request to your API and retrieve 100 results by time.
Use a loop to check if your result count is set to 100.
if true, do a new request to page 2.
test again and check the number of itens until the resultset is lower than 100.
The Twitter Search API has changed, including in the naming of the parameters: for instance, rpp is now count and the page parameter was removed in favor of max_id, a parameter based on a timeline concept:
"To use max_id correctly, an application’s first request to a
timeline endpoint should only specify a count. When processing this
and subsequent responses, keep track of the lowest ID received. This
ID should be passed as the value of the max_id parameter for the next
request, which will only return Tweets with IDs lower than or equal to
the value of the max_id parameter."
https://developer.twitter.com/en/docs/tweets/timelines/guides/working-with-timelines
The updated link to the Twitter search api is:
https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets.html
Remember that not all tweets are indexed and if you are using the non-commercial version, you are limited to a 7-day search.

Resources