Youtube Analytics API pagination - youtube-api

I am using the YouTube Analytics API (v1-rev18-1.15.0-rc). I tried to get some channel reports using video dimension. According to the API document, it has a limit of maxResults <= 10. I set the startIndex and maxResults as belwo, but the second query returns nothing for the following code.
First page returns 10 rows.
query.setMaxResults(10);
query.setStartIndex(1);
Using the same query object, the second page returns nothing (resultTable.Rows is null)
query.setStartIndex(11);
result = query.execute();
result.getRows() == null; // true
I tried to create a new query object each time or set the maxResults to a smaller number such as 3, it didn't work. In queries I tested, even for dimensions without maxResults limit such as the day dimension, it returned null rows when startIndex is > 1 even it was the first query. Did I miss anything?
I just found that the pagination works in Content Owner Reports, not in Channel Reports.

The limitation for maxResults <= 10 is only for some sort orders, like by views and watch time. When sorting on any of the dimensions, like day or country the maximum number of results is much higher.
But it seems like maxResults is actually an endIndex, when I try it out:
startIndex=1 maxResults=10 -> result: 1..10
startIndex=2 maxResults=10 -> result: 3..11
startIndex=5 maxResults=10 -> result: 9..14
startIndex=10 maxResults=10 -> result: 19
startIndex=11 maxResults=10 -> result: none
startIndex=1 maxResults=20 -> result: 1..20
startIndex=10 maxResults=20 -> result: 19..29
startIndex=20 maxResults=20 -> result: 39
startIndex=21 maxResults=20 -> result: none
startIndex also seems to be using the formula (=startIndex*2 - 1) This looks like a bug to me.

Related

Microsoft Graph "messages" delta request truncates too many results with date filter

I think I've found a bug with the date filtering on the delta API.
I'm finding on one of the email accounts I'm working with using Office 365 Graph API that the "messages" graph API delta request is returning a different number of items than are actually in a folder for the expected time range. There are 150,000 items covering 10 years in the folder but delta only returns the last 5,000-ish items covering the last 60 or so days.
Paging Works Fine
When querying the graph API for the folder "Inbox" it has 154,045 total items and 57456 unread items.
IUserMailFoldersCollectionPage foldersPage =
await client.Users[mailboxid].MailFolders.Request().GetAsync();
I can skip over 10,000, 50,000 or more messages using paging.
model.messages = await client.Users[mailboxid].MailFolders[folderid].Messages.Request().Top(top)
.Skip(skip).GetAsync();
Delta with Date Filter doesn't work
But when looping with nextToken and deltaTokens, the deltaToken appears after 5000 or so email messages. Basically it seems like it's only returning results for the last couple months even though the filter is saying find messages for the last 20 years.
Here is the example for how we generate the Delta request. The time is hardcoded here but in reality it is a variable.
var sFilter = $"receivedDateTime ge {DateTimeOffset.UtcNow.AddYears(-20).ToString("yyyy-MM-dd")}";
model.messages = await client.Users[mailboxid].MailFolders[folderid].Messages.Delta().Request()
.Header("Prefer", "odata.maxpagesize=" + maxpagesize)
.Filter(sFilter)
.OrderBy("receivedDateTime desc")
.GetAsync();
And then on each paging operation I do the following. "nexttoken" is either the next or delta link depending on what came back from the first request.
model.messages = new MessageDeltaCollectionPage();
model.messages.InitializeNextPageRequest(client, nexttoken);
model.messages = await model.messages.NextPageRequest
.Header("Prefer", "odata.maxpagesize=" + maxpagesize)
.GetAsync();
Delta without Filter works
If I do the exact same code for delta above but remove the "Filter" operation on date, then I get all the messages in the folder.
This isn't a great solution since I normally only need messages for the last year or 2 years and if there are 15 years of messages it is a huge waste to query everything.
Update on 12/3/2019
I'm still getting this issue. I recently switched back to trying to use Delta again whereas before I was querying everything from the server even though I might only need the last month of data. But that's super wasteful.
This code works fine for most mailboxes but sometimes I encounter a mailbox with this issue.
My code looks like this.
string sStartingTime = startingTime.ToString("yyyy'-'MM'-'dd'T'HH':'mm':'ss") + "Z";
var messageCollectionPage = await client.Users[mailboxsource.GetMailboxIdFromAccountID()].MailFolders[folder.Id].Messages.Delta().Request()
.Filter("receivedDateTime+ge+" + Uri.EscapeDataString(sStartingTime))
.Select(select)
.Header("Prefer", "odata.maxpagesize=" + preferredPageSize)
.OrderBy("receivedDateTime desc")
.GetAsync(cancellationToken);
At around 5000 results the Delta request just stops returning results even though there are 66K items in the folder.
Paul, my peers confirmed there is indeed a 5000-item limit if you apply $filter to a delta query of the message resource.
Within the next day, the docs will also be updated with this information. Thank you for your patience and support!

AdWords Scripts. report ORDER BY not working

This function suppose to show in which hours adverts are clicked more often.
It works fine however I have problem with sorting it by "HourOfDay". When I add ORDER BY HourOfDay to the end of the query I get en error.
function exportReportToSpreadsheet() {
var spreadsheet = SpreadsheetApp.create('INSERT_REPORT_NAME_HERE');
var report = AdWordsApp.report("SELECT Clicks, Impressions, AverageCpc, HourOfDay FROM ACCOUNT_PERFORMANCE_REPORT DURING LAST_MONTH ORDER BY HourOfDay");
report.exportToSheet(spreadsheet.getActiveSheet());
Logger.log("Report available at " + spreadsheet.getUrl());
}
exportReportToSpreadsheet();
Anyone knows what is wrong with ORDER BY in AdWordsApp.report ?
https://developers.google.com/adwords/scripts/docs/reference/adwordsapp/adwordsapp_report
According to AWQL query language documentation it should work as expected.
https://developers.google.com/adwords/api/docs/guides/awql#using_awql_with_reports
BUG?
You cannot sort reports. From the AWQL documentation:
ORDER BY and LIMIT (sorting and paging) are NOT supported for reports.
Including these clauses in a query will generate an error.
Ordering is only possible when you use the different entities` selectors, e.g. to iterate over campaigns sorted by cost you could do
campaignIterator = AdWordsApp
.campaigns()
.forDateRange("LAST_MONTH")
.orderBy("Clicks DESC");

Getting PlaylistItems by Page Number in YouTube API?

Okay, lets say I have a YouTube playlist with 500 items in them. YouTube's PlaylistItems end-point only allows you to retrieve 50 items at a time:
https://developers.google.com/youtube/v3/docs/playlistItems/list
After 50 items, it gives you a nextPageToken which you can use to specify in your query to get the next page. Doing this, you could iterate through the entire playlist to get all 500 items in 10 queries.
However, what if I only wanted to get the last page? Page 10?
In YouTube's V2 API, you could have told it to start the index at position 451, and then it would give you the results for 451-500. This doesn't seem to be an option in their V3 API. Now, it seems if I wanted to get just page 10, I would have to iterate through the entire playlist once again, throw out the first 9 pages, and then just take the 10th page.
This seems like a HUGE waste of resources and the cURL operations alone could be a killer.
So is it possible to set the starting index in the V3 API like in the V2 API?
You can still use a start index but you have to generate the corresponding page token yourself.
As far as you can tell from observation, page tokens are basically a byte sequence encoded in base64, with the first byte always being 8, and the last two being 16, 0. We generate tokens somewhat like this (using python 3):
i = 451
k = i // 128
i -= 128 * (k - 1)
b = [8, index]
if k > 1 or i > 127: b += [k]
b += [16, 0]
t = base64.b64encode(bytes(f)).decode('utf8').strip('=')
The hindmost operation removes the trailing '=' characters that are used to fill incomplete blocks in base64. The result ('CMMDEAA') is your page token.

getting random rows with yql?

I want to use javascript to fetch data with yql from flickr,
e.g.
select id from flickr.photos.search(10) where text = 'music' and license=4
however, I would like to fetch 10 random rows, rather then the latest, since the latest tend to be 10 photos all from the same person.
ist that possible in yql itself (I suspect not),
or any workarounds that could bring the same effect?
(it does not have to be complete random, the main thing I want to avoid is to get 10 photos from the same poster)
To get only results from unique owners, you can use the unique() function (docs).
My suggestion would be to query for a larger result set (more likely to have 10 unique people) then call unique() followed by truncate() to limit to 10 results, as below.
select id from flickr.photos.search(100) where text = 'music' and
license=4 | unique(field="owner") | truncate(count=10)

YQL sample query returns max 260 results

I am following YQL sample query
select * from local.search(500) where query="sushi" and location="san francisco, ca"
but I get 260 max count instead of 500. I tried also to use limit 500 after 'where' and different keywords, always get maximum 260 results. How do you increase it?
The underlying API that the local.search table uses (Yahoo! Local Search Web Service) has restrictions on the number of results returned.
The results parameter (the number of results "per page") has a maximum value of 20.
The start parameter (the offset at which to start) has a maximum value of 250.
Since you ask for the first 500 results, YQL makes multiple queries against the Local Search API returning 20 results at a time. Therefore the start values are 1, 21, 41, ... 241. This brings back 260 results, as you have seen.
Since the YQL query asks for more results, the next start value is tried (261) which is beyond the allowed range so the underlying service returns an error (with the message "invalid value: start (261) must be between 1 and 250"). If you turn on "diagnostics" in the YQL console, you will see the "Bad Request" being returned.
Nothing you do to the query will bring back more results than the underlying service allows.
I figured out, I was missing paging number, so 0++ will work
select * from local.search(0,500) where query="sushi" and location="san francisco, ca"

Resources