YouTube Data API v3 Comment Thread Discrepency - youtube

I have been trying to get a list of comments out using the new V3 Data API with mixed results.
For some videos, you only get a subset of comments out. I have noticed this on a few videos, but for this specific case I will use video ID = U55NGD9Jm7M
You can find all the comments on this video in the WebUI here: https://www.youtube.com/all_comments?v=U55NGD9Jm7M
At the time of posting, there were 5,499 comments on this video.
API Results:
When querying https://www.googleapis.com/youtube/v3/commentThreads?part=id,snippet,replies&textFormat=plainText&maxResults=100&videoId=U55NGD9Jm7M&key={YOUR_API_KEY} I am only getting about 317 comments (including Paging, and counting all replies) (sorted chronologically).
Verification Research:
If you select "Top Comments" from the drop down and then scroll down and click "More" over and over again, you get over 1,000 comments (I stopped at about 1,000)
If you then select "Newest First" from the dropdown and repeat the process (more ... more ... more) you will find that there are about 317 comments before you are unable to show any more comments.
I find it quite odd that there is a discrepancy in the UI, but thankful that the API lines up with part of the UI. Has anyone else noticed this? Is there a way to get the full text of all 5,499 comments?
Thanks!
Jason
Follow-up 1
As a follow-up, I was able to isolate one comment using View->Source (Thread ID z12wzfzhtybgz13kj22ocvsz2unrtn1qj04) and fetch all the information from this comment in the API here: https://www.googleapis.com/youtube/v3/commentThreads?part=id%2Csnippet%2Creplies&id=z12wzfzhtybgz13kj22ocvsz2unrtn1qj04&maxResults=100&key={YOUR_API_KEY})
It even mentions the correct VideoID that the comment is associated with. However, when you query by Video, this comment ID is not returned.
Follow-up 2
I refresh the Web UI of the All Comments, and there was a significantly different list of comments that are being returned

The commentsThread.list call can only return a maximum of 100 results (see maxResults in the documentation). If you wanted to get more comment threads, you'd have to pass in the nextPageToken you get from your initial call into a subsequent API call.
For example:
https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&videoId=U55NGD9Jm7M&maxResults=100&key=API_KEY
gives you 100 comment threads, and the nextPageToken is Cg0Qk9fa7fHgxgIgACgBEhQIARCY49LZ5eDGAhi4rNGIrZrGAhgCIGM. If you include that token in a new API call, like so:
https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&videoId=Dlj6SUg9B04&&maxResults=100&nextPageToken=Cg0Qk9fa7fHgxgIgACgBEhQIARCY49LZ5eDGAhi4rNGIrZrGAhgCIGM&key=API_KEY
You get a completely different set of comment threads. You can double check this by specifying order=time in both API calls. You'll see that the earliest comment threads for both calls are different, and you won't find the comment Thread ID for either in the other call's results. To get even more comment threads, you take the nextPageToken from the newer call's results and do the same thing again (until the call doesn't give you another nextPageToken, meaning you're on the last page, and there are no more comment threads to return).

Related

Twilio conversation JS SDK - a proper way of fetching user conversations

I'm trying to display all user's conversations sorted by last message creation date and I'm a little bit confused.
I see getSubscribedConversation method in docs (https://media.twiliocdn.com/sdk/js/conversations/releases/1.1.0/docs/Client.html#getSubscribedConversations__anchor) but it says nothing about page size and sorting. It returns paginator so I assume it doesn't return all conversations at once.
On the other hand I see some examples in twilio github projects where conversations are added to the list only by listening for conversationAdded event (which indeed fires even for previously created conversations) but it doesn't seem like a clean solution - if user belongs to 50 conversation then I should handle every single event and rerender the list 50 times?
To sum up, I have following questions:
Does getSubscribedConversation returns all user's conversations at once?
If no, then what is default page size and is it possible to change it (together with sorting)
If getSubscribedConversation return paginator indeed - wouldn't it break if I add conversation from conversationAdded event in the meantime?
I can't answer all your questions but I can give some insight on a couple -
From what I can tell, getSubscribedConversations returns 50 Conversations. I have not found a way to change that limit or sort it (I'm not entirely sure in what order Twilio returns them even).
For a project I'm working on we need Conversations sorted in order of recent message. The way I'm currently dealing with it is by storing the most recent message on an attribute on the Conversation. I also initialize the app by loading all the conversations with a recursive function.
Hope that sheds some light for you.

YouTube Data API v3 - Comment threads request doesn't return all comments

For certain videos the call to commentThreads does not return the complete set of toplevelcomments. For example the following call only returns 16 toplevel comments when in fact the video has over 3458 total comments, and the toplevel comments are clearly more than 16.
https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&videoId=h_9-3Fj3ZdI&key=[KEY]&maxResults=50
I have run into this issue on a couple of Videos and in both cases the result seems to break on a comment which is in fact missing from the web UI. I have looked around and and tried multiple ways (i.e. trying to skip using start-index etc) , but I haven't found a solution. You can verify the missing comment in the the above result: 'z131gd2gwqnby3lea23byp3yyt3pshcqb04'
I don't know if this is still important. But I can get all comments. You need to use the nextPageToken to get all comments (so more than your maxResults) and the comments-request to get the replies.
You can verify this in python as well. See the following code:
api_key = '...'
url = 'https://youtube.googleapis.com/youtube/v3/commentThreads?part=snippet&videoId=h_9-3Fj3ZdI&maxResults=100&key=' + api_key
response = requests.get(url=url).json()
print(response)

YouTube Content ID API : AssetSearch Issue

I am working on YouTube Content ID API to fetch assests which are added today.
First of all I try to explore API on YouTube Content ID API explorer and then find asset search suitable for my criteria.So I provided the required parameters and got the response but response include only 25 results each time so I used NextPageToken
recived from response to get next assests,so far so good ,but for all this responses I noticed the ResultsPerPage varies for each request which confused me.As I assumed that ResultsPerPage indicates the all assests for the particular content owner and considering this I decided to code but now I'm unable to decide how should procced.
Can any one help me to understand this
Both the totalResults and ResultsPerPage are not reliable, according to employees at YouTube. You can only rely on the data that comes through. You can verify this with your TAM/Partner Manager.
In order to get the real count of your assets (I'm assuming you're using AssetSearch?), you have to keep paginating until there's no "nextPageToken' in the response, and count your results.
By the way, if you set the parameter "maxResults=50" in your request, you'll get 50 per page (until there's less than 50 left to display, which should only happen on the last page, given that you have a number of assets not divisible by 50).

Empty response when startindex >= 100

After a lot of debugging, it finally occured to me that seemingly Youtube is only issueing the first 100 comments when using the v2 YouTube-API for getting comments. I finally tried using:
curl -Lk -X GET "http://gdata.youtube.com/feeds/api/videos/MShbP3OpASA/comments?alt=json&start-index=100&max-results=50"
And all I get is a response without an entry parameter. That is to say, I do not receive an error response or something like that - I get a perfectly good response, but without the entry parameter.
Digging a little deeper, in my response the value for openSearch$totalResults is 100, so in accordance to this resource this seems to be the expected result (although it tells about some kind of error message which I don't get?).
But here comes the kicker: When I use
curl -Lk -X GET "http://gdata.youtube.com/feeds/api/videos/MShbP3OpASA/comments?alt=json&start-index=1&max-results=50&orderby=published"
openSearch$totalResults equals 3141, the actual count of the comments.
Now here is my question: Since the v2 API is officially been deprecated about a week ago, is it possible that Google just set up a limit on the comments? So only the first 100 comments are accessible? Since the v3 API does not allow for comment retrieval, that would be a pretty bummer for me.
Does anyone have any ideas?
I've figured out how to retrieve all the comments using the navigation links embedded in the json response.
Suppose you retrieve the first using a link like (python here, but you get the point):
r'https://gdata.youtube.com/feeds/api/videos/' + aVideoID + r'/comments?alt=json&start-index=1&max-results=50&prettyprint=true&orderby=published'
Embedded in the json under "feed" (and before the comments) will be a four element array called "link". The fourth element will be called "rel": "next" and under "href" there will be a link you can use to get the next 50 comments. The link will look something like:
https://gdata.youtube.com/feeds/api/videos/fH0cEP0mvlU/comments?alt=json&orderby=published&alt=json&start-token=EgkI2NqyoZDRvgIosK%2FPosPRvgIw653cmsXRvgI4AUAC&max-results=50&orderby=published
for an original URL of:
https://gdata.youtube.com/feeds/api/videos/fH0cEP0mvlU/comments?alt=json&start-index=1&max-results=50&prettyprint=true&orderby=published
If you follow the next link it will return similar json to the original link, with another 50 comments. Continue this process over and over until you get all the comments (in my code I check for both the absence of this item in the json or zero comments in the json to determine when to stop).
You need the "&orderby=published" in the original URL because otherwise the "next" links eventually grow to be too large and cause an error (something in the token the API uses to track which comments you've seen in the default orderby takes a lot of space). Something about the published orderby keeps the "start-token" small, whereas after about 500 comments with the default orderby you will start getting 414 Request URI too long errors.
Hope this helps.

YouTube API "published" filter doesn't seem to work

I'm trying to use the YouTube API to return videos that were recently published, but the filter I'm using doesn't seem to work as expected.
This API call only returns two videos whereas there should be tons more that were published after March 1st:
https://gdata.youtube.com/feeds/api/videos?q=&fields=entry[xs:dateTime(published)%20%3E%20xs:dateTime('2013-03-01T12:00:00.000Z')]
However, if I add a query string, then many more results are returned. For example:
https://gdata.youtube.com/feeds/api/videos?q=surfing&fields=entry[xs:dateTime(published)%20%3E%20xs:dateTime('2013-03-01T12:00:00.000Z')]
Anyone know why? Is there another approach I should be using to just get me the latest videos published regardless of query string?
I understand your confusion, but that's not what the fields= parameter is used for. The documentation should hopefully clear things up, but to summarize, using fields= in that manner is equivalent to making a request without the fields= parameter and then filtering the results of that request so that it only includes the entries that match your filter.
So if your request without fields= would normally return 25 specific videos, adding fields= to it will give you a response that includes somewhere between 0 and 25 videos—all the non-matching videos are filtered out.
You can request a feed of recently published videos without any other filters using http://gdata.youtube.com/feeds/api/videos?v=2&orderby=published

Resources