Youtube Analytics API - How to get all the video stats for a given channel and date? - youtube-api

We got to build our own reporting database for our Youtube channel to measure the channel and video performance.
To support this, we implemented an ETL job to extract using Youtube Analytics API and used below python code to get the data.
def GetAnalyticsData(extractDate,accessToken, channelId):
channelId = 'channel%3D%3D{0}'.format(channelId)
headers = {'Authorization': 'Bearer {}'.format(accessToken),
'accept': 'application/json'}
url = 'https://youtubeanalytics.googleapis.com/v2/reports?dimensions={dimensions}&endDate={enddate}&ids={ids}&maxResults={maxresults}&metrics={metrics}&startDate={startdate}&alt={alt}&sort={sort}'.format(
dimensions='video',
ids=channelId,
enddate= extractDate,
startdate=extractDate,
metrics = 'views%2Ccomments%2Clikes%2Cdislikes%2Cshares%2CestimatedMinutesWatched%2CsubscribersGained%2CsubscribersLost%2CannotationClicks%2CannotationClickThroughRate%2CaverageViewDuration%2CaverageViewPercentage%2CannotationCloseRate%2CannotationImpressions%2CannotationClickableImpressions%2CannotationClosableImpressions%2CannotationCloses',
maxresults = 200,
alt ='json',
sort='-views'
)
return requests.get(url,headers=headers)
We hit this API everyday and get all the video metric and sorted by views in descending order.
This solved our need partially and it returns only 200 videos, if we specify maxResults more than 200, its return 400 error code.
The challenge is, how to get all videos for the given date and given channel?
Thanks in advance.
Regards,
Guna

I am not keen on YouTube analytics API, but it seems that you are looking for startIndex.
startIndex
integer
The 1-based index of the first entity to retrieve. (The default value is 1.) Use this parameter as a pagination mechanism along with the max-results parameter.

Related

How to avoid omissions in video information acquisition when using the YouTube Data API?

Assumption / What I want to achieve
I want to use YouTube Data API V3 to get the video ID without any omissions, and find out if the cause of the trouble is in the code or in the video settings of YouTube (API side).
Problem
The following code is used to get the video information from YouTube Data API, but the number of IDs I got did not match the number of videos that are actually posted.
from apiclient.discovery
import build
id = "UCD-miitqNY3nyukJ4Fnf4_A" #sampleID
token_check = None
nextPageToken = None
id_info = []
while True:
if token_check != None:
nextPageToken = token_check
Search_Video = youtube.search().list(
part = "id",
channelId = id,
maxResults = 50,
order = 'date',
safeSearch = "none",
pageToken = nextPageToken
).execute()
for ID_check in Search_Video.get("items", []):
if ID_check["id"]["kind"] == "youtube#video":
id_info.append(ID_check["id"]["videoId"])
try:
token_check = Search_Video["nextPageToken"]
except:
print(len(id_info)) #check number of IDs
break
I also used the YouTube Data API function to get the videoCount information of the channel, and noticed that the value of videoCount did not match the number of IDs obtained by the code above, which is why I posted this.
According to channels() API, this channel have 440 videos, but the above code gets only 412 videos (at 10:30 a.m. JST).
Supplemental Information
・Python 3.9.0
・YouTube Data API v3
You have to acknowledge that the Search.list API endpoint does not have a crisp behavior. That means you should not expect precise results from it. Google does not document this behavior as such, but this forum has many posts from users experiencing that.
If you want to obtain all the IDs of videos uploaded by a given channel then you should employ the following two-step procedure:
Step 1: Obtain the ID of the Uploads Playlist of a Channel.
Invoke the Channels.list API endpoint, queried with its request parameter id set to the ID of the channel of your interest (or, otherwise, with its request parameter mine set to true) for to obtain that channel's uploads playlist ID, contentDetails.relatedPlaylists.uploads.
def get_channel_uploads_playlist_id(youtube, channel_id):
response = youtube.channels().list(
fields = 'items/contentDetails/relatedPlaylists/uploads',
part = 'contentDetails',
id = channel_id,
maxResults = 1
).execute()
items = response.get('items')
if items:
return items[0] \
['contentDetails'] \
['relatedPlaylists'] \
.get('uploads')
else:
return None
Do note that the function get_channel_uploads_playlist_id should only be called once for to obtain the uploads playlist
ID of a given channel; subsequently use that ID as many times as needed.
Step 2: Retrieve All IDs of Videos of a Playlist.
Invoke the PlaylistItems.list API endpoint, queried with its request parameter playlistId set to the ID obtained from get_channel_uploads_playlist_id:
def get_playlist_video_ids(youtube, playlist_id):
request = youtube.playlistItems().list(
fields = 'nextPageToken,items/snippet/resourceId',
playlistId = playlist_id,
part = 'snippet',
maxResults = 50
)
videos = []
is_video = lambda item: \
item['snippet']['resourceId']['kind'] == 'youtube#video'
video_id = lambda item: \
item['snippet']['resourceId']['videoId']
while request:
response = request.execute()
items = response.get('items', [])
assert len(items) <= 50
videos.extend(map(video_id, filter(is_video, items)))
request = youtube.playlistItems().list_next(
request, response)
return videos
Do note that, when using the Google's APIs Client Library for Python (as you do), API result set pagination is trivially simple: just use the list_next method of the Python API object corresponding to the respective paginated API endpoint (as was shown above):
request = API_OBJECT.list(...)
while request:
response = request.execute()
...
request = API_OBJECT.list_next(
request, response)
Also note that above I used twice the fields request parameter. This is good practice: ask from the API only the info that is of actual use.
Yet an important note: the PlaylistItems.list endpoint would not return items that correspond to private videos of a channel when invoked with an API key. This happens when your youtube object was constructed by calling the function apiclient.discovery.build upon passing to it the parameter developerKey.
PlaylistItems.list returns items corresponding to private videos only to the channel owner. This happens when the youtube object is constructed by calling the function apiclient.discovery.build upon passing to it the parameter credentials and if credentials refer to the channel that owns the respective playlist.
An additional important note: according to Google staff, there's an upper 20000 limit set by design for the number of items returned via PlaylistItems.list endpoint when queried for a given channel's uploads playlist. This is unfortunate, but a fact.

How to find if a youtube channel is currently live streaming without using search?

I'm working on a website to load multiple youtube channels live streams. At first i was trying to figure out a way to do this without utilizing youtube's api but have decided to give in.
To find whether a channel is live streaming and to get the live stream links I've been using:
https://www.googleapis.com/youtube/v3/search?part=snippet&channelId={CHANNEL_ID}&eventType=live&maxResults=10&type=video&key={API_KEY}
However with the minimum quota being 10000 and each search being worth 100, Im only able to do about 100 searches before I exceed my quota limit which doesn't help at all. I ended up exceeding the quota limit in about 10 minutes. :(
Does anyone know of a better way to figure out if a channel is currently live streaming and what the live stream links are, using as minimal quota points as possible?
I want to reload youtube data for each user every 3 minutes, save it into a database, and display the information using my own api to save server resources as well as quota points.
Hopefully someone has a good solution to this problem!
If nothing can be done about links just determining if the user is live without using 100 quota points each time would be a big help.
Since the question only specified that Search API quotas should not be used in finding out if the channel is streaming, I thought I would share a sort of work-around method. It might require a bit more work than a simple API call, but it reduces API quota use to practically nothing:
I used a simple Perl GET request to retrieve a Youtube channel's main page. Several unique elements are found in the HTML of a channel page that is streaming live:
The number of live viewers tag, e.g. <li>753 watching</li>. The LIVE NOW
badge tag: <span class="yt-badge yt-badge-live" >Live now</span>.
To ascertain whether a channel is currently streaming live requires a simple match to see if the unique HTML tag is contained in the GET request results. Something like: if ($get_results =~ /$unique_html/) (Perl). Then, an API call can be made only to a channel ID that is actually streaming, in order to obtain the video ID of the stream.
The advantage of this is that you already know the channel is streaming, instead of using thousands of quota points to find out. My test script successfully identifies whether a channel is streaming, by looking in the HTML code for: <span class="yt-badge yt-badge-live" > (note the weird extra spaces in the code from Youtube).
I don't know what language OP is using, or I would help with a basic GET request in that language. I used Perl, and included browser headers, User Agent and cookies, to look like a normal computer visit.
Youtube's robots.txt doesn't seem to forbid crawling a channel's main page, only the community page of a channel.
Let me know what you think about the pros and cons of this method, and please comment with what might be improved rather than disliking if you find a flaw. Thanks, happy coding!
2020 UPDATE
The yt-badge-live seems to have been deprecated, it no longer reliably shows whether the channel is streaming. Instead, I now check the HTML for this string:
{"text":" watching"}
If I get a match, it means the page is streaming. (Non-streaming channels don't contain this string.) Again, note the weird extra whitespace. I also escape all the quotation marks since I'm using Perl.
Here are my two suggestions:
Check my answer where I explain how you can check how retrieve videos from channels who are livesrteaming.
Another option could be use the following URL and somehow make request(s) each time for check if there's a livestreaming.
https://www.youtube.com/channel/<CHANNEL_ID>/live
Where CHANNEL_ID is the channel id you want check if that channel is livestreaming1.
1 Just notice that maybe the URL wont work in all channels (and that depends of the channel itself).
For example, if you check the channel_id UC7_YxT-KID8kRbqZo7MyscQ - link to this channel livestreaming - https://www.youtube.com/channel/UC4nprx9Vd84-ly7N-1Ce6Og/live, this channel will show if he is livestreaming, but, with his channel id UC4nprx9Vd84-ly7N-1Ce6Og - link to this channel livestreaming -, it will show his main page instead.
Adding to the answer by Bman70, I tried eliminating the need of making a costly search request after knowing that the channel is streaming live. I did this using two indicators in the HTML response from channels page who are streaming live.
function findLiveStreamVideoId(channelId, cb){
$.ajax({
url: 'https://www.youtube.com/channel/'+channelId,
type: "GET",
headers: {
'Access-Control-Allow-Origin': '*',
'Accept-Language': 'en-US, en;q=0.5'
}}).done(function(resp) {
//one method to find live video
let n = resp.search(/\{"videoId[\sA-Za-z0-9:"\{\}\]\[,\-_]+BADGE_STYLE_TYPE_LIVE_NOW/i);
//If found
if(n>=0){
let videoId = resp.slice(n+1, resp.indexOf("}",n)-1).split("\":\"")[1]
return cb(videoId);
}
//If not found, then try another method to find live video
n = resp.search(/https:\/\/i.ytimg.com\/vi\/[A-Za-z0-9\-_]+\/hqdefault_live.jpg/i);
if (n >= 0){
let videoId = resp.slice(n,resp.indexOf(".jpg",n)-1).split("/")[4]
return cb(videoId);
}
//No streams found
return cb(null, "No live streams found");
}).fail(function() {
return cb(null, "CORS Request blocked");
});
}
However, there's a tradeoff. This method confuses a recently ended stream with currently live streams. A workaround for this issue is to get status of the videoId returned from Youtube API (costs a single unit from your quota).
I found youtube API to be very restrictive given the cost of search operation. Apparently the accepted answer did not work for me as I found the string on non live streams as well. Web scraping with aiohttp and beautifulsoup was not an option since the better indicators required javascript support. Hence I turned to selenium. I looked for the css selector
#info-text
and then search for the string Started streaming or with watching now in it.
To reduce load on my tiny server that would have otherwise required lot more resources, I moved this test of functionality to a heroku dyno with a small flask app.
# import flask dependencies
import os
from flask import Flask, request, make_response, jsonify
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
base = "https://www.youtube.com/watch?v={0}"
delay = 3
# initialize the flask app
app = Flask(__name__)
# default route
#app.route("/")
def index():
return "Hello World!"
# create a route for webhook
#app.route("/islive", methods=["GET", "POST"])
def is_live():
chrome_options = Options()
chrome_options.binary_location = os.environ.get('GOOGLE_CHROME_BIN')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--remote-debugging-port=9222')
driver = webdriver.Chrome(executable_path=os.environ.get('CHROMEDRIVER_PATH'), chrome_options=chrome_options)
url = request.args.get("url")
if "youtube.com" in url:
video_id = url.split("?v=")[-1]
else:
video_id = url
url = base.format(url)
print(url)
response = { "url": url, "is_live": False, "ok": False, "video_id": video_id }
driver.get(url)
try:
element = WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#info-text")))
result = element.text.lower().find("Started streaming".lower())
if result != -1:
response["is_live"] = True
else:
result = element.text.lower().find("watching now".lower())
if result != -1:
response["is_live"] = True
response["ok"] = True
return jsonify(response)
except Exception as e:
print(e)
return jsonify(response)
finally:
driver.close()
# run the app
if __name__ == "__main__":
app.run()
You'll however need to add the following buildpacks in settings
https://github.com/heroku/heroku-buildpack-google-chrome
https://github.com/heroku/heroku-buildpack-chromedriver
https://github.com/heroku/heroku-buildpack-python
Set the following Config Vars in settings
CHROMEDRIVER_PATH=/app/.chromedriver/bin/chromedriver
GOOGLE_CHROME_BIN=/app/.apt/usr/bin/google-chrome
You can find supported python runtime here but anything below python 3.9 should be good since selenium had problems with improper use of is operator
I hope youtube will provide better alternatives than workarounds.
I know this is a old thread, but i thought i share my way of checking to for example grab the status code to use in an app.
This is for a single Channel, but you could easly do a foreach with it.
<?php
#####
$ytchannelID = "UCd0BTXriKLvOs1ANx3puZ3Q";
#####
$ytliveurl = "https://www.youtube.com/channel/".$ytchannelID."/live";
$ytchannelLIVE = '{"text":" watching now"}';
$contents = file_get_contents($ytliveurl);
if ( strpos($contents, $ytchannelLIVE) !== false ){http_response_code(200);} else {http_response_code(201);}
unset($ytliveurl);
?>
Adding onto the other answers here, I use a GET request to https://www.youtube.com/c/<CHANNEL_NAME>/live and then search for "isLive":true (rather than {"text":" watching"})

Youtube Data API searching for all videos on a channel returns double results

I'm writing a PHP script that retrive all videos of a channel using Google APIs Client Library for PHP and youtube.search.list with params:
part = snippet
maxResults = 50
channelId = ...
type = video
order = rating
pageToken = [nextPageToken]
Scrolling all the results of pagination I see that sometimes the search returns videos that were present in the previous pages.
I have tested thoroughly and the calls to the next pages are correct with the right nextPageToke.

Retrieve number of items in playlist with YouTube API v3

I'm transferring my iOS code from YouTube Data API v2 to v3.
I would like to retrieve the number of items in a playlist with YouTube API v3.
With the data API v2, I could call a search query, resulting in a list of playlist items with the property "size", which represented the size of the playlist, see:
https://gdata.youtube.com/feeds/api/playlists/snippets?q=GoogleDevelopers&max-results=10&v=2&alt=jsonc
With API v3, I haven't found a way to achieve this. I tried queries, like below, not resulting in the desired output.
https://www.googleapis.com/youtube/v3/playlists?part=snippet&id=RD0KSOMA3QBU0&key={YOUR_API_KEY}
The only way to get the number of items is running a query for each playlist, retrieving the full playlist items. This is an overkill, however, and results in a too high API usage.
https://www.googleapis.com/youtube/v3/playlistItems?part=snippet&playlistId=RD0KSOMA3QBU0&key={YOUR_API_KEY}
Any help appreciated!
Thanks
thanks for that. It helped me to find a solution, which is now to call
https://www.googleapis.com/youtube/v3/playlists?part=contentDetails&id={Comma Separated ID's}&key={YOUR_API_KEY}
an use the option to make a comma separate lists of the playlists to evaluate. In the return string, I grab the "itemCount" parameter for each playlist.
Thanks
Unfortunately, there is no way to get number of items in a playlist without requesting contentDetails in the parts (which isn't supported when listing playlists).
I have use the following V3 api to get the playlist videos, and its works for me.
https://www.googleapis.com/youtube/v3/playlistItems?part=contentDetails&maxResults=50&playlistId=PLFPg_IUxqnZM3uua-YwStHJ1qmlQKBnh0&key={YOUR_API_KEY_HERE}
**
playlistId
**
There is no details available for "playlistId" parameter in Youtube Documentation.
but is works for getting playlist videos.
This javascript will retrieve the number 91.
That is the total number of clips (total = data.pageInfo.totalResults)
from the youtube playlist
<script>
function numberOfClips(pid){
$.get(
"https://www.googleapis.com/youtube/v3/playlistItems",{
part : 'snippet',
maxResults : 2,
playlistId : pid,
key: 'YOUR API v.3 KEY',
fields: 'pageInfo(totalResults)'
},
function(data) {
total = data.pageInfo.totalResults;
$('#total').html('Total number of clips: ' + total);
st = JSON.stringify(data,'',2);
$('#area1').val(st);
});
}
</script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js"></script>
<body onload="numberOfClips('PLE9F62B21C75C054C')">
<p id="total"></p>
<textarea id='area1' style='width:400px;height:100px;'></textarea>
PLE9F62B21C75C054C
I would not recommend to rely on neither itemsCount in Playlist-Query nor pageInfo.totalResults in Playlistitems-Query, because both values may differ from the size of the retrieved list of playlist-items
see also: https://stackoverflow.com/questions/37899490/totalresults-in-playlistitems-do-not-match-items-size

YouTube Data API v3: invalid search filters and/or restrictions for certain Freebase topics

I am trying to do a YouTube topic search using the YouTube Data API v3. I've found that there are many seemingly popular Freebase topics for which YouTube returns no results even though the safeSearch parameter is set to "none". For instance, when I try to search YouTube for the topic "alcoholic beverage" (/m/012mj) the API throws an exception.
from apiclient.discovery import build
from apiclient.errors import HttpError
DEVELOPER_KEY = "blah blah"
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
developerKey=DEVELOPER_KEY)
topic_id = "/m/012mj"
try:
search_response = youtube.search().list(
topicId=topic_id,
type="video",
part="id",
safeSearch="none",
q="",
).execute()
except HttpError,e:
print e
<HttpError 400 when requesting https://www.googleapis.com/youtube/v3/search?topicId=%2Fm%2F012mj&q=&safeSearch=none&part=id&key=AIzaSyC7MDamoleicn233r8mTyK2sohcV4A3Aq8&alt=&type=video returned "Invalid combination of search filters and/or restrictions.">
Any suggestions?
UPDATE
This no longer returns a 400 error. The API now returns a search response with no items. At least this helps me differentiate between an error and a search response with no results. However, it still seems strange that YouTube won't return search results for this topic even with safeSearch set to "none".
Have you considered setting the max-results parameter? I have no experience with the YouTube flavor of the Freebase api, but have read that they recommend limiting the results for videos to 10 rather than accepting the default of 25. Otherwise it is odd that they would send a 400 (bad request) rather than a 401 or so.

Resources