Youtube Data Api: Get Latest Video Id From Channel Excluding Live Streams
Solution 1:
You'll have to iterate your call to PlaylistItems.list
API endpoint (using pagination) for to filter out manually the videos that are live streams.
defget_non_livestream_videos(youtube, video_ids):
assertlen(video_ids) <= 50ifnot video_ids: return []
response = youtube.videos().list(
fields = 'items(id,liveStreamingDetails)',
part = 'id,liveStreamingDetails',
maxResults = len(video_ids),
id = ','.join(video_ids),
).execute()
items = response.get('items', [])
assertlen(items) <= len(video_ids)
not_live = lambda video: \
not video.get('liveStreamingDetails')
video_id = lambda video: video['id']
returnmap(video_id, filter(not_live, items))
defget_latest_video_id(youtube, playlistId):
request = youtube.playlistItems().list(
fields = 'nextPageToken,items/snippet/resourceId',
playlistId = playlistId,
maxResults = 50,
part = 'snippet'
)
is_video = lambda item: \
item['snippet']['resourceId']['kind'] == 'youtube#video'
video_id = lambda item: \
item['snippet']['resourceId']['videoId']
while request:
response = request.execute()
items = response.get('items', [])
assertlen(items) <= 50
videos = map(video_id, filter(is_video, items))
if videos:
videos = get_non_livestream_videos(youtube, videos)
if videos: return videos[0]
request = youtube.playlistItems().list_next(
request, response)
returnNone
Note that above I used the fields
request parameter for to get from the APIs only the info that's actually needed.
Also note that you may have to elaborate a bit the function get_non_livestream_videos
, since the Videos.list
API endpoint queried with its id
parameter as a comma-separated list of video IDs may well alter the order of the items it returns w.r.t. the given order of the IDs in video_ids
.
Yet an important note: if you're running the code above under Python 3 (your question does not mention this), then make sure you have the following configuration code inserted at the top of your script:
if sys.version_info[0] >= 3:
from builtins importmapas builtin_map
map = lambda *args: list(builtin_map(*args))
This is needed since, under Python 3, the builtin function map
returns an iterator, whereas under Python 2, map
returns a list.
Here is the code that solves the issue I mentioned above w.r.t. the case of Videos.list
altering the order of items returned relative to the order of the IDs given by the argument video_ids
of function get_non_livestream_videos
:
import sys
if sys.version_info[0] >= 3:
from builtins importmapas builtin_map
map = lambda *args: list(builtin_map(*args))
classMergeVideoListsError(Exception): passdefmerge_video_lists(video_ids, video_res):
pair0 = lambda pair: pair[0]
pair1 = lambda pair: pair[1]
video_ids = sorted(
enumerate(video_ids), key = pair1)
video_res.sort(
key = lambda video: video['id'])
deferror(video_id):
raise MergeVideoListsError(
"unexpected video resource of ID '%s'" % video_id)
defdo_merge():
N = len(video_ids)
R = len(video_res)
assert R <= N
l = []
i, j = 0, 0while i < N and j < R:
v = video_ids[i]
r = video_res[j]
s = v[1]
d = r['id']
if s == d:
l.append((v[0], r))
i += 1
j += 1elif s < d:
i += 1else:
error(d)
if j < R:
error(video_res[j]['id'])
return l
video_res = do_merge()
video_res.sort(key = pair0)
returnmap(pair1, video_res)
defprintln(*args):
for a in args:
sys.stdout.write(str(a))
sys.stdout.write('\n')
deftest_merge_video_lists(ids, res, val):
try:
println("ids: ", ids)
println("res: ", res)
r = merge_video_lists(ids, res)
println("merge: ", r)
except MergeVideoListsError as e:
println("error: ", e)
r = str(e)
finally:
println("test: ", "OK" \
if val == r \
else"failed")
TESTS = ((
['c', 'b', 'a'],
[{'id': 'c'}, {'id': 'a'}, {'id': 'b'}],
[{'id': 'c'}, {'id': 'b'}, {'id': 'a'}]
),(
['c', 'b', 'a'],
[{'id': 'b'}, {'id': 'c'}],
[{'id': 'c'}, {'id': 'b'}]
),(
['c', 'b', 'a'],
[{'id': 'a'}, {'id': 'c'}],
[{'id': 'c'}, {'id': 'a'}]
),(
['c', 'b', 'a'],
[{'id': 'a'}, {'id': 'b'}],
[{'id': 'b'}, {'id': 'a'}]
),(
['c', 'b', 'a'],
[{'id': 'z'}, {'id': 'b'}, {'id': 'c'}],
"unexpected video resource of ID 'z'"
),(
['c', 'b', 'a'],
[{'id': 'a'}, {'id': 'z'}, {'id': 'c'}],
"unexpected video resource of ID 'z'"
),(
['c', 'b', 'a'],
[{'id': 'a'}, {'id': 'b'}, {'id': 'z'}],
"unexpected video resource of ID 'z'"
))
defmain():
for i, t inenumerate(TESTS):
if i: println()
test_merge_video_lists(*t)
if __name__ == '__main__':
main()
# $ python merge-video-lists.py# ids: ['c', 'b', 'a']# res: [{'id': 'c'}, {'id': 'a'}, {'id': 'b'}]# merge: [{'id': 'c'}, {'id': 'b'}, {'id': 'a'}]# test: OK# # ids: ['c', 'b', 'a']# res: [{'id': 'b'}, {'id': 'c'}]# merge: [{'id': 'c'}, {'id': 'b'}]# test: OK# # ids: ['c', 'b', 'a']# res: [{'id': 'a'}, {'id': 'c'}]# merge: [{'id': 'c'}, {'id': 'a'}]# test: OK# # ids: ['c', 'b', 'a']# res: [{'id': 'a'}, {'id': 'b'}]# merge: [{'id': 'b'}, {'id': 'a'}]# test: OK# # ids: ['c', 'b', 'a']# res: [{'id': 'z'}, {'id': 'b'}, {'id': 'c'}]# error: unexpected video resource of ID 'z'# test: OK# # ids: ['c', 'b', 'a']# res: [{'id': 'a'}, {'id': 'z'}, {'id': 'c'}]# error: unexpected video resource of ID 'z'# test: OK# # ids: ['c', 'b', 'a']# res: [{'id': 'a'}, {'id': 'b'}, {'id': 'z'}]# error: unexpected video resource of ID 'z'# test: OK
The code above is a standalone program (running both under Python v2 and v3) that implements a merging function merge_video_lists
.
You'll have to use this function within the function get_non_livestream_videos
by replacing the line:
returnmap(video_id, filter(not_live, items))
with:
returnmap(video_id, merge_video_lists(
video_ids, filter(not_live, items)))
for Python 2. For Python 3 the replacement would be:
returnmap(video_id, merge_video_lists(
video_ids, list(filter(not_live, items))))
Instead of replacing the return
statement, just have that statement preceded by this one:
items = merge_video_lists(video_ids, items)
This latter variant is better, since it also validates the video IDs returned by the API: if there is an ID that is not in video_ids
, then merge_video_lists
throws a MergeVideoListsError
exception indicating the culprit ID.
For obtaining all videos that are exactly N
days old, excluding live streams, use the function below:
defget_days_old_video_ids(youtube, playlistId, days = 7):
from datetime import date, datetime, timedelta
n_days = date.today() - timedelta(days = days)
request = youtube.playlistItems().list(
fields = 'nextPageToken,items(snippet/resourceId,contentDetails/videoPublishedAt)',
part = 'snippet,contentDetails',
playlistId = playlistId,
maxResults = 50
)
defparse_published_at(item):
details = item['contentDetails']
details['videoPublishedAt'] = datetime.strptime(
details['videoPublishedAt'],
'%Y-%m-%dT%H:%M:%SZ') \
.date()
return item
deffind_if(cond, items):
for item in items:
if cond(item):
returnTruereturnFalse
n_days_eq = lambda item: \
item['contentDetails']['videoPublishedAt'] == n_days
n_days_lt = lambda item: \
item['contentDetails']['videoPublishedAt'] < n_days
is_video = lambda item: \
item['snippet']['resourceId']['kind'] == 'youtube#video'
video_id = lambda item: \
item['snippet']['resourceId']['videoId']
videos = []
while request:
response = request.execute()
items = response.get('items', [])
assertlen(items) <= 50# remove the non-video entries in 'items'
items = filter(is_video, items)
# replace each 'videoPublishedAt' with# its corresponding parsed date object
items = map(parse_published_at, items)
# terminate loop when found a 'videoPublishedAt' < n_days
done = find_if(n_days_lt, items)
# retain only the items with 'videoPublishedAt' == n_days
items = filter(n_days_eq, items)
# add to 'videos' the IDs of videos in 'items' that are not live streams
videos.extend(get_non_livestream_videos(youtube, map(video_id, items)))
if done: break
request = youtube.playlistItems().list_next(
request, response)
return videos
The function get_days_old_video_ids
above needs filter
and map
to return lists, therefore the configuration code above has to be updated to:
if sys.version_info[0] >= 3:
from builtins importmapas builtin_map
map = lambda *args: list(builtin_map(*args))
from builtins importfilteras builtin_filter
filter = lambda *args: list(builtin_filter(*args))
Do note that get_days_old_video_ids
is relying on the following undocumented property of the result set produced by PlaylistItems.list
: for the uploads playlist of a channel, the items returned by PlaylistItems.list
are ordered in reverse chronological order (newest first) by contentDetails.videoPublishedAt
.
Therefore you have to make sure the argument playlistId
of get_days_old_video_ids
is the ID of the uploads playlist of your channel. Usually, a channel ID and its corresponding uploads playlist ID are related by s/^UC([0-9a-zA-Z_-]{22})$/UU\1/
.
Also note that get_days_old_video_ids
is returning the IDs of videos that are exactlydays
old. If needing to obtain the IDs of videos that are at mostdays
old, then have defined:
n_days_ge = lambda item: \
item['contentDetails']['videoPublishedAt'] >= n_days
and have n_days_eq
replaced with n_days_ge
.
Yet something to note: at the top of function get_non_livestream_videos
above, I added the statement:
if not video_ids: return []
such that to avoid processing an empty video_ids
list.
Post a Comment for "Youtube Data Api: Get Latest Video Id From Channel Excluding Live Streams"