I have a video that I want to download from YouTube but I also want the chat of the entire live stream.
youtube-dl \
--limit-rate '0.25M' \
--retries '3' \
--no-overwrites \
--call-home \
--write-info-json \
--write-description \
--write-thumbnail \
--all-subs \
--convert-subs 'srt' \
--write-annotations \
--add-metadata \
--embed-subs \
--download-archive '/archive/videos/gamer_Archive/gamer_Archive.ytdlarchive' \
--format 'bestvideo+bestaudio/best' \
--merge-output-format 'mkv' \
--output '/archive/videos/gamer_Archive/%(upload_date)s_%(id)s/gamer_Archive_%(upload_date)s_%(id)s_%(title)s.%(ext)s' \
'https://www.youtube.com/watch?v=HPmhA3FpQNA' ;
I obviously am not the owner of the video and I cannot seem to use the API to pull the chat. How do I pull the video and all of the chat logs for the video? Can I do this using YouTube-DL? I tried meta-data, embed-subs, and everything else I could find online in the hopes the chat is written somewhere but I cannot find it.
I did it using this library: https://pypi.org/project/chat-replay-downloader/
Commands to do it (Linux, bash, python):
$ python -m venv venv
$ source venv/bin/activate
$ pip install chat-replay-downloader
$ chat_replay_downloader https://www.youtube.com/watch?v=yOP-VT0Q9mk >> chat.txt
Related
When running DataFlow on GCP, DataFlow does not use the nodes specified in "numWorkers".
On the console, I can see that "TargetWorkers" is the same number as "numWorkers". but actual is not...
How do I get it to work as intended?
This is the command to deploy the template.
mvn -Pdataflow-runner compile exec:java -Dexec.mainClass=fqdn.of.main.Class \
-Dexec.args=" \
--project=my_gcp_project_name \
--stagingLocation=gs://my_product/staging/ \
--templateLocation=gs://my_product/template/Template \
--runner=DataflowRunner \
--autoscalingAlgorithm=NONE \
--numWorkers=90 \
--workerMachineType=n1-highmem-4 \
--enableStreamingEngine=true \
"
Even though you are specifying the flag "--autoscalingAlgorithm=NONE", you are also using Streaming Engine which is a type of autoscaling.
I made a test using this Quickstart, when using Streaming Engine:
mvn -Pdataflow-runner compile exec:java \
-Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args="--project=support-data-ez \
--stagingLocation=gs://datacert/staging/ \
--output=gs://datacert/output \
--runner=DataflowRunner \
--autoscalingAlgorithm=NONE \
--numWorkers=90 \
--workerMachineType=n1-highmem-4
--enableStreamingEngine=true \
"
I have experimented with the same behavior.
You also have to verify that any quota has been reached.
When a quota is exceeded Dataflow will discard the flag "--numWorkers=90" and will determines an appropriate number of workers. So, consider this factor too.
mvn -Pdataflow-runner compile exec:java \
-Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args="--project=support-data-ez \
--stagingLocation=gs://datacert/staging/ \
--output=gs://datacert/output \
--runner=DataflowRunner \
--autoscalingAlgorithm=NONE \
--numWorkers=58 \
--workerMachineType=n1-standard-1
--enableStreamingEngine=false \
As you can see when using a number of workers (or CPUs) that not exceed my quota and disabling the Streaming Engine feature I was able to use all the workers specified.
I'm rather new to Docker and I'm trying to make a simple Dockerfile that combines an alpine image with a python one.
This is what the Dockerfile looks like:
FROM alpine
RUN apk update &&\
apk add -q --progress \
bash \
bats \
curl \
figlet \
findutils \
git \
make \
mc \
nodejs \
openssh \
sed \
wget \
vim
ADD ./src/ /home/src/
WORKDIR /home/src/
FROM python:3.7.4-slim
When running:
docker build -t alp-py .
the image builds as normal.
When I run
docker run -it alp-py bash
I can access the bash, but when I cd to /home/ and ls, it shows an empty directory:
root#5fb77bbc81a1:/# cd home
root#5fb77bbc81a1:/home# ls
root#5fb77bbc81a1:/home#
I've alredy tried changing ADD to COPY and also trying:
CPOY . /home/src/
but nothing works.
What am I doing wrong? Am I missing something?
Thanks!
There is no such thing as "combining 2 images". You should see the images as different virtual machines (only for the purpose of understanding the concept - because they are more than that). You cannot combine them.
In your example you can start directly with the python image and install the tools you need on top of it:
FROM python:3.7.4-slim
RUN apt update &&\
apt-get install -y \
bash \
bats \
curl \
figlet \
findutils \
git \
make \
mc \
nodejs \
openssh \
sed \
wget \
vim
ADD ./src/ /home/src/
WORKDIR /home/src/
I didn't test if all the packages are available so you might want to so a bit of research to get them all in case you get errors.
When you use 2 FROM statements in your Dockerfile you are creating a multi-stage build. That is useful if you want to create a final image that doesn't contain your source code, but only binaries of your product (first stage build the source and the second only copies the binaries from the first one).
I have a following scenario. I want to use tensorflow for ML and OpenCV for some image processing. I recently learned about dockers and found out, that both TF and OCV are dockerized. I can easily pull the image and run eg. tensorflow script. Is there a way to somehow merge what both dockers offer? Or run on top of it. I want to write a piece of code that uses both OpenCV and Tensorflow. Is there a way to achieve this?
Or in more generic sense: Docker A image has preinstalled python package AA. Docker B has python package BB. How can I write script that uses functions from both AA and BB?
Really simple. Build your own docker image with both TF and OpenCV. Example Dockerfile (Based on janza/docker-python3-opencv):
FROM python:3.7
LABEL maintainet="John Doe"
RUN apt-get update && \
apt-get install -y \
build-essential \
cmake \
git \
wget \
unzip \
yasm \
pkg-config \
libswscale-dev \
libtbb2 \
libtbb-dev \
libjpeg-dev \
libpng-dev \
libtiff-dev \
libavformat-dev \
libpq-dev && \
pip install numpy && \
pip install tensorflow
WORKDIR /
ENV OPENCV_VERSION="3.4.2"
RUN wget https://github.com/opencv/opencv/archive/${OPENCV_VERSION}.zip \
&& unzip ${OPENCV_VERSION}.zip \
&& mkdir /opencv-${OPENCV_VERSION}/cmake_binary \
&& cd /opencv-${OPENCV_VERSION}/cmake_binary \
&& cmake -DBUILD_TIFF=ON \
-DBUILD_opencv_java=OFF \
-DWITH_CUDA=OFF \
-DWITH_OPENGL=ON \
-DWITH_OPENCL=ON \
-DWITH_IPP=ON \
-DWITH_TBB=ON \
-DWITH_EIGEN=ON \
-DWITH_V4L=ON \
-DBUILD_TESTS=OFF \
-DBUILD_PERF_TESTS=OFF \
-DCMAKE_BUILD_TYPE=RELEASE \
-DCMAKE_INSTALL_PREFIX=$(python3.7 -c "import sys; print(sys.prefix)") \
-DPYTHON_EXECUTABLE=$(which python3.7) \
-DPYTHON_INCLUDE_DIR=$(python3.7 -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") \
-DPYTHON_PACKAGES_PATH=$(python3.7 -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())") \
.. \
&& make install \
&& rm /${OPENCV_VERSION}.zip \
&& rm -r /opencv-${OPENCV_VERSION}
Of course, I don't know your exact requirements regarding this project and there is some probability that this Dockerfile won't work for you. Just adjust it to you needs. But I recommend creating from ground zero (just basing on some already existing image of some Linux distribution). Then you have full control what have you installed in which versions without redundant stuff that is often found in 3rd party images (I'm not saying they are bad, but often for people use cases most parts are redundant.)
There is also already combined docker image in official hub:
https://hub.docker.com/r/fbcotter/docker-tensorflow-opencv/
If you reaaaaly want to have it separate I guess you could link running containers of those images. Containers for the linked service are reachable at a hostname identical to the alias, or the service name if no alias was specified. But you would have to implement some kind of logic to use another package from another container (probably possible but difficult and complex).
Docker Networking
I would like to build Open CV3 from scratch with Anaconda 3 . I tried to find the instructions online but cannot find it. Would appreciate if anyone could point me in the right direction.
Thanks
Let's assume it is Linux, that you want to install opencv in /foo/opencv, that anaconda3 is installed in /foo/anaconda3. This should do it.
mkdir -p /foo/opencv/src
cd /foo/opencv/src
wget https://github.com/opencv/opencv/archive/3.2.0.zip
unzip 3.2.0.zip
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=/foo/opencv \
-DCMAKE_BUILD_TYPE=Release \
-DBuild_opencv_python2=OFF \
-DBuild_opencv_python3=ON \
-DPYTHON3_EXECUTABLE=/foo/anaconda3/bin/python \
-DPYTHON3_INCLUDE_DIR=/foo/anaconda3/include/python3.6m \
-DWITH_1394=OFF -DWITH_VTK=OFF -DWITH_CUDA=OFF -DWITH_OPENMP=ON \
-DWITH_OPENCL=OFF -DWITH_MATLAB=OFF -DBUILD_SHARED_LIBS=ON \
-DBUILD_PERF_TESTS=OFF -DBUILD_TESTS=OFF \
../opencv-3.2.0
make
make install
I have the following ffmpeg-cli command which does not produce the described effect in documentation. Could this be a bug, or I have something wrong with the command.
ffmpeg \
-y \
-i small.mp4 \
-i monkey/monkey_%04d.png \
-filter_complex "[0:v][1:v]overlay=enable='between(t,1,5)'[out1]" \
-map '[out1]' \
output.mp4
I expect it to overlay the #1 stream on top of #0 between seconds 1 and 5.
You may download the test tarball from this link:
https://drive.google.com/file/d/0BxIQVP1zErDPYXRveG9hN0c0Qjg/view?usp=sharing
It includes assets for the test case.
The build I tried with:
ffmpeg-3.0.2-64bit-static (available online)
FFmpeg is a time-based processor i.e. it aligns packets by timestamps, so you have to align the start of the image sequence to the start of the overlay.
ffmpeg \
-y \
-i small.mp4 \
-i monkey/monkey_%04d.png \
-filter_complex "[1:v]setpts=PTS-STARTPTS+(1/TB)[1v]; \
[0:v][1v]overlay=enable='between(t,1,5)'[out1]" \
-map '[out1]' \
output.mp4