I am using Pandoc 2.13.
If I run the following
pandoc -s foo.md \
-f markdown_phpextra+autolink_bare_uris+raw_tex \
--toc \
-V title="Pandoc Lunch and Learn" \
-V linkcolor:blue \
-V mainfont="DejaVu Serif" \
-V monofont="DejaVu Sans Mono" \
-V geometry:letterpaper \
-V geometry:margin=2cm \
-V documentclass:article \
-o foo.pdf
My links are blue, as expected.
However, if I try to use -V linkcolor:red or any other color for that matter the links still end up blue. If I use V urlcolor=red the link colors change as I'd expect. Why doesn't linkcolor work? Per the manual it seems like it should: https://pandoc.org/MANUAL.html#variables-for-latex
EDIT:
This is foo.md
# Foo
## Foo Bar Baz
[This link][1] will always be blue, even when I pass `-V linkcolor:red`
[1]: https://stackoverflow.com/questions/70835461/pandoc-v-linkcolor-not-working-correctly-when-generating-pdf-with-latex
Note that the link is blue when run with Pandoc 2.13 with this following command line args:
pandoc -s foo.md \
-f markdown_phpextra+autolink_bare_uris+raw_tex \
--toc \
-V title="Pandoc Lunch and Learn" \
-V linkcolor:red \
-V mainfont="DejaVu Serif" \
-V monofont="DejaVu Sans Mono" \
-V geometry:letterpaper \
-V geometry:margin=2cm \
-V documentclass:article \
-o foo.pdf
Result:
The option linkcolor of the hyperref package is to change the colour of internal links, e.g. if you link to another section in your document.
However your link gets converted to \href{...}{...} macro, thus the colour is to be specified via the urlcolor option.
For more information about these hyperref options, see section "3.5 Extension options" of the user manual
When running DataFlow on GCP, DataFlow does not use the nodes specified in "numWorkers".
On the console, I can see that "TargetWorkers" is the same number as "numWorkers". but actual is not...
How do I get it to work as intended?
This is the command to deploy the template.
mvn -Pdataflow-runner compile exec:java -Dexec.mainClass=fqdn.of.main.Class \
-Dexec.args=" \
--project=my_gcp_project_name \
--stagingLocation=gs://my_product/staging/ \
--templateLocation=gs://my_product/template/Template \
--runner=DataflowRunner \
--autoscalingAlgorithm=NONE \
--numWorkers=90 \
--workerMachineType=n1-highmem-4 \
--enableStreamingEngine=true \
"
Even though you are specifying the flag "--autoscalingAlgorithm=NONE", you are also using Streaming Engine which is a type of autoscaling.
I made a test using this Quickstart, when using Streaming Engine:
mvn -Pdataflow-runner compile exec:java \
-Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args="--project=support-data-ez \
--stagingLocation=gs://datacert/staging/ \
--output=gs://datacert/output \
--runner=DataflowRunner \
--autoscalingAlgorithm=NONE \
--numWorkers=90 \
--workerMachineType=n1-highmem-4
--enableStreamingEngine=true \
"
I have experimented with the same behavior.
You also have to verify that any quota has been reached.
When a quota is exceeded Dataflow will discard the flag "--numWorkers=90" and will determines an appropriate number of workers. So, consider this factor too.
mvn -Pdataflow-runner compile exec:java \
-Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args="--project=support-data-ez \
--stagingLocation=gs://datacert/staging/ \
--output=gs://datacert/output \
--runner=DataflowRunner \
--autoscalingAlgorithm=NONE \
--numWorkers=58 \
--workerMachineType=n1-standard-1
--enableStreamingEngine=false \
As you can see when using a number of workers (or CPUs) that not exceed my quota and disabling the Streaming Engine feature I was able to use all the workers specified.
I have a following scenario. I want to use tensorflow for ML and OpenCV for some image processing. I recently learned about dockers and found out, that both TF and OCV are dockerized. I can easily pull the image and run eg. tensorflow script. Is there a way to somehow merge what both dockers offer? Or run on top of it. I want to write a piece of code that uses both OpenCV and Tensorflow. Is there a way to achieve this?
Or in more generic sense: Docker A image has preinstalled python package AA. Docker B has python package BB. How can I write script that uses functions from both AA and BB?
Really simple. Build your own docker image with both TF and OpenCV. Example Dockerfile (Based on janza/docker-python3-opencv):
FROM python:3.7
LABEL maintainet="John Doe"
RUN apt-get update && \
apt-get install -y \
build-essential \
cmake \
git \
wget \
unzip \
yasm \
pkg-config \
libswscale-dev \
libtbb2 \
libtbb-dev \
libjpeg-dev \
libpng-dev \
libtiff-dev \
libavformat-dev \
libpq-dev && \
pip install numpy && \
pip install tensorflow
WORKDIR /
ENV OPENCV_VERSION="3.4.2"
RUN wget https://github.com/opencv/opencv/archive/${OPENCV_VERSION}.zip \
&& unzip ${OPENCV_VERSION}.zip \
&& mkdir /opencv-${OPENCV_VERSION}/cmake_binary \
&& cd /opencv-${OPENCV_VERSION}/cmake_binary \
&& cmake -DBUILD_TIFF=ON \
-DBUILD_opencv_java=OFF \
-DWITH_CUDA=OFF \
-DWITH_OPENGL=ON \
-DWITH_OPENCL=ON \
-DWITH_IPP=ON \
-DWITH_TBB=ON \
-DWITH_EIGEN=ON \
-DWITH_V4L=ON \
-DBUILD_TESTS=OFF \
-DBUILD_PERF_TESTS=OFF \
-DCMAKE_BUILD_TYPE=RELEASE \
-DCMAKE_INSTALL_PREFIX=$(python3.7 -c "import sys; print(sys.prefix)") \
-DPYTHON_EXECUTABLE=$(which python3.7) \
-DPYTHON_INCLUDE_DIR=$(python3.7 -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") \
-DPYTHON_PACKAGES_PATH=$(python3.7 -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())") \
.. \
&& make install \
&& rm /${OPENCV_VERSION}.zip \
&& rm -r /opencv-${OPENCV_VERSION}
Of course, I don't know your exact requirements regarding this project and there is some probability that this Dockerfile won't work for you. Just adjust it to you needs. But I recommend creating from ground zero (just basing on some already existing image of some Linux distribution). Then you have full control what have you installed in which versions without redundant stuff that is often found in 3rd party images (I'm not saying they are bad, but often for people use cases most parts are redundant.)
There is also already combined docker image in official hub:
https://hub.docker.com/r/fbcotter/docker-tensorflow-opencv/
If you reaaaaly want to have it separate I guess you could link running containers of those images. Containers for the linked service are reachable at a hostname identical to the alias, or the service name if no alias was specified. But you would have to implement some kind of logic to use another package from another container (probably possible but difficult and complex).
Docker Networking
I have a video that I want to download from YouTube but I also want the chat of the entire live stream.
youtube-dl \
--limit-rate '0.25M' \
--retries '3' \
--no-overwrites \
--call-home \
--write-info-json \
--write-description \
--write-thumbnail \
--all-subs \
--convert-subs 'srt' \
--write-annotations \
--add-metadata \
--embed-subs \
--download-archive '/archive/videos/gamer_Archive/gamer_Archive.ytdlarchive' \
--format 'bestvideo+bestaudio/best' \
--merge-output-format 'mkv' \
--output '/archive/videos/gamer_Archive/%(upload_date)s_%(id)s/gamer_Archive_%(upload_date)s_%(id)s_%(title)s.%(ext)s' \
'https://www.youtube.com/watch?v=HPmhA3FpQNA' ;
I obviously am not the owner of the video and I cannot seem to use the API to pull the chat. How do I pull the video and all of the chat logs for the video? Can I do this using YouTube-DL? I tried meta-data, embed-subs, and everything else I could find online in the hopes the chat is written somewhere but I cannot find it.
I did it using this library: https://pypi.org/project/chat-replay-downloader/
Commands to do it (Linux, bash, python):
$ python -m venv venv
$ source venv/bin/activate
$ pip install chat-replay-downloader
$ chat_replay_downloader https://www.youtube.com/watch?v=yOP-VT0Q9mk >> chat.txt
I am trying to run the CVB on a directory of plain text files, following the procedure outlined below. However, I am not able to see the vectordump (step 6). Run without the "-c csv" flag the generated file is empty. However, if I use the flag "-c csv" the generated file starts with a series of numbers followed by an alphabetically organized series of unigrams (see below)
#1,10,1163,12,121,13,14,141,1462,15,16,17,185,1901,197,2,201,2227,23,283,298,3,331,35,4,402,4351,445,5,57,58,6,68,7,9,987,a.m,ab,abc,abercrombie,abercrombies,ability
Can someone point out what I am doing wrong?
thank you
0: Set Paths
> export HDFS_PATH=/path/to/hdfs/
> export LOCAL_PATH=/path/to/localfs
1: Put docs in HDFS using hadoop fs -put [-put ... ]
> hadoop fs -put $LOCAL_PATH/test $HDFS_PATH/rawdata
2: Generate sequence files (of Text) from a directory
> mahout seqdirectory \
-i $HDFS_PATH/rawdata \
-o $HDFS_PATH/sequenced \
-c UTF-8 -chunk 5
3- Generate sparse Vector from Text sequence files
> mahout seq2sparse \
-i $HDFS_PATH/sequenced \
-o $HDFS_PATH/sparseVectors \
-ow --maxDFPercent 85 --namedVector --weight tf
4- rowid: : Map SequenceFile to {SequenceFile, SequenceFile}
> mahout rowid \
-i $HDFS_PATH/sparseVectors/tfidf-vectors \
-o $HDFS_PATH/matrix
5- run cvb
> mahout cvb \
-i $HDFS_PATH/matrix/matrix \
-o $HDFS_PATH/test-lda \
-k 100 -ow -x 40 \
-dict $HDFS_PATH/sparseVectors/dictionary.file-0 \
-dt $HDFS_PATH/test-lda-topics \
-mt $HDFS_PATH/test-lda-model
6- Dump vectors from a sequence file to text
> mahout vectordump \
-i $HDFS_PATH/test-lda-topics/part-m-00000 \
-o $LOCAL_PATH/vectordump \
-vs 10 -p true \
-d $HDFS_PATH/sparseVectors/dictionary.file-0 \
-dt sequencefile \
-sort $HDFS_PATH/test-lda-topics/part-m-00000 \
-c csv
; cat $LOCAL_PATH/vectordump
The problem was in step 4. In step 3 I am generating TF vectors (--weight tf) but in step 4 I was running the rowid job (which converts the <Text, VectorWritable> tuples of tf-vectors, to <IntWritable, VectorWritable> which cvb expects) with tfidf-vectors.
So changing step 4 from:
> mahout rowid \
-i $HDFS_PATH/sparseVectors/tfidf-vectors \
-o $HDFS_PATH/matrix
to
> mahout rowid \
-i $HDFS_PATH/sparseVectors/tf-vectors\
-o $HDFS_PATH/matrix
fixes the problem.