python docker apis containers.run doesnt print output to console - docker

I have pulled alpine image
I have build the container
I am trying run the image but I do not see any output on my console, anyone whats wrong?
If I run using docker run I can see the output
Python version is 2.7.10
dockerClient = docker.from_env()
image = dockerClient.images.pull(alpine)
dockerClient.images.build(path = "build/", tag="alpine_tests")
dockerClient.containers.run('alpine_tests', 'pwd')

If you really want to see information immediately, you should use docker build in CLI, or subproccess.call(['docker', 'build']) in Python.
When using the Python SDK, the stdout and stderr messages are always not necessary when success. Only when crashed or failed, messages are good.
# docker build
client = docker.from_env()
try:
client.images.build(
path='build/',
tag='alpine_tests',
)
except docker.errors.BuildError as exc:
for log in exc.build_log:
msg = log.get('stream')
if msg:
print(msg, end='')
raise
# docker push
ret = client.images.push(alpine_tests)
last_line = ret.splitlines()[-1]
info = json.loads(last_line)
error = info.get('error')
if error:
print(error)
exit(1)

Related

Airflow - failing XCOM push when using Alpine image

I want to run KubernetesPodOperator in Airflow that reads some file and send the content to XCOM.
Definition looks like:
read_file = DefaultKubernetesPodOperator(
image = 'alpine:3.16',
cmds = ['bash', '-cx'],
arguments = ['cat file.json >> /airflow/xcom/return.json'],
name = 'some-name',
task_id = 'some_name',
do_xcom_push = True,
image_pull_policy = 'IfNotPresent',
)
but I am getting: INFO - stderr from command: cat: can't open '/***/xcom/return.json': No such file or directory
When I use ubuntu:22.04 it works, but I want it make faster by using smaller (Alpine) image. Why it is not working with alpine and how to overcome that?

Powershell: Issue redirecting output from error stream when using docker

I am working on a set of build scripts which are called from a ubuntu hosted CI environment. The powershell build script calls jest via react-scripts via npm. Unfortunately jest doesn't use stderr correctly and writes non-errors to the stream.
I have redirected the error stream using 3>&1 2>&1 and this works fine from just powershell core ($LASTEXITCODE is 0 after running, no content from stderr is written in red).
However when I introduce docker via docker run, the build script appears to not behave and outputs the line that should be redirected from the error stream in red (and crashes). i.e. something like: docker : PASS src/App.test.js. Error: Process completed with exit code 1..
Can anyone suggest what I am doing wrong? because I'm a bit stumped. I include the sample PowerShell call below:-
function Invoke-ShellExecutable
{
param (
[ScriptBlock]
$Command
)
$Output = Invoke-Command $Command -NoNewScope | Out-String
if($LASTEXITCODE -ne 0) {
$CmdString = $Command.ToString().Trim()
throw "Process [$($CmdString)] returned a failure status code [$($LASTEXITCODE)]. The process may have outputted details about the error."
}
return $Output
}
Invoke-ShellExecutable {
($env:CI = "true") -and (npm run test:ci)
} 3>&1 2>&1

Why is my containerized Selenium application failing only in AWS Lambda?

I'm trying to get a function to run in AWS Lambda that uses Selenium and Firefox/geckodriver in order to run. I've decided to go the route of creating a container image, and then uploading and running that instead of using a pre-configured runtime. I was able to create a Dockerfile that correctly installs Firefox and Python, downloads geckodriver, and installs my test code:
FROM alpine:latest
RUN apk add firefox python3 py3-pip
RUN pip install requests selenium
RUN mkdir /app
WORKDIR /app
RUN wget -qO gecko.tar.gz https://github.com/mozilla/geckodriver/releases/download/v0.28.0/geckodriver-v0.28.0-linux64.tar.gz
RUN tar xf gecko.tar.gz
RUN mv geckodriver /usr/bin
COPY *.py ./
ENTRYPOINT ["/usr/bin/python3","/app/lambda_function.py"]
The Selenium test code:
#!/usr/bin/env python3
import util
import os
import sys
import requests
def lambda_wrapper():
api_base = f'http://{os.environ["AWS_LAMBDA_RUNTIME_API"]}/2018-06-01'
response = requests.get(api_base + '/runtime/invocation/next')
request_id = response.headers['Lambda-Runtime-Aws-Request-Id']
try:
result = selenium_test()
# Send result back
requests.post(api_base + f'/runtime/invocation/{request_id}/response', json={'url': result})
except Exception as e:
# Error reporting
import traceback
requests.post(api_base + f'/runtime/invocation/{request_id}/error', json={'errorMessage': str(e), 'traceback': traceback.format_exc(), 'logs': open('/tmp/gecko.log', 'r').read()})
raise
def selenium_test():
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument('-headless')
options.add_argument('--window-size 1920,1080')
ffx = Firefox(options=options, log_path='/tmp/gecko.log')
ffx.get("https://google.com")
url = ffx.current_url
ffx.close()
print(url)
return url
def main():
# For testing purposes, currently not using the Lambda API even in AWS so that
# the same container can run on my local machine.
# Call lambda_wrapper() instead to get geckodriver logs as well (not informative).
selenium_test()
if __name__ == '__main__':
main()
I'm able to successfully build this container on my local machine with docker build -t lambda-test . and then run it with docker run -m 512M lambda-test.
However, the exact same container crashes with an error when I try and upload it to Lambda to run. I set the memory limit to 1024M and the timeout to 30 seconds. The traceback says that Firefox was unexpectedly killed by a signal:
START RequestId: 52adeab9-8ee7-4a10-a728-82087ec9de30 Version: $LATEST
/app/lambda_function.py:29: DeprecationWarning: use service_log_path instead of log_path
ffx = Firefox(options=options, log_path='/tmp/gecko.log')
Traceback (most recent call last):
File "/app/lambda_function.py", line 45, in <module>
main()
File "/app/lambda_function.py", line 41, in main
lambda_wrapper()
File "/app/lambda_function.py", line 12, in lambda_wrapper
result = selenium_test()
File "/app/lambda_function.py", line 29, in selenium_test
ffx = Firefox(options=options, log_path='/tmp/gecko.log')
File "/usr/lib/python3.8/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
RemoteWebDriver.__init__(
File "/usr/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
self.start_session(capabilities, browser_profile)
File "/usr/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status signal
END RequestId: 52adeab9-8ee7-4a10-a728-82087ec9de30
REPORT RequestId: 52adeab9-8ee7-4a10-a728-82087ec9de30 Duration: 20507.74 ms Billed Duration: 21350 ms Memory Size: 1024 MB Max Memory Used: 131 MB Init Duration: 842.11 ms
Unknown application error occurred
I had it upload the geckodriver logs as well, but there wasn't much useful information in there:
1608506540595 geckodriver INFO Listening on 127.0.0.1:41597
1608506541569 mozrunner::runner INFO Running command: "/usr/bin/firefox" "--marionette" "-headless" "--window-size 1920,1080" "-foreground" "-no-remote" "-profile" "/tmp/rust_mozprofileQCapHy"
*** You are running in headless mode.
How can I even begin to debug this? The fact that the exact same container behaves differently depending upon where it's run seems fishy to me, but I'm not knowledgeable enough about Selenium, Docker, or Lambda to pinpoint exactly where the problem is.
Is my docker run command not accurately recreating the environment in Lambda? If so, then what command would I run to better simulate the Lambda environment? I'm not really sure where else to go from here, seeing as I can't actually reproduce the error locally to test with.
If anyone wants to take a look at the full code and try building it themselves, the repository is here - the lambda code is in lambda_function.py.
As for prior research, this question a) is about ChromeDriver and b) has no answers from over a year ago. The link from that one only has information about how to run a container in Lambda, which I'm already doing. This answer is almost my problem, but I know that there's not a version mismatch because the container works on my laptop just fine.
I have exactly the same problem and a possible explanation.
I think what you want is not possible for the time being.
According to AWS DevOps Blog Firefox relies on fallocate system call and /dev/shm.
However AWS Lambda does not mount /dev/shm so Firefox will crash when trying to allocate memory. Unfortunately, this handling cannot be disabled for Firefox.
However if you can live with Chromium, there is an option for chromedriver --disable-dev-shm-usage that disables the usage of /dev/shm and instead writes shared memory files to /tmp.
chromedriver works fine for me on AWS Lambda, if that is an option for you.
According to AWS DevOps Blog you can also use AWS Fargate to run Firefox/geckodriver.
There is an entry in the AWS forum from 2015 that requests mounting /dev/shm in Lambdas, but nothing happened since then.

How to use Snakemake container for htslib (bgzip + tabix)

I have a pipeline which uses a global singularity image and rule-based conda wrappers.
However, some of the tools don't have wrappers (i.e. htslib's bgzip and tabix).
Now I need to learn how to run jobs in containers.
In the official documentation link it says:
"Allowed image urls entail everything supported by singularity (e.g., shub:// and docker://)."
Now I've tried the following image from singularity hub but I get an error:
minimal reproducible example:
config.yaml
# Files
REF_GENOME: "c_elegans.PRJNA13758.WS265.genomic.fa"
GENOME_ANNOTATION: "c_elegans.PRJNA13758.WS265.annotations.gff3"
Snakefile
# Directories------------------------------------------------------------------
configfile: "config.yaml"
# Setting the names of all directories
dir_list = ["REF_DIR", "LOG_DIR", "BENCHMARK_DIR", "QC_DIR", "TRIM_DIR", "ALIGN_DIR", "MARKDUP_DIR", "CALLING_DIR", "ANNOT_DIR"]
dir_names = ["refs", "logs", "benchmarks", "qc", "trimming", "alignment", "mark_duplicates", "variant_calling", "annotation"]
dirs_dict = dict(zip(dir_list, dir_names))
GENOME_INDEX=config["REF_GENOME"]+".fai"
VEP_ANNOT=config["GENOME_ANNOTATION"]+".gz"
VEP_ANNOT_INDEX=config["GENOME_ANNOTATION"]+".gz.tbi"
# Singularity with conda wrappers
singularity: "docker://continuumio/miniconda3:4.5.11"
# Rules -----------------------------------------------------------------------
rule all:
input:
expand('{REF_DIR}/{GENOME_ANNOTATION}{ext}', REF_DIR=dirs_dict["REF_DIR"], GENOME_ANNOTATION=config["GENOME_ANNOTATION"], ext=['', '.gz', '.gz.tbi']),
expand('{REF_DIR}/{REF_GENOME}{ext}', REF_DIR=dirs_dict["REF_DIR"], REF_GENOME=config["REF_GENOME"], ext=['','.fai']),
rule download_references:
params:
ref_genome=config["REF_GENOME"],
genome_annotation=config["GENOME_ANNOTATION"],
ref_dir=dirs_dict["REF_DIR"]
output:
os.path.join(dirs_dict["REF_DIR"],config["REF_GENOME"]),
os.path.join(dirs_dict["REF_DIR"],config["GENOME_ANNOTATION"]),
os.path.join(dirs_dict["REF_DIR"],VEP_ANNOT),
os.path.join(dirs_dict["REF_DIR"],VEP_ANNOT_INDEX)
resources:
mem=80000,
time=45
log:
os.path.join(dirs_dict["LOG_DIR"],"references","download.log")
singularity:
"shub://biocontainers/tabix"
shell: """
cd {params.ref_dir}
wget ftp://ftp.wormbase.org/pub/wormbase/releases/WS265/species/c_elegans/PRJNA13758/c_elegans.PRJNA13758.WS265.genomic.fa.gz
bgzip -d {params.ref_genome}.gz
wget ftp://ftp.wormbase.org/pub/wormbase/releases/WS265/species/c_elegans/PRJNA13758/c_elegans.PRJNA13758.WS265.annotations.gff3.gz
bgzip -d {params.genome_annotation}.gz
grep -v "#" {params.genome_annotation} | sort -k1,1 -k4,4n -k5,5n -t$'\t' | bgzip -c > {params.genome_annotation}.gz
tabix -p gff {params.genome_annotation}.gz
"""
rule index_reference:
input:
os.path.join(dirs_dict["REF_DIR"],config["REF_GENOME"])
output:
os.path.join(dirs_dict["REF_DIR"],GENOME_INDEX)
resources:
mem=2000,
time=30,
log:
os.path.join(dirs_dict["LOG_DIR"],"references", "faidx_index.log")
wrapper:
"0.64.0/bio/samtools/faidx"
Error
Building DAG of jobs...
Pulling singularity image shub://biocontainers/tabix.
WorkflowError:
Failed to pull singularity image from shub://biocontainers/tabix:
ESC[31mFATAL: ESC[0m While pulling shub image: failed to get manifest for: shub://biocontainers/tabix: the requested manifest was not found in singularity hub
File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/deployment/singularity.py", line 88, in pull
~
It appears this is a problem with the container?
(snakemake) [moldach#arc CONTAINER_TROUBLESHOOT]$ singularity pull shub://biocontainers/tabix
FATAL: While pulling shub image: failed to get manifest for: shub://biocontainers/tabix: the requested manifest was not found in singularity hub
In fact, I experience this problem with other biocontainers containers.
For example, I also need to use a container to do bowtie2 indexing and this is the error I get from the biocontainers/bowtie2 versus another developers container of the same tool comics/bowtie2:
^C(snakemake) [moldach#arc CONTAINER_TROUBLESHOOT]$ singularity pull docker://biocontainers/bowtie2
FATAL: While making image from oci registry: failed to get checksum for docker://biocontainers/bowtie2: Error reading manifest latest in docker.io/biocontainers/bowtie2: manifest unknown: manifest unknown
(snakemake) [moldach#arc CONTAINER_TROUBLESHOOT]$ singularity pull docker://comics/bowtie2
INFO: Converting OCI blobs to SIF format
INFO: Starting build...
Getting image source signatures
Copying blob a02a4930cb5d done
Does anyone know why?
Biocontainers does not allow latest as tag for their containers, and therefore you will need to specify the tag to be used.
From their doc:
The BioContainers community had decided to remove the latest tag. Then, the following command docker pull biocontainers/crux will fail. Read more about this decision in Getting started with Docker
When no tag is specified, it defaults to latest tag, which of course is not allowed here. See here for bowtie2's tags. Usage like this will work:
singularity pull docker://biocontainers/bowtie2:v2.4.1_cv1
Using another container solves the issue; however, the fact I'm getting errors from biocontainers is troubling given that these are both very common and used as examples in the literature so I will award the top-answer to whomever can solve that specific issue.
As it were, the use of stackleader/bgzip-utility solve the issue of actually running this rule in a container.
container:
"docker://stackleader/bgzip-utility"
Once again, for those coming to this post, it's probably best to test any container first before running snakemake, e.g. singularity pull docker://stackleader/bgzip-utility.

When trying to do a docker pull using exec.Command() in golang, I am not seeing the progress bars

I am using the Go exec package to execute a docker pull debian command:
import (
"bufio"
"os/exec"
"strings"
)
func main() {
cmd := exec.Command("docker", "pull", "debian")
stdout, _ := cmd.StdoutPipe()
cmd.Start()
scanner := bufio.NewScanner(stdout)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
return nil
}
But it never shows me the progress bar. It only shows an update when it is fully complete. For larger images over a GB it is hard to see if there is progess being made. This is what it shows:
e9afc4f90ab0: Pulling fs layer
e9afc4f90ab0: Verifying Checksum
e9afc4f90ab0: Download complete
e9afc4f90ab0: Pull complete
Is it possible to get output similar to what I see when I run docker pull debian in the terminal or something that I can use to show progress?:
e9afc4f90ab0: Downloading [==========> ] 10.73MB/50.39MB
As David mentionned, you would rather use the official docker engine SDK to interact with docker.
Initialize the docker client
cli, _ := client.NewClientWithOpts(client.FromEnv, client.WithAPIVersionNegotiation())
Pull the image
reader, _ := cli.ImagePull(context.Background(), "hello-world", types.ImagePullOptions{})
Parse the json stream
id, isTerm := term.GetFdInfo(os.Stdout)
_ = jsonmessage.DisplayJSONMessagesStream(reader, os.Stdout, id, isTerm, nil)
You will get the same output as the docker cli provide when you do a docker pull hello-world

Resources