why ansible always replaces double quotes with single quotes in templates? - docker

I am trying to generate Dockerfiles with Ansible template - see the role source and the template in Ansible Galaxy and Github
I need to genarate a standard Dockerfile line like:
...
VOLUME ["/etc/postgresql/9.4"]
...
However, when I put this in the input file:
...
instruction: CMD
value: "[\"/etc/postgresql/{{postgresql_version}}\"]"
...
It ends up rendered like:
...
VOLUME ['/etc/postgresql/9.4']
...
and I lose the " (which renders the Dockerfiles useless)
Any help ? How can I convince Jinja to not substitute " with ' ? I tried \" , |safe filter, even {% raw %} - it just keeps doing it!
Update:
Here is how to reproduce the issue:
Go get the peruncs.docker role from galaxy.ansible.com or Github (link is given above)
Write up a simple playbook (say demo.yml) with the below content and run: ansible-playbook -v demo.yml. The -v option will allow you to see the temp directory where the generated Dockerfile goes with the broken content, so you can examine it. Generating the docker image is not important to succeed, just try to get the Dockerfile right.
- name: Build docker image
hosts: localhost
vars:
- somevar: whatever
- image_tag: "blabla/booboo"
- docker_copy_files: []
- docker_file_content:
- instruction: CMD
value: '["/usr/bin/runit", "{{somevar}}"]'
roles:
- peruncs.docker
Thanks in advance!

Something in Ansible appears to be recognizing that as valid Python, so it's getting transformed into a Python list and then serialized using Python's str(), which is why you end up with the single-quoted values.
An easy way to work around this is to stick a space at the beginning of the value, which seems to prevent it from getting converted into Python:
- name: Build docker image
hosts: localhost
vars:
- somevar: whatever
- image_tag: "blabla/booboo"
- docker_copy_files: []
- docker_file_content:
- instruction: CMD
value: ' ["/usr/bin/runit", "{{somevar}}"]'
roles:
- peruncs.docker
This results in:
CMD ["/usr/bin/runit", "whatever"]

Related

Rule in snakemake using singularity: unterminated quoted string

I'm running a snakemake pipeline that for a specific rule loads a container:
rule counts:
params:
transcriptome=os.environ["INDEX"],
outdir= (os.environ["OUTDIR"] + "/counts/"),
indir= (os.environ["INDIR"] + "{sample}"),
name = lambda wildcards: SAMPLES[wildcards.sample]
output:
(os.environ["OUTDIR"] + "counts/" + "{sample}" + "/outs/web_summary.html")
container:
"docker://marcusczi/cellranger_clean"
shell:
"""
cellranger count --id={wildcards.sample} --transcriptome={params.transcriptome} --fastqs={params.indir} --sample={params.name}
mkdir -p {params.outdir}
mv ./{wildcards.sample}/ {params.outdir}
"""
Dry run looks fine, the rule itself I'm sure it works (tried it without the container). However, when I run it with docker I get this error:
Activating singularity image /some/path/.snakemake/singularity/c288fbc3fef5771f055a688c6678c24d.simg
/bin/sh: syntax error: unterminated quoted string
[ 1.228141] reboot: Power down
And then it waits for the missing files, and fails.
I think the answer to this situation might be related to this previous question, but I have tried everything i can think of in terms of escaping characters (except for the wildcards and variables within curly brackets because I'm guessing it should be fine, and if not why am i even using snakemake :-( ). The paths for the directories I'm using are valid and exist, the name and wildcard "sample" are in the shape "sample_123", nothing fancy.
It's also worth saying that there are no single or double quotes in any of these variables.
Thank you!!
Software and OS:
I am in macos catalina 10.15.5, running snakemake 5.20.1, and I have been using the beta version of singularity for macos (3.3.0-rc.1.658.g7427b73f1.dirty).
Running singularity outside Snakemake:
I tried running the singularity outside snakemake, the software that I'm trying to run starts, but then complains that there is no disk left on space (which is not true). I'm running the singularity as sudo singularity run -B "$(pwd):$(pwd)" docker://marcusczi/cellranger_clean
I think this latest error might be either 1) I'm not running singularity as I should..? Or 2) A false statement of what is happening since cellranger (the software I'm trying to run) often has misleading error messages.
Minimal reproducible example:
If you install snakemake, you should be able to reproduce my error when running snakemake -j1 --use-singularity in the same directory of the Snakefile.
Snakefile:
rule all:
input:
"output.txt"
rule counts:
output:
"output.txt"
container:
"docker://marcusczi/cellranger_clean"
shell:
"""
cellranger count --help
echo "hurray!" > {output}
"""

How to exclude Multiple folders with the EXTRA_ARGS variable?

With the current syntax of bitbucket's EXTRA_ARGS variable one directory is excluded from deployment like this:
EXTRA_ARGS: '--exclude=YOUR_DESIRE_FOLDER_PATH/*'
(Bitbucket Pipeline - how to exclude files or folders?)
But how to exclude multiple directories?
First, note that not every pipe has a support for --exclude option, as some pipes are just wrappers around the cli tools, like rsync or sftp. However, if you use the rsync-deploy pipe, you should be able to use multiple --exclude options:
script:
- pipe: atlassian/rsync-deploy:0.3.2
variables:
USER: 'ec2-user'
SERVER: '127.0.0.1'
REMOTE_PATH: '/var/www/build/'
LOCAL_PATH: 'build'
DEBUG: 'true'
EXTRA_ARGS: '--exclude=*.txt --exclude=src/*'

Artifact not being published in bitbucket pipeline

I'm doing quite trivial java build in BitBucket Pipeline. the only twist is that it is in the repository subdirectory.
my bitbucket-pipelines.yml:
pipelines:
default:
- step:
caches:
- gradle
script: # Modify the commands below to build your repository.
# You must commit the Gradle wrapper to your repository
# https://docs.gradle.org/current/userguide/gradle_wrapper.html
- bash "./foo bar/gradlew" -p "./foo bar" distTar
- ls ./foo\ bar/build -R
- echo 'THE END'
artifacts:
- ./foo bar/build/distributions/xxx.tar
My ls confirms that xxx.tar is in the expected location
....
./foo bar/build/distributions:brigitte.tar
....
, but artifact page is empty.
Found it! It should be
# ...
artifacts:
- foo bar/build/distributions/brigitte.tar
artifacts paths are not real path so "dot slash" at the beginning was invalidating my path. Shame that it was not raised as a warning!
Extending the existing answer, I'd like to highlight the docs fragment that speaks about this
https://support.atlassian.com/bitbucket-cloud/docs/use-artifacts-in-steps/#Introduction
You can use glob patterns to define artifacts. Glob patterns that start with a * will need to be put in quotes. Note: As these are glob patterns, path segments “.” and “..” won’t work. Use paths relative to the build directory.
So that's it: artifacts can't be outside the build directory and artifact definitions in the pipelines must not contain . and .. path segments.
Absolute paths starting with / will neither work!
It is truly shameful how obscure this is like, why wouldn't the pipelines throw an error if that declaration is invalid? Damn, Bitbucket!

Jenkins job builder: conditionally include builder and publisher

I have a set of Jenkins jobs that are substantially the same. I have created a job template that creates them all. However, some have builders that others don't (i.e. the first in the chain doesn't copy artifacts from another project) and others have publishers that others don't (they don't all have JUnit tests).
I would like to conditionally include these modules depending on a variable, but I can't find a way of doing this:
I can't use a jinja2 template to include or exclude one item in a list
Including empty variables typically causes the build to fail
I could include yaml files, but I would need to include all of the builders section, and I would need one for each job, meaning a lot of repetition
Is this possible? I would like to include the comment section below in some of the jobs.
43 builders:
44 - shell: |
45 echo Removing working directory from previous run
46 rm -rf ${{WORKSPACE}}/css-build/working
47 # - copyartifact:
48 # project: "{previous-project}"
49 # whichbuild: last-successful
50 # optional: "{copy-optional}"
51 - shell: |
52 {init-shell}
53 ${{WORKSPACE}}/css-build/build-util.sh {shell-args} ${{WORKSPACE}}/{location} -w ${{WORKSPACE}}/css-b uild/working
Well, here is the workaround:
Define a new module (in this case it will be a builder) with a different name to the original. If the omit tag is present, don't do anything; otherwise, do what would have happened anyway.
def optional_copy(registry, xml_parent, data):
if data['omit'].lower() == 'true':
return
else:
new_data = collections.OrderedDict()
new_data['copyartifact'] = data
registry.dispatch('builder', xml_parent, new_data)
Register it to jjb in setup.py:
setup(
name='JJB config',
py_modules = ['optionals'],
entry_points={
'jenkins_jobs.builders': [
'optional-copy=optionals:optional_copy'
]
}
)
Then, in your yaml, you can use the optional-copy module and the omit property:
builders:
- shell: |
echo Removing working directory from previous run
rm -rf "{working-dir}"
- optional-copy:
omit: "{omit-copy}"
project: "{prev}"
whichbuild: last-successful
- shell: |
{init-shell}
${{WORKSPACE}}/css-build/build-util.sh -u {diirt-version} {shell-args} -p ${{WORKSPACE}}/{location} -w "{working-dir}"
I have got a workaround for your issue that does not require extending the job builder.
But it requires the availablity of the Conditional build step plugin on jenkins.
Example for optional builders:
- job-template
id: my-custom-template
builders:
- conditional-step:
condition-kind: always
steps: "{obj:optional_builders|[]}"
With this you can add builders to your job using a optional_builders variable (if you want to).
jobs:
#With optional_builders
- my-custom-template:
optional_builders:
- copyartifact:
project: "{previous-project}"
whichbuild: last-successful
#Without optional_builders
- my-custom-template:
Example for optional publishers:
publishers:
- conditional-publisher:
- condition-kind: always
action: "{obj:optional_publishers|[]}"

How to get Task ID from within ECS container?

Hello I am interested in retrieving the Task ID from within inside a running container which lives inside of a EC2 host machine.
AWS ECS documentation states there is an environment variable ECS_CONTAINER_METADATA_FILE with the location of this data but will only be set/available if ECS_ENABLE_CONTAINER_METADATA variable is set to true upon cluster/EC2 instance creation. I don't see where this can be done in the aws console.
Also, the docs state that this can be done by setting this to true inside the host machine but would require to restart the docker agent.
Is there any other way to do this without having to go inside the EC2 to set this and restart the docker agent?
This doesn't work for newer Amazon ECS container versions anymore, and in fact it's now much simpler and also enabled by default. Please refer to this docu, but here's a TL;DR:
If you're using Amazon ECS container agent version 1.39.0 and higher, you can just do this inside the docker container:
curl -s "$ECS_CONTAINER_METADATA_URI_V4/task" \
| jq -r ".TaskARN" \
| cut -d "/" -f 3
Here's a list of container agent releases, but if you're using :latest – you're definitely fine.
The technique I'd use is to set the environment variable in the container definition.
If you're managing your tasks via Cloudformation, the relevant yaml looks like so:
Taskdef:
Type: AWS::ECS::TaskDefinition
Properties:
...
ContainerDefinitions:
- Name: some-name
...
Environment:
- Name: AWS_DEFAULT_REGION
Value: !Ref AWS::Region
- Name: ECS_ENABLE_CONTAINER_METADATA
Value: 'true'
This technique helps you keep everything straightforward and reproducible.
If you need metadata programmatically and don't have access to the metadata file, you can query the agent's metadata endpoint:
curl http://localhost:51678/v1/metadata
Note that if you're getting this information as a running task, you may not be able to connect to the loopback device, but you can connect to the EC2 instance's own IP address.
We set it with the so called user data, which are executed at the start of the machine. There are multiple ways to set it, for example: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html#user-data-console
It could look like this:
#!/bin/bash
cat <<'EOF' >> /etc/ecs/ecs.config
ECS_CLUSTER=ecs-staging
ECS_ENABLE_CONTAINER_METADATA=true
EOF
Important: Adjust the ECS_CLUSTER above to match your cluster name, otherwise the instance will not connect to that cluster.
Previous answers are correct, here is another way of doing this:
From the ec2 instance where container is running, run this command
curl http://localhost:51678/v1/tasks | python -mjson.tool |less
From the AWS ECS cli Documentation
Command:
aws ecs list-tasks --cluster default
Output:
{
"taskArns": [
"arn:aws:ecs:us-east-1:<aws_account_id>:task/0cc43cdb-3bee-4407-9c26-c0e6ea5bee84",
"arn:aws:ecs:us-east-1:<aws_account_id>:task/6b809ef6-c67e-4467-921f-ee261c15a0a1"
]
}
To list the tasks on a particular container instance
This example command lists the tasks of a specified container instance, using the container instance UUID as a filter.
Command:
aws ecs list-tasks --cluster default --container-instance f6bbb147-5370-4ace-8c73-c7181ded911f
Output:
{
"taskArns": [
"arn:aws:ecs:us-east-1:<aws_account_id>:task/0cc43cdb-3bee-4407-9c26-c0e6ea5bee84"
]
}
My ECS solution as bash and Python snippets. Logging calls are able to print for debug by piping to sys.stderr while print() is used to pass the value back to a shell script
#!/bin/bash
TASK_ID=$(python3.8 get_ecs_task_id.py)
echo "TASK_ID: ${TASK_ID}"
Python script - get_ecs_task_id.py
import json
import logging
import os
import sys
import requests
# logging configuration
# file_handler = logging.FileHandler(filename='tmp.log')
# redirecting to stderr so I can pass back extracted task id in STDOUT
stdout_handler = logging.StreamHandler(stream=sys.stderr)
# handlers = [file_handler, stdout_handler]
handlers = [stdout_handler]
logging.basicConfig(
level=logging.INFO,
format="[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s",
handlers=handlers,
datefmt="%Y-%m-%d %H:%M:%S",
)
logger = logging.getLogger(__name__)
def get_ecs_task_id(host):
path = "/task"
url = host + path
headers = {"Content-Type": "application/json"}
r = requests.get(url, headers=headers)
logger.debug(f"r: {r}")
d_r = json.loads(r.text)
logger.debug(d_r)
ecs_task_arn = d_r["TaskARN"]
ecs_task_id = ecs_task_arn.split("/")[2]
return ecs_task_id
def main():
logger.debug("Extracting task ID from $ECS_CONTAINER_METADATA_URI_V4")
logger.debug("Inside get_ecs_task_id.py, redirecting logs to stderr")
logger.debug("so that I can pass the task id back in STDOUT")
host = os.environ["ECS_CONTAINER_METADATA_URI_V4"]
ecs_task_id = get_ecs_task_id(host)
# This print statement passes the string back to the bash wrapper, don't remove
logger.debug(ecs_task_id)
print(ecs_task_id)
if __name__ == "__main__":
main()

Resources