Getting an error while trying to use a command under the lifecycle tag on kubernetes - docker

im successfully running kubernetes, gcloud and postgres but i wanna make some modifications after pod startup , im trying to move some files so i tried these 3 options
1
image: paunin/postgresql-cluster-pgsql
lifecycle:
postStart:
exec:
command: [/bin/cp /var/lib/postgres/data /tmpdatavolume/]
2
image: paunin/postgresql-cluster-pgsql
lifecycle:
postStart:
exec:
command:
- "cp"
- "/var/lib/postgres/data"
- "/tmpdatavolume/"
3
image: paunin/postgresql-cluster-pgsql
lifecycle:
postStart:
exec:
command: ["/bin/cp "]
args: ["/var/lib/postgres/data","/tmpdatavolume/"]
on option 1 and 2, im getting the same errors (from kubectl get events )
Killing container with docker id f436e40f5df2: PostStart handler: Error ex
ecuting in Docker Container: -1
and on option 3 it wont even let me upload the yaml file giving me this error
error validating "postgres-master.yaml": error validating data: found invalid field args for v1.ExecAction; if you choose to ignore these errors, turn validation off with --validate=false
any help would be appreciated! thanks.
pd: i just pasted part of my yaml file since i wasnt getting any errors since i added those new lines

Here's the document about lifecycle hooks you might find useful.
Your option 1 won't work and should give you the error you saw, it should be ["/bin/cp","/var/lib/postgres/data","/tmpdatavolume/"] instead. Option 2 is also the right way to specify it. Can you kubectl exec into your pod and type those commands to see what error messages that generates? Do something like kubectl exec <pod-name> -i -t -- bash -il
The error message shown in option 3 means that you're not passing a valid configuration to the API server. To learn the API definition, see v1.Lifecycle and after a few clicks into its child fields you'll find args isn't valid under lifecycle.postStart.exec.
Alternatively, you can find those API definition using kubectl explain, e.g. kubectl explain pods.spec.containers.lifecycle.postStart.exec in this case.

Related

Deploying Cloud Run via YAML gives error spec.template.spec.containers should contain exactly 1 container

When deploying a Cloud Run service via a YAML file from the command line, it fails with this error.
ERROR: (gcloud.run.services.replace) spec.template.spec.containers should contain exactly 1 container
This is because the documentation for adding an environment variable is wrong, or confusing at best.
The env node should be a child of the image and not the containers node as it says here.
https://cloud.google.com/run/docs/configuring/environment-variables#yaml
This is correct:
- image: us-east1-docker.pkg.dev/proj/repo/image:r1
env:
- name: SOMETHING
value: Xyz

How can I make a custom command run always with `when: always`

I have a circle config which includes the following custom command:
remove-circle-ip:
description: "remove current Circle CI box IP from inbound security group rules for DB"
steps:
- aws-white-list-circleci-ip/remove:
tag-key: circleci
tag-value: whitelistmeplease
port: 5432
which I use in my job as follows:
jobs:
test:
docker:
- image: nikolaik/python-nodejs:python3.8-nodejs12
environment:
AWS_DEFAULT_REGION: us-east-2
steps:
- setup
- install-python-deps
- add-circle-ip
- run:
name: run tests
command: |
poetry run coverage run --source='.' manage.py test
- run:
name: remove circle IP
command: remove-circle-ip
when: always
I'd like the step for remove circle IP to run even if the tests which run before it fail. I can't seem to figure out the syntax for this. Previously, I had just used - remove-circle-ip to run the command rather than putting a run block, i.e.:
jobs:
test:
docker:
...
steps:
- setup
- ...
- add-circle-ip
- ...
- remove-circle-ip
but couldn't figure out how to specify when: always if I did it that way.
But now, when switching to calling my command as part of a run block, it fails with "remove-circle-ip: command not found"
So how can I make this command always run even if steps before fail?
I'm fairly new to CircleCI so there may be a better way to do this, or maybe this shouldn't be done at all, however something similar was done (before I joined) to a project I'm working on. It was achieved by making every step report success, whether it actually succeeded or failed, which allows the command at the end to always run. The commands are all terminal commands, so they just have || true at the end. I'm not sure how you would achieve that with a more complex command or using a builtin command.
In our case the steps that can fail are optional and we don't care if they actually fail or not. However if you want to report the failure I think that you should be able to store the failure from a previous step somewhere, and add a final step that reports it.

Error: endorsement failure during invoke. response: status:500 message:"error in simulation: failed to execute transaction [duplicate]

I just reinstalled Fabric Samples v2.2.0 from Hyperledger Fabric repository according to the documentation.
But when I try to run asset-transfer-basic application located in fabric-samples/asset-transfer-basic/application-javascript directory by running node app.js the wallet is created and an admin and user is registered. But then it tries to invoke the function as given in app.js and shows this error
error: [Transaction]: Error: No valid responses from any peers. Errors:
peer=peer0.org1.example.com:7051, status=500, message=error in simulation: failed to execute transaction
aa705c10403cb65cecbd360c13337d03aac97a8f233a466975773586fe1086f6: could not launch chaincode basic_1.0:b359a077730d7
f44d6a437ad49d1da951f6a01c6d1eed4f85b8b1f5a08617fe7: error starting container: error starting container:
API error (404): network _test not found
Response of a transaction to invoke a function
This error never occured before. But somehow after reinstalling docker and Hyperledger Fabric fabric-samples it never seems to find the network _test.
N.B. : Before reinstalling name of the network was net_test. But now when I try docker network ls it shows a network called docker_test. I am using Windows Subsystem for Linux (WSL) version 1.
NETWORK ID NAME DRIVER SCOPE
b7ac05456f46 bridge bridge local
acaa5856b871 docker_test bridge local
866f58b9078d host host local
4812f94efb15 none null local
How can I fix the issue occurring when I try to run the application?
In my opinion, the CORE_VM_DOCKER_HOSTCONFIG_NETWORKMODE setting seems to be wrong.
you can check docker-compose.yaml or core.yaml
1. docker-compose.yaml
I will explain fabric-samples/test-network as targeting according to your current situation.
You can check in CORE_VM_DOCKER_HOSTCONFIG_NETWORKMODE in docker-compose.yaml
Perhaps in your case(fabric-samples/test-network), the value of ${COMPOSE_PROJECT_NAME} was not set properly, so it was set to _test.
Make sure the value is set correctly and change it to your network name.
# hyperledger/fabric-samples/test-network/docker/docker-compose-test-net.yaml
# based v2.2
...
peer0.org1.example.com:
container_name: peer0.org1.example.com
image: hyperledger/fabric-peer:2.2
environment:
- CORE_VM_ENDPOINT=unix:///host/var/run/docker.sock
# - CORE_VM_DOCKER_HOSTCONFIG_NETWORKMODE=${COMPOSE_PROJECT_NAME}_test
- CORE_VM_DOCKER_HOSTCONFIG_NETWORKMODE=docker_test
...
2. core.yaml
If you have not set the value in the docker-compose.yaml peer, you need to check the core.yaml referenced by the peer.
you can find the networkMode parameter in core.yaml
# core.yaml
...
vm:
docker:
hostConfig:
# NetworkMode: host
NetworkMode: docker_test
...
If neither is set, it will be set to the default value. However, as you see _test being logged, the wrong value have been set in one of the two section, and you need to correct the value to the value you intended.
This issue is related to docker networking. In complete to #nezuko-response.
Create a file and name it ".env" in the same directory where your docker-compose file exists.
Add the following line in it:
COMPOSE_PROJECT_NAME=net
Use docker-compose up to update the container with the new configurations.
Or bring the HL network down (./network.sh down) and up (./network.sh up), restarting the test-nework.
Otherwise you'll still get the same error even after creating ".env" file.
More explanation about docker networking
run ./network down
then
export COMPOSE_PROJECT_NAME=net
afterwards
./network start
I copied this from someone .This one worked for me !!
Please create a file named ".env" in the same directory where your docker-compose file exists. Add the following line in ".env" file:-
COMPOSE_PROJECT_NAME=net
This worked for me
export COMPOSE_PROJECT_NAME=net

Trying to install logging on google cloud run but it's failing

I am trying to follow these instructions to log correctly from java to logback to cloudrun...
https://cloud.google.com/logging/docs/setup/java
If I used jdk8, I get alpn missing jetty issues so I moved to a Docker image openjdk:10-jre-slim
and my DockerFile is simple
FROM openjdk:10-jre-slim
RUN mkdir -p ./webpieces
COPY . ./webpieces/
COPY config/logback.cloudrun.xml ./webpieces/config/logback.xml
WORKDIR "/webpieces"
ENTRYPOINT ./bin/customerportal -http.port=:$PORT -hibernate.persistenceunit=cloud-production
AND the only difference is I switched the image from openjdk:8-jdk-alpine which worked fine!!!
When I deploy to google cloud I get this error...
Deploying container to Cloud Run service [staging-customerportal] in project [orderly-gcp] region [us-west1]
⠏ Deploying... Cloud Run error: Invalid argument error. Invalid ENTRYPOINT. [name: "gcr.io/orderly-gcp/customerportal2#sha256:6c1c2e7531684d8f50a3120f1de60cade841ab1d9069b
704ee3fd8499c5b7779"
error: "Invalid command \"/bin/sh\": file not found"
].
X Deploying... Cloud Run error: Invalid argument error. Invalid ENTRYPOINT. [name: "gcr.io/orderly-gcp/customerportal2#sha256:6c1c2e7531684d8f50a3120f1de60cade841ab1d9069b
704ee3fd8499c5b7779"
error: "Invalid command \"/bin/sh\": file not found"
].
. Routing traffic...
Deployment failed
ERROR: (gcloud.run.deploy) Cloud Run error: Invalid argument error. Invalid ENTRYPOINT. [name: "gcr.io/orderly-gcp/customerportal2#sha256:6c1c2e7531684d8f50a3120f1de60cade841ab1d9069b704ee3fd8499c5b7779"
error: "Invalid command \"/bin/sh\": file not found"
].
However, when I run locally to test, I get this error on project ID being required so it seems it is working. SIDE QUESTION: How to simulate this project ID so I can still run locally?
03:10:08,650 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [CLOUD]
03:10:09,868 |-ERROR in ch.qos.logback.core.joran.spi.Interpreter#14:13 - RuntimeException in Action for tag [appender] java.lang.IllegalArgumentException: A project ID is required for this service but could not be determined from the builder or the environment. Please set a project ID using the builder.
at java.lang.IllegalArgumentException: A project ID is required for this service but could not be determined from the builder or the environment. Please set a project ID using the builder.
at at com.google.common.base.Preconditions.checkArgument(Preconditions.java:142)
at at com.google.cloud.ServiceOptions.<init>(ServiceOptions.java:285)
at at com.google.cloud.logging.LoggingOptions.<init>(LoggingOptions.java:98)
at at com.google.cloud.logging.LoggingOptions$Builder.build(LoggingOptions.java:92)
at at com.google.cloud.logging.LoggingOptions.getDefaultInstance(LoggingOptions.java:52)
at at com.google.cloud.logging.logback.LoggingAppender.getLoggingOptions(LoggingAppender.java:246)
at at com.google.cloud.logging.logback.LoggingAppender.getProjectId(LoggingAppender.java:209)
at at com.google.cloud.logging.logback.LoggingAppender.start(LoggingAppender.java:194)
at at ch.qos.logback.core.joran.action.AppenderAction.end(AppenderAction.java:90)
at at ch.qos.logback.core.joran.spi.Interpreter.callEndAction(Interpreter.java:309)
at at ch.qos.logback.core.joran.spi.Interpreter.endElement(Interpreter.java:193)
at at ch.qos.logback.core.joran.spi.Interpreter.endElement(Interpreter.java:179)
at at ch.qos.logback.core.joran.spi.EventPlayer.play(EventPlayer.java:62)
at at ch.qos.logback.core.joran.GenericConfigurator.doConfigure(GenericConfigurator.java:165)
at at ch.qos.logback.core.joran.GenericConfigurator.doConfigure(GenericConfigurator.java:152)
at at ch.qos.logback.core.joran.GenericConfigurator.doConfigure(GenericConfigurator.java:110)
The Java 10 version is EOL, and the official images has been removed. More detail here
Prefer a Java 11 version.
Anyway, when you use version, some are optimized and does not install bash by default (for reducing their size) and you have to install it by yourselves.
For a local run, I don't recommend to use a JSON key file (in general, don't use JSON key file, except for automated system out of GCP) due to security constraint, key rotation, secure storage,...
For setting the project, simply perform this command gcloud config set project MY_PROJECT. You don't need credential for this.
Since your current question is how to simulate the project ID for local testing:
You should download service account key file from https://console.cloud.google.com/iam-admin/serviceaccounts/project?project=MY_PROJECT, make it accessible inside docker container and activate it via
gcloud auth activate-service-account --key-file my_service_account.json
gcloud config set project MY_PROJECT
This problem may be due to the fact that alpine doesn't have bash:
"/bin/sh" therefore a solution could be to remove the dependency on bash itself by not using bash or by using exec instead of bash.
in my case I solved the problem by using a more complete base image, instead of alpine for instance.
HTH

Docker image deployed to Google Compute Engine keeps restarting

I built an image with Google Cloud Build using Docker Compose. In my cloudbuild.yml file I have the following steps:
Build the docker image using docker compose
Tag the built image
Create an instance template
Create instance group
Now here is the problem every time a new instance gets built the created container from the image keeps restarting and never actually boots up. In spite of this I can build the image and start it as a container on the instance independent from the image from cloud build.
I managed to find some clues from the logs:
E1219 19:13:52 7f28dce6d700 api_server.cc:184 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected
oauth2.cc:289 Getting auth token from metadata server docker
I also got some clue by running the following in the instance:
docker -a -i start <container_id>
Output: Unrecognized input header: 99
The cloudbuild.yml file looks like (I've replaced some variables with ...):
#cloudbuild.yaml
steps:
- name: 'docker/compose:1.22.0'
args: ['-f', 'docker/docker-compose.tb.prod.yml', 'up', '-d']
- name: 'gcr.io/cloud-builders/docker'
args: ['tag', 'tb:latest', '...']
- name: 'gcr.io/cloud-builders/gcloud'
args: [
'beta', 'compute', '--project=...', 'instance-templates', 'create-with-container',
'tb-app-staging-${COMMIT_SHA}',
'--machine-type=n1-standard-2', '--network=...', '--network-tier=PREMIUM', '--metadata=google-logging-enabled=true',
'--maintenance-policy=MIGRATE', '--service-account=...',
'--scopes=https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/trace.append',
'--tags=http-server,https-server', '--image=cos-stable-69-10895-62-0', '--image-project=cos-cloud', '--boot-disk-size=20GB', '--boot-disk-type=pd-standard',
'--container-restart-policy=always', '--labels=container-vm=cos-stable-69-10895-62-0',
'--boot-disk-device-name=...',
'--container-image=...',
]
- name: 'gcr.io/cloud-builders/gcloud'
args: [
'beta', 'compute', '--project=...', 'instance-groups',
'managed', 'rolling-action', 'start-update',
'tb-app-staging',
'--version',
'template=...',
'--zone=europe-west1-b',
'--max-surge=20',
'--max-unavailable=9999'
]
images: ['...']
timeout: 1200s
I found the issue and I'll answer this question myself just incase someone else runs into the same issue.
The problem was that in my docker-compose.yml I have the configuration for stdin_open and tty set to true but my cloudbuild.yml file did not accept it and was failing silently (annoying!).
To fix the issue you will need to use the flags --container-stdin and --container-tty on the create-with-container command.
More details can be found on the google docs https://cloud.google.com/compute/docs/containers/configuring-options-to-run-containers
I has a similar issue the reason was setting USER in Dockerfile. I was using changing user to 'node' which is user available in official nodejs images. But does not work on Google cloud containers.
FROM node:current-buster-slim
USER node

Resources