"Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable." - google-cloud-run

I'm trying to build a container image that I will later use to update the code inside of a virtual machine. The docker image works fine as I can build and run it inside of my terminal. However, I keep getting an error when I try to deploy it to cloud run: "Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable." How can I fix this error?
The build log contains this:
Deploying container to Cloud Run service [SERVICE] in project [PROJECT_ID] region [REGION]
Deploying...
Creating Revision.......................................................................................................................................................................failed
Deployment failed
ERROR: (gcloud.run.deploy) Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.
The revision log contains this:
{
"protoPayload": {
"#type": "type.googleapis.com/google.cloud.audit.AuditLog",
"status": {
"code": 9,
"message": "Ready condition status changed to False for Revision {REVISION_NAME} with message: Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.\n\nLogs URL:{URL_LINK}"
},
"serviceName": "run.googleapis.com",
"resourceName": "{REVISION_NAME}",
"response": {
"metadata": {
"name": "{REVISION_NAME}",
"namespace": "{NAMESPACE}",
"selfLink": "{SELFLINK}",
"uid": "{UID}",
"resourceVersion": "{RESOURCEVER}",
"generation": 1,
"creationTimestamp": "{TIMESTAMP}",
"labels": {
"serving.knative.dev/route": "{SERVICE}",
"serving.knative.dev/configuration": "{SERVICE}",
"serving.knative.dev/configurationGeneration": "15",
"serving.knative.dev/service": "{SERVICE}",
"serving.knative.dev/serviceUid": "{SERVICE_UID}",
"cloud.googleapis.com/location": "{REGION}"
},
"annotations": {
"run.googleapis.com/client-name": "gcloud",
"serving.knative.dev/creator": "{NAMESPACE}#cloudbuild.gserviceaccount.com",
"client.knative.dev/user-image": "gcr.io/{PROJECT_ID}/{IMAGE}",
"run.googleapis.com/client-version": "357.0.0",
"autoscaling.knative.dev/maxScale": "100"
},
"ownerReferences": [
{
"kind": "Configuration",
"name": "{SERVICE}",
"uid": "{UID}",
"apiVersion": "serving.knative.dev/v1",
"controller": true,
"blockOwnerDeletion": true
}
]
},
"apiVersion": "serving.knative.dev/v1",
"kind": "Revision",
"spec": {
"containerConcurrency": 80,
"timeoutSeconds": 300,
"serviceAccountName": "{NAMESPACE}-compute#developer.gserviceaccount.com",
"containers": [
{
"image": "gcr.io/{PROJECT_ID}/{IMAGE}",
"ports": [
{
"name": "h2c",
"containerPort": 8080
}
],
"resources": {
"limits": {
"cpu": "1000m",
"memory": "512Mi"
}
}
}
]
},
"status": {
"observedGeneration": 1,
"conditions": [
{
"type": "Ready",
"status": "False",
"reason": "HealthCheckContainerError",
"message": "Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.\n\nLogs URL:{LOG_LINK}",
"lastTransitionTime": "{TIME}"
},
{
"type": "Active",
"status": "Unknown",
"reason": "Reserve",
"lastTransitionTime": "{TIME}",
"severity": "Info"
},
{
"type": "ContainerHealthy",
"status": "False",
"reason": "HealthCheckContainerError",
"message": "Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.\n\nLogs URL:{LOG_LINK}",
"lastTransitionTime": "{TIME}"
},
{
"type": "ResourcesAvailable",
"status": "True",
"lastTransitionTime": "{TIME}"
},
{
"type": "Retry",
"status": "True",
"reason": "ImmediateRetry",
"message": "System will retry after 0:00:00 from lastTransitionTime for attempt 0.",
"lastTransitionTime": "{TIME}",
"severity": "Info"
}
],
"logUrl": "{LOG_LINK}",
"imageDigest": "gcr.io/{PROJECT_ID}/{IMAGE_SHA}"
},
"#type": "type.googleapis.com/google.cloud.run.v1.Revision"
}
},
"insertId": "{ID}",
"resource": {
"type": "cloud_run_revision",
"labels": {
"location": "{REGION}",
"configuration_name": "{SERVICE}",
"service_name": "{SERVICE}",
"project_id": "{PROJECT_ID}",
"revision_name": "{REVISION_NAME}"
}
},
"timestamp": "{TIME}",
"severity": "ERROR",
"logName": "projects/{PROJECT_ID}/logs/cloudaudit.googleapis.com%2Fsystem_event",
"receiveTimestamp": "{TIME}"
}
This is my cloudbuild.yaml:
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/PROJECT_ID/IMAGE', '.']
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/PROJECT_ID/IMAGE']
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args: ['run', 'deploy', 'SERVICE-NAME', '--image', 'gcr.io/PROJECT_ID/IMAGE', '--region', 'REGION', '--port', '8080']
images:
- gcr.io/PROJECT_ID/IMAGE
This is my Dockerfile:
FROM python:3.9.7-slim-buster
WORKDIR /app
COPY . .
CMD [ "python3", "hello.py" ]
This is the code in hello.py:
print("Hello World")

When Cloud Run starts your container, a health check is sent to the container. Your container is not responding to the health check. Therefore, Cloud Run determines that your service is failing.
Cloud Run requires that a container provide service/process/program that listens for and responds to HTTP requests.
Your hello.py file only prints a message to stdout. Your program does not start a process to listen for requests.
A very simple example that converts your example into a working program:
import os
from flask import Flask
app = Flask(__name__)
#app.route('/')
def home():
return "Hello world"
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))
Note: You will need to add a file requirements.txt to your build to include Flask. Create requirements.txt in the same location as Dockerfile.
requirements.txt:
Flask==2.0.1

Related

Filebeat 7.10.1 add_docker_metadata adds only container.id

I'm using filebeat 7.10.1 installed on host system (not docker container), running as service by root
according to https://www.elastic.co/guide/en/beats/filebeat/current/add-docker-metadata.html
and https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-container.html
filebeat config, filebeat.yml:
filebeat.inputs:
- type: container
enabled: true
paths:
- '/var/lib/docker/containers/*/*.log'
processors:
- add_docker_metadata: ~
setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false
setup.kibana:
output.logstash:
hosts: ["<logstash_host>:5044"]
started container:
docker run --rm -d -l my-label --label com.example.foo=bar -p 80:80 nginx
filebeat get logs and successfully send them to endpoint (in my case to logstash, which resend to elasticsearch), but generated json by filebeat contains only container.id without container.name, container.labels and container.image
it looks like (copy-paste from kibana):
{
"_index": "logstash-2021.02.10",
"_type": "_doc",
"_id": "s4a4i3cB8j0XLXFVuyMm",
"_version": 1,
"_score": null,
"_source": {
"#version": "1",
"ecs": {
"version": "1.6.0"
},
"#timestamp": "2021-02-10T11:33:54.000Z",
"host": {
"name": "<some_host>"
},
"input": {
"type": "container"
},
"tags": [
"beats_input_codec_plain_applied"
],
"log": {
.....
},
"stream": "stdout",
"container": {
"id": "15facae2115ea57c9c99c13df815427669e21053791c7ddd4cd0c8caf1fbdf8c-json.log"
},
"agent": {
"version": "7.10.1",
"ephemeral_id": "adebf164-0b0d-450f-9a50-11138e519a27",
"id": "0925282e-319e-49e0-952e-dc06ba2e0c43",
"name": "<some_host>",
"type": "filebeat",
"hostname": "<some_host>"
}
},
"fields": {
"log.timestamp": [
"2021-02-10T11:33:54.000Z"
],
"#timestamp": [
"2021-02-10T11:33:54.000Z"
]
},
"highlight": {
"log.logger_name": [
"#kibana-highlighted-field#gw_nginx#/kibana-highlighted-field#"
]
},
"sort": [
1612956834000
]
}
what am I doing wrong? How to configure filebeat for send container.name, container.labels, container.image?
So after looking on filebeat-debug and paths on filesystem - issue closed
Reason: symlink /var/lib/docker -> /data/docker produces unexpected behavior
Solution:
filebeat.inputs:
- type: container
enabled: true
paths:
- '/data/docker/containers/*/*.log' #use realpath
processors:
- add_docker_metadata:
match_source_index: 3 #subfolder for extract container id from path

How to directly mount external NFS share/volume in kubernetes(1.10.3)

I am using kubernetes : v1.10.3 , i have one external NFS server which i am able to mount anywhere ( any physical machines). I want to mount this NFS directly to pod/container . I tried but every time i am getting error. don't want to use privileges, kindly help me to fix.
ERROR: MountVolume.SetUp failed for volume "nfs" : mount failed: exit
status 32 Mounting command: systemd-run Mounting arguments:
--description=Kubernetes transient mount for /var/lib/kubelet/pods/d65eb963-68be-11e8-8181-00163eeb9788/volumes/kubernetes.io~nfs/nfs
--scope -- mount -t nfs 10.225.241.137:/stagingfs/alt/ /var/lib/kubelet/pods/d65eb963-68be-11e8-8181-00163eeb9788/volumes/kubernetes.io~nfs/nfs
Output: Running scope as unit run-43393.scope. mount: wrong fs type,
bad option, bad superblock on 10.225.241.137:/stagingfs/alt/, missing
codepage or helper program, or other error (for several filesystems
(e.g. nfs, cifs) you might need a /sbin/mount. helper program)
In some cases useful info is found in syslog - try dmesg | tail or so.
NFS server : mount -t nfs 10.X.X.137:/stagingfs/alt /alt
I added two things for volume here but getting error every time.
first :
"volumeMounts": [
{
"name": "nfs",
"mountPath": "/alt"
}
],
Second :
"volumes": [
{
"name": "nfs",
"nfs": {
"server": "10.X.X.137",
"path": "/stagingfs/alt/"
}
}
],
---------------------complete yaml --------------------------------
{
"kind": "Deployment",
"apiVersion": "extensions/v1beta1",
"metadata": {
"name": "jboss",
"namespace": "staging",
"selfLink": "/apis/extensions/v1beta1/namespaces/staging/deployments/jboss",
"uid": "6a85e235-68b4-11e8-8181-00163eeb9788",
"resourceVersion": "609891",
"generation": 2,
"creationTimestamp": "2018-06-05T11:34:32Z",
"labels": {
"k8s-app": "jboss"
},
"annotations": {
"deployment.kubernetes.io/revision": "2"
}
},
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"k8s-app": "jboss"
}
},
"template": {
"metadata": {
"name": "jboss",
"creationTimestamp": null,
"labels": {
"k8s-app": "jboss"
}
},
"spec": {
"volumes": [
{
"name": "nfs",
"nfs": {
"server": "10.X.X.137",
"path": "/stagingfs/alt/"
}
}
],
"containers": [
{
"name": "jboss",
"image": "my.abc.com/alt:7.1_1.1",
"resources": {},
"volumeMounts": [
{
"name": "nfs",
"mountPath": "/alt"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent",
"securityContext": {
"privileged": true
}
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"schedulerName": "default-scheduler"
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": "25%",
"maxSurge": "25%"
}
},
"revisionHistoryLimit": 10,
"progressDeadlineSeconds": 600
},
"status": {
"observedGeneration": 2,
"replicas": 1,
"updatedReplicas": 1,
"readyReplicas": 1,
"availableReplicas": 1,
"conditions": [
{
"type": "Available",
"status": "True",
"lastUpdateTime": "2018-06-05T11:35:45Z",
"lastTransitionTime": "2018-06-05T11:35:45Z",
"reason": "MinimumReplicasAvailable",
"message": "Deployment has minimum availability."
},
{
"type": "Progressing",
"status": "True",
"lastUpdateTime": "2018-06-05T11:35:46Z",
"lastTransitionTime": "2018-06-05T11:34:32Z",
"reason": "NewReplicaSetAvailable",
"message": "ReplicaSet \"jboss-8674444985\" has successfully progressed."
}
]
}
}
Regards
Anupam Narayan
As stated in the error log:
for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount. helper program
According to this question, you might be missing the nfs-commons package which you can install using sudo apt install nfs-common

How Docker will resolve hostname or IP present in properties file?

I have 2 Spring Boot micro-service applications i.e web application and metastore application. This is the properties file for my web application.
spring:
thymeleaf:
prefix: classpath:/static/
application:
name: web-server
profiles:
active: native
server:
port: ${port:8383}
---
host:
metadata: http://10.**.**.***:5011
Dockerfile for web application:
FROM java:8-jre
MAINTAINER **** <******>
ADD ./ms.console.ivu-ivu.1.0.1.jar /app/
CMD chmod +x /app/*
CMD ["java","-jar", "/app/ms.console.web-web.1.0.1.jar"]
EXPOSE 8383
Dockerfile for metadata application:
FROM java:8-jre
MAINTAINER ******* <********>
ADD config/* /deploy/config/
CMD chmod +x ./deploy/config/*
COPY ./ms.metastore.1.0.1.jar /deploy/
CMD chmod +x ./deploy/ms.metastore.1.0.1.jar
CMD ["java","-jar","./deploy/ms.metastore.1.0.1.jar"]
EXPOSE 5011
I am using Mesos and Marathon for cluster management. The Marathon scripts for metastore is :-
{
"id": "/ms-metastore",
"cmd": null,
"cpus": 1,
"mem": 2000,
"disk": 0,
"instances": 0,
"acceptedResourceRoles": [
"*"
],
"container": {
"type": "DOCKER",
"docker": {
"forcePullImage": true,
"image": "*****/****:ms-metastore",
"parameters": [],
"privileged": true
},
"volumes": [],
"portMappings": [
{
"containerPort": 5011,
"hostPort": 0,
"labels": {},
"protocol": "tcp",
"servicePort": 10000
}
]
},
"networks": [
{
"mode": "container/bridge"
}
],
"portDefinitions": [],
"fetch": [
{
"uri": "file:///etc/docker.tar.gz",
"extract": true,
"executable": false,
"cache": false
}
]
}
Web marathon:
{
"id": "/ms-console",
"cmd": null,
"cpus": 1,
"mem": 2000,
"disk": 0,
"instances": 0,
"acceptedResourceRoles": [
"*"
],
"container": {
"type": "DOCKER",
"docker": {
"forcePullImage": true,
"image": "****/****:ms-console",
"parameters": [],
"privileged": true
},
"volumes": [],
"portMappings": [
{
"containerPort": 8383,
"hostPort": 0,
"labels": {},
"protocol": "tcp",
"servicePort": 10000
}
]
},
"networks": [
{
"mode": "container/bridge"
}
],
"portDefinitions": [],
"fetch": [
{
"uri": "file:///etc/docker.tar.gz",
"extract": true,
"executable": false,
"cache": false
}
]
}
Web application I am connecting to metastore with IP which is hard coded (mentioned in properties). I created docker images for both and run in my server. The metastore server now running in different machine, so my web application is unable to resolve this IP.
All you need to do here is expose 5011 as the host port on the metadata server running on "different machine" using -p -
docker run -d -p 5011:5011 metadata_image ....
Now your web application should be able to access metadata server by using http://$different_machine_ip:5011/
$different_machine_ip = Metadata server IP
However since they need to be tightly coupled, i would suggest you run web app & metadata server on the same machine in case your metadata server is stateless.

Mesos Marathon(ctl) Debugging - "Abnormal executor termination: unknown container"

I'm still new to Mesos, but am trying to figure out the best way to debug a Mesos application I'm attempting to develop. I'm getting the error message "Abnormal executor termination: unknown container" through the web application, and am unsure how to get more descriptive error messages to figure out what's going on. The error message would seem to indicate it can't find the Docker image, but I know for a fact it's referencing the correct image that is installed and running.
{
"id": "pgprimary",
"cmd": null,
"cpus": 1,
"mem": 128,
"disk": 0,
"instances": 1,
"container": {
"docker": {
"image": "example/postgres:centos7-10.0-1.6.0",
"network": "BRIDGE",
"parameters": [{
"key": "hostname",
"value": "pgprimary"
}],
"portMappings": [
]
},
"type": "DOCKER",
"volumes": [
{
"hostPath": "/mnt/nfsfileshare/pgdata",
"containerPath": "/pgdata",
"mode": "RW"
}
]
},
"env": {
"PG_MODE": "primary",
"PG_USER": "testuser",
"PG_PASSWORD": "testuser",
"PG_DATABASE": "userdb",
"PG_ROOT_PASSWORD": "password",
"PG_PRIMARY_USER": "primaryuser",
"PG_PRIMARY_PASSWORD": "password",
"PG_PRIMARY_PORT": "5432"
},
"labels": {},
"healthChecks": [
{
"protocol": "COMMAND",
"command": {
"value": "/usr/pgsql-10/bin/pg_isready --host=pgprimary.marathon.mesos"
},
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}
]
}
The command I'm using to deploy the Marathon app:
marathonctl -h http://10.0.2.15:8080 app create postgres.json
Not image, but docker is what marathon cannot find.
Specify the use of the Docker containerizer:
echo 'docker,mesos' > /etc/mesos-slave/containerizers
Provisioning Containers with the Docker Containerizer
https://mesosphere.github.io/marathon/docs/native-docker.html

How to pull docker image with marathon which need to be authorized

I wan to deploy a docker container with marathon, if the docker image without authorized, the image can be pull normally, but when I try to pull an image from repository which need to be authorized, task deploy fail, the response is
Failed to launch container: Failed to run 'docker -H unix:///var/run/docker.sock pull example.com/web:laest': exited with status 1; stderr='Error response from daemon: repository example.com/web not found: does not exist or no pull access '
I changed the permission of /var/run/docker.sock file to 777 on node, and master, but the issue is still appeared, that seems permission is not the root cause for the issue; I try to run "docker login" on the node, and pull the image manually, then the marathon task run correctly, my marathon json like below:
{
"id": "/web",
"cmd": "docker login --username='sam' --passwoer='123456' example.com/web:latest",
"cpus": 0.3,
"mem": 32,
"disk": 0,
"instances": 1,
"env": {
"EMAIL_USE_TLS": "False",
"DATABASE_URI": "mysql://user:123456#RDS:3306/test"
},
"container": {
"type": "DOCKER",
"volumes": [
{
"containerPath": "/data/supervisor/",
"hostPath": "/data/workspace/logs/supervisor/",
"mode": "RW"
}
],
"docker": {
"image": "daocloud.io/gizwits2015/gwaccounts:1.6.0",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 0,
"hostPort": 0,
"servicePort": 10000,
"protocol": "tcp",
"labels": {}
}
],
"privileged": false,
"parameters": [
{
"key": "add-host",
"value": "RDS:10.66.125.161"
}
],
"forcePullImage": false
}
},
"portDefinitions": [
{
"port": 10000,
"protocol": "tcp",
"name": "default",
"labels": {}
}
]
}
How can I pull the image with authorized with marathon?
You should read: https://mesosphere.github.io/marathon/docs/native-docker-private-registry.html
Follow step 1, and in step 2 replace the uris section with
"fetch" : [
{
"uri" : "https://path.to/file",
"extract" : true,
"outputFile" : "dockerConfig.tar.gz"
}
]
I've written more detailed explanation here: http://blog.itaysk.com/2017/05/22/using-a-custom-private-docker-registry-with-marathon

Resources