GCP Cloud run Docker Service not reachable when created by Continous Deployment - docker

I am currently getting myself a bit familar with GCP and there I have setup a Cloud Run Docker based WebService with a simple Python Script:
from flask import Flask, jsonify
app = Flask(__name__)
#app.route("/test", methods = ['GET'])
def get_all_users():
x = 5
y = 5
return jsonify("{} + {} = {}".format(x, y, x + y))
if __name__ == '__main__':
app.run(host='0.0.0.0', debug=True, port=8080)
with this Docker file:
FROM python:3.8
ENV PORT 8080
ENV PYTHONUNBUFFERED True
RUN pip install flask gunicorn
WORKDIR /server
COPY main.py .
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app
In the Cloud Run menu when I use the option to deploy a revision from an existing container I am able to connect to my WebService e. g. via Postman without any problem:
However if I want to use the Continously deployment of new revisions I cannot reach my Service anymore.
In both cases my Build Trigger are activated an run though correctly. - So basically in the first option above I use a valid Docker image. And in the second option it got problems.
One point I discovered is that every time I choose the continous deployment, it gets an error when I have set it up:
I can fix this in the trigger because it seems that the Tigger tries to build the image using a yaml file by default:
I can change this to Dockerfile and then the build passes and the error message dissapears, however there is no new container deployed - even if I push a new commit to the targer branch (the trigger is executed then but the Cloud Run server is not affected).
As said if I do not use the continous deployment, but just deploy a Cloud Run instance with a Docker image it works. So the Dockerfile and the Python Script are fine. But I have no idea what I made wrong.

Related

How to filter Google Logging by COMMIT_SHA container built through Google Build?

I have a Compute VM running a COS (Container Optimized Operating system). The container is Debian running a Python 3 application.
I have a trigger in Google Build which will take my source code and build a Docker image and place the image in Google Artifact. Then I go to Compute and start a VM using the container option and specify the container I want.
Once the container is running, I can see the logs as expected in Logging. There are cases where I could have multiple image versions running and I'd like to filter by $COMMIT_SHA.
To be clear: I want to filter by log items from a running Google Compute VM and NOT the logs from Google Build.
I've tried
Filter for google cloud build with a given commit_sha
How to pass build arguments to docker build command?
Using a cloudbuild.yaml passing in the $COMMIT_SHA to an ENV variable and attempting to have Python log the string:
cloudbuild.yaml
steps:
- name: "gcr.io/cloud-builders/docker"
args: ["build", "-t",
"us-central1-docker.pkg.dev/.../my_python_app:latest", "--build-arg",
"gcp_build_sha=${COMMIT_SHA}",
"."]
images: ["us-central1-docker.pkg.dev/.../my_python_app:latest"]
Dockerfile
FROM debian:latest
WORKDIR /application
COPY . .
ENV BUILD_SHA=$gcp_build_sha
...
CMD python3 my_python_app.py
my_python_app.py
import logging
logging.basicConfig(level=logging.DEBUG)
...
logging.info("BUILD SHA IS: %s", os.getenv("BUILD_SHA"))
...

Cloud Run Deploy fails: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable

I have a project which I had previously successfully deployed to Google Cloud Run, and set up with a trigger such that upon pushing to the repo's main branch on Github, it would automatically deploy. It worked great.
Then I tried to rename the github repo, which meant deleting and creating a new trigger, and now I cannot get it working again.
Everytime, the build succeeds but deployment fails with this error in Cloud Build:
Step #2 - "Deploy": ERROR: (gcloud.run.services.update) Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.
I have not changed anything other than the repo name, leading me to believe the fix is not with my code, but I tried some changes there anyway.
I have looked into the solutions set forth in this post. However, I believe I am listening on the correct port.
My app is using Python and Flask, and contains this:
if __name__ == "__main__":
app.run(debug=False, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
Which should use the ENV var Port (8080) and otherwise default to 8080. I also tried just using port=8080.
I tried explicitly exposing the port in the Dockerfile, which also did not work:
FROM python:3.7
#Copy files into docker image dir, and make that the current working dir
COPY . /docker-image
WORKDIR /docker-image
RUN pip install -r requirements.txt
CMD ["flask", "run", "--host", "0.0.0.0"]
EXPOSE 8080
Cloud Run does seem to be using port 8080 - if I dig into the response, I see this nested under Response.spec.container.0 :
ports: [
0: {
containerPort: 8080
name: "http1"
}
]
All that said, if I look at the logs, it shows "Now running on Port 5000".
I have no idea where that Port 5000 is coming from or being set, but trying to change the ports in Python/Flask and the Dockerfile to 5000 leads to the same errors.
How do I get it to run on Port 8080? It's very strange to me that this was working FINE prior to renaming the repo and creating a new trigger. How is this setup different? The Trigger does not give an option to set the port so I'm not sure how that caused this error.
You have mixed things. Flask command default port is effectively 5000. If you want to change it, you need to change your flask run command with the --port= parameter
CMD ["flask", "run", "--host", "0.0.0.0","--port","8080"]
In addition, your flask run command, is a flask runtime and totally ignore the standard python entrypoint if __name__ == "__main__":. If you want to use this entrypoint, use the Python runtime
CMD ["python", "<main file>.py"]

Invoking a Lambda function that has a Python library dependency in Docker container, deployed with Zappa; 500 response code error

I am looking to invoke AWS Lambda functions, deployed using Zappa. The lambda functions will have dependencies that are too large for the usual zip file approach (even with slim_handler : true).
To do this I am using Zappa with Docker containers, where the container holds the larger dependencies and the Lambda functions call upon the container as needed.
The application is a Flask app, with the usual app.py code that routes to a function. Deploying a dockerized flask app with Zappa is pretty straightforward, and I can do this successfully, as long as there is only vanilla Python in the app.py. But as soon as I add Python that depends on a library, I get the following error when attempting to invoke the Lambda function via the AWS API Gateway (as setup by Zappa):
Warning! Status check on the deployed lambda failed. A GET request to '/' yielded a 500 response code.
Here are the steps I am taking. I am using Feast as the Python library I want to work, with the dockerized Flask app pushed to Lambda with Zappa.
I Create the Dockerfile that works alongside Zappa for deploying to AWS Lambda.
I am using the steps outlined here to create this Dockerfile:
FROM amazon/aws-lambda-python:3.8
ARG FUNCTION_DIR="/var/task/"
COPY ./ ${FUNCTION_DIR}
# Setup Python environment
RUN pip install pipenv
RUN pip install -r requirements.txt
# Grab the zappa handler.py and put it in the working directory
RUN ZAPPA_HANDLER_PATH=$( \
python -c "from zappa import handler; print (handler.__file__)" \
) \
&& echo $ZAPPA_HANDLER_PATH \
&& cp $ZAPPA_HANDLER_PATH ${FUNCTION_DIR}
CMD ["handler.lambda_handler"]
Briefly, this uses a base image provided by AWS, copies my application code into the image, and sets up my Python environment with pipenv. The Zappa specific steps are at the bottom, where handler.py is manually added to the Docker image. "The lambda_handler function contains all the Zappa magic that routes API Gateway requests to the corresponding Flask function." The handler is specified in the CMD so that it runs whenever a Docker container using this image is started.
Here is my requirements.txt file:
zappa
flask
feast
Since I want Feast to be available, I install this inside the virtual environment:
Installing FEAST
These steps are from here:
Install Feast:
pip install feast
Create a Feature Repo:
feast init feature_repo
cd feature_repo
Register feature definitions and deployed feature store:
feast apply
Load features into my online store:
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
feast materialize-incremental $CURRENT_TIME
To test that Feast will be successfully deployed on Lambda I will fetch the feature vectors for inference over the AWS API Gateway, by invoking a Lambda Function over AWS. So I need the necessary Feast code callable by Flask:
Fetching Feature Vectors for Inference
I set up the Feast code inside the usual Flask app.py:
from flask import Flask
from feast import FeatureStore
app = Flask(__name__)
#app.route('/')
def index():
store = FeatureStore(repo_path="feature_repo/")
feature_vector = store.get_online_features(
feature_refs=[
"driver_hourly_stats:conv_rate",
"driver_hourly_stats:acc_rate",
"driver_hourly_stats:avg_daily_trips",
],
entity_rows=[{"driver_id": 1001}],
).to_dict()
return feature_vector
# We only need this for local development.
if __name__ == '__main__':
app.run()
The expected response from this is:
{
'driver_id': [1001],
'driver_hourly_stats__conv_rate': [0.49274],
'driver_hourly_stats__acc_rate': [0.92743],
'driver_hourly_stats__avg_daily_trips': [72],
}
I also install Zappa inside the virtual environment:
pipenv install zappa
...here is my zappa_settings.json:
"zappa_test": {
"app_function": "app.app",
"project_name": "app",
"runtime": "python3.8",
"s3_bucket": "zappa-f8s0d8fs0df",
"aws_region" : "us-east-2",
"lambda_description": "Zappa + Docker + Flask"
}
The S3 bucket number shown here is fake.
Also installing Flask inside the virtual environment:
pipenv install flask
Building the Image
I run the following to build the image:
zappa save-python-settings-file zappa_test
docker build -t zappa_test:latest .
I then Push to ECR:
aws ecr create-repository --repository-name zappa_test --image-scanning-configuration scanOnPush=true
Re-Tag the Image:
docker tag zappa_test:latest XXXXX.dkr.ecr.us-east-1.amazonaws.com/zappa_test:latest
Get authenticated to push to ECR:
aws ecr get-login-password | docker login --username AWS --password-stdin XXXXX.dkr.ecr.us-east-1.amazonaws.com
...and push the image:
docker push XXXXX.dkr.ecr.us-east-1.amazonaws.com/zappa_test:latest
Finally, I deploy the lambda function:
zappa deploy zappa_test -d XXXXX.dkr.ecr.us-east-1.amazonaws.com/zappa_test
So altogether, this is how one uses Zappa to push a docker image to AWS Lambda in order to invoke a lambda function over the AWS API Gateway. In this case, I am trying to invoke a lambda function that fetches Feast data via Flask, with the dependencies made available inside a Docker container.
But, upon running the deploy command I get:
Warning! Status check on the deployed lambda failed. A GET request to '/' yielded a 500 response code.
IMPORTANT:
I have confirmed that all of these steps work if I put something simple inside the app.py like this:
#app.route('/')
def index():
return "Hello, world!", 200
Everything is credentialized on AWS correctly, the image deploys, and I can call it perfectly through the API endpoint provided by Zappa (via AWS Gateway). I only get the mentioned error when I put the Feast code inside app.py, as I've shown above.

gcloud Docker error because of user input taking

I am trying to deploy a Python App using Docker on Google Cloud
After typing the command gcloud run deploy --image gcr.io/id/name, I get this error:
ERROR: (gcloud.run.deploy) Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.
Logs explorer:
TEST_MODE = input()
EOFError: EOF when reading a line
I know with error is caused by trying to take in user input, and with Dockers this command solves the error:
docker run -t -i
Any idea how to run this with gcloud?
Your example does not run a server and so it's not accepted by Cloud Run.
Cloud Run expects a server to be running on PORT (generally this evaluates to 8080 but you should not assume this).
While it's reasonable to want to run arbitrary containers on Cloud Run, the service expects to something to respond via HTTP.
One option would be to simply jam an HTTP server into your container that listens on PORT and then run your Python app alongside it but, Python is single-threaded and so it's less easy to do this. Plus, it's considered an anti-pattern to run multiple processes in a single container.
Therefore I propose the following:
Rewrite your app to return the input as an HTTP GET:
main.py:
from flask import Flask
app = Flask(__name__)
#app.route('/hello/<name>')
def hello(name):
return "Hello {name}".format(name=name)
if __name__ == '__main__':
app.run(host='127.0.0.1', port=8080, debug=True)
Test it:
python3 main.py
* Running on http://127.0.0.1:8080/ (Press CTRL+C to quit)
NOTE Flask is running on localhost (127.0.0.1). We need to change this when we run it in a container. It's running on 8080
NAME="Freddie"
curl http://localhost:8080/hello/${NAME}
Hello Freddie
Or browse: http://localhost:8080/hello/Freddie
Containerize this:
Dockerfile:
FROM python:3.10-rc-slim
WORKDIR /app
RUN pip install flask gunicorn
COPY main.py .
ENV PORT 8080
CMD exec gunicorn --bind 0.0.0.0:$PORT main:app
NOTE ENV PORT 8080 sets the environment variable PORT to a value of 8080 unless we specify otherwise (we'll do that next)
NOTE The image uses gunicorn as a runtime host for Flask. This time the Flask service is bound to 0.0.0.0 which permits it to accessible from outside the container (which we need) and it uses the value of PORT
Then:
# Build
docker build --tag=66458821 --file=./Dockerfile .
# Run
PORT="9999"
docker run \
--rm --interactive --tty \
--env=PORT=${PORT} \
--publish=8080:${PORT} \
66458821
[INFO] Starting gunicorn 20.0.4
[INFO] Listening at: http://0.0.0.0:9999 (1)
NOTE Because --env=${PORT}, Flask now runs on 0.0.0.0:9999 but, we remap this port to 8080 on the host. This is just to show how the PORT variable is now used by the container
Test it (using the commands as before)!
Publish it and gcloud run deploy ...
Test it!

Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable

I built my container image, but when I try to deploy it from the gcloud command line or the Cloud Console, I get the following error: "Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable."
In your code, you probably aren't listening for incoming HTTP requests, or you're listening for incoming requests on the wrong port.
As documented in the Cloud Run container runtime contract, your container must listen for incoming HTTP requests on the port that is defined by Cloud Run and provided in the $PORT environment variable.
If your container fails to listen on the expected port, the revision health check will fail, the revision will be in an error state and the traffic will not be routed to it.
For example, in Node.js with Express, you should use :
const port = process.env.PORT || 8080;
app.listen(port, () => {
console.log('Hello world listening on port', port);
});
In Go:
port := os.Getenv("PORT")
if port == "" {
port = "8080"
}
log.Fatal(http.ListenAndServe(fmt.Sprintf(":%s", port), nil))
In python:
app.run(port=int(os.environ.get("PORT", 8080)),host='0.0.0.0',debug=True)
One of the other reason may be the one which I observed. Docker images may not have the required code to run the application.
I had a Node application written in TypeScript. In order to dockerize the application all I need to do is compile the code tsc and run docker build but I though that gcloud builds submit will be taking care of that and picking the compiled code as the Dockerfile suggested in conjunction to the .dockerignore and will build my source code and submit to the repository.
But what all it did was to copy my source code and submitted to the Cloud Build and there as per the Dockerfile it dockerized my source code as compared to dockerizing the compiled code.
So remember to include a build step in Dockerfile if you are doing a source code in a language with require compilation.
Remember that enabling the build step in the Dockerfile will increase the image size every time you do a image push to the repository. It is eating the space over there and google is going to charge you for that.
Another possibility is that the docker image ends with a command that takes time to complete. By the time deployment starts the server is not yet running and the health check will hit a blank.
What kind of command would that be ? Usually any command that runs the server in dev mode. For Scala/SBT it would be sbt run or in Node it would be something like npm run dev. In short make sure to run only on the packaged build.
I was exposing a PORT in dockerfile , remove that automatically fixed my problem. Google injects PORT env variable so the project will pick up that Env variable.
We can also specify the port number used by the image from the command line.
If we are using Cloud Run, we can use the following:
gcloud run deploy --image gcr.io/<PROJECT_ID>/<APP_NAME>:<APP_VERSION> --max-instances=3 --port <PORT_NO>
Where
<PROJECT_ID> is the project ID
<APP_NAME> is the app name
<APP_VERSION> is the app version
<PORT_NO> is the port number
The Cloud Run is generating default yaml file which has hard-coded default port in it:
spec:
containerConcurrency: 80
timeoutSeconds: 300
containers:
- image: us.gcr.io/project-test/express-image:1.0
ports:
- name: http1
containerPort: 8080
resources:
limits:
memory: 256Mi
cpu: 1000m
So, we need to expose the same 8080 port or change the containerPort in yaml file and redeploy.
Here is more about that:
A possible solution could be:
build locally
push the image on google cloud
deploy on google run
With commands:
docker build -t gcr.io/project-name/image-name
docker push gcr.io/project-name/image-name
gcloud run deploy tag-name --image gcr.io/project-name/image-name

Resources