Managing traffic on Google Cloud Run - google-cloud-run

I'm actually using Cloud Build & Cloud Run in my project.
For the CI/CD, I use a cloudbuild.yaml file to define the steps I need to perform to deploy my new revision of Cloud Run.
After several issue on traffic management, I had to add to my Cloud Build file a step to specify that the traffic should be set to 100% to the latest built revision. That is weird but it is working so... be it 👍
cloudbuild.yaml step
# Allocate 100% of the traffic to that new revision
- name: gcr.io/cloud-builders/gcloud
args: ['run', 'services', 'update-traffic', '${_APP_NAME}', '--to-revisions=LATEST=100', '--platform', 'gke', '--cluster', 'xxxxxx', '--cluster-location', 'xxxxxxxxxxx', '--namespace', 'xxxxxxxx']
Unfortunately, even if the traffic is well set, it always keeps "alive" the old revision 👎 .
What is strange is that it is not doing it when I'm doing it remotely using gcloud command.
Did you already have that issue in the past ?
Thanks for the help :)

Related

how can I ensure traffic is being served only on my new cloud run service deployment?

I am seeing in my stackdriver logs that I am running an old revision of a cloud run service, while the console shows only one newer revision as live.
In this case I have a cloud run container built in a GCP project, but which is deployed in a second project, using the fully specified image name. (My attempts at Terraform were sidetracked into an auth catch 22, so until I have restored determination, I am manually deploying.) I don't know if this wrinkle of the two projects is relevant to my question. Otherwise all the auth works.
In brief, why may I be seeing old deployments receiving traffic minutes after the new deployment is made? Even more than 30 minutes later traffic is still reaching the old deployment.
There are a few things to take into account here:
Try explicitly telling Cloud Run to migrate all traffic to the latest revision. You can do that by adding the following to your .yaml file
metadata:
name: SERVICE
spec:
...
traffic:
- latestRevision: true
percent: 100
Try always adding :latest tag upon the building of a new image
so instead of only having let's say gcr.io/project/newimage it would be gcr.io/project/newimage:latest. This way you will ensure the latest image is being used and not previously automatically assigned tags.
If neither fix your issue, then please provide the logs as there might be something useful that indicates what is the root cause. (Also let us know if you are using any caching config)
You can tell Cloud Run to route all traffic to the latest instance with gcloud run services update-traffic [SERVICE NAME] --to-latest. That will route all traffic to the latest deployment, and update the traffic allocation as you deploy new instances.
You might not want to use this if you need to validate the service after deployment and before "opening the floodgates", or if you're doing canary deployments.

getting timeout on cloud build : tried all sort of things

Hello I have exhausted all sorts of options i found on web and nothings seems to work for me.
I am pushing changes to a repo which is already setup for cloud build.
This is my yaml file for cloud trigger
steps:
name: "gcr.io/cloud-builders/gcloud"
args: ["app", "deploy", ".yaml"]
timeout: 1200s
timeout: 1500s
It also triggers one extra Google Cloud Storage build in parallel which I cant figure out where is it coming from. It has no ties to the code repo I got. This build times out in 10 minutes and hence causing my main build to fail.
Even if set timeout to 1200 to my step, it seems to have no impact on this cloud storage build. It times out at 10 mins.
TIA
I already have timeouts on steps and build level which is higher than 10minuyes, but this doesnot seem to do the magic on "Google Cloud Storage" build
I can tell you why you have this error, but not how to fix it...
Why?? Because the gcloud app deploy command use Cloud Build
During deployment, the Cloud Build service builds a container image of your application to run in the App Engine standard environment. Learn more in Managing build images.
The Cloud Build default timeout parameter is 10 minutes. That's why.
How to fix?
If you use flexible environment, try to build your container separately. Then create a Dockerfile which only use your builded container (FROM gcr.io/MyProjectID/containerName) and use a runtime: custom in your app.yaml file.
If you use standard environment, what's your service? Is it big? Do it have lot of dependencies?
This solved it:
steps:
name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args: ['-c', 'gcloud config set app/cloud_build_timeout 1800 && gcloud app deploy my-service/my-config.yaml']
seems like step level timeout didnot work for me. but this one is cool . phewwww.
thanks all Guys!

Spring Cloud Data Flow with Docker images from private repo - imagePullSecrets not being used. Cant Pull image

So I am unable to Launch a custom task application stored in a private docker repo. All my docker images in Kubernetes come are pulled from this private repo. So the imagePullSecrets works fine but it seems it is not being used by Spring Cloud Dataflow when deploying the task to Kubernetes. If I inspect the pod there is no imagepullSecret set.
The error I get is:
xxxxx- no basic auth credentials
The server has been deployed with the ENV variable which the guide states will fix this
- name: SPRING_CLOUD_DEPLOYER_KUBERNETES_IMAGE_PULL_SECRET
value: regcred
I have even tried to add custom properties on a per-application bases
I have read through the guide HERE
I am running the following versions:
Kubernetes 1.15 &
I have been stuck on this issue for weeks and simply can't find a solution. I'm hoping somebody has seen this issue and managed to solve it before?
Is there something else I'm missing?
So I found if I do the following it pulls the image (it seems i put this in the wrong place as the documentation doesn't clearly specify where and how)
But using the global environment variable as stated above does not seem to work still
Using the environment variable SPRING_CLOUD_DEPLOYER_KUBERNETES_IMAGE_PULL_SECRET also didnt work for me.
An alternative that made it work in my case is adding the following to the application.yaml of the SCDF Server in Kubernetes:
application.yaml
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
imagePullSecret: <your_secret>
or, when you are using a custom SCDF image like i do, you can of course specify it as argument:
deployment.yaml
[...]
command: ["java", "-jar", "spring-cloud-dataflow-server.jar"]
args:
- --spring.cloud.dataflow.task.platform.kubernetes.accounts.default.imagePullSecret=<your_secret>
[...]
More details on https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/

OSSEC_HIDS Kubernetes Deployment

Which would be the best HIDS (HostBase Intrusion Detection System) to deploy on Kubernetes Google Cloud Platform
I want to build docker image on debian:stable-slim
So I have been testing the ossec-docker and wazuh-docker
here are repos respectively:
OSSEC: https://github.com/Atomicorp/ossec-docker
WAZUH: https://github.com/wazuh/wazuh-docker
The wazuh-api=3.7.2-1 is broken as I am unable to get it install on debian:stable-slim
with nodejs: 6.10.0 or higher as it needs nodejs version >=4.6.0
but api is unable to install
I would need to know if anyone can suggest HostBase Intrusion Detection system which I can configure and deploy on docker/ Kubernetes If you have any github repo link would really appreciate the link
Wazuh has a repository for Kubernetes. Right now, it is focused on AWS, but I think you just need to change the volumes configuration (it is implemented for AWS EBS) and it should work in GCP.

How does Spread know to update image in Kubernetes?

I want to set up a Gitlab CD to Kubernetes and I read this article
However, I am wondering, how is it that my K8 cluster would be updated with my latest Docker images?
For example, in my .gitlab-ci.yaml file I will have a build, test, and release stage that ultimately updates my cloud Docker images. By setting up the deploy stage as instructed in the article:
deploy:
stage: deploy
image: redspreadapps/gitlabci
script:
- null-script
would Spread then know to "magically" update my K8 cluster (perhaps by repulling all images, perform rolling-updates) as long as I set up my directory structure of K8 resources as is specified by Spread?
I don't have a direct answer, but from looking at the spread project it seems pretty dead. Last commit in Aug last year with a bunch of issues and not supporting any of the newer kubernetes constructs (e.g. deployments).
The typical way to update images in kubernetes nowadays is to run a command like kubectl set image <deployment-name> <image>. This will in turn perform a rolling update on the deployment and shutting down a POD at a time updating it with the new image. See this doc.
Since spread is from before that, I assume they must use rolling update replication controller with a command like kubectl rolling-update NAME -f FILE and picking up the new image from the configuration file in their project folder (assuming it changed). See this doc.

Resources