Apache server runs with docker run but kubernetes pod fails with CrashLoopBackOff - docker

My application uses apache2 web server. Due to restrictions in the kubernetes cluster, I do not have root previliges inside pod. So I have changed default port of apache2 from 80 to 8080 to be able to run as non-root user.
My problem is that once I build the docker image and run it in local it runs fine, but when I deploy using kubernetes in the cluster it keeps failing with:
Action '-D FOREGROUND' failed.
resulting in CrashLoopBackOff.
So, basically the apache2 server is not able to run in the pod with non-root user, but runs fine in local with docker run.
Any help is appreciated.
I am attaching my deployment and service files for reference:
apiVersion: apps/v1
kind: Deployment
metadata:
name: &DeploymentName app
spec:
replicas: 1
selector:
matchLabels: &appName
app: *DeploymentName
template:
metadata:
name: main
labels:
<<: *appName
spec:
securityContext:
fsGroup: 2000
runAsUser: 1000
runAsGroup: 3000
volumes:
- name: var-lock
emptyDir: {}
containers:
- name: *DeploymentName
image: image:id
ports:
- containerPort: 8080
volumeMounts:
- mountPath: /etc/apache2/conf-available
name: var-lock
- mountPath: /var/lock/apache2
name: var-lock
- mountPath: /var/log/apache2
name: var-lock
- mountPath: /mnt/log/apache2
name: var-lock
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 180
periodSeconds: 60
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 300
periodSeconds: 180
imagePullPolicy: Always
tty: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
envFrom:
- configMapRef:
name: *DeploymentName
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 2Gi
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: &hpaName app
spec:
maxReplicas: 1
minReplicas: 1
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: *hpaName
targetCPUUtilizationPercentage: 60
---
apiVersion: v1
kind: Service
metadata:
labels:
app: app
name: app
spec:
selector:
app: app
ports:
- protocol: TCP
name: http-web-port
port: 80
targetPort: 8080
- protocol: TCP
name: https-web-port
port: 443
targetPort: 443

CrashLoopBackOff is a common error in Kubernetes, indicating a pod constantly crashing in an endless loop.
The CrashLoopBackOff error can be caused by a variety of issues, including:
Insufficient resources-lack of resources prevents the container from loading Locked file—a file was already locked by another container
Locked database-the database is being used and locked by other pods
Failed reference—reference to scripts or binaries that are not present on the container
Setup error- an issue with the init-container setup in Kubernetes
Config loading error—a server cannot load the configuration file.
Misconfigurations- a general file system misconfiguration
Connection issues—DNS or kube-DNS is not able to connect to a third-party service
Deploying failed services—an attempt to deploy services/applications that have already failed (e.g. due to a lack of access to other services)
To fix kubernetes CrashLoopbackoff error refer to this link and also check out stackpost for more information.

Related

Unable to configure k8 ingress on GKE to run solr

I am trying to setup solr 8.0 on GKE. I can successfully run it on my local instance. But when I configured it on GKE, it keeps giving 502 error.
Here's my deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: solr
namespace: api
labels:
app: solr
spec:
replicas: 1
revisionHistoryLimit: 10
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: solr
template:
metadata:
labels:
app: solr
spec:
containers:
- name: app
image: solr:8
imagePullPolicy: Always
ports:
- name: http
containerPort: 8983
resources:
limits:
cpu: 250m
ephemeral-storage: 1Gi
memory: 512Mi
requests:
cpu: 250m
ephemeral-storage: 1Gi
memory: 512Mi
livenessProbe:
initialDelaySeconds: 20
httpGet:
path: /
port: http
Service:
apiVersion: v1
kind: Service
metadata:
name: solr
namespace: api
labels:
app: solr
spec:
type: ClusterIP
ports:
- name: solr
port: 8080
targetPort: 8983
selector:
app: solr
and ingress:
- host: solr.*****.***
http:
paths:
- pathType: ImplementationSpecific
backend:
service:
name: solr
port:
name: http
Things I have tried so far:
I have tried running the service on different ports and default ports.
I can exec into pod and access solr through command line. It is working fine.
Using port-forwarding kubectl port-forward --namespace api my-pod-name 8080:8983 I can access solr admin dashboard using the temporary url that google provides. But when i use the subdomain created for solr, it keeps giving me 502 Server error
Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.
Logs display error failed_to_pick_backend when I open subdomain that I added.

Configuring Rails application in Kubernates

I am configuring rails application in kubernates.I am using redis,sidekiq and Postgres DB.Below the yaml I am using.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: dev-app
name: test-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: Dev-app
spec:
nodeSelector:
cloud.io/sec-zone-green: "true"
containers:
- name: dev-application
image: hub.docker.net/appautomation/dev.app.1.0:latest
command: ["/bin/sh"]
args: ["-c", "while true; do echo test; sleep 20;done"]
resources:
limits:
memory: 8Gi
cpu: 5
requests:
memory: 8Gi
cpu: 5
ports:
- containerPort: 3000
- name: dev-app-nginx
image: hub.docker.net/appautomation/dev.nginx.1.0:latest
resources:
limits:
memory: 4Gi
cpu: 4
requests:
memory: 4Gi
cpu: 4
ports:
- containerPort: 80
- name: dev-app-redis
image: hub.docker.net/appautomation/dev.redis.1.0:latest
resources:
limits:
memory: 4Gi
cpu: 4
requests:
memory: 4Gi
cpu: 4
ports:
- containerPort: 6379
In kubectl I am not seeing any error.But when I try to execute logs in pods I am getting below.I could see three containers build internally.I have executed my dev-application and tried rails s to check server is running or not.But I am getting "/usr/local/bundle/gems/redis-3.3.5/lib/redis/connection/ruby.rb:229:in `getaddrinfo': getaddrinfo: Name or service not known (SocketError." How to check my application linked with redis and nginx? My yaml configuration is correct? or I need to use depends on in my yaml file.
kubectl get pods
NAME READY STATUS RESTARTS AGE
dev-database-57b6ff5997-mgdhm 1/1 Running 0 11d
test-deployment-5f59864c8b-4t5b7 3/3 Running 0 8m44s
kubectl logs test-deployment-5f59864c8b-4t5b7
error: a container name must be specified for pod test-deployment-5f59864c8b-4t5b7, choose one of: [dev-application dev-app-nginx dev-app-redis]
Service yams file
apiVersion: v1
kind: Service
metadata:
namespace: Dev-app
name: test-deployment
spec:
selector:
app: Dev-app
ports:
- name: Dev-application
protocol: TCP
port: 3001
targetPort: 3000
- name: redis
port: 6379
targetPort: 6379
you are not running right way container. ideally POD running must be single application if require multiple container then and then use the multiple container inside the single POD or deployment.
you should be deploying single container in single POD or deployment instead of 3 in single.
for logs issue you check specific container logs using
kubectl logs test-deployment-5f59864c8b-4t5b7
error: a container name must be specified for pod test-deployment-5f59864c8b-4t5b7, choose one of: [dev-application dev-app-nginx dev-app-redis]
-c is used to check the specific container logs
kubectl logs test-deployment-5f59864c8b-4t5b7 -c <any one name dev-application dev-app-nginx dev-app-redis>
ideally distributed system structure goes like you run the standalone POD or deployment of the REDIS so all the services can use it here you are running your application redis if Redis crash your application will auto-restart (Kubernetes behavior).
If application crash Redis will auto-restart as Kubernetes auto-restart whole if any of container fails inside the POD.
I am getting "/usr/local/bundle/gems/redis-3.3.5/lib/redis/connection/ruby.rb:229:in `getaddrinfo': getaddrinfo: Name or service not known (SocketError.
if you are getting this error check you have set the proper host path into the application code. If all the Redis, Nginx and application running in single container you connect with any or service over the localhost. So Redis will be running at localhost 6379 for application
if you want to further debug you try using the exec command to go inside the pod and check the
kubectl exec -it test-deployment-5f59864c8b-4t5b7 -c dev-application -- /bin/bash
by this way, you will be inside the container and test out the connections to Redis using CLI.
Update :
Redis deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
type: ClusterIP
ports:
- port: 6379
name: redis
selector:
app: redis
---
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: redis
spec:
selector:
matchLabels:
app: redis
serviceName: redis
replicas: 1
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redislabs/rejson
args: ["--appendonly", "no", "--loadmodule"]
ports:
- containerPort: 6379
name: redis
volumeMounts:
- name: redis-volume
mountPath: /data
volumeClaimTemplates:
- metadata:
name: redis-volume
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi

Can't access my local kubernetes service over the internet

Implementation Goal
Expose Zookeeper instance, running on kubernetes, to the internet.
(configuration & version information provided at the bottom)
Implementation Attempt
I currently have a minikube cluster running on ubuntu 14.04, backed by docker containers.
I'm running a bare metal k8s cluster, and I'm trrying to expose a zookeeper service to the internet. Seeing as my cluster is not running on a cloud provider, I set up metallb, in order to provide a network-loadbalancer implementation for my zookeeper service.
On startup everything looks good, an external IP is assigned and I can access it from the same host via a curl command.
$ kubectl get pods -n metallb-system
NAME READY STATUS RESTARTS AGE
controller-5c9894b5cd-9gh8m 1/1 Running 0 5h59m
speaker-j2z8q 1/1 Running 0 5h59m
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.xxx.xxx.xxx <none> 443/TCP 6d19h
zk-cs LoadBalancer 10.xxx.xxx.xxx 172.1.1.x 2181:30035/TCP 56m
zk-hs LoadBalancer 10.xxx.xxx.xxx 172.1.1.x 2888:30664/TCP,3888:31113/TCP 6m15s
When I curl the above mentioned external IP's, I get a valid response
$ curl -D- "http://172.1.1.x:2181"
curl: (52) Empty reply from server
So far it all looks good, I can access the LB from outside the cluster with no issues, but this is where my lack of Kubernetes/Networking knowledge gets me.I'm finding it impossible to expose this LB to the internet. I've tried running minikube tunnel which I had high hopes for, only to be deeply disappointed.
Running a curl command from another node, whilst minikube tunnel is running will just see the request time out.
$ curl -D- "http://172.1.1.x:2181"
curl: (28) Failed to connect to 172.1.1.x port 2181: Timed out
At this point, as I mentioned before, I'm stuck.
Is there any way that I can get this service exposed to the internet without giving my soul to AWS or GCP?
Any help will be greatly appreciated.
Service Configuration
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
selector:
app: zk
ports:
- port: 2888
targetPort: 2888
name: server
protocol: TCP
- port: 3888
targetPort: 3888
name: leader-election
protocol: TCP
clusterIP: ""
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
selector:
app: zk
ports:
- name: client
protocol: TCP
port: 2181
targetPort: 2181
type: LoadBalancer
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zk
spec:
selector:
matchLabels:
app: zk
serviceName: zk-hs
replicas: 1
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: zk
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
containers:
- name: zookeeper
imagePullPolicy: Always
image: "library/zookeeper:3.6"
resources:
requests:
memory: "1Gi"
cpu: "0.5"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
- name: zoo-config
mountPath: /conf
volumes:
- name: zoo-config
configMap:
name: zoo-config
securityContext:
fsGroup: 2000
runAsUser: 1000
runAsNonRoot: true
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
name: zoo-config
namespace: default
data:
zoo.cfg: |
tickTime=10000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=10
syncLimit=4
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 172.1.1.1-172.1.1.10
minikube: v1.13.1
docker: 18.06.3-ce
You can do it with minikube, but the idea of minikube is just to test stuff on your local environment. So, by default, it does not have the correct IPTable permissions, and yes you can adjust that, but if your goal is only to use without any loud provider, I'll higly recommend you to use kubeadm (https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/).
This tool will provide you a very customizable cluster configuration and you will be able to set your network problems without headaches.

Kubernetes - nginx-ingress is crashing after file upload via php

I'am running Kubernetes cluster on Google Cloud Platform via their Kubernetes Engine. Cluster version is 1.13.11-gke.14. PHP application pod contains 2 containers - Nginx as a reverse proxy and php-fpm (7.2).
In google cloud is used TCP Load Balancer and then internal routing via Nginx Ingress.
Problem is:
when I upload some bigger file (17MB), ingress is crashing with this error:
W 2019-12-01T14:26:06.341588Z Dynamic reconfiguration failed: Post http+unix://nginx-status/configuration/backends: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E 2019-12-01T14:26:06.341658Z Unexpected failure reconfiguring NGINX:
W 2019-12-01T14:26:06.345575Z requeuing initial-sync, err Post http+unix://nginx-status/configuration/backends: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I 2019-12-01T14:26:06.354869Z Configuration changes detected, backend reload required.
E 2019-12-01T14:26:06.393528796Z Post http+unix://nginx-status/configuration/backends: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E 2019-12-01T14:26:08.077580Z healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I 2019-12-01T14:26:12.314526990Z 10.132.0.25 - [10.132.0.25] - - [01/Dec/2019:14:26:12 +0000] "GET / HTTP/2.0" 200 541 "-" "GoogleStackdriverMonitoring-UptimeChecks(https://cloud.google.com/monitoring)" 99 1.787 [bap-staging-bap-staging-80] [] 10.102.2.4:80 553 1.788 200 5ac9d438e5ca31618386b35f67e2033b
E 2019-12-01T14:26:12.455236Z healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I 2019-12-01T14:26:13.156963Z Exiting with 0
Here is yaml configuration of Nginx ingress. Configuration is default by Gitlab's system that is creating cluster on their own.
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
creationTimestamp: "2019-11-24T17:35:04Z"
generation: 3
labels:
app: nginx-ingress
chart: nginx-ingress-1.22.1
component: controller
heritage: Tiller
release: ingress
name: ingress-nginx-ingress-controller
namespace: gitlab-managed-apps
resourceVersion: "2638973"
selfLink: /apis/apps/v1/namespaces/gitlab-managed-apps/deployments/ingress-nginx-ingress-controller
uid: bfb695c2-0ee0-11ea-a36a-42010a84009f
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 10
selector:
matchLabels:
app: nginx-ingress
release: ingress
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
prometheus.io/port: "10254"
prometheus.io/scrape: "true"
creationTimestamp: null
labels:
app: nginx-ingress
component: controller
release: ingress
spec:
containers:
- args:
- /nginx-ingress-controller
- --default-backend-service=gitlab-managed-apps/ingress-nginx-ingress-default-backend
- --election-id=ingress-controller-leader
- --ingress-class=nginx
- --configmap=gitlab-managed-apps/ingress-nginx-ingress-controller
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 3
name: nginx-ingress-controller
ports:
- containerPort: 80
name: http
protocol: TCP
- containerPort: 443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 3
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/nginx/modsecurity/modsecurity.conf
name: modsecurity-template-volume
subPath: modsecurity.conf
- mountPath: /var/log/modsec
name: modsecurity-log-volume
- args:
- /bin/sh
- -c
- tail -f /var/log/modsec/audit.log
image: busybox
imagePullPolicy: Always
name: modsecurity-log
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/log/modsec
name: modsecurity-log-volume
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: ingress-nginx-ingress
serviceAccountName: ingress-nginx-ingress
terminationGracePeriodSeconds: 60
volumes:
- configMap:
defaultMode: 420
items:
- key: modsecurity.conf
path: modsecurity.conf
name: ingress-nginx-ingress-controller
name: modsecurity-template-volume
- emptyDir: {}
name: modsecurity-log-volume
I have no Idea what else to try. I'm running cluster on 3 nodes (2x 1vCPU, 1.5GB RAM and 1x Preemptile 2vCPU, 1,8GB RAM), all of them on SSD drives.
Anytime i upload the image, disk IO will get crazy.
Disk IOPS
Disk I/O
Thanks for your help.
Found solution. Nginx-ingress pod contained modsecurity too. All requests were analyzed by mod security and bigger uploaded files caused those crashes. It wasn't crash at all but took too much CPU and I/O, that caused longer healthcheck response to all other pods. Solution is to configure correctly modsecurity or disable.

Kubernetes Ingress get Unhealthy backend services on Google Kubernetes Engine

I'm trying to deploy two services on Google container engine, I have created a cluster with 3 Nodes.
My docker images are in private docker hub repo that's why I have created a secret and used in Deployments, The ingress is creating a load balancer in the Google cloud console but it shows that backend services are not healthy and inside the kubernetes section under workloads it says Does not have minimum availability.
I'm new to kubernetes, what can be a problem?
Here are my yamls:
Deployment.yaml:
kind: Deployment
apiVersion: apps/v1
metadata:
name: pythonaryapp
labels:
app: pythonaryapp
spec:
replicas: 1 #We always want more than 1 replica for HA
selector:
matchLabels:
app: pythonaryapp
template:
metadata:
labels:
app: pythonaryapp
spec:
containers:
- name: pythonaryapp #1st container
image: docker.io/arycloud/docker_web_app:pythonaryapp #Dockerhub image
ports:
- containerPort: 8080 #Exposes the port 8080 of the container
env:
- name: PORT #Env variable key passed to container that is read by app
value: "8080" # Value of the env port.
readinessProbe:
httpGet:
path: /healthz
port: 8080
periodSeconds: 2
timeoutSeconds: 2
successThreshold: 2
failureThreshold: 10
imagePullSecrets:
- name: docksecret
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: pythonaryapp1
labels:
app: pythonaryapp1
spec:
replicas: 1 #We always want more than 1 replica for HA
selector:
matchLabels:
app: pythonaryapp1
template:
metadata:
labels:
app: pythonaryapp1
spec:
containers:
- name: pythonaryapp1 #1st container
image: docker.io/arycloud/docker_web_app:pythonaryapp1 #Dockerhub image
ports:
- containerPort: 8080 #Exposes the port 8080 of the container
env:
- name: PORT #Env variable key passed to container that is read by app
value: "8080" # Value of the env port.
readinessProbe:
httpGet:
path: /healthz
port: 8080
periodSeconds: 2
timeoutSeconds: 2
successThreshold: 2
failureThreshold: 10
imagePullSecrets:
- name: docksecret
---
And here's services.yaml:
kind: Service
apiVersion: v1
metadata:
name: pythonaryapp
spec:
type: NodePort
selector:
app: pythonaryapp
ports:
- protocol: TCP
port: 8080
---
---
kind: Service
apiVersion: v1
metadata:
name: pythonaryapp1
spec:
type: NodePort
selector:
app: pythonaryapp1
ports:
- protocol: TCP
port: 8080
---
And Here's my ingress.yaml:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: mysvcs
spec:
rules:
- http:
paths:
- path: /
backend:
serviceName: pythonaryapp
servicePort: 8080
- path: /<name>
backend:
serviceName: pythonaryapp1
servicePort: 8080
Update:
Here's flask service code:
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello World, from Python Service.', 200
if __name__ == '__main__':
app.run()
And, on running the container of it's docker image it retunrs 200 sttaus code at the root path /.
Thanks in advance!
Have a look at this post. It might contain helpful tips for your issue.
For example I do see a readiness probe but not a liveness probe in your config files.
This post suggests that “Does not have minimum availability” in k8s could be a result of a CrashloopBackoff caused by a failing liveness probe.
In GKE, the ingress is implemented by GCP LoadBalancer. The GCP LB is checking the health of the service by calling it in the service address with the root path '/'. Make sure that your container can respond with 200 on the root, or alternatively change the LB backend service health check route (you can do it in the GCP console)

Resources