I'm stressing my kubernetes API and I found out that every request is creating a process inside the Worker Node.
Deployment YAML:
apiVersion: apps/v1
kind: Deployment
metadata:
name: ${KUBE_APP_NAME}-deployment
namespace: ${KUBE_NAMESPACE}
labels:
app_version: ${KUBE_APP_VERSION}
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
selector:
matchLabels:
app_name: ${KUBE_APP_NAME}
template:
metadata:
labels:
app_name: ${KUBE_APP_NAME}
spec:
containers:
- name: ${KUBE_APP_NAME}
image: XXX:${KUBE_APP_VERSION}
imagePullPolicy: Always
env:
- name: MONGODB_URI
valueFrom:
secretKeyRef:
name: mongodb
key: uri
- name: JWT_PASSWORD
valueFrom:
secretKeyRef:
name: jwt
key: password
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
imagePullSecrets:
- name: regcred
Apache Bench used ab -p payload.json -T application/json -c 10 -n 2000
Why is this?
It's hard to answer your questions if this is normal that the requests are being kept open.
We don't know what exactly is your payload and how big it is. We also don't know if the image that you are using is handling those correctly.
You should use verbose=2 ab -v2 <host> and check what it taking so long.
You are using Apache Bench with -c 10 -n 2000 options which means there will be:
-c 10 concurrent connections at a time,
-n 2000 request total
You could use -k to enable HTTP KeepAlive
-k
Enable the HTTP KeepAlive feature, i.e., perform multiple requests within one HTTP session. Default is no KeepAlive.
It would be easier if you provided the output of using the ab.
As for the Kubernetes part.
We can read a definition of a pod available at Viewing Pods and Nodes:
A Pod is a Kubernetes abstraction that represents a group of one or more application containers (such as Docker or rkt), and some shared resources for those containers
...
The containers in a Pod share an IP Address and port space, are always co-located and co-scheduled, and run in a shared context on the same Node.
Related
I am using Cassandra image w.r.t.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v13
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
# do not use these in production until ssd GCEPersistentDisk or other ssd pd
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 1Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: k8s.io/minikube-hostpath
parameters:
type: pd-ssd
Now I need to add below line to cassandra-env.sh in postStart or in cassandra yaml file:
-JVM_OPTS="$JVM_OPTS
-javaagent:$CASSANDRA_HOME/lib/cassandra-exporter-agent-<version>.jar"
Now I was able to achieve this, but after this step, Cassandra requires a restart but as it's already running as a pod, I don't know how to restart the process. So is there any way that this step is done prior to running the pod and not after it is up?
I was suggested below solution:-
This won’t work. Commands that run postStart don’t impact the running container. You need to change the startup commands passed to Cassandra.
The only way that I know to do this is to create a new container image in the artifactory based on the existing image. and pull from there.
But I don't know how to achieve this.
There is a kubernetes cluster with 100 nodes, I have to clean the specific images manually, I know the kubelet garbage collect may help, but it isn't applied in my case.
After browsing the internet , I found a solution - docker in docker, to solve my problem.
I just wanna remove the image in each node one time, is there any way to run a job in each node one time?
I checked the kubernetes labels and podaffinity, but still no ideas, any body could help?
Also, I tried to use daemonset to solve the problem, but turns out that it can only remove the image for a part of nodes instead of all nodes, I don't what might be the problem...
here is the daemonset example:
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: test-ds
labels:
k8s-app: test
spec:
selector:
matchLabels:
k8s-app: test
template:
metadata:
labels:
k8s-app: test
spec:
containers:
- name: test
env:
- name: DELETE_IMAGE_NAME
value: "nginx"
image: busybox
command: ['sh', '-c', 'curl --unix-socket /var/run/docker.sock -X DELETE http://localhost/v1.39/images/$(DELETE_IMAGE_NAME)']
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker-sock-volume
ports:
- containerPort: 80
volumes:
- name: docker-sock-volume
hostPath:
# location on host
path: /var/run/docker.sock
If you want to run you job on single specific Node you can us the Nodeselector in POD spec
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: test
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: test
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
nodeSelector:
name: node3
daemon set ideally should resolve your issues, as it creates the PODs on each available Node in the cluster.
You can read more about the affinity at here : https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
nodeSelector provides a very simple way to constrain pods to nodes
with particular labels. The affinity/anti-affinity feature, greatly
expands the types of constraints you can express. The key enhancements
are
The affinity/anti-affinity language is more expressive. The language
offers more matching rules besides exact matches created with a
logical AND operation;
You can use the Affinity in Job YAML something like
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: k8s.gcr.io/pause:2.0
Update
Now if you have issue with the Deamon affinity with the Job is also useless, as Job will create the Single POD which will get schedule to Single node as per affinity. Either create 100 job with different affinity rules or you use Deployment + Affinity to schedule the Replicas on different nodes.
We will create one Deployment with POD affinity and make sure, multiple PODs of a single deployment won't get scheduled on one Node.
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-deployment
labels:
app: test
spec:
replicas: 100
selector:
matchLabels:
app: test
template:
metadata:
labels:
app: test
spec:
containers:
- name: test
image: <Image>
ports:
- containerPort: 80
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- test
topologyKey: "kubernetes.io/hostname"
Try using this deployment template and replace your image here. You can reduce replicas first to 10 instead of 100 to check it's spreading PODs or not.
Read more at : https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#an-example-of-a-pod-that-uses-pod-affinity
Extra :
You can also write and use your custom CRD : https://github.com/darkowlzz/daemonset-job which will behave as daemon set and job
I am configuring rails application in kubernates.I am using redis,sidekiq and Postgres DB.Below the yaml I am using.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: dev-app
name: test-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: Dev-app
spec:
nodeSelector:
cloud.io/sec-zone-green: "true"
containers:
- name: dev-application
image: hub.docker.net/appautomation/dev.app.1.0:latest
command: ["/bin/sh"]
args: ["-c", "while true; do echo test; sleep 20;done"]
resources:
limits:
memory: 8Gi
cpu: 5
requests:
memory: 8Gi
cpu: 5
ports:
- containerPort: 3000
- name: dev-app-nginx
image: hub.docker.net/appautomation/dev.nginx.1.0:latest
resources:
limits:
memory: 4Gi
cpu: 4
requests:
memory: 4Gi
cpu: 4
ports:
- containerPort: 80
- name: dev-app-redis
image: hub.docker.net/appautomation/dev.redis.1.0:latest
resources:
limits:
memory: 4Gi
cpu: 4
requests:
memory: 4Gi
cpu: 4
ports:
- containerPort: 6379
In kubectl I am not seeing any error.But when I try to execute logs in pods I am getting below.I could see three containers build internally.I have executed my dev-application and tried rails s to check server is running or not.But I am getting "/usr/local/bundle/gems/redis-3.3.5/lib/redis/connection/ruby.rb:229:in `getaddrinfo': getaddrinfo: Name or service not known (SocketError." How to check my application linked with redis and nginx? My yaml configuration is correct? or I need to use depends on in my yaml file.
kubectl get pods
NAME READY STATUS RESTARTS AGE
dev-database-57b6ff5997-mgdhm 1/1 Running 0 11d
test-deployment-5f59864c8b-4t5b7 3/3 Running 0 8m44s
kubectl logs test-deployment-5f59864c8b-4t5b7
error: a container name must be specified for pod test-deployment-5f59864c8b-4t5b7, choose one of: [dev-application dev-app-nginx dev-app-redis]
Service yams file
apiVersion: v1
kind: Service
metadata:
namespace: Dev-app
name: test-deployment
spec:
selector:
app: Dev-app
ports:
- name: Dev-application
protocol: TCP
port: 3001
targetPort: 3000
- name: redis
port: 6379
targetPort: 6379
you are not running right way container. ideally POD running must be single application if require multiple container then and then use the multiple container inside the single POD or deployment.
you should be deploying single container in single POD or deployment instead of 3 in single.
for logs issue you check specific container logs using
kubectl logs test-deployment-5f59864c8b-4t5b7
error: a container name must be specified for pod test-deployment-5f59864c8b-4t5b7, choose one of: [dev-application dev-app-nginx dev-app-redis]
-c is used to check the specific container logs
kubectl logs test-deployment-5f59864c8b-4t5b7 -c <any one name dev-application dev-app-nginx dev-app-redis>
ideally distributed system structure goes like you run the standalone POD or deployment of the REDIS so all the services can use it here you are running your application redis if Redis crash your application will auto-restart (Kubernetes behavior).
If application crash Redis will auto-restart as Kubernetes auto-restart whole if any of container fails inside the POD.
I am getting "/usr/local/bundle/gems/redis-3.3.5/lib/redis/connection/ruby.rb:229:in `getaddrinfo': getaddrinfo: Name or service not known (SocketError.
if you are getting this error check you have set the proper host path into the application code. If all the Redis, Nginx and application running in single container you connect with any or service over the localhost. So Redis will be running at localhost 6379 for application
if you want to further debug you try using the exec command to go inside the pod and check the
kubectl exec -it test-deployment-5f59864c8b-4t5b7 -c dev-application -- /bin/bash
by this way, you will be inside the container and test out the connections to Redis using CLI.
Update :
Redis deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
type: ClusterIP
ports:
- port: 6379
name: redis
selector:
app: redis
---
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: redis
spec:
selector:
matchLabels:
app: redis
serviceName: redis
replicas: 1
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redislabs/rejson
args: ["--appendonly", "no", "--loadmodule"]
ports:
- containerPort: 6379
name: redis
volumeMounts:
- name: redis-volume
mountPath: /data
volumeClaimTemplates:
- metadata:
name: redis-volume
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
I am developing a game development platform that allows users to run their game servers within my Kubernetes cluster. What is everything that I need to restrict / configure to prevent malicious users from gaining access to resources they should not be allowed to access such as internal pods, Kubernetes access keys, image pull secrets, etc?
I'm currently looking at Network Policies to restrict access to internal IP addresses, but I'm not sure if they would still be able to enumerate DNS addresses to sensitive internal architecture. Would they still be able to somehow find out how my MongoDB, Redis, Kafka pods are configured?
Also, I'm aware Kubernetes puts an API token at the /var/run/secrets/kubernetes.io/serviceaccount/token path. How do I disable this token from being created? Are there other sensitive files I need to remove / disable?
I've been researching everything I can think of, but I want to make sure that I'm not missing anything.
Pods are defined within a Deployment with a Service, and exposed via Nginx Ingress TCP / UDP ConfigMap. Example Configuration:
---
metadata:
labels:
app: game-server
name: game-server
spec:
replicas: 1
selector:
matchLabels:
app: game-server
template:
metadata:
labels:
app: game-server
spec:
containers:
- image: game-server
name: game-server
ports:
- containerPort: 7777
resources:
requests:
cpu: 500m
memory: 500M
imagePullSecrets:
- name: docker-registry-image-pull-secret
---
metadata:
labels:
app: game-server
service: game-server
name: game-server
spec:
ports:
- name: tcp
port: 7777
selector:
app: game-server
TL;DR: How do I run insecure, end user-defined Pods within my Kubernetes cluster safely?
I was able to get a kubernetes job up and running on AKS (uses docker hub image to process a biological sample and then upload the output to blob storage - this is done with a bash command that I provide in the args section of my yaml file). However, I have 20 samples, and would like to spin up 20 nodes so that I can process the samples in parallel (one sample per node). How do I send each sample to a different node? The "parallelism" option in a yaml file processes all of the 20 samples on each of the 20 nodes, which is not what I want.
Thank you for the help.
if you want each instance of the job to be on a different node, you can use daemonSet, thats exactly what it does, provisions 1 pod per worker node.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-elasticsearch
namespace: kube-system
labels:
k8s-app: fluentd-logging
spec:
selector:
matchLabels:
name: fluentd-elasticsearch
template:
metadata:
labels:
name: fluentd-elasticsearch
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd-elasticsearch
image: k8s.gcr.io/fluentd-elasticsearch:1.20
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
Another way of doing that - using pod antiaffinity:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
The requiredDuringSchedulingIgnoredDuringExecution field tells the Kubernetes Scheduler that it should never co-locate two Pods which have app label as zk in the domain defined by the topologyKey. The topologyKey kubernetes.io/hostname indicates that the domain is an individual node. Using different rules, labels, and selectors, you can extend this technique to spread your ensemble across physical, network, and power failure domains
How/where the samples are stored? You could load them (or a pointer to the actual sample) into a queue like Kafka and let the application retrieve each sample once and upload it to the blob after computation. You can then even assure that if a computation fails, another pod will pick it up and restart the computation.