Kubectl fail after a long time - docker

I have made up a little cluster (it is 1 Machine the master and two VM the nodes), now I have created a NFS directory to share a persistence volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs #nome di riferimento
spec:
capacity:
storage: 100Mi
accessModes:
- ReadWriteMany
nfs:
server: 192.168.57.1
path: "/mnt/shardisk"
and a claim that call it:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Mi
and finally a stupid pod to use it:
kind: Pod
apiVersion: v1
metadata:
name: nginx-nfs
spec:
volumes:
- name: storage
persistentVolumeClaim:
claimName: test-pvc
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: storage
now I have created a cluster from the physical machine and I have joined it from the VM, I have used callico for the network services (because flannel fail to start if someone know why it would be wonderful to solve it)
now if I try to do:
kubectl describe pod I see all work fine and so to kubectl logs nginx-nfs, but if I try to do kubectl exec -it nginx-nfs /bin/bash
all freeze for a very long time and after that I have this:
Error from server: error dialing backend: dial tcp 10.0.2.15:10250: getsockopt: connection timed out

I have "solve" it, i use kubernetes in 2 different lan and so the admin.conf have an ip that no match the current ip and it will not work, i have solve it creating same vm internal to the host and nat a static ip on it

Related

Can't access my local kubernetes service over the internet

Implementation Goal
Expose Zookeeper instance, running on kubernetes, to the internet.
(configuration & version information provided at the bottom)
Implementation Attempt
I currently have a minikube cluster running on ubuntu 14.04, backed by docker containers.
I'm running a bare metal k8s cluster, and I'm trrying to expose a zookeeper service to the internet. Seeing as my cluster is not running on a cloud provider, I set up metallb, in order to provide a network-loadbalancer implementation for my zookeeper service.
On startup everything looks good, an external IP is assigned and I can access it from the same host via a curl command.
$ kubectl get pods -n metallb-system
NAME READY STATUS RESTARTS AGE
controller-5c9894b5cd-9gh8m 1/1 Running 0 5h59m
speaker-j2z8q 1/1 Running 0 5h59m
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.xxx.xxx.xxx <none> 443/TCP 6d19h
zk-cs LoadBalancer 10.xxx.xxx.xxx 172.1.1.x 2181:30035/TCP 56m
zk-hs LoadBalancer 10.xxx.xxx.xxx 172.1.1.x 2888:30664/TCP,3888:31113/TCP 6m15s
When I curl the above mentioned external IP's, I get a valid response
$ curl -D- "http://172.1.1.x:2181"
curl: (52) Empty reply from server
So far it all looks good, I can access the LB from outside the cluster with no issues, but this is where my lack of Kubernetes/Networking knowledge gets me.I'm finding it impossible to expose this LB to the internet. I've tried running minikube tunnel which I had high hopes for, only to be deeply disappointed.
Running a curl command from another node, whilst minikube tunnel is running will just see the request time out.
$ curl -D- "http://172.1.1.x:2181"
curl: (28) Failed to connect to 172.1.1.x port 2181: Timed out
At this point, as I mentioned before, I'm stuck.
Is there any way that I can get this service exposed to the internet without giving my soul to AWS or GCP?
Any help will be greatly appreciated.
Service Configuration
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
selector:
app: zk
ports:
- port: 2888
targetPort: 2888
name: server
protocol: TCP
- port: 3888
targetPort: 3888
name: leader-election
protocol: TCP
clusterIP: ""
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
selector:
app: zk
ports:
- name: client
protocol: TCP
port: 2181
targetPort: 2181
type: LoadBalancer
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zk
spec:
selector:
matchLabels:
app: zk
serviceName: zk-hs
replicas: 1
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: zk
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
containers:
- name: zookeeper
imagePullPolicy: Always
image: "library/zookeeper:3.6"
resources:
requests:
memory: "1Gi"
cpu: "0.5"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
- name: zoo-config
mountPath: /conf
volumes:
- name: zoo-config
configMap:
name: zoo-config
securityContext:
fsGroup: 2000
runAsUser: 1000
runAsNonRoot: true
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
name: zoo-config
namespace: default
data:
zoo.cfg: |
tickTime=10000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=10
syncLimit=4
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 172.1.1.1-172.1.1.10
minikube: v1.13.1
docker: 18.06.3-ce
You can do it with minikube, but the idea of minikube is just to test stuff on your local environment. So, by default, it does not have the correct IPTable permissions, and yes you can adjust that, but if your goal is only to use without any loud provider, I'll higly recommend you to use kubeadm (https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/).
This tool will provide you a very customizable cluster configuration and you will be able to set your network problems without headaches.

Use a docker registry from the kubernetes cluster on which it is deployed

I have an on-prem kubernetes cluster and I want to deploy to it a docker registry from which the cluster nodes can download images. In my attempts to do this, I've tried several methods of identifying the service: a NodePort, a LoadBalancer provided by MetalLB in Layer2 mode, its Flannel network IP (referring to the IP that, by default, would be on the 10.244.0.0/16 network), and its cluster IP (referring to the IP that, by default, would be on the 10.96.0.0/16 network). In every case, connecting to the registry via docker failed.
I performed a cURL against the IP and realized that while the requests were resolving as expected, the tcp dial step was consistently taking 63.15 +/- 0.05 seconds, followed by the HTTP(s) request itself completing in an amount of time that is within margin of error for the tcp dial. This is consistent across deployments with firewall rules varying from a relatively strict set to nothing except the rules added directly by kubernetes. It is also consistent across network architectures ranging from a single physical server with VMs for all cluster nodes to distinct physical hardware for each node and a physical switch. As mentioned previously, it is also consistent across the means by which the service is exposed. It is also consistent regardless of whether I use an ingress-nginx service to expose it or expose the docker registry directly.
Further, when I deploy another pod to the cluster, I am able to reach the pod at its cluster IP without any delays, but I do encounter an identical delay when trying to reach it at its external LoadBalancer IP or at a NodePort. No delays besides expected network latency are encountered when trying to reach the registry from a machine that is not a node on the cluster, e.g., using the LoadBalancer or NodePort.
As a matter of practice, my main inquiry is what is the "correct" way to do what I am attempting to do? Furthermore, as an academic matter, I would also like to know the source of the very long, very consistent delay that I've been seeing?
My deployment yaml file has been included below for reference. The ingress handler is ingress-nginx.
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: registry-pv-claim
namespace: docker-registry
labels:
app: registry
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: docker-registry
namespace: docker-registry
spec:
replicas: 1
selector:
matchLabels:
app: docker-registry
template:
metadata:
labels:
app: docker-registry
spec:
containers:
- name: docker-registry
image: registry:2.7.1
env:
- name: REGISTRY_HTTP_ADDR
value: ":5000"
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: "/var/lib/registry"
ports:
- name: http
containerPort: 5000
volumeMounts:
- name: image-store
mountPath: "/var/lib/registry"
volumes:
- name: image-store
persistentVolumeClaim:
claimName: registry-pv-claim
---
kind: Service
apiVersion: v1
metadata:
name: docker-registry
namespace: docker-registry
labels:
app: docker-registry
spec:
selector:
app: docker-registry
ports:
- name: http
port: 5000
targetPort: 5000
---
apiVersion: v1
items:
- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "0"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
kubernetes.io/ingress.class: docker-registry
name: docker-registry
namespace: docker-registry
spec:
rules:
- host: example-registry.com
http:
paths:
- backend:
serviceName: docker-registry
servicePort: 5000
path: /
tls:
- hosts:
- example-registry.com
secretName: tls-secret
For future visitors, seems like your issue is related to Flannel.
The whole problem was described here:
https://github.com/kubernetes/kubernetes/issues/88986
https://github.com/coreos/flannel/issues/1268
including workaround:
https://github.com/kubernetes/kubernetes/issues/86507#issuecomment-595227060

Kubernetes stateful apps with bare metal

I am pretty new on Kubernetes and cloud computing. I am working with bare metal servers on my home (actually virtual servers on vbox) and trying to run a stateful app with StatefulSet. I have 1 master and 2 worker nodes and I am trying to run a database application on this cluster. So each node has 1 pod and I am very confused about volumes. I use hostpath volume(code below) but volumes working separately(actually they are not synchronizing). So my 2 pods are working different(same apps but they run like 2 different servers) when I reach them.
How can I run that app in 2 synchronized pods?
I've tried to synchronized volume files between 2 slaves. I also tried to synchronize volume files with deployment. I've tried to do this with volume provisioning (persistent volume and persistent volume provisioning).
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cloud
spec:
selector:
matchLabels:
app: cloud
serviceName: "cloud"
replicas: 2
template:
metadata:
labels:
app: cloud
spec:
containers:
- name: cloud
image: owncloud:v2
imagePullPolicy: Never
ports:
- containerPort: 80
name: web
volumeMounts:
- name: cloud-volume
mountPath: /var/www/html/
volumes:
- name: cloud-volume
hostPath:
path: /volumes/cloud/
---
kind: Service
apiVersion: v1
metadata:
name: cloud
spec:
selector:
app: cloud
type: LoadBalancer
ports:
- protocol: TCP
port: 80

Kubernetes Persistent Volume and hostpath

I was experimenting with something with Kubernetes Persistent Volumes, I can't find a clear explanation in Kubernetes documentation and the behaviour is not the one I am expecting so I like to ask here.
I configured following Persistent Volume and Persistent Volume Claim.
kind: PersistentVolume
apiVersion: v1
metadata:
name: store-persistent-volume
namespace: test
spec:
storageClassName: hostpath
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/Volumes/Data/data"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: store-persistent-volume-claim
namespace: test
spec:
storageClassName: hostpath
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
and the following Deployment and Service configuration.
kind: Deployment
apiVersion: apps/v1beta2
metadata:
name: store-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
k8s-app: store
template:
metadata:
labels:
k8s-app: store
spec:
volumes:
- name: store-volume
persistentVolumeClaim:
claimName: store-persistent-volume-claim
containers:
- name: store
image: localhost:5000/store
ports:
- containerPort: 8383
protocol: TCP
volumeMounts:
- name: store-volume
mountPath: /data
---
#------------ Service ----------------#
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: store
name: store
namespace: test
spec:
type: LoadBalancer
ports:
- port: 8383
targetPort: 8383
selector:
k8s-app: store
As you can see I defined '/Volumes/Data/data' as Persistent Volume and expecting that to mount that to '/data' container.
So I am assuming whatever in '/Volumes/Data/data' in the host should be visible at '/data' directory at container. Is this assumption correct? Because this is definitely not happening at the moment.
My second assumption is, whatever I save at '/data' should be visible at host, which is also not happening.
I can see from Kubernetes console that everything started correctly, (Persistent Volume, Claim, Deployment, Pod, Service...)
Am I understanding the persistent volume concept correctly at all?
PS. I am trying this in a Mac with Docker (18.05.0-ce-mac67(25042) -Channel edge), may be it should not work at Mac?
Thx for answers
Assuming you are using multi-node Kubernetes cluster, you should be able to see the data mounted locally at /Volumes/Data/data on the specific worker node that pod is running
You can check on which worker your pod is scheduled by using the command kubectl get pods -o wide -n test
Please note, as per kubernetes docs, HostPath (Single node testing only – local storage is not supported in any way and WILL NOT WORK in a multi-node cluster) PersistentVolume
It does work in my case.
As you are using the host path, you should check this '/data' in the worker node in which the pod is running.
Like the guy said above. You need to run a 'kubectl get po -n test -o wide' and you will see the node the pod is hosted on. Then if you SSH that worker you can see the volume

Unable to create Persistent Volume Claim for PetSet on CoreOS

Trying to set up PetSet using Kube-Solo
In my local dev environment, I have set up Kube-Solo with CoreOS. I'm trying to deploy a Kubernetes PetSet that includes a Persistent Volume Claim Template as part of the PetSet configuration. This configuration fails and none of the pods are ever started. Here is my PetSet definition:
apiVersion: apps/v1alpha1
kind: PetSet
metadata:
name: marklogic
spec:
serviceName: "ml-service"
replicas: 2
template:
metadata:
labels:
app: marklogic
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
terminationGracePeriodSeconds: 30
containers:
- name: 'marklogic'
image: {ip address of repo}:5000/dcgs-sof/ml8-docker-final:v1
imagePullPolicy: Always
command: ["/opt/entry-point.sh", "-l", "/opt/mlconfig.sh"]
ports:
- containerPort: 7997
name: health-check
- containerPort: 8000
name: app-services
- containerPort: 8001
name: admin
- containerPort: 8002
name: manage
- containerPort: 8040
name: sof-sdl
- containerPort: 8041
name: sof-sdl-xcc
- containerPort: 8042
name: ml8042
- containerPort: 8050
name: sof-sdl-admin
- containerPort: 8051
name: sof-sdl-cache
- containerPort: 8060
name: sof-sdl-camel
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
lifecycle:
preStop:
exec:
command: ["/etc/init.d/MarkLogic stop"]
volumeMounts:
- name: ml-data
mountPath: /var/opt/MarkLogic
volumeClaimTemplates:
- metadata:
name: ml-data
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ "ReadWriteMany" ]
resources:
requests:
storage: 1Gi
In the Kubernetes dashboard, I see the following error message:
SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "ml-data-marklogic-0", which is unexpected.
It seems that being unable to create the Persistent Volume Claim is also preventing the image from ever being pulled from my local repository. Additionally, the Kubernetes Dashboard shows the request for the Persistent Volume Claims, but the state is continuously "pending".
I have verified the issue is with the Persistent Volume Claim. If I remove that from the PetSet configuration the deployment succeeds.
I should note that I was using MiniKube prior to this and would see the same message, but once the image was pulled and the pod(s) started the claim would take hold and the message would go away.
I am using
Kubernetes version: 1.4.0
Docker version: 1.12.1 (on my mac) & 1.10.3 (inside the CoreOS vm)
Corectl version: 0.2.8
Kube-Solo version: 0.9.6
I am not familiar with kube-solo.
However, the issue here might be that you are attempting to use a feature, dynamic volume provisioning, which is in beta, which does not have specific support for volumes in your environment.
The best way around this would be to create the persistent volumes that it expects to find manually, so that the PersistentVolumeClaim can find them.
The same error happened to me and found clues about the following config (considering volumeClaimTemplates and StorageClass) at the slack group and this pull request
volumeClaimTemplates:
- metadata:
name: cassandra-data
annotations:
volume.beta.kubernetes.io/storage-class: standard
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
namespace: kube-system
name: standard
annotations:
storageclass.beta.kubernetes.io/is-default-class: "true"
labels:
kubernetes.io/cluster-service: "true"
provisioner: kubernetes.io/host-path

Resources