i have a problem, I can't connect to an external database that is in my same network but in another node. My application is built on spring boot and my problem is trying to connect to the database. This is the stacktrace of the error.
22:40:56.393 [main] INFO com.zaxxer.hikari.HikariDataSource - HikariPool-1 - Starting...
22:41:07.480 [main] ERROR com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Exception during pool initialization.
org.postgresql.util.PSQLException: The connection attempt failed.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:331)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
at org.postgresql.Driver.makeConnection(Driver.java:402)
at org.postgresql.Driver.connect(Driver.java:261)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138)
...
...
Caused by: java.net.SocketTimeoutException: connect timed out
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
22:41:07.482 [main] WARN o.h.e.j.e.i.JdbcEnvironmentInitiator - HHH000342: Could not obtain connection to query metadata
org.postgresql.util.PSQLException: The connection attempt failed.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:331)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
at org.postgresql.Driver.makeConnection(Driver.java:402)
I'm using minikube version: v1.26.0
Code for start minikube:
minikube start --driver=hyperkit
My deployment yml file
apiVersion: v1
kind: Service
metadata:
name: web-application
spec:
selector:
role: webapp
ports:
- protocol: TCP
port: 8080
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp-gastronomia
spec:
selector:
matchLabels:
role: webapp
replicas: 1
template:
metadata:
labels:
role: webapp
env: production
spec:
containers:
- name: app-gastronomia
image: localhost:5000/app-gastronomia:1.0
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
My dockerfile:
FROM openjdk:11.0.7-jre-slim
COPY build/libs/register-0.0.1-SNAPSHOT.jar app.jar
EXPOSE 8001
ENTRYPOINT ["java","-jar","app.jar"]
In docker everything works fine, I have no problem.
Ping from ubuntu pod
root#ubuntu:/# ping -c 3 172.16.0.180
PING 172.16.0.180 (172.16.0.180) 56(84) bytes of data.
--- 172.16.0.180 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2072ms
Ping from my computer's terminal
ping -c 3 172.16.0.180 2 ✘ took 13s at 05:47:45 PM
PING 172.16.0.180 (172.16.0.180): 56 data bytes
64 bytes from 172.16.0.180: icmp_seq=0 ttl=63 time=78.136 ms
64 bytes from 172.16.0.180: icmp_seq=1 ttl=63 time=74.404 ms
64 bytes from 172.16.0.180: icmp_seq=2 ttl=63 time=74.904 ms
--- 172.16.0.180 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 74.404/75.815/78.136/1.654 ms
Investigating, create a service and endpoint pointing to the external ip, here my file
apiVersion: v1
kind: Service
metadata:
name: database
spec:
clusterIP: None
ports:
- port: 5432
targetPort: 5432
protocol: TCP
---
kind: Endpoints
apiVersion: v1
metadata:
name: database
subsets:
- addresses:
- ip: 172.16.0.180
ports:
- port: 5432
Doing a ping from the ubuntu pod
root#ubuntu:/# ping -c 3 database
PING database.default.svc.cluster.local (172.16.0.180) 56(84) bytes of data.
--- database.default.svc.cluster.local ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2032ms
Any solution on how to connect from my application (pod) to the minikube database outside?
EDIT
I forgot to mention that I work from home, I access the database from a vpn
This really depends on how you've set up Minikube. Normally, it is running inside a VM which is isolated from the network other local containers are running on.
There are 2 simple ways you can go about to solve your issue -
Run the DB on Minikube as well, which will allow the application to access it using k8s internal dns address
Expose the DB to the public internet using Ngrok or a similar solution. This approach is much less recommended as it needlessly exposes your DB to the internet just so you can access it from Minikube.
Related
I am using the WSL2 based engine for Docker and I have enabled Kubernetes v1.19.3
I have several Kubernetes services and pods running and I want to connect to a website hosted on the WSL2 VM. How can I determine the IP address for that VM that I can use to connect from a pod?
I ran hostname -I on the VM and got an IP address for the machine.
I created a service and an endpoint:
apiVersion: v1
kind: Service
metadata:
name: test-viewer
spec:
ports:
- protocol: TCP
port: 8080
targetPort: 8280
apiVersion: v1
kind: Endpoints
metadata:
name: test-viewer
subsets:
- addresses:
- ip: 172.17.159.34
ports:
- port: 8280
I tried to use curl from one of the pods and got the following error:
curl http://test-viewer.default.svc.cluster.local:8080/index.html --output somefile
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:03 --:--:-- 0
curl: (7) Failed connect to test-viewer.default.svc.cluster.local:8080; No route to host
I run the following command on the WSL machine without issue:
curl http://172.17.159.34:8280/index.html --output somefile
Your problem is on the service the specified port.
So either change the service port to 80 use the port 8080 on your curl
apiVersion: v1
kind: Service
metadata:
name: test-viewer
spec:
ports:
- protocol: TCP
port: 80
targetPort: 8280
I have two applications, nginx and redis, where nginx uses redis to cache some data so the redis address must be configured in nginx.
On the one hand, I could first apply the redis deployment and get its IP and then apply the nginx deployment to set up the two application in my minikube.
But on the other, to simplify installation in the Kubernetes Dashboard for QA, I want to create a single Kubernetes YAML file (like GoogleCloudPlatform/microservices-demo/kubernetes-manifests.yaml) to deploy these two applications on two diverse Pods. However, if I do it by means of Environment Variables, I cannot get the redis address.
So how do I achieve it?
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-master
labels:
app: redis
spec:
selector:
matchLabels:
app: redis
role: master
tier: backend
replicas: 2
template:
metadata:
labels:
app: redis
role: master
tier: backend
spec:
containers:
- name: master-c
image: docker.io/redis:alpine
ports:
- containerPort: 6379
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
selector: # Defines how the Deployment finds which Pods to manage.
matchLabels:
app: my-nginx
template:
metadata: # Defines what the newly created Pods are labeled.
labels:
app: my-nginx
tier: frontend
spec:
terminationGracePeriodSeconds: 5
containers:
- name: my-nginx # Defines container name
image: my-nginx:dev # docker image load -i my-nginx-docker_image.tar
imagePullPolicy: Never # Always, IfNotPresent (default), Never
ports:
env:
- name: NGINX_ERROR_LOG_SEVERITY_LEVEL
value: debug
- name: MY_APP_REDIS_HOST
# How to use the IP address of the POD with redis-master labeled that is created by the previous deployment?
value: 10.86.50.235
# https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
# valueFrom:
# fieldRef:
# fieldPath: status.podIP # this is the current POD IP
- name: MY_APP_CLIENT_ID
value: client_id
- name: MY_APP_CLIENT_SECRET
# https://kubernetes.io/docs/concepts/configuration/secret
value: client_secret
---
# https://kubernetes.io/docs/concepts/services-networking/service/#defining-a-service
apiVersion: v1
kind: Service
# https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors
# https://kubernetes.io/docs/concepts/overview/working-with-objects/field-selectors/
# metadata - Data that helps uniquely identify the object, including a name string, UID, and optional namespace
metadata:
name: my-nginx
spec:
type: NodePort
selector:
# Defines a proper selector for your pods with corresponding `.metadata.labels` field.
# Verify it using: kubectl get pods --selector app=my-nginx || kubectl get pod -l app=my-nginx
# Make sure the service points to correct pod by, for example, `kubectl describe pod -l app=my-nginx`
app: my-nginx
ports:
# By default and for convenience, the `targetPort` is set to the same value as the `port` field.
- name: http
port: 6080
targetPort: 80
# By default and for convenience, the Kubernetes control plane will allocate a port from a range (default: 30000-32767)
nodePort: 30080
- name: https
port: 6443
targetPort: 443
nodePort: 30443
Added some network output,
Microsoft Windows [Version 10.0.18362.900]
(c) 2019 Microsoft Corporation. All rights reserved.
PS C:\Users\ssfang> kubectl get pods
NAME READY STATUS RESTARTS AGE
my-nginx-pod 1/1 Running 9 5d14h
redis-master-7db899bccb-npl6s 1/1 Running 3 2d15h
redis-master-7db899bccb-rgx47 1/1 Running 3 2d15h
C:\Users\ssfang> kubectl exec redis-master-7db899bccb-npl6s -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
C:\Users\ssfang> kubectl exec my-nginx-pod -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
C:\Users\ssfang> kubectl -n kube-system get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller-admission ClusterIP 10.108.221.2 <none> 443/TCP 7d11h
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 7d17h
C:\Users\ssfang> kubectl get ep kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 172.17.0.2:53,172.17.0.5:53,172.17.0.2:9153 + 3 more... 7d17h
C:\Users\ssfang> kubectl get ep kube-dns --namespace=kube-system -o=yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2020-07-09T02:08:35Z"
creationTimestamp: "2020-07-01T09:34:44Z"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: KubeDNS
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:endpoints.kubernetes.io/last-change-trigger-time: {}
f:labels:
.: {}
f:k8s-app: {}
f:kubernetes.io/cluster-service: {}
f:kubernetes.io/name: {}
f:subsets: {}
manager: kube-controller-manager
operation: Update
time: "2020-07-09T02:08:35Z"
name: kube-dns
namespace: kube-system
resourceVersion: "523617"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-dns
subsets:
- addresses:
nodeName: minikube
targetRef:
kind: Pod
namespace: kube-system
resourceVersion: "523566"
uid: ed3a9f46-718a-477a-8804-e87511db16d1
- ip: 172.17.0.5
nodeName: minikube
targetRef:
kind: Pod
name: coredns-546565776c-hmm5s
namespace: kube-system
resourceVersion: "523616"
uid: ae21c65c-e937-4e3d-8a7a-636d4f780855
ports:
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
- name: dns
port: 53
protocol: UDP
C:\Users\ssfang> kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 7d20h
my-nginx-service NodePort 10.98.82.96 <none> 6080:30080/TCP,6443:30443/TCP 7d13h
PS C:\Users\ssfang> kubectl describe pod/my-nginx-pod | findstr IP
IP: 172.17.0.8
IPs:
IP: 172.17.0.8
PS C:\Users\ssfang> kubectl describe service/my-nginx-service | findstr IP
IP: 10.98.82.96
C:\Users\ssfang> kubectl describe pod/my-nginx-65ffdfb5b5-dzgjk | findstr IP
IP: 172.17.0.4
IPs:
IP: 172.17.0.4
Take two Pods with nginx for example to inspect network,
C:\Users\ssfang> kubectl exec my-nginx-pod -it -- bash
# How to install nslookup, dig, host commands in Linux
apt-get install dnsutils -y # In ubuntu
yum install bind-utils -y # In RHEL/Centos
root#my-nginx-pod:/etc# apt update && apt-get install -y dnsutils iputils-ping
root#my-nginx-pod:/etc# nslookup my-nginx-service
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: my-nginx-service.default.svc.cluster.local
Address: 10.98.82.96
root#my-nginx-pod:/etc# nslookup my-nginx-pod
Server: 10.96.0.10
Address: 10.96.0.10#53
** server can't find my-nginx-pod: SERVFAIL
root#my-nginx-pod:/etc# ping -c3 -W60 my-nginx-pod
PING my-nginx-pod (172.17.0.8) 56(84) bytes of data.
64 bytes from my-nginx-pod (172.17.0.8): icmp_seq=1 ttl=64 time=0.011 ms
64 bytes from my-nginx-pod (172.17.0.8): icmp_seq=2 ttl=64 time=0.021 ms
64 bytes from my-nginx-pod (172.17.0.8): icmp_seq=3 ttl=64 time=0.020 ms
--- my-nginx-pod ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2065ms
rtt min/avg/max/mdev = 0.011/0.017/0.021/0.005 ms
root#my-nginx-pod:/etc# ping -c3 -W20 my-nginx-service
PING my-nginx-service.default.svc.cluster.local (10.98.82.96) 56(84) bytes of data.
--- my-nginx-service.default.svc.cluster.local ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2060ms
root#my-nginx-pod:/etc# ping -c3 -W20 my-nginx-pod.default.svc.cluster.local
ping: my-nginx-pod.default.svc.cluster.local: Name or service not known
root#my-nginx-pod:/etc# ping -c3 -W20 my-nginx-service.default.svc.cluster.local
PING my-nginx-service.default.svc.cluster.local (10.98.82.96) 56(84) bytes of data.
--- my-nginx-service.default.svc.cluster.local ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2051ms
C:\Users\ssfang> kubectl exec my-nginx-65ffdfb5b5-dzgjk -it -- bash
root#my-nginx-65ffdfb5b5-dzgjk:/etc# ping -c3 -W20 my-nginx-pod.default.svc.cluster.local
ping: my-nginx-pod.default.svc.cluster.local: Name or service not known
root#my-nginx-65ffdfb5b5-dzgjk:/etc# ping -c3 -W20 my-nginx-service.default.svc.cluster.local
ping: my-nginx-service.default.svc.cluster.local: Name or service not known
root#my-nginx-65ffdfb5b5-dzgjk:/etc# ping -c3 -W20 172.17.0.8
PING 172.17.0.8 (172.17.0.8) 56(84) bytes of data.
64 bytes from 172.17.0.8: icmp_seq=1 ttl=64 time=0.195 ms
64 bytes from 172.17.0.8: icmp_seq=2 ttl=64 time=0.039 ms
64 bytes from 172.17.0.8: icmp_seq=3 ttl=64 time=0.039 ms
--- 172.17.0.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2055ms
rtt min/avg/max/mdev = 0.039/0.091/0.195/0.073 ms
C:\Users\ssfang> ssh -o StrictHostKeyChecking=no -i C:\Users\ssfang.minikube\machines\minikube\id_rsa docker#10.86.50.252 &:: minikube ssh
_ _
_ _ ( ) ( )
___ ___ (_) ___ (_)| |/') _ _ | |_ __
/' _ ` _ `\| |/' _ `\| || , < ( ) ( )| '_`\ /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )( ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)
$ ping default.svc.cluster.local
ping: bad address 'default.svc.cluster.local'
$ ping my-nginx-pod.default.svc.cluster.local
ping: bad address 'my-nginx-pod.default.svc.cluster.local'
$ ping my-nginx-service.default.svc.cluster.local
ping: bad address 'my-nginx-service.default.svc.cluster.local'
$ nslookup whoami
Server: 10.86.50.1
Address: 10.86.50.1:53
** server can't find whoami: NXDOMAIN
** server can't find whoami: NXDOMAIN
$ ping -c3 -W20 172.17.0.8
PING 172.17.0.8 (172.17.0.8): 56 data bytes
64 bytes from 172.17.0.8: seq=0 ttl=64 time=0.053 ms
64 bytes from 172.17.0.8: seq=1 ttl=64 time=0.035 ms
64 bytes from 172.17.0.8: seq=2 ttl=64 time=0.040 ms
--- 172.17.0.8 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.035/0.042/0.053 ms
$ ping -c3 -W20 172.17.0.4
PING 172.17.0.4 (172.17.0.4): 56 data bytes
64 bytes from 172.17.0.4: seq=0 ttl=64 time=0.070 ms
64 bytes from 172.17.0.4: seq=1 ttl=64 time=0.039 ms
64 bytes from 172.17.0.4: seq=2 ttl=64 time=0.038 ms
--- 172.17.0.4 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.038/0.049/0.070 ms
Hardcoding IP-address is not a good practice. Instead you can create a service for redis as well and configure the service dns name in your nginx deployment using the kubernetes dns config like this my-svc.my-namespace.svc.cluster-domain.example. Your nginx will then communicate to the redis container through this service.
I am trying to setup a high-availability RabbitMQ cluster of nodes in my Kubernetes cluster as a StatefulSet so that my data (e.g. queues, messages) persist even after restarting all of the nodes simultaneously. Since I'm deploying the RabbitMQ nodes in Kubernetes, I understand that I need to include an external persistent volume for the nodes to store data in so that the data will persist after a restart. I have mounted an Azure Files Share into my containers as a volume at the directory /var/lib/rabbitmq/mnesia.
When starting with a fresh (empty) volume, the nodes start up without any issues and successfully form a cluster. I can open the RabbitMQ management UI and see that any queue I create is mirrored on all of the nodes, as expected, and the queue (plus any messages in it) will persist as long as there is at least 1 active node. Deleting pods with kubectl delete pod rabbitmq-0 -n rabbit will cause the node to stop and then restart, and the logs show that it successfully syncs with any remaining/active node so everything is fine.
The problem I have encountered is that when I simultaneously delete all RabbitMQ nodes in the cluster, the first node to start up will have the persisted data from the volume and tries to re-cluster with the other two nodes which are, of course, not active. What I expected to happen was that the node would start up, load the queue and message data, and then form a new cluster (since it should notice that no other nodes are active).
I suspect that there may be some data in the mounted volume that indicates the presence of other nodes which is why it tries to connect with them and join the supposed cluster, but I haven't found a way to prevent that and am not certain that this is the cause.
There are two different error messages: one in the pod description (kubectl describe pod rabbitmq-0 -n rabbit) when the RabbitMQ node is in a crash loop and another in the pod logs. The pod description error output includes the following:
exited with 137:
20:38:12.331 [error] Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only
Error: unable to perform an operation on node 'rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: ['rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local']
rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local:
* connected to epmd (port 4369) on rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
* epmd reports: node 'rabbit' not running at all
no other nodes on rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
* suggestion: start the node
Current node details:
* node name: 'rabbitmqcli-345-rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local'
* effective user's home directory: /var/lib/rabbitmq
* Erlang cookie hash: xxxxxxxxxxxxxxxxx
and the logs output the following info:
Config file(s): /etc/rabbitmq/rabbitmq.conf
Starting broker...2020-06-12 20:39:08.678 [info] <0.294.0>
node : rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.conf
cookie hash : xxxxxxxxxxxxxxxxx
log(s) : <stdout>
database dir : /var/lib/rabbitmq/mnesia/rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
...
2020-06-12 20:48:39.015 [warning] <0.294.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,['rabbit#rabbitmq-2.rabbitmq-internal.rabbit.svc.cluster.local','rabbit#rabbitmq-1.rabbitmq-internal.rabbit.svc.cluster.local','rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local'],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
2020-06-12 20:48:39.015 [info] <0.294.0> Waiting for Mnesia tables for 30000 ms, 0 retries left
2020-06-12 20:49:09.341 [info] <0.44.0> Application mnesia exited with reason: stopped
2020-06-12 20:49:09.505 [error] <0.294.0>
2020-06-12 20:49:09.505 [error] <0.294.0> BOOT FAILED
2020-06-12 20:49:09.505 [error] <0.294.0> ===========
2020-06-12 20:49:09.505 [error] <0.294.0> Timeout contacting cluster nodes: ['rabbit#rabbitmq-2.rabbitmq-internal.rabbit.svc.cluster.local',
2020-06-12 20:49:09.505 [error] <0.294.0> 'rabbit#rabbitmq-1.rabbitmq-internal.rabbit.svc.cluster.local'].
...
BACKGROUND
==========
This cluster node was shut down while other nodes were still running.
2020-06-12 20:49:09.506 [error] <0.294.0>
2020-06-12 20:49:09.506 [error] <0.294.0> This cluster node was shut down while other nodes were still running.
2020-06-12 20:49:09.506 [error] <0.294.0> To avoid losing data, you should start the other nodes first, then
2020-06-12 20:49:09.506 [error] <0.294.0> start this one. To force this node to start, first invoke
To avoid losing data, you should start the other nodes first, then
start this one. To force this node to start, first invoke
"rabbitmqctl force_boot". If you do so, any changes made on other
cluster nodes after this one was shut down may be lost.
What I've tried so far is clearing the /var/lib/rabbitmq/mnesia/rabbit#rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local/nodes_running_at_shutdown file contents, and fiddling with config settings such as the volume mount directory and erlang cookie permissions.
Below are the relevant deployment files and config files:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
namespace: rabbit
spec:
serviceName: rabbitmq-internal
revisionHistoryLimit: 3
updateStrategy:
type: RollingUpdate
replicas: 3
selector:
matchLabels:
app: rabbitmq
template:
metadata:
name: rabbitmq
labels:
app: rabbitmq
spec:
serviceAccountName: rabbitmq
terminationGracePeriodSeconds: 10
containers:
- name: rabbitmq
image: rabbitmq:0.13
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- >
until rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} node_health_check; do sleep 1; done;
rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} set_policy ha-all "" '{"ha-mode":"all", "ha-sync-mode": "automatic"}'
ports:
- containerPort: 4369
- containerPort: 5672
- containerPort: 5671
- containerPort: 25672
- containerPort: 15672
resources:
requests:
memory: "500Mi"
cpu: "0.4"
limits:
memory: "600Mi"
cpu: "0.6"
livenessProbe:
exec:
# Stage 2 check:
command: ["rabbitmq-diagnostics", "status", "--erlang-cookie", "$(RABBITMQ_ERLANG_COOKIE)"]
initialDelaySeconds: 60
periodSeconds: 60
timeoutSeconds: 15
readinessProbe:
exec:
# Stage 2 check:
command: ["rabbitmq-diagnostics", "status", "--erlang-cookie", "$(RABBITMQ_ERLANG_COOKIE)"]
initialDelaySeconds: 20
periodSeconds: 60
timeoutSeconds: 10
envFrom:
- configMapRef:
name: rabbitmq-cfg
env:
- name: HOSTNAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: RABBITMQ_USE_LONGNAME
value: "true"
- name: RABBITMQ_NODENAME
value: "rabbit#$(HOSTNAME).rabbitmq-internal.$(NAMESPACE).svc.cluster.local"
- name: K8S_SERVICE_NAME
value: "rabbitmq-internal"
- name: RABBITMQ_DEFAULT_USER
value: user
- name: RABBITMQ_DEFAULT_PASS
value: pass
- name: RABBITMQ_ERLANG_COOKIE
value: my-cookie
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
volumeMounts:
- name: my-volume-mount
mountPath: "/var/lib/rabbitmq/mnesia"
imagePullSecrets:
- name: my-secret
volumes:
- name: my-volume-mount
azureFile:
secretName: azure-rabbitmq-secret
shareName: my-fileshare-name
readOnly: false
---
apiVersion: v1
kind: ConfigMap
metadata:
name: rabbitmq-cfg
namespace: rabbit
data:
RABBITMQ_VM_MEMORY_HIGH_WATERMARK: "0.6"
---
kind: Service
apiVersion: v1
metadata:
namespace: rabbit
name: rabbitmq-internal
labels:
app: rabbitmq
spec:
clusterIP: None
ports:
- name: http
protocol: TCP
port: 15672
- name: amqp
protocol: TCP
port: 5672
- name: amqps
protocol: TCP
port: 5671
selector:
app: rabbitmq
---
kind: Service
apiVersion: v1
metadata:
namespace: rabbit
name: rabbitmq
labels:
app: rabbitmq
type: LoadBalancer
spec:
selector:
app: rabbitmq
ports:
- name: http
protocol: TCP
port: 15672
targetPort: 15672
- name: amqp
protocol: TCP
port: 5672
targetPort: 5672
- name: amqps
protocol: TCP
port: 5671
targetPort: 5671
Dockerfile:
FROM rabbitmq:3.8.4
COPY conf/rabbitmq.conf /etc/rabbitmq
COPY conf/enabled_plugins /etc/rabbitmq
USER root
COPY conf/.erlang.cookie /var/lib/rabbitmq
RUN /bin/bash -c 'ls -ld /var/lib/rabbitmq/.erlang.cookie; chmod 600 /var/lib/rabbitmq/.erlang.cookie; ls -ld /var/lib/rabbitmq/.erlang.cookie'
rabbitmq.conf
## cluster formation settings
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s
cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname
cluster_formation.k8s.service_name = rabbitmq-internal
cluster_formation.k8s.hostname_suffix = .rabbitmq-internal.rabbit.svc.cluster.local
cluster_formation.node_cleanup.interval = 60
cluster_formation.node_cleanup.only_log_warning = true
cluster_partition_handling = autoheal
queue_master_locator=min-masters
## general settings
log.file.level = debug
## Mgmt UI secure/non-secure connection settings (secure not implemented yet)
management.tcp.port = 15672
## RabbitMQ entrypoint settings (will be injected below when image is built)
Thanks in advance!
What I want to achieve:
We have an on premise Kafka cluster. I want to set up KSQLDB in OpenShift and connect it to the brokers of the on premise Kafka cluster.
The problem:
When I try to start the KSQLDB server with the command "/usr/bin/ksql-server-start /etc/ksqldb/ksql-server.properties" I get the error message:
[2020-05-14 15:47:48,519] ERROR Failed to start KSQL (io.confluent.ksql.rest.server.KsqlServerMain:60)
io.confluent.ksql.util.KsqlServerException: Could not get Kafka cluster configuration!
at io.confluent.ksql.services.KafkaClusterUtil.getConfig(KafkaClusterUtil.java:90)
at io.confluent.ksql.security.KsqlAuthorizationValidatorFactory.isKafkaAuthorizerEnabled(KsqlAuthorizationValidatorFactory.java:81)
at io.confluent.ksql.security.KsqlAuthorizationValidatorFactory.create(KsqlAuthorizationValidatorFactory.java:51)
at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:624)
at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:544)
at io.confluent.ksql.rest.server.KsqlServerMain.createExecutable(KsqlServerMain.java:98)
at io.confluent.ksql.rest.server.KsqlServerMain.main(KsqlServerMain.java:56)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1589471268517) timed out at 1589471268518 after 1 attempt(s)
at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
at io.confluent.ksql.services.KafkaClusterUtil.getConfig(KafkaClusterUtil.java:60)
... 6 more
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1589471268517) timed out at 1589471268518 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
My configuration:
I set up my Dockerfile on the basis of this image: https://hub.docker.com/r/confluentinc/ksqldb-server, the ports 9092, 9093, 8080, 8082 and 443 are open.
My service-yaml looks like that:
kind: Service
apiVersion: v1
metadata:
name: social-media-dev
namespace: abc
selfLink: xyz
uid: xyz
resourceVersion: '1'
creationTimestamp: '2020-05-14T09:47:15Z'
labels:
app: social-media-dev
annotations:
openshift.io/generated-by: OpenShiftNewApp
spec:
ports:
- name: social-media-dev
protocol: TCP
port: 9092
targetPort: 9092
nodePort: 31364
selector:
app: social-media-dev
deploymentconfig: social-media-dev
clusterIP: XX.XX.XXX.XXX
type: LoadBalancer
externalIPs:
- XXX.XX.XXX.XXX
sessionAffinity: None
externalTrafficPolicy: Cluster
status:
loadBalancer:
ingress:
- ip: XX.XX.XXX.XXX
My ksql-server.properties file includes the following information:
listeners: http://0.0.0.0:8082
bootstrap.servers: X.X.X.X:9092, X.X.X.Y:9092, X.X.X.Z:9092
What I have tried so far:
I tried to connect from within my pod to a broker and it worked: (timeout 1 bash -c '</dev/tcp/X.X.X.X/9092 && echo PORT OPEN || echo PORT CLOSED') 2>/dev/null
result: PORT OPEN
I also played around with the listener but then the error message got shorter just with the information "Could not get Kafka cluster configuration!" and without the timeout error.
I tried to exchange LoadBalancer to Nodeport, but also without success.
Do you have any ideas what I could try next?
UPDATE: With an upgrade to Cloudera CDH6, the Cloudera Kafka cluster works now also with Kafka Streams. Hence I was able to connect from my KSQLDB Cluster in Openshift to the on-premise Kafka cluster now.
UPDATE:
With an upgrade to Cloudera CDH6, the Cloudera Kafka cluster works now also with Kafka Streams. Hence I was able to connect from my KSQLDB Cluster in Openshift to the on-premise Kafka cluster now.
I will also describe my final way of connecting to the kerberized Kafka-cluster here as I have been struggling a lot to get it running:
Getting Kerberos-tickets and establish connections via SSL
ksql-server.properties (the sasl_ssl part of it):
security.protocol=SASL_SSL
sasl.mechanism=GSSAPI
ssl.truststore.location=truststore.jks
ssl.truststore.password=password
ssl.truststore.type=JKS
ssl.ca.location=cert
sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="my.keytab" serviceName="kafka" principal="myprincipal";
serviceName="kafka"
producer.ssl.endpoint.identification.algorithm=HTTPS
producer.security.protocol=SASL_SSL
producer.ssl.truststore.location=truststore.jks
producer.ssl.truststore.password=password
producer.sasl.mechanism=GSSAPI
producer.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="my.keytab" serviceName="kafka" principal="myprincipal";
consumer.ssl.endpoint.identification.algorithm=HTTPS
consumer.security.protocol=SASL_SSL
consumer.ssl.truststore.location=truststore.jks
consumer.ssl.truststore.password=password
consumer.sasl.mechanism=GSSAPI
consumer.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="my.keytab" serviceName="kafka" principal="myprincipal";`
Set up Sentry rules therefore
HOST=[HOST]->CLUSTER=kafka-cluster->action=idempotentwrite
HOST=[HOST]->TRANSACTIONALID=[ID]->action=describe
HOST=[HOST]->TRANSACTIONALID=[ID]->action=write
i have a problem with ftps-filezilla and Kubernetes for weeks.
CONTEXT :
I have a school project with Kubernetes and ftps.
I need to create a ftps server in kubernetes in the port 21, and it needs to run on alpine linux.
So i create an image of my ftps-alpine server using a docker container.
I test it, if it work properly on it own :
Using docker run --name test-alpine -itp 21:21 test_alpine
I have this output in filezilla :
Status: Connecting to 192.168.99.100:21…
Status: Connection established, waiting for welcome message…
Status: Initializing TLS…
Status: Verifying certificate…
Status: TLS connection established.
Status: Logged in
Status: Retrieving directory listing…
Status: Calculating timezone offset of server…
Status: Timezone offset of server is 0 seconds.
Status: Directory listing of “/” successful
It work successfully, filezilla see the file that is within my ftps directory
I am good for now(work on active mode).
PROBLEM :
So what i wanted, was to use my image in my kubernetes cluster(I use Minikube).
When i connect my docker image to an ingress-service-deployment in kubernetes I have that :
Status: Connecting to 192.168.99.100:30894…
Status: Connection established, waiting for welcome message…
Status: Initializing TLS…
Status: Verifying certificate…
Status: TLS connection established.
Status: Logged in
Status: Retrieving directory listing…
Command: PWD
Response: 257 “/” is the current directory
Command: TYPE I
Response: 200 Switching to Binary mode.
Command: PORT 192,168,99,1,227,247
Response: 500 Illegal PORT command.
Command: PASV
Response: 227 Entering Passive Mode (172,17,0,5,117,69).
Command: LIST
Error: The data connection could not be established: EHOSTUNREACH - No route to host
Error: Connection timed out after 20 seconds of inactivity
Error: Failed to retrieve directory listing
SETUP :
ingress.yaml :
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$1
namespace: default
name: ingress-controller
spec:
backend:
serviceName: my-nginx
servicePort: 80
backend:
serviceName: ftps-alpine
servicePort: 21
ftps-alpine.yml :
apiVersion: v1
kind: Service
metadata:
name: ftps-alpine
labels:
run: ftps-alpine
spec:
type: NodePort
ports:
port: 21
targetPort: 21
protocol: TCP
name: ftp21
port: 20
targetPort: 20
protocol: TCP
name: ftp20
selector:
run: ftps-alpine
apiVersion: apps/v1
kind: Deployment
metadata:
name: ftps-alpine
spec:
selector:
matchLabels:
run: ftps-alpine
replicas: 1
template:
metadata:
labels:
run: ftps-alpine
spec:
- name: ftps-alpine
image: test_alpine
imagePullPolicy: Never
ports:
- containerPort: 21
- containerPort: 20
WHAT DID I TRY :
When i see the error message : Error: The data connection could not
be established: EHOSTUNREACH - No route to host google it and i see
this message :
FTP in passive mode : EHOSTUNREACH - No route to host
. And i already run my ftps server in active mode.
Change vsftpd.conf file and my service:
vsftpd.conf :
seccomp_sandbox=NO
pasv_promiscuous=NO
listen=NO
listen_ipv6=YES
anonymous_enable=NO
local_enable=YES
write_enable=YES
local_umask=022
dirmessage_enable=YES
use_localtime=YES
xferlog_enable=YES
connect_from_port_20=YES
chroot_local_user=YES
#secure_chroot_dir=/vsftpd/empty
pam_service_name=vsftpd
pasv_enable=YES
pasv_min_port=30020
pasv_max_port=30021
user_sub_token=$USER
local_root=/home/$USER/ftp
userlist_enable=YES
userlist_file=/etc/vsftpd.userlist
userlist_deny=NO
rsa_cert_file=/etc/ssl/private/vsftpd.pem
rsa_private_key_file=/etc/ssl/private/vsftpd.pem
ssl_enable=YES
allow_anon_ssl=NO
force_local_data_ssl=YES
force_local_logins_ssl=YES
ssl_tlsv1=YES
ssl_sslv2=NO
ssl_sslv3=NO
allow_writeable_chroot=YES
#listen_port=21
I did change my the nodeport of my kubernetes to 30020 and 30021 and i add them to containers ports.
I change the pasv min port and max port.
I add the pasv_adress of my minikube ip.
Nothing work .
Question :
How can i have the successfully first message but for my kubernetes cluster ?
If you have any questions to clarify, no problem.
UPDATE :
Thanks to coderanger, i have advance and there is this problem :
Status: Connecting to 192.168.99.100:30894...
Status: Connection established, waiting for welcome message...
Status: Initializing TLS...
Status: Verifying certificate...
Status: TLS connection established.
Status: Logged in
Status: Retrieving directory listing...
Command: PWD
Response: 257 "/" is the current directory
Command: TYPE I
Response: 200 Switching to Binary mode.
Command: PASV
Response: 227 Entering Passive Mode (192,168,99,100,178,35).
Command: LIST
Error: The data connection could not be established: ECONNREFUSED - Connection refused by server
It works with the following change:
apiVersion: v1
kind: Service
metadata:
name: ftps-alpine
labels:
run: ftps-alpine
spec:
type: NodePort
ports:
- port: 21
targetPort: 21
nodePort: 30025
protocol: TCP
name: ftp21
- port: 20
targetPort: 20
protocol: TCP
nodePort: 30026
name: ftp20
- port: 30020
targetPort: 30020
nodePort: 30020
protocol: TCP
name: ftp30020
- port: 30021
targetPort: 30021
nodePort: 30021
protocol: TCP
name: ftp30021
selector:
run: ftps-alpine
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ftps-alpine
spec:
selector:
matchLabels:
run: ftps-alpine
replicas: 1
template:
metadata:
labels:
run: ftps-alpine
spec:
containers:
- name: ftps-alpine
image: test_alpine
imagePullPolicy: Never
ports:
- containerPort: 21
- containerPort: 20
- containerPort: 30020
- containerPort: 30021
and for the vsftpd.conf :
seccomp_sandbox=NO
pasv_promiscuous=NO
listen=YES
listen_ipv6=NO
anonymous_enable=NO
local_enable=YES
write_enable=YES
local_umask=022
dirmessage_enable=YES
use_localtime=YES
xferlog_enable=YES
connect_from_port_20=YES
chroot_local_user=YES
#secure_chroot_dir=/vsftpd/empty
pam_service_name=vsftpd
pasv_enable=YES
pasv_min_port=30020
pasv_max_port=30021
user_sub_token=$USER
local_root=/home/$USER/ftp
userlist_enable=YES
userlist_file=/etc/vsftpd.userlist
userlist_deny=NO
rsa_cert_file=/etc/ssl/private/vsftpd.pem
rsa_private_key_file=/etc/ssl/private/vsftpd.pem
ssl_enable=YES
allow_anon_ssl=NO
force_local_data_ssl=YES
force_local_logins_ssl=YES
ssl_tlsv1=YES
ssl_sslv2=NO
ssl_sslv3=NO
allow_writeable_chroot=YES
#listen_port=21
pasv_address=#minikube_ip#
First you need to fix your passive port range to actually be port 20 like you set in service:
pasv_min_port=20
pasv_max_port=20
And then you need to override the pasv_address to match whatever IP the user should be connecting to, pick one of your node IPs.