Access kubernetes docker for desktop from jenkins - docker

I've already installed a kubernetes docker for desktop and it's working fine. Then, I installed and configured a jenkins container.
Now, I want to deploy from jenkins container to kubernetes.
I installed Kubernetes Continuous Deploy Plugin and I configured the credential using the result of "kubectl config view" command
apiVersion: v1
clusters:
- cluster:
insecure-skip-tls-verify: true
server: https://localhost:6445
name: docker-for-desktop-cluster
contexts:
- context:
cluster: docker-for-desktop-cluster
user: docker-for-desktop
name: docker-for-desktop
current-context: docker-for-desktop
kind: Config
preferences: {}
When I try to deploy to kubernetes from jenkins, I get this error
ERROR: ERROR: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Deployment] with name: [sudoku] in namespace: [default] failed.
hudson.remoting.ProxyException: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Deployment] with name: [sudoku] in namespace: [default] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:206)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162)
at com.microsoft.jenkins.kubernetes.KubernetesClientWrapper$DeploymentUpdater.getCurrentResource(KubernetesClientWrapper.java:404)
at com.microsoft.jenkins.kubernetes.KubernetesClientWrapper$DeploymentUpdater.getCurrentResource(KubernetesClientWrapper.java:392)
at com.microsoft.jenkins.kubernetes.KubernetesClientWrapper$ResourceUpdater.createOrApply(KubernetesClientWrapper.java:358)
at com.microsoft.jenkins.kubernetes.KubernetesClientWrapper.apply(KubernetesClientWrapper.java:157)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand$DeploymentTask.doCall(DeploymentCommand.java:168)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand$DeploymentTask.call(DeploymentCommand.java:122)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand$DeploymentTask.call(DeploymentCommand.java:105)
at hudson.FilePath.act(FilePath.java:1165)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand.execute(DeploymentCommand.java:67)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand.execute(DeploymentCommand.java:46)
at com.microsoft.jenkins.azurecommons.command.CommandService.runCommand(CommandService.java:88)
at com.microsoft.jenkins.azurecommons.command.CommandService.execute(CommandService.java:96)
at com.microsoft.jenkins.azurecommons.command.CommandService.executeCommands(CommandService.java:75)
at com.microsoft.jenkins.azurecommons.command.BaseCommandContext.executeCommands(BaseCommandContext.java:77)
at com.microsoft.jenkins.kubernetes.KubernetesDeploy.perform(KubernetesDeploy.java:42)
at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.build(MavenModuleSetBuild.java:945)
at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:896)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
at hudson.model.Run.execute(Run.java:1810)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:543)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.remoting.ProxyException: java.net.ConnectException: Failed to connect to /localhost:6445
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:240)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:158)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:256)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:134)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:113)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:54)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200)
at okhttp3.RealCall.execute(RealCall.java:77)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:379)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:344)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:313)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:296)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:770)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:195)
... 26 more
Caused by: hudson.remoting.ProxyException: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at okhttp3.internal.platform.Platform.connectSocket(Platform.java:125)
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:238)
... 55 more
Obviusly the error is the address of kubernetes but I don't know how to retrieve the correct one.
I tried the one I can see in kubernetes under the Service > Kubernetes > Deploy but it doen't work. Where can I find the correct one?

Your problem lays in kubeconfig file, to be precise in following section:
server: https://localhost:6445
It`s of course valid in context of your local machine, but not when you are trying to use it to reach Kubernetes cluster from inside of node/agent jenkins container. Containers are having unique hostname, so talking from container to https://localhost address is like talking to itself.
Solution:
please replace 'localhost' with either 'host.docker.internal', 'docker.for.win.localhost' (if you are on Windows) or simply local IP address of your local machine, like this:
apiVersion: v1
clusters:
- cluster:
insecure-skip-tls-verify: true
server: https://docker.for.win.localhost:6445
name: docker-for-desktop-cluster
contexts:
- context:
cluster: docker-for-desktop-cluster
user: docker-for-desktop
name: docker-for-desktop
current-context: docker-for-desktop
...
You can read more on Docker Desktop networking here

Related

Failed to connect to spark-master:7077

I am trying to deploy my spark application on Kubernetes. I followed the below steps:
Installed spark-kubernetes-operator:
helm repo add spark-operator https://googlecloudplatform.github.io/spark-on-k8s-operator
helm install gcp-spark-operator spark-operator/spark-operator
Created a spark-app.py
from pyspark.sql.functions import *
from pyspark.sql import *
from pyspark import SparkContext
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType,StructField, StringType, IntegerType
if __name__ == "__main__":
spark = SparkSession.builder.appName('spark-on-kubernetes-test').getOrCreate()
data2 = [("James","","Smith","36636","M",3000),
("Michael","Rose","","40288","M",4000),
("Robert","","Williams","42114","M",4000),
("Maria","Anne","Jones","39192","F",4000),
("Jen","Mary","Brown","","F",-1)
]
schema = StructType([ \
StructField("firstname",StringType(),True), \
StructField("middlename",StringType(),True), \
StructField("lastname",StringType(),True), \
StructField("id", StringType(), True), \
StructField("gender", StringType(), True), \
StructField("salary", IntegerType(), True) \
])
df = spark.createDataFrame(data=data2,schema=schema)
df.printSchema()
df.show(truncate=False)
print("program is completed !")
Then I created the new image with my application:
FROM bitnami/spark
USER root
WORKDIR /app
COPY requirements.txt .
RUN pip3 install -r requirements.txt
COPY spark-app.py .<
Then I created the spark-application.yaml file:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: pyspark-app
namespace: default
spec:
type: Python
mode: cluster
image: "test/spark-k8s-app:1.0"
imagePullPolicy: Always
mainApplicationFile: local:///app/spark-app.py
sparkVersion: 3.3.0
restartPolicy:
type: OnFailure
onFailureRetries: 3
onFailureRetryInterval: 10
onSubmissionFailureRetries: 5
onSubmissionFailureRetryInterval: 20
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
serviceAccount: spark
labels:
version: 3.3.0
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
executor:
cores: 1
instances: 1
memory: "512m"
labels:
version: 3.3.0
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
But when I tried to deploy the yaml file, I am getting the below error:
$ kubectl logs pyspark-app-driver
←[38;5;6m ←[38;5;5m13:04:01.12 ←[0m
←[38;5;6m ←[38;5;5m13:04:01.12 ←[0m←[1mWelcome to the Bitnami spark container←[0m
←[38;5;6m ←[38;5;5m13:04:01.12 ←[0mSubscribe to project updates by watching ←[1mhttps://github.com/bitnami/containers←[0m
←[38;5;6m ←[38;5;5m13:04:01.12 ←[0mSubmit issues and feature requests at ←[1mhttps://github.com/bitnami/containers/issues←[0m
←[38;5;6m ←[38;5;5m13:04:01.12 ←[0m
22/09/24 13:04:04 INFO SparkContext: Running Spark version 3.3.0
22/09/24 13:04:04 INFO ResourceUtils: ==============================================================
22/09/24 13:04:04 INFO ResourceUtils: No custom resources configured for spark.driver.
22/09/24 13:04:04 INFO ResourceUtils: ==============================================================
22/09/24 13:04:04 INFO SparkContext: Submitted application: ml_framework
22/09/24 13:04:04 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 512, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
22/09/24 13:04:04 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor
22/09/24 13:04:04 INFO ResourceProfileManager: Added ResourceProfile id: 0
22/09/24 13:04:04 INFO SecurityManager: Changing view acls to: root
22/09/24 13:04:04 INFO SecurityManager: Changing modify acls to: root
22/09/24 13:04:04 INFO SecurityManager: Changing view acls groups to:
22/09/24 13:04:04 INFO SecurityManager: Changing modify acls groups to:
22/09/24 13:04:04 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users
with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
22/09/24 13:04:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/09/24 13:04:05 INFO Utils: Successfully started service 'sparkDriver' on port 7078.
22/09/24 13:04:05 INFO SparkEnv: Registering MapOutputTracker
22/09/24 13:04:05 INFO SparkEnv: Registering BlockManagerMaster
22/09/24 13:04:05 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/09/24 13:04:05 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/09/24 13:04:05 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
22/09/24 13:04:05 INFO DiskBlockManager: Created local directory at /var/data/spark-019ba05b-dba8-4350-a281-ffa35b54d840/blockmgr-20d5c478-8e93-42f6-85a9-9ed070f50b2b
22/09/24 13:04:05 INFO MemoryStore: MemoryStore started with capacity 117.0 MiB
22/09/24 13:04:05 INFO SparkEnv: Registering OutputCommitCoordinator
22/09/24 13:04:06 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/09/24 13:04:06 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://spark-master:7077...
22/09/24 13:04:10 WARN TransportClientFactory: DNS resolution failed for spark-master:7077 took 4005 ms
22/09/24 13:04:10 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master spark-master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$1.run(StandaloneAppClient.scala:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Failed to connect to spark-master:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:288)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:204)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:202)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:198)
... 4 more
Caused by: java.net.UnknownHostException: spark-master
at java.net.InetAddress.getAllByName0(InetAddress.java:1287)
at java.net.InetAddress.getAllByName(InetAddress.java:1199)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:156)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:153)
at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:41)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:61)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:53)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:55)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:31)
at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:106)
at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:206)
at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:46)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:180)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:166)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:990)
at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:516)
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429)
at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
22/09/24 13:04:26 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://spark-master:7077...
22/09/24 13:04:30 WARN TransportClientFactory: DNS resolution failed for spark-master:7077 took 4006 ms
22/09/24 13:04:30 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master spark-master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$1.run(StandaloneAppClient.scala:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Failed to connect to spark-master:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:288)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:204)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:202)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:198)
... 4 more
Caused by: java.net.UnknownHostException: spark-master
at java.net.InetAddress.getAllByName0(InetAddress.java:1287)
at java.net.InetAddress.getAllByName(InetAddress.java:1199)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:156)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:153)
at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:41)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:61)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:53)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:55)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:31)
at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:106)
at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:206)
at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:46)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:180)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:166)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:990)
at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:516)
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429)
at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)9)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
How can I resolve this?

Error: x509: certificate signed by unknown authority, kind cluster

I am running into a strange issue, docker pull works but when using kubectl create or apply -f with kind cluster, it is getting below error
Warning Failed 20m kubelet, kind-control-plane Failed to pull image "quay.io/airshipit/kubernetes-entrypoint:v1.0.0": rpc error: code = Unknown desc = failed to pull and unpack image "quay.io/airshipit/kubernetes-entrypoint:v1.0.0": failed to copy: httpReaderSeeker: failed open: failed to do request: Get https://d3uo42mtx6z2cr.cloudfront.net/sha256/b5/b554c0d094dd848c822804a164c7eb9cc3d41db5f2f5d2fd47aba54454d95daf?Expires=1587558576&Signature=Tt9R1O4K5zI6hFG9GYt-tLAWkwlQyLoAF0NDNouFnff2ywZnPlMSo2x2aopKcQJ5cAMYYTHvYBKm2Zwk8W80tE9cRet1PfP6CnAmo2lzsYzKnRRWbgQhgsyJK8AmAvKzw7iw6lbYdP91JjEiUcpfjMAj7dMPj97tpnEnnd72kljRew8VfgBhClblnhNFvfR9fs9lRS7wNFKrZ1WUSGpNEEJZjNcc9zBNIbOyKeDPfvIpdJ6OthQMJ3EKaFEFfVN6asiyz3lOgM2IMjJ0uBI2ChhCyDx7YHTdNZCOoYAEmw8zo5Ma0n8EQpX3EwU1qSR0IwoGNawF0qV6tFAZi5lpbQ__&Key-Pair-Id=APKAJ67PQLWGCSP66DGA: x509: certificate signed by unknown authority
Here is the ./kube/config
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EUXlNakV4TVRNd09Gb1hEVE13TURReU1ERXhNVE13T0Zvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTDlvCkNiYlFYblBxbXpUV0hwdnl6ZXdPcWo5L0NCSmFLV1lrSEVCZzJHcXhjWnFhWG92aVpOdkt3NVZsQmJvTUlSOTMKVUxiWGFVeFl4MHJyQ3pWanNKU09lWDd5VjVpY3JTOXRZTkF1eHhPZzBMM1F3SElxUEFKWkY5b1JwWG55VnZMcwpIcVBDQ2ZRblhBYWRpM3VsM2J5bjcrbVFhcU5mV0NSQkZhRVJjcXF5cDltbzduRWZ2YktybVM0TUdIUHN3eUV0CkYxeXJjc041Vlo5QkM5TWlXZnhEY1dUL2R5SXIrUjFtL3hWYlU0aGNjdkowYi9CQVJ3aUhVajNHVFpnYUtmbGwKNUE5elJsVFRNMjV6c0t5WHVLOFhWcFJlSTVCNTNqUUo3VGRPM0lkc0NqelNrbnByaFI0YmNFcll5eVNhTWN6cgo4c1l0RHNWYmtWOE9rd0pFTnlNQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFHdEFyckYrKzdjOGZEN09RZWxHWldxSHpyazkKbFQ2MHhmZjBtTzRFdWI3YUxzZGdSTmZuSjB5UDRVelhIeXBkZEhERFhzUHFzVHZzZ2h6MXBPNFQrVTVCVmRqQQpGWjdxZW9iUWN2NkhnSERZSjhOdy9sTHFObGMyeUtPYVJSNTNsbjRuWERWYkROaTcyeEJTbUlNN0hhOFJQSVNFCmttTndHeHFKQVM3UmFOanN0SDRzbC9LR2xKcUowNFdRZnN0b1lkTUY4MERuc0prYlVuSkQyb29oOGVHTlQ5WGsKOTZPbGdoa05yZ09ybmFOR2hTZlQxYjlxdDJZOFpGUlRrKzhhZGNNczlHWW50RzZZTW1WRzVVZDh0L1phbVlRSwpIWlJ6WDRxM3NoY1p3NWRmR2JZUmRPelVTZkhBcE9scHFOQ1FmZGxyOWMyeDMxdkRpOW4vZE9RMHVNbz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
server: https://127.0.0.1:32768
name: kind-kind
contexts:
- context:
cluster: kind-kind
user: kind-kind
name: kind-kind
current-context: kind-kind
kind: Config
preferences: {}
users:
- name: kind-kind
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM4akNDQWRxZ0F3SUJBZ0lJSWNDdHVsWUhYaVl3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TURBME1qSXhNVEV6TURoYUZ3MHlNVEEwTWpJeE1URXpNVEJhTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQTArZ0JKWHBpUncxK09WaGEKVjU0bG5IMndHTzRMK1hIZjBnUjFadU01MnUwUFV3THQ5cDNCd2d5WDVHODhncUFIMmh3K1c4U2lYUi9WUUM5MgpJd3J3cnc1bFlmcTRrWDZhWEcxdFZLRjFsU2JMUHd4Nk4vejFMczlrbnlRb2piMHdXZkZ2dUJrOUtCMjJuSVozCmdOUEZZVmNVcWwyM2s3ck5yL0xzdGZncEJoVTRaYWdzbCsyZG53Qll2MVh4Z1M1UGFuTGxUcFVYODIxZ3RzQ0QKbUN1aFFyQlQzdzZ0NXlqUU5MSGNrZ3M4Y1JXUkdxZFNnZGMrdGtYczkzNDdoSzRjazdHYUw0OHFBMTgzZzBXKwpNZEllcDR3TUxGbU9XTCtGS2Q5dC83bXpMbjJ5RWdsRXlvNjFpUWRmV2s1S2Q1c1BqQUtVZXlWVTIrTjVBSlBLCndwaGFyUUlEQVFBQm95Y3dKVEFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFGNXp5d1hSaitiakdzSG1OdjgwRXNvcXBjOEpSdVY4YnpNUQoxV0dkeGl3Mzk3TXBKVHFEaUNsTlZoZjZOOVNhVmJ2UXp2dFJqWW5yNmIybi9HREdvZDdyYmxMUWJhL2NLN1hWCm1ubTNHTXlqSzliNmc0VGhFQjZwUGNTa25yckRReFFHL09tbXE3Ulg5dEVCd2RRMHpXRGdVOFU0R0t3a3NyRmgKMFBYNE5xVnAwdHcyaVRDeE9lU0FpRnBCQ0QzS3ZiRTNpYmdZbHNPUko5S0Y3Y00xVkpuU0YzUTNZeDNsR3oxNgptTm9JanVHNWp2a3NDejc3TlFIL3Ztd2dXRXJLTndCZ0NDeEVQY1BjNFRZREU1SzBrUTY1aXc1MzR6bHZuaW5JCjZRTGYvME9QaHRtdC9FUFhRSU5PS0dKWEpkVFo1ZU9JOStsN0lMcGROREtkZjlGU3pNND0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBMCtnQkpYcGlSdzErT1ZoYVY1NGxuSDJ3R080TCtYSGYwZ1IxWnVNNTJ1MFBVd0x0CjlwM0J3Z3lYNUc4OGdxQUgyaHcrVzhTaVhSL1ZRQzkySXdyd3J3NWxZZnE0a1g2YVhHMXRWS0YxbFNiTFB3eDYKTi96MUxzOWtueVFvamIwd1dmRnZ1Qms5S0IyMm5JWjNnTlBGWVZjVXFsMjNrN3JOci9Mc3RmZ3BCaFU0WmFncwpsKzJkbndCWXYxWHhnUzVQYW5MbFRwVVg4MjFndHNDRG1DdWhRckJUM3c2dDV5alFOTEhja2dzOGNSV1JHcWRTCmdkYyt0a1hzOTM0N2hLNGNrN0dhTDQ4cUExODNnMFcrTWRJZXA0d01MRm1PV0wrRktkOXQvN216TG4yeUVnbEUKeW82MWlRZGZXazVLZDVzUGpBS1VleVZVMitONUFKUEt3cGhhclFJREFRQUJBb0lCQUZzYWsrT1pDa2VoOVhLUwpHY1V4cU5udTc1YklRVDJ0UjV6emJjWWVTdkZrbWdJR2NHaG15cmF5MDFyU3VDRXd6QzlwbFNXL0ZFOFZNSW0zCjNnS1M0WWRobVJUV3hpTkhXdllCMWM5YzIwQ1V2UzBPSUQyUjg1ZDhjclk0eFhhcXIrNzdiaHlvUFRMU0U0Q1kKRHlqRDQwaEdPQXhHM25ZVkNmbHJaM21VaDQ2bEo4YlROcXB5UzFCcVdNZnZwekt1ZDB6TElmMWtTTW9Cbm1XeQo0RzBrNC9qWVdEOWNwdGtSTGxvZXp5WVlCMTRyOVdNQjRENkQ5eE84anhLL0FlOEQraTl2WCtCaUdGOURSYllJCmVVQmRTQzE2QnQybW5XWGhXMmhSRFFqRmR2dzJIQ0gxT0ppcVZuWUlwbGJEcjFYVXI1NzFYWTZQMFJlQ0JRc3kKOUZpMG44RUNnWUVBMUQ3Nmlobm5YaEZyWFE2WkVEWnp3ZGlwcE5mbHExMGJxV0V5WUVVZmpPd2p3ZnJ4bzVEYgppUmoySm5Fei96bDhpVDFEbmh3bFdWZlBNbWo3bUdFMVYwMkFWSkJoT20vdU1tZnhYVmcvWGwxOVEzODdJT0tpCjBKSmdabGZqVjEyUGdRU3NnbnRrckdJa3dPcisrOUFaL3R0UVVkVlU0bFgxWWRuazZ5T1V6YWNDZ1lFQS81Y1kKcHJxMVhuNGZBTUgxMzZ2dVhDK2pVaDhkUk9xS1Zia2ZtWUw0dkI0dG9jL2N1c1JHaGZPaTZmdEZCSngwcDhpSgpDU1ZCdzIxbmNHeGRobDkwNkVjZml2ZG0vTXJlSmlyQmFlMlRRVWdsMjh1cmU3MWJEdXpjbWMrQVRQa1VXVDgyCmJpaDM5b3A1SEo5N2NlU3JVYU5zRTgxaEdIaXNSSzJEL2pCTjU0c0NnWUVBcUExeHJMVlQ5NnlOT1BKZENYUkQKOWFHS3VTWGxDT2xCQkwwYitSUGlKbCsyOUZtd3lGVGpMc3RmNHhKUkhHMjFDS2xFaDhVN1lXRmdna2FUcDVTWQplcGEzM0wwdzd1Yy9VQlB6RFhqWk8rdUVTbFJNU2Y2SThlSmtoOFJoRW9UWElrM0VGZENENXVZU3VkbVhxV1NkCm9LaWdFUnQ4Q1hZTVE3MFdQNFE5eHhNQ2dZQnBkVTJ0bGNJNkQrMzQ0UTd6VUR5VWV1OTNkZkVjdTIyQ3UxU24KZ1p2aCtzMjNRMDMvSGZjL1UreTNnSDdVelQxdzhWUmhtcWJNM1BwZUw4aFRKbFhWZFdzMWFxbHF5c1hvbDZHZwpkRzlhODByenF0REJ5THFtcU9MSThBNHZOR0xLQkVRUUpkQ0J3RmNDa1dkYzhnNGlMRHp1MnNJaVY4QTB3aWVCCkhTczN5d0tCZ1FDeXl2Tk45enk5S3dNOW1nMW5GMlh3WUVzMzB4bmsrNXJmTGdRMzQvVm1sSVA5Y1cyWS9oTWQKWnVlNWd4dnlYREcrZW9GU3Njc281TmcwLytWUDI0Sjk0cGJIcFJWV3FIWENvK2gxZjBnKzdET2p0dWp2aGVBRwpSb240NmF1clJRSG5HUStxeldWcWtpS2l1dDBybFpHd2ZzUGs4eWptVjcrWVJuamxES1hUWUE9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
I ran into a similar issue (I think) on OpenShift - I could pull images, but I couldn't push or get k8s to pull them. To resolve it, I had to update the docker config at /etc/sysconfig/docker and add the registry as an insecure registry. For openshift, the default route was required.
OPTIONS=' <some existing config stuff here> --insecure-registry=<fqdn-of-your-registry>'
Then systemctl restart docker to have the changes take effect.
You might also need to create a docker pull secret with your credentials in kubernetes to allow it to access the registry. Details here

Jenkins agent pod fails to run on kubernetes

I have installed jenkins on a GKE cluster using the stable helm chart.
I am able to access and login to the UI.
However, when trying to run a simple job, the agent pod fails to be created.
The logs are not very informative on this
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.523+0000 [id=184] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-ld008, template=PodTemplate{inheritFrom='', name='default', slaveConnectTimeout=0, label='jenkins-kos-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins/agent', command='', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='500m', resourceRequestMemory='1Gi', resourceLimitCpu='4000m', resourceLimitMemory='8Gi', envVars=[ContainerEnvVar [getValue()=http://jenkins-kos.jenkins.svc.cluster.local:8080/jenkins, getKey()=JENKINS_URL]]}]}
jenkins-kos-58586644f9-vh278 jenkins java.lang.IllegalStateException: Pod has terminated containers: jenkins/default-ld008 (jnlp)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:166)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:187)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:127)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:132)
jenkins-kos-58586644f9-vh278 jenkins at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:290)
jenkins-kos-58586644f9-vh278 jenkins at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
jenkins-kos-58586644f9-vh278 jenkins at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
jenkins-kos-58586644f9-vh278 jenkins at java.util.concurrent.FutureTask.run(FutureTask.java:266)
jenkins-kos-58586644f9-vh278 jenkins at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
jenkins-kos-58586644f9-vh278 jenkins at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
jenkins-kos-58586644f9-vh278 jenkins at java.lang.Thread.run(Thread.java:748)
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.524+0000 [id=184] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-ld008
jenkins-kos-58586644f9-vh278 jenkins Terminated Kubernetes instance for agent jenkins/default-ld008
jenkins-kos-58586644f9-vh278 jenkins Disconnected computer default-ld008
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.559+0000 [id=184] INFO o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent jenkins/default-ld008
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.560+0000 [id=184] INFO o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer default-ld008
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:56.009+0000 [id=53
Here are the kubernetes events
0s Normal Scheduled pod/default-zkwp4 Successfully assigned jenkins/default-zkwp4 to gke-kos-nodepool1-kq69
0s Normal Pulled pod/default-zkwp4 Container image "docker.io/istio/proxyv2:1.4.0" already present on machine
0s Normal Created pod/default-zkwp4 Created container
0s Normal Started pod/default-zkwp4 Started container
0s Normal Pulled pod/default-zkwp4 Container image "jenkins/jnlp-slave:3.27-1" already present on machine
0s Normal Created pod/default-zkwp4 Created container
0s Normal Started pod/default-zkwp4 Started container
0s Normal Pulled pod/default-zkwp4 Container image "docker.io/istio/proxyv2:1.4.0" already present on machine
1s Normal Created pod/default-zkwp4 Created container
0s Normal Started pod/default-zkwp4 Started container
0s Warning Unhealthy pod/default-zkwp4 Readiness probe failed: Get http://10.15.2.113:15020/healthz/ready: dial tcp 10.15.2.113:15020: connect: connection refused
0s Warning Unhealthy pod/default-zkwp4 Readiness probe failed: Get http://10.15.2.113:15020/healthz/ready: dial tcp 10.15.2.113:15020: connect: connection refused
0s Normal Killing pod/default-zkwp4 Killing container with id docker://istio-proxy:Need to kill Pod
The TCP port for agent communication is fixed to 50000
Using jenkins/jnlp-slave:3.27-1 for the agent image.
Any ideas what might be causing this?
UPDATE 1: Here is a gist with the description of the failed agent.
UPDATE 2: Managed to pinpoint the actual error in the jnlp logs using stackdriver (although not aware of the root cause yet)
"SEVERE: Failed to connect to http://jenkins-kos.jenkins.svc.cluster.local:8080/jenkins/tcpSlaveAgentListener/: Connection refused (Connection refused)
UPDATE 3: Here comes the weird(est) part: from a pod I spin up within the jenkins namespace:
/ # dig +short jenkins-kos.jenkins.svc.cluster.local
10.14.203.189
/ # nc -zv -w 3 jenkins-kos.jenkins.svc.cluster.local 8080
jenkins-kos.jenkins.svc.cluster.local (10.14.203.189:8080) open
/ # curl http://jenkins-kos.jenkins.svc.cluster.local:8080/jenkins/tcpSlaveAgentListener/
Jenkins
UPDATE 4: I can confirm that this occurs on a GKE cluster using istio 1.4.0 but NOT on another one using 1.1.15
You can disable the sidecar proxy for agents.
Go to Manage Jenkins -> Configuration -> Kubernetes Cloud.
Select Annotations options and enter the below annotation value.
sidecar.istio.io/inject: "false"

Java : Connecting to Redis cluster running in minikube

I have a Redis cluster with 3 master and 3 slaves running in minikube.
PS D:\redis\main\kubernetes-redis-cluster> kubectl exec -ti redis-1-2723908297-prjq5 -- /bin/bash
root#redis-1:/data# redis-cli -p 7000 -c
127.0.0.1:7000> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_ping_sent:9131
cluster_stats_messages_pong_sent:9204
cluster_stats_messages_meet_sent:3
cluster_stats_messages_sent:18338
cluster_stats_messages_ping_received:9202
cluster_stats_messages_pong_received:9134
cluster_stats_messages_meet_received:2
cluster_stats_messages_received:18338
127.0.0.1:7000> cluster nodes
de9a4780d93cb7eab8b77abdaaa96a081adcace3 172.17.0.7:7000#17000 slave ee4deab0525d054202e612b317924156ff587021 0 15099603
02577 4 connected
b3a3c05225e0a7fe8ae683dd4316e724e7a7daa6 172.17.0.5:7000#17000 myself,master - 0 1509960301000 2 connected 5461-10922
8bebd48850ec77db322ac51501d59314582865a3 172.17.0.6:7000#17000 master - 0 1509960302000 3 connected 10923-16383
ee4deab0525d054202e612b317924156ff587021 172.17.0.4:7000#17000 master - 0 1509960303479 1 connected 0-5460
28a1c75e9976bc375a13e55160f2aae48defb242 172.17.0.8:7000#17000 slave b3a3c05225e0a7fe8ae683dd4316e724e7a7daa6 0 15099603
02477 5 connected
32e9de12324b8571a6256285682fa066d79161ab 172.17.0.9:7000#17000 slave 8bebd48850ec77db322ac51501d59314582865a3 0 15099603
02000 6 connected
127.0.0.1:7000>
I am able to set/fetch key/values via redis-cli without any issue.
Now I am trying to connect to redis cluster from a simple java program running from eclipse.
I understand that I need to forward the port. I executed below command.
kubectl port-forward redis-0-334270214-fd4k0 7000:7000
Now when I execute below program.
public class Main {
public static void main(String[] args) {
GenericObjectPoolConfig config = new GenericObjectPoolConfig();
config.setMaxTotal(500);
config.setMaxIdle(500);
config.setMaxWaitMillis(60000);
config.setTestOnBorrow(true);
config.setMaxWaitMillis(20000);
Set<HostAndPort> jedisClusterNode = new HashSet<HostAndPort>();
jedisClusterNode.add(new HostAndPort("192.168.99.100", 31695));
JedisCluster jc = new JedisCluster(jedisClusterNode, config);
jc.set("prime", "1 is primeee");
String keyVal = jc.get("prime");
System.out.println(keyVal);
}
}
Then I get below exception.
Exception in thread "main" redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
at redis.clients.util.Pool.getResource(Pool.java:53)
at redis.clients.jedis.JedisPool.getResource(JedisPool.java:226)
at redis.clients.jedis.JedisSlotBasedConnectionHandler.getConnectionFromSlot(JedisSlotBasedConnectionHandler.java:66)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:116)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:141)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:141)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:141)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:141)
at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:31)
at redis.clients.jedis.JedisCluster.set(JedisCluster.java:103)
at com.redis.main.Main.main(Main.java:25)
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: connect timed out
at redis.clients.jedis.Connection.connect(Connection.java:207)
at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:93)
at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:1767)
at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:106)
at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:819)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:429)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:360)
at redis.clients.util.Pool.getResource(Pool.java:49)
... 10 more
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at redis.clients.jedis.Connection.connect(Connection.java:184)
... 17 more
Redis services are created and running as well..
PS C:\Users\rootmn> kubectl get services
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.0.0.1 <none> 443/TCP 11h
redis-0 10.0.0.105 <pending> 7000:31695/TCP,17000:31596/TCP 10h
redis-1 10.0.0.7 <pending> 7000:30759/TCP,17000:30646/TCP 10h
redis-2 10.0.0.167 <pending> 7000:32591/TCP,17000:30253/TCP 10h
redis-3 10.0.0.206 <pending> 7000:31644/TCP,17000:31798/TCP 10h
redis-4 10.0.0.244 <pending> 7000:30186/TCP,17000:32701/TCP 10h
redis-5 10.0.0.35 <pending> 7000:30628/TCP,17000:32396/TCP 10h
Telnet to redis ip port works fine.
Am I doing something wrong here. What would cause this issue ?
you need to define the service with type Nodeport. from the below yaml file update the selection section to match your redis pod label.
apiVersion: v1
kind: Service
metadata:
name: redis-master-nodeport
labels:
app: redis
spec:
ports:
- port: 7000
selector:
app: redis
role: master
tier: backend
type: NodePort
Create the service kubectl.exe create -f redis-master-service.yaml
then check the service it will give you the port number.
eg.
kubectl.exe get svc redis-master-nodep
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-master-nodep 10.0.0.62 <nodes> 6379:30277/TCP 16s
Now using minkikube ip:30277 port you can connect to redis.
Hope this helps

Kubernetes cannot pull from insecure registry ans cannot run container from local image on offline cluster

I am working on a offline cluster (machines have no internet access), deploying docker images using ansible and docker compose scripts.
My servers are Centos7.
I have set up an insecure docker registry on the machines. We are going to change environnement, and I am installing kubernetes in order to manage my pull of container.
I follow this guide to install kubernetes:
https://severalnines.com/blog/installing-kubernetes-cluster-minions-centos7-manage-pods-services
After the installation, I tried to launch a testing pod. here is the yml for the pod, launching with
kubectl -f create nginx.yml
here the yml:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: [my_registry_addr]:[my_registry_port]/nginx:v1
ports:
- containerPort: 80
I used kubectl describe to get more information on what was wrong:
Name: nginx
Namespace: default
Node: [my node]
Start Time: Fri, 15 Sep 2017 11:29:05 +0200
Labels: <none>
Status: Pending
IP:
Controllers: <none>
Containers:
nginx:
Container ID:
Image: [my_registry_addr]:[my_registry_port]/nginx:v1
Image ID:
Port: 80/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned nginx to [my kubernet node]
1m 1m 2 {kubelet [my kubernet node]} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_addr]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving"
54s 54s 1 {kubelet [my kubernet node]} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"[my_registry_addr]:[my_registry_port]\""
8s 8s 1 {kubelet [my kubernet node]} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "Network timed out while trying to connect to https://index.docker.io/v1/repositories/library/[my_registry_addr]/images. You may want to check your internet connection or if you are behind a proxy."
then, I go to my node and use journalctl -xe
sept. 15 11:22:02 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:02.350930396+02:00" level=info msg="{Action=create, LoginUID=4294967295, PID=11555}"
sept. 15 11:22:17 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:17.351536727+02:00" level=warning msg="Error getting v2 registry: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
sept. 15 11:22:17 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:17.351606330+02:00" level=error msg="Attempting next endpoint for pull after error: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
sept. 15 11:22:32 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:32.353946452+02:00" level=error msg="Not continuing with pull after error: Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_ip]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving"
sept. 15 11:22:32 [my_node_ip] kubelet[11555]: E0915 11:22:32.354309 11555 docker_manager.go:2161] Failed to create pod infra container: ErrImagePull; Skipping pod "nginx_default(8b5c40e5-99f4-11e7-98db-f8bc12456ee4)": Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_ip]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving
sept. 15 11:22:32 [my_node_ip] kubelet[11555]: E0915 11:22:32.354390 11555 pod_workers.go:184] Error syncing pod 8b5c40e5-99f4-11e7-98db-f8bc12456ee4, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_ip]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving"
sept. 15 11:22:44 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:44.350708175+02:00" level=error msg="Handler for GET /v1.24/images/[my_registry_ip]:[my_registry_port]/json returned error: No such image: [my_registry_ip]:[my_registry_port]"
I sure thant my docker configuration is good, cause I am using it every day with ansible or mesos.
docker version is 1.12.6, kubernetes version is 1.5.2
What can I do now? I didn't find any configuration key for this usage.
When I saw that pulling was failing, I manually pull the image on all the nodes. I put a tag to ensure that kubernetes will to try to pull as default, and set " imagePullPolicy: IfNotPresent "
The syntax for specifying the docker image is :
[docker_registry]/[image_name]:[image_tag]
In your manifest file, you have used ":" to separate docker repository host and the port the repository is listening on. The default port for docker private registry I guess is 5000.
So change your image declaration from
Image: [my_registry_addr]:[my_registry_port]/nginx:v1
to
Image: [my_registry_addr]/nginx:v1
Also, check the network connectivity from the worker node to your docker registry by doing a ping.
ping [my_registry_addr]
If you still want to check if the port 443 is opened on the registry you can do a tcp check on that port on the host running docker registry
curl telnet://[my_registry_addr]:443
Hope that helps.
I finally find what was the problem.
To work, Kubernetes need a pause container. Kubernetes was trying to find the pause container on the internet.
I deployed a custom pause container on my registry, I set up kubernetes pause container to this image.
After that, kubernetes is working like a charm.

Resources