Can't connect to cassandra cluster, but can connect to single node? - docker

I am facing some weird situation here while trying to connect a external Play Application to a cassandra cluster (which is running on a docker container over mesos).
The problem is:
If I only have a single Cassandra node, I can connect to it properly from the Play application. BUT, If I add a second node to it, I am not able to connect to any node anymore.
I am putting up the nodes as following:
First node (SEED)
{
"id": "cassandra-seed",
"constraints": [["hostname", "CLUSTER", "docker-sl-vm"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "cassandra:latest",
"network": "BRIDGE",
"portMappings": [ {"containerPort": 9042,"protocol": "tcp"} ]
}
},
"env": {
"CASSANDRA_SEED_COUNT": "1"
},
"cpus": 0.5,
"mem": 512.0,
"instances": 1,
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 3600
}
At this point, I am able to connect playy app to cassandra-seed.
CASSANDRA NODE2
{
"id": "cassandra",
"constraints": [["hostname", "CLUSTER", "docker-sl-vm"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "cassandra:latest",
"network": "BRIDGE",
"portMappings": [ {"containerPort": 9042,"protocol": "tcp"} ]
}
},
"env": {
"CASSANDRA_SEED_COUNT": "1",
"CASSANDRA_SEEDS": "cassandra-seed.marathon.mesos"
},
"cpus": 0.5,
"mem": 512.0,
"instances": 1,
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 3600
}
After this node comes up, I can't connect to it nor to cassandra-seed.
nodetool status result:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UJ 172.17.0.3 92.91 KB 256 ? ccf83479-beed-44f5-9e36-6c997fd8855c rack1
UN 172.17.0.2 96.96 KB 256 ? 1e42609d-ba3f-4c35-80c2-424a095b4db7 rack1
It looks like after the second node is up, cassandra isn't bound to the address, and the host stop seeing it. What should I do?

Related

Unable to spin dockerized cassandra cluster on Mesosphere DC/OS

Can anyone have some idea to create a Cassandra cluster on Mesosphere DC/OS using Docker?
The issue is that Cassandra containers keep getting started after every few seconds.
It seems that Marathon is failing to get the health status of the newly created containers because it keeps creating new ones continuously. In DC/OS GUI service debug, it shows
State: TASK_FAILED
Message: Container terminated with signal Broken pipe
While checking on the machine, the containers are up and running and also new containers are getting creating repeatedly in every minute or two.
Why marathon doesn't get the correct response from the container that it has started successfully so that it can stop creating a new one?
I am sharing my current JSON configuration for the service.
Cassandra.json
{
"id": "/cassandra",
"acceptedResourceRoles": [
"*"
],
"backoffFactor": 1.15,
"backoffSeconds": 1,
"container": {
"portMappings": [
{
"containerPort": 8000,
"hostPort": 0,
"protocol": "tcp",
"servicePort": 10003,
"name": "main"
}
],
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "cassandra:3.9",
"forcePullImage": false,
"privileged": false,
"parameters": []
}
},
"cpus": 3,
"disk": 10000,
"instances": 1,
"maxLaunchDelaySeconds": 300,
"mem": 6000,
"gpus": 0,
"networks": [
{
"mode": "container/bridge"
}
],
"requirePorts": false,
"upgradeStrategy": {
"maximumOverCapacity": 1,
"minimumHealthCapacity": 1
},
"killSelection": "YOUNGEST_FIRST",
"unreachableStrategy": {
"inactiveAfterSeconds": 0,
"expungeAfterSeconds": 0
},
"fetch": [],
"constraints": []
}
DC/OS open source version 1.13
Marathon Version 1.8.194
Please help if anyone have some idea what's going on? I can share further details if needed.

How Docker will resolve hostname or IP present in properties file?

I have 2 Spring Boot micro-service applications i.e web application and metastore application. This is the properties file for my web application.
spring:
thymeleaf:
prefix: classpath:/static/
application:
name: web-server
profiles:
active: native
server:
port: ${port:8383}
---
host:
metadata: http://10.**.**.***:5011
Dockerfile for web application:
FROM java:8-jre
MAINTAINER **** <******>
ADD ./ms.console.ivu-ivu.1.0.1.jar /app/
CMD chmod +x /app/*
CMD ["java","-jar", "/app/ms.console.web-web.1.0.1.jar"]
EXPOSE 8383
Dockerfile for metadata application:
FROM java:8-jre
MAINTAINER ******* <********>
ADD config/* /deploy/config/
CMD chmod +x ./deploy/config/*
COPY ./ms.metastore.1.0.1.jar /deploy/
CMD chmod +x ./deploy/ms.metastore.1.0.1.jar
CMD ["java","-jar","./deploy/ms.metastore.1.0.1.jar"]
EXPOSE 5011
I am using Mesos and Marathon for cluster management. The Marathon scripts for metastore is :-
{
"id": "/ms-metastore",
"cmd": null,
"cpus": 1,
"mem": 2000,
"disk": 0,
"instances": 0,
"acceptedResourceRoles": [
"*"
],
"container": {
"type": "DOCKER",
"docker": {
"forcePullImage": true,
"image": "*****/****:ms-metastore",
"parameters": [],
"privileged": true
},
"volumes": [],
"portMappings": [
{
"containerPort": 5011,
"hostPort": 0,
"labels": {},
"protocol": "tcp",
"servicePort": 10000
}
]
},
"networks": [
{
"mode": "container/bridge"
}
],
"portDefinitions": [],
"fetch": [
{
"uri": "file:///etc/docker.tar.gz",
"extract": true,
"executable": false,
"cache": false
}
]
}
Web marathon:
{
"id": "/ms-console",
"cmd": null,
"cpus": 1,
"mem": 2000,
"disk": 0,
"instances": 0,
"acceptedResourceRoles": [
"*"
],
"container": {
"type": "DOCKER",
"docker": {
"forcePullImage": true,
"image": "****/****:ms-console",
"parameters": [],
"privileged": true
},
"volumes": [],
"portMappings": [
{
"containerPort": 8383,
"hostPort": 0,
"labels": {},
"protocol": "tcp",
"servicePort": 10000
}
]
},
"networks": [
{
"mode": "container/bridge"
}
],
"portDefinitions": [],
"fetch": [
{
"uri": "file:///etc/docker.tar.gz",
"extract": true,
"executable": false,
"cache": false
}
]
}
Web application I am connecting to metastore with IP which is hard coded (mentioned in properties). I created docker images for both and run in my server. The metastore server now running in different machine, so my web application is unable to resolve this IP.
All you need to do here is expose 5011 as the host port on the metadata server running on "different machine" using -p -
docker run -d -p 5011:5011 metadata_image ....
Now your web application should be able to access metadata server by using http://$different_machine_ip:5011/
$different_machine_ip = Metadata server IP
However since they need to be tightly coupled, i would suggest you run web app & metadata server on the same machine in case your metadata server is stateless.

Mesos Marathon(ctl) Debugging - "Abnormal executor termination: unknown container"

I'm still new to Mesos, but am trying to figure out the best way to debug a Mesos application I'm attempting to develop. I'm getting the error message "Abnormal executor termination: unknown container" through the web application, and am unsure how to get more descriptive error messages to figure out what's going on. The error message would seem to indicate it can't find the Docker image, but I know for a fact it's referencing the correct image that is installed and running.
{
"id": "pgprimary",
"cmd": null,
"cpus": 1,
"mem": 128,
"disk": 0,
"instances": 1,
"container": {
"docker": {
"image": "example/postgres:centos7-10.0-1.6.0",
"network": "BRIDGE",
"parameters": [{
"key": "hostname",
"value": "pgprimary"
}],
"portMappings": [
]
},
"type": "DOCKER",
"volumes": [
{
"hostPath": "/mnt/nfsfileshare/pgdata",
"containerPath": "/pgdata",
"mode": "RW"
}
]
},
"env": {
"PG_MODE": "primary",
"PG_USER": "testuser",
"PG_PASSWORD": "testuser",
"PG_DATABASE": "userdb",
"PG_ROOT_PASSWORD": "password",
"PG_PRIMARY_USER": "primaryuser",
"PG_PRIMARY_PASSWORD": "password",
"PG_PRIMARY_PORT": "5432"
},
"labels": {},
"healthChecks": [
{
"protocol": "COMMAND",
"command": {
"value": "/usr/pgsql-10/bin/pg_isready --host=pgprimary.marathon.mesos"
},
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}
]
}
The command I'm using to deploy the Marathon app:
marathonctl -h http://10.0.2.15:8080 app create postgres.json
Not image, but docker is what marathon cannot find.
Specify the use of the Docker containerizer:
echo 'docker,mesos' > /etc/mesos-slave/containerizers
Provisioning Containers with the Docker Containerizer
https://mesosphere.github.io/marathon/docs/native-docker.html

How to pull docker image with marathon which need to be authorized

I wan to deploy a docker container with marathon, if the docker image without authorized, the image can be pull normally, but when I try to pull an image from repository which need to be authorized, task deploy fail, the response is
Failed to launch container: Failed to run 'docker -H unix:///var/run/docker.sock pull example.com/web:laest': exited with status 1; stderr='Error response from daemon: repository example.com/web not found: does not exist or no pull access '
I changed the permission of /var/run/docker.sock file to 777 on node, and master, but the issue is still appeared, that seems permission is not the root cause for the issue; I try to run "docker login" on the node, and pull the image manually, then the marathon task run correctly, my marathon json like below:
{
"id": "/web",
"cmd": "docker login --username='sam' --passwoer='123456' example.com/web:latest",
"cpus": 0.3,
"mem": 32,
"disk": 0,
"instances": 1,
"env": {
"EMAIL_USE_TLS": "False",
"DATABASE_URI": "mysql://user:123456#RDS:3306/test"
},
"container": {
"type": "DOCKER",
"volumes": [
{
"containerPath": "/data/supervisor/",
"hostPath": "/data/workspace/logs/supervisor/",
"mode": "RW"
}
],
"docker": {
"image": "daocloud.io/gizwits2015/gwaccounts:1.6.0",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 0,
"hostPort": 0,
"servicePort": 10000,
"protocol": "tcp",
"labels": {}
}
],
"privileged": false,
"parameters": [
{
"key": "add-host",
"value": "RDS:10.66.125.161"
}
],
"forcePullImage": false
}
},
"portDefinitions": [
{
"port": 10000,
"protocol": "tcp",
"name": "default",
"labels": {}
}
]
}
How can I pull the image with authorized with marathon?
You should read: https://mesosphere.github.io/marathon/docs/native-docker-private-registry.html
Follow step 1, and in step 2 replace the uris section with
"fetch" : [
{
"uri" : "https://path.to/file",
"extract" : true,
"outputFile" : "dockerConfig.tar.gz"
}
]
I've written more detailed explanation here: http://blog.itaysk.com/2017/05/22/using-a-custom-private-docker-registry-with-marathon

How to set up Cassandra Docker cluster in Marathon with BRIDGE network?

I have a production DC/OS(v1.8.4) cluster and I am trying to setup a Cassandra cluster inside it. I use Marathon(v1.3.0) to deploy Cassandra nodes. I use the official Docker image of Cassandra and more specifically the 2.2.3 version.
First Case: Deploy Cassandra using HOST mode network - Everything OK
In this case, I first deploy a node that I call cassasndra-seed and it attaches to a physical host with IP 10.32.0.6. From the stdout log of Marathon for this service I can see that "Node /10.32.0.6 state jump to normal" and that listen_address and broadcast_address are set to 10.32.0.6. If I check the mesos-dns records using "_cassandra-seed._tcp.marathon.mesos SRV" in a master node I can see that the IP that resolves for this service is 10.32.0.6. The node is fully functional and I manage to create a test database.
{
"id": "/cassandra-seed",
"cpus": 1.5,
"mem": 8192,
"disk": 0,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "cassandra:2.2.3",
"network": "HOST",
"ports": [7199,7000,7001,9160,9042],
"requirePorts": true,
"privileged": true
}
},
"constraints": [ ["hostname","UNIQUE"] ],
"env": { "CASSANDRA_CLUSTER_NAME": "democluster" }
}
Now I add one more node of cassandra using a separate deployment and providing 10.32.0.6 as seed (set "CASSANDRA_SEEDS": "10.32.0.6" in the env section of the deployment JSON). The new node gets the IP of another physical host (same pattern as before) and manages to gossip with the seed node. Thus, we have a functioning Cassandra cluster.
{
"id": "/cassandra",
"cpus": 1.5,
"mem": 8192,
"disk": 0,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "cassandra:2.2.3",
"network": "HOST",
"ports": [7199,7000,7001,9160,9042],
"requirePorts": true,
"privileged": true
}
},
"constraints": [ ["hostname","UNIQUE"] ],
"env": {
"CASSANDRA_CLUSTER_NAME": "democluster",
"CASSANDRA_SEEDS": "10.32.0.6"
}
}
Second Case: Deploy Cassandra using BRIDGE mode network - Houston we have a problem
In this case, I also deploy a first cassandra-seed node and it attaches to a physical host with IP 10.32.0.6. However, now at the stdout log of the service in Marathon I can see that "Node /172.17.0.2 state jump to normal" and that listen_address and broadcast_address are set to 172.17.0.2. 172.17.0.2 is the IP of the docker container (found using docker inspect). However, if I check the mesos-dns records using "_cassandra-seed._tcp.marathon.mesos SRV" in a master node I can see that the IP that resolves for this service is 10.32.0.6. The node is fully functional and I manage to create a test database.
{
"id": "/cassandra-seed",
"cpus": 1.5,
"mem": 8192,
"disk": 0,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "cassandra:2.2.3",
"network": "BRIDGE",
"portMappings": [
{"containerPort": 7000, "hostPort": 7000, "servicePort": 0 },
{"containerPort": 7001, "hostPort": 7001, "servicePort": 0 },
{"containerPort": 7199, "hostPort": 7199, "servicePort": 0 },
{"containerPort": 9042, "hostPort": 9042, "servicePort": 0 },
{"containerPort": 9160, "hostPort": 9160, "servicePort": 0 },
],
"privileged": true,
}
},
"constraints": [ [ "hostname", "UNIQUE" ] ],
"env": {"CASSANDRA_CLUSTER_NAME": "democluster"}
}
Now I add one more node of cassandra using a separate deployment and providing 10.32.0.6 as seed. The new node attaches to another host and gets the IP of his container (Node /172.17.0.2 state jump to normal). The result is that the new node cannot gossip with the seed.
{
"id": "/cassandra",
"cpus": 1.5,
"mem": 8192,
"disk": 0,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "cassandra:2.2.3",
"network": "BRIDGE",
"portMappings": [
{"containerPort": 7000, "hostPort": 7000, "servicePort": 0 },
{"containerPort": 7001, "hostPort": 7001, "servicePort": 0 },
{"containerPort": 7199, "hostPort": 7199, "servicePort": 0 },
{"containerPort": 9042, "hostPort": 9042, "servicePort": 0 },
{"containerPort": 9160, "hostPort": 9160, "servicePort": 0 },
],
"privileged": true,
}
},
"constraints": [ [ "hostname", "UNIQUE" ] ],
"env": {
"CASSANDRA_CLUSTER_NAME": "democluster",
"CASSANDRA_SEEDS": "10.32.0.6"
}
}
The question is how could I make the two nodes gossip in the second case? Which is the IP that I should provide as seed to the second node in order to find the first one? The 172.17.0.2 is the docker container IP and cannot be reached by the second node. For example, could cassandra instance in the seed node get the IP of the physical host just like in the host network mode?
Thank you in advance!
When forming a cassandra cluster in bridge network mode below settinhs should be taken care.
1. Set below values to host Ip (not container ip)
Seeds : public_ip
Broadcast_address : public_ip
Broadcast_rpc_address : public_ip
Set listen_address to container Ip
Listen_address : 172.17.x.x
3 . Set rpc_address to 0.0.0.0 (don't use localhost)
This way we can actually form a Cassandra cluster using bridge network.
Give it a try. Make sure required ports should be accessible from outside world.

Resources