How to do sidecar container communication in an ECS task? - docker

I have an ECS task where I have the main container and a sidecar container. I'm creating the task on EC2 and the network mode is bridge. My main container needs to talk to the sidecar container. But I am unable to do so.
My task definition is:
[
{
"name": "my-sidecar-container",
"image": "ECR image name",
"memory": "256",
"cpu": "256",
"essential": true,
"portMappings": [
{
"containerPort": "50051",
"hostPort": "50051",
"protocol": "tcp"
}
],
"links": [
"app"
]
},
{
"name": "app",
"image": "<app image URL here>",
"memory": "256",
"cpu": "256",
"essential": true
}
]
The sidecar is a gRPC server.
To check if I can list all the gRPC endpoints if I do the following from my main app container, it does not work.
root#my-main-app# ./grpcurl -plaintext localhost:50051 list
Failed to dial target host "localhost:50051": dial tcp 127.0.0.1:50051: connect: connection refused
But if I mention the EC2 private IP, it works. e.g.
root#my-main-app# ./grpcurl -plaintext 10.0.56.69:50051 list
grpc.reflection.v1alpha.ServerReflection
health.v1.Health
server.v1.MyServer
So it is definitely a networking issue. Wondering how to fix it!

If you're using bridge mode and linking, then you actually need to use link name as the address, instead of localhost. You would need to link the sidecar container to the app container (you are currently doing the opposite) and then use the sidecar's link name as the address.
If you were using awsvpc mode, then you would use localhost:containerport to communicate between containers in the same task.

Related

Can you tell me the solution to the change of service ip in mesos + marathon combination?

I am currently posting a docker service with the MESOS + Marathon combination.
This means that the IP address of the docker is constantly changing.
For example, if you put mongodb on marathon, you would use the following code.
port can specify the port that is coming into the host. After a day, the service will automatically shut down and run and the IP will change.
So, when I was looking for a method called mesos dns, when I was studying the docker command, I learned how to find the ip of the service with the alias name by specifying the network alias in the docker.
I thought it would be easier to access without using mesos dns by using this method.
However, in marathon, docker service is executed in json format like below.
I was asked because I do not know how to specify the docker network alias option or the keyword or method.
{
"id": "mongodbTest",
"instances": 1,
"cpus": 2,
"mem": 2048.0,
"container": {
"type": "DOCKER",
"docker": {
"image": "mongo:latest",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 27017,
"hostPort": 0,
"servicePort": 0,
"protocol": "tcp"
}
]
},
"volumes": [
{
"containerPath": "/etc/mesos-mg",
"hostPath": "/var/data/mesos-mg",
"mode": "RW"
}
]
}
}

Running Chronos docker image in BRIDGE mode

I've been putting together a POC mesos/marathon system that I am using to launch and control docker images.
I have a Vagrant virtual machine running in VirtualBox on which I run docker, marathon, zookeeper, mesos-master and mesos-slave processes, with everything working as expected.
I decided to add Chronos into the mix and initially I started with it running as a service on the vagrant VM, but then opted to switch to running it in a docker container using the mesosphere/chronos image.
I have found that I can get container image to start and run successfully when I specify HOST network mode for the container, but when I change to BRIDGE mode then I run into problems.
In BRIDGE mode, the chronos framework registers successfully with mesos (I can see the entry on the frameworks page of the mesos UI), but it looks as though the framework itself doesn't know that the registration was successful. The mesos master log if full of messages like:
strong textI1009 09:47:35.876454 3131 master.cpp:2094] Received SUBSCRIBE call for framework 'chronos-2.4.0' at scheduler-16d21dac-b6d6-49f9-90a3-bf1ba76b4b0d#172.17.0.59:37318
I1009 09:47:35.876832 3131 master.cpp:2164] Subscribing framework chronos-2.4.0 with checkpointing enabled and capabilities [ ]
I1009 09:47:35.876924 3131 master.cpp:2174] Framework 20151009-094632-16842879-5050-3113-0001 (chronos-2.4.0) at scheduler-16d21dac-b6d6-49f9-90a3-bf1ba76b4b0d#172.17.0.59:37318 already subscribed, resending acknowledgement
This implies some sort of configuration/communication issue but I have not been able to work out exactly what the root of the problem is. I'm not sure if there is any way to confirm if the acknowledgement from mesos is making it back to chronos or to check the status of the communication channels between the components.
I've done a lot of searching and I can find posts by folk who have encountered the same issue but I haven't found an detailed explanation of what needs to be done to correct it.
For example, I found the following post which mentions a problem that was resolved and which implies the user successfully ran their chronos container in bridge mode, but their description of the resolution was vague. There was also this post but the change suggested did resolve the issue that I am seeing.
Finally there was a post by someone at ILM who had what sound like exactly my problem and the resolution appeared to involve a fix to Mesos to introduce two new environment variables LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT (on top of LIBPROCESS_IP and LIBPROCESS_PORT) but I can't find a decent explanation of what values should be assigned to any of these variables, so have yet to work out whether the change will resolve the issue I am having.
It's probably worth mentioning that I've also posted a couple of questions on the chronos-scheduler group, but I haven't had any responses to these.
If it's of any help the versions of software I'm running are as follows (the volume mount allows me to provide values of other parameters [e.g. master, zk_hosts] as files, without having to keep changing the JSON):
Vagrant: 1.7.4
VirtualBox: 5.0.2
Docker: 1.8.1
Marathon: 0.10.1
Mesos: 0.24.1
Zookeeper: 3.4.5
The JSON that I am using to launch the chronos container is as follows:
{
"id": "chronos",
"cpus": 1,
"mem": 1024,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "mesosphere/chronos",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 4400,
"hostPort": 0,
"servicePort": 4400,
"protocol": "tcp"
}
]
},
"volumes": [
{
"containerPath": "/etc/chronos/conf",
"hostPath": "/vagrant/vagrantShared/chronos",
"mode": "RO"
}
]
},
"cmd": "/usr/bin/chronos --http_port 4400",
"ports": [
4400
]
}
If anyone has any experience of using chronos in a configuration like this then I'd appreciate any help that you might be able to provide in resolving this issue.
Regards,
Paul Mateer
I managed to work out the answer to my problem (with a little help from the sample framework here), so I thought I should post a solution to help anyone else the runs into the same issue.
The chronos service (and also the sample framework) were configured to communicate with zookeeper on the IP associated with the docker0 interface on the host (vagrant) VM (in this case 172.17.42.1).
Zookeeper would report the master as being available on 127.0.1.1 which was the IP address of the host VM that the mesos-master process started on, but although this IP address could be pinged from the container any attempt to connect to specific ports would be refused.
The solution was to start the mesos-master with the --advertise_ip parameter and specify the IP of the docker0 interface. This meant that although the service started on the host machine it would appear as though it had been started on the docker0 ionterface.
Once this was done communications between mesos and the chronos framework started completeing and the tasks scheduled in chronos ran successfully.
Running Mesos 1.1.0 and Chronos 3.0.1, I was able to successfully configure Chronos in BRIDGE mode by explicitly setting LIBPROCESS_ADVERTISE_IP, LIBPROCESS_ADVERTISE_PORT and pinning its second port to a hostPort which isn't ideal but the only way I could find to make it advertise its port to Mesos properly:
{
"id": "/core/chronos",
"cmd": "LIBPROCESS_ADVERTISE_IP=$(getent hosts $HOST | awk '{ print $1 }') LIBPROCESS_ADVERTISE_PORT=$PORT1 /chronos/bin/start.sh --hostname $HOST --zk_hosts master-1:2181,master-2:2181,master-3:2181 --master zk://master-1:2181,master-2:2181,master-3:2181/mesos --http_credentials ${CHRONOS_USER}:${CHRONOS_PASS}",
"cpus": 0.1,
"mem": 1024,
"disk": 100,
"instances": 1,
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "mesosphere/chronos:v3.0.1",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 9900,
"hostPort": 0,
"servicePort": 0,
"protocol": "tcp",
"labels": {}
},
{
"containerPort": 9901,
"hostPort": 9901,
"servicePort": 0,
"protocol": "tcp",
"labels": {}
}
],
"privileged": true,
"parameters": [],
"forcePullImage": true
}
},
"env": {
"CHRONOS_USER": "admin",
"CHRONOS_PASS": "XXX",
"PORT1": "9901",
"PORT0": "9900"
}
}

Mesos cannot deploy container from private Docker registry

I have a private Docker registry that is accessible at https://docker.somedomain.com (over standard port 443 not 5000). My infrastructure includes a set up of Mesosphere, which have docker containerizer enabled. I'm am trying to deploy a specific container to a Mesos slave via Marathon; however, this always fails with Mesos failing the task almost immediately with no data in stderr and stdout of that sandbox.
I tried deploying from an image from the standard Docker Registry and it appears to work fine. I'm having trouble figuring out what is wrong. My private Docker registry does not require password authentication (turned off for debugging this), AND if I shell into the Meso's slave instance, and sudo su as root, I can run a 'docker pull docker.somedomain.com/services/myapp' successfully every time.
Here is my Marathon post data for starting the task:
{
"id": "myapp",
"cpus": 0.5,
"mem": 64.0,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "docker.somedomain.com/services/myapp:2",
"network": "BRIDGE",
"portMappings": [
{ "containerPort": 7000, "hostPort": 0, "servicePort": 0, "protocol": "tcp" }
]
},
"volumes": [
{
"containerPath": "application.yml",
"hostPath": "/var/myapp/application.yml",
"mode": "RO"
}
]
},
"healthChecks": [
{
"protocol": "HTTP",
"portIndex": 0,
"path": "/",
"gracePeriodSeconds": 5,
"intervalSeconds": 20,
"maxConsecutiveFailures": 3
}
]
}
I've been stuck on this for almost a day now, everything I've tried seems to be yielding the same result. Any insights on this would be much appreciated.
My versions:
Mesos: 0.22.1
Marathon: 0.8.2
Docker: 1.6.2
So this turns out to be an issue with volumes
"volumes": [
{
"containerPath": "/application.yml",
"hostPath": "/var/myapp/application.yml",
"mode": "RO"
}
]
Using the root path of the container of the root path may be legal in docker, but Mesos appears not to handle this behavior. Modifying the containerPath to a non-root path resolves this, i.e
"volumes": [
{
"containerPath": "/var",
"hostPath": "/var/myapp",
"mode": "RW"
}
]
If it is a problem between Marathon and the registry, the answer should be in the http logs of your registry. If Marathon connects, there will be an entry. And the Mesos master log should contain a clue as well.
It doesn't really sound like a problem between Marathon and Registry though. Are you sure you have 'docker,mesos' in /etc/mesos-slave/containerizers?
Did you --despite having no authentification-- try to follow Using a Private Docker Repository?
To supply credentials to pull from a private repository, add a .dockercfg to the uris field of your app. The $HOME environment variable will then be set to the same value as $MESOS_SANDBOX so Docker can automatically pick up the config file.

Docker's containers communication using Consul

I have read about service discovery for Docker using Consul, but I can't understand it.
Could you explain to me, how can I run two docker containers, recognize from the first container host of the second using Consul and send some message to it?
You would need to run Consul Agent in client mode inside each Docker container. Each Docker Container will need a Consul Service Definition file to let the Agent know to advertize it's service to the Consul Servers.
They look like this:
{
"service": {
"name": "redis",
"tags": ["master"],
"address": "127.0.0.1",
"port": 8000,
"checks": [
{
"script": "/usr/local/bin/check_redis.py",
"interval": "10s"
}
]
}
}
And a Service Health Check to monitor the health of the service. Something like this:
{
"check": {
"id": "redis",
"name": "Redis",
"script": "/usr/local/bin/check_redis_ping_returns_pong.sh",
"interval": "10s"
}
}
In the other Docker Container your code would find the Redis service either via DNS or the Consul Servers HTTP API
dig #localhost -p 8500 redis.service.consul
curl $CONSUL_SERVER/v1/health/service/redis?passing

Binding a port to a host interface using the REST API

The documentation for the commandline interface says the following:
To bind a port of the container to a specific interface of the host
system, use the -p parameter of the docker run command:
General syntax
docker run -p [([<host_interface>:[host_port]])|(<host_port>):]<container_port>[/udp] <image>
When no host interface is provided, the port is bound to
all available interfaces of the host machine (aka INADDR_ANY, or
0.0.0.0).When no host port is provided, one is dynamically allocated. The possible combinations of options for TCP port are the following
So I was wondering how I do the same but with the REST API?
With POST /container/create I tried:
"PortSpecs": ["5432:5432"] this seems to expose the port but not bind it to the host interface.
"PortSpecs": ["5432"] gives me the same result as the previous one.
"PortSpecs": ["0.0.0.0:5432:5432"] this returns the error Invalid hostPort: 0.0.0.0 which makes sense.
When I do sudo docker ps the container shows 5432/tcp which should be 0.0.0.0:5432/tcp.
Inspecting the container gives me the following:
"NetworkSettings": {
"IPAddress": "172.17.0.25",
"IPPrefixLen": 16,
"Gateway": "172.17.42.1",
"Bridge": "docker0",
"PortMapping": null,
"Ports": {
"5432/tcp": null
}
}
Full inspect can be found here.
This is an undocumented feature. I found my answer on the mailing list:
When creating the container you have to set ExposedPorts:
"ExposedPorts": { "22/tcp": {} }
When starting your container you need to set PortBindings:
"PortBindings": { "22/tcp": [{ "HostPort": "11022" }] }
There already is an issue on github about this.
Starting containers with PortBindings in the HostConfig was deprecated in v1.10 and removed in v1.12.
Both these configuration parameters should now be included when creating the container.
POST /containers/create
{
"Image": image_id,
"ExposedPorts": {
"22/tcp": {}
},
"HostConfig": {
"PortBindings": { "22/tcp": [{ "HostPort": "" }] }
}
}
I know this question had been answered, I using the above solution and here is how I did it in java using Docker Java Client v3.2.5
PortBinding portBinding = PortBinding.parse( hostPort + ":" + containerPort);
HostConfig hostConfig = HostConfig.newHostConfig()
.withPortBindings(portBinding);
CreateContainerResponse container =
dockerClient.createContainerCmd(imageName)
.withHostConfig(hostConfig)
.withExposedPorts(ExposedPort.parse(containerPort+"/tcp"))
.exec();

Resources