Infinispan's JGroups not joining the same cluster in Docker services - docker

(Query section below towards middle)
Cross posted at https://developer.jboss.org/message/982355
Environment:
Infinispan 9.13,
Embedded cache in a cluster with jgroups,
Single file store,
Using JGroups inside Docker services in a single docker host/daemon (Not in AWS yet).
Infinispan.xml below:
<jgroups><stack-file name="external-file" path="${path.to.jgroups.xml}"/></jgroups>
Application = 2 webapps + database
Issue:
When I deploy the 2 webapps in separate tomcats directly on a machine (not docker yet), the Infinispan manager initializing the cache (in each webapp) forms a cluster using jgroups (i.e it works). But with the exact same configuration (and same channel name in jgroups), when deploying the webapps as services in docker, they don't join the same cluster (rather they are separate and have just one member in view - logs below).
The services are docker containers from images = linux + tomcat + webapp and are launched using docker compose v3.
I have tried the instructions at https://github.com/belaban/jgroups-docker for a container containing JGroups and a couple of demos where it suggests either using --network=host mode for the docker services (this does work but we cannot do this as the config files would need to have separate ports if we scale), or passing the external_addr=docker_host_IP_address field in jgroups.xml (this is NOT working and the query is how to make this work).
Its not a timing issue as I also tried putting a significant delay in starting the 2nd service deployed in the stack, but still the 2 apps's Infinispan cluster have just one member in its view (that container itself). Calling the cacheManager.getMembers() also shows just one entry inside each app (should show 2).
Log showing just one member in first app:
org.infinispan.remoting.transport.jgroups.JGroupsTransport.receiveClusterView ISPN000094: Received new cluster view for channel CHANNEL_NAME: [FirstContainerId-6292|0] (1) [FirstContainerId-6292].
org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded ISPN000079: Channel CHANNEL_NAME local address is FirstContainerId-6292, physical addresses are [10.xx.yy.zz:7800]
Log showing just one member in second app:
org.infinispan.remoting.transport.jgroups.JGroupsTransport.receiveClusterView ISPN000094: Received new cluster view for channel CHANNEL_NAME: [SecondContainerId-3502|0] (1) [SecondContainerId-3502]
29-Apr-2018 11:47:42.357 INFO [localhost-startStop-1] org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded ISPN000079: Channel CHANNEL_NAME local address is 58cfa4b95c16-3502, physical addresses are [10.xx.yy.zz:7800]
The docker compose V3 is below and shows the overlay network:
version: "3"
services:
app1:
image: app1:version
ports:
- "fooPort1:barPort"
volumes:
- "foo:bar"
networks:
- webnet
app2:
image: app2:version
ports:
- "fooPort2:barPort"
volumes:
- "foo:bar"
networks:
- webnet
volumes:
dbdata:
networks:
webnet:
Deployed using $docker stack deploy --compose-file docker-compose.yml OurStack
The JGroups.xml has the relevant config part below:
<TCP
external_addr="${ext-addr:docker.host.ip.address}"
bind_addr="${jgroups.tcp.address:127.0.0.1}"
bind_port="${jgroups.tcp.port:7800}"
enable_diagnostics="false"
thread_naming_pattern="pl"
send_buf_size="640k"
sock_conn_timeout="300"
bundler_type="sender-sends-with-timer"
thread_pool.min_threads="${jgroups.thread_pool.min_threads:1}"
thread_pool.max_threads="${jgroups.thread_pool.max_threads:10}"
thread_pool.keep_alive_time="60000"/>
<MPING bind_addr="${jgroups.tcp.address:127.0.0.1}"
mcast_addr="${jgroups.mping.mcast_addr:228.2.4.6}"
mcast_port="${jgroups.mping.mcast_port:43366}"
ip_ttl="${jgroups.udp.ip_ttl:2}"/>
The code is similar to:
DefaultCacheManager manager = new DefaultCacheManager(jgroupsConfigFile.getAbsolutePath());
Cache someCache = new Cache(manager.getCache("SOME_CACHE").getAdvancedCache().withFlags(Flag.IGNORE_RETURN_VALUES));
Query:
How do we deploy with docker-compose (as two services in docker containers) and jgroups.xml above so that the Infinispan cache in each of the two webapps join and form a cluster - so both apps can access the same data each other read/write in the cache. Right now they connect to the same channel name and each becomes a cluster with one member, even if we point jgroups to external_addr.
Tried so far:
Putting delay in second service's startup so first has enough time to advertise.
JGroups - Running JGroups in Docker can deploy the belaban/jgroups containers as two services in a stack using docker compose and they are able to form a cluster (chat.sh inside container shows 2 member view).
Tried --net=host which works but is infeasible. Tried external_addr=docker.host.ip in jgroups.xml which is the ideal solution but its not working (the log above is from that).
Thanks! Will try to provide any specific info if required.

Apparently external_addr="${ext-addr:docker.host.ip.address}" does not resolve (or reseolves to null), so bind_addr of 127.0.0.1 is used. Is docker.host.ip.address set by you (e.g. as an env variable)?
The external_addr attribute should point to a valid IP address.

Related

Docker container R/W permissions to access remote TrueNAS SMB share

I've been banging my head against the wall trying to sort out permissions issues when running a container that uses a remote SMB share for storing configuration files.
I found this post and answer but still can't seem to get things to work:
docker-add-network-drive-as-volume-on-windows
For the below YAML code, yes everything is formatted correctly. I just copied this from my reddit post and the indents are not showing correctly now.
My set-up is as follows:
Running Proxmox as my hypervisor with:
TrueNAS Scale as the NAS
Debian VM for hosting Docker
The TrueNAS VM has a single pool, with 1 dataset for SMB shares and 1 dataset for NFS shares (implemented for troubleshooting purposes)
I have credentials steve:steve (1000:1000) supersecurepassword with Full Control ACL permissions on the SMB share. I can access this share via windows and CLI and have all expected operations behaving as expected.
On the Debian host, I have created user steve:steve (1000:1000) with supersecurepassword.
I have been able to successfully mount and map the share within the debian host using cifs using:
//192.168.10.206/dockerdata /mnt/dockershare cifsuid=1000,gid=1000,vers=3.0,credentials=/root/.truenascreds 0 0
The credentials are:
username=steve
password=supersecurepassword
I can read/write from CLI through the mount point, view files, modify files, etc.
I have also successfully mounted & read/write the share with these additional options:
file_mode=0777,dir_mode=0777,noexec,nosuid,nosetuids,nodev
Now here's where I start having problems. I can create a container user docker compose, portainer (manual creation and stack for compose) but run into database errors as the container attempts to start.
version: "2.1"
services:
babybuddytestsmbmount:
image: lscr.io/linuxserver/babybuddy:latest
container_name: babybuddytestsmbmount
environment:
- PUID=1000
- PGID=1000
- TZ=America/New_York
- CSRF_TRUSTED_ORIGINS=http://127.0.0.1:8000,https://babybuddy.domain.com
ports:
- 1801:8000
restart: unless-stopped
volumes:
- /mnt/dockershare/babybuddy2:/config
Docker will create all folders and files, start the container but the webui will return a server 500 error. The logs indicate these database errors which results in a large number of exceptions:
sqlite3.OperationalError: database is locked
django.db.utils.OperationalError: database is locked
django.db.migrations.exceptions.MigrationSchemaMissing: Unable to create the django_migrations table (database is locked)
I also tried mounting the SMB share in a docker volume using the following:
version: "2.1"
services:
babybuddy:
image: lscr.io/linuxserver/babybuddy:latest
container_name: babybuddy
environment:
- PUID=1000
- PGID=1000
- TZ=America/New_York
- CSRF_TRUSTED_ORIGINS=http://127.0.0.1:8000,https://babybuddy.domain.com
ports:
- 1800:8000
restart: unless-stopped
volumes:
- dockerdata:/config
volumes:
dockerdata:
driver_opts:
type: "cifs"
o: "username=steve,password=supersecurepassword,uid=1000,gid=1000,file_mode=0777,dir_mode=0777,noexec,nosuid,nosetuids,nodev,vers=3.0"
device: "//192.168.10.206/dockerdata"
I have also tried this under options:
o: "username=steve,password=supersecurepassword,uid=1000,gid=1000,rw,vers=3.0"
Docker again is able to create the container, create & mount the volume, create all folders and files, but encouters the same DB errors indicated above.
I believe this is because the container is trying to access the SMB share as root, which TrueNAS does not permit. I have verified that all files and folders are under the correct ownership, and during troubleshooting have also stopped the container, recursively chown and chgrp the dataset to root:root, restarting the container and no dice. Changing the SMB credntials on the debian host to root results in a failure to connect.
Testing to ensure I didn't have a different issue causing problems, I was able to sucessfully start the container locally on the host as well as using a remote NFS share from the same TrueNAS VM.
I have also played with the dataset permissions, changing owners within TrueNAS, attempting permissions without ACL, etc.
Each of these variations was done with fresh dataset for SMB and a wipeout and recreation of docker as well as reinstall of debian.
Any help or suggestions would be greatly appreciated.
Edit: I also tried this with Ubuntu as the docker host and attempted to have docker run under the steve user to disastrous results.
I expected to be able to mount the SMB share on my TrueNAS system on my Debian docker host machine and encounter write errors in the database files that are part of the container. Local docker instances and NFS mounts work fine.

How to configure RabbitMQ for message persistence in Docker swarm?

How can I configure RabbitMQ to retain messages on node restart in docker swarm?
I've marked the queues as durable and I'm setting the message's delivery mode to 2. I'm mounting /var/lib/rabbitmq/mnesia to a persistent volume. I've docker exec'd to verify that rabbitmq is indeed creating files in said folder, and all seems well. Everything works in my local machine using docker-compose.
However, when the container crashes, docker swarm creates a new one, and this one seems to initialize a new Mnesia database instead of using the old one. The database's name seems to be related to the container's id. It's just a single node, I'm not configuring any clustering.
I haven't changed anything in rabbitmq.conf, except for the cluster_name, since it seemed to be related to the folder created, but that didn't solve it.
Relevant section of the docker swarm configuration:
rabbitmq:
image: rabbitmq:3.9.11-management-alpine
networks:
- default
environment:
- RABBITMQ_DEFAULT_PASS=password
- RABBITMQ_ERLANG_COOKIE=cookie
- RABBITMQ_NODENAME=rabbit
volumes:
- rabbitmq:/var/lib/rabbitmq/mnesia
- rabbitmq-conf:/etc/rabbitmq
deploy:
placement:
constraints:
- node.hostname==foomachine

Docker compose/swarm 3: docker file path , build, container name, links, migration

I have project with docker-compose file and want to migrate to V3, but when deploy with
docker stack deploy --compose-file=docker-compose.yml vertx
It does not understand build path, links, container names...
My file locate d here
https://github.com/armdev/vertx-spring/blob/master/docker-compose.yml
version: '3'
services:
eureka-node:
image: eureka-node
build: ./eureka-node
container_name: eureka-node
ports:
- '8761:8761'
networks:
- vertx-network
postgres-node:
image: postgres-node
build: ./postgres-node
container_name: postgres-node
ports:
- '5432:5432'
networks:
- vertx-network
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: socnet
POSTGRES_DB: socnet
vertx-node:
image: vertx-node
build: ./vertx-node
container_name: vertx-node
links:
- postgres-node
- eureka-node
ports:
- '8585:8585'
networks:
- vertx-network
networks:
vertx-network:
driver: overlay
when I run docker-compose up, it is working, but with
stack deploy not.
How to define path for docker file?
docker stack deploy works only on images, not on builds.
This means that you will have to push your images to an image registry (created with the build process), later docker stack deploy will download the images and execute them.
here you have an example of how was it done for a php application.
You have to pay attention to the parts 1, 3 and 4.
The articles are about php, but can easily be applied to any other language.
The swarm mode "docker service" interface has a few fundamental differences in how it manages containers. You are no longer directly running containers like with "docker run", and it is assumed that you will be doing this in a distributed environment more often than not.
I'll break down the answer by these specific things you listed.
It does not understand build path, links, container names...
Links
The link option has been deprecated for quite some time in favor of the network service discovery feature introduced alongside the "docker network" feature. You no longer need to specify specific links to/from containers. Instead, you simply need to ensure that all containers are on the same network and then they can discovery eachother by the container name or "network alias"
docker-compose will put all your containers into the same network by default, and it sets up the compose service name as an alias. That means if you have a service called 'postgres-node', you can reach it via dns by the name 'postgres-node'.
Container Names
The "docker service" interface allows you to declare a desired state. "I want x number of identical services". Since the interface must support x number of instances of a service, it doesn't allow you to choose the specific container name. Instead, you get to choose the service name. In the case of 'docker stack deploy', the service name defined under the services key in your docker-compose.yml file will be used, but it will also prepend the stack name to the service name.
In most cases, I would argue that overriding the container name in a docker-compose.yml file is unnecessary, even when using regular containers via docker-compose up.
If you need a different name for network service discovery purposes, add a different alias or use the service name alias that you get when using docker-compose or docker stack deploy.
build path
Because swarm mode was built to be a distributed system, building an image in place locally isn't something that "docker stack deploy" was meant to do. Instead, you should build and push your image to a registry that all nodes in your cluster can access.
In the case where you are using a single node swarm "cluster", you should be able to use the docker-compose build option to get the images built locally, and then use docker stack deploy.

Host names are not set in docker compose

I created simple compose config to try Postgres BDR replication.
I expect containers to have host names as service names I defined and I expect one container to be able to resolve and reach another with this hostname. I expect it to be true because of that:
https://docs.docker.com/compose/networking/
My config:
version: '2'
services:
bdr1:
image: bdr
volumes:
- /var/lib/postgresql/data1:/var/lib/postgresql/data
ports:
- "5001:5432"
bdr2:
image: bdr
volumes:
- /var/lib/postgresql/data2:/var/lib/postgresql/data
ports:
- "5002:5432"
But in reality both containers get rubbish hostnames and are not reachable by container names:
Creating network "bdr_default" with the default driver
Creating bdr_bdr1_1
Creating bdr_bdr2_1
Attaching to bdr_bdr1_1, bdr_bdr2_1
bdr1_1 | Hostname: 938e0585fee2
bdr2_1 | Hostname: 7153165f4d5b
Is it a bug, or I did something wrong?
I use Ubuntu 14.04.4 LTS, Docker version 1.10.1, build 9e83765, docker-compose version 1.6.0, build d99cad6
docker-compose gives you the option of scaling services up or down, meaning you can launch multiple instances of the same service. That is at least one reason why the hostnames are not just service names. You will notice that if you scale bdr1 to 2 instance, you will then have bdr_bdr1_1 and bdr_bdr1_2 containers.
You can work around this inside the containers that were started up by docker-compose in at least two ways:
If a service refers to other service, you can use links section, for example make bdr1 link to bdr2. In this case when you are inside bdr1 you can call host bdr2 by name. I have not tried what happens when you scale up bdr2 in this case.
You can force the hostname of a container internally to the name you want by using the hostname section. For example if you add hostname: bdr1 to bdr1, then you can internally connect to bdr1, which is itself.
You can possibly achieve similar result with the networks section, but I have not yet used it myself so I don't know for sure.
The hostname inside the container should be the short container id, so this is correct (note there was a bug with Compose 1.6.0 and the short container id, so you should use at least version 1.6.2). Also /etc/hosts is not longer used, there is now an embedded dns server that handles resolving names to container ip addresses.
The container is discoverable by other containers with 3 names: the container name, the container short id, and the service name.
However, the other container may not be available immediately when the first one starts. You can use depends_on to set the order.
If you are testing the discovery, try using ping, and make sure to retry , because the name may not resolve immediately.

Docker compose 'scale' command is not scaling across multiple machines

I have a 2 machine swarm cluster. I have installed the simple Docker compose demo from here on one of the machines. However, when I try to scale the application with the docker-compose scale web=5command, it only scales to the current machine and does not create any of the new web containers on the other machine in the swarm cluster as expected.
On every example I've seen by others, their scale command just works and nothing was mentioned about additional configurations needed get it to scale across multiple nodes.
Not sure what else to try. I get the same result when running the scale command from either machine
Please let me know what further information I can provide.
I see now there were two issues causing my scale commands to fail, however, it is still not working even with proper multi-host networking setup.
When scaling a container from a compose application that was linked to another container in that same compose app - This was failing because I was joining the containers with the deprecated(?) "links" functionality rather than using the new multi-host networking functionality. Apparently, "links" can only work on a single machine and cannot be scaled across multiple machines. (I'm fairly sure this is the case, but could be wrong)
When attempting to scale an unlinked container - This was actually working as expected. I had forgot I had other containers running on the machine I was expecting Docker to scale out my container to. Thus the Swarm scheduler just put the newly scaled containers onto the current machine since the current machine was being least utilized. (This was on a 2 machine swarm cluster)
EDIT - Actual Solution
Okay, it looks like the final problem was I cannot scale the part of the compose app that uses build for creating its image rather than specifying the image with image.
I suppose this makes sense because the machine it is trying to scale that container to doesn't have the build file available to create that image but I had assumed Docker Compose/Swarm would be smart enough to figure that out and somehow copy that across machines.
So the solution is to build that image beforehand with Docker build and then either push that image to the public Docker Hub or your own private registry and have the Docker compose file specify that image with image rather than trying to create it with build.
One thing you could do is label the web containers (such as com.mydomain.myapp.category=web) and make a soft anti-affinity rule for the label (such as affinity:com.mydomain.myapp.category!=~web). This would tell Swarm to try and schedule another container with com.mydomain.myapp.category=web to host that doesn't container the container first (but schedule on one already having that container if not).
The modified Docker Compose file in that repository would be something like:
web:
build: .
volumes:
- .:/code
links:
- redis
expose:
- "5000"
environment:
- "affinity:com.mydomain.myapp.category!=~web"
labels:
- "com.mydomain.myapp.category=web"
redis:
image: redis
lb:
image: tutum/haproxy
links:
- web
ports:
- "80:80"
environment:
- BACKEND_PORT=5000
- BALANCE=roundrobin

Resources