My redis stop working and fail to restart - docker

I started a redis via docker on my server (managed with traefik)
But my Redis always finish to fail, I don't understand why.
And the only method I found to restart is to delete /data/dump.rdb and restart my redis.
* Connecting to MASTER 19XXXXXXXXX
* MASTER <-> REPLICA sync started
* Non blocking connect for SYNC fired the event.
* Master replied to PING, replication can continue...
* Partial resynchronization not possible (no cached master)
* Full resync from master: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ:1
* MASTER <-> REPLICA sync: receiving 54992 bytes from master to disk
* MASTER <-> REPLICA sync: Flushing old data
* MASTER <-> REPLICA sync: Loading DB in memory
# Wrong signature trying to load DB from file
# Failed trying to load the MASTER synchronization DB from disk: Invalid argument
* Reconnecting to MASTER 19XXXXXXXXX after failure
* MASTER <-> REPLICA sync started
* Non blocking connect for SYNC fired the event.
# Failed to read response from the server: Connection reset by peer
# Master did not respond to command during SYNC handshake
* Connecting to MASTER 19XXXXXXXXX
* MASTER <-> REPLICA sync started
* Non blocking connect for SYNC fired the event.
Redis Docker Compose:
version: '3'
services:
redis:
image: "redis:7.0"
container_name: "redis"
command: "redis-server"
networks:
- "traefik"
restart: "always"
ports:
- '6379:6379'
volumes:
- "/root/redis/data:/data"
labels:
- "traefik.enable=true"
- "com.centurylinklabs.watchtower.enable=false"

Sample docker-compose.yml file for running a Redis server using Docker Compose
version: '3'
services:
redis:
image: redis
ports:
- "6379:6379"
Above yml uses the official redis image from Docker Hub and exposes the service on port 6379 and maps it to the host machine's port 6379.
It's a minimal example. Hope it helps.

Related

redis fails after some time

I'm running a redis container with no replication when I let the server run for some time it starts repeating this error
Timeout connecting to the MASTER... Reconnecting to MASTER
xxx.xxx.xxx.xxx:8886 after failure MASTER <-> REPLICA sync started Non blocking connect for SYNC fired the event. Master replied to PING,
replication can continue...
I also tried to set up a replica which produced the same error with an additional line of
Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
redis:
image: docker.io/bitnami/redis:7.0
environment:
# ALLOW_EMPTY_PASSWORD is recommended only for development.
- ALLOW_EMPTY_PASSWORD=yes
- REDIS_DISABLE_COMMANDS=FLUSHDB,FLUSHALL
- REDIS_REPLICATION_MODE=master
ports:
- '6379:6379'
volumes:
- 'redis_data:/bitnami/redis/data'
I'm experiencing the same problem, wondering what it can be.

Not able to connect to redis cluster from docker container

I am trying to setup Redis cluster. The configuration is the bare minimum setup of the cluster. I am using docker-compose to run my application. But it always throwing the following error. However, when I connect Redis to an external tool it gets connected successfully.
ClusterAllFailedError: Failed to refresh slots cache
const db = new Redis.Cluster([{
host: "redis",
port: 6379
}])
I can see Redis master and replica instance on docker container
master
container ID - 8a7f4d9fc877
image - bitnami/redis:latest
ports - 0.0.0.0:32862->6379/tcp
name - mogus_redis_1
slave
container ID - f04433e04de5
image - bitnami/redis:latest
ports - 0.0.0.0:32863->6379/tcp
name - mogus_redis-replica_1
yml file
redis:
image: "bitnami/redis:latest"
ports:
- 6379
environment:
REDIS_REPLICATION_MODE: master
ALLOW_EMPTY_PASSWORD: "yes"
volumes:
- redis-data:/bitnami
redis-replica:
image: "bitnami/redis:latest"
ports:
- 6379
depends_on:
- redis
environment:
REDIS_REPLICATION_MODE: slave
REDIS_MASTER_HOST: redis
REDIS_MASTER_PORT_NUMBER: 6379
ALLOW_EMPTY_PASSWORD: "yes"
You are not using a Redis Cluster, just a single master and a replica. That being the case, your app should just use the singleinstance class, which I assume is something like this:
const db = new Redis([{
host: "redis",
port: 6379
}])

Docker container issue can't communicate

I beginner in Docker, I write the simple docker-compose.yml file for run two service container first container for node app and another one for redis issue with my app server unable to connect with redis container here is my code:
version: '3'
services:
redis:
image: redis
ports:
- "6379:6379"
networks:
- test
app_server:
image: app_server
depends_on:
- redis
links:
- redis
ports:
- "4004:4004"
networks:
- test
networks:
test:
Output:
Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED
Looks like your webapp is connecting to 127.0.0.1/localhost instead of redis. So not a docker issue, but more of a programming issue within your web app. you could add environment variable in your webapp (something like REDIS_HOST) and then give that parameter in the compose-file. This of course requires your web application to read redis host from environment variable.
Example environment variable assignment in compose:
webapp:
image: my_web_app
environment:
- REDIS_HOST=redis
Again, this requires that your web app is actually utilizing REDIS_HOST environment variable in its code.
127.0.0.1:6379 is connect to current container localhost not to redis container
With your docker-composer file. Now your connect to redis via redis container name. Becase docker-compose automatic create an docker bridge network - whic allow you call to another container via their name...
docker inspect to see redis container name - for example current redis container name is redis_abc, so you can connect to redis via redis_abc:6379 Or more simple, just add container_name: redis_server to docker-compose file for certain container name..
https://docs.docker.com/network/bridge/

RabbitMQ on Docker. Not creating cookies / can't find host?

I'm trying to follow a basic tutorial to start using RabbitMQ, converting from "docker run" to docker-compose. http://josuelima.github.io/docker/rabbitmq/cluster/2017/04/19/setting-up-a-rabbitmq-cluster-on-docker.html
Here's my docker-compose file:
version: '3'
services:
rabbit1:
image: rabbitmq:3.6.6-management
restart: unless-stopped
hostname: rabbit1
ports:
- "4369:4369"
- "5672:5672"
- "15672:15672"
- "25672:25672"
- "35197:35197"
environment:
- RABBITMQ_USE_LONGNAME=true
- RABBITMQ_LOGS=/var/log/rabbitmq/rabbit.log
volumes:
- "/nfs/docker/rabbit/data1:/var/lib/rabbitmq"
- "/nfs/docker/rabbit/data1/logs:/var/log/rabbitmq"
Trying to see if I can connect (and also remove the guest account) I get this error.
Error: unable to connect to node rabbit#rabbit1: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit#rabbit1]
rabbit#rabbit1:
* connected to epmd (port 4369) on rabbit1
* epmd reports node 'rabbit' running on port 25672
* TCP connection succeeded but Erlang distribution failed
* suggestion: hostname mismatch?
* suggestion: is the cookie set correctly?
* suggestion: is the Erlang distribution using TLS?
current node details:
- node name: 'rabbitmq-cli-41#rabbit1.no-domain'
- home dir: /var/lib/rabbitmq
- cookie hash: WjJle1otRdldm4Wso6HGfg==
Looking at the persistant data, it doesn't appear to be creating a cookie (whether or not I use the RABBITMQ_ERLANG_COOKIE variable) and I'm not convinced that the domain is being handled properly.
RabbitMQ docs are useless for this.

How to get my machine back to swarm manager status?

I have two AWS instances:
production-01
docker-machine-master
I ssh into docker-machine-master and run docker stack deploy -c deploy/docker-compose.yml --with-registry-auth production and i get this error:
this node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again
My guess is the swarm manager went down at some point and this new instance spun up some how keeping the same information/configuration minus the swarm manager info. Maybe the internal IP changed or something. I'm making that guess because the launch times are different by months. The production-01 instance was launched 6 months earlier. I wouldn't know because I am new to AWS, Docker, & this project.
I want to deploy code changes to the production-01 instance but I don't have ssh keys to do so. Also, my hunch is that production-01 is a replica noted in the docker-compose.yml file.
I'm the only dev on this project so any help would be much appreciated.
Here's a copy of my docker-compose.yml file with names changed.
version: '3'
services:
database:
image: postgres:10
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=pass
deploy:
replicas: 1
volumes:
- db:/var/lib/postgresql/data
aservicename:
image: 123.456.abc.amazonaws.com/reponame
ports:
- 80:80
depends_on:
- database
environment:
DB_HOST: database
DATA_IMPORT_BUCKET: some_sql_bucket
FQDN: somedomain.com
DJANGO_SETTINGS_MODULE: name.settings.production
DEBUG: "true"
deploy:
mode: global
logging:
driver: awslogs
options:
awslogs-group: aservicename
cron:
image: 123.456.abc.amazonaws.com/reponame
depends_on:
- database
environment:
DB_HOST: database
DATA_IMPORT_BUCKET: some_sql_bucket
FQDN: somedomain.com
DOCKER_SETTINGS_MODULE: name.settings.production
deploy:
replicas: 1
command: /name/deploy/someshellfile.sh
logging:
driver: awslogs
options:
awslogs-group: cron
networks:
default:
driver: overlay
ipam:
driver: default
config:
- subnet: 192.168.100.0/24
volumes:
db:
driver: rexray/ebs
I'll assume you only have the one manager, and the production-01 is a worker.
If docker info shows Swarm: inactive and you don't have backups of the Swarm raft log, then you'll need to create a new swarm with docker swarm init.
Be sure it has the rexray/ebs driver by checking docker plugin ls. All nodes will need that plugin driver to use the db volume.
If you can't SSH to production-01 then there will be no way to have it leave and join the new swarm. You'd need to deploy a new worker node and shutdown that existing server.
Then you can docker stack deploy that app again and it should reconnect the db volume.
Note 1: Don't redeploy the stack on new servers if it's still running on the production-01 worker, as it would fail because the ebs volume for db will still be connected to production-01.
Note 2: It's best in anything beyond learning, you run three managers (managers are also workers by default). That way if one node gets killed, you still have a working solution.

Resources