I'm running a redis container with no replication when I let the server run for some time it starts repeating this error
Timeout connecting to the MASTER... Reconnecting to MASTER
xxx.xxx.xxx.xxx:8886 after failure MASTER <-> REPLICA sync started Non blocking connect for SYNC fired the event. Master replied to PING,
replication can continue...
I also tried to set up a replica which produced the same error with an additional line of
Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
redis:
image: docker.io/bitnami/redis:7.0
environment:
# ALLOW_EMPTY_PASSWORD is recommended only for development.
- ALLOW_EMPTY_PASSWORD=yes
- REDIS_DISABLE_COMMANDS=FLUSHDB,FLUSHALL
- REDIS_REPLICATION_MODE=master
ports:
- '6379:6379'
volumes:
- 'redis_data:/bitnami/redis/data'
I'm experiencing the same problem, wondering what it can be.
Related
I started a redis via docker on my server (managed with traefik)
But my Redis always finish to fail, I don't understand why.
And the only method I found to restart is to delete /data/dump.rdb and restart my redis.
* Connecting to MASTER 19XXXXXXXXX
* MASTER <-> REPLICA sync started
* Non blocking connect for SYNC fired the event.
* Master replied to PING, replication can continue...
* Partial resynchronization not possible (no cached master)
* Full resync from master: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ:1
* MASTER <-> REPLICA sync: receiving 54992 bytes from master to disk
* MASTER <-> REPLICA sync: Flushing old data
* MASTER <-> REPLICA sync: Loading DB in memory
# Wrong signature trying to load DB from file
# Failed trying to load the MASTER synchronization DB from disk: Invalid argument
* Reconnecting to MASTER 19XXXXXXXXX after failure
* MASTER <-> REPLICA sync started
* Non blocking connect for SYNC fired the event.
# Failed to read response from the server: Connection reset by peer
# Master did not respond to command during SYNC handshake
* Connecting to MASTER 19XXXXXXXXX
* MASTER <-> REPLICA sync started
* Non blocking connect for SYNC fired the event.
Redis Docker Compose:
version: '3'
services:
redis:
image: "redis:7.0"
container_name: "redis"
command: "redis-server"
networks:
- "traefik"
restart: "always"
ports:
- '6379:6379'
volumes:
- "/root/redis/data:/data"
labels:
- "traefik.enable=true"
- "com.centurylinklabs.watchtower.enable=false"
Sample docker-compose.yml file for running a Redis server using Docker Compose
version: '3'
services:
redis:
image: redis
ports:
- "6379:6379"
Above yml uses the official redis image from Docker Hub and exposes the service on port 6379 and maps it to the host machine's port 6379.
It's a minimal example. Hope it helps.
Context: I have a master-worker system on celery + rabbitmq stack.
System is dockerized( worker service is not presented here )
version: '2'
services:
rabbit:
hostname: rabbit
image: rabbitmq:latest
environment:
- RABBITMQ_DEFAULT_USER=admin
- RABBITMQ_DEFAULT_PASS=mypass
ports:
- "5672:5672"
master:
build:
context: .
dockerfile: dockerfile
volumes:
- .:/app
links:
- rabbit
depends_on:
- rabbit
When i execute docker-compose up - everything OK!
Problems: But I can't use docker-compose up, I need to use docker-compose master and docker-compose worker (two separate commands for worker and master machines). So, when I execute docker-compose master - the container launches, but hangs up!:
Research: I have found out, that it hangs up on task submitting:
result = longtime_add.delay(count) Where longtime_add is a task.
Full code: https://github.com/waryak/MastersDiploma/tree/vlad
Also, please, edit the title - I feel, that it needs more clear title
A couple of quick points: (1) I didn't see the expected output messages for the producer broker url that you have in github; (2) I couldn't find where /src/network was added to your pythonpath; and (3) the code that loads the producer broker url in celery.py looks wrong as it is looking for the CONFIG variable and not PRODUCE_BROKER_URL as it is in the variables.env file. The reason that the producer would timeout is if it can't connect to the broker, so you'r eon the right track by printing out the produce and worker broker URLs. It may just be easier for you to try hardcoding the broker_url in the producer first:
from celery.app import Celery
app = Celery(broker_url='amqp://admin:mypass/rabbit:56772')
app.send_task(name='messages.tasks.longtime_add', kwargs={})
I just tried docker-compose up rabbit master and it worked out. That's strange, because i can see no external differences in broker logs or any other logs. Also, docker documentation assured, that all depended services would be launched...
I'm trying to follow a basic tutorial to start using RabbitMQ, converting from "docker run" to docker-compose. http://josuelima.github.io/docker/rabbitmq/cluster/2017/04/19/setting-up-a-rabbitmq-cluster-on-docker.html
Here's my docker-compose file:
version: '3'
services:
rabbit1:
image: rabbitmq:3.6.6-management
restart: unless-stopped
hostname: rabbit1
ports:
- "4369:4369"
- "5672:5672"
- "15672:15672"
- "25672:25672"
- "35197:35197"
environment:
- RABBITMQ_USE_LONGNAME=true
- RABBITMQ_LOGS=/var/log/rabbitmq/rabbit.log
volumes:
- "/nfs/docker/rabbit/data1:/var/lib/rabbitmq"
- "/nfs/docker/rabbit/data1/logs:/var/log/rabbitmq"
Trying to see if I can connect (and also remove the guest account) I get this error.
Error: unable to connect to node rabbit#rabbit1: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit#rabbit1]
rabbit#rabbit1:
* connected to epmd (port 4369) on rabbit1
* epmd reports node 'rabbit' running on port 25672
* TCP connection succeeded but Erlang distribution failed
* suggestion: hostname mismatch?
* suggestion: is the cookie set correctly?
* suggestion: is the Erlang distribution using TLS?
current node details:
- node name: 'rabbitmq-cli-41#rabbit1.no-domain'
- home dir: /var/lib/rabbitmq
- cookie hash: WjJle1otRdldm4Wso6HGfg==
Looking at the persistant data, it doesn't appear to be creating a cookie (whether or not I use the RABBITMQ_ERLANG_COOKIE variable) and I'm not convinced that the domain is being handled properly.
RabbitMQ docs are useless for this.
I have two AWS instances:
production-01
docker-machine-master
I ssh into docker-machine-master and run docker stack deploy -c deploy/docker-compose.yml --with-registry-auth production and i get this error:
this node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again
My guess is the swarm manager went down at some point and this new instance spun up some how keeping the same information/configuration minus the swarm manager info. Maybe the internal IP changed or something. I'm making that guess because the launch times are different by months. The production-01 instance was launched 6 months earlier. I wouldn't know because I am new to AWS, Docker, & this project.
I want to deploy code changes to the production-01 instance but I don't have ssh keys to do so. Also, my hunch is that production-01 is a replica noted in the docker-compose.yml file.
I'm the only dev on this project so any help would be much appreciated.
Here's a copy of my docker-compose.yml file with names changed.
version: '3'
services:
database:
image: postgres:10
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=pass
deploy:
replicas: 1
volumes:
- db:/var/lib/postgresql/data
aservicename:
image: 123.456.abc.amazonaws.com/reponame
ports:
- 80:80
depends_on:
- database
environment:
DB_HOST: database
DATA_IMPORT_BUCKET: some_sql_bucket
FQDN: somedomain.com
DJANGO_SETTINGS_MODULE: name.settings.production
DEBUG: "true"
deploy:
mode: global
logging:
driver: awslogs
options:
awslogs-group: aservicename
cron:
image: 123.456.abc.amazonaws.com/reponame
depends_on:
- database
environment:
DB_HOST: database
DATA_IMPORT_BUCKET: some_sql_bucket
FQDN: somedomain.com
DOCKER_SETTINGS_MODULE: name.settings.production
deploy:
replicas: 1
command: /name/deploy/someshellfile.sh
logging:
driver: awslogs
options:
awslogs-group: cron
networks:
default:
driver: overlay
ipam:
driver: default
config:
- subnet: 192.168.100.0/24
volumes:
db:
driver: rexray/ebs
I'll assume you only have the one manager, and the production-01 is a worker.
If docker info shows Swarm: inactive and you don't have backups of the Swarm raft log, then you'll need to create a new swarm with docker swarm init.
Be sure it has the rexray/ebs driver by checking docker plugin ls. All nodes will need that plugin driver to use the db volume.
If you can't SSH to production-01 then there will be no way to have it leave and join the new swarm. You'd need to deploy a new worker node and shutdown that existing server.
Then you can docker stack deploy that app again and it should reconnect the db volume.
Note 1: Don't redeploy the stack on new servers if it's still running on the production-01 worker, as it would fail because the ebs volume for db will still be connected to production-01.
Note 2: It's best in anything beyond learning, you run three managers (managers are also workers by default). That way if one node gets killed, you still have a working solution.
I want to send logs from one rancher service (e.g. my_service) to another rancher service running the ELK stack with the syslog driver
I am setting up my stack via a docker-compose as follows more or less:
elk-custom:
# image: elk-custom
build:
context: .
dockerfile: Dockerfile-elk
ports:
- 5601:5601
- 9200:9200
- 5044:5044
- 5151:5151
- 5152:5152
my_service:
image: some_image_from_my_local_registry
depends_on:
- elk-custom
logging:
driver: syslog
options:
syslog-address: "tcp://elk-custom:514"
However, on the stack dashboard, for my_service I get:
my_service (Expected state running but got error: Error response from daemon: failed to initialize logging driver: dial tcp: lookup elk-custom on 10.0.2.3:53: server misbehaving)
Is there anything additional needed to make the specific logging (elk-custom) service discoverable?
a few things are going on there that are problematic. First, if you are doing build, you have to use either a git or HTTP remote url or s3 based context.
Please see the docs: http://rancher.com/docs/rancher/v1.6/en/cattle/rancher-compose/#builds
Typically, you build an image and deploy it as a service. Builds are more of a development bit and less used on a production side of things.
The next thing is that in a multi-host setup, you will have trouble routing on the Rancher network. I would recommend deploying a Logstash collector on all nodes with the syslog ingress in host networking mode. Then you can point the logging driver to the localhost syslog target. Each of the logstash collectors would then forward to either another logstash for filtering or Elastic cluster.