Elastic in docker stack/swarm - docker

I have swarm of two nodes
[ra#speechanalytics-test ~]$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
mlwwmkdlzbv0zlapqe1veq3uq speechanalytics-preprod Ready Active 18.09.3
se717p88485s22s715rdir9x2 * speechanalytics-test Ready Active Leader 18.09.3
I am trying to run container with elastic in stack. Here is my docker-compose.yml file
version: '3.4'
services:
elastic:
image: docker.elastic.co/elasticsearch/elasticsearch:6.7.0
environment:
- cluster.name=single-node
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- esdata:/usr/share/elasticsearch/data
deploy:
placement:
constraints:
- node.hostname==speechanalytics-preprod
volumes:
esdata:
driver: local
after start with docker stack
docker stack deploy preprod -c docker-compose.yml
container crashes in 20 seconds
docker service logs preprod_elastic
...
| OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
| OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=0
| [2019-04-03T16:41:30,044][WARN ][o.e.b.JNANatives ] [unknown] Unable to lock JVM Memory: error=12, reason=Cannot allocate memory
| [2019-04-03T16:41:30,049][WARN ][o.e.b.JNANatives ] [unknown] This can result in part of the JVM being swapped out.
| [2019-04-03T16:41:30,049][WARN ][o.e.b.JNANatives ] [unknown] Increase RLIMIT_MEMLOCK, soft limit: 16777216, hard limit: 16777216
| [2019-04-03T16:41:30,050][WARN ][o.e.b.JNANatives ] [unknown] These can be adjusted by modifying /etc/security/limits.conf, for example:
| OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
| # allow user 'elasticsearch' mlockall
| OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=0
| elasticsearch soft memlock unlimited
| [2019-04-03T16:41:02,949][WARN ][o.e.b.JNANatives ] [unknown] Unable to lock JVM Memory: error=12, reason=Cannot allocate memory
| elasticsearch hard memlock unlimited
| [2019-04-03T16:41:02,954][WARN ][o.e.b.JNANatives ] [unknown] This can result in part of the JVM being swapped out.
| [2019-04-03T16:41:30,050][WARN ][o.e.b.JNANatives ] [unknown] If you are logged in interactively, you will have to re-login for the new limits to take effect.
| [2019-04-03T16:41:02,954][WARN ][o.e.b.JNANatives ] [unknown] Increase RLIMIT_MEMLOCK, soft limit: 16777216, hard limit: 16777216
preprod
on both nodes I have
ra#speechanalytics-preprod:~$ sysctl vm.max_map_count
vm.max_map_count = 262144
Any ideas how to fix ?

The memlock errors you're seeing from Elasticsearch is a common issue not unique to having used Docker, but occurs when Elasticsearch is told to lock its memory, but is unable to do so. You can circumvent the error by removing the following environment variable from the docker-compose.yml file:
- bootstrap.memory_lock=true
Memlock may be used with Docker Swarm Mode, but with some caveats.
Not all options that work with docker-compose (Docker Compose) work with docker stack deploy (Docker Swarm Mode), and vice versa, despite both sharing the docker-compose YAML syntax. One such option is ulimits:, which when used with docker stack deploy, will be ignored with a warning message, like so:
Ignoring unsupported options: ulimits
My guess is that with your docker-compose.yml file, Elasticsearch runs fine with docker-compose up, but not with docker stack deploy.
With Docker Swarm Mode, by default, the Elasticsearch instance as you have defined will have trouble with memlock. Currently, setting of ulimits for docker swarm services is not yet officially supported. There are ways to get around the issue, though.
If the host is Ubuntu, unlimited memlock can be enabled across the docker service (see here and here). This can be achieved via the commands:
echo -e "[Service]\nLimitMEMLOCK=infinity" | SYSTEMD_EDITOR=tee systemctl edit docker.service
systemctl daemon-reload
systemctl restart docker
However, setting memlock to infinity is not without its drawbacks, as spelt out by Elastic themselves here.
Based on my testing, the solution works on Docker 18.06, but not on 18.09. Given the inconsistency and the possibility of Elasticsearch failing to start, the better option would be to not use memlock with Elasticsearch when deploying on Swarm. Instead, you can opt for any of the other methods mentioned in Elasticsearch Docs to achieve similar results.

Related

GitLab docker cannot fork "Resource temporarily unavailable"

I tried to run a gitlab-ce docker container on a ubuntu server version 22.04.
The log output of docker logs --follow gitlab results in
execute[/opt/gitlab/bin/gitlab-ctl start alertmanager] action run
[execute] /opt/gitlab/bin/gitlab-ctl: fork: retry: Resource temporarily unavailable
even though I have enough memory available by monitoring with htop. Docker exited with an error code 137. My docker-compose.yml file looks like
version: "3.7"
gitlab:
image: gitlab/gitlab-ce:latest
container_name: gitlab
restart: "no"
ports:
- "8929:8929"
- "2289:22"
hostname: "gitlab.example.com"
environment:
GITLAB_OMNIBUS_CONFIG: |
external_url "https://gitlab.example.com"
nginx['listen_port'] = 8929
nginx['listen_https'] = false
gitlab_rails['gitlab_shell_ssh_port'] = 2289
volumes:
- ./volumes/gitlab/config:/etc/gitlab
- ./volumes/gitlab/logs:/var/log/gitlab
- ./volumes/gitlab/data:/var/opt/gitlab
shm_size: "256m"
I am using docker version 20.10.16. Other images work fine with docker. The output of ulimit -a is
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1029348
max locked memory (kbytes, -l) 65536
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 62987
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I had the same problem with a vServer which looks pretty much like your machine.
I guess that the problem is a limit on the processes that can run at the same time. Probably you are limited by 400 but you need more to run your compose network.
cat /proc/user_beancounters | grep numproc
The response is formatted like this: held, maxheld, barrier, limit
If you run this command, you should be able to see that you are very close to exceeding the limit (if I'm right with my assumption).
Checkout this link, they talk about Java, but the general problem is the same:
https://debianforum.de/forum/viewtopic.php?t=180774

Db2 on docker swarm memory consumption

I am using db2 on docker with a self non-root installation.
Even if I set the INSTANCE_MEMORY to the minimum, it seems to reserve 4G of RAM on the server.
How can DB2 be aware of the limits setted in the docker-compose file as I run the database in a docker swarm cluster as a STACK?
The DB2 version is 11.1.4FP4.
docker --version
Docker version 18.03.1-ce, build 9ee9f40
When I look at the docker stats, it uses only about 80MiB.
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
8282c5d0c9e7 db2wse_db2.1.waalf6vljuapnxlvzhf2cb0uv 0.21% 76.83MiB / 1GiB 7.50% 0B / 0B 408MB / 6.86GB 56
My docker-compose.yml file
version: "3.3"
services:
db2:
image: docker-registry.ju.globaz.ch:5000/db2wse:11.1.4fp4
networks:
- network-db
deploy:
mode: replicated
replicas: 1
resources:
limits:
memory: 1G
networks:
network-db:
external: true
Any idea ? It is very frustrating.
Thanks a lot
I found the problem.
It was a kernel configuration in the sysctl.conf.
I had this :
cat /etc/sysctl.conf |grep vm.
vm.swappiness=10
vm.overcommit_memory=2
vm.dirty_ratio=2
vm.dirty_background_ratio=1
I removed everything setted for DB2 (put back the default configuration) and now I can take advantage of all the RAM of the hosts.
I kept this :
cat /etc/sysctl.conf |grep vm.
vm.swappiness=10
vm.max_map_count=262144
Thanks

Neo4j Docker Insufficient Memory

I'm having this weird issue with Neo4j in Docker. This is my docker-compose file:
version: '3'
services:
neo4j:
ports:
- "7473:7473"
- "7474:7474"
- "7687:7687"
volumes:
- neo4j_data:/data
image: neo4j:3.3
volumes:
neo4j_data: {}
I'm using Docker Toolbox on Windows 10. I have tested this on two different machines and it works perfectly. However, on one machine, the container always crashes a few seconds after creation. Here's the log for this container:
$ docker container logs database_neo4j_1
Active database: graph.db
Directories in use:
home: /var/lib/neo4j
config: /var/lib/neo4j/conf
logs: /var/lib/neo4j/logs
plugins: /var/lib/neo4j/plugins
import: /var/lib/neo4j/import
data: /var/lib/neo4j/data
certificates: /var/lib/neo4j/certificates
run: /var/lib/neo4j/run
Starting Neo4j.
2018-11-18 12:50:41.954+0000 WARN Unknown config option: causal_clustering.discovery_listen_address
2018-11-18 12:50:41.965+0000 WARN Unknown config option: causal_clustering.raft_advertised_address
2018-11-18 12:50:41.965+0000 WARN Unknown config option: causal_clustering.raft_listen_address
2018-11-18 12:50:41.967+0000 WARN Unknown config option: ha.host.coordination
2018-11-18 12:50:41.968+0000 WARN Unknown config option: causal_clustering.transaction_advertised_address
2018-11-18 12:50:41.968+0000 WARN Unknown config option: causal_clustering.discovery_advertised_address
2018-11-18 12:50:41.969+0000 WARN Unknown config option: ha.host.data
2018-11-18 12:50:41.970+0000 WARN Unknown config option: causal_clustering.transaction_listen_address
2018-11-18 12:50:42.045+0000 INFO ======== Neo4j 3.3.9 ========
2018-11-18 12:50:42.275+0000 INFO Starting...
2018-11-18 12:50:48.632+0000 INFO Bolt enabled on 0.0.0.0:7687.
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 262160 bytes for Chunk::new
# An error report file with more information is saved as:
# /var/lib/neo4j/hs_err_pid6.log
#
# Compiler replay data is saved as:
# /var/lib/neo4j/replay_pid6.log
Looking add the additional log file /var/lib/neo4j/hs_err_pid6.log revealed the following information:
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 262160 bytes for Chunk::new
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (allocation.cpp:390), pid=6, tid=0x00007fee96f9bae8
#
# JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.9.0
# Distribution: Custom build (Tue Oct 23 11:27:22 UTC 2018)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
As it turns out, my Docker machine was set to only 1GB of RAM, and the minimum requirement for Neo4j (according to their website) are 2GB. I was able to solve the problem by replacing my default Docker machine according to this guide and giving the new one 4GB of memory.
Essentially, I did the following:
$ docker-machine rm default
$ docker-machine create -d virtualbox --virtualbox-cpu-count=2 --virtualbox-memory=4096 --virtualbox-disk-size=50000 default
you may also need to restart Docker:
docker-machine stop
exit
I haven't found anything about this problem online so far, so maybe this helps someone someday =).

cannot connect to container with docker-compose

I'm using docker 1.12 and docker-compose 1.12, on OSX.
I created a docker-compose.yml file which runs two containers:
the first, named spark, builds and runs a sparkjava application
the second, named behave, runs some functional tests on the API exposed by the first container.
version: "2"
services:
behave:
build:
context: ./src/test
container_name: "behave"
links:
- spark
depends_on:
- spark
entrypoint: ./runtests.sh spark:9000
spark:
build:
context: ./
container_name: "spark"
ports:
- "9000:9000"
As recommended by Docker Compose documentation, I use a simple shell script to test if the spark server is ready. This script is name runtest.sh, and runs into the container named "behave". It is launched by docker-compose (see above):
#!/bin/bash
# This scripts waits for the API server to be ready before running functional tests with Behave
# the parameter should be the hostname for the spark server
set -e
host="$1"
echo "runtests host is $host"
until curl -L "http://$host"; do
>&2 echo "Spark server is not ready - sleeping"
sleep 5
done
>&2 echo "Spark server is up - starting tests"
behave
```
The DNS resolution does not seem to work. curl makes a request to spark.com instead of a request to my container named "spark".
UPDATE:
By setting an alias for my link (links: -spark:myserver), I've seen the DNS resolution is not done by Docker: I received an error message from a corporate network equipment (I'm running this from behind a corporate proxy, with Docker for Mac). Here is an extract of the output:
Recreating spark
Recreating behave
Attaching to spark, behave
behave | runtests host is myserver:9000
behave | % Total % Received % Xferd Average Speed Time Time Time Current
behave | Dload Upload Total Spent Left Speed
100 672 100 672 0 0 348 0 0:00:01 0:00:01 --:--:-- 348
behave | <HTML><HEAD>
behave | <TITLE>Network Error</TITLE>
behave | </HEAD>
behave | <BODY>
behave | ...
behave | <big>Network Error (dns_unresolved_hostname)</big>
behave | Your requested host "myserver" could not be resolved by DNS.
behave | ...
behave | </BODY></HTML>
behave | Spark server is up - starting tests
To solve this, I added an environment variable no_proxy for the name of the container I wanted to join.
In the dockerfile for the container behave, I have:
ENV http_proxy=http://proxy.mycompany.com:8080
ENV https_proxy=http://proxy.mycompany.com:8080
ENV no_proxy=127.0.0.1,localhost,spark

docker-compose swarm: force containers to run on specific hosts

Trying to run cluster application on different virtual machines with use of Swarm stand alone and docker-compose version '2'. Overlay network is set. But want to force certain containers to run on specific hosts.
In documentation there is following advice, but with this parameter I was not able to start any container at all:
environment:
- "constraint:node==node-1"
ERROR: for elasticsearch1 Cannot create container for service elasticsearch1: Unable to find a node that satisfies the following conditions
[available container slots]
[node==node-1]
Should we register hosts as node-1 node-2... or it is done by default.
[root#ux-test14 ~]# docker node ls
Error response from daemon: 404 page not found
[root#ux-test14 ~]# docker run swarm list
[root#ux-test14 ~]#
[root#ux-test14 ~]# docker info
Containers: 8
Running: 6
Paused: 0
Stopped: 2
Images: 8
Server Version: swarm/1.2.5
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 2
ux-test16.rs: 10.212.212.2:2375
â ID: JQPG:GKFF:KJZJ:AY3N:NHPZ:HD6J:SH36:KEZR:2SSH:XF65:YW3N:W4DG
â Status: Healthy
â Containers: 4 (4 Running, 0 Paused, 0 Stopped)
â Reserved CPUs: 0 / 2
â Reserved Memory: 0 B / 3.888 GiB
â Labels: kernelversion=3.10.0-327.28.3.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
â UpdatedAt: 2016-09-05T11:11:31Z
â ServerVersion: 1.12.1
ux-test17.rs: 10.212.212.3:2375
â ID: Z27V:T5NU:QKSH:DLNK:JA4M:V7UX:XYGH:UIL6:WFQU:FB5U:J426:7XIR
â Status: Healthy
â Containers: 4 (2 Running, 0 Paused, 2 Stopped)
â Reserved CPUs: 0 / 2
â Reserved Memory: 0 B / 3.888 GiB
â Labels: kernelversion=3.10.0-327.28.3.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
â UpdatedAt: 2016-09-05T11:11:17Z
â ServerVersion: 1.12.1
Plugins:
Volume:
Network:
Swarm:
NodeID:
Is Manager: false
Node Address:
Security Options:
Kernel Version: 3.10.0-327.28.3.el7.x86_64
Operating System: linux
Architecture: amd64
CPUs: 4
Total Memory: 7.775 GiB
Name: 858ac2fdd225
Docker Root Dir:
Debug Mode (client): false
Debug Mode (server): false
WARNING: No kernel memory limit support
My first answer is about "swarm mode". You'd since clarified that you're using legacy Swarm and added more info, so here:
The constraint you list assumes that you have a host named node-1. Your hosts are named ux-test16.rs and ux-test17.rs. Just use that instead of node-1 in your constraint. Eg:
environment:
- "constraint:node==ux-test16.rs"
The environment variable constraint is only valid for the legacy (stand alone) version of Swarm. The newer "Swarm Mode" uses either mode or constraints options (not environment variables).
To enforce one and only one task (container) per node, use mode=global.
docker service create --name proxy --mode global nginx
The default mode is replicated which means that the swarm manager will create tasks (containers) across all available nodes to meet the number specified in the --replicas option. Eg:
docker service create --name proxy --replicas 5 nginx
To enforce other constraints based on hostname (node), label, role, id's use the --constraint option. Eg:
docker service create --name proxy --constraint "node.hostname!=node01" nginx
See https://docs.docker.com/engine/reference/commandline/service_create/#/specify-service-constraints
EDIT sept 2016:
Something else. docker-compose is not currently supported in "swarm mode". Swarm mode understands the new dab format instead. There is a way to convert docker-compose files to dab but it's experimental and not to be relied on at this point. It's better to create a bash script that calls all the docker service create ... directly.
EDIT March 2017:
As of docker 1.13 (17.03), docker-compose can now be used to provision swarm environments directly without having to deal with the dab step.
Related issue - I had a recent Swarm project with a mixture of worker nodes (3 x Linux + 4 x Windows). My containers needed to run on a specific OS, but not on any specific node. Swarm mode now supports specifying an OS under "constraints" in docker-compose files. No need to create labels for each node:
version: '3'
services:
service_1:
restart: on-failure
image: 'service_1'
deploy:
placement:
constraints:
- node.platform.os == windows
junittestsuite:
restart: on-failure
image: 'junit_test_suite:1.0'
command: ant test ...
deploy:
placement:
constraints:
- node.platform.os == linux

Resources