Limit MarkLogic memory consumption in docker container - docker

The project in which I am working develops a Java service that uses MarkLogic 9 in the backend.
We are running a Jenkins build server that executes (amongst others) several tests in MarkLogic written in XQuery.
For those tests MarkLogic is running in a docker container on the Jenkins host (which is running Ubuntu Linux).
The Jenkins host has 12 GB of RAM and 8 GB of swap configured.
Recently I have noticed that the MarkLogic instance running in the container uses a huge amount of RAM (up to 10 GB).
As there are often other build jobs running in parallel, the Jenkins starts to swap, sometimes even eating up all swap
so that MarkLogic reports it cannot get more memory.
Obviously, this situation leads to failed builds quite often.
To analyse this further I made some tests on my PC running Docker for Windows and found out that the MarkLogic tests
can be run successfully with 5-6 GB RAM. The MarkLogic logs show that it sees all the host memory and wants to use everything.
But as we have other build processes running on that host this behaviour is not desirable.
My question: is there any possibility to tell the MarkLogic to not use so much memory?
We are preparing the docker image during the build, so we could modify some configuration, but it has to be scripted somehow.

The issue of the container not detecting memory limit correctly has been identified, and should be addressed in a forthcoming release.
In the meantime, you might be able to mitigate the issue by:
changing the group cache sizing from automatic to manual and setting cache sizes appropriate for the allocated resources. There area variety of ways to set these configs, whether deploying and settings configs from ml-gradle project, making your own Manage API REST calls, or programmatically:
admin:group-set-cache-sizing
admin:group-set-compressed-tree-cache-partitions
admin:group-set-compressed-tree-cache-size
admin:group-set-expanded-tree-cache-partitions
admin:group-set-expanded-tree-cache-size
admin:group-set-list-cache-partitions
admin:group-set-list-cache-size
reducing the in-memory-limit
in memory limit specifies the maximum number of fragments in an in-memory stand. An in-memory stand contains the latest version of any new or changed fragments. Periodically, in-memory stands are written to disk as a new stand in the forest. Also, if a stand accumulates a number of fragments beyond this limit, it is automatically saved to disk by a background thread.

Related

Parallel Docker Container Creation

I am using a Docker Setup that consists of 14 different containers. Every container gets a cpu_limit of 2 and a mem_limit of 2g.
To create and run these containers, I've written a Python script that uses the docker-py library. As of now, the containers are created sequentially, which takes approximately 2 minutes.
Now I'm thinking about parallelizing the process. So now instead of doing (its pseudocode):
for container in containers_to_start:
create_container(container)
I do
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
pool.map(create_container, containers_to_start)
And as a result the 14 containers are created 2x faster. BUT: The applications within the containers take a significant longer time to boot. At the end of the day, i dont gain really much, the time until every application is reachable is more or less the same, no matter if with or without multithreading.
But I don't really know why, because every container gets the same amount of CPU and memory resources, so I would expect the same boot time no matter how many containers are starting at the same time. Clearly this is not the case. Maybe I'm missing some knowledge here, any explanation would be greatly appreciated.
System Specs
CPU: intel i7 # 2.90 GHz
32GB RAM
I am using Windows 10 with Docker installed in WSL2 backend.

Should Docker release all memory when all containers are closed?

I am debugging a possible memory leak in a web service I have running as a Docker network. The service has a Javascript front end, Flask REST API, Dask worker pool, the spaCy natural language toolkit...the works. I see intermittent running-out-of memory problems and I'm trying to get a handle on what could be going on.
I can run this system on my laptop, a MacBook Pro with 16 GB of memory where I am using Docker Desktop. When there are no containers running, Activity Monitor shows com.docker.hyperkit using about 12 GB. Then I launch the Docker network, which ultimately runs 14 containers to house the various components. I perform a fairly large batch job in the Docker network. It runs for an hour, during which time com.docker.hyperkit's memory creeps up to around 18 GB. This is not surprising--this is a memory intensive service. But when I stop all the containers in the network, I would expect com.docker.hyperkit's memory usage to drop back to 12 GB. Instead it stays at 18 GB. The only way I can get it back to 12 GB is to restart the Docker Desktop.
Is this expected behavior? It looks like a memory leak in Docker.
No it should not release the memory, and yes it is expected behavior.
There is no way to run docker containers natively on MacOS, so you run them inside of a virtual machine. A VM gets memory assigned to it, which it assigns to processes running inside of that VM. When those processes inside of the VM exit, the resources are released back to the VM, but not back to the parent MacOS. That's just how VM's work, and the fact that it didn't take all of the memory up to the limit specified in the Docker preferences immediately on startup is an impressive feat itself.
The containers themselves are processes running within this VM, and they will release all of their memory back to the VM upon exit. If you run something like docker run --rm busybox free you'll likely see the memory being used and freed within the VM.
For more details on this, there's several extensive threads in the github issues. Most of the comments on these threads appear to be from users assuming MacOS is running containers, rather than a VM that runs containers. Even completely idle, that VM will use some resources to run the kernel, container runtime daemons, volume sharing code, port forwarding code, etc. There's a lot of magic under the covers to make docker not look like a VM to the user, so that you can just pass paths and connect to ports on the MacOS side. The most helpful comment in the thread to me is here: https://github.com/moby/hyperkit/issues/231#issuecomment-448416559

Each docker container runs a separate JVM at run-time?

We are running hundreds of Mule Java 8 apps in a machine currently. A single JVM appears to be running that is sharing hundreds of megs of Jars with the apps at run-time. If we were to run each app in a docker container would each container run a separate JVM at run-time? If so this would incur a major RAM penalty!
Yes, each container would run its own java process and thus its own JVM.
You may consider partitioning your apps though so that, rather than going from many apps on one server|VM to many containers each running one app on one server, that you go to some number of containers each running several apps on one server.
Yes, you'd have to duplicate shared jars for each container. Yes, you'd increase CPU, RAM and network traffic.
But, you'd gain more flexibility in duplicating app servers for scale-out and moving them onto different machines to better reflect their CPU, memory, bandwidth needs.

Slow install / upgrade through Helm (for Kubernetes)

Our application consists of circa 20 modules. Each module contains a (Helm) chart with several deployments, services and jobs. Some of those jobs are defined as Helm pre-install and pre-upgrade hooks. Altogether there are probably about 120 yaml files, which eventualy result in about 50 running pods.
During development we are running Docker for Windows version 2.0.0.0-beta-1-win75 with Docker 18.09.0-ce-beta1 and Kubernetes 1.10.3. To simplify management of our Kubernetes yaml files we use Helm 2.11.0. Docker for Windows is configured to use 2 CPU cores (of 4) and 8GB RAM (of 24GB).
When creating the application environment for the first time, it takes more that 20 minutes to become available. This seems far to slow; we are probably making an important mistake somewhere. We have tried to improve the (re)start time, but to no avail. Any help or insights to improve the situation would be greatly appreciated.
A simplified version of our startup script:
#!/bin/bash
# Start some infrastructure
helm upgrade --force --install modules/infrastructure/chart
# Start ~20 modules in parallel
helm upgrade --force --install modules/module01/chart &
[...]
helm upgrade --force --install modules/module20/chart &
await_modules()
Executing the same startup script again later to 'restart' the application still takes about 5 minutes. As far as I know, unchanged objects are not modified at all by Kubernetes. Only the circa 40 hooks are run by Helm.
Running a single hook manually with docker run is fast (~3 seconds). Running that same hook through Helm and Kubernetes regularly takes 15 seconds or more.
Some things we have discovered and tried are listed below.
Linux staging environment
Our staging environment consists of Ubuntu with native Docker. Kubernetes is installed through minikube with --vm-driver none.
Contrary to our local development environment, the staging environment retrieves the application code through a (deprecated) gitRepo volume for almost every deployment and job. Understandibly, this only seems to worsen the problem. Starting the environment for the first time takes over 25 minutes, restarting it takes about 20 minutes.
We tried replacing the gitRepo volume with a sidecar container that retrieves the application code as a TAR. Although we have not modified the whole application, initial tests indicate this is not particularly faster than the gitRepo volume.
This situation can probably be improved with an alternative type of volume that enables sharing of code between deployements and jobs. We would rather not introduce more complexity, though, so we have not explored this avenue any further.
Docker run time
Executing a single empty alpine container through docker run alpine echo "test" takes roughly 2 seconds. This seems to be overhead of the setup on Windows. That same command takes less 0.5 seconds on our Linux staging environment.
Docker volume sharing
Most of the containers - including the hooks - share code with the host through a hostPath. The command docker run -v <host path>:<container path> alpine echo "test" takes 3 seconds to run. Using volumes seems to increase runtime with aproximately 1 second.
Parallel or sequential
Sequential execution of the commands in the startup script does not improve startup time. Neither does it drastically worsen.
IO bound?
Windows taskmanager indicates that IO is at 100% when executing the startup script. Our hooks and application code are not IO intensive at all. So the IO load seems to originate from Docker, Kubernetes or Helm. We have tried to find the bottleneck, but were unable to pinpoint the cause.
Reducing IO through ramdisk
To test the premise of being IO bound further, we exchanged /var/lib/docker with a ramdisk in our Linux staging environment. Starting the application with this configuration was not significantly faster.
To compare Kubernetes with Docker, you need to consider that Kubernetes will run more or less the same Docker command on a final step. Before that happens many things are happening.
The authentication and authorization processes, creating objects in etcd, locating correct nodes for pods scheduling them and provisioning storage and many more.
Helm itself also adds an overhead to the process depending on size of chart.
I recommend reading One year using Kubernetes in production: Lessons learned. Author goes into explaining what have they achieved by switching to Kubernetes as well differences in overhead:
Cost calculation
Looking at costs, there are two sides to the story. To run Kubernetes, an etcd cluster is required, as well as a master node. While these are not necessarily expensive components to run, this overhead can be relatively expensive when it comes to very small deployments. For these types of deployments, it’s probably best to use a hosted solution such as Google's Container Service.
For larger deployments, it’s easy to save a lot on server costs. The overhead of running etcd and a master node aren’t significant in these deployments. Kubernetes makes it very easy to run many containers on the same hosts, making maximum use of the available resources. This reduces the number of required servers, which directly saves you money. When running Kubernetes sounds great, but the ops side of running such a cluster seems less attractive, there are a number of hosted services to look at, including Cloud RTI, which is what my team is working on.

What are the recommended settings for the jvm memory of RestHeart?

In the documentation, it does not specify the memory needed for the JVM, in the post on the performance either.
RESTHeart runs on java 8 and java 8 uses the metaspace memory model which usually does not need jvm memory tuning at all.
we at softinstigate.com usually run restheart with docker and run it on aws ecs service.
we configure restheart threading as follows:
# Number of I/O threads created for non-blocking tasks. at least 2. suggested value: core*2
io-threads: 2
# Number of threads created for blocking tasks (such as ones involving db access). suggested value: core*16
worker-threads: 8
on aws ecs we set a soft memory limit of 1Gb for the docker container running restheart and we never had a memory issue (even on heavy load)

Resources