Dump Docker Container Memory - docker

I'm working on in memory encryption of dockerized workloads. Anyways, step 1 is to dump the memory of a simple container with zero protection. I'm happy dumping that memory either from a host node or from inside a terminal in the container itself.
I've seen a lot of samples, but none of them concretely work. See this repo: https://github.com/drcrook1/confidential_k8s_test
IDEALLY: I can dump the memory and read the value from the run.py file "my_array" and find its value and also be able to see the env variable: "SECRET_ONE"'s value as well.
Thanks for any tips!

Related

Where should production critical and non-production non-critical data stored?

I was asked this question in an interview and i m not sure of the correct answer hence I would like your suggestions.
I was asked whether we should persist production critical data inside of the docker instance or outside of it? What would be my choice and the reasons for it.
Would your answer differ incase we have a non-prod non critical data ?
Back your answers with reasons.
Most data should be managed externally to containers and container images. I tend to view data constrained to a container as temporary (intermediate|discardable) data. Otherwise, if it's being captured but it's not important to my business, why create it?
The name "container" is misleading. Containers aren't like VMs where there's a strong barrier (isolation) between VMs. When you run multiple containers on a single host, you can enumerate all their processes using ps aux on the host.
There are good arguments for maintaining separation between processes and data and running both within a single container makes it more challenging to retain this separation.
Unlike processes, files in container layers are more isolated though. Although the layers are manifest as files on the host OS, you can't simply ls a container layer's files from the host OS. This makes accessing the data in a container more complex. There's also a performance penalty for effectively running a file system atop another file system.
While it's common and trivial to move container images between machines (viz docker push and docker pull), it's less easy to move containers between machines. This isn't generally a problem for moving processes as these (config aside) are stateless and easy to move and recreate, but your data is state and you want to be able to move this data easily (for backups, recovery) and increasingly to move amongst a dynamic pool of nodes that perform processing upon it.
Less importantly but not unimportantly, it's relatively easy to perform the equivalent of a rm -rf * with Docker by removing containers (docker container rm ...) and thereby deleting the application and your data.
The two very most basic considerations you should have here:
Whenever a container gets deleted, everything in the container filesystem is lost.
It's extremely common to delete containers; it's required to change many startup options or to update a container to a newer image.
So you don't really want to keep anything "in the container" as its primary data storage: it's inaccessible from outside the container, and will get lost the next time there's a critical security update and you must delete the container.
In plain Docker, I'd suggest keeping
...in the image: your actual application (the compiled binary or its interpreted source as appropriate; this does not go in a volume)
...in the container: /tmp
...in a bind-mounted host directory: configuration files you need to push into the container at startup time; directories of log files produced by the container (things where you as an operator need to directly interact with the files)
...in either a named volume or bind-mounted host directory: persistent data the container records in the filesystem
On this last point, consider trying to avoid this layer altogether; keeping data in a database running "somewhere else" (could be another container, a cloud service like RDS, ...) simplifies things like backups and simplifies running multiple replicas of the same service. A host directory is easier to back up, but on some environments (MacOS) it's unacceptably slow.
My answers don't change here for "production" vs. "non-production" or "critical" vs. "non-critical", with limited exceptions you can justify by saying "it's okay if I lose this data" ("because it's not the master copy of it").

Docker container with Elk stack to browse nginx and tomcat log files

I am trying to debug a production failure involving (multiple) nginx and tomcat logs. I have copied the logs to my dev machine. What is the easiest way for me to import these logs into an elastic/ELK stack to sift through quickly? (Currently, I'm making do with less commands across multiple windows)
So far I've found only generic docker containers (like https://elk-docker.readthedocs.io/) that require me to install filebeat and configure it. However, since my data is static, I would prefer a simpler installation.
What I did earlier is create the ELK stack with docker-compose and ingest the data via 'nc' (netcat). An example can be found at: https://github.com/deviantony/docker-elk
You might want to adjust the logstash config, so that it reads and parses your data correctly. If the amount of files is not too big, you can nc them one-by-one and otherwise you can write a small script around it, in bash for example, to loop through the files.

Spring boot is consuming too much RAM

I have created some services in spring boot, I have 11 fat jars and I deploy them in docker containers, my doubt was that every jar was consuming between 1 and 1.5 GB of RAM without any use, I check the RAM by running:
docker stats containername
At first I thought that it was the java container and I tried to change to one that uses alpine but nothing changed, so I think the only problem is my jar. Is there a way to change the RAM that the jar is using? Or this behavior is normal because every jar has an embedded tomcat? Or maybe is better to put some jars together and deploy them as war and use only one tomcat for a group of "jars"? Can someone share his/her experience?,
Thanks in advance.
This is how Java behaves in general. The JVM takes as much memory as you give it, and it will perform a process called Garbage collection (What is the garbage collector in Java) to free up space once it decides it should do so.
However, if you don't tell your JVM how much memory it can use, it will use the system defaults, which depend on your systems memory and the amount of cores you have. You can verify this using the following command (How is the default Java heap size determined):
java -XX:+PrintFlagsFinal -version | grep HeapSize
On my machine, that's an initial heap memory of 256MiB and a maximum heap size of 4GiB. However, that doesn't mean that your application needs it.
A good way of measuring your memory is by using a monitoring tool like jvisualvm. Additionally, you could use actuator's /health endpoint to see the heap memory usage as well.
Your heap memory usage will normally have a sawtooth pattern (Why a sawtooth shaped graph), where the memory is gradually being used, and eventually freed by the garbage collector.
The memory that is left over after a garbage collection are usually objects that cannot be destroyed because they're still in use. You could see this as your working memory. Now, to configure your -Xmx you'll have to see how your application behaves after trying it out:
Configure it below your normal memory usage and your application will go out of memory, throwing an OutOfMemoryError.
Configure it too low but above your minimal memory usage, and you will see a huge performance hit, due to the garbage collector continuously having to free memory.
Configure it too high and you'll reserve memory you won't need in most of the cases, so wasting too much resources.
From the screenshot above, you can see that my application reserves about 1GiB of memory for heap usage, while it only uses about 30MiB after a garbage collection. That means that it has a way too high -Xmx value, so we could change it to different values and see how the application behaves.
People often prefer to work in powers of 2 (even though there is no limitation, as seen in jvm heap setting pattern). In my case, I need to go with at least 30MiB, since that's the amount of memory my application uses at all times. So that means I could try -Xmx32m, see how it performs, and adjust if it goes out of memory or performs worse.
You can set memory usage of docker container using -e JAVA_OPTS="-Xmx64M -Xms64M".
docker file:
FROM openjdk:8-jre-alpine
VOLUME ./mysql:/var/lib/mysql
ADD /build/libs/application.jar app.jar
ENTRYPOINT exec java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar /app.jar
image run:
docker run -d --name container-name -p 9100:9100 -e JAVA_OPTS="-Xmx512M -Xms512M" imagename:tag
Here i set 512Mb memory usage . you can set 1g or as per your requirement. After run using this check your memory usage. it will max 512Mb.
After taking a look into the openjkd DockerHub image documentation it seems that you can set the Default Heap Size by setting -XX:MaxRAM=...:
RAM limit is supported by Windows Server containers, but currently JVM
cannot detect it. To prevent excessive memory allocations,
-XX:MaxRAM=... option must be specified with the value that is not bigger than a containers RAM limit.
From the oracle docs:
Default Heap Size Unless the initial and maximum heap sizes are specified on the command line, they are calculated based on the amount
of memory on the machine.

Unwanted revert back to original of dbms.memory.heap.max_size

I'm using Neo4j in a docker (v. 3.1.0). I tried to update the whole database with a single query when I faced error:
There is not enough memory to perform the current task. Please try
increasing 'dbms.memory.heap.max_size' in the neo4j configuration
(normally in 'conf/neo4j.conf' or, if you you are using Neo4j Desktop,
found through the user interface) or if you are running an embedded
installation increase the heap by using '-Xmx' command line flag, and
then restart the database.
So I went to set the config file entries:
dbms.memory.heap.initial_size=512M
dbms.memory.heap.max_size=512M
I gave them both 2048M (as I've read here that these two better to match). But after saving and restarting the docker, the entries are reverted back to their 512M original values. To make sure that it's not a docker issue, I wrote some comment line in the config, and it sticks. Which means the values are reverted by Neo4j intentionally. But why? Is it a limitation imposed by docker? Because my hardware has enough memory!
If you are using the standard docker image, the /docker_entrypoint.sh will set the memory based on environment variables or default it to 512M.
setting "dbms.memory.heap.initial_size" "${NEO4J_dbms_memory_heap_maxSize:-512M}"
setting "dbms.memory.heap.max_size" "${NEO4J_dbms_memory_heap_maxSize:-512M}"
When you instantiate your docker container add --env NEO4J_dbms_memory_heap_maxSize=2048 to the command.

Is it "safe" to commit a running container in docker?

As the title goes, safe means... the proper way?
Safe = consistent, no data loss, professional, legit way.
Hope to share some experiences with pro docker users.
Q. Commit is safe for running docker containers (with the exception of rapidly changing realtime stuff and database stuff, your own commentary is appreciated.)
Yes or No answer is accepted with comment. Thanks.
All memory and harddisk storage is saved inside the container instance. You should, as long as you don't use any external mounts/docker volumes and servers (externally connected DBs?) never get in trouble for stopping/restarting and comitting dockers. Please read on to go more in depth on this topic.
A question that you might want to ask yourself initially, is how does docker store changes that it makes to its disk on runtime? What is really sweet to check out, is how docker actually manages to get this working. The original state of the container's hard disk is what is given to it from the image. It can NOT write to this image. Instead of writing to the image, a diff is made of what is changed in the containers internal state in comparison to what is in the docker image.
Docker uses a technology called "Union Filesystem", which creates a diff layer on top of the initial state of the docker image.
This "diff" (referenced as the writable container in the image below) is stored in memory and disappears when you delete your container. When you use docker commit, the writable container that is retained in the temporary "state" of the container is stored inside a new image, however: I don't recommend this. The state of your new docker image is not represented in a dockerfile and can not easily be regenerated from a rebuild. Making a new dockerfile should not be hard. So that is alway the way-to-go for me personally.
When your docker is working with mounted volumes, external servers/DBs, you might want to make sure you don't get out of sync and temporary stop your services inside the docker container. When you would use a dockerfile you can start up a bootstrap shell script inside your container to start up connections, perform checks and initialize the running process to get your application durably set up. Again, running a committed container makes it harder to do something like this.

Resources