Compiled C program runs 100x time slower in a Docker container - docker

I recently wrote a C code that natively (on Ubuntu 20.04) I compiled using the musl-gcc library. When I run the code it took me roughly 4 seconds to execute.
Then I wanted to see how long it will take when it is in a container. For this I wrote a small container in which I copied the compiled code and I executed it. Since I compiled the code using musl-gcc I couldn't find a way to compile it in the Dockerfile, so I decided to copy the compiled file instead. This is the Dockerfile I wrote:
FROM ubuntu:20.04
ADD program /bin/
CMD ["/bin/program"]
It takes a lot of time to run. Since the program also has some file-system activity, I decided to try to mount it to a temporary file system by running it like this:
docker run --tmpfs /run:rw,noexec,nosuid,size=65536k image-name
But even with temporary file system it takes about 500 seconds to execute (roughly 125 times slower)
Is there something I'm missing that can be done to speedup the execution in a container of this program.

Related

Can I run scripts from a docker build context without a copy?

I want to build on top of a windows docker container by installing a couple programs. The files total .5 GB and I want to keep the layers as small as possible. I am hoping I can run the setup files from the build-context, and then have the build-context swept away at the end so I don't have a needless copy of the source files for the setup.exe embedded in my container layers. However, I have not found an example where this is the case. Instead I mostly see people run a COPY command to a temporary build folder, run their setup, then remove the folder. Won't those files still be in the container layers because the COPY command creates a new layer when it's done?
I don't know if the container can see the build-context directly. I was hoping for some magical folder filled with the build-context files so I could run a script using it, but haven't found anything.
It seems like the alternative is to create a private file-server and perform a RUN that can download them from that private server and unpack them, run the install, and remove them (all as 1 docker step). I understand this would make it more available to others who need to rerun the build, but I'm not convinced we'll need to rerun it. It's not likely to change as the container will build patches for a legacy application. Just seems like a lot to host files on a private, public-facing server for something that will get called once every couple years if ever.
So are these my two options?
Make a container with needless copies of source files embedded within
Host the files on a private file server and download/install/remove them
Or am I missing another option or point about how the containers work?
It's a long shot as Windows is a tricky thing with file system, but you could do this way:
In your Dockerfile use a COPY command, install then RUN del ... to remove the installation files
Build your image docker build -t my-large-image:latest .
Run your image docker run --name my-large-container my-large-image:latest
Stop the container
Export your container filesystem docker export my-large-container > my-large-container.tar
Import the filesystem to a new image cat my-large-container.tar | docker import - my-small-image
Caveat is you need to run the container once which might not be what you want. And also I haven't tested with windows container, sorry.
I usually do the download or copy in one step, then in the next step I do the silent installation and remove the installer.
# escape=`
FROM mcr.microsoft.com/dotnet/framework/wcf:4.8-windowsservercore-ltsc2016
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
ADD https://download.visualstudio.microsoft.com/download/pr/6afa582f-fa26-4a73-8cb9-194321e85f8d/ecea51ead62beb7acc73ad9799511ffdb3083ad384fe04ec50e2cbecfb426482/VS_RemoteTools.exe VS_RemoteTools_x64.exe
RUN Start-Process .\\VS_RemoteTools_x64.exe -ArgumentList #('/install','/quiet','/norestart') -NoNewWindow -Wait; `
Remove-Item -Path C:/VS_RemoteTools_x64.exe -Force;
But otherwise, I don't think you can mount a custom volume while it's being built.
I didn't find a satisfactory answer to this. Docker seems designed for only the modern era and assumes you'll be able to download what you need via scripts and tools hitting APIs and file servers. The easiest option I found that I eventually went with was to host the files on a private file server or service (in my case, AWS S3).
I really wish there was a way to have files hosted by the docker daemon in some way, eg. if it acted like a temporary server that you could get data from via http instead of needing to COPY the files and create a layer. Alas, I found no such feature.
Taking this route made my container about a GB smaller.

Trying to port application to docker nanoserver container. Running exe fails with exit code -1073741515 (Dependency missing)

I'm currently trying to port my image optimizer application to a NanoServer docker image. One of the tools my image optimizer uses is truepng.exe. (Can be downloaded here: http://x128.ho.ua/clicks/clicks.php?uri=TruePNG_0625.zip)
I simply created a nanoserver container and mounted a folder that contained truepng.exe:
docker run --rm -it -v C:\data:C:\data mcr.microsoft.com/windows/nanoserver:2004-amd64
When I now run truepng.exe I expect some output regarding command line arguments missing:
C:\MyLocalWindowsMachine>truepng
TruePNG 0.6.2.5 : PNG Optimizer
by x128 (2010-2017)
x128#ua.fm
...
However when I call this from inside the nanoserver docker container I basically see no output:
C:\data>truepng
C:\data>echo %ERRORLEVEL%
-1073741515
As you can see above, the exit code is set to -1073741515. According to this it typically means that there's a dependency missing.
I then downloaded https://github.com/lucasg/Dependencies to see the dependencies of truepng:
It seems it has some dependencies on 5 DLL's. Looking these up I found that there's apparently something called 'Reverse Forwarders': https://cloudblogs.microsoft.com/windowsserver/2015/11/16/moving-to-nano-server-the-new-deployment-option-in-windows-server-2016/
According to the following post though they should already be included in nanoserver: https://social.technet.microsoft.com/Forums/en-US/5b36a6d3-84c9-4940-8b7a-9e2a38468291/reverse-forwarders-package-in-tp5?forum=NanoServer
After all this investigation I've also been playing around with manually copying over the DLL's from my local machine (system32) to the docker machine without any success (it just kept breaking other things like the copy command which required me to recreate the container). Next to that I've also copied the files from SysWOW64, but this didn't help either.
I'm currently quite stranded on how to proceed further as I'm not even sure if the tool is missing dependencies or if something else is going on. Is there a way to investigate what DLL's are missing once a tool is starting?
Kind regards,
Devedse
Edit 1: Idea from #CherryDT
I tried running gflags (https://social.msdn.microsoft.com/Forums/en-US/f004a7e5-9024-4555-9ada-e692fbc3160d/how-to-start-quotloader-snapsquot?forum=vcgeneral) which gave the following output:
C:\data>"C:\data\gflags.exe" /i TruePNG.exe +sls
Current Registry Settings for TruePNG.exe executable are: 00000000
After this I tried running Dbgview.exe, this however never resulted in a log file being written:
C:\data>"C:\data\DebugView\Dbgview.exe" /v /l debugview-log.txt /g /n
C:\data>
I also started TruePNG.exe again, but again, no log file was written.
I tried querying the EventLogs using a dotnet core application, but this resulted in the following exception:
Unhandled exception. System.InvalidOperationException: Cannot open log Application on computer '.'. This function is not supported on this system.
at System.Diagnostics.EventLogInternal.OpenForRead(String currentMachineName)
at System.Diagnostics.EventLogInternal.GetEntryAtNoThrow(Int32 index)
at System.Diagnostics.EventLogEntryCollection.GetEntryAtNoThrow(Int32 index)
at System.Diagnostics.EventLogEntryCollection.EntriesEnumerator.MoveNext()
at EventLogReaderTest.ConsoleApp.Program.Main(String[] args) in C:\data\EventLogReaderTest.ConsoleApp\Program.cs:line 22
Windows Nano Server is tiny and only supports 64-bit applications, tools, and agents. The missing dependency in this case is the entire x86 emulation layer (WoW64), as TruePNG is a 32-bit application.
Windows Server Core contains WoW64 and other components missing from Nano Server. Use a Windows Server Core image instead.
Example command:
docker run --rm -it -v C:\Temp:C:\Temp mcr.microsoft.com/windows/servercore:2004 C:\Temp\TruePNG.exe
Yields the expected output:
TruePNG 0.6.2.5 : PNG Optimizer
by x128 (2010-2017)
x128#ua.fm
TruePNG {options} files
options:
/f# PNG delta filters 0=None, 1=Sub, 2=Up, 3=Average, 4=Paeth, 5=Mixed
/fe PNG extra filters, overrides /f switch
/i# PNG interlace method 0=None, 1=Adam7 (default input)
/g# PNG gamma 0=Remove, 1=Apply & Remove, 2=Keep (default)
[...]

Is there a reason why people write everything to the DockerFile instead of a separate shell script?

I somehow don't like the RUN x && y && z ... syntax we currently use in DockerFile. As far as I understand I could just run a shell script instead like RUN xyz.sh and do the same tasks on my favorite language. Does the latter have any disadvantage?
Update:
In additional to the point made by David about the complexity, I believe writing everything to the Dockerfile makes it easier for share (thus creating a survivorship bias for you). Namely on the DockerHub, most of the time, you have a "Dockerfile" tab to quickly get the idea on how the image is built. If the author uses COPY and RUN xyz.sh, he/she would have to host the script elsewhere or the Dockerfile alone becomes meaningless.
CMD is executed at runtime, that is when the container is created from the image. RUN is a build time instruction. So the question is actually why people run things with RUN instead of CMD at runtime. (You can of course COPY script.sh /script.sh then RUN bash /script.sh)
If you do things like installing dependencies, it could take a lot of time, in case of scaling up your service, this would make auto-scaling useless because it can't be fast enough to absorb the peak.
At build time, RUN can be cached, so next time the build will be a lot faster.
Because the way docker file system works, creating 10 containers from the same image takes only a few more space than creating 1 container. So you can save disk space by installing packages in the image, while if you install them at runtime, they will all occupy a part of disk space.
RUN executes commands in a new layer and creates a new image. This happens when you build the image using docker build.
CMD specifies a default command an parameters to be run when a container is launched from the image.
In summary. Run and cmd is not interchangeable, RUN runs when an image is created, CMD when a container is launched.

Compile inside Docker container without huge container sizes

I'm creating an auto-testing service for my university. I need to take student code, put it into the project directory, and run tests.
This needs to be done for multiple different languages in an extensible way.
My initial plan:
Have a "base image" for each language (i.e. install the language runtime on buildpack-deps:stretch)
Take user files & pre-made project structure
Put user files into the correct location in the project
Build an image of the project extending the base image
Run the container. It will compile the project and run tests.
Save test results to the database, stop & delete the image
Rinse repeat for every submission
When testing manually, the image sizes are huge! Almost 1.5GB in size! I'm installing the runtime for one language, and I was testing with Hello World - so the project wasn't big either.
This "works", but feels very inefficient. I'm also very new to Docker – is there a better way to do this?
Cheers
In this specific application, I'd probably compile the program inside a container and not build an image out of it (since you're throwing it away immediately, and the compilation and testing is the important part and, unusually, you don't need the built program for anything after that).
If you assume that the input file gets into the container somehow, then you can write a script that does the building and testing:
#!/bin/sh
cd /project/src/student
tar xzf "/app/$1"
cd ../..
make
...
curl ??? # send the test results somewhere
Then your Dockerfile just builds this into an image, without any specific student code in it
FROM buildpack-deps:stretch
RUN apt-get update && apt-get install ...
RUN adduser user
COPY build_and_test.sh /usr/local/bin
USER user
ADD project-structure.tar.gz /project
Then when you actually go to run it, you can use the docker run -v option to inject the submitted code.
docker run --rm -v $HOME/submissions:/app theimage \
build_and_test.sh student_name.tar.gz
In your original solution, note that the biggest things are likely to be the language runtime, C toolchain, and associated header files, and so while you get an apparently huge image, all of these things come from layers in the base image and so are shared across the individual builds (it's not taking up quite as much space as you think).

Mounted docker volumes corrupting files

I think this is machine related, but I'm not sure. I'm using the most current docker toolbox with docker 1.10.3 on OSX
I have a project using a Dockerfile, which copies code into the container like this:
[...]
COPY . /code
VOLUME /code
WORKDIR /code
[...]
For faster local development (test execution), we mount the current directory in the compose file
[...]
volumes:
- .:/code
[...]
and execute
docker-compose -f docker-compose.yml -f docker-compose.testing.yml run web py.test
Now, it looks like I have two different folders/files:
when running the container and looking inside a file with vi, everything looks like on the host. Changing files and executing our tests (pytest, specifically) lets the python interpreter read garbage so it can't execute the tests.
Example
the end of a file looks like this (which got copied in the Dockerfile into the container):
post_save.connect(backup_something, sender=SomeSender, dispatch_uid='backup_something') foobar
this obviously raises an error when executing, so I change it to
post_save.connect(backup_something, sender=SomeSender, dispatch_uid='backup_something')
the file looks fine now, both from the host and inside the container.
Executing pytest, it still reads the content of the copied code, breaking the tests locally for me.
If I change even more, it's neither the copied nor the mounted file, so stuff breaks at random positions:
File "/code/some_code.py", line 69
dispatch_uid='backup_
^
SyntaxError: EOL while scanning string literal
(tail shows correct syntax etc, there is definitely nothing broken with the code)
Is there something wrong with our setup or is it just my machine being broken somehow? I tried restarting and recreating the docker machine but this doesn't help.
I would try to mount in read only mode and then double check the filesystem type if there's something strange.
Years ago there was a bug with ntfs-3g corrupting files, maybe it's something similar (obviously not ntfs because we are on OS X)
I have no experience with DT on IOS, but I think you may have done a union mount.
If that is the case, the solution would be to move files or mount point so that files won't be shadowed.
This article may be relevant:

Resources