Does `strace -f` work differently when run inside a docker container? - docker

Assume the following:
I have a program myprogram inside a docker container
I'm running the docker container with
docker run --privileged=true my-label/my-container
Inside the container - the program is being run with:
strace -f -e trace=desc ./myprogram
What I see is that the strace (despite having the -f on) doesn't follow all the child processes.
I see the following output from strace
[pid 10] 07:36:46.668931 write(2, "..\n"..., 454 <unfinished ...>
<stdout of ..>
<stdout other output - but I don't see the write commands - so probably from a child process>
[pid 10] 07:36:46.669684 write(2, "My final output\n", 24 <unfinished ...>
<stdout of My final output>
What I want to see is the other write commands.
Now I should see the the other write commands - because I'm using -f.
What I think is happening is that running inside docker makes the process handling and security different.
My question is: Does strace -f work differently when run inside a docker container?
Note that this application starts and stops in 2 seconds - so the tracing tool has to follow the application lifecycle - like strace does. Connecting to a server background process won't work.

It turns out strace truncates string output - you have to explicitly tell it that you want more than the first n (10?) string chars. You do this with -s 800.
strace -s 800 -ff ./myprogram
You can also get all the write commands by asking strace explicitly with -e write.
strace -s 800 -ff -e write ./myprogram

Related

Custom docker image logs are incomplete

I have a task where I need to built a Docker Compose stack for a Wordpress with HTTPS support. So there I have a custom built images based on Ubuntu 20.04 and patterned on official image equivalents for these images. In short everythings works on these custom images besides the log part. I do have 2 projects - one for official images and one for custom built images.
The actual logs on the container cat /var/log/apache2/error.log looks like it:
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 192.168.80.3. Set the 'ServerName' directive globally to suppress this message
[Wed Feb 15 16:36:15.477371 2023] [mpm_prefork:notice] [pid 8] AH00163: Apache/2.4.41 (Ubuntu) configured -- resuming normal operations
[Wed Feb 15 16:36:15.477397 2023] [core:notice] [pid 8] AH00094: Command line: '/usr/sbin/apache2 -D FOREGROUND'
Where the docker logs does give logs looking like that:
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 192.168.80.3. Set the 'ServerName' directive globally to suppress this message
I have the same issue for 3 custom built containers.
Their CMD or ENTRYPOINT 's look like this:
CMD ["apache2ctl", "-D", "FOREGROUND"]
ENTRYPOINT ["sh", "-c", "nginx -g 'daemon off;' & exec tail -f /var/log/nginx/*.log"]
ENTRYPOINT ["sh", "-c", "/usr/sbin/mysqld --init-file=/docker-entrypoint-initdb.d/setup.sql --log-error-verbosity=3 --general_log=1 --general_log_file=/var/log/mysql/general.log --log-error=/var/log/mysql/error.log & exec tail -f /var/log/mysql/*.log"]
I need to use tail for 2/3 of these containers because without it there are no logs whatsoever.
In the case of apache2ctl there is no need to use tail but the result is the same for all of these containers - logs are being displayed only once and they appear to be incomplete, they do not seem to update in real time like they would do for official images.
I have built a custom ubuntu based image with Python3 and simple printing out script and in that case it prints forever (logs get updated) - same behavior like for a regular shell.
I spent a lot of time testing different possibilities and I'm going nowhere with it, so I am here to ask why it's the case for these custom built images and how can I make it so the logging is continuous and complete like it is the case for the official images.

who and w commands in CentOS 8 Docker container

While playing with CentOs 8 on Docker container I found out, that outputs of who and w commands are always empty.
[root#9e24376316f1 ~]# who
[root#9e24376316f1 ~]# w
01:01:50 up 7:38, 0 users, load average: 0.00, 0.04, 0.00
USER TTY FROM LOGIN# IDLE JCPU PCPU WHAT
Even when I'm logged in as a different user in second terminal.
When I want to write to this user it shows
[root#9e24376316f1 ~]# write test
write: test is not logged in
Is this because of Docker? Maybe it works in some way that disallow sessions to see each other?
Or maybe that's some other issue. I would really appreciate some explanation.
These utilities obtain the information about current logins from the utmp file (/var/run/utmp). You can easily check that in ordinary circumstances (e.g. on the desktop system) this file contains something like the following string (here qazer is my login and tty7 is a TTY where my desktop environment runs):
$ cat /var/run/utmp
tty7:0qazer:0�o^�
while in the container this file is (usually) empty:
$ docker run -it centos
[root#5e91e9e1a28e /]# cat /var/run/utmp
[root#5e91e9e1a28e /]#
Why?
The utmp file is usually modified by programs which authenticate the user and start the session: login(1), sshd(8), lightdm(1). However, the container engine cannot rely on them, as they may be absent in the container file system, so "logging in" and "executing on behalf of" is implemented in the most primitive and straightforward manner, avoiding relying on anything inside the container.
When any container is started or any command is execd inside it, the container engine just spawns the new process, arranges some security settings, calls setgid(2)/setuid(2) to forcibly (without any authentication) alter the process' UID/GID and then executes the required binary (the entry point, the command, and so on) within this process.
Say, I start the CentOS container running its main process on behalf of UID 42:
docker run -it --user 42 centos
and then try to execute sleep 1000 inside it:
docker exec -it $CONTAINER_ID sleep 1000
The container engine will perform something like this:
[pid 10170] setgid(0) = 0
[pid 10170] setuid(42) = 0
...
[pid 10170] execve("/usr/bin/sleep", ["sleep", "1000"], 0xc000159740 /* 4 vars */) = 0
There will be no writes to /var/run/utmp, thus it will remain empty, and who(1)/w(1) will not find any logins inside the container.

dumpcap stops logging after 1184 files?

Recently I come across a linux application design. The indent of the application is to log the ethernet frames via dumpcap <> api in linux. But they implemented as below:
Create a new process using fork()
Call dumpcap <> in execl() as shown below
a. execl("/bin/sh", "/bin/sh", "-c", dumpcap<>, NULL);
b. sudo dumpcap -i "eth0" -B 1 -b filesize:5 -w "/mnt/Test_1561890567.pcapng" -t -q
They send a SIGTERM to kill the process
The problem facing now is when ever we run the command from the process after 1184 or 1185 no:files then dumpcap stops logging. The process and thread is alive the command we can see in top command.

How to know if my program is completely started inside my docker with compose

In my CI chain I execute end-to-end tests after a "docker-compose up". Unfortunately my tests often fail because even if the containers are properly started, the programs contained in my containers are not.
Is there an elegant way to verify that my setup is completely started before running my tests ?
You could poll the required services to confirm they are responding before running the tests.
curl has inbuilt retry logic or it's fairly trivial to build retry logic around some other type of service test.
#!/bin/bash
await(){
local url=${1}
local seconds=${2:-30}
curl --max-time 5 --retry 60 --retry-delay 1 \
--retry-max-time ${seconds} "${url}" \
|| exit 1
}
docker-compose up -d
await http://container_ms1:3000
await http://container_ms2:3000
run-ze-tests
The alternate to polling is an event based system.
If all your services push notifications to an external service, scaeda gave the example of a log file or you could use something like Amazon SNS. Your services emit a "started" event. Then you can subscribe to those events and run whatever you need once everything has started.
Docker 1.12 did add the HEALTHCHECK build command. Maybe this is available via Docker Events?
If you have control over the docker engine in your CI setup you could execute docker logs [Container_Name] and read out the last line which could be emitted by your application.
RESULT=$(docker logs [Container_Name] 2>&1 | grep [Search_String])
logs output example:
Agent pid 13
Enter passphrase (empty for no passphrase): Enter same passphrase again: Identity added: id_rsa (id_rsa)
#host SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6
#host SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6
parse specific line:
RESULT=$(docker logs ssh_jenkins_test 2>&1 | grep Enter)
result:
Enter passphrase (empty for no passphrase): Enter same passphrase again: Identity added: id_rsa (id_rsa)

docker exec command doesn't return after completing execution

I started a docker container based on an image which has a file "run.sh" in it. Within a shell script, i use docker exec as shown below
docker exec <container-id> sh /test.sh
test.sh completes execution but docker exec does not return until i press ctrl+C. As a result, my shell script never ends. Any pointers to what might be causing this.
I could get it working with adding the -it parameters:
docker exec -it <container-id> sh /test.sh
Mine works like a charm with this command. Maybe you only forgot the path to the binary (/bin/sh)?
docker exec 7bd877d15c9b /bin/bash /test.sh
File location at
/test.sh
File Content:
#!/bin/bash
echo "Hi"
echo
echo "This works fine"
sleep 5
echo "5"
Output:
ArgonQQ#Terminal ~ docker exec 7bd877d15c9b /bin/bash /test.sh
Hi
This works fine
5
ArgonQQ#Terminal ~
My case is a script a.sh with content
like
php test.php &
if I execute it like
docker exec contianer1 a.sh
It also never returned.
After half a day googling and trying
changed a.sh to
php test.php >/tmp/test.log 2>&1 &
It works!
So it seems related with stdin/out/err.
>/tmp/test.log 2>&1
Please try.
And please note that my test.php is a dead loop script that monitors a specified process, if the process is down, it will restart it. So test.php will never exit.
As described here, this "hanging" behavior occurs when you have processes that keep stdout or stderr open.
To prevent this from happening, each long-running process should:
be executed in the background, and
close both stdout and stderr or redirect them to files or /dev/null.
I would therefore make sure that any processes already running in the container, as well as the script passed to docker exec, conform to the above.
OK, I got it.
docker stop a590382c2943
docker start a590382c2943
then will be ok.
docker exec -ti a590382c2943 echo "5"
will return immediately, while add -it or not, no use
actually, in my program, the deamon has the std input and std output, std err. so I change my python deamon like following, things work like a charm:
if __name__ == '__main__':
# do the UNIX double-fork magic, see Stevens' "Advanced
# Programming in the UNIX Environment" for details (ISBN 0201563177)
try:
pid = os.fork()
if pid > 0:
# exit first parent
os._exit(0)
except OSError, e:
print "fork #1 failed: %d (%s)" % (e.errno, e.strerror)
os._exit(0)
# decouple from parent environment
#os.chdir("/")
os.setsid()
os.umask(0)
#std in out err, redirect
si = file('/dev/null', 'r')
so = file('/dev/null', 'a+')
se = file('/dev/null', 'a+', 0)
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())
# do second fork
while(True):
try:
pid = os.fork()
if pid == 0:
serve()
if pid > 0:
print "Server PID %d, Daemon PID: %d" % (pid, os.getpid())
os.wait()
time.sleep(3)
except OSError, e:
#print "fork #2 failed: %d (%s)" % (e.errno, e.strerror)
os._exit(0)

Resources