Cross-site dupe (no answer).
Basically I have to set up incremental backup system for mariadb. Everything is living in docker containers, and I'm trying to use wal-g (similar to how easy I can set it up with postgres and WAL archiving) together with mariabackup. Backup task is executed within separate docker container based on mariadb:10.8.3-jammy image with wal-g binary added (I'm running wal-g backup-push). During debugging I use exporting to data folder (mounted as docker volume to /bak).
I'm presenting more details below, but let me begin with the actual problem: incremental backups are large. If I disable all DB clients and repeat the same backup a few times in a row (so nothing changes between them), initial backup is approx. 30
Mb and all consequent incremental backups are 30 Mb each.
# Initial
$ sudo docker compose run --rm db_backup wal-g backup-push
[+] Running 1/0
⠿ Container reporting-db-1 Running 0.0s
INFO: 2022/11/20 19:24:01.865762 FILE PATH: stream_20221120T192400Z/stream.lz4
INFO: 2022/11/20 19:24:01.865984 Backup sentinel: {"StartLocalTime":"2022-11-20T19:24:00.366507Z","StopLocalTime":"2022-11-20T19:24:01.86589Z","UncompressedSize":350375370,"CompressedSize":227476734,"Hostname":"b08f12bbbb68"}
# Incremental
$ sudo docker compose run --rm db_backup wal-g backup-push
[+] Running 1/0
⠿ Container reporting-db-1 Running 0.0s
INFO: 2022/11/20 19:24:07.109202 FILE PATH: stream_20221120T192405Z/stream.lz4
INFO: 2022/11/20 19:24:07.109509 Backup sentinel: {"StartLocalTime":"2022-11-20T19:24:05.639259Z","StopLocalTime":"2022-11-20T19:24:07.109321Z","UncompressedSize":28104928,"CompressedSize":14609954,"Hostname":"3d9854550254"}
If I provide an empty database as data source, then the result is much more comical: both initial and incremental backups are of the same size (+- a few Kb).
I supposed that mariabackup dumps system tables, which use Aria storage engine and do not support incremental backup (source). However, manual verification seems to disagree: I tried converting all mysql.* tables to InooDB (ALTER TABLE ... engine=innodb) and re-creating the backup, getting 1.5x larger incremental dumps (though compressed results are slightly better).
# Initial
$ sudo docker compose run --rm db_backup wal-g backup-push
[+] Running 1/0
⠿ Container reporting-db-1 Running 0.0s
INFO: 2022/11/20 18:56:06.254782 FILE PATH: stream_20221120T185602Z/stream.lz4
INFO: 2022/11/20 18:56:06.254984 Backup sentinel: {"StartLocalTime":"2022-11-20T18:56:02.328882Z","StopLocalTime":"2022-11-20T18:56:06.254911Z","UncompressedSize":366630153,"CompressedSize":229793575,"Hostname":"843e27ceff9e"}
# Incremental 1
$ sudo docker compose run --rm db_backup wal-g backup-push
[+] Running 1/0
⠿ Container reporting-db-1 Running 0.0s
INFO: 2022/11/20 18:56:15.123822 FILE PATH: stream_20221120T185610Z/stream.lz4
INFO: 2022/11/20 18:56:15.124035 Backup sentinel: {"StartLocalTime":"2022-11-20T18:56:10.974406Z","StopLocalTime":"2022-11-20T18:56:15.12395Z","UncompressedSize":44359711,"CompressedSize":13798326,"Hostname":"26a6e38afb59"}
# Incremental 2
$ sudo docker compose run --rm db_backup wal-g backup-push
[+] Running 1/0
⠿ Container reporting-db-1 Running 0.0s
INFO: 2022/11/20 19:06:57.626672 FILE PATH: stream_20221120T190653Z/stream.lz4
INFO: 2022/11/20 19:06:57.626904 Backup sentinel: {"StartLocalTime":"2022-11-20T19:06:53.667772Z","StopLocalTime":"2022-11-20T19:06:57.626823Z","UncompressedSize":44360402,"CompressedSize":13799019,"Hostname":"184342c39610"}
Unfortunately, I cannot ask my client to switch to convenient DBMS and have to do something here. I do not want to store these 30 Mb for every backup, if it can be avoided.
Is my reasoning ok? What else can cause this weird behaviour?
Can I just convert all system tables to InnoDB? I found evidence that it can be harmful on mysql 5.7, but cannot find more recent references. It'd resolve the problem, because then everything will support incremental backup properly. (Dupe? Not really).
Are there any alternative backup solutions that can handle described situation better?
Can I give up and prevent mariabackup from backing system tables up? I doubt it can be a viable solution (because the more complete the backup is, the easier to live with it), but may be wrong.
Side questions:
How can I examine the binary stream outputted by mariabackup somehow to confirm that system tables are the actual problem (and perhaps find out which tables exactly)?
What can cause the dump size fluctuations? Whenever I run multiple incremental backups in a row, the compressed and uncompressed size is slightly different every time (it can either increase or decrease compared to the previous run). Backup should be a deterministic process, and all actions above are performed on the same database without any modifications in between (I started from local dump which was loaded to new mariadb container with clean volumes, and no clients have access to that instance, so nothing should differ) - then why do I observe this? I checked with mariabackup without wal-g tool, and the size is stable then. It is introduced on some higher level, and this is less interesting.
Everything described above reproduces with plain mariabackup as well, generating approx. 27 Mb files per incremental backup.
mariabackup wrapper script:
last_lsns=$(ls /bak/lsns/ | sort -rn | head -n1)
if [ -n "$last_lsns" ]; then
ex="/bak/lsns/lsn_$(date +%s)"
mkdir -p "$ex"
mariabackup -H"$WEB_DB_HOST" -uroot -p"$MYSQL_ROOT_PASSWORD" --backup \
--stream=xbstream --datadir=/var/lib/mysql \
--incremental-basedir=/bak/lsns/$last_lsns --extra-lsndir=$ex
else
mkdir -p /bak/lsns/initial
mariabackup -H$WEB_DB_HOST -uroot -p"$MYSQL_ROOT_PASSWORD" --backup \
--stream=xbstream --datadir=/var/lib/mysql \
--extra-lsndir=/bak/lsns/initial
fi
This script is used as WALG_STREAM_CREATE_COMMAND. I have also
WALG_MYSQL_DATASOURCE_NAME='root:$MYSQL_ROOT_PASSWORD#tcp($REPORTING_DB_HOST:$REPORTING_DB_PORT)/$REPORTING_DATABASE'
WALG_FILE_PREFIX='/bak/foo'
and these settings (in fact written in compose file, but it's probably not important) seem to be correct (backups are created as expected ad written to proper directories).
Here are storage types used:
> select table_schema, table_name, engine from information_schema.tables where table_schema <> 'performance_schema' and engine <> 'MEMORY';
+--------------------+---------------------------+--------+
| table_schema | table_name | engine |
+--------------------+---------------------------+--------+
| information_schema | ALL_PLUGINS | Aria |
| information_schema | CHECK_CONSTRAINTS | Aria |
| information_schema | COLUMNS | Aria |
| information_schema | EVENTS | Aria |
| information_schema | OPTIMIZER_TRACE | Aria |
| information_schema | PARAMETERS | Aria |
| information_schema | PARTITIONS | Aria |
| information_schema | PLUGINS | Aria |
| information_schema | PROCESSLIST | Aria |
| information_schema | ROUTINES | Aria |
| information_schema | SYSTEM_VARIABLES | Aria |
| information_schema | TRIGGERS | Aria |
| information_schema | VIEWS | Aria |
| mysql | slow_log | CSV |
| mysql | db | Aria |
| mysql | help_relation | Aria |
| mysql | general_log | CSV |
| mysql | innodb_index_stats | InnoDB |
| mysql | servers | Aria |
| mysql | time_zone_transition_type | Aria |
| mysql | gtid_slave_pos | InnoDB |
| mysql | time_zone | Aria |
| mysql | roles_mapping | Aria |
| mysql | transaction_registry | InnoDB |
| mysql | procs_priv | Aria |
| mysql | proxies_priv | Aria |
| mysql | global_priv | Aria |
| mysql | func | Aria |
| mysql | innodb_table_stats | InnoDB |
| mysql | help_topic | Aria |
| mysql | time_zone_leap_second | Aria |
| mysql | help_keyword | Aria |
| mysql | time_zone_transition | Aria |
| mysql | event | Aria |
| mysql | columns_priv | Aria |
| mysql | tables_priv | Aria |
| mysql | time_zone_name | Aria |
| mysql | plugin | Aria |
| mysql | table_stats | Aria |
| mysql | index_stats | Aria |
| mysql | proc | Aria |
| mysql | help_category | Aria |
| mysql | column_stats | Aria |
| test_reporting | merchant_configs | InnoDB |
| test_reporting | masterdata_prediction | InnoDB |
| test_reporting | aggregator_config | InnoDB |
| test_reporting | masterdata | InnoDB |
| test_reporting | fixed_costs | InnoDB |
| test_reporting | timeline | InnoDB |
| test_reporting | migrations | InnoDB |
| test_reporting | user_analytics | InnoDB |
| test_reporting | affiliate | InnoDB |
| test_reporting | vat_config | InnoDB |
| sys | sys_config | Aria |
| reporting | merchant_configs | InnoDB |
| reporting | masterdata_prediction | InnoDB |
| reporting | aggregator_config | InnoDB |
| reporting | masterdata | InnoDB |
| reporting | fixed_costs | InnoDB |
| reporting | timeline | InnoDB |
| reporting | migrations | InnoDB |
| reporting | user_analytics | InnoDB |
| reporting | affiliate | InnoDB |
| reporting | vat_config | InnoDB |
| reporting_web | record_change | InnoDB |
| reporting_web | jwt_expiry | InnoDB |
| reporting_web | forecasting | InnoDB |
| reporting_web | users | InnoDB |
| reporting_web | alembic_version | InnoDB |
| reporting_web | merchant_analytics | InnoDB |
| reporting_web | email_config | InnoDB |
| reporting_web | user_company | InnoDB |
| reporting_web | user_detail | InnoDB |
+--------------------+---------------------------+--------+
As referenced MDEV-23614, Aria system tables cannot be incrementally backed up.
As you saw, system tables can be changed to InnoDB in 10.4+. Azure do this by default.
Two small issues prevent this being default under --enforce-storage-engine=InnoDB, both related to help tables:
CREATE TABLE IF NOT EXISTS help_relation has a FK reference to help_keyword, but help_keyword isn't created (easy fix, swap order in mysql_system_tables.sql)
In fill_help_tables.sql, lock tables help_topic write, help_category write.. once its InnoDB, returns ER_WRONG_LOCK_OF_SYSTEM_TABLE, can be commented out, but really is a bug that needs reporting/fixing.
To save space, help table and the proc tables are the biggest ones.
help tables, are optional for HELP command syntax. And could be truncated/removed.
proc, by default, contains a bunch of gis functions you may not need.
both could be converted to InnoDB.
Alternatives:
Use binary logs as a PITR alternate mechanism and perform less incremental mariabackups further apart.
Contribute a patch to mariabackup.
Related
I have two containers running in a swarm. Each exposes a /stats endpoint which I am trying to scrape.
However, using the swarm port obviously results in the queries being load balanced and therefore the stats are all intermingled:
+--------------------------------------------------+
| Server |
| +-------------+ +-------------+ |
| | | | | |
| | Container A | | Container B | |
| | | | | |
| +-------------+ +-------------+ |
| \ / |
| \ / |
| +--------------+ |
| | | |
| | Swarm Router | |
| | | |
| +--------------+ |
| v |
+-------------------------|------------------------+
|
A Stats
B Stats
A Stats
B Stats
|
v
I want to keep the load balancer for application requests, but also need a direct way to make requests to each container to scrape the stats.
+--------------------------------------------------+
| Server |
| +-------------+ +-------------+ |
| | | | | |
| | Container A | | Container B | |
| | | | | |
| +-------------+ +-------------+ |
| | \ / | |
| | \ / | |
| | +--------------+ | |
| | | | | |
| | | Swarm Router | | |
| v | | v |
| | +--------------+ | |
| | | | |
+--------|----------------|----------------|-------+
| | |
A Stats | B Stats
A Stats Normal Traffic B Stats
A Stats | B Stats
| | |
| | |
v | v
A dynamic solution would be ideal, but since I don't intend to do any dynamic scaling something like hardcoded ports for each container would be fine:
::8080 Both containers via load balancer
::8081 Direct access to container A
::8082 Direct access to container B
Can this be done with swarm?
From inside an overlay network you can get IP-addresses of all replicas with tasks.<service_name> DNS query:
; <<>> DiG 9.11.5-P4-5.1+deb10u5-Debian <<>> -tA tasks.foo_test
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19860
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;tasks.foo_test. IN A
;; ANSWER SECTION:
tasks.foo_test. 600 IN A 10.0.1.3
tasks.foo_test. 600 IN A 10.0.1.5
tasks.foo_test. 600 IN A 10.0.1.6
This is mentioned in the documentation.
Also, if you use Prometheus to scrape those endpoints for metrics, you can combine the above with dns_sd_configs to set the targets to scrape (here is an article how). This is easy to get running but somewhat limited in features (especially in large environments).
A more advanced way to achieve the same is to use dockerswarm_sd_config (docs, example configuration). This way the list of endpoints will be gathered by querying Docker daemon, along with some useful labels (i.e. node name, service name, custom labels).
While less than ideal, you can introduce a microservice that acts as an intermediary to the other containers that are exposing /stats. This microservice would have to be configured with the individual endpoints and operate in the same network as said endpoints.
This doesn't bypass the load balancer, but instead makes it so it does not matter.
The intermediary could roll-up the information or you could make it more sophisticated by passing a list of opaque identifiers which the caller can then use to individually query the intermediary.
It is slightly "anti-pattern" in the sense that you have a highly coupled "stats" proxy that must be configured to be able to hit each endpoint.
That said, it is good in the sense that you don't have to expose individual containers outside of the proxy. From a security perspective, this may be better because you're not leaking additional information out of your swarm.
You can try to publish a specific container port on a host machine
,add to your services:
ports:
- target: 8081
published: 8081
protocol: tcp
mode: host
Background: I have a setup where many different scalable services connect to their databases via a connection pool (per instance). These services run within a Docker Swarm.
In my current database setup, this ends up looking as follows (using PostgreSQL in this example):
PID | Database | User | Application | Client | ...
... | db1 | app | --standard JDBC driver string-- | x | ...
... | db1 | app | --standard JDBC driver string-- | y | ...
... | db1 | app | --standard JDBC driver string-- | y | ...
... | ... | app | ... | x | ...
... | ... | app | ... | x | ...
... | db2 | app | --standard JDBC driver string-- | y | ...
... | db2 | app | --standard JDBC driver string-- | y | ...
... | ... | app | ... | x | ...
... | ... | app | ... | x | ...
What I would like to do, effectively, is provide the current Docker Swarm container name, including scaling identifier, to the DBMS to be able to better monitor the database connections, i.e.:
PID | Database | User | Application | Client | ...
... | db1 | app | books-service.1 | x | ...
... | db1 | app | books-service.2 | y | ...
... | db1 | app | books-service.3 | y | ...
... | ... | app | ... | x | ...
... | ... | app | ... | x | ...
... | db2 | app | checkout-service.2 | y | ...
... | db2 | app | checkout-service.2 | y | ...
... | ... | app | ... | x | ...
... | ... | app | ... | x | ...
(obviously, setting the connection string is trivial - it's getting the information to set that is the issue)
Since my applications are managed by Docker Swarm (and sometime in the future, likely Kubernetes), I cannot manually set this value via environment (as I do not perform a docker run).
When running docker ps on a given Swarm node, I see the following output:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
59cdbf724091 my-docker-registry.net/books-service:latest "/bin/sh -c '/usr/bi…" 7 minutes ago Up 7 minutes books-service.1.zabvo1jal0h2xya9qfftnrnej
0eeeee15e92a my-docker-registry.net/checkout-service:latest "/bin/sh -c 'exec /u…" 8 minutes ago Up 8 minutes checkout-service.2.189s7d09m0q86y7fdf3wpy0vc
Of note is the NAMES column, which includes an identifier for the actual instance of the given container (or image, however you'd prefer to look at it). Do note that this name is not the hostname of the container, which by default is the container ID.
I know there are ways to determine if an application is running inside Docker (e.g. using /proc/1/cgroup), but that doesn't help me either as those also only list the container ID.
Is there a way to get this value from inside a Docker container that is being run in a swarm?
What you need (books-service.1) is a combination of a swarm service name and a task slot. Both of these can be passed to the container as environment variables, as well as a full task name (books-service.1.zabvo1jal0h2xya9qfftnrnej):
version: "3.0"
services:
test:
image: debian:buster
command: cat /dev/stdout
environment:
SERVICE_NAME: '{{ .Service.Name }}'
TASK_SLOT: '{{ .Task.Slot }}'
TASK_NAME: "{{ .Task.Name }}"
After passing these you can either strip the TASK_NAME to get what you need or combine the SERVICE_NAME with the TASK_SLOT. Come to think of it, you can combine them right in the template:
version: "3.0"
services:
test:
image: debian:buster
command: cat /dev/stdout
environment:
MY_NAME: '{{ .Service.Name }}.{{.Task.Slot}}'
Other possible placeholders can be found here.
i did an upgrade on my server for my Docker MARIADB with:
docker-compose pull
docker-compose up -d
My version before:
Server version: 10.2.14-MariaDB-10.2.14+maria~jessie mariadb.org binary distribution
SHOW VARIABLES LIKE "%version%";
+-------------------------+--------------------------------------+
| Variable_name | Value |
+-------------------------+--------------------------------------+
| innodb_version | 5.7.21 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 10.2.14-MariaDB-10.2.14+maria~jessie |
| version_comment | mariadb.org binary distribution |
| version_compile_machine | x86_64 |
| version_compile_os | debian-linux-gnu |
| version_malloc_library | system |
| version_ssl_library | OpenSSL 1.0.1t 3 May 2016 |
| wsrep_patch_version | wsrep_25.23 |
+-------------------------+--------------------------------------+
My version now:
Server version: 10.3.9-MariaDB-1:10.3.9+maria~bionic mariadb.org binary distribution
+---------------------------------+------------------------------------------+
| Variable_name | Value |
+---------------------------------+------------------------------------------+
| innodb_version | 10.3.9 |
| protocol_version | 10 |
| slave_type_conversions | |
| system_versioning_alter_history | ERROR |
| system_versioning_asof | DEFAULT |
| version | 10.3.9-MariaDB-1:10.3.9+maria~bionic |
| version_comment | mariadb.org binary distribution |
| version_compile_machine | x86_64 |
| version_compile_os | debian-linux-gnu |
| version_malloc_library | system |
| version_source_revision | ca26f91bcaa21933147974c823852a2e1c2e2bd7 |
| version_ssl_library | OpenSSL 1.1.0g 2 Nov 2017 |
| wsrep_patch_version | wsrep_25.23 |
+---------------------------------+------------------------------------------+
So it seems it was a upgrade from 10.2 to 10.3.
Upgrading from MariaDB 10.2 to MariaDB 10.3
Now i get the following error in "docker-compose logs"
2018-09-28 13:03:38 0 [Warning] InnoDB: Table mysql/innodb_table_stats has length mismatch in the column name table_name. Please run mysql_upgrade
2018-09-28 13:03:38 0 [Warning] InnoDB: Table mysql/innodb_index_stats has length mismatch in the column name table_name. Please run mysql_upgrade
The database is working as expected. What to do to get rid of this error?
While I was writing the question I was able to fix it myself. If you also facing this problem:
connect to the docker database container:
docker exec -u 0 -i -t CONTAINER_NAME /bin/bash
run mysql_upgrade like written in the error message:
mysql_upgrade --user=root --password=xxyy --host=localhost
I did a restart of the docker compose with:
docker-compose stop
docker-compose start
I found numerous sites explaining ssh port forwarding, ssh reverse proxy, ssh multiplexing etc. involving sshpiper, sslh, running a ssh socks server, just configuring the local SSH server an so on..
so I'm quite puzzled right now and might ask a very common and/or simple question:
As you might already guess from the title I want to set up a git server (GitLab) inside a docker container listening for SSH connections on port 22 without having to use a different port for default ssh operations (terminal, scp, etc..) on the host (as suggested here)
I.e.
ssh alice#myserver.org should still be possible as well as
git clone git#myserver.com:path/to/project
and I don't want to do any setup on the client computer
If you prefer a picture:
+------ myserver.org --------+
| +----+ +- typical -+ |
+--------+ alice#myserver.org:22 | | | | SSH | |
| client | ----------------------> -+--+----+---->| service | |
+--------+ all names but `git` | | ? | +-----------+ |
| | | |
| | ? | +- docker --+ |
+--------+ git#myserver.org:22 | | | | with | |
| client | ----------------------> -+--+----+---->| GitLab | |
+--------+ only user `git` | | | | | |
| +----+ +-----------+ |
+----------------------------+
Can you tell me what's the recommended/most common way to do this? This question sounds promising but the answer seems to configure the client (which I want to avoid)
This project may help you.
https://github.com/tg123/sshpiper.
SSH Piper works as a proxy-like ware, and route connections by username, src ip , etc.
+---------+ +------------------+ +-----------------+
| | | | | |
| Bob +----ssh -l bob----+ | SSH Piper +-------------> Bob' machine |
| | | | | | | |
+---------+ | | | | +-----------------+
+---> pipe-by-name--+ |
+---------+ | | | | +-----------------+
| | | | | | | |
| Alice +----ssh -l alice--+ | +-------------> Alice' machine |
| | | | | |
+---------+ +------------------+ +-----------------+
Downstream SSH Piper Upstream
First of all, thanks for reading TheDockerExperts blog , hope our articles help you! Let me explain how we do SSH proxy in our company.
We have HAproxy that listens TCP 22 port and sends traffic to GitLab server, on host we have custom SSH port. Unfortunately as we use TCP balancing in this case, there is no way to create balancer based on domain names and users. You can take small VPS , spin up HAproxy on it and use it to balance your GIT traffic.
Hope this will help you!
I may be asking a very beginner level question but I need a way to distinguish process under docker and that under non-docker in a box. The 'ps' command command output gives me a feeling that process is running in linux box and cannot confirm if same is under hood of docker.
In the same context is it possible / feasible that process under docker be started with docker root file system.
Is the same feasible or there any other solution for same?
You can identify Docker process via the process tree on the Docker host (or on the VM if using docker for mac/windows)
The parent process to 2924(haproxy) is 2902
The parent process to 2902(haproxy-start) is 2881
2881 will be docker-container which is managed by a dockerd process
To view your process listing in a tree format use ps -ejH or pstree (available in the psmisc package)
To get a quick list of whats running under dockerd
/ # pstree $(pgrep dockerd)
dockerd-+-docker-containe-+-docker-containe-+-java---17*[{java}]
| | `-8*[{docker-containe}]
| |-docker-containe-+-sinopia-+-4*[{V8 WorkerThread}]
| | | |-{node}
| | | `-4*[{sinopia}]
| | `-8*[{docker-containe}]
| |-docker-containe-+-node-+-4*[{V8 WorkerThread}]
| | | `-{node}
| | `-8*[{docker-containe}]
| |-docker-containe-+-tinydns
| | `-8*[{docker-containe}]
| |-docker-containe-+-dnscache
| | `-8*[{docker-containe}]
| |-docker-containe-+-apt-cacher-ng
| | `-8*[{docker-containe}]
| `-20*[{docker-containe}]
|-2*[docker-proxy---6*[{docker-proxy}]]
|-docker-proxy---5*[{docker-proxy}]
|-2*[docker-proxy---4*[{docker-proxy}]]
|-docker-proxy---8*[{docker-proxy}]
`-28*[{dockerd}]
Show the parents of a PID (-s)
/ # pstree -aps 3744
init,1
`-dockerd,1721 --pidfile=/run/docker.pid -H unix:///var/run/docker.sock --swarm-default-advertise-addr=eth0
`-docker-containe,1728 -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim ...
`-docker-containe,3711 8d923b3235eb963b735fda847b745d5629904ccef1245d4592cc986b3b9b384a...
`-java,3744 -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp/zookeeper/bin/../build/cl
|-{java},4174
|-{java},4175
|-{java},4176
|-{java},4177
|-{java},4190
|-{java},4208
|-{java},4209
|-{java},4327
|-{java},4328
|-{java},4329
|-{java},4330
|-{java},4390
|-{java},4416
|-{java},4617
|-{java},4625
|-{java},4629
`-{java},4632
Show all children of docker, including namespace changes (-S):
/ # pstree -apS $(pgrep dockerd)
dockerd,1721 --pidfile=/run/docker.pid -H unix:///var/run/docker.sock --swarm-default-advertise-addr=eth0
|-docker-containe,1728 -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim ...
| |-docker-containe,3711 8d923b3235eb963b735fda847b745d5629904ccef1245d4592cc986b3b9b384a...
| | |-java,3744,ipc,mnt,net,pid,uts -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp/zookeeper/bin/../build/cl
| | | |-{java},4174
| | | |-{java},4175
| | | |-{java},4629
| | | `-{java},4632
| | |-{docker-containe},3712
| | `-{docker-containe},4152
| |-docker-containe,3806 49125f8274242a5ae244ffbca121f354c620355186875617d43876bcde619732...
| | |-sinopia,3841,ipc,mnt,net,pid,uts
| | | |-{V8 WorkerThread},4063
| | | |-{V8 WorkerThread},4064
| | | |-{V8 WorkerThread},4065
| | | |-{V8 WorkerThread},4066
| | | |-{node},4062
| | | |-{sinopia},4333
| | | |-{sinopia},4334
| | | |-{sinopia},4335
| | | `-{sinopia},4336
| | |-{docker-containe},3814
| | `-{docker-containe},4038
| |-docker-containe,3846 2a756d94c52d934ba729927b0354014f11da6319eff4d35880a30e72e033c05d...
| | |-node,3910,ipc,mnt,net,pid,uts lib/dnsd.js
| | | |-{V8 WorkerThread},4204
| | | |-{V8 WorkerThread},4205
| | | |-{V8 WorkerThread},4206
| | | |-{V8 WorkerThread},4207
| | | `-{node},4203
The command lxc-ls and the command lxc-ps may be installable on your Linux distribution. This will allow you to list the running LXC containers and the processes running within those containers respectively. You should be able to link the output from lxc-ls to lxc-ps using streams and get a list of all containerized processes.
The big caveat is that you specified Docker and not every Docker instance is running on LXC nor is it necessarily a localhost process. Docker defines an API that can be called to list remote Docker instances, so this technique will not help with enumerating processes on remote machines as well.
In windows docker behave little bit different.
It's processes are not run as child of parent process, but running as separate process on the host.
They can be viewed by (for example), powershell, like
Get-Process powershell
For example, getting processes on the host when running microsoft/iis container will include additional powershell process (since ms/iis container runs powershell as a main executable process).