jstack and other tools on google cloud dataflow VMs - google-cloud-dataflow

Is there a way to run jstack on the VMs created for Dataflow jobs?
I'm trying to see where the job spends most of the CPU time and I can't find it installed.
Thanks,
G

A workaround which I found to work:
Log on to the machine
Find the docker container that runs "python -m taskrunne" using sudo docker ps
Connect to the container using sudo docker exec -i -t 9da88780f555 bash (replacing the container id with the one found in step 2)
Install openjdk-7-jdk using apt-get install openjdk-7-jdk
Find the process id of the java executable
Run /usr/bin/jstack 1437

This Github issue update includes some basic instructions for getting profiles using the --enableProfilingAgent option.

This doesn't answer the "and other tools" part of your question, but:
Dataflow workers run a local http server that you can use to get some info. Instead of using jstack you can get a thread dump with this:
curl http://localhost:8081/threadz

I'm not familiar with jstack but based on a quick Google search it looks like jstack is a tool that runs independently from the JVM and just takes a PID. So you can do the following while your job is running.
ssh into one of the VMs using gcutil ssh
Install jstack on the VM.
Run ps -aux | grep java to identify the PID of the java process.
Run jstack using the PID you identified.
Would that work for you? Are you trying to run jstack from within your code so as to profile it automatically?

Related

Detailed Step by Step for Installing Docker on Synology DS120j (aarch64)?

I have scrubbed the internet for detailed instructions on how to install Docker specifically on a Synology DS120j (aarch64, running DSM 7.1.4) and still need some help.
For confirmation I checked,
uname -m
aarch64
I'm seeing that it looks to be possible to in stall Docker on this non-Intel machine but so far the instructions I've read are not as specific (step by step) as I apparently need them to be because something's not quite working.
End use is installing and running Home Assistant on this machine (which requires Docker) as an alternative to Raspberry Pi 4 because they are so hard to find and the DS120j seems to be an economic alternative (I have Homebridge successfully running on it and it's working great).
Though it looks like I was able to (sort of) install Docker, I can not access the Docker GUI / it does not show up in my package center. I'm not sure how I can install Home Assistant without the Docker GUI.
So I'm not sure if what I've done right and wrong along the way but have tried multiple methods (from 4 post on Stackoverflow) to get to this point but I might need to start from scratch, which I'm completely prepared to do!
I have tried steps 1 - 6 from this Stackoverflow post including trying the automatic script (and also a second automatic script from a link that was posted further down the in the post replies).
Can I install Docker on arm8 based Synology Nas
Any detailed step by step instructions or insigts would be greatly appreciated.
I have held off for two weeks posting about this but am doing so now because there seems to be a lot of people buying these cheap DS120j NAS machines to run Homebridge / Home Assistant and other servers in place of Raspberry Pi's.
Thank you!
M
This is one script that I have tried:
#!/bin/sh
#/bin/wget -O - https://raw.githubusercontent.com/wdmomoxx/catdriver/master/install-docker.sh | /bin/sh
/bin/wget https://raw.githubusercontent.com/wdmomoxx/catdriver/master/catdsm-docker.tgz
tar -xvpzf catdsm-docker.tgz -C /
rm catdsm-docker.tgz
PATH=/opt/sbin:/opt/bin:/opt/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
/opt/etc/init.d/S60dockerd
sudo docker run -d --network=host -v "/run/docker.sock:/var/run/docker.sock" portainer/portainer:linux-arm64
echo "猫盘群晖Docker安装完成"
echo "浏览器输入群晖IP:9000进入Docker UI"
also
sudo mkdir -p /volume1/#Docker/lib
sudo mkdir /docker
sudo mount -o bind "/volume1/#Docker/lib" /docker
This tip from comments in the above Stackoverflow post, I do not understand though,
"Then set the data-root in /etc/docker/daemon.json: { "data-root": "/docker" }"
I have also created a docker user group and added my name to the group but I'm not sure if docker is set to associate the two.
However when I currently ssh into the Diskstation and
docker --version"
I do get,
Docker version 20.10.0, build 7287ab3
but I can not seem to see or launch the Docker GUI.
I see there is talk on the next about using Portainer but I'm not sure how to get that running as well.

How to execute command from one docker container to another

I'm creating an application that will allow users to upload video files that will then be put through some processing.
I have two containers.
Nginx container that serves the website where users can upload their video files.
Video processing container that has FFmpeg and some other processing stuff installed.
What I want to achieve. I need container 1 to be able to run a bash script on container 2.
One possibility as far as I can see is to make them communicate over HTTP via an API. But then I would need to install a web server in container 2 and write an API which seems a bit overkill.
I just want to execute a bash script.
Any suggestions?
You have a few options, but the first 2 that come time mind are:
In container 1, install the Docker CLI and bind mount
/var/run/docker.sock (you need to specify the bind mount from the
host when you start the container). Then, inside the container, you
should be able to use docker commands against the bind mounted
socket as if you were executing them from the host (you might also
need to chmod the socket inside the container to allow a non-root
user to do this.
You could install SSHD on container 2, and then ssh in from container 1 and run your script. The advantage here is that you don't need to make any changes inside the containers to account for the fact that they are running in Docker and not bare metal. The down side is that you will need to add the SSHD setup to your Dockerfile or the startup scripts.
Most of the other ideas I can think of are just variants of option (2), with SSHD replaced by some other tool.
Also be aware that Docker networking is a little strange (at least on Mac hosts), so you need to make sure that the containers are using the same docker-network and are able to communicate over it.
Warning:
To be completely clear, do not use option 1 outside of a lab or very controlled dev environment. It is taking a secure socket that has full authority over the Docker runtime on the host, and granting unchecked access to it from a container. Doing that makes it trivially easy to break out of the Docker sandbox and compromise the host system. About the only place I would consider it acceptable is as part of a full stack integration test setup that will only be run adhoc by a developer. It's a hack that can be a useful shortcut in some very specific situations but the drawbacks cannot be overstated.
I wrote a python package especially for this use-case.
Flask-Shell2HTTP is a Flask-extension to convert a command line tool into a RESTful API with mere 5 lines of code.
Example Code:
from flask import Flask
from flask_executor import Executor
from flask_shell2http import Shell2HTTP
app = Flask(__name__)
executor = Executor(app)
shell2http = Shell2HTTP(app=app, executor=executor, base_url_prefix="/commands/")
shell2http.register_command(endpoint="saythis", command_name="echo")
shell2http.register_command(endpoint="run", command_name="./myscript")
can be called easily like,
$ curl -X POST -H 'Content-Type: application/json' -d '{"args": ["Hello", "World!"]}' http://localhost:4000/commands/saythis
You can use this to create RESTful micro-services that can execute pre-defined shell commands/scripts with dynamic arguments asynchronously and fetch result.
It supports file upload, callback fn, reactive programming and more. I recommend you to checkout the Examples.
Running a docker command from a container is not straightforward and not really a good idea (in my opinion), because :
You'll need to install docker on the container (and do docker in docker stuff)
You'll need to share the unix socket, which is not a good thing if you have no idea of what you're doing.
So, this leaves us two solutions :
Install ssh on you're container and execute the command through ssh
Share a volume and have a process that watch for something to trigger your batch
It was mentioned here before, but a reasonable, semi-hacky option is to install SSH in both containers and then use ssh to execute commands on the other container:
# install SSH, if you don't have it already
sudo apt install openssh-server
# start the ssh service
sudo service start ssh
# start the daemon
sudo /usr/sbin/sshd -D &
Assuming you don't want to always be root, you can add default user (in this case, 'foobob'):
useradd -m --no-log-init --system --uid 1000 foobob -s /bin/bash -g sudo -G root
#change password
echo 'foobob:foobob' | chpasswd
Do this on both the source and target containers. Now you can execute a command from container_1 to container_2.
# obtain container-id of target container using 'docker ps'
ssh foobob#<container-id> << "EOL"
echo 'hello bob from container 1' > message.txt
EOL
You can automate the password with ssh-agent, or you can use some bit of more hacky with sshpass (install it first using sudo apt install sshpass):
sshpass -p 'foobob' ssh foobob#<container-id>
I believe
docker exec -it <container_name> <command>
should work, even inside the container.
You could also try to mount to docker.sock in the container you try to execute the command from:
docker run -v /var/run/docker.sock:/var/run/docker.sock ...

Docker - How to test if a service is running during image creation

I'm pretty green regarding docker and find myself facing the following problem:
I'm trying to create a dockerfile to generate an image with my companie software on it. During the installation of that software the install process check if ssh is running with the following command:
if [ $(pgrep sshd | wc -l) -eq 0 ]; then
I probably need to precise that I'm installing and starting open-ssh during that same process.
Can you at all check that a service is running during the image creation ?
I cannot ignore that step has it is executed as part of a self extracting mechanism.
Any clue toward the right direction would be appreciated.
An image cannot run services. You are just creating all the necessary things needed for your container to run, like installing databases, servers, or copying some config files etc in the Dockerfile. The last step in the Dockerfile is where you can give instructions on what to do when you issue a docker run command. A script or command can be specified using CMD or ENTRYPOINT in the Dockerfile.
To answer your question, during the image creation process, you cannot check whether a service is running or not. When the container is started, docker will execute the script or command that you can specify in the CMD or ENTRYPOINT. You can use that script to check if your services are running or not and take necessary action after that.
It is possible to run services during image creation. All processes are killed once a RUN command completes. A service will not keep running between RUN commands. However, each RUN command can start services and use them.
If an image creation command needs a service, start the service and then run the command that depends on the service, all in one RUN command.
RUN sudo service ssh start \
&& ssh localhost echo ok \
&& ./install
The first line starts the ssh server and succeeds with the server running.
The second line tests if the ssh server is up.
The third line is a placeholder: the 'install' command can use the localhost ssh server.
In case the service fails to start, the docker build command will fail.

Avoid docker exec zombie processes when connecting to containers via bash

Like most docker users, I periodically need to connect to a running container and execute various arbitrary commands via bash.
I'm using 17.06-CE with an ubuntu 16.04 image, and as far as I understand, the only way to do this without installing ssh into the container is via docker exec -it <container_name> bash
However, as is well-documented, for each bash shell process you generate, you leave a zombie process behind when your connection is interrupted. If you connect to your container often, you end up with 1000s of idle shells -a most undesirable outcome!
How can I ensure these zombie shell processes are killed upon disconnection -as they would be over ssh?
One way is to make sure the linux init process runs in your container.
In recent versions of docker there is an --init option to docker run that should do this. This uses tini to run init which can also be used in previous versions.
Another option is something like the phusion-baseimage project that provides a base docker image with this capability and many others (might be overkill).

Julia cluster using docker

I am trying to connect to docker containers using the default SSHManager.
These containers only have a running sshd, with public key authentication, and julia installed.
Here is my dockerfile:
FROM rastasheep/ubuntu-sshd
RUN apt-get update && apt-get install -y julia
RUN mkdir -p /root/.ssh
ADD id_rsa.pub /root/.ssh/authorized_keys
I am running the container using:
sudo docker run -d -p 3333:22 -it --name julia-sshd julia-sshd
And then in the host machine, using the julia repl, I get the following error:
julia> import Base:SSHManager
julia> addprocs(["root#localhost:3333"])
stdin: is not a tty
Worker 2 terminated.
ERROR (unhandled task failure): EOFError: read end of file
Master process (id 1) could not connect within 60.0 seconds.
exiting.
I have tested that I can connect to the container via ssh without password.
I have also tested that in julia repl I can add a regular machine with julia installed to the cluster and it works fine.
But I cannot get this two things working together. Any help or suggestions will be apreciated.
I recommend you to also deploy the Master in a Docker container. It makes your environment easily and fully reproducible.
I'm working on a way of deploying Workers in Docker containers on-demand. i.e., the Master deployed in a container can deploy further DockerizedJuliaWorkers. It is similar to https://github.com/gsd-ufal/Infra.jl but assuming that Master and Workers run on the same host, which makes things not so hard.
It is an on-going work and I plan to finish next weeks. In a nutshell:
1) You'll need a simple DockerBackend and a wrapper to transparently run containers, set up SSH, and call addprocs with all the low-level parameters (i.e., the DockerizedJuliaWorker.jl file):
https://github.com/NaelsonDouglas/DistributedMachineLearningThesis/tree/master/src/docker
2) Read here how to build the Docker image (Dockerfile is included):
https://github.com/NaelsonDouglas/DistributedMachineLearningThesis
Please tell me if you have any suggestion on how to improve it.
Best,
André Lage.

Resources