flume data transfer monitor with file name - monitoring

I am monitoring flume using below command
flume-ng agent -n agent_name -c conf -f conf/config.conf \-Dflume.monitoring.type=http -Dflume.monitoring.port=34545\
I am not getting the file name.
How do I monitor both source and sink with file name and other details?
How do I store the result in db?

Related

Copy file from localhost to docker container on remote server

I have a large file on my laptop (localhost). I would like to copy this file to a docker container which is located on a remote server. I know how to do it in two steps, i.e. I first copy the file to my remote server and then I copy the file from remote server to the docker container. But, for obvious reasons, I want to avoid this.
A similar question which has a complicated answer is covered here: Copy file from remote docker container
However in this question, the direction is reversed, the file is copied from the remote container to localhost.
Additional request: is it possible that this upload can be done piece-wise or that in case of a network failure I can resume the upload from where it stopped, instead of having to upload the entire file again? I ask because the file is fairly large, ~13GB.
From https://docs.docker.com/engine/reference/commandline/cp/#corner-cases and https://www.cyberciti.biz/faq/howto-use-tar-command-through-network-over-ssh-session/ you would just do:
tar Ccf $(dirname SRC_PATH) - $(basename SRC_PATH) | ssh you#host docker exec -i CONTAINER tar Cxf DEST_PATH -
or
tar Ccf $(dirname SRC_PATH) - $(basename SRC_PATH) | ssh you#host docker cp - CONTAINER:DEST_PATH
Or untested, no idea if this works:
DOCKER_HOST=ssh://you#host docker cp SRC_PATH CONTAINER:DEST_PATH
This will work if you are running a *nix server and a docker with ssh server in it.
You can create a local tunnel on the remote server by following these steps:
mkfifo host_to_docker
netcat -lkp your_public_port < host_to_docker | nc docker_ip_address 22 > host_to_docker &
First command will create a pipe that you can check with file host_to_docker.
Second one is the greatest network utility of all times that is netcat. It just accepts a tcp connection and forwards it to another netcat instance, receiving and forwarding underlying ssh messages to the ssh server running on docker and writing its responses to the pipe we created.
last step is:
scp -P your_public_port payload.tar.gz user#remote_host:/dest/folder
You can use the DOCKER_HOST environment variable and rsync to archive your goal.
First, you set DOCKER_HOST, which causes your docker client (i.e., the docker CLI util) to be connected to the remote server's docker daemon over SSH. This probably requires you to create an ssh-config entry for the destination server.
export DOCKER_HOST="ssh://<your-host-name>"
Next, you can use docker exec in conjunction with rsync to copy your data into the target container. This requires you to obtain the container ID via, e.g., docker ps. Note, that rsync must be installed in the container.
#
rsync -ar -e 'docker exec -i' <local-source-path> <container-id>:/<destintaion-in-the-container>
Since rsync is used, this will also allow you to resume (if the appropriated flags are used) uploads at some point later.

How do I automate backing up MYSQL database container

I have a MYSQL database container running on a centos server. How do I automate backing up the database outside the container?
as mentioned in the comments, make sure you use volumes for data folder.
for backing up:
create a bash script on the host machine and make it executable:
#!/bin/bash
DATE=$(date '+%Y-%m-%d_%H-%M-%S')
docker exec <container name> /usr/bin/mysqldump -u <putdatabaseusername here> -p<PutyourpasswordHere> --all-databases > /<path to desired backup location>/$DATE.sql
if [[ $? == 0 ]]; then
find /<path to desired backup location>/ -mtime +10 -exec rm {} \;
fi
change the following:
<container name> to actual db container name
<putdatabaseusername here> to db user
<PutyourpasswordHere> to the db password
create a directory for backup files and replace /<path to desired backup location>/ to the actual path
create a cronjob on host machine that executes the script in desired time/period
note that this script will retain backups for 10 days, change the number to reflect your needs.
Important note: this script is storing the password in the file, use a secure way in production

Where should I store log files of an application running in Container(docker/kubernetes)?

I have to port an application (C/C++) to docker container. The application writes to log files in file system.
How and where to store log file when application runs in container ?
You have a couple of options:
Have you application write to /dev/stdout and/or /dev/stderr.
Write to a log file on the filesystem.
What it boils down to is "log collection" and how you want to go about collecting this data. Logging to stdout and stderr may be the simplest but it doesn't help you if you want to acquire and analyze very specific log data about your application.
Docker supports multiple "log drivers" which will help extract, transform and lift the log data to an external source ("ETL"). See: https://docs.docker.com/config/containers/logging/configure/
For all of my applications I use FluentD and the docker log collector so I can do "smart" things with the data (https://fluentd.org).
I would recommend the following for writing the log file of the application and to see the stdout and stderr of the logs on running the command
$docker logs <docker-name>
This requires change in the Dockerfile and this is what you can see in most of the dockers like nginx, httpd.
# forward request and error logs to docker log collector
RUN ln -sf /dev/stdout /var/log/<app-name>/access.log \
&& ln -sf /dev/stderr /var/log/<app-name>/error.log

How do you get the swap space configured for Docker for Mac via command line?

Docker for Mac has a GUI interface for setting the cpu, ram, and swap values for the Hypervisor which hosts containers. All 3 of these settings are visible by opening up the docker menu and going to Preferences -> Advanced.
In addition the cpu and ram settings can be parsed from the output of docker info command, but the swap information is not listed there. Nor could I find any other docker cli utilities that output the swap setting info for Docker for Mac.
How do you obtain this swap setting info via command line tools?
Docker for Mac stores the settings in an iso file located at: ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/config.iso.
If Docker for Mac is running this file will be in use and cannot be mounted using hdiutil so the following sequence of commands is suggested to get the swap settings:
cp ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/config.iso /tmp/config.iso
# Mount the iso image
hdiutil mount /tmp/config.iso 2>&1 > /dev/null
# Parse the swap information from the config file (json format) using python
cat /Volumes/config/config | python -c "import sys, json; print json.load(sys.stdin)['swap']['entries']['size']['content']"
# Unmount the iso image
hdiutil eject /Volumes/config 2>&1 > /dev/null
rm /tmp/config.iso
Note: If Docker for Mac is not running then you do not need to copy the iso file from its original location you can simply mount it in place. If you try mounting the config.iso file in its original location while Docker for Mac is running you will get an error as follows:
hdiutil: mount failed - Resource temporarily unavailable
An example of the output is below:
2048M

GnuParallel: Parallelizing a script over a cluster, script writes files to the Master node

I have a simple bash script which takes as input a list of directory names in a text file. It traverses these directories one by one, copies the output of pwd to a file, and moves this file to a results directory. I can parallelize this script on my 4 core machine using Gnuparallel easily. The bash script (myScript.sh) is given below:
#!/bin/bash
par_func (){
name=$1
cd /home/zahaib/parentFolder/$name
pwd > $name.txt
mv $name.txt /home/zahaib/result/
cd /home/zahaib/parentFolder
}
export -f par_func
parallel -a /home/zahaib/folderList.txt -j 10 par_func
Now I want to parallelize the same script on a Cluster, all the worker nodes have mounted the home directory of the Master node so I can see the output of ls /home/zahaib/ on all worker nodes.
I tried using the --env to export the par_func. I also have a list of worker nodes in a workerList.txt file. My initial idea was to invoke parallel by changing the last line in my script above with the following:
parallel -vv --env par_func --slf /home/zahaib/workerList.txt -a /home/zahaib/folderList.txt -j 10 par_func
However, this does not seem to work and the shell on Master node just hangs after I do ./myScript.sh. What am I missing here?
The contents of my folderList.txt are as follows:
docs
dnload
driver
pics
music
.
.
and the contents of my workerList.txt are as follows:
2//usr/bin/ssh zahaib#node-1
2//usr/bin/ssh zahaib#node-2
2//usr/bin/ssh zahaib#node-3
From your description you are doing the right thing, so you may have hit a bug.
Try minimizing workerList.txt and folderList.txt and then run:
parallel -D ...
(And also checkout the option --result which might be useful to you).

Resources