We are using browserless docker to read content on certain websites. For few websites, the CPU increases proportional to the number of sessions open.
Specifics
docker version browserless/chrome:1.35-chrome-stable
Script to reproduce once docker is up and running.
#!/bin/bash
HOST='localhost:3000'
curl_new_session() {
echo $(curl -XPOST http://$HOST/webdriver/session -d ' {"desiredCapabilities":{"browserName":"chrome","platform":"ANY","chromeOptions":{"binary":"","args":["--window-size=1400,900","--no-sandbox","--headless"]},"goog:chromeOptions":{"args":["--window-size=1400,900","--no-sandbox","--headless"]}}}' | jq '.sessionId') | tr -d '"'
{"desiredCapabilities":{"browserName":"chrome","platform":"ANY","chromeOptions":{"binary":"","args":["--window-size=1400,900","--no-sandbox","--headless"]},"goog:chromeOptions":{"args":["--window-size=1400,900","--no-sandbox","--headless"]}}}
}
# we open the session and keep it running
curl_visit_url() {
local id=$1
local url=$2
echo "http://$HOST/webdriver/session/$id/url"
echo $(curl http://$HOST/webdriver/session/$id/url -d '{"url":"'$url'"}' |jq '' )
}
for i in {1..5}
do
id=$(curl_new_session)
echo $id
curl_visit_url $id 'http://monday.com' &
sleep 0.5
echo '.'
done
This specific site(monday.com), uses too much cpu but this can happen with other sites as well.
Question
Considering we encounter such websites from time to time. What's the best way to handle them?
We want the sessions to be kept alive and not close after opening them.
I have a simple code in Dotnet core console and it simply counted forever :
class Program
{
static void Main(string[] args)
{
int counter = 1;
while (true)
{
counter++;
Console.WriteLine(counter);
}
}
}
I contained it with docker and ran it via VS Docker run.
but every time that I execute > docker logs -f dockerID it starts and shows counting from scratch! (1,2,3,....). I expected that whenever I run this command it shows me logs from the last integer that it counts!
Is "docker logs -f" cause to run a new instance of my application every time?
When running docker help logs you see this:
--tail string Number of lines to show from the end of the logs
(default "all")
So you should do:
docker logs --tail 100 -f dockerID
Additional reference: docker logs documentation > Options
docker logs -f should work similar to tail -f, so, it should show just new logs.
Therefore, if you're watching logs repeated, it looks your program is being executed in loop.
The only thing that comes to my mind is a little trick, although it's not an elegant solution, it can be useful for other similar situations:
Define in your system history time format according to docker logs
export HISTTIMEFORMAT="%Y-%m-%dT%H:%M:%S "
Get your last docker logs command date with a simple script.
LAST_DOCKER_LOG=`history | grep " docker logs" | tac | head -1 | awk '{print $2}'`
Get logs after your last docker logs execution:
docker logs --since $LAST_DOCKER_LOG <your_docker>
In one line:
docker logs -t --since `history | grep " docker logs" | tac | head -1 | awk '{print $2}'` <your_docker>
Note: history time and docker log times have to be in the same timezone.
In spite of everything, I'd have a look to your loop execution. Maybe, docker log -t timestamp show option can help you to solve it.
When I run docker-compose up, I get these logs:
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:46Z","tags":["reporting","browser-driver","warning"],"pid":6,"message":"Enabling the Chromium sandbox provides an additional layer of protection."}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:46Z","tags":["reporting","warning"],"pid":6,"message":"Generating a random key for xpack.reporting.encryptionKey. To prevent pending reports from failing on restart, please set xpack.reporting.encryptionKey in kibana.yml"}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:46Z","tags":["status","plugin:reporting#7.3.1","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:46Z","tags":["info","task_manager"],"pid":6,"message":"Installing .kibana_task_manager index template version: 7030199."}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:46Z","tags":["info","task_manager"],"pid":6,"message":"Installed .kibana_task_manager index template: version 7030199 (API version 1)"}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:47Z","tags":["info","migrations"],"pid":6,"message":"Creating index .kibana_1."}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:47Z","tags":["info","migrations"],"pid":6,"message":"Pointing alias .kibana to .kibana_1."}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:47Z","tags":["info","migrations"],"pid":6,"message":"Finished in 254ms."}
kibana_1 | {"type":"log","#timestamp":"2019-09-09T21:41:47Z","tags":["listening","info"],"pid":6,"message":"Server running at http://0:5601"}
is there some configuration I can use so that it only spits out JSON? I am looking for it to omit the "kibana_1 | " part before each line.
And of course, ideally it could make that part of the JSON, like {"source":"kibana_1", ...}
Note: Not sure if docker-compose supports this out of the box but you can look at Docker logging drivers.
What you could do is use the cut command piping output from docker-compose logs -f.
Here is an example below:
docker-compose logs -f kibana | cut -d"|" -f2
..
{"type":"log","#timestamp":"2019-08-11T03:44:01Z","tags":["status","plugin:xpack_main#6.8.1","info"],"pid":1,"state":"green","message":"Status changed from red to green - Ready","prevState":"red","prevMsg":"Request Timeout after 3000ms"}
{"type":"log","#timestamp":"2019-08-11T03:44:01Z","tags":["status","plugin:graph#6.8.1","info"],"pid":1,"state":"green","message":"Status changed from red to green - Ready","prevState":"red","prevMsg":"Request Timeout after 3000ms"}
{"type":"log","#timestamp":"2019-08-11T03:44:01Z","tags":["status","plugin:searchprofiler#6.8.1","info"],"pid":1,"state":"green","message":"Status changed from red to green - Ready","prevState":"red","prevMsg":"Request Timeout after 3000ms"}
..
The cut -d"|" -f2 command will look for a | character and output everything after.
You can take it a step further (although i'm sure there are better ways to do this) by deleting the leading space.
docker-compose logs -f kibana | cut -d"|" -f2 | cut -d" " -f2
..
{"type":"log","#timestamp":"2019-08-11T03:47:53Z","tags":["status","plugin:maps#6.8.1","error"],"pid":1,"state":"red","message":"Status
{"type":"log","#timestamp":"2019-08-11T03:47:53Z","tags":["status","plugin:index_management#6.8.1","error"],"pid":1,"state":"red","message":"Status
{"type":"log","#timestamp":"2019-08-11T03:47:53Z","tags":["status","plugin:index_lifecycle_management#6.8.1","error"],"pid":1,"state":"red","message":"Status
..
I use iperf3 version and then I get the performance result like this in terminal :
[ 4] local 10.0.1.8 port 34355 connected to 10.0.1.1 port 5201
49,49,0,0,35500
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 2.19 MBytes 18.4 Mbits/sec 0 69.3 KBytes
CPU Utilization: local/sender 2.8% (0.7%u/3.3%s), remote/receiver 1.4% (0.6%u/0.9%s)
I want to use only certain values which I will use in the bash script later. What I want is like this :
35500,18.4,2.8
As far as I know I can use grep to print bandwidth only :
./src/iperf3 -c 10.0.1.1 -d -t 1 -V | grep -Po '[0-9.]*(?= Mbits/sec)'
but is it possible to obtain "35500,18.4,2.8" using grep and how to do it?
Thank you for the answers
grep with P(Perl-regex) option allows you to include multiple regexes,
$ grep -Po '(?<=,)[0-9]+$|[0-9.]*(?= Mbits/sec)|(?<=local\/sender )[^%]*' file | paste -d, - - -
35500,18.4,2.8
So your command would be,
$ ./src/iperf3 -c 10.0.1.1 -d -t 1 -V | grep -Po '(?<=,)[0-9]+$|[0-9.]*(?= Mbits/sec)|(?<=local\/sender )[^%]*' | paste -d, - - -
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
2.6.18-164.el5
I am trying to use the watch command combined with the netstat to see the 2 programs using certain ports.
However, with the command I using below doesn't work for both words:
watch -n1 "netstat -upnlt | grep gateway\|MultiMedia"
Is this the correct way to grep for both program names.
If I use one its ok, but both together doesn't work.
For the grep you need:
"grep gateway\|MultiMedia"
So perhaps try:
watch -n1 'netstat -upnlt | grep "gateway\|MultiMedia"'
There's also the new way of doing things... grep -E is nice and portable (Or egrep, which is simply quick for grep -E on linux&bsd) so you don't have to escape the quote. From the man pages:
-E Interpret pattern as an extended regular expression (i.e. force
grep to behave as egrep).
So...
watch "netstat -upnlt | grep -E 'gateway|multimedia'"
or
watch "netstat -upnlt | egrep 'gateway|multimedia'"
I had a similar problem monitoring an ssh connection.
> netstat -tulpan|grep ssh
tcp 0 0 192.168.2.52:58072 192.168.2.1:22 ESTABLISHED 31447/ssh
However watch -n 1 'netstat -tulpan|grep ssh' shows no output (apart from message from watch).
If I change it to watch -n 1 'netstat -tulpan|grep ":22"' I get the required output line. It seems as if the -p option is ignored when netstat is run through watch. Strange.