I am using haproxy to loadbalance my MQTT brokers cluster. Each MQTT Broker can handle up to 1,00,000 Connections easily. But the problem i am facing with haproxy is that is only handling upto30k connections per node. Whenever if any node is reaching near 32k connections, the haproxy CPU Would suddenly spike to 100% and now all connections start dropping.
The problem with this is, that for every 30k connection, i have to roll another MQTT broker. How can I increase it to at least 60k connections per MQTT broker node?
My Virtual Machine: 1 Cpu, 2 GB Ram. I have tried increasing CPU counts, and faced the same problem.
My config –
bind 0.0.0.0:1883
maxconn 1000000
mode tcp
#sticky session load balancing – new feature
stick-table type string len 32 size 200k expire 30m
stick on req.payload(0,0),mqtt_field_value(connect,client_identifier)
option clitcpka # For TCP keep-alive
option tcplog
timeout client 600s
timeout server 2h
timeout check 5000
server mqtt1 10.20.236.140:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
server mqtt2 10.20.236.142:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
server mqtt3 10.20.236.143:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
I have also tuned system params
sysctl -w net.core.somaxconn=60000
sysctl -w net.ipv4.tcp_max_syn_backlog=16384
sysctl -w net.core.netdev_max_backlog=16384
sysctl -w net.ipv4.ip_local_port_range='1024 65535'
sysctl -w net.ipv4.tcp_rmem='1024 4096 16777216'
sysctl -w net.ipv4.tcp_wmem='1024 4096 16777216'
modprobe ip_conntrack
sysctl -w net.nf_conntrack_max=1000000
sysctl -w net.netfilter.nf_conntrack_max=1000000
sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
sysctl -w net.ipv4.tcp_max_tw_buckets=1048576
sysctl -w net.ipv4.tcp_fin_timeout=15
tee -a /etc/security/limits.conf << EOF
root soft nofile 1048576
root hard nofile 1048576
haproxy soft nproc 1048576
haproxy hard nproc 1048576
EOF
Output of Haproxy haproxy -v
HAProxy version 2.4.18-1ppa1~focal 2022/07/27 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-2.4.18.html
Running on: Linux 5.4.0-122-generic #138-Ubuntu SMP Wed Jun 22 15:00:31 UTC 2022 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = cc
CFLAGS = -O2 -g -O2 -fdebug-prefix-map=/build/haproxy-96Se88/haproxy-2.4.18=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_SYSTEMD=1 USE_PROMEX=1
DEBUG =
Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT +PCRE2 +PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 -CLOSEFROM -ZLIB +SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL -PROCCTL +THREAD_DUMP -EVPORTS -OT -QUIC +PROMEX -MEMORY_PROFILING
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_THREADS=64, default=1).
Built with OpenSSL version : OpenSSL 1.1.1f 31 Mar 2020
Running on OpenSSL version : OpenSSL 1.1.1f 31 Mar 2020
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.34 2019-11-21
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 9.4.0
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|CLEAN_ABRT|HOL_RISK|NO_UPG
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
<default> : mode=HTTP side=FE|BE mux=H1 flags=HTX
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
<default> : mode=TCP side=FE|BE mux=PASS flags=
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
Available services : prometheus-exporter
Available filters :
[SPOE] spoe
[CACHE] cache
[FCGI] fcgi-app
[COMP] compression
[TRACE] trace
I don't see anything in your config that should account for why you can't achieve more than 30K connections without the CPU spiking. I am not sure those tuned system parameters are doing you any good.
For reference, I am successfully running HAProxy (vanilla docker image haproxy:2.4.17) with 2 CPU and 4G memory and can reach my maxconn of 150K without pegging the CPU.)
Outside of the per instance scaling problem you are having, it looks like another problem you may have is you are assuming 1 HAProxy to 1 MQTT broker.
The problem with this is, that for every 30k connection, i have to roll another MQTT broker.
In my experience it is better to run multiple HAProxy nodes in front of a single broker. (Not sure what your constraints are here.) Having more than one instance of HAProxy per backend is critical. When you lose an instance, you will only lose a fraction of your traffic. (When, not if.) This is the critical piece here, because when the traffic gets dropped the clients will all try to re-connect at the same time. That is how you accidentally DDoS yourself because of a lost virtual machine or pod.
At your current vertical scale (CPU and memory) you can get to 30K connections. That would be 34 nodes if you set maxconn 30K. That should be easily doable if you run them as Kubernetes pods as a deployment with something like ELB in front of your HAProxy cluster. One way or another, you need to figure out how to run HAProxy as a cluster without duplicating your backends.
A good rule of thumb is "scale out when scaling up no longer works".
Related
Honestly, I have no idea what I am doing, so be gentle with me. I am trying to use uwsgi to run my django application on a aws ubuntu instance. I have a virtual environment with python3.7 running, but when I try to run uwsgi. I get this in the logs:
*** Starting uWSGI 2.0.14 (64bit) on [Sun Jan 5 14:51:32 2020] ***
compiled with version: 5.4.0 20160609 on 20 October 2016 05:56:34
os: Linux-4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018
nodename: ip-172-31-41-139
machine: x86_64
clock source: unix
detected number of CPU cores: 1
current working directory: /
detected binary path: /usr/local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
chdir() to /home/ubuntu/web/graff
*** WARNING: you are running uWSGI without its master process manager ***
your processes number limit is 3804
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to UNIX address /home/ubuntu/web/graffuwsgi.sock fd 3
Python version: 3.5.2 (default, Oct 8 2019, 13:06:37) [GCC 5.4.0 20160609]
Set PythonHome to /home/ubuntu/.virtualenvs/graff
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
Here is my uwsgi.conf
# file: /etc/init/graffuwsgi.conf
description "uWSGI server for graff"
start on runlevel [2345]
stop on runlevel [!2345]
respawn
exec /usr/local/bin/uwsgi --home /home/ubuntu/web/graff/ --socket /home/ubuntu/web/graffuwsgi.sock --chmod-socket=666 --module=graff.wsgi --pythonpath /home/ubuntu/web -H /home/ubuntu/.virtualenvs/graff --logto /home/ubuntu/web/logs/graffuwsgi.log --chdir=/home/ubuntu/web/graff --chmod-socket=666
It seems like python3.5 just doesn't work anymore. I feel like I've had to replace python3.5 with 3.7 in several places lately to fix various bugs, and I have it in my head that if I can get uwsgi to run python3.7 instead of 3.5 then that will solve this error too. Anyway, any help is much appreciated.
Looks like your uwsgi is compiled with different python version, make sure you compile with python 3.5
PYTHON=python3.5 uwsgi --build-plugin "/usr/src/uwsgi/plugins/python python35"
mv python35_plugin.so /usr/lib/uwsgi/plugins/python35_plugin.so
chmod 644 /usr/lib/uwsgi/plugins/python35_plugin.so
you can follow this guide:
https://www.paulox.net/2017/04/04/how-to-use-uwsgi-with-python3-6-in-ubuntu/
The source of error is PythonHome (pyhome, venv, home) setting.
See at the official Python docs on Environment variables
Is your path /home/ubuntu/.virtualenvs/graff suited for Python requirements?
In short:
Use pythonpath (pp) and let the system to find modules.
You can set several repeated options in your uwsgi config to custom modules search, eg:
pythonpath = /opt/web2py/
pythonpath = /opt/anaconda/lib/python3.9/site-packages/
So, I had the same error. And it was fixed by editing uwsgi vassal config.
I thing the core of problem that virtual envs are created with symlinks or have some inner relative paths, so an isolated process inside uwsgi could not find modules.
I was able to flash a micropython binary which I'd cross compiled some 6 months ago, and it was working fine. It was built from master branch at that point of time, and I did not save the code, nor the binary.
Today, when I again compiled, the binary is having problem at a point. So I want to revert back to the old binary, only problem is I'm not sure what commitID/build the master was at at that point of time ~6 months ago when my compiled binary which works fine was created.
I do have an ESP which has that binary flashed into it. So I was thinking if there is a way to retrieve the binary from the ESP?
Please let me know if this can be done somehow via ampy, etc..
Or suggest me some workaround. I'm already trying to find out the approximate commit around that time, and would cross compile again, which I'm not sure if would work as expected.
Regardless of which firmware you loaded onto your ESP8266 module (NodeMCU, MicroPython, Arduino, etc.) you can use esptool.py to dump the flash content to a file like so:
./esptool.py -p PORT -b 460800 read_flash 0 0x200000 flash_contents.bin
read_flash is the command, 0x200000 the argument for the upper memory bound (2MB).
For reading the firmware as a BIN file
For reading the firmware as a BIN file you need FIRST to connect correct the FTDI with the pins on the IR module
FTDI to IR Module as follows
FTDI 3.3 V to IR 3.3 V,
FTDI GND to IR GND,
FTDI GND to IR IO0 (flash mode - IMPORTANT otherwise it will not work),
FTDI RX to IR TXD,
FTDI TX to IR RXD
Then run the command (if the COM port is 5 and the name to extract the bin is flash-contents, otherwise you replace them to match your COM and the name you wish to have) – important the baud rate to be 9600
esptool.py -p COM5 -b 9600 read_flash 0 0x200000 flash_contents.bin
Below is the outcome for me (running under python 3.10.2 on windows 11):
PS F:\> esptool.py -p COM5 -b 9600 read_flash 0 0x200000 flash_contents.bin
esptool.py v3.2
Serial port COM5
Connecting....
Detecting chip type... Unsupported detection protocol, switching and trying
again...
Connecting...
Detecting chip type... ESP8266
Chip is ESP8266EX
Features: WiFi
Crystal is 26MHz
MAC: 10:52:1c:f8:b7:c7
Stub is already running. No upload is necessary.
2097152 (100 %)
2097152 (100 %)
Read 2097152 bytes at 0x0 in 2215.2 seconds (7.6 kbit/s)...
Hard resetting via RTS pin...
PS F:\>
Remember the esptool.py -p COM5 -b 9600 read_flash 0 0x200000 flash_contents.bin is for 2MB memory
but it is well run with esptool.py -p COM5 -b 9600 read_flash 0 0x100000 flash_contents.bin for 1MB memory as it was in my IR Module
I reduced the speed of reading the flash memory of my esp8266
460800 for "46080" I took a zero.
and successful
My system is a windows 10
C:\Users\POSITIVO\Downloads\esptool-master\esptool-master>esptool.py -p COM6 -b 46080 read_flash 0 0x400000 flash_contents3.bin
esptool.py v3.0-dev
Serial port COM6
Connecting....
Detecting chip type... ESP8266
Chip is ESP8266EX
Features: WiFi
Crystal is 26MHz
MAC: 2c:3a:e8:42:b9:f7
Uploading stub...
Running stub...
Stub running...
4194304 (100 %)
4194304 (100 %)
Read 4194304 bytes at 0x0 in 937.7 seconds (35.8 kbit/s)...
Hard resetting via RTS pin...
i'm a beginner.
i use a RPI3 and a buildroot build system and try to enable wireless.
I followed several links without success.
In particulary, i followed this link : https://delog.wordpress.com/2014/10/10/wireless-on-raspberry-pi-with-buildroot/
and verify my linux kernel wireless options are activated, but no results.
However, the options i used on the buildroot . config file seems to be good :
debian-host:/build/buildroot# egrep -i "wireless|wpa|80211" .config
# BR2_PACKAGE_WIRELESS_REGDB is not set
BR2_PACKAGE_WIRELESS_TOOLS=y
BR2_PACKAGE_WIRELESS_TOOLS_LIB=y
BR2_PACKAGE_WPA_SUPPLICANT=y
BR2_PACKAGE_WPA_SUPPLICANT_NL80211=y
BR2_PACKAGE_WPA_SUPPLICANT_AP_SUPPORT=y
BR2_PACKAGE_WPA_SUPPLICANT_WIFI_DISPLAY=y
# BR2_PACKAGE_WPA_SUPPLICANT_MESH_NETWORKING is not set
BR2_PACKAGE_WPA_SUPPLICANT_AUTOSCAN=y
BR2_PACKAGE_WPA_SUPPLICANT_EAP=y
BR2_PACKAGE_WPA_SUPPLICANT_HOTSPOT=y
BR2_PACKAGE_WPA_SUPPLICANT_DEBUG_SYSLOG=y
BR2_PACKAGE_WPA_SUPPLICANT_WPS=y
BR2_PACKAGE_WPA_SUPPLICANT_CLI=y
BR2_PACKAGE_WPA_SUPPLICANT_WPA_CLIENT_SO=y
BR2_PACKAGE_WPA_SUPPLICANT_PASSPHRASE=y
# BR2_PACKAGE_WPA_SUPPLICANT_DBUS_OLD is not set
# BR2_PACKAGE_WPA_SUPPLICANT_DBUS_NEW is not set
BR2_PACKAGE_WPAN_TOOLS=y
I installed a minibian an another rpi3, i noticed a firmware was used and i installed it by a :
apt-get install firmware-brcm80211
If the firmware is not installed, I noticed that iwlist wlan0 scan have empty results. On my RPI3-buildroot-system, after booting, lsmod show no modules.
I need to load manually by modprobe or by /etc/modules. So i load the same modules used on minibian, so i did (i loaded bluetooth mods also)
uname -a
Linux pi3 4.9.13-rt12-v7 #1 SMP Mon Mar 20 14:04:21 CET 2017 armv7l GNU/Linux
pwd
/lib/modules/4.9.13-rt12-v7/kernel/drivers
find . -name "*brcm*.ko"
./net/wireless/broadcom/brcm80211/brcmfmac/brcmfmac.ko
./net/wireless/broadcom/brcm80211/brcmutil/brcmutil.ko
modprobe 8192cu
modprobe brcmfmac
modprobe brcmutil
modprobe hci_uart
modprobe bnep
and the lsmod show :
lsmod
Module Size Used by Not tainted
8192cu 581125 0
hci_uart 19956 0
btbcm 7992 1 hci_uart
bnep 12051 0
bluetooth 364941 3 hci_uart,btbcm,bnep
brcmfmac 222136 0
brcmutil 9156 1 brcmfmac
cfg80211 543530 1 brcmfmac
rfkill 20944 2 bluetooth,cfg80211
ipv6 405794 18 [permanent]
but
iwlist wlan0 scan
wlan0 Interface doesn't support scanning.
I don't arrive to have the same result as my minibian distro with my apt-get ..
What is the way to retrieve the buildroot process to have the same result that on my minibian ?
I've forget something ?
Thanks for helping me.
When using make menuconfig, enable rpi-wifi-firmware under Target packages > Hardware handling > Firmware to include the firmware files.
For WiFi on the Raspberry Pi, you need to enable the following packages in your defconfig file :
BR2_PACKAGE_WPA_SUPPLICANT=y
BR2_PACKAGE_RPI_WIFI_FIRMWARE=y
If you want to reference any other things, I have an older buildroot external tree here for the Raspberry Pi.
I submit my code to a spark stand alone cluster. Submit command is like below:
nohup ./bin/spark-submit \
--master spark://ES01:7077 \
--executor-memory 4G \
--num-executors 1 \
--total-executor-cores 1 \
--conf "spark.storage.memoryFraction=0.2" \
./myCode.py 1>a.log 2>b.log &
I specify the executor use 4G memory in above command. But use the top command to monitor the executor process, I notice the memory usage keeps growing. Now the top Command output is below:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12578 root 20 0 20.223g 5.790g 23856 S 61.5 37.3 20:49.36 java
My total memory is 16G so 37.3% is already bigger than the 4GB I specified. And it is still growing.
Use the ps command , you can know it is the executor process.
[root#ES01 ~]# ps -awx | grep spark | grep java
10409 ? Sl 1:43 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4G -Xmx4G -XX:MaxPermSize=256m org.apache.spark.deploy.master.Master --ip ES01 --port 7077 --webui-port 8080
10603 ? Sl 6:16 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4G -Xmx4G -XX:MaxPermSize=256m org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://ES01:7077
12420 ? Sl 10:16 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms1g -Xmx1g -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master spark://ES01:7077 --conf spark.storage.memoryFraction=0.2 --executor-memory 4G --num-executors 1 --total-executor-cores 1 /opt/flowSpark/sparkStream/ForAsk01.py
12578 ? Sl 21:03 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4096M -Xmx4096M -Dspark.driver.port=52931 -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler#10.79.148.184:52931 --executor-id 0 --hostname 10.79.148.184 --cores 1 --app-id app-20160511080701-0013 --worker-url spark://Worker#10.79.148.184:52660
Below are the code. It is very simple so I do not think there is memory leak
if __name__ == "__main__":
dataDirectory = '/stream/raw'
sc = SparkContext(appName="Netflow")
ssc = StreamingContext(sc, 20)
# Read CSV File
lines = ssc.textFileStream(dataDirectory)
lines.foreachRDD(process)
ssc.start()
ssc.awaitTermination()
The code for process function is below. Please note that I am using HiveContext not SqlContext here. Because SqlContext do not support window function
def getSqlContextInstance(sparkContext):
if ('sqlContextSingletonInstance' not in globals()):
globals()['sqlContextSingletonInstance'] = HiveContext(sparkContext)
return globals()['sqlContextSingletonInstance']
def process(time, rdd):
if rdd.isEmpty():
return sc.emptyRDD()
sqlContext = getSqlContextInstance(rdd.context)
# Convert CSV File to Dataframe
parts = rdd.map(lambda l: l.split(","))
rowRdd = parts.map(lambda p: Row(router=p[0], interface=int(p[1]), flow_direction=p[9], bits=int(p[11])))
dataframe = sqlContext.createDataFrame(rowRdd)
# Get the top 2 interface of each router
dataframe = dataframe.groupBy(['router','interface']).agg(func.sum('bits').alias('bits'))
windowSpec = Window.partitionBy(dataframe['router']).orderBy(dataframe['bits'].desc())
rank = func.dense_rank().over(windowSpec)
ret = dataframe.select(dataframe['router'],dataframe['interface'],dataframe['bits'], rank.alias('rank')).filter("rank<=2")
ret.show()
dataframe.show()
Actually I found below code will cause the problem:
# Get the top 2 interface of each router
dataframe = dataframe.groupBy(['router','interface']).agg(func.sum('bits').alias('bits'))
windowSpec = Window.partitionBy(dataframe['router']).orderBy(dataframe['bits'].desc())
rank = func.dense_rank().over(windowSpec)
ret = dataframe.select(dataframe['router'],dataframe['interface'],dataframe['bits'], rank.alias('rank')).filter("rank<=2")
ret.show()
Because If I remove these 5 line. The code can run all night without showing memory increase. But adding them will cause the memory usage of executor grow to a very high number.
Basically the above code is just some window + grouby in SparkSQL. So is this a bug?
Disclaimer: this answer isn't based on debugging, but more on observations and the documentation Apache Spark provides
I don't believe that this is a bug to begin with!
Looking at your configurations, we can see that you are focusing mostly on the executor tuning, which isn't wrong, but you are forgetting the driver part of the equation.
Looking at the spark cluster overview from Apache Spark documentaion
As you can see, each worker has an executor, however, in your case, the worker node is the same as the driver node! Which frankly is the case when you run locally or on a standalone cluster in a single node.
Further, the driver takes 1G of memory by default unless tuned using spark.driver.memory flag. Furthermore, you should not forget about the heap usage from the JVM itself, and the Web UI that's been taken care of by the driver too AFAIK!
When you delete the lines of code you mentioned, your code is left without actions as map function is just a transformation, hence, there will be no execution, and therefore, you don't see memory increase at all!
Same applies on groupBy as it is just a transformation that will not be executed unless an action is being called which in your case is agg and show further down the stream!
That said, try to minimize your driver memory and the overall number of cores in spark which is defined by spark.cores.max if you want to control the number of cores on this process, then cascade down to the executors. Moreover, I would add spark.python.profile.dump to your list of configuration so you can see a profile for your spark job execution, which can help you more with understanding the case, and to tune your cluster more to your needs.
As I can see in your 5 lines, maybe the groupBy is the issue , would you try with reduceBy, and see how it performs.
See here and here.
I have a little office network and I'm experiencing a huge internet link latency. We have a simple network topology: a computer configured as router running ubuntu server 10.10, 2 network cards (one to internet link, other to office network) and a switch connecting 20 computers. I have a huge tcpdump log collected at the router and I would like to plot a histogram with the RTT time of all TCP streams to try to find out the best solution to this latency problem. So, could somebody tell me how to do it using wireshark or other tool?
Wireshark or tshark can give you the TCP RTT for each received ACK packet using tcp.analysis.ack_rtt which measures the time delta between capturing a TCP packet and the ACK for that packet.
You need to be careful with this as most of your ACK packets will be from your office machines ACKing packets received from the internet, so you will be measuring the RTT between your router seeing the packet from the internet and seeing the ACK from your office machine.
To measure your internet RTT you need to look for ACKS from the internet (ACKing data sent from your network). Assuming your office machines have IP addresses like 192.168.1.x and you have logged all the data on the LAN port of your router you could use a display filter like so:
tcp.analysis.ack_rtt and ip.dst==192.168.1.255/24
To dump the RTTs into a .csv for analysis you could use a tshark command like so;
tshark -r router.pcap -Y "tcp.analysis.ack_rtt and ip.dst==192.168.1.255/24" -e tcp.analysis.ack_rtt -T fields -E separator=, -E quote=d > rtt.csv
The -r option tells tshark to read from your .pcap file
The -Y option specifies the display filter to use (-R without -2 is deprecated)
The -e option specifies the field to output
The -T options specify the output formatting
You can use the mergecap utility to merge all your pcap files into one one file before running this command. Turning this output into a histogram should be easy!
Here's the 5-min perlscript inspired by rupello's answer:
#!/usr/bin/perl
# For a live histogram of rtt latencies, save this file as /tmp/hist.pl and chmod +x /tmp/hist.pl, then run:
# tshark -i wlp2s0 -Y "tcp.analysis.ack_rtt and ip.dst==192.168.0.0/16" -e tcp.analysis.ack_rtt -T fields -E separator=, -E quote=d | /tmp/hist.pl
# Don't forget to update the interface "wlp2s0" and "and ip.dst==..." bits as appropriate, type "ip addr" to get those.
#t[$m=0]=20;
#t[++$m]=10;
#t[++$m]=5;
#t[++$m]=2;
#t[++$m]=1;
#t[++$m]=0.9;
#t[++$m]=0.8;
#t[++$m]=0.7;
#t[++$m]=0.6;
#t[++$m]=0.5;
#t[++$m]=0.4;
#t[++$m]=0.3;
#t[++$m]=0.2;
#t[++$m]=0.1;
#t[++$m]=0.05;
#t[++$m]=0.04;
#t[++$m]=0.03;
#t[++$m]=0.02;
#t[++$m]=0.01;
#t[++$m]=0.005;
#t[++$m]=0.001;
#t[++$m]=0;
#h[0]=0;
while (<>) {
s/\"//g; $n=$_; chomp($n); $o++;
for ($i=$m;$i>=0;$i--) { if ($n<=$t[$i]) { $h[$i]++; $i=-1; }; };
if ($i==-1) { $h[0]++; };
print "\033c";
for (0..$m) { printf "%6s %6s %8s\n",$t[$_],sprintf("%3.2f",$h[$_]/$o*100),$h[$_]; };
}
The newer versions of tshark seem to work better with a "stdbuf -i0 -o0 -e0 " in front of the "tshark".
PS Does anyone know if wireshark has DNS and ICMP rtt stats built in or how to easily get those?
2018 Update: See https://github.com/dagelf/pping