How to speed up sonarqube analysis job? - jenkins

I have one java based application which is having huge line of source code(~1m).Now I am using jenkins with sonar-runner-2.4 to run analysis with code coverage and test cases count.I have upgraded sonarqube server from 5.4 to 6.3.1.Before upgrade this job took 9hrs to complete the whole analysis (still it is very much long time but fine) but after upgrade to sonarqube-6.3.1 same job taking 13hrs to complete the same analysis.
How do I improve analysis time at least my earlier time 9hr ?
EDIT
Here is my JAVA_OPTS for sonarqube-6.3.1 instance
sonar.web.javaOpts=-Xmx6G -Xms2G -XX:MaxPermSize=1G -XX:+HeapDumpOnOutOfMemoryError -Djava.net.preferIPv4Stack=true
Available Hardware :
$lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 26
Stepping: 5
CPU MHz: 1596.000
BogoMIPS: 3999.44
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 4096K
NUMA node0 CPU(s): 0-3
NUMA node1 CPU(s): 4-7
Available Memory :
$free -m
total used free shared buff/cache available
Mem: 128714 58945 66232 430 3535 68298
Swap: 32767 957 31810
sonar-project.properties for the long running job:
sonar-project.properties

As you haven't really given many details, I can't really give many details in the answer, but the simple answer is that you have to make the scan do less work.
Look at your codebase. Is your scan processing generated classes? Is it scanning test classes? Is it scanning classes that have little real business logic? If you answer "yes" to any of those, consider excluding those classes.
Look at the SonarQube plugins you're using. Are you running every possible plugin you can run? Are there some heuristics you don't need to run, or perhaps you could run less frequently?

Related

Ubuntu 18.04: GNU parallel can't find and use the full number of cores on a local system

I am using GNU parallel (version 20200522) on Ubuntu Linux 18.04.4 and running jobs on all cores of a local server minus 2 cores, that is I am using the -j-2 parameter.
find /folder/ -type f -iname "*.pdf" | parallel -j-2 --nice 2 "script.sh {1} {1/.}; mv -f -v {1} /folder2/; mv -f {1/.}.txt /folder3/" :::: -
However, the program shows
Error: Cannot run any jobs.
I tried using the -j100% parameter and I have seen that it uses just 1 core(job), and I deduce that, for GNU parallel, 100% of the available cores on this system is just one core.
If I use the -j5 parameter (which does not imply autodetection of the total number of cores), everything is alright, parallel launches 5 jobs and uses 5 cores.
The interesting part is that the file /root/.parallel/tmp/sshlogin/MACHINE_NAME/cpuspec contains the following:
1
6
6
which means, I think, that GNU parallel should see 6 available cores.
I have tried deleting the cpuspec file and running parallel again to redetect the total number of cores, but the cpuspec file and the behavior of the program remain the same.
On different systems, deleting the cpuspec file solved all issues, but on this particular system it is not working. The virtual machine is copied from another server with a different configuration, that is why I need deleting the cpuspec file.
What should I do to get GNU parallel to correctly detect the number of cores on the system, so that I can use the -j-2 parameter?
Update 21.07:
After deleting once again the folder with the cpuspec file, running the parallel --number-of-sockets/cores/threads commands and using just once the -S 6/: parameter, the problem seems to have resolved itself. Now GNU parallel correctly detects the number of cores and the -j-2 parameters works.
I am not sure what good things happened, but I am not able to reproduce the bug anymore.
Ole, thank you for your answer. If I meet the bug again or if I am able to reproduce it, I will post it here.
And here is the output to the commands:
parallel --number-of-sockets
1
parallel --number-of-cores
6
parallel --number-of-threads
6
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2620 v3 # 2.40GHz
stepping : 2
microcode : 0xffffffff
cpu MHz : 2397.218
cache size : 15360 KB
physical id : 0
siblings : 6
core id : 0
cpu cores : 6
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl cpuid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm pti fsgsbase bmi1 avx2 smep bmi2 erms xsaveopt
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips : 4794.43
clflush size : 64
cache_alignment : 64
address sizes : 42 bits physical, 48 bits virtual
power management:
And it repeats itself for 5 more cores.
You may have found a bug. Please post the output of:
cat /proc/cpuinfo
parallel --number-of-sockets
parallel --number-of-cores
parallel --number-of-threads
Also see if you can make an MCVE.
As a workaround you can use -S 6/: to force GNU Parallel to detect 6 cores on your system.
find /folder/ -type f -iname "*.pdf" |
parallel -S 6/: -j-2 --nice 2 "script.sh {1} {1/.}; mv -f -v {1} /folder2/; mv -f {1/.}.txt /folder3/"
(Also :::: - can be left out completely: If there is no ::: :::: then GNU Parallel reads from stdin).

jvm in kubernetes/docker running out of memory faster than standalone

We are moving our JDK 1.8v131 JVM servers to Kubernetes/Docker environment.
We have few JVM servers running in stand alone VMs and few running Kubernetes/Docker environment and both types are present in production.
With the same load Kubernetes/Docker JVMs are running out of memory whereas JVMs in VMs are running fine without issues.
We used exact SAME JVM parameters for running in VM & Container.
Any ideas how to fix this issue?
Here are the options:
Environment:
JAVA_MEM_OPTS: -Xms2048M -Xmx2048M
-XX:MaxPermSize=256M -XX:+ExitOnOutOfMemoryError -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/heapdumps/${HOSTNAME}_$(date +%Y%m%d_%H_%M_%S).hprof
JAVA_GC_OPTS: -Dnogclogging=true -XX:+PrintGC -XX:+PrintGCDetail
2018-12-07T15:43:21.42043862Z {Heap before GC invocations=2880 (full
625): 2018-12-07T15:43:21.420465613Z PSYoungGen total
435712K, used 249344K
2018-12-07T15:43:21.420469712Z eden space 249344K, 100% used 2018-12-07T15:43:21.420472561Z from space 186368K, 0% used
2018-12-07T15:43:21.420475332Z to space 228352K, 0% used
2018-12-07T15:43:21.420477921Z ParOldGen total 1398272K,
used 1397679K 2018-12-07T15:43:21.420480674Z object space
1398272K, 99% used 2018-12-07T15:43:21.420483127Z Metaspace
used 229431K, capacity 249792K, committed 249968K, reserved 1271808K
2018-12-07T15:43:21.420485549Z class space used 24598K,
capacity 27501K, committed 27544K, reserved 1048576K
2018-12-07T15:43:22.628605014Z 2018-12-07T15:43:21.420+0000:
124733.208: ] ] 1647023K->1646334K(1833984K), ], 1.2079201 secs] [Times: user=1.98 sys=0.01, real=1.21 secs]
2018-12-07T15:43:22.62868917Z Heap after GC invocations=2880 (full
625): 2018-12-07T15:43:22.628794768Z PSYoungGen total
435712K, used 248654K
2018-12-07T15:43:22.628799885Z eden space 249344K, 99% used 2018-12-07T15:43:22.628803713Z from space 186368K, 0% used
2018-12-07T15:43:22.628807485Z to space 228352K, 0% used
2018-12-07T15:43:22.628811115Z ParOldGen total 1398272K,
used 1397679K 2018-12-07T15:43:22.62881498Z object space
1398272K, 99% used 2018-12-07T15:43:22.628818943Z Metaspace
used 229431K, capacity 249792K, committed 249968K, reserved 1271808K
2018-12-07T15:43:22.628827543Z class space used 24598K,
capacity 27501K, committed 27544K, reserved 1048576K
2018-12-07T15:43:22.628831766Z } 2018-12-07T15:43:22.632712004Z
{Heap before GC invocations=2881 (full 626):
2018-12-07T15:43:22.63273803Z PSYoungGen total 435712K, used
249344K
2018-12-07T15:43:22.632742051Z eden space 249344K, 100% used **
**2018-12-07T15:43:22.63274617Z from space 186368K, 0% used 2018-12-07T15:43:22.632752151Z to space 228352K, 0% used
2018-12-07T15:43:22.632756279Z ParOldGen total 1398272K, used
1397679K 2018-12-07T15:43:22.632760269Z object space
1398272K, 99% used 2018-12-07T15:43:22.632764456Z Metaspace
used 229431K, capacity 249792K, committed 249968K, reserved 1271808K
2018-12-07T15:43:22.632768599Z class space used 24598K,
capacity 27501K, committed 27544K, reserved 1048576K
2018-12-07T15:43:23.164683101Z 2018-12-07T15:43:22.632+0000:
124734.420:
SERVER RESTARTS HERE
Did you set your container memory resouce requests and limits? Jdk 8u131 doesn't know that it is running inside a container. It still sees the host VMs resources. That could be why your JVM inside the container is killed immediately.
There's a good article from redhat back in 2017.
https://developers.redhat.com/blog/2017/03/14/java-inside-docker/

dask jobqueue worker failure at startup 'Resource temporarily unavailable'

I'm running dask over slurm via jobqueue and I have been getting 3 errors pretty consistently...
Basically my question is what could be causing these failures? At first glance the problem is that too many workers are writing to disk at once, or my workers are forking into many other processes, but it's pretty difficult to track that. I can ssh into the node but I'm not seeing an abnormal number of processes, and each node has a 500gb ssd, so I shouldn't be writing excessively.
Everything below this is just information about my configurations and such
My setup is as follows:
cluster = SLURMCluster(cores=1, memory=f"{args.gbmem}GB", queue='fast_q', name=args.name,
env_extra=["source ~/.zshrc"])
cluster.adapt(minimum=1, maximum=200)
client = await Client(cluster, processes=False, asynchronous=True)
I suppose i'm not even sure if processes=False should be set.
I run this starter script via sbatch under the conditions of 4gb of memory, 2 cores (-c) (even though i expect to only need 1) and 1 task (-n). And this sets off all of my jobs via the slurmcluster config from above. I dumped my slurm submission scripts to files and they look reasonable.
Each job is not complex, it is a subprocess.call( command to a compiled executable that takes 1 core and 2-4 GB of memory. I require the client call and further calls to be asynchronous because I have a lot of conditional computations. So each worker when loaded should consist of 1 python processes, 1 running executable, and 1 shell.
Imposed by the scheduler we have
>> ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-m: resident set size (kbytes) unlimited
-u: processes 512
-n: file descriptors 1024
-l: locked-in-memory size (kbytes) 64
-v: address space (kbytes) unlimited
-x: file locks unlimited
-i: pending signals 1031203
-q: bytes in POSIX msg queues 819200
-e: max nice 0
-r: max rt priority 0
-N 15: unlimited
And each node has 64 cores. so I don't really think i'm hitting any limits.
i'm using the jobqueue.yaml file that looks like:
slurm:
name: dask-worker
cores: 1 # Total number of cores per job
memory: 2 # Total amount of memory per job
processes: 1 # Number of Python processes per job
local-directory: /scratch # Location of fast local storage like /scratch or $TMPDIR
queue: fast_q
walltime: '24:00:00'
log-directory: /home/dbun/slurm_logs
I would appreciate any advice at all! Full log is below.
FORK BLOCKING IO ERROR
distributed.nanny - INFO - Start Nanny at: 'tcp://172.16.131.82:13687'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/dbun/.local/share/pyenv/versions/3.7.0/lib/python3.7/multiprocessing/forkserver.py", line 250, in main
pid = os.fork()
BlockingIOError: [Errno 11] Resource temporarily unavailable
distributed.dask_worker - INFO - End worker
Aborted!
CANT START NEW THREAD ERROR
https://pastebin.com/ibYUNcqD
BLOCKING IO ERROR
https://pastebin.com/FGfxqZEk
EDIT:
Another piece of the puzzle:
It looks like dask_worker is running multiple multiprocessing.forkserver calls? does that sound reasonable?
https://pastebin.com/r2pTQUS4
This problem was caused by having ulimit -u too low.
As it turns out each worker has a few processes associated with it, and the python ones have multiple threads. In the end you end up with approximately 14 threads that contribute to your ulimit -u. Mine was set to 512, and with a 64 core system I was likely hitting ~896. It looks like the a maximum threads per a process I could have had would have been 8.
Solution:
in .zshrc (.bashrc) I added the line
ulimit -u unlimited
Haven't had any problems since.

Why does my garbage collection log show 3.8GB as the max available heap size while I have allocated 4GB as the max heap size?

I have a 64-bit hotspot JDK version 1.7.0 installed on a 64-bit RHEL 6 machine. I use the following JVM options for my tomcat application.
CATALINA_OPTS="${CATALINA_OPTS} -Dfile.encoding=UTF8 -Dorg.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES=false -Duser.timezone=EST5EDT"
# General Heap sizing
CATALINA_OPTS="${CATALINA_OPTS} -Xms4096m -Xmx4096m -XX:NewSize=2048m -XX:MaxNewSize=2048m -XX:PermSize=512m -XX:MaxPermSize=512m -XX:+UseCompressedOops -XX:+DisableExplicitGC"
# Enable the CMS GC policy
CATALINA_OPTS="${CATALINA_OPTS} -XX:+UseConcMarkSweepGC -XX:CMSWaitDuration=15000 -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:+CMSConcurrentMTEnabled -XX:+CMSScavengeBeforeRemark -XX:+CMSClassUnloadingEnabled"
# Verbose Garbage Collection Logging
CURRENT_DATE=`date +%Y%m%d%H%M%S`
CATALINA_OPTS="${CATALINA_OPTS} -verbose:gc -XX:+PrintGCDetails -Xloggc:${CATALINA_BASE}/logs/gc-${CURRENT_DATE}.log -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution"
When I have a Garbage Collection analysis, the GC logs show a maximum available heap of only 3.8GB instead of 4GB allocated to the JVM. Why is that?
New Generation (2048M) consists of 80% Eden (1638.4M) and two Survivor Spaces (10% or 204.8M each):
Heap
par new generation total 1887488K, used 134226K [0x00000006fae00000, 0x000000077ae00000, 0x000000077ae00000)
eden space 1677824K, 8% used [0x00000006fae00000, 0x00000007031148e0, 0x0000000761480000)
from space 209664K, 0% used [0x0000000761480000, 0x0000000761480000, 0x000000076e140000)
to space 209664K, 0% used [0x000000076e140000, 0x000000076e140000, 0x000000077ae00000)
concurrent mark-sweep generation total 2097152K, used 242K [0x000000077ae00000, 0x00000007fae00000, 0x00000007fae00000)
At any time one of survivor spaces is empty (see Generations).
So, the useful heap size is 1638.4 + 204.8 + 2048 = 3891.2 MB

Tracking memory usage of piped commands with valgrind

I have a couple processes running a tool I've written that are joined by pipes, and I would like to measure their collected memory usage with valgrind. So far, I have tried something like:
$ valgrind tool=massif trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses".%p" myProcesses.script
Where myProcesses.script runs the equivalent of my tool foo twice, e.g.:
foo | foo > /dev/null
Valgrind doesn't seem to capture the collected memory usage of this the way I expect. If I use top to track this, I get (for sake of argument) 10% memory usage on the first foo, and then another 10% collects on the second foo before the myProcesses.script completes. This is the sort of thing I want to measure: the usage of both processes. Valgrind instead returns the following error:
Massif: ms_main.c:1891 (ms_new_mem_brk): Assertion 'VG_IS_PAGE_ALIGNED(len)' failed.
Is there a way to collect memory usage data for commands I'm using in a piped fashion (using valgrind)? Or a similar tool that I can use to accurately automate these measurements?
The numbers that top returns while polling seem hand-wavy, to me, and I am seeking accurate and repeatable measurements. If you have suggestions for alternative tools, I would welcome those, as well.
EDIT - Fixed typo with valgrind option.
EDIT 2 - For some reason, it appears that the option --pages-as-heap is giving us troubles with the binaries we're testing. Your examples run fine. A new page is created every time we enter a non-inlined function (stack overflows - heh). We wanted to count those, but they're relatively minor in the scale of memory usage we're testing. (Perhaps there aren't function calls in ls or less?) Removing --pages-as-heap helped get testing working again. Thanks to MrGomez for the great help.
With the correct valgrind version given in the errata, this seems to just work for me in Valgrind 3.6.1. My invocation:
<me>#harley:/tmp/test$ /usr/local/bin/valgrind --tool=massif \
--trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes \
--massif-out-file=myProcesses".%p" ./testscript.sh
==21067== Massif, a heap profiler
==21067== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21067== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21067== Command: ./testscript.sh
==21067==
==21068== Massif, a heap profiler
==21068== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21068== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21068== Command: /bin/ls
==21068==
==21070== Massif, a heap profiler
==21070== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21070== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Massif, a heap profiler
==21069== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21069== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Command: /bin/sleep 5
==21069==
==21070== Command: /usr/bin/less
==21070==
==21068==
(END) ==21069==
==21070==
==21067==
The contents of my test script, testscript.sh:
ls | sleep 5 | less
Sparse contents from one of the files generated by --massif-out-file=myProcesses".%p" (myProcesses.21055):
desc: --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses.%p
cmd: ./testscript.sh
time_unit: i
#-----------
snapshot=0
#-----------
time=0
mem_heap_B=110592
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=1
#-----------
time=0
mem_heap_B=118784
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
...
#-----------
snapshot=18
#-----------
time=108269
mem_heap_B=1708032
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=peak
n2: 1708032 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
n3: 1474560 0x4015E42: mmap (mmap.S:62)
n1: 1425408 0x4005CAC: _dl_map_object_from_fd (dl-load.c:1209)
n2: 1425408 0x4007109: _dl_map_object (dl-load.c:2250)
n1: 1413120 0x400CEEA: openaux (dl-deps.c:65)
n1: 1413120 0x400D834: _dl_catch_error (dl-error.c:178)
n1: 1413120 0x400C1E0: _dl_map_object_deps (dl-deps.c:247)
n1: 1413120 0x4002B59: dl_main (rtld.c:1780)
n1: 1413120 0x40140C5: _dl_sysdep_start (dl-sysdep.c:243)
n1: 1413120 0x4000C6B: _dl_start (rtld.c:333)
n0: 1413120 0x4000855: ??? (in /lib/ld-2.11.1.so)
n0: 12288 in 1 place, below massif's threshold (01.00%)
n0: 28672 in 3 places, all below massif's threshold (01.00%)
n1: 20480 0x4005E0C: _dl_map_object_from_fd (dl-load.c:1260)
n1: 20480 0x4007109: _dl_map_object (dl-load.c:2250)
n0: 20480 in 2 places, all below massif's threshold (01.00%)
n0: 233472 0xFFFFFFFF: ???
#-----------
snapshot=19
#-----------
time=108269
mem_heap_B=1703936
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=20
#-----------
time=200236
mem_heap_B=1839104
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
Massif continues to complain about heap allocations in the remainder of my files. Note this is very similar to your error.
I theorize that your version of valgrind was built in debug mode, causing the asserts to fire. A rebuild from source (I used this with the defaults hanging off ./configure) will fix the issue.
Either way, this seems to be expected with Massif.
Some programs allow you to preload the libmemusage.so library and get a report of what memory allocations were allocated recorded:
$ LD_PRELOAD=libmemusage.so less /etc/passwd
Memory usage summary: heap total: 36212, heap peak: 35011, stack peak: 15008
total calls total memory failed calls
malloc| 39 5985 0
realloc| 3 64 0 (nomove:2, dec:0, free:0)
calloc| 238 30163 0
free| 51 11546
Histogram for block sizes:
0-15 128 45% ==================================================
16-31 13 4% =====
32-47 105 37% =========================================
48-63 2 <1%
64-79 4 1% =
80-95 5 1% =
96-111 3 1% =
112-127 3 1% =
160-175 1 <1%
192-207 1 <1%
208-223 2 <1%
256-271 1 <1%
432-447 1 <1%
560-575 1 <1%
656-671 1 <1%
768-783 1 <1%
944-959 1 <1%
1024-1039 2 <1%
1328-1343 1 <1%
2128-2143 1 <1%
3312-3327 1 <1%
7952-7967 1 <1%
8240-8255 1 <1%
Though I must admit that it doesn't always work -- LD_PRELOAD=libmemusage.so ls never reports anything, for example -- and I wish I knew the conditions that allow it to work or not work.

Resources