All the commands below are ran under the root user. In order to find out the PID of Jenkins, I ran the command like this.
#ps aux | grep jenkins
and with the PID I ran another one, which is
#pmap -x [PID]
Here's the result I got from the command.
Address Kbytes RSS Dirty Mode Mapping
0000000000400000 4 0 0 r-x-- java
0000000000600000 4 4 4 r---- java
0000000000601000 4 4 4 rw--- java
0000000000b3e000 312 216 216 rw--- [ anon ]
...
00007ffc29848000 1156 32 32 rw--- [ stack ]
00007ffc29976000 8 4 0 r-x-- [ anon ]
ffffffffff600000 4 0 0 r-x-- [ anon ]
---------------- ------- ------- -------
total kB 10027288 1172504 1163812
So, Jenkins seems to be taking approximately 9.6 gigabytes. Currently there are around 35 items added in Jenkins, and only 8 out of them are built periodically on a daily basis. I do believe that there should not be any reason for Jenkins to consume this huge memory, so I now have the following 3 doubts:
That I figured out the memory usage in a wrong way (the pmap command did not deliver the right figure),
or there is really a problem with the Jenkins configuration
or it is just natural to consume this amount with that number of items
Any Jenkins experts out there? I do need your help.
I'm not a Jenkins expert, but I have some knowledge for Linux memory management and Java applications.
You said Jenkins seems to be taking approximately 9.6 gigabytes., it's not correct an aspect of memory consumption.
The 9.6GiB( Check the your jenkin's java heap memory option ) memory is virtual memory that just was estimated from OS, RSS(Resident Set Size) is real memory usage.
So my answer is similar with it, it is just natural to consume this amount with that number of items.
I hope this will help you.
Related
I'm currently reading Master Embedded Linux Programming and I'm on the chapter where it goes into bootloaders, more specifically U-Boot for the Beaglebone Black.
I have built a crosscompiler and I'm able to build U-Boot, however I can't make it run the way it is described in the book.
After some experimentation and Google'ing, I can make it work by writing MLO and u-boot.img in raw mode (using these command)
However, if I put the files in a FAT32 MBR boot partition, the Beaglebone will not boot, it will only show a string of C's, which indicate that it is trying to get its bootloader from the serial interface and it has decided it cannot boot from SD card.
I have also studied this answer. According to that answer I should be doing everything correctly. I've tried to experiment with the MMC raw mode options in the U-Boot build configuration, but I've not been able to find a change that works.
I feel like there must be something obvious I'm missing, but I can't figure it out. Are there any things I can try to debug this further?
Update: some more details on the partition tables.
When using the "raw way" of putting LBO and u-boot.img on the SD cards, I have not created any partitions at all. This works:
$ sudo sfdisk /dev/sda -l
Disk /dev/sda: 117,75 GiB, 126437294080 bytes, 246947840 sectors
Disk model: MassStorageClass
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
When trying to use a boot partition, that does not work, I have this configuration:
$ sudo sfdisk /dev/sda -l
Disk /dev/sda: 117,75 GiB, 126437294080 bytes, 246947840 sectors
Disk model: MassStorageClass
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x3d985ec3
Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 133119 131072 64M c W95 FAT32 (LBA)
Update 2: The contents of the boot partition is the exact same 2 files that I use for the raw writes, so they are confirmed to work:
$ ls -al
total 1000
drwxr-xr-x 2 peter peter 16384 Jan 1 1970 .
drwxr-x---+ 3 root root 4096 Jul 18 08:44 ..
-rw-r--r-- 1 peter peter 108184 Jul 14 13:56 MLO
-rw-r--r-- 1 peter peter 893144 Jul 14 13:56 u-boot.img
Update 3: I have already tried the following U-Boot options to try it go get to work (in the SPL / TPL menu):
"Support FAT filesystems" This is enabled by default. I can't really find a good reference for the U-Boot options, but I am guessing this is what enables booting from a FAT partition (which is what I'm trying to do)
"MCC raw mode: by sector" I have disabled this. As expected, this indeed breaks the booting in raw mode, which is the only thing I got working up till now.
"MCC raw mode: by partition". I have tried to enable this and using partition 1 to load U-Boot from. I'm not sure how to understand this option. I assume raw mode does not require partitions, but this asks for what partition to use...
In general, if any one can point me to a U-Boot configuration reference, that would already by very helpful. Right now, I'm just randomly turning things on and off that sound like they may help.
After migrating a informix binaries to new server by os level cloning there was a warning when I execute the oninit -vy like could not open some chunks. then I ask system administrator link those missing chunks and again executed the oninit -vy again it prompt a warning with mentioning those chunks are bad chunks. what is the reason behind that. Is there any mistake happen when chunks re-configuring to new server
nwnhost#nwn$oninit -vy
Reading configuration file '/informix/strim/inf11/etc/onconfig'...succeeded
Creating /INFORMIXTMP/.infxdirs...succeeded
Checking config parameters...succeeded
Allocating and attaching to shared memory...succeeded
Creating resident pool 1629910 kbytes...succeeded
Allocating 6606044 kbytes for buffer pool of 2K page size...succeeded
Allocating 19267600 kbytes for buffer pool of 8K page size...succeeded
Creating infos file "/informix/strim/inf11/etc/.infos.ocs_test"...succeeded
Linking conf file "/informix/strim/inf11/etc/.conf.ocs_test"...succeeded
Initializing rhead structure...succeeded
Writing to infos file...succeeded
Initialization of Encryption...succeeded
Initializing ASF...succeeded
Initializing Dictionary Cache and SPL Routine Cache...succeeded
Bringing up ADM VP...succeeded
Creating VP classes...succeeded
Forking main_loop thread...succeeded
Initializing DR structures...succeeded
Forking 1 'soctcp' listener threads...succeeded
Starting tracing...succeeded
Initializing 128 flushers...succeeded
Initializing SDS Server network connections...succeeded
Initializing log/checkpoint information...succeeded
Initializing dbspaces...succeeded
Opening primary chunks...Bad Primary Chunk '/dev/chunk1186'.
Bad Primary Chunk '/dev/chunk1188'.
Bad Primary Chunk '/dev/chunk1265'.
Bad Primary Chunk '/dev/chunk1279'.
Bad Primary Chunk '/dev/chunk1317'.
Bad Primary Chunk '/dev/chunk1319'.
Bad Primary Chunk '/dev/chunk1320'.
succeeded
Validating chunks...succeeded
Initialize Async Log Flusher...succeeded
Starting B-tree Scanner...succeeded
Init ReadAhead Daemon...succeeded
Initializing DBSPACETEMP list...succeeded
Checking database partition index...succeeded
Initializing dataskip structure...succeeded
Checking for temporary tables to drop...succeeded
Updating Global Row Counter...succeeded
Forking onmode_mon thread...succeeded
Creating periodic thread...succeeded
Creating periodic thread...succeeded
Starting scheduling system...succeeded
Verbose output complete: mode = 5
here is the onstat -d output for those chunks
nwnhost#nwn$onstat -d | egrep 'chunk1188|chunk1186|chunk1265|chunk1279|chunk1317|chunk1319|chunk1320'
7be211028 1252 36 48 2097125 0 PD-B-- /dev/chunk1186
7be211428 1254 36 48 2097125 0 PD-B-- /dev/chunk1188
7be22d028 1331 37 48 2097139 0 PD-B-- /dev/chunk1265
7be22fc28 1345 38 48 2097000 0 PD-B-- /dev/chunk1279
7be241228 1383 48 48 2097139 0 PD-B-- /dev/chunk1317
7be241628 1385 38 48 2097139 0 PD-B-- /dev/chunk1319
7be241828 1386 37 48 2097000 0 PD-B-- /dev/chunk1320
nwnhost#nwn$
I could resolve above error by opening those chunks by below command.
onspaces -s [dbspace_name] -p [pathname] -o [offset] -O
eg :-
onspaces -s dbspace1 -p /dev/chunk1186 -o 96 -O
I'm trying to understand why the limits have decided a task needs to be killed, and how it's doing the accounting. When my GCE Docker container kills a process, it shows something like:
Task in /404daacfcf6b9e55f71b3d7cac358f0dc921a2d580eed460c2826aea8e43f05e killed as a result of limit of /404daacfcf6b9e55f71b3d7cac358f0dc921a2d580eed460c2826aea8e43f05e
memory: usage 2097152kB, limit 2097152kB, failcnt 74571
memory+swap: usage 0kB, limit 18014398509481983kB, failcnt 0
kmem: usage 0kB, limit 18014398509481983kB, failcnt 0
Memory cgroup stats for /404daacfcf6b9e55f71b3d7cac358f0dc921a2d580eed460c2826aea8e43f05e: cache:368KB rss:2096784KB rss_huge:0KB mapped_file:0KB writeback:0KB inactive_anon:16KB active_anon:2097040KB inactive_file:60KB active_file:36KB unevictable:0KB
[ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[ 4343] 0 4343 5440 65 15 0 0 bash
[ 4421] 0 4421 265895 6702 77 0 0 npm
[ 4422] 0 4422 12446 2988 28 0 0 gunicorn
[ 4557] 0 4557 739241 346035 1048 0 0 gunicorn
[ 4560] 0 4560 1086 24 8 0 0 sh
[ 4561] 0 4561 5466 103 15 0 0 bash
[14594] 0 14594 387558 168790 672 0 0 node
Memory cgroup out of memory: Kill process 4557 (gunicorn) score 662 or sacrifice child
Killed process 4557 (gunicorn) total-vm:2956964kB, anon-rss:1384140kB, file-rss:0kB
Supposedly the memory hit a 2GB usage limit, and something needs to die. According to the cgroup stats, I appear to have 2GB of usage in active_anon and rss.
When I look at the table of process stats, I don't see where the 2GB is:
For rss, I see the two major processes 346035 + 168790 = 514MB?
For total_vm, I see three major processes 265895 + 739241 + 387558 = 1.4GB?
But when it decides to kill the gunicorn process, it says it had 3GB of Total VM and 1.4GB of Anon RSS. I don't see how this follows from the above numbers at all...
For most of it's life, according to top, the gunicorn process appears to hum along with 555m RES and 2131m VIRT and 22% MEM * 2.5GB box = 550MB of memory usage. (I haven't yet been able to time it properly to peek at top values at the time it dies...)
Can someone help me understand this?
Under what accounting, do these sum to 2GB of usage? (virtual? rss? something else?)
Is there something else besides top/ps I should use to track how much memory a process is using for the purposes of docker's killing it?
From what I know, the total_vm and rss are counted in 4kB (refer to: https://stackoverflow.com/a/43611576), instead of kB.
So for pid<4557>:
rss=346035, means anon-rss:1384140kB (=346035*4kB)
total_vm=739241, means total-vm:2956964kB(=739241*4kB)
This will explain your mem usage very well.
I have one java based application which is having huge line of source code(~1m).Now I am using jenkins with sonar-runner-2.4 to run analysis with code coverage and test cases count.I have upgraded sonarqube server from 5.4 to 6.3.1.Before upgrade this job took 9hrs to complete the whole analysis (still it is very much long time but fine) but after upgrade to sonarqube-6.3.1 same job taking 13hrs to complete the same analysis.
How do I improve analysis time at least my earlier time 9hr ?
EDIT
Here is my JAVA_OPTS for sonarqube-6.3.1 instance
sonar.web.javaOpts=-Xmx6G -Xms2G -XX:MaxPermSize=1G -XX:+HeapDumpOnOutOfMemoryError -Djava.net.preferIPv4Stack=true
Available Hardware :
$lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 26
Stepping: 5
CPU MHz: 1596.000
BogoMIPS: 3999.44
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 4096K
NUMA node0 CPU(s): 0-3
NUMA node1 CPU(s): 4-7
Available Memory :
$free -m
total used free shared buff/cache available
Mem: 128714 58945 66232 430 3535 68298
Swap: 32767 957 31810
sonar-project.properties for the long running job:
sonar-project.properties
As you haven't really given many details, I can't really give many details in the answer, but the simple answer is that you have to make the scan do less work.
Look at your codebase. Is your scan processing generated classes? Is it scanning test classes? Is it scanning classes that have little real business logic? If you answer "yes" to any of those, consider excluding those classes.
Look at the SonarQube plugins you're using. Are you running every possible plugin you can run? Are there some heuristics you don't need to run, or perhaps you could run less frequently?
I have a couple processes running a tool I've written that are joined by pipes, and I would like to measure their collected memory usage with valgrind. So far, I have tried something like:
$ valgrind tool=massif trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses".%p" myProcesses.script
Where myProcesses.script runs the equivalent of my tool foo twice, e.g.:
foo | foo > /dev/null
Valgrind doesn't seem to capture the collected memory usage of this the way I expect. If I use top to track this, I get (for sake of argument) 10% memory usage on the first foo, and then another 10% collects on the second foo before the myProcesses.script completes. This is the sort of thing I want to measure: the usage of both processes. Valgrind instead returns the following error:
Massif: ms_main.c:1891 (ms_new_mem_brk): Assertion 'VG_IS_PAGE_ALIGNED(len)' failed.
Is there a way to collect memory usage data for commands I'm using in a piped fashion (using valgrind)? Or a similar tool that I can use to accurately automate these measurements?
The numbers that top returns while polling seem hand-wavy, to me, and I am seeking accurate and repeatable measurements. If you have suggestions for alternative tools, I would welcome those, as well.
EDIT - Fixed typo with valgrind option.
EDIT 2 - For some reason, it appears that the option --pages-as-heap is giving us troubles with the binaries we're testing. Your examples run fine. A new page is created every time we enter a non-inlined function (stack overflows - heh). We wanted to count those, but they're relatively minor in the scale of memory usage we're testing. (Perhaps there aren't function calls in ls or less?) Removing --pages-as-heap helped get testing working again. Thanks to MrGomez for the great help.
With the correct valgrind version given in the errata, this seems to just work for me in Valgrind 3.6.1. My invocation:
<me>#harley:/tmp/test$ /usr/local/bin/valgrind --tool=massif \
--trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes \
--massif-out-file=myProcesses".%p" ./testscript.sh
==21067== Massif, a heap profiler
==21067== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21067== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21067== Command: ./testscript.sh
==21067==
==21068== Massif, a heap profiler
==21068== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21068== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21068== Command: /bin/ls
==21068==
==21070== Massif, a heap profiler
==21070== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21070== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Massif, a heap profiler
==21069== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21069== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Command: /bin/sleep 5
==21069==
==21070== Command: /usr/bin/less
==21070==
==21068==
(END) ==21069==
==21070==
==21067==
The contents of my test script, testscript.sh:
ls | sleep 5 | less
Sparse contents from one of the files generated by --massif-out-file=myProcesses".%p" (myProcesses.21055):
desc: --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses.%p
cmd: ./testscript.sh
time_unit: i
#-----------
snapshot=0
#-----------
time=0
mem_heap_B=110592
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=1
#-----------
time=0
mem_heap_B=118784
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
...
#-----------
snapshot=18
#-----------
time=108269
mem_heap_B=1708032
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=peak
n2: 1708032 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
n3: 1474560 0x4015E42: mmap (mmap.S:62)
n1: 1425408 0x4005CAC: _dl_map_object_from_fd (dl-load.c:1209)
n2: 1425408 0x4007109: _dl_map_object (dl-load.c:2250)
n1: 1413120 0x400CEEA: openaux (dl-deps.c:65)
n1: 1413120 0x400D834: _dl_catch_error (dl-error.c:178)
n1: 1413120 0x400C1E0: _dl_map_object_deps (dl-deps.c:247)
n1: 1413120 0x4002B59: dl_main (rtld.c:1780)
n1: 1413120 0x40140C5: _dl_sysdep_start (dl-sysdep.c:243)
n1: 1413120 0x4000C6B: _dl_start (rtld.c:333)
n0: 1413120 0x4000855: ??? (in /lib/ld-2.11.1.so)
n0: 12288 in 1 place, below massif's threshold (01.00%)
n0: 28672 in 3 places, all below massif's threshold (01.00%)
n1: 20480 0x4005E0C: _dl_map_object_from_fd (dl-load.c:1260)
n1: 20480 0x4007109: _dl_map_object (dl-load.c:2250)
n0: 20480 in 2 places, all below massif's threshold (01.00%)
n0: 233472 0xFFFFFFFF: ???
#-----------
snapshot=19
#-----------
time=108269
mem_heap_B=1703936
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=20
#-----------
time=200236
mem_heap_B=1839104
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
Massif continues to complain about heap allocations in the remainder of my files. Note this is very similar to your error.
I theorize that your version of valgrind was built in debug mode, causing the asserts to fire. A rebuild from source (I used this with the defaults hanging off ./configure) will fix the issue.
Either way, this seems to be expected with Massif.
Some programs allow you to preload the libmemusage.so library and get a report of what memory allocations were allocated recorded:
$ LD_PRELOAD=libmemusage.so less /etc/passwd
Memory usage summary: heap total: 36212, heap peak: 35011, stack peak: 15008
total calls total memory failed calls
malloc| 39 5985 0
realloc| 3 64 0 (nomove:2, dec:0, free:0)
calloc| 238 30163 0
free| 51 11546
Histogram for block sizes:
0-15 128 45% ==================================================
16-31 13 4% =====
32-47 105 37% =========================================
48-63 2 <1%
64-79 4 1% =
80-95 5 1% =
96-111 3 1% =
112-127 3 1% =
160-175 1 <1%
192-207 1 <1%
208-223 2 <1%
256-271 1 <1%
432-447 1 <1%
560-575 1 <1%
656-671 1 <1%
768-783 1 <1%
944-959 1 <1%
1024-1039 2 <1%
1328-1343 1 <1%
2128-2143 1 <1%
3312-3327 1 <1%
7952-7967 1 <1%
8240-8255 1 <1%
Though I must admit that it doesn't always work -- LD_PRELOAD=libmemusage.so ls never reports anything, for example -- and I wish I knew the conditions that allow it to work or not work.