Jenkins Server Java.exe memory is growing very fast - memory

We're running Jenkins server with few slaves that run the builds. Lately there are more and more builds that are running in the same time.
I see the java.exe process on the Jenkins server just increasing , and not decreasing even when the jobs were finnished.
Any idea dwhy oes it happen?
We're running Jenkins ver. 1.501.
Is there a way maybe to make the Jenkins server service ro wait until the last job is finnished, then restart automatically?

I can't seem to find a reference on this (still posting an answer because it's too long for comments ;-) ) but this is what I've observed using the Oracle JVM:
If more memory than currently reserved is needed, the JVM reserves more. So far so good. What it doesn't seem to do is release the memory once it's not needed anymore. You can watch this behaviour by turning on the heap size indicator in Eclipse.
I'd say the same does happen with Jenkins. A running Jenkins with only a few projects already can easily jump the 1 gig mark. If you have a lot of concurrent builds, Jenkins needs a lot of memory at some point. After the builds are done and the heapsize has decreased, the JVM keeps the memory reserved. It is practically "empty" but still claimed by the JVM so it's unavailable for other processes.
Again: It's just an observation. I'd be happy if someone with deeper insight on Java memory management would back this up (or disprove it)
As for a practical solution I'd say you gonna have to live with it to some point. Jenkins IS very hungry for memory. Restarting it solves the problem only temporary. At least it should stop claiming memory at some point because the "empty" reserved memory should be reused. If it's not this really sounds like a memory leak and would be a bug.

Jenkins' [excessive] use of memory without bounds seems to be a common observation. The Jenkins Wiki gives some suggestions for "I'm getting OutOfMemoryErrors".
We have also found that the Monitoring Plugin is useful for keeping an eye on the memory usage and helping us know if we might need to restart Jenkins soon.
Is there a way maybe to make the Jenkins server service ro wait until the last job is finnished, then restart automatically?
Check out the Restart Safely Plugin

Related

Application slow down due to zombie process?

We face the application downtime/issue while uploading the file to Azure storage via Arc.
There is no specific code error, but facing a timeout issue.
It gets resolved once the Azure web app is restarted.
It has happened intermittently.
Since we cannot find the root cause, we consulted if there is an issue on the Azure side.
The Microsoft team says the system health is OK but pointing towards accumulated zombie processes. EPMD and inet_gethost
On searching, I understand that these are created by Erlang runtime.
Please let me know if we have some process to kill these zombie processes from time to time?
Also, do they contribute to the application downtime?
Thanks
Please let me know, if we have some process to kill these zombie processes time to time?
If you're running a sensible init process, these zombie processes should be correctly reaped. This can often be a problem if you run Erlang as the top-level process inside a container, for instance. Can you give more detail of your environment?
Also do they really contribute for the application downtime?
Depends on how many of them there are, but probably not, no.

Jenkins: jobs in queue are stuck and not triggered to be restarted

For a while, our Jenkins experiences critical problems. We have jobs hung, our job scheduler does not trigger the builds. After the Jenkins service restart, everything is back to normal, but after some period of time all problem are return. (this period can be week or day or ever less). Any idea where we can start looking? I'll appreciate any help on this issue
Muatik has made a good point in his comment, the recommended approach is to run jobs on agents (slave) nodes. If you already do it, you can look at:
Jenkins master machine CPU, RAM and hard disk usage. Access the machine and/or use plugin like Java Melody. I have seen missing graphics in the builds test results and stuck builds due to no hard disk space. You could also have hit the limit of RAM or CPU for the slaves/jobs you are executing. You may need more heap space.
Look at Jenkins log files, start with severe exceptions. If the files are too big or you see logrotate exceptions, you can change the logging levels, so that fewer exceptions are logged. For more details see my article on this topic. Try to fix exceptions that you see logged.
Go through recently made changes that can be the cause of such behavior, for example, new plugins, changes to config files (jenkins.xml)?
Look at TCP connections. Run netstat -a Are there suspicious connections (CLOSED_WAIT status)?
Delete old builds that you do not need.
We have been facing this issue from last 4 months, and tried everything, changing resources CPU & memory, increasing desired nodes in ASG. But nothing seems worked .
Solution: 1. Go to Manage Jnekins-> System Configurationd-> Maven project
configurations
2. In "usage" field, select "Only buid Jobs with label expressions matching this nodes"
Doing this resolved it and jenkins is working as a Rocket now :)

How can I reduce Jenkins master/slave communication?

I have set up a Jenkins master (linux) and slave (Windows) on Amazon EC2. My build is running on the slave and it is considerably slower than what I was seeing on a similar desktop machine (c4.large).
Checking into the load on the slave, neither the CPU cores, nor the memory is fully used (CPU at 60%, 30% each, memory stable at about 2.5GB)
However, I'm starting to think the bottleneck is in the network traffic. I can see that there seems to be a lot of network traffic going on between master and slave. It averages at about 500Kbs going out and 250Kbs coming in, but there are regular spikes going into the multiple Mbs.
I've searched around, but I can't really figure out what data Jenkins is sending.
The job is doing quite a bit of logging, but this is not the only source. There's a console log of about 15MB for a 2 hour build, which would translate to roughly (15*1024*8)/(2*60*60) = 17Kbs
So my questions:
What else is Jenkins communicating between master and slaves?
How can I reduce this?
Update: Just to be sure, I disabled all logging during the build. And it turns out that was the source of the traffic. I don't understand why there's so much traffic given my calculations above. Maybe there's some kind of polling/acknowledging going on that's causing a lot of overhead, I still don't know.
It's apparently also not the main source of the slowdown, but still, I would like to find a way to reduce this network traffic as this is going to impact the AWS bill.

Memory profiling on Google Cloud Dataflow

What would be the best way to debug memory issues of a dataflow job?
My job was failing with a GC OOM error, but when I profile it locally I cannot reproduce the exact scenarios and data volumes.
I'm running it now on 'n1-highmem-4' machines, and I don't see the error anymore, but the job is very slow, so obviously using machine with more RAM is not the solution :)
Thanks for any advice,
G
Please use the option --dumpHeapOnOOM and --saveHeapDumpsToGcsPath (see docs).
This will only help if one of your workers actually OOMs. Additionally you can try running jmap -dump PID on the harness process on the worker to obtain a heap dump at runtime if it's not OOMing but if you observe high memory usage nevertheless.

building multiple jobs in jenkins performance

In Jenkins I have 100 java projects. Each has its own build file.
Every time I want clear the build file and compile all source files again.
Using bulkbuilder plugin I tried compling all the jobs.. Having 100 jobs run parallel.
But performance is very bad. Individually if the job takes 1 min. in the batch it takes 20mins. More the batch size more the time it takes.. I am running this on powerful server so no problem of memory and CPU.
Please Suggest me how to over come this.. what configurations need to be done in jenkins.
I am launching jenkins using war file.
Thanks..
Even though you say you have enough memory and CPU resources, you seem to imply there is some kind of bottleneck when you increase the number of parallel running jobs. I think this is understandable. Even though I am not a java developer, I think most of the java build tools are able to parallelize build internally. I.e. building a single job may well consume more than one CPU core and quite a lot of memory.
Because of this I suggest you need to monitor your build server and experiment with different batch sizes to find an optimal number. You should execute e.g. "vmstat 5" while builds are running and see if you have idle cpu left. Also keep an eye on the disk I/O. If you increase the batch size but disk I/O does not increase, you are consuming all of the I/O capacity and it probably will not help much if you increase the batch size.
When you have found the optimal batch size (i.e. how many executors to configure for the build server), you can maybe tweak other things to make things faster:
Try to spend as little time checking out code as possible. Instead of deleting workspace before build starts, configure the SCM plugin to remove files that are not under version control. If you use git, you can use a local reference repo or do a shallow clone or something like that.
You can also try to speed things up by using SSD disks
You can get more servers, run Jenkins slaves on them and utilize the cpu and I/O capacity of multiple servers instead of only one.

Resources