Jenkins Builds - PhantomJS was not killed in 2000 ms, sending SIGKILL - jenkins

We have started seeing intermittent issues on our Jenkins build server when running jasmine unit tests with Karma.
We see the following error:
PhantomJS was not killed in 2000 ms, sending SIGKILL
This is usually goes away the next time we run the build, and we may not see the issue again for a couple of days.
We don't see this when running tests locally, so I'm wondering what could be different about our Jenkins environment that could cause this.
Can anyone offer any suggestions please?
Thanks

Related

How to diagnose slow startup time in Cloud Run containers?

I am running some services with Google Cloud Run. While performance has been satisfactory, there's a recurrent issue with extremely slow startup time, which leads to occasionally dropped requests when new containers can't spin up in time.
Currently, with first gen execution environments and startup CPU boost enabled, Google's dashboard reports around 18 to 50 seconds of startup time. Image is based on ruby:3.0.2, and it runs a Ruby on Rails 6 application. In a development environment, startup (timed from run to container accepting requests) never seems to take more than 5 seconds.
I want to know what tools are available to diagnose this issue, and if there are any obvious pitfalls with my specific case that I might be missing.
I've tried playing around with the service's configuration options, to no avail. The biggest suspect is a startup bash script that handles migrations on the first boot, and asset compiling on development. However, I've tried building with an empty script, and the problem persists. I also think the container images might be too large (around 700Mb), but I haven't gotten around to slimming then down nor found evidence that this is the problem.

node webpack hangs. How to debug?

I am trying to build ORO Platform js assets, using a non-docker environment, it works like a charm, but in Docker (either during Docker Build, or container execution) the building process stop and hangs with 100% CPU.
67% [0] building 1416/1470 modules 54 active ... ndles/orotask/sidebar_widgets/assigned_tasks/css/styles.scss
The building process does not necessarily hang on the exact same file. And also, the build seems to succeed on some occasion.
I've try to reduce to a minimum the process by removing Happy, tested with --max-old-space-size=4096, but no luck.
Sources : https://github.com/oroinc/platform/tree/master/build
How would you recommend debugging this ?
Thanks
There is a known issue when a NodeJs process hangs while you run it from the root user. As I know, there is no workaround for now. Consider using another user to build the assets.
If it's not the case, please review the Troubleshooting section in OroAssetBundle, that might help.

Apache Beam Dataflow Jobs started failing with: Workflow failed

I've been running batch jobs for over a week now with DataflowRunner without a problem but all of a sudden starting from today the jobs started failing with the error message below. The workers don't seem to start and there's no log in stackdriver at all.
Anything I'm missing here?
Dataflow SDK version: 2.0.0
Submitted job: 2017-08-29_09_43_20-9537473353894635176
2017-08-29 16:44:24 ERROR MonitoringUtil$LoggingHandler:101 - 2017-08-29T16:44:22.277Z: (54a5da9d57fd266d): Workflow failed.
EDIT:
If I remove --zone=europe-west2-b from the batch run it works which indicates that there might be something wrong with this zone.
I took a look at your job. It failed because it couldn't get quota to bring up the workers. Likely you do not have quota in that zone. This error is not handled back correctly, but it should be fixed in the next release.

Deploy Glassfish : V3 cannot process this command at this time, please wait

I 'm trying to deploy my ear with Jenkins on my Glassfish server with the following command :
nohup /home/hadrienmp/bin/glassfish3/bin/asadmin deploy --force metier-ear.ear > Output.out 2> Error.err < /dev/null &
Sometimes it works but most of the time I get the following error message : 'V3 cannot process this command at this time, please wait'.
My Glassfish server hosts webservices that are used by other developers. I don't know if it can prevent Glassfish from deploying my ear correctly.
Do you have any idea what I might be doing wrong?
In the end I figured it out. I have 4 projects that depend on each other to deploy every night. What each build did was restart the Glassfish after the deploy to avoid any memory issue.
I think that the builds run too quickly for the Glassfish to have fully restarted. To fix this I just run my 4 builds 30 minutes apart from each other.

please wait while jenkins is restarting- waiting long

I updated some plugins and restarted the jenkins but now it says:
Please wait while Jenkins is restarting
Your browser will reload automatically when Jenkins is ready.
It is taking too much time (waiting from last 40 minutes). I have only 1 project with around 20 builds. I have restarted jenkins many times and worked fine but now it stucks.
Is there any way out to kill/suspend jenkins to avoid this wait?
I had a very similar issue when using jenkins build-in restart function. To fix it I killed the service (with crossed fingers), but somehow it kept serving the "Please wait" page. I guess it is served by a separate thread, but since i could not see any running java or jenkins processes i restarted the server to stop it.
After reboot jenkins worked but it was not updated. To make it work it I ran the update again and restarted the jenkins service manually - it took less than a minute and worked just fine...
Jenkins seems to have a number of bugs related to restarting, and at least one unresolved: jenkins issue
Windows ONLY....
All the solutions here didn't work and restarting the server was not an option. If you are in the same situation.
I had to kill java.exe and restart the jenkins service. After I did this Jenkins reloaded several times and then went back to normal.
I was stuck on the jenkins restarting page for 10-ish minutes untill I did this.
Hope this helps.
Running this in the command line helped me:
service jenkins restart
I had a similar issue when updating plugins from the pluging update page and I marked the restart jenkins options. jenkins only showed the waiting message for a long time.
I solved the issue restoring .bak to .jpi files of the the plugins that I tried to update.
I did the follow in my jenkins
cd $JENKINS_HOME/plugins/
>sudo mv git.bak git.jpi
.
. (more plugins files)
.
>sudo mv ldap.bak ldap.jpi
>sudo /sbin/service jenkins restart
Check Event Viewer.
I found that my Java died.
Faulting application java.exe, version 7.0.250.17, time stamp 0x51c4b3fd, faulting module ntdll.dll, version 6.0.6002.18541, time stamp 0x4ec3e39f, exception code 0xc0000374, fault offset 0x000abc4f, process id 0x1188, application start time 0x01cee4f42968bc81.
Finally I found that it's Jenkins 1.540 problem. Don't use it.
https://issues.jenkins-ci.org/browse/JENKINS-20630
I faced the same issue after upgrading some plugins on Windows. Looking on jenkins.err.log it displayed this error
Exception in thread "main" java.io.IOException: Jenkins has failed to create a temporary file in C:\Users\builder\AppData\Local\Temp\
at Main.extractFromJar(Main.java:350)
at Main._main(Main.java:194)
at Main.main(Main.java:91)
Caused by: java.io.IOException: There is not enough space on the disk
at java.io.WinNTFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(Unknown Source)
at Main.extractFromJar(Main.java:347)
... 2 more
The problem was that the TEMP folder of the jenkins user had lots of temporary files. After cleaning that folder jenkins restarted correctly.
just performed a restart on the server. That fixed the issue !
In Command prompt execute this
C:\>service jenkins restart
Or
You can go for Service currently running in your machine( Win + R ) seach for Jenkins and Click on restart
For me, the cause seemed to be having lots of old job build logs hanging around. To clean them up, I ran:
cd $JENKINS_HOME/jobs
find -name 'builds' | xargs -n 1 bash -c 'rm -rf $0/[1-9]*'
Then I stopped and started Jenkins again, and it came up within a minute.
Credit to: https://stackoverflow.com/a/39230597/2255242
This is an old thread.. but my personal recommendation is to WAIT before attempting to do anything (such as restarting service, etc).
I wasted hours once trying to fix something that turned out to be not an issue in the first place. In the end, I messed things up and wasted a lot of time.
Just because you see errors in the logs doesn't necessarily mean that you need to take action.
The upgrade took about 45 minutes in the end for me. All i did at one point was refreshing my browser window. It can take a while.
Just my opinion
On Win 10: Stopping with the service command from the command line reported failure to stop the service, but I was able to stop it from services.msc (running as administrator). The updates were applied. Sorry, no definitive answer from me. YMMV.
I used TCPView and killed process that was using port 8080. BAsically it was all Java.exe from Jenkins. Killed all processes and restarted Jenkins Service
try to restart that inside windows services console, it will work
I have observed the same issue after installing a plugin and opting to restart the jenkins when no jobs are running.
When I looked at the jenkins server process, it was running fine and no issues.
On restarting the jenkins service using the below command and reloading the browser, Jenkins was up.
sudo service jenkins restart
If Jenkins is taking an unusually long time to restart the best recourse is to check the generated logs to see what may be wrong. However, even that may be of little help because many plugins try to be "quiet" by default, even if they are furiously working to load content. So if all else fails, you may have to resort to manually disabling plugins.
However here is a free tip: Some plugins are known to be messy. For example the Job Config History plugin we observed to write hundreds of thousands of records for both job configuration changes AND agent changes. Removing this plugin, and deleting the configHistory folder fixed one problem where our startup literally took > 4 hours.
In our case, the problem was we were launching ephemeral agents (via docker and/or kubernetes). Each new "agent" was treated as a configuration change. With thousands of agents per day, it didn't take long to fill up a substantial part of the disk with history that never was effectively cleared.
There are other plugins that leak data in this way. And you can also create self-inflicted wounds, e.g. by using a standalone process to remove "obsolete" files. An example where we were "bitten" is a process that tried to discard old build records, but did an incomplete job - and was "warring" with the running Jenkins process. Jenkins will try breaking its neck to load a build.xml record that is empty or incomplete.
Three more tips:
You can install the monitoring plugin. Often when the jenkins UI proper didn't start, we were able to see the /monitoring in action.
Likewise, /userContent can often be loaded even when the rest of the UI is not fully up.
Don't rule out bad actors. It just takes one aggressive script that tries, e.g. to load the entire build history and ship it back via a REST call to effectively deny service to all other UI users.
I try to fix a file named hudson.model.UpdateCenter.xml located /var/lib/jenkins
I change the URL to https://mirrors.tuna.tsinghua.edu.cn/jenkins/updates/update-center.json
Finally, restart Jenkins. it solves my problem

Resources