Job stuck after manual jobs deletion

Job stuck after manual jobs deletion - jenkins

I performed a manual clean by deleting job folders directly into the filesystem, and now I find a stucked running job that I cannot abort.
I've tried the answers here to force it to be stopped, but it doesn't work as it is not able to find the existing job in the system.
Additionally, when I click over the running job I get a 404 error:
"Problem accessing <route_to_job_that_doesnt_exist_anymore>"
Reason: Not found
Is there something I can do to abort this running job without restarting the server?

A way to stop a build (Like actually aborting it) is by adding a /stop at the end of the job url, behind the Build Number.
Example: http://myjenkins/project/123/stop
If this doesn't work, there is also the "Hard Kill". Instead of adding /stop you add /kill. I guess you need Admin Access for that POST action.
Don't know though if it works for jobs that don't exist on the Jenkins Host anymore due to missing filesystems

Related

Jenkins System configuration error when saving

An instance of Jenkins started not saving changes made under 'Manage Jenkins > System Configuration'.
In an attempt to solve it, I have recently upgraded to Jenkins 2.346.3 (including all the plugins).
Unfortunately, this behavior still persists and the System Log only shows:
Error while serving http://<jenkins_url>/configSubmit
java.lang.ClassCastException: java.lang.Integer cannot be cast to hudson.model.Describable
at hudson.util.DescribableList.get(DescribableList.java:128)
at hudson.util.DescribableList.rebuild(DescribableList.java:170)
at jenkins.model.GlobalNodePropertiesConfiguration.configure(GlobalNodePropertiesConfiguration.java:24)
at jenkins.model.Jenkins.configureDescriptor(Jenkins.java:4017)
at jenkins.model.Jenkins.doConfigSubmit(Jenkins.java:3981)
at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627)
at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:397)
Caused: java.lang.reflect.InvocationTargetException
at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:401)
at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:409)
at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:207)
<snippet>
Any idea on the possible cause?
UPDATE
After 2 attempts on restarting Jenkins without the config.xml, I succeeded in having Jenkins 'Manage Jenkins > System Configuration' behaving as expected.
After the first attempt, I reverted to the old configuration file as all the security related configurations were missing and I ended up raising the ticket https://issues.jenkins.io/browse/JENKINS-69548
On the 2nd attempt, I did what I described in the ticket comment https://issues.jenkins.io/browse/JENKINS-69548?focusedCommentId=430091&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-430091 (i.e. re-introducing the security-related configurations)

You probably have a corrupted config.xml from your old installation. Try deleting the config.xml(Back it up) located at $JENKINS_HOME(if you have not changed the default JENKINS_HOME in most cases it will be at USER_HOME/.jenkins(~/.jenkins)) and restarting Jenkins. If it's successful you can start reconfiguring or moving the configs from there.

Why Jest tests are SOMETIMES failing on CircleCI?

I have Jest tests that are running against the dockerized Neo4j Database, and sometimes they fail on CircleCI. The error message for all 25+ of them is :
thrown: "Exceeded timeout of 5000 ms for a hook.
#*******api: Use jest.setTimeout(newTimeout) to increase the timeout value, if this is a long-running test."
Since they fail sometimes, like once in 25 runs, I am wondering if jest.setTimeout will solve the issue. I was able to fail them locally by setting jest.setTimeout(10), but I am not sure how to debug this even more, or whether something else could be an issue here aside from a small timeout (default 5000). I would understand if 1/25 or a few fails, or if all other suits fail, but only a single file with all tests within that file is failing. And it is always the same file, never some other file for this reason ever.
Additional information, locally, that single file runs in less than a 1000ms connected to the staging database which is huge compared to the dockerized that has only a few files at the time of running

For anyone who sees this, I was able to solve this by adding the --maxWorkers=2 flag to the test command in my CircleCI config. See here for details: https://support.circleci.com/hc/en-us/articles/360005442714-Your-test-tools-are-smart-and-that-s-a-problem-Learn-about-when-optimization-goes-wrong-

Naman's answer is perfect! I couldn't believe it but it really solved my problem. Just to be extra clear on how to do it:
I change the test script from my package.json from jest to jest --maxWorkers=2. Then I pushed and it did solve my error.

Waiting for an available agent / Waiting for an agent to be requested

(26.07.2016)I am using TFS2015 Update3 in a VM.
When I try to queue a build through the web interface or from Team Explorer, I get the following.
Then I restart all services related to TFS in services.msc and then after some time it starts working again.
So this happens too often.
I have a custom pool running:
Is there a way to debug this behaviour?
Examining the Log files
Link to Worker log file
Link to Agent log file
Exception occurs in this order here:
Checking if artifacts directory exists C:\workspaces\agent\_work\2\a
Deleting artifacts directory
System.ComponentModel.Win32Exception (0x80004005): The directory is not empty
at Microsoft.TeamFoundation.Common.FileSpec.DeleteDirectoryLongPath(String path, Boolean recursive, Boolean followJunctionPoints)
The weird thing is, queueing new build works most of the time, this happens only sporadically
It could be, that I have opened a file from that folder in notepad with many tabs open. Will observe if this issue persists and report.

If this is happening sporadically, it might a long path exists here in artifacts:
C:\workspaces\agent_work\2\a
Or, there was a cancelled build which left the artifact directory half cleaned which exposed a bug in cleaning.
The 2.x agent isn't subject to long paths (net core) but only works with 2017+:
https://github.com/Microsoft/vsts-agent
We can troubleshoot but it would be good to get to 2017+ (2018 QU3 is out) with a 2.x agent.
If that's not an option, message me and we can dig into what I think is a cancel / state bug.

Jenkins console output has an error with log permissions

My jenkins setup is not overly complicated, there are just a bit over 200 jobs; the problem I'm having is as follows:
The jobs folder is mounted on NFS drive;
Some of the jobs are creating log file fine, but then it is loosing permissions completely (it becomes 000), resulting in an error on the console regarding log file permissions:
I've checked and rechecked permissions on the folder and all the jobs, but nothing is there stands out that could explain what's the cause of the problem. It's not an issue on its own, but some of the jobs are quite important, and without manually fixing permissions, they can't be debugged.
Any hints would be welcome.

I had the same issue, it's more than likely that your version control (e.g Perforce/SVN) sets log files to read only permissions when synced.
An easy work around to this problem is to add an "Execute Windows Batch Command" build step where you cd into the directory containing your log file and change it's permissions.
Use the commands:
cd
attrib -r
This will alter the permissions of your log file and allow your jobs to write to the file. I'm sure there's other ways of dealing with this issue but this is a pretty quick and easy way.

Jenkins job lost upon restart even though configuration is on the disk

I have restarted Jenkins using the following:
service jenkins stop
service jenkins start
Followed to that I can see some jobs are missing from the GUI.
I have also tried to go the job URL using http://<jenkins_url>/job/<JOBNAME>/
Unfortunately it is also giving:
HTTP ERROR 404
Problem accessing /job/<JOBNAME>/. Reason:
Not Found
Powered by Jetty://
Also performed Doing a Reload Configuration from Disk with no luck.
I checked the config.xml file and I can see it is corrupted. The size of config.xml file is around 110 MB. Why this file got corrupted? How to trace it.
Can anyone give me any pointer how to troubleshoot this problem?

I had the same symptoms, but I'm using a homebrew installation of Jenkins.
The Jenkins machine was shut down improperly, likely from a power outage, so when it came back up it was basically a clean instance. No jobs and no system configurations.
The following solution isn't for your exact use case, but it does solve the problem for some users who return to Jenkins to find it without any jobs.
The solution basically involves you checking to see if you have started the Jenkins service incorrectly or from the wrong place.
...
On to the specific homebrew issue:
For whatever reason, the homebrew.mxcl.jenkins.plist file was found in ~/Library/LaunchDaemons/
It belongs in ~/Library/LaunchAgents/ only.
If this happens, it can be solved as follows
Stop the service:
sudo launchctl unload ~/Library/LaunchDaemons/homebrew.mxcl.jenkins.plist
Reload the correct file, located in ~/Library/LaunchAgents/ by trying the following line in case it's running:
launchctl unload ~/Library/LaunchAgents/homebrew.mxcl.jenkins.plist
Note: the above line may yell at you if it's not running, which is ok.
Start it up again:
launchctl load ~/Library/LaunchAgents/homebrew.mxcl.jenkins.plist
If all looks good when Jenkins loads again, you can and should
delete homebrew.mxcl.jenkins.plist in ~/Library/LaunchDaemons/:
sudo rm ~/Library/LaunchDaemons/homebrew.mxcl.jenkins.plist

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Job stuck after manual jobs deletion - jenkins

Related

Jenkins System configuration error when saving

Why Jest tests are SOMETIMES failing on CircleCI?

Waiting for an available agent / Waiting for an agent to be requested

Jenkins console output has an error with log permissions

Jenkins job lost upon restart even though configuration is on the disk

Categories

Resources