I've created some job (weather forecasting) and it is a heavy load, mostly CPU and memory, for a long(er) time. I notice that when I'm running the job from the cli I can still use my browser without stuttering. But when I move the same job to a cron job there are stutters all over the place.
I think this has to do with the way that CFS scheduling from the kernel will group processes (by tty). See e.g. here for documentation.
Now that link does provide some pointers on how to fix it, possibly. But I was wondering if anyone has already done such a thing and what the results were.
Linux xyz 4.4.0-170-generic #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
The Jenkins that we use for ETL automation stopped its service and then restarted it using command prompt. But I wanted to investigate about the reason that caused it to stop. But in System log file of Jenkins I can see only today's log. How can I see log of previous days. Please help.
If you are using a linux machine logs will be in /var/log/jenkins/jenkins.log unless you have set customized location. If you have set any logrotate it will be archived and you might require to unzip and check those to see previous logs.
Take a look at this documentation for more info
I've been running batch jobs for over a week now with DataflowRunner without a problem but all of a sudden starting from today the jobs started failing with the error message below. The workers don't seem to start and there's no log in stackdriver at all.
Anything I'm missing here?
Dataflow SDK version: 2.0.0
Submitted job: 2017-08-29_09_43_20-9537473353894635176
2017-08-29 16:44:24 ERROR MonitoringUtil$LoggingHandler:101 - 2017-08-29T16:44:22.277Z: (54a5da9d57fd266d): Workflow failed.
EDIT:
If I remove --zone=europe-west2-b from the batch run it works which indicates that there might be something wrong with this zone.
I took a look at your job. It failed because it couldn't get quota to bring up the workers. Likely you do not have quota in that zone. This error is not handled back correctly, but it should be fixed in the next release.
I am using MS Release management with agent based release templates for my releases.
The releases are getting failed with the following error message and which is randomly happening for different actions in the deployment sequence.
"Communication with the deployer was lost during the deployment. Please make sure (1) the deployer machine has not rebooted during installation and (2) the component timeout is sufficient to copy the files from the drop location to the deployer machine and install the package."
Please note:
1) Deployer machine was not rebooted during the deployment
2) I have created custom components for all actions and made the components' time out to 60 mins
When I restarts the release, it succeeded with out any error. What could be the reason of this error, what is the solution.
Experts please share your views on this.
I think you must have a Encrypted field in your deployment component, just re-enter the value for that encrypted field.
You could refer to this: https://lajak.wordpress.com/2015/12/14/release-management-communication-with-the-deployer-was-lost-during-the-deployment-the-parameter-is-incorrect/
I updated some plugins and restarted the jenkins but now it says:
Please wait while Jenkins is restarting
Your browser will reload automatically when Jenkins is ready.
It is taking too much time (waiting from last 40 minutes). I have only 1 project with around 20 builds. I have restarted jenkins many times and worked fine but now it stucks.
Is there any way out to kill/suspend jenkins to avoid this wait?
I had a very similar issue when using jenkins build-in restart function. To fix it I killed the service (with crossed fingers), but somehow it kept serving the "Please wait" page. I guess it is served by a separate thread, but since i could not see any running java or jenkins processes i restarted the server to stop it.
After reboot jenkins worked but it was not updated. To make it work it I ran the update again and restarted the jenkins service manually - it took less than a minute and worked just fine...
Jenkins seems to have a number of bugs related to restarting, and at least one unresolved: jenkins issue
Windows ONLY....
All the solutions here didn't work and restarting the server was not an option. If you are in the same situation.
I had to kill java.exe and restart the jenkins service. After I did this Jenkins reloaded several times and then went back to normal.
I was stuck on the jenkins restarting page for 10-ish minutes untill I did this.
Hope this helps.
Running this in the command line helped me:
service jenkins restart
I had a similar issue when updating plugins from the pluging update page and I marked the restart jenkins options. jenkins only showed the waiting message for a long time.
I solved the issue restoring .bak to .jpi files of the the plugins that I tried to update.
I did the follow in my jenkins
cd $JENKINS_HOME/plugins/
>sudo mv git.bak git.jpi
.
. (more plugins files)
.
>sudo mv ldap.bak ldap.jpi
>sudo /sbin/service jenkins restart
Check Event Viewer.
I found that my Java died.
Faulting application java.exe, version 7.0.250.17, time stamp 0x51c4b3fd, faulting module ntdll.dll, version 6.0.6002.18541, time stamp 0x4ec3e39f, exception code 0xc0000374, fault offset 0x000abc4f, process id 0x1188, application start time 0x01cee4f42968bc81.
Finally I found that it's Jenkins 1.540 problem. Don't use it.
https://issues.jenkins-ci.org/browse/JENKINS-20630
I faced the same issue after upgrading some plugins on Windows. Looking on jenkins.err.log it displayed this error
Exception in thread "main" java.io.IOException: Jenkins has failed to create a temporary file in C:\Users\builder\AppData\Local\Temp\
at Main.extractFromJar(Main.java:350)
at Main._main(Main.java:194)
at Main.main(Main.java:91)
Caused by: java.io.IOException: There is not enough space on the disk
at java.io.WinNTFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(Unknown Source)
at Main.extractFromJar(Main.java:347)
... 2 more
The problem was that the TEMP folder of the jenkins user had lots of temporary files. After cleaning that folder jenkins restarted correctly.
just performed a restart on the server. That fixed the issue !
In Command prompt execute this
C:\>service jenkins restart
Or
You can go for Service currently running in your machine( Win + R ) seach for Jenkins and Click on restart
For me, the cause seemed to be having lots of old job build logs hanging around. To clean them up, I ran:
cd $JENKINS_HOME/jobs
find -name 'builds' | xargs -n 1 bash -c 'rm -rf $0/[1-9]*'
Then I stopped and started Jenkins again, and it came up within a minute.
Credit to: https://stackoverflow.com/a/39230597/2255242
This is an old thread.. but my personal recommendation is to WAIT before attempting to do anything (such as restarting service, etc).
I wasted hours once trying to fix something that turned out to be not an issue in the first place. In the end, I messed things up and wasted a lot of time.
Just because you see errors in the logs doesn't necessarily mean that you need to take action.
The upgrade took about 45 minutes in the end for me. All i did at one point was refreshing my browser window. It can take a while.
Just my opinion
On Win 10: Stopping with the service command from the command line reported failure to stop the service, but I was able to stop it from services.msc (running as administrator). The updates were applied. Sorry, no definitive answer from me. YMMV.
I used TCPView and killed process that was using port 8080. BAsically it was all Java.exe from Jenkins. Killed all processes and restarted Jenkins Service
try to restart that inside windows services console, it will work
I have observed the same issue after installing a plugin and opting to restart the jenkins when no jobs are running.
When I looked at the jenkins server process, it was running fine and no issues.
On restarting the jenkins service using the below command and reloading the browser, Jenkins was up.
sudo service jenkins restart
If Jenkins is taking an unusually long time to restart the best recourse is to check the generated logs to see what may be wrong. However, even that may be of little help because many plugins try to be "quiet" by default, even if they are furiously working to load content. So if all else fails, you may have to resort to manually disabling plugins.
However here is a free tip: Some plugins are known to be messy. For example the Job Config History plugin we observed to write hundreds of thousands of records for both job configuration changes AND agent changes. Removing this plugin, and deleting the configHistory folder fixed one problem where our startup literally took > 4 hours.
In our case, the problem was we were launching ephemeral agents (via docker and/or kubernetes). Each new "agent" was treated as a configuration change. With thousands of agents per day, it didn't take long to fill up a substantial part of the disk with history that never was effectively cleared.
There are other plugins that leak data in this way. And you can also create self-inflicted wounds, e.g. by using a standalone process to remove "obsolete" files. An example where we were "bitten" is a process that tried to discard old build records, but did an incomplete job - and was "warring" with the running Jenkins process. Jenkins will try breaking its neck to load a build.xml record that is empty or incomplete.
Three more tips:
You can install the monitoring plugin. Often when the jenkins UI proper didn't start, we were able to see the /monitoring in action.
Likewise, /userContent can often be loaded even when the rest of the UI is not fully up.
Don't rule out bad actors. It just takes one aggressive script that tries, e.g. to load the entire build history and ship it back via a REST call to effectively deny service to all other UI users.
I try to fix a file named hudson.model.UpdateCenter.xml located /var/lib/jenkins
I change the URL to https://mirrors.tuna.tsinghua.edu.cn/jenkins/updates/update-center.json
Finally, restart Jenkins. it solves my problem