getting "Jenkins is going to shutdown" message during long jobs - jenkins

Sometimes, when one of our longer builds is running (around 2 hours), Jenkins will start displaying the "Jenkins is going to shutdown ..." message. And no, this has not been done by an admin (me).
When I last saw this, I checked the console output of the running job, and it was still churning through it's tests and was running normally. It was not hung.
Then later, I checked again, and the console had the "BUILD SUCCESSFUL" message, followed by "Pausing (Preparing for shutdown)" - and it just sat there.
So I clicked on the kill job button, and killed it. and got the "Aborted by ..." message.
Then 15 seconds later it displayed "Click here to forcibly terminate running steps". I did that. It then displayed "Terminating withAnt".
Then 15 seconds later, it displayed "Click here to forcibly kill entire build". Which I did - and Jenkins return to normal operation and cleared the "going to shutdown" message.
WHAT IS GOING ON!
One related note: Due to getting too much "state" bleedthrough on our JUNits, we recently added the forkmode="perTest" setting to the Ant JUnit task. This has resulted in random tests failing with a "vm exited unexpectedly" message. It happens randomly for different tests. (which is a PITA since we can no longer count on Test Failed status in jenkins meaning anything.) And no, I'm not sure if that has always happened when the Jenkins job has the termination problem.

Well, I think I figured this out.
The system was running low on disk space. So SOME jobs that used more were triggering this problem - and others would run without a problem.
When I finally received a low-disk space error in one of the logs, I did some cleaning (found a bunch of files that were supposed to have been deleted.). Since then, this error has stopped occuring.
NOPE! Still happening

Related

MSIX sideloaded app is slow to start after update

I am using MSIX packaging to deploy .NET desktop applications. The app is built by Azure Pipelines and the installation package is deployed to a shared folder on a file server.
When I run the .appinstaller, the dialog opens and applies updates as it should. But then the dialog closes, and nothing happens for over 1 minute. Then the app starts.
TEST 1 - Normal user
Looking in the event log, there is first this warning:
App manifest validation warning: Declared namespace
http://schemas.microsoft.com/developer/appx/2015/build is
inapplicable, it will be ignored during manifest processing.
Then several messages like
error 0x5: Deleting file \?\C:\Program
Files\WindowsApps\Deleted\8b7d5c25-92aa-4962-9e74-93b9685ce2ca-test_2021.1005.1225.1455_x64__002e9dkagpm7g28acfe13-edc2-4d9d-8a69-d5d9687e0573\MyApp\MyApp.exe
failed.
After 1 minute there is this warning:
Warning: There were 129 additional files that failed to be deleted
under the folder \?\C:\Program Files\WindowsApps\Deleted.
It seems that the process tries, and retries, to delete the old files for over 1 minute, then gives up.
How can I allow MSIX to delete the files without giving it administrator rights?
TEST 2 - Administrator user
I did a second test, this time on a different machine, and logged in as an administrator.
The update dialog finished the update and closed after 12s.
Then nothing happened for 5 minutes(!)
I believe I clicked the Start button or something, then suddenly the app started.
Examining the log did not show any warnings about failed file deletions.
Only this warning:
App manifest validation warning: Declared namespace
http://schemas.microsoft.com/developer/appx/2015/build is
inapplicable, it will be ignored during manifest processing.
During the 5 minutes there were no log entries at all.
These were the last 2 log entries, made after 5 minutes:
14-10-2021 10:10:12
UpdateUsingAppInstallerOperation operation on a package with main
parameter
8b7d5c25-92aa-4962-9e74-93b9685ce2ca-test_2021.1013.1518.1578_x64__002e9dkagpm7g
and Options 0 and 0. See http://go.microsoft.com/fwlink/?LinkId=235160
for help diagnosing app deployment issues.
14-10-2021 10:10:13
The bundle streaming reader was created successfully for bundle
8b7d5c25-92aa-4962-9e74-93b9685ce2ca-test_2021.1013.1518.1578_neutral_~_002e9dkagpm7g.Started deploymentThe bundle streaming reader was created
Conclusion
Looking at Task Manager and ProcMon, I can see that the app starts right after the update dialog closes. However, the process is a Background Process, invisible to the user.
While googling, I came across these posts describing the same problem:
https://techcommunity.microsoft.com/t5/msix-deployment/app-does-not-launch-immediately-after-installation-but-after-a/m-p/1972161
https://techcommunity.microsoft.com/t5/msix-deployment/winforms-exe-in-msix-package-does-not-startup-after-auto-update/m-p/965978
You can't give it admin rights. MSIX installations always run per user.
This seems to sound like a machine-related problem. Are you reproducing the same behavior on other machines (try virtual machines if you don't have access to a separate physical machine).
I never found a solution for this. My workaround is to turn off the update dialog by not including ShowPrompt="true".
Then the app seems to launch as it should even if there are updates.
However, there is a new problem - the first time the user starts the app after an update has been released, the auto-update does not happen. It only gets applied the second time the app starts. This is by design apparently...

Jenkins script not running - stalled/stuck?

I am trying to run a python script on Jenkins - I already tried it in jupyter and it works fine there.
When I try running it in Jenkins, I don't get any error message unfortunately, I get an infinite wheel rotating.
The console output circle at the top is gray, and I get the following circling wheel:
Since I don't see any error message - I am not really sure where to start. I don't think it is a matter of being more patient since it takes ~5 minutes to run it all in Jupyter, but left it running overnight in Jenkins and the script still hadn't finished the next day. It isn't an authorization error, since I've had those before and I get an error message on what to fix.
In case this is relevant - the progress bar seems to be almost done, but in red:
Any ideas on what could be going wrong?

instruments[34247:1345307] Attempting to set event horizon when core is not engaged, request ignored

I get this error during ui automation and am unable to resolve it.
This stops my automation flow ...
instruments[34247:1345307] Attempting to set event horizon when core
is not engaged, request ignored
In my experience, this message is related to the startup or shutdown of the individual processes that enable UIAutomation; nothing in your javascript code or environment really has an effect on it. It's just a sporadic error that comes from somewhere in Apple's software.
Sometimes it happens at the beginning of the run (in which case your javascript code will never be executed), or at the end of the run (in which case your code has already run). If you are seeing this error at the end of a test run and your code did not fully execute, then the real error is probably happening sometime before this -- you are just seeing Apple's error as well on the test shutdown.

Jenkins quits at midnight

I am running Jenkins from cmd, not as a service, because I need to do GUI testing. It works fine when I start it up and I can do everything I want. But I schedule a task around 4am. and when I come back Jenkins didn't last till 4am. From the console, it seems to just quit at 12am.
First I thought it's computer environment problem. But it still happens after I change my computer to never sleep, and I put the hard drive to never sleep as well. I locked my computer around 7pm. But it seems to continue running until 12am.
Any idea on what is happening?
Look at Jenkins logs. They're somewhere either in the directory where jenkins.jar was or in a subdirectoy, look for jenkins*out* and jenkins*err* files with right time stamp. I can't check exact location and names right now, sorry.
Seems you were running jenkins from C:\, congrats for cluttering your C drive root ;). To help clean it up, copy jenkins.war to C:\jenkins\ or something and run it again to see what all it creates under there, so you know what to clean up.
Also, running it from C drive root might have somehow interfered with some Windows maintenance task or something, which caused it to abort.

Gerrit - Application Error - Intraline difference not available due to server error

For one of our gerrit projects, while navigating the file differences we get this error:
Application Error
Intraline difference not available due to server error
[Continue]
It doesn't happen for all projects, currently we've detected the error on only one project.
I looked on Google and on the gerrit documentation. Found a reference on their source code, but don't know what causes it and how it can be resolved.
The web page with the error contains a "Continue" button. Once clicked it will take you to the file you selected, but the error is annoying.
Do you know how to fix this?
That is caused while cache the intraline difference of one file, when compared between two commits. The default timeout value is 5 seconds. If the file is huge, and computation takes longer than the timeout, the worker thread is terminated, an error message is shown, and no intraline difference is displayed for the file pair.
A solution could fix this.
Add config in gerrit.conf.
[cache "diff_intraline"]
timeout = 15000 ms # Or other time length as you want.
restart Gerrit service
run SSH command "gerrit flush-caches", using a user with ViewCaches global capability.
ssh -p port userxxx#host gerrit flush-caches
Then it would work.
Cause of the error:
It is a result of Gerrit taking too long to diff the file, and marking the diff in one of its caches as non-available.
The relevant error log is here:
[2012-06-08 11:14:08,547] WARN com.google.gerrit.server.patch.IntraLineLoader : 5000 ms timeout reached for IntraLineDiff in project xxxxxxx on commit 354dd67ad54578cf801d8cda64a4ae8484ebb0b7 for path xxxxxxx.java comparing bf9fbc21520af7bfd0841c8b9f955ca6e215b059..f6b9c7992c12cfdca253acd033966f98f70f3543. Killing IntraLineDiff-6

Resources