Docker publishLocal sbt task stuck - docker

The publishLocal task in sbt sometimes gets stuck on the docker*Version tasks, until the PC is rebooted. Rebooting the PC seems to help, but sometimes needs to be done more than once. What is causing this, and how can we fix it?
sbt subproject/Docker/publishLocal
[...]
[info] Wrote C:\full\path\to\project\subproject\target\scala-2.13\subproject_2.13-0.0.0-SNAPSHOT.pom
| => subproject / dockerVersion 44s
| => subproject / dockerApiVersion 44s
The publishLocal task remains stuck on the docker*Version sub-tasks indefinitely.
Stopping and re-starting the publishLocal does not help. Rebooting the PC usually helps.
Running on Win10 / Win11 with latest patches, WSL 2 Ubuntu installed. DockerDesktop 20.10.22. Scala 2.13.9, sbt 1.7.1, "sbt-native-packager" % "1.8.0", Eclipse Adoptium Java 11.0.15.

Related

NestJS cli very slow in Docker container on Windows with Visual Studio Code

The response from the nest cli command from NestJS (npm i -g #nestjs/cli) in a Docker Development container with Visual Studio Code on Windows 10 is suddenly very slow. At first it works fine but at some point, for instance after deleting a directory in the src folder, the nest command gets very slow.
Example:
node ➜ /workspaces/Servers/terminal-server (master ✗) $ time nest --help
[...]
real 0m44.576s
user 0m6.239s
sys 0m4.407s
Yarn is used for the package manager. NPM is used to install nest cli globally (npm i -g #nestjs/cli):
Software
Version
Running in container
Running on W10 host
NPM
8.1.2
X
NodeJS
v16.13.1
X
Yarn
1.22.15
X
Typescript
4.5.2
X
Nest
8.1.6
X
Visual Studio Code
1.63.2
X
Docker Desktop
4.3.1
X
It looks like the line const localCommandLoader = local_binaries_1.loadLocalBinCommandLoader(); in /usr/local/share/npm-global/bin/nest is causing the delay.
Edit:
Compiling is also very slow. As you can see, it started at 8:57:20 and finished at 9:00:17. And this is compiling the default scaffolding.
[8:57:20 AM] Starting compilation in watch mode...
[8:59:43 AM] Found 0 errors. Watching for file changes.
[Nest] 5197 - 12/23/2021, 9:00:17 AM LOG [NestFactory] Starting Nest application...
[Nest] 5197 - 12/23/2021, 9:00:17 AM LOG [InstanceLoader] AppModule dependencies initialized +67ms
[Nest] 5197 - 12/23/2021, 9:00:17 AM LOG [RoutesResolver] AppController {/}: +42ms
[Nest] 5197 - 12/23/2021, 9:00:17 AM LOG [RouterExplorer] Mapped {/, GET} route +8ms
[Nest] 5197 - 12/23/2021, 9:00:17 AM LOG [NestApplication] Nest application successfully started +8ms
I did the same on WSL:
[10:03:48 AM] Starting compilation in watch mode...
[10:03:53 AM] Found 0 errors. Watching for file changes.
[Nest] 1998 - 12/23/2021, 10:03:54 AM LOG [NestFactory] Starting Nest application...
[Nest] 1998 - 12/23/2021, 10:03:54 AM LOG [InstanceLoader] AppModule dependencies initialized +62ms
[Nest] 1998 - 12/23/2021, 10:03:54 AM LOG [RoutesResolver] AppController {/}: +14ms
[Nest] 1998 - 12/23/2021, 10:03:54 AM LOG [RouterExplorer] Mapped {/, GET} route +6ms
[Nest] 1998 - 12/23/2021, 10:03:54 AM LOG [NestApplication] Nest application successfully started +9ms
For the Docker image I've selected the Node.js & TypeScript image. Would it be better to just use a plain image and install everything manually?
Or is there a way to get the response time of nest normal again?
TL;DR If you insist on booting into Windows for development but want to use VSCode and dev containers, try doing it all inside a Linux VM as I got a 96% reduction in time taken for key dev task steps.
Summary
I get a 96% reduction on startup / recompile time of a dev-container NestJS TypeScript project on npm run start:dev when running in a linux VM inside Windows 10 vs VSCode directly Windows 10 with docker and wsl2. ie 7s vs 2m 50s
Dev Containers
I think dev containers are great, love using them on github codespaces, but found the experience on Win 10 to be painfully slow.
In a multi developer team dev containers can offer a consistent experience between developers and reduce the risk of different setups on different developers' machines. But that's worthless if it takes more than a few seconds to rebuild the app everytime you save a file while running in 'watch' mode.
What slows NestJS / TypeScript in Dev Containers
After various investigations I'm convinced the speed penalty is in the dev container accessing the hosts filesystem.
NestJS cli is frustratingly eager to load everything it can do, before it even parses the command line args, so that's a big hit if filesystem access is slow.
And then the TypeScript compilation is obviously heavily dependent on filesystem speed. So this is the other area that grinds to a halt. Even on a relatively small NestJS project with few additional external dependencies!
What's fastest
Running Linux inside a VM on my Windows machine, running VSCode and docker all inside that VM meant that the filesystem access between the commands running inside the dev container (in docker inside the linux VM) can access the code 'hosted' on the linux VM very quickly.
Comparison Table
Activity
CodespacesBrowser
CodespacesWin10 remote
VSCode + Win Dockeron Win 10 with WSL2
VSCode + dockerin Linux VMin Win 10
npx nest i
1.16s
1.16s
14.2s
1.12s
npm run start:dev startup
10s
8s
170s
7s
npm run start:dev update a file
3s
2s
38s
2s
rm -rf node_modules ; time npm install
30s
28s
85s
27s
Setups used:
Codespaces
2core 4GB instance
Win 10
AMD Ryzen 7 3700X 8 Core # 4.16GHz
64GB 2400MHz RAM
1TB SSD
Internet: 27Mb down / 5Mb up
Linux VM (inside Win 10 above)
VMware 16.2.3
Ubuntu 22.04
dev container is the recommended Node.js & Typescript container with VARIANT: "16-bullseye"
with mariadb server running inside
To get an active-developer experience, I ran each command a few times until the timing settled down and recorded the most representative time...
anything that doesn't use 'time' in the command was done by hand and so has a +/- 1sec error
time npx nest i
the time the command shows at the end
npm run start:dev startup
timing is from hitting enter to the first log of LOG [NestFactory] Starting Nest application...
npm run start:dev update a file
timing is from saving an update to a watched file, to the next log of LOG [NestFactory] Starting Nest application...
rm -rf node_modules ; time npm install
just another filesystem dependent task to compare the environments
the time the command shows at the end

Jest hangs when running in docker with some CPU / RAM config

Some days ago I upgraded my codebase from jest 26 to 27.
Running tests in my local environment worked like a charm but, when I tried to run them on my CI machine, the tests "never" stop.
Actually, in CI, the process exited correctly when trying to run tests sequentially but not in parallel, eg with --runInBand --detectOpenHandles --forceExit.
I tried to build and run the same docker locally... and it worked.
Changing randomly some docker's CPU / RAM configs, I achieved the same result: the process hangs.
Tests hanging:
Running top in docker:
As you can see, it's not a problem of CPU / RAM considering what top says.
Do you have any hints?
Do you need more information?
After some days debugging, I figured out that the problem was the amount of RAM and some memory leaks in the tests.
I advise you to use the node --expose-gc ./node_modules/.bin/jest --logHeapUsage command to dive into the problems you can have. https://jestjs.io/docs/cli#--logheapusage
This post helped me out as well: https://chanind.github.io/javascript/2019/10/12/jest-tests-memory-leak.html
TL;DR:
I fixed memory leaks by downgrading node from 16.13.2 to 16.10.0.
Details:
Based on Mattia Larentis answer:
I had the same problem, code and tests were written for client app, used packages:
node: 16.13.2
jest: 27.4.7
ts-jest: 27.1.3
typescript: 4.5.5
After debugging and analysing jest tests, I detected memory leaks too, using:
node --expose-gc ./node_modules/.bin/jest --logHeapUsage
Then I analysed node heap snapshots (for more info see Your Jest Tests are Leaking Memory).
I found a lot of strings, containing transpiled TS code.
I went to ts-jest & jest repos to look for issues about memory leaks.
I found these:
ts-jest: Module caching memory leak
jest: [Bug]: Memory consumption issues on Node JS 16.11.0+
node: vm Script memory leak in Node.js 14 / 16
Then, I downgraded node version from 16.13.2 to 16.10.0, checked memory leaks using
node --expose-gc ./node_modules/.bin/jest --logHeapUsage
And found that problem was gone.

docker codeception unit tests run time is too long on Mac

In our project for local env. we are using Docker containers for development and running test. And vendor/bin/codecept run unit is taking almost 7 minutes for devs who work from Mac, but only 17 seconds for devs from linux. Any idea what can cause the issue? I believe the issue is on some docker configurations on Mac

Docker hanging requiring reboot

We are running docker 1.7.1, build 786b29d on RHEL 6.7. Recently we have had multiple times when the docker daemon locked up and we had to reboot the machine to get it back.
A typical scenario is that a container that has been running fine for weeks suddenly starts throwing errors. Sometime we can restart the container and all is well. But other times all docker commands will hang, and restarting the daemon fails, and I see this in a ps:
4 Z root 4895 1 0 80 0 - 0 exit Aug23 ? 00:01:24 [docker]
Looking in the system log I've seen this:
device-mapper: ioctl: unable to remove open device docker-253:6-1048578-317bb6ad40cded3fbfd752d95551861c2e4ef08dffc1186853fea0e85da6b12b
INFO: task docker:16676 blocked for more than 120 seconds.
Not tainted 2.6.32-573.12.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
docker D 000000000000000b 0 16676 1 0x00000080
ffff88035ef13ea8 0000000000000082 ffff88035ef13e70 ffff88035ef13e6c
ffff88035ef13e28 ffff88062fc29a00 0000376c85170937 ffff8800283759c0
0000000000000400 00000001039d40c7 ffff8803000445f8 ffff88035ef13fd8
Call Trace:
[] _mutexlock_slowpath+0x96/0x210
[] ? wake_up_process+0x15/0x20
[] mutex_lock+0x2b/0x50
[] sync_filesystems+0x26/0x150
[] sys_sync+0x17/0x40
[] system_call_fastpath+0x16/0x1b
The latest docker version is 1.12.1 and we are on 1.7.1. Can or should I install a new version? 1.7.1 is the version yum installs. If I did want a new version how would I install that (sorry if that is a dumb question, I am not a sys admin).
Googling, I found on this on a Red Hat site "Red Hat does not recommend running any version of Docker on any RHEL 6 releases." We have been running docker on RHEL 6 for a few years, so this confuses me. Upgrading to RHEL 7 is not really an option for us right now.
Can anyone shed any light on these issue? We need docker to work reliably without having to reboot often.
Docker 1.7.1 is really old by today's standards. There have been hundreds of bugs fixed, enhancements to driver stacks, security patches, and valuable features added in the versions since. It looks like you're having a issue with your storage stack, and there is a good chance this is fixed in a newer version.
Docker has stated that default versions in package management systems like yum and apt can be way out of date, and that you should use their repo. The best way to do this is add their Yum repo information to your system so you can install it like other packages. The instructions are here: Installation on Red Hat Enterprise Linux.
Note: This will allow you to install Docker, and the service will be called docker, but the package is docker-engine. This has confused some people in the past.
yum install docker-engine
Docker has also provided a script that does this to make things easier (run as admin/root):
curl -fsSL https://get.docker.com/ | sh
Don't use a RHEL6 based system.
RHEL6 uses a 2.6 kernel with backported fixes to keep Docker working. Docker would normally require a 3.10+ kernel. Docker dropped support for RHEL6 from v1.8 on so it's unlikely there will be any more packages for it.
If you must use RHEL6, don't use the default loopback devicemapper for storage. Setup an LVM thin pool for Docker to use.

Jenkins won't start after plugin installation *and does not log anything*

I installed Jenkins' Gradle plugin and used the automatic restart option via the Jenkins web interface. Jenkins seemed to hang on the "restarting..." page, so I finally tried to manually restart the Jenkins service on the server (64-bit Debian 7) using service jenkins restart.
Now, Jenkins is no longer running at all (verified with ps -ef | grep -i [J]enkins and service jenkins status), and when I try service jenkins [re]start, I see an [ ok ] message but nothing else seems to happen. I've deleted /var/log/jenkins/jenkins.log, and each time I try a service start (or restart), the log file reappears, but it's blank (ls -lA shows that the file was recently made, but cat produces no output). I also tried rebooting the server, with no effect. I finally deleted the Gradle folders under /var/lib/jenkins/plugins, which also did not appear to make a difference.
How do I even begin to approach this problem? Should I just re-install Jenkins?
EDIT: System info:
> uname -a
Linux AUC-Workstation1 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 GNU/Linux
According to dpkg -l, I'm using Debian's jenkins package, version 1.617.
EDIT 2: I'm actually using the jenkins package provided directly by Jenkins, as per the instructions here.
I just had a problem where multiple Jenkins plugins were breaking Jenkins startup (after an upgrade) and here is the procedure I followed to resolve the issue, which might work for other plugin startup issues.
I'm working on an Ubuntu server, but I expect that this would work for Debian if it's going to work at all - I encourage others to adjust the procedure:
logged into the server and switched to the jenkins user (sudo su jenkins in my case)
went to the main jenkins directory
renamed plugins to plugins.problems_YYYYMMDD
previously, I attempted to disable the plugins, but this did not work for me (system still would not start)
created an empty directory plugins
restarted jenkins (sudo service jenkins restart)
In my case, this started just fine
iteratively followed the following procedure to add plugins back in
copied 1 or more plugins from plugins.problems_YYYYMMDD/ to plugins/
restarted jenkins
went to the plugin center and installed updates as available
sometimes I needed to install updates in a particular order due to dependencies
evaluated results in 'Manage Old Data'
I think I'm facing some manual updates of the old data
Note: if you know which plugins are likely the problem, then it is easier to just disable or temporarily (re)move them rather than (re)moving all of the plugins!
I never did figure out the initial problem, but I did get Jenkins working again, sort of.
I uninstalled Jenkins (using apt-get purge) and then re-installed it. This time it failed to start because it needed Java 7, but I apparently only had Java 6 installed (this surprised me, because I thought I had previously configured Jenkins to use Java 7 on that machine). So I installed openjdk-7-jdk and openjdk-7-jre, set JAVA and JAVA_HOME appropriately in the Jenkins config file, and started the service again. This allowed Jenkins to start.

Resources