Regarding scikit_learn installation - docker

Hello and thank you for looking at this. I have been using docker to create various multi-arch builds. I have noticed some interesting behavior. When trying to run buildx for image creation, scikit_learn 0.21.3 takes a very long time to download/install (about 2.5 hours). However, when using the regular build command through docker on a single arch, it only takes about 10 minutes or so. The reason I am having to use this specific version of scikit_learn is due to an error I receive from my application where it is unable to find the sklearn.utils.linear_assignment module.
I receive this \nModuleNotFoundError: No module named 'sklearn.utils.linear_assignment_'
The only version I have been able to run is 0.21.3 up to this point. Having said that, I have found more recent versions do install much faster, but again they do not have the linear assigmment module, which is a dependency for my application. Any help or guidance would be greatly appreciated.

Related

Is Docker-ized dev envoirment good for maintaining legacy software?

Let's say I have old, unmaintained application that lives on a VPS (i.e. Symfony 3 PHP app that relies on PHP 5).
If some changes are needed I have to clone this app to my desktop, build it, change and re-deploy. As time goes, recreating desktop dev environment gets harder - in this example I can't simply build the app as I use PHP7 in my CLI that breaks building process.
I tried to dockerize the app, so I added Ubuntu 18 to my docker-compose file... and it doesn't work as latest Ubuntu that has PHP5 support is 14.04. 14.04 is also the oldest (official) version available on DockerHub. But will it be still available in 3 years? If not, Docker won't build a container.
So, my question is: is Docker a right tool to solve described problem at all?
If so, should I backup docker images described that my build relies on?
If not, beside proper maintenance, what tool is better?
You can install PHP5 in newer ubuntu versions, but it means adding an external repository.
You could also create your own docker image, containing only the libraries you want. If so, I'd advise to try and use alpine as a base image. There is a bit of a learning curve to adapt, but once you do it you'll have a small image tailored to your needs.
Given that containers allow you to isolate processus and conf with minimal footprint compared to a VM, I think it is the best option. Tailoring and maintaining your own image is not that expensive in terms of maintenance if you document it correctly, and it will allow you to always have a system 'maintaining' all your precise requirements.

node webpack hangs. How to debug?

I am trying to build ORO Platform js assets, using a non-docker environment, it works like a charm, but in Docker (either during Docker Build, or container execution) the building process stop and hangs with 100% CPU.
67% [0] building 1416/1470 modules 54 active ... ndles/orotask/sidebar_widgets/assigned_tasks/css/styles.scss
The building process does not necessarily hang on the exact same file. And also, the build seems to succeed on some occasion.
I've try to reduce to a minimum the process by removing Happy, tested with --max-old-space-size=4096, but no luck.
Sources : https://github.com/oroinc/platform/tree/master/build
How would you recommend debugging this ?
Thanks
There is a known issue when a NodeJs process hangs while you run it from the root user. As I know, there is no workaround for now. Consider using another user to build the assets.
If it's not the case, please review the Troubleshooting section in OroAssetBundle, that might help.

What is the difference between running a quick start version of hyperledger iroha and building iroha?

The documentation provided from the site https://iroha.readthedocs.io
highlight two different sections titled as Building Iroha and Quick Start Guide (which runs an example test version of Hyperledger Iroha). If any experts here could explain me on the difference between these two, I would be thankful.
Thanks!
Quick Start Guide provides instructions how to run iroha on docker - it is fastest and easiest way.
On the other hand building iroha from scratch is not really complicated, because we need just to copy few commands, and almost all dependencies would be downloaded automatically by vpkg, or with cmake.
About other differences:
When you use docker's image:
It is faster to set up, it is harder to make mistake and is more probably than docker's version is fully working.
When you build from scratch: You need to read more, find dependencies (there are listed for debian-based linuxes, but for Manjaro you need to find by your own). You also need to wait longer. And what is most important - you are not sure that Your version would work, or even compile (if something is changed in dependence libraries).
Personally despite all those disadvantages I prefer to build manually, because I prefer to compile on my system without extra layer like docker.

GCR push very slow

Since a few days, the gcloud docker -- push command has become terrible slow, sometimes taking up to 10 minutes to push a simple change (like a change in default CMD in the Dockerfile)
I saw there was a post a while back (link), but sadly without any good reason why it's slow and/or how to resolve it.
As a side node, I'm building the images using Jenkins on Ubuntu 16.04 using gcloud version 197.0.0
Does anyone else experience the same issues?
Sounds like you might be hitting docker/cli#840.
What version of docker are you using, and does updating to 18.03 fix your issue?

Compile Tensorflow from source with Docker to get CPU speed up

I am looking for a way to set up or modify an existing Docker image for installing tensorflow that will install it such that the SSE4, AVX, AVX2, and FMA instructions can be utilized for CPU speed up. So far I have found how to install from source using bazel How to Compile Tensorflow... and CPU instructions not compiled.... Neither of these explain how to do this within Docker. So I think what I am looking for is what you need to add to an existing docker image that installs without these options so that you can get a compile version of tensorflow with the CPU options enabled. The existing docker images do not do this because they want the image to run on as many machines as possible. I am using Ubuntu 14.04 on linux PC. I am new to docker but have installed tensorflow and have it working without getting the CPU warnings I get when I use the docker images. I may not need this for speed, but I have seen posts that claim the speed up can be significant. I searched for existing docker images that do this and could not find anything. I need this to work with gpu so needs to be compatible with nvidia-docker.
I just found this docker support for bazel and it might provide an answer, however I do not understand it well enough to know for sure. I believe this is saying that you can not build tensorflow with bazel inside a Dockerfile. You have to build a Dockerfile using bazel. Is my understanding correct and is this the only way to get a docker image with tensorflow compiled from source? If so, I could still use help in how to do it and still get the other dependencies that I would get if using an existing docker image for tensorflow.
Dockerfiles that build with CPU support can be found here.
Hope that helps! Spent many a late night here on Stack Overflow and Github Issues and stuff. Now it's my turn to give back! :)
The GPU stuff in particular is really hairy - especially when enabling the XLA/JIT/AOT stuff as well as the Graph Transform Tools.
Lots of hacks embedded in my Dockerfiles. Feel free to review and ask me questions!
The contributing guidelines mention building TensorFlow from source with Docker to run the unit tests:
Refer to the
CPU-only developer Dockerfile and
GPU developer Dockerfile
for the required packages. Alternatively, use the said
Docker images, e.g.,
tensorflow/tensorflow:nightly-devel and tensorflow/tensorflow:nightly-devel-gpu
for development to avoid installing the packages directly on your system.

Resources