How to develop a model on a CPU before migrating - docker

I am building a multivariate LSTM to model longitudinal data with pytorch.
I have installed Graphcore pytorch (3.1, which includes poplar and popart) and tools from docker. Rather than installing an IPU immediately, can I develop the model on the CPU to start with before adding or migrating to an IPU? When I issue any gc-* command it reports no IPU available, which I know is true!
I generally prefer to run in bare metal [Ubuntu 20.04 LTS AMD 1950X Threadripper] rather than via VMs. Do I need a Graphcore account to do this, so I can sign the licence agreement etc? I guess that is implied in the docker application.

Related

Cannot infer on Movidius (NCS2) using OpenVINO Workbench through Docker: Drivers setup failed?

I am trying to run some inferences using the OpenVINO Workbench Docker image https://hub.docker.com/r/openvino/workbench . Everything works well using my CPU as targeted device (Configuration -> Select Environment). But I get the following error when I select my Intel Movidius Myriad X VPU (a Neural Compute Stick 2):
"Cannot infer this model on Intel(R) Movidius(TM) Neural Compute Stick 2 (NCS 2). Possible causes: Drivers setup failed. Update the drivers or run inference on a CPU." (cf attached screenshot).
I did not change the start_workbench.sh script. Here are my execution params:
./start_workbench.sh -IMAGE_NAME openvino/workbench -TAG latest -ENABLE_MYRIAD -DETACHED -ASSETS_DIR /hdd-raid0/openvino_workbench
However, I can play with the NCS2 using the classification or cross check commands provided by https://hub.docker.com/r/openvino/ubuntu18_dev.
Any idea ?
Thxxxx!
This is how you can use a Docker* Image for Intel® Vision Accelerator Design with Intel® Movidius™ VPUs: https://docs.openvinotoolkit.org/latest/openvino_docs_install_guides_installing_openvino_docker_linux.html
Kindly navigate to the specific topic. You will found that there are a few additional steps to be done before NCS2 can be used with Docker.

ML training process is not on GPU

I just moved from AWS to Google Cloud Platform because of its lower GPU price. I followed the instructions on the website creating a compute engine instance with a K80 GPU, installed the latest-versioned Tensorflow, keras, cuda driver and cnDNN, everything goes very well. However when I try to train my model, the training process is still on CPU.
NVIDIA-SMI 387.26 Driver Version: 387.26
Cuda compilation tools, release 9.1, V9.1.85
Tensorflow version -1.4.1
cudnn cudnn-9.1-linux-x64-v7
Ubuntu 16.04
Perhaps you installed the cpu version of tensorflow?
Now, Google Cloud's compute engine has a VM OS image with all the needed software installed for an easier/faster way to get started: https://cloud.google.com/deep-learning-vm/

How to run Vagrant with nvidia docker as provider

I'm part of a team developing a machine learning application.
currently we're using Vagrant with a Docker provider as a uniform dev environment.
We want to utilize the GPUs on our computers when we play around during development, and I found that Nvidia released nvidia-docker to enable that for a simple docker container.
How can I use nvidia-docker as a provider for Vagrant?
If it's not possible, is there any equivalent solution?
It is important for us to develop on top of the same docker image that we deploy since we depend on multiple interacting opensource libraries, and we want to manage them in one place
(no dependencies breaking when deploying)

Docker on embedded systems, why not?

There was a project thrown my way recently that involves the orchestration of several (Linux capable) embedded devices, deploying software to them, and allowing for the applications to be updated when the code base updates in a git repo.
The initial thought was to make a standard image for each device, and I set out, attempting to install docker on an UDOO Quad and an Intel Edison to start, but without any success up to this point.
My thinking is that it seems to be a good idea to install Docker on embedded devices--but if that's the case, surely it would have been ported by now. The only group out there that seems to be making these efforts is Resin.io.
Is there something I'm missing, or is there a clear reason why Docker doesn't make sense on embedded devices? If there isn't a reason, and it does make sense to run Docker on embedded systems, is there something I've overlooked out there: are there any sources of discussion on porting, or how-to's that cover this?
I have considered running docker on embedded devices (a mips system), but didn't go that way. There are some problems with it, in my humble view:
Docker is implemented in Golang. There is currently no available tool chain for mips to compile go. You will need to create the tool chain yourself using gcc-go.
The size of docker is larger than lxc. In a desktop computer this is not a problem, but the embedded device has limited flash storage.
Docker uses some quite up-to-date feature of linux kernel. Sometimes the kernel version on embedded devices are not so new and back-port is needed to make it work.
The docker image has to be built on the same architecture as the run time environment. It means that if you want to run a docker container on Raspberry Pi, the docker image has to be built on an ARM-architecture system. QEMU can be used to build docker image in the cloud, but it doesn't support all CPU architectures used in embedded system. (for example, it currently doesn't support MIPS)
In the end, lxc was chosen for the specific task of running a container on embedded device. It has limited features compared to docker, but currently it suits the requirement of the project.
As of year 2019, I would like to update this answer since I did port docker to embedded system with ARM cpu. With the price of flash usage, memory usage, by using docker you will have container management, image management, and many ready to run images from docker hub. So the decision is a balance between cost and features.
Here is an update for 2018:
You can work with Docker on embedded devices such as Raspberry Pi and Orange Pi quite easily now because of advancements in the development of Raspbian and Armbian operating system images. Specifically, both types of devices and their respective OS images now support kernels that are of sufficiently high enough versions to install Docker without any problems (at least version 3.10, though both now offer 4.x+ versions).
Your desire for faster rates of change can definitely be realized by using embedded Docker. I can say from experience that I have tested and regularly run the approach you describe. Basically, you start with a base operating system image such as Raspbian or Armbian, tweak that operating system enough that it's secure and has Docker installed, and then you use Docker for handling development iteration and application updates.
As an aside, if you are interested in running Docker on embedded Linux devices, then I recommend you check out a free, open-source, MIT-licensed command line tool I wrote to help developers work with embedded Docker on multiple devices at once: https://github.com/ForwardLoopLLC/floopcli .
Even if you are not interested in the tool itself, the documentation for the tool describes several patterns for working with Dockerized applications across multiple devices in multiple languages: https://docs.forward-loop.com/floopcli/master/index.html . The materials there should serve as a starting point for porting applications to Docker and then deploying them on embedded devices. The documentation also addresses some embedded device subtleties, such as differences between ARMv6 and ARMv7. Hopefully this helps you get started!
There is a great article on LinkedIn describing his experience with that
https://www.linkedin.com/pulse/whale-jar-when-running-docker-embedded-linux-good-thing-fletcher#pulse-comments-urn:li:article:7736487387895237975
Often embedded systems have a very slow rate of change. Docker works well on a minimum build then layering on top. If you want to sacrifice the overhead of running docker on a minimum embedded system for docker's ability to have a build system and steady rate of change then you could explorer it.

How to install Torch on windows 8.1?

Torch is a scientific computing framework with wide support for machine learning algorithms. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.
Q:
Is there a way to install torch on MS Windows 8.1?
I got it installed and running on Windows (although not 8.1, but I don't expect the process to be different) following instructions in this repository; it's now deprecated, but wasn't deprecated few months ago when I built it. The new instructions point to torch/torch7 repository, but it has a different structure and I haven't been able to build it on Windows yet.
There are instructions on how to install Torch7 from luarocks, but you may run into issues on windows as well; I haven't tried this process. It seems like there is no official support for Windows yet, but some work is being done by contributors (there is a link to a pull request in that thread).
Based on my experience, compiling that deprecated repo may be your best option on Windows at the moment.
Update (7/9/2015): I've recently submitted several changes that fix compilation issues with mingw, so you may try the most recent version of torch7 and follow the build instructions in the ticket. Note that the changes only apply to the core lib and additional libraries may need similar changes.
This webpage hosted by New York University recommends installing a Linux virtual machine in order to run Torch7 on Windows through Linux. Another option would off course be to install a Linux dist in parallel with Windows 8.
Otherwise, if you don't mind running an older version of Torch, there is a Windows installer for Torch5 at SourceForge.
I think to use a GPU from inside the virtual machine, the processor and the motherboard should not only support VT-x , but VT-d should be supported too.
But the question is, if I use a CPU with VT-d supported, do you think there will be a significant loss in PCIe connections efficiency?
From what I understand,
VT-d is important if I want to give the virtual machines direct access to my hardware components (like PCI Express cards). Like directly attach graphics card to vm instead of host machine. Isn't that mean that the PCIe connections efficiency will be the same just like if it was the host?

Resources