XGB via Scikit learn API doesn't seem to be running in GPU although compiled to run for GPU - machine-learning

It appears although XGB is compiled to run on GPU, when called/executed via Scikit learn API, it doesn't seem to be running on GPU.
Please advise if this is expected behaviour

As far as I can tell, the Scikit learn API does not currently support GPU. You need to use the learning API (e.g. xgboost.train(...)). This also requires you to first convert your data into xgboost DMatrix.
Example:
params = {"updater":"grow_gpu"}
train = xgboost.DMatrix(x_train, label=y_train)
clf = xgboost.train(params, train, num_boost_round=10)
UPDATE:
The Scikit Learn API now supports GPU via the **kwargs argument:
http://xgboost.readthedocs.io/en/latest/python/python_api.html#id1

I couldn't get this working from the pip installed XGBoost, but I pulled the most recent XGBoost from GitHub (git clone --recursive https://github.com/dmlc/xgboost) and compiled it with the PLUGIN_UPDATER_GPU flag which allowed me to use the GPU with the sklearn API. This required me to also change some NVCC flags to work on my GTX960 that was causing some build errors, then some runtime errors due to architecture mismatch. After it built, I installed with pip install -e python-package/ within the repo directory. To use the Scikit learn API (using either grow_gpu or grow_hist_gpu):
import xgboost as xgb
model = xgb.XGBClassifier(
max_depth=5,
objective='binary:logistic',
**{"updater": "grow_gpu"}
)
model.fit(train_x, train_y)
If anyone is interested in the process to fix the build with the GPU flag, here is the process that I went through on Ubuntu 14.04.
i) git clone git clone --recursive https://github.com/dmlc/xgboost
ii) cd insto xgboost and make -j4 to create multi-threaded, if no GPU is desired
iii) to make GPU, edit make/config.mk to use PLUGIN_UPDATER_GPU
iv) Edit the makefile Makefile, on the NVCC section to use the flag --gpu-architecture=sm_xx for GPU version (5.2 for GTX 960) on line 101
#CODE = $(foreach ver,$(COMPUTE),-gencode arch=compute_$(ver),code=sm_$(ver)) TO
CODE = --gpu-architecture=sm_52
v) Run the ./build.sh, it should say completed in multi-threaded mode or the NVCC build probably failed (or another error, look above for the error)
vi) In the virtualenv (if desired) in the same directory run pip install -e python-package/
These are some things that caused some nvcc errors for me:
i) Installing/updating the Cuda Toolkit by downloading the cuda toolkit .deb from Nvidia (version 8.0 worked for me, and is required in some cases?).
ii) Install/update cuda
sudo apt-get update
sudo apt-get install cuda
iii) Add nvcc to your path. Mine was in /usr/local/cuda/bin/
iv) A restart may be required if running nvidia-smi does not work due to some of the cuda/driver/toolkit updates.

Related

Drake Mathematical Program Tutorial

I am running Drake on Ubuntu 20.04 using WSL2.
I use python3.8.10 and Drake1.2.0.
I tried running the "Mathematical Program Tutorial" obtained from deepnote on my PC, but the behavior of the ipopt solver is unnatural and does not give the expected results.
The 1st error is occurred in the section using ipopt solver.
All components of the solution is printed as "nan"
The 2nd error is below about "get_solver_details().status"
RuntimeError: The solver_details has not been set yet.
I can see both errors in "Demo on manually choosing a solver" in the tutorial.
The result is following
SolutionResult.kUnknownError
x* = [nan nan]
Solver is IPOPT
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-12-2d1b3835c54a> in <module>
25 print("x* = ", result.GetSolution(x))
26 print("Solver is ", result.get_solver_id().name())
---> 27 print("Ipopt solver status: ", result.get_solver_details().status,
28 ", meaning ", result.get_solver_details().ConvertStatusToString())
RuntimeError: The solver_details has not been set yet.
Thank you in advance.
P.S.
I installed pydrake for venv by pip commands
python3 -m venv env
env/bin/pip install --upgrade pip
env/bin/pip install drake
sudo apt-get install --no-install-recommends \
libpython3.8 libx11-6 libsm6 libxt6 libglib2.0-0
source env/bin/activate
I just download the folder "Tutorial" from deepnote and put it under env.
Then, I run it by Jupyter Notebook as
jupyter notebook
and open env/Tutorials/mathematical_program.ipynb
It turns out that the pip drake == 1.2.0 version has a bug in the IpoptSolver compilation.
As a work-around, you can use SnoptSolver instead, or else use the https://drake.mit.edu/from_binary.html release (unpacking a zipped binary, instead of using pip).
It's possible that the pydrake.solvers.ipopt.IpoptSolver class (which is a wrapper around the https://coin-or.github.io/Ipopt/ library) does not run correctly under WSL2, due to using some odd libc API which doesn't work on Windows. We will need more information to reproduce the problem and try to debug.
Can you state exactly how you installed pydrake (i.e., show us the command lines you used). Was it via pip (https://drake.mit.edu/pip.html) or just via binary (https://drake.mit.edu/from_binary.html)?
Can you state exactly how you ran Jupyter (the command line) to launch the notebook? Was it python3 -m pydrake.tutorials or something else?
Looks like this may not be tied to WSL, but instead pip build (or just binary build). Ran into this on Ubuntu 20.04 (no WSL). Per Drake Slack, filed issue:
https://github.com/RobotLocomotion/drake/issues/17162

Pytorch errors: "received an invalid combination of arguments" in Jupyter Notebook

I'm trying to learn Pytorch, but whenever I seem to try any online tutorial (https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py), I get errors when trying to run certain functions, but only in Jupyter Notebook.
When running
x = torch.empty(5, 3)
I get an error:
module 'torch' has no attribute 'empty'
Furthermore, when running
x = torch.zeros(5, 3, dtype=torch.long)
I get the error:
module 'torch' has no attribute 'long'
Some other functions work fine like:
x = torch.rand(5, 3)
But generally, most code I try to run seems to run into an error really quickly. I couldn't find any resolution online.
When I go into my docker container and simply run python in the shell, I can run these lines just fine with no errors.
I'm running pytorch in a Docker image that I extended from a fastai image, as it already included things like jupyter notebook and pytorch. I used anaconda to update everything, and committed it to a new image for myself.
I have absolutely no idea what the issue could be. I've tried updating packages through anaconda, pip, aptitude in my docker container, and making sure to commit my changes, but nothing seems to work. I also tried creating a new kernel with python 3.7 as I noticed that my Jupyter Notebook only runs in 3.6.4, and when I run python in the shell it is at 3.7.
I've also tried getting different docker images and extending them with what I need, but all images that I've tried have had errors with anaconda where it gets stuck on "Solving environment" step.
Ok, so the fix for me was to either update pytorch through conda using the following command
conda update pytorch
If it's not installed yet, I've gotten it to work in other environments by simply installing it through conda
conda install pytorch
Kind of stupid that I didn't try this earlier, but I was confused on the difference between conda and pip.

Keras with TensorFlow backend not using GPU

I built the gpu version of the docker image https://github.com/floydhub/dl-docker with keras version 2.0.0 and tensorflow version 0.12.1. I then ran the mnist tutorial https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py but realized that keras is not using GPU. Below is the output that I have
root#b79b8a57fb1f:~/sharedfolder# python test.py
Using TensorFlow backend.
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-09-06 16:26:54.866833: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866855: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866863: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866870: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866876: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Can anyone let me know if there are some settings that need to be made before keras uses GPU ? I am very new to all these so do let me know if I need to provide more information.
I have installed the pre-requisites as mentioned on the page
Install Docker following the installation guide for your platform: https://docs.docker.com/engine/installation/
I am able to launch the docker image
docker run -it -p 8888:8888 -p 6006:6006 -v /sharedfolder:/root/sharedfolder floydhub/dl-docker:cpu bash
GPU Version Only: Install Nvidia drivers on your machine either from Nvidia directly or follow the instructions here. Note that you don't have to install CUDA or cuDNN. These are included in the Docker container.
I am able to run the last step
cv#cv-P15SM:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 375.66 Mon May 1 15:29:16 PDT 2017
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
GPU Version Only: Install nvidia-docker: https://github.com/NVIDIA/nvidia-docker, following the instructions here. This will install a replacement for the docker CLI. It takes care of setting up the Nvidia host driver environment inside the Docker containers and a few other things.
I am able to run the step here
# Test nvidia-smi
cv#cv-P15SM:~$ nvidia-docker run --rm nvidia/cuda nvidia-smi
Thu Sep 7 00:33:06 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 780M Off | 0000:01:00.0 N/A | N/A |
| N/A 55C P0 N/A / N/A | 310MiB / 4036MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
I am also able to run the nvidia-docker command to launch a gpu supported image.
What I have tried
I have tried the following suggestions below
Check if you have completed step 9 of this tutorial ( https://github.com/ignaciorlando/skinner/wiki/Keras-and-TensorFlow-installation ). Note: Your file paths may be completely different inside that docker image, you'll have to locate them somehow.
I appended the suggested lines to my bashrc and have verified that the bashrc file is updated.
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64' >> ~/.bashrc
echo 'export CUDA_HOME=/usr/local/cuda-8.0' >> ~/.bashrc
To import the following commands in my python file
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="0"
Both steps, done separately or together unfortunately did not solve the issue. Keras is still running with the CPU version of tensorflow as its backend. However, I might have found the possible issue. I checked the version of my tensorflow via the following commands and found two of them.
This is the CPU version
root#08b5fff06800:~# pip show tensorflow
Name: tensorflow
Version: 1.3.0
Summary: TensorFlow helps the tensors flow
Home-page: http://tensorflow.org/
Author: Google Inc.
Author-email: opensource#google.com
License: Apache 2.0
Location: /usr/local/lib/python2.7/dist-packages
Requires: tensorflow-tensorboard, six, protobuf, mock, numpy, backports.weakref, wheel
And this is the GPU version
root#08b5fff06800:~# pip show tensorflow-gpu
Name: tensorflow-gpu
Version: 0.12.1
Summary: TensorFlow helps the tensors flow
Home-page: http://tensorflow.org/
Author: Google Inc.
Author-email: opensource#google.com
License: Apache 2.0
Location: /usr/local/lib/python2.7/dist-packages
Requires: mock, numpy, protobuf, wheel, six
Interestingly, the output shows that keras is using tensorflow version 1.3.0 which is the CPU version and not 0.12.1, the GPU version
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import tensorflow as tf
print('Tensorflow: ', tf.__version__)
Output
root#08b5fff06800:~/sharedfolder# python test.py
Using TensorFlow backend.
Tensorflow: 1.3.0
I guess now I need to figure out how to have keras use the gpu version of tensorflow.
It is never a good idea to have both tensorflow and tensorflow-gpu packages installed side by side (the one single time it happened to me accidentally, Keras was using the CPU version).
I guess now I need to figure out how to have keras use the gpu version of tensorflow.
You should simply remove both packages from your system, and then re-install tensorflow-gpu [UPDATED after comment]:
pip uninstall tensorflow tensorflow-gpu
pip install tensorflow-gpu
Moreover, it is puzzling why you seem to use the floydhub/dl-docker:cpu container, while according to the instructions you should be using the floydhub/dl-docker:gpu one...
I had similar kind of issue - keras didn't use my GPU. I had tensorflow-gpu installed according to instruction into conda, but after installation of keras it simply not listed GPU as available device. I've realized that installation of keras adds tensorflow package! So I had both tensorflow and tensorflow-gpu packages. I've found that there is keras-gpu package available. After complete uninstallation of keras, tensorflow, tensorflow-gpu and installation of tensorflow-gpu, keras-gpu the problem was solved.
In the future, you can try using virtual environments to separate tensorflow CPU and GPU, for example:
conda create --name tensorflow python=3.5
activate tensorflow
pip install tensorflow
AND
conda create --name tensorflow-gpu python=3.5
activate tensorflow-gpu
pip install tensorflow-gpu
This worked for me:
Install tensorflow v2.2.0
pip install tensorflow==2.2.0
Also remove tensorflow-gpu (if it's present)

Ubuntu Docker image with minimal mono runtime in order to run F# app

I need to build a "slim" docker image which only contains the mono runtime in order to execute a pre-compiled F# app. In other words, I want to create the leanest possible image for executing mono apps, without any of the additional stuff for compiling/building apps. I am using Ubuntu:16.04 as a my base image (which weighs at around 47MB).
If I try to install mono on top of that image (using apt-get install mono-devel), then the image grows to whopping 500MB. This of course happens because the entire mono development tools are installed.
How can I proceed to only create an image containing the mono runtime? Is there a way installing through apt-get the mono runtime?
I'm answering the question as it is stated:
How can I proceed to only create an image containing the mono runtime?
For that, the answer is yes. There is a package for just the runtime called mono-runtime. In addition to that, there is an apt option to ignore installing recommended packages (usually docs and other stuff that may not be necessary for a runtime) with --no-install-recommends. Combining the two, we can get down to around 240 MB on the Ubuntu base:
FROM ubuntu
RUN apt update && apt install -qy --no-install-recommends mono-runtime libfsharp-core4.3-cil
Also mentioned in comments, there are some more minimal images based on Alpine linux that may be of interest such as https://hub.docker.com/r/frolvlad/alpine-mono/ (which at the moment is around 200 MB).

how to fly a Parrot bebop drone in tum simulator?

I am working on a school project. I have completed http://bebop-autonomy.readthedocs.io/en/indigo-devel/running.html installation till running the driver.
my current package structure is this :
$ mkdir -p ~/bebop_ws/src && cd ~/bebop_ws
$ catkin init
$ git clone https://github.com/AutonomyLab/bebop_autonomy.git src/bebop_autonomy
# Update rosdep database and install dependencies (including parrot_arsdk)
$ rosdep update
$ rosdep install --from-paths src -i
# Build the workspace
$ catkin build -DCMAKE_BUILD_TYPE=RelWithDebInfo
I need a simulator to simulate the bebop drone. I have installed Gazebo 2x with ros indigo.And then for simulation, I followed the following instructions to create tum simulator, but permission is denied in the roscd.
$roscd
$git clone https://github.com/tum-vision/tum_simulator.git
$export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:`pwd`/tum_simulator
$rosmake cvg_sim_gazebo_plugins
$rosmake message_to_tf
If I skip the roscd step and clone tum_simulator, when I run this I am getting the following error :
#"[ rosmake ] WARNING: The following args could not be parsed as stacks or packages: ['cvg_sim_gazebo_plugins']
#[ rosmake ] ERROR: No arguments could be parsed into valid package or stack names.
"
Can somebody help me with starting the tum simulator by your own solution or fixing what I am doing ? If I can use this https://github.com/dougvk/tum_simulator, in which directory should I clone the tumsimulator git ?
Most likely I arrived a few years late, but I think it may be useful for future visitors to the page. My colleagues and I, at the University of Sannio in Benevento, Italy, developed a ROS package for simulating the behavior of the Parrot Bebop 2. The ROS package is compatible with both Indigo Igloo and Kinetic Kame distros of ROS. Moreover, we are working to make the code compatible with Parrot Sphinx (it simulates the firmware behavior thus, making it possible to put it in the control loop). Here the link to the repository available on GitHub.
https://github.com/gsilano/BebopS
EDIT January 5, 2020
Now, the ROS package is compatible with the Melodic distro of ROS and Gazebo. Moreover, the package is compatible with ROS Kinetic and both the 7th and 9th releases of Gazebo.
Unfortunately the tum_simulator package is for Parrot AR-Drone quadcopters and does not support Bebop. When you are using a simulator, you do not need to run the driver since the simulator replaces the driver and provides a similar API as if you were using the real hardware.
You can remove bebop_autonomy from your workspace (even better, start from scratch in a new workspace) and follow the instructions on how to compile/run tum_simulator on ROS Indigo. The documentation for ardrone_autonomy, the ROS driver for Parrot AR-Drone can be found here.
I was also looking for the similar simulator. Finally, I added the model of Parrot Bebop 2 to ETH RotorS package. You may also give it a try.
git clone https://github.com/ayushgaud/rotors_simulator.git
Currently, I am using their controller but I have added a Gazebo model with most parameters correctly modeled and the front camera mounted.

Resources