What is the relationship between PyTorch and Torch? - lua

There are two PyTorch repositories :
https://github.com/hughperkins/pytorch
https://github.com/pytorch/pytorch
The first clearly requires Torch and lua and is a wrapper, but the second doesn't make any reference to the Torch project except with its name.
How is it related to the Lua Torch?

Here a short comparison on pytorch and torch.
Torch:
A Tensor library like numpy, unlike numpy it has strong GPU support.
Lua is a wrapper for Torch (Yes! you need to have a good understanding of Lua), and for that you will need LuaRocks package manager.
PyTorch:
No need for the LuaRocks package manager, no need to write code in Lua. And because we are using Python, we can develop Deep Learning models with utmost flexibility. We can also exploit major Python packages likes scipy, numpy, matplotlib and Cython with PyTorch's own autograd.
There is a detailed discussion on this on pytorch forum. Adding to that both PyTorch and Torch use THNN. Torch provides lua wrappers to the THNN library while Pytorch provides Python wrappers for the same.
PyTorch's recurrent nets, weight sharing and memory usage with the flexibility of interfacing with C, and the current speed of Torch.
For more insights, have a look at this discussion session here.

Just to clarify the confusion between both pytorch repositories:
pytorch/pytorch is very similar to (Lua) Torch but in Python. So it's a wrapper over THNN. This was written by Facebook too.
hughperkins/pytorch: I have come across this repo when I was developing in Torch before pytorch existed, but I have never used it so I'm not quite sure if it is a wrapper written in Python over (Lua) Torch which is in turn a wrapper over THNN OR a wrapper over THNN and Lua. In both case, this is not the original version of Torch. It was written by Hugh Perkins when there was no Python alternative for Torch.
If you are wondering which one to go for, I would definitely recommend pytorch/pytorch as it communicates directly with THNN, is written by the people who made THNN and is continuously maintained. hughperkins/pytorch does not seem to be maintained anymore.

Related

Is there a native library written in Julia for Machine Learning?

I have started using Julia.I read that it is faster than C.
So far I have seen some libraries like KNET and Flux, but both are for Deep Learning.
also there is a command "Pycall" tu use Python inside Julia.
But I am interested in Machine Learning too. So I would like to use SVM, Random Forest, KNN, XGBoost, etc but in Julia.
Is there a native library written in Julia for Machine Learning?
Thank you
A lot of algorithms are just plain available using dedicated packages. Like BayesNets.jl
For "classical machine learning" MLJ.jl which is a pure Julia Machine Learning framework, it's written by the Alan Turing Institute with very active development.
For Neural Networks Flux.jl is the way to go in Julia. Also very active, GPU-ready and allow all the exotics combinations that exist in the Julia ecosystem like DiffEqFlux.jl a package that combines Flux.jl and DifferentialEquations.jl.
Just wait for Zygote.jl a source-to-source automatic differentiation package that will be some sort of backend for Flux.jl
Of course, if you're more confident with Python ML tools you still have TensorFlow.jl and ScikitLearn.jl, but OP asked for pure Julia packages and those are just Julia wrappers of Python packages.
Have a look at this kNN implementation and this for XGboost.
There are SVM implementations, but outdated an unmaintained (search for SVM .jl). But, really, think about other algorithms for much better prediction qualities and model construction performance. Have a look at the OLS (orthogonal least squares) and OFR (orthogonal forward regression) algorithm family. You will easily find detailed algorithm descriptions, easy to code in any suitable language. However, there is currently no Julia implementation I am aware of. I found only Matlab implementations and made my own java implementation, some years ago. I have plans to port it to julia, but that has currently no priority and may last some years. Meanwhile - why not coding by yourself? You won't find any other language making it easier to code a prototype and turn it into a highly efficient production algorithm running heavy load on a CUDA enabled GPGPU.
I recommend this quite new publication, to start with: Nonlinear identification using orthogonal forward regression with nested optimal regularization

h2o ML python cheatsheet OR comparison between h2o using python and scikit-learn

I wanted to ask if anyone has come across a h2o machine learning python cheatsheet or comparison between h2o using python and scikit-learn
Would be very helpful since I am a scikit-learn guy.
There is a H2OFrame / Pandas DataFrame munging cheatsheet
here.
The "EEG Eyestate" demo was written for both H2O and Scikit-learn, so that's the closest thing to a side-to-side comparison that I can point you to.
There are some Python tutorials
here, which demonstrate basic usage of the supervised H2O algos (and grid search) in Python.
Taylor Smith created the skutil module which allows you to use H2O models more easily with sklearn pipelines.
For algorithms and examples in Python for how to use each parameter, go here (the main H2O user documentation) and look at the Algorithms section:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/index.html
For Python specific stuff, go to the docs website and search for 'Python' on the page. There's a box specifically with Python stuff:
http://docs.h2o.ai
You can use H2O models as elements of an sklearn pipeline.

Is it possible to use libsvm in cuda?

I wonder If I can use libsvm in cuda.
I look for best parameters by cross validation, so I have to run same code around 4000 with different parameters.
I wonder If I can run the cross validation in parallel with cuda,
instead of using:
for i in range(4000):
predict(parameter)
find_best_parameter()
In the official webpage of the libsvm software you can find this sentence:
Python, R, MATLAB, Perl, Ruby, Weka, Common LISP, CLISP, Haskell, OCaml, LabVIEW, and PHP interfaces. C# .NET code and CUDA extension is available.
And there is a link to a GPU implementation:
http://mklab.iti.gr/project/GPU-LIBSVM

Online time series algorithms implemented in R/python/MOA

I am looking for implemented online learning time series algorithms. Does R, Python, MOA or any other tools have these kind of algorithms implemented?
TIA!
It's a little bit late, but in case someone is looking for the answer, I will share what I know:
PYTHON: sklearn clustering algorithms.
MiniBatchKMeans and Birch: both algorithm implementations have a partial_fit method allowing you to stream data through them in incremental updates (allowing online learning).
JAVA: MOA framework.
There are a lot of well known stream clustering algorithms implemented (CluStream, DenStream, etc ...). You can use it via:
terminal
user interface (see clustering demo)
code (Java API)
See the 'downloads' section in the MOA web, or check directly the source code on Github.
R: streamMOA: a R package that acts as a wrapper for the MOA [Java] classes. See the manual.

OpenCV GPU support and TBB

I am going to train my Haar classifier for flowers(which I am highly skeptical about). I have been following the CodingRobin Tut for everything.
http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html
Now, it has been emphasized that I use GPU support, multithreading etc. otherwise the training is gonna take days. I am going to use pre-built libraries and therefore the pre-built opencv_traincascade utility.
I want to ask beforehand, Will I be able to leverage GPU support if I use the pre-built libs, given that I install CUDA?
Where does TBB fit in the whole picture?
Do you recommend me building the whole library from scratch with TBB and CUDA support checked, or that would be a waste?
Note: I am using OpenCV 2.4.11. And I am a complete beginner to OpenCV.

Resources