Fastest way to compute cosine similarity in a GPU

Fastest way to compute cosine similarity in a GPU - machine-learning

So I have a huge tfidf matrix with more than a million records, I would like to find the cosine similarity of this matrix with itself. I am using colab to run the code, but I am not sure how to best make use of the gpu provided by colab.
sequentially run code -
tfidf_matrix = tf.fit_transform(df['categories'])
cosine_similarities = linear_kernel(matrix, matrix)
Is there way we can parallelise the code using jit or any other way?

try simple torch code like in this example from sentence transformers library: https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/util.py#L31
or just import the function.
consider cuml library which uses CUDA acceleration
https://docs.rapids.ai/api/cuml/nightly/api.html

Related

Computing the normal cdf with custom pydrake types?

I'm writing a MathematicalProgram that requires evaluating the likelihood of sampling a normal distribution and getting a value close to a certain number. Ideally, I would want to write something like this, which uses the scipy.stats norm class:
prog.AddCost(log_likelihood_cost, vars=x)
def log_likelihood_cost(x):
cdf_upper = norm.cdf(x + eps)
cdf_lower = norm.cdf(x - eps)
observation_likelihood = cdf_upper - cdf_lower
return -np.log(observation_likelihood)
However, scipy doesn't play nicely with numpy arrays of dtype "object"---the above code throws an error because the "isnan" operation can't be safely applied. I can think of a few not very statisfying workarounds:
Instead of integrating around the value, using the probability density as the basis for cost. This is bad because I'd like the cost to be positive and the density is unbounded if we vary sigma.
Approximating the integral with numpy code, perhaps using a trapeze method, maybe also caching the cdf over a grid and interpolating. This feels a bit like reinventing the wheel, and it'd also be much more bug-prone.
Ideally, I'd avoid resorting to either of those. I've looked into kGaussian and the likes but it didn't seem like there's methods to compute a cdf, at least among those exposed in python. What would be the pydrake way of doing this?

KNN classifier taking too much time even on gpu

I am classifying the MNSIT digit using KNN on kaggle but at last step it is taking to much time to execute and mnsit data is juts 15 mb like i am still waiting can you point any problem that is in my code thanks.
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
print(os.listdir("../input"))
#Loading datset
train=pd.read_csv('../input/mnist_test.csv')
test=pd.read_csv('../input/mnist_train.csv')
X_train=train.drop('label',axis=1)
y_train=train['label']
X_test=test.drop('label',axis=1)
y_test=test['label']
from sklearn.neighbors import KNeighborsClassifier
clf=KNeighborsClassifier(n_neighbors=3)
clf.fit(X_train,y_train)
accuracy=clf.score(X_test,y_test)
accuracy

There isn't anything wrong with your code per se. KNN is just a slow algorithm, it's slower for you because computing distances between images is hard at scale, and it's slower for you because the problem is large enough that your cache can't really be used effectively.
Without using a different library or coding your own GPU kernel, you can probably get a speed boost by replacing
clf=KNeighborsClassifier(n_neighbors=3)
with
clf=KNeighborsClassifier(n_neighbors=3, n_jobs=-1)
to at least use all of your cores.

because you are not using gpu on kaggle actually. KNeighborsClassifier do not support gpu

In order to use the GPU for KNN, you need to specify it otherwise it defaults to CPU the documentation is here: https://simbsig.readthedocs.io/en/latest/KNeighborsClassifier.html
knn = KNeighborsClassifier(n_neighbors=3, device = 'gpu')

Why it's only working with setting kernel: 'rbf' in SVM Classifier?

from sklearn.model_selection import GridSearchCV
from sklearn import svm
params_svm = {
'kernel' : ['linear','rbf','poly'],
'C' : [0.1,0.5,1,10,100],
'gamma' : [0.001,0.01,0.1,1,10]
}
svm_clf = svm.SVC()
estimator_svm = GridSearchCV(svm_clf,param_grid=params_svm,cv=4,verbose=1,scoring='accuracy')
estimator_svm.fit(data,labels)
print(estimator_svm.best_params_)
estimator_svm.best_score_
/*
data.shape is (891,9)
labels.shape is (891) both are numeric 2-D and 1-D arrays.
*/
when I am using GridSearchCV with rbf it's giving the best parameter combination in just 2.7seconds..!
but when I make a list of kernel including any 'poly' or 'linear' separately or with 'rbf' it's taking too long to produce output, i.e. not giving output even after 15-20 minutes, which means I am doing something wrong. I am new to Machine Learning(supervised). I am not able to find any bug in the coding...I am not getting what's going wrong behind the scenes!
Can anyone explain this to me ,what i am doing wrong

No you are not doing anything wrong as per your code. There are many factors that come into play here
SVC is a complex classfier which requires computation of a distance between each pair of points in the dataset.
The complexity also varies with different kernel. I am not sure but i think it is O((no_of_samples)^2 * n_features) for rbf kernel, while it is O(n_samples*n_features) for linear kernel. So, it is not the case that just because rbf kernel works in 15 mins, then linear kernel will also work in similar time.
Also the time taken depends drastically on the dataset and the data patterns present in it. For e.g. an rbf kernel may converge quickly with say C = 0.5 but may take drastically more time for polynomial kernel to converge for the same value of C.
Also, without using the cache the running time increase a lot. In this answer, the author mentions it might increase to O(n_samples^3 *n_features).
Here is the offical documentation from sklearn about SVM complexity. See this section about practical tips on using SVM as well.
You can set verbose to True to see the progress of your classfier and how it is trained.
References
GridSearchCV goes to endless execution using SVC
Computational complexity of SVM
Official Documentation of SVM for scikit-learn

facial expression classification in real time using SVM

I am currently working on a project where I have to extract the facial expression of a user (only one user at a time from a webcam) like sad or happy.
My method for classifying facial expressions is:
Use opencv to detect the face in the image
Use ASM and stasm to get the facial feature point
and now i'm trying to do facial expression classification
is SVM a good option ? and if it is how can i start with SVM :
how i'm going to train svm for every emotions using this landmarks ?

Yes, SVMs have been numerously shown to perform well in this task. There have been dozens (if not hundreads) of papers describing such procedures.
For example:
Simple paper
Longer paper
Poster about it
More complex example
Some basic sources of the SVMs themselves can be obtained on http://www.support-vector-machines.org/ (like books titles, software links etc.)
And if you are just interested in using them rather then understanding you can get one of basic libraries:
libsvm http://www.csie.ntu.edu.tw/~cjlin/libsvm/
svmlight http://svmlight.joachims.org/

if you are already using opencv,i suggest you use the built in svm implementation, training/saving/loading in python is as follow. c++ has corresponding api to do the same in about the same amount of code. it also has 'train_auto' to find best parameters
import numpy as np
import cv2
samples = np.array(np.random.random((4,5)), dtype = np.float32)
labels = np.array(np.random.randint(0,2,4), dtype = np.float32)
svm = cv2.SVM()
svmparams = dict( kernel_type = cv2.SVM_LINEAR,
svm_type = cv2.SVM_C_SVC,
C = 1 )
svm.train(samples, labels, params = svmparams)
testresult = np.float32( [svm.predict(s) for s in samples])
print samples
print labels
print testresult
svm.save('model.xml')
loaded=svm.load('model.xml')
and output
#print samples
[[ 0.24686454 0.07454421 0.90043277 0.37529686 0.34437731]
[ 0.41088378 0.79261768 0.46119651 0.50203663 0.64999193]
[ 0.11879266 0.6869216 0.4808321 0.6477254 0.16334397]
[ 0.02145131 0.51843268 0.74307418 0.90667248 0.07163303]]
#print labels
[ 0. 1. 1. 0.]
#print testresult
[ 0. 1. 1. 0.]
so you provide the n flattened shape models as samples and n labels and you are good to go. you probably dont even need the asm part, just apply some filters which are sensitive to orientation like sobel or gabor and concatenate the matrices and flatten them then feed them directly to svm. you probably can get maybe 70-90% accuracy.
as someone said cnn are an alternative to svms.here's some links that implement lenet5. so far,i find svms much simpler to get started.
https://github.com/lisa-lab/DeepLearningTutorials/
http://www.codeproject.com/Articles/16650/Neural-Network-for-Recognition-of-Handwritten-Digi
-edit-
landmarks are just n (x,y) vectors right? so why dont you try put them into a array of size 2n and simply feed them directly to the code above?
for example,3 training samples of 4 land marks (0,0),(10,10),(50,50),(70,70)
samples = [[0,0,10,10,50,50,70,70],
[0,0,10,10,50,50,70,70],
[0,0,10,10,50,50,70,70]]
labels=[0.,1.,2.]
0=happy
1=angry
2=disgust

You could check this code to get idea how this could be done using SVM.
You can find algorithm explained here

How to do a Gaussian filtering in 3D

How do i do a gaussi smoothing in the 3th dimension?
I have this detection pyramid, votes accumulated at four scales. Objects are found at each peak.
I already smoothed each of them in 2d, and reading in my papers that i need to filter the third dimension with a \sigma = 1, which i havent tried before, i am not even sure what it means.
I Figured out how to do it in Matlab, and need something simular in opencv/c++.
Matlab Raw Values:
Matlab Smoothen with M0 = smooth3(M0,'gaussian'); :

Gaussian filters are separable. You apply 1D filter at each dimension as follows:
for (dim = 0; dim < D; dim++)
tensor = gaussian_filter(tensor, dim);
I would recommend OpenCV for an implementation of a gaussian filter (and image processing in general) in C++.
Note that this assumes that your pyramid levels are all of the same size.
You can have your own functions that sample your scale-space pyramid on the fly while convolving the third dimension, but if you have enough memory I believe that it would be faster to scale up your coarser level to have the same size of the finest level.

Long ago (in 2008-2009) I have developed a small C++ template lib to apply some simple transformations and convolution filters. The library's source can be found in the Linderdaum Engine - it has nothing to do with the rest of the engine and does not use any of the engine's features. The license is MIT, so do whatever you want with it.
Take a look into the Linderdaum's source code (http://www.linderdaum.com) at Src/Linderdaum/Images/VolumeLib.*
The function to prepare the kernel is PrepareGaussianFilter() and MakeScalarVolumeConvolution() applies the filter. It is easy to adapt the library for the different data sources because the I/O is implemented using callback functions.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Fastest way to compute cosine similarity in a GPU - machine-learning

Related

Computing the normal cdf with custom pydrake types?

KNN classifier taking too much time even on gpu

Why it's only working with setting kernel: 'rbf' in SVM Classifier?

facial expression classification in real time using SVM

How to do a Gaussian filtering in 3D

Categories

Resources