Training models on Apple M1’s ANE? - machine-learning

Is it possible to train models on Apple M1’s Neural Engine (ANE)?
Is it possible to perform basic matrix multiplication on the ANE?

Related

feature extraction, selection, and classification concepts

I know that support vector machine, random tree forest and logistic regression are famous machine learning (ML)algorithms for classification.
I'm confused the terminology between a feature extraction, selection and classification.
Does the above ML algorithms are used for extracting features not part of selecting?
Does the ML algorithms include both process of feature extraction and classification?
Does the result of training the ML algorithm (accuracy, specificity, sensitivity..) tell us the result of classifying a disease after the feature extraction?
Regarding your confusion about the 3 terminologies,
Feature extraction: When you want to create new features out of raw data (say you have the transaction_day column but you are only interested in the month, so you create a new column "transaction_month" out of "transaction_day")
Feature selection: You have many features but want to select only the important ones (how many of them is another topic to be studied). This could speed up the process of learning and with the right strategy, you would not sacrifice accuracy in many applications.
Classification: Is a family of supervised (labeled) machine learning that your goal is to assign observations to known classes (for example emails to spam or normal class)
Note: Some of machine learning algorithms like "Lasso" have build-in feature selection but for others, large coefficient of the feature after training usually shows the importance of the feature (read more about recursive feature elimination (rfe))
you may also find a good discussion in this post.

CoreML on-device model training with tabular data

I'm trying to build an app that makes suggestions (distinct classes) based on a table with 4 features: latitude, longitude, time and weekday.
The training data of my app is 100% personal, so it doesn't really make sense to pre-train the model. I wanna be able to train on device. I know CoreML 3 supports updating for neural networks and kNN classifiers, but does this really help me with my tabular data?
Other tabular classifiers like boasted tree, random forest... can't be trained on device unfortunately. Are there alternatives to CoreML for on device training of those simpler machine learning algorithms? Or can CoreML somehow already do what I want.
Unfortunately I'm not really an expert in neural networks.
Just because Core ML doesn't provide something, doesn't mean it's impossible. :-) You can use existing libraries or implement the algorithm by yourself.
If you're looking to build a logistic regression classifier, this is fairly easy to implement by hand. (You can even use a neural network with a single layer for this and still use Core ML.)

Why do we use metric learning when we can classify

So far, I have read some highly cited metric learning papers. The general idea of such papers is to learn a mapping such that mapped data points with same label lie close to each other and far from samples of other classes. To evaluate such techniques they report the accuracy of the KNN classifier on the generated embedding. So my question is if we have a labelled dataset and we are interested in increasing the accuracy of classification task, why do not we learn a classifier on the original datapoints. I mean instead of finding a new embedding which suites KNN classifier, we can learn a classifier that fits the (not embedded) datapoints. Based on what I have read so far the classification accuracy of such classifiers is much better than metric learning approaches. Is there a study that shows metric learning+KNN performs better than fitting a (good) classifier at least on some datasets?
Metric learning models CAN BE classifiers. So I will answer the question that why do we need metric learning for classification.
Let me give you an example. When you have a dataset of millions of classes and some classes have only limited examples, let's say less than 5. If you use classifiers such as SVMs or normal CNNs, you will find it impossible to train because those classifiers (discriminative models) will totally ignore the classes of few examples.
But for the metric learning models, it is not a problem since they are based on generative models.
By the way, the large number of classes is a challenge for discriminative models itself.
The real-life challenge inspires us to explore more better models.
As #Tengerye mentioned, you can use models trained using metric learning for classification. KNN is the simplest approach but you can take the embeddings of your data and train another classifier, be it KNN, SVM, Neural Network, etc. The use of metric learning, in this case, would be to change the original input space to another one which would be easier for a classifier to handle.
Apart from discriminative models being hard to train when data is unbalanced, or even worse, have very few examples per class, they cannot be easily extended for new classes.
Take for example facial recognition, if facial recognition models are trained as classification models, these models would only work for the faces it has seen and wouldn't work for any new face. Of course, you could add images for the faces you wish to add and retrain the model or fine-tune the model if possible, but this is highly impractical. On the other hand, facial recognition models trained using metric learning can generate embeddings for new faces, which can be easily added to the KNN and your system then can identify the new person given his/her image.

Confusion in machine learning concept for the object detection when using Aggregate Channel Features

I have one confusion in my mind regarding the machine learning concept for the Object detection.
The two main modules in Object detection is Proposal extraction and detection.
For the Proposal Extraction Module:
I want to use Aggregate Channel Features (ACF) for the proposal extraction. this algorithm needs training (positive and negative samples)and then we can do testing.
For Object Detection Module:
lets say I am using Convolution Neural Network.
Now my question is, Can I train the ACF first with 80% samples from the dataset, test its performance and make it ready to put in the pipeline. Then I split data set again, lets say now I choose 40% for the training the CNN architecture. This 40% dataset will first go to trained ACF to extract the proposals and these proposals are then trained according to their labels for the object detection.
Is this concept is right?

OpenCV Haar classifier - is it an SVM

I'm using an OpenCV Haar classifier in my work but I keep reading conflicting reports on whether the OpenCV Haar classifier is an SVM or not, can anyone clarify if it is using an SVM? Also if it is not using an SVM what advantages does the Haar method offer over an SVM approach?
SVM and Boosting (AdaBoost, GentleBoost, etc) are feature classification strategies/algorithms. Support Vector Machines solve a complex optimization problem, often using kernel functions which allows us to separate samples by working in a much higher dimension feature space. On the other hand, boosting is a strategy based on combining lots of "cheap" classifiers in a smart way, which leads to a very fast classification. Those weak classifiers can be even SVM.
Haar-like features are a kind of features based in integral images and very suitable for Computer Vision problems.
This is, you can combine Haar features with any of the two classification schemes.
It isn't SVM. Here is the documentation:
http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html#haar-feature-based-cascade-classifier-for-object-detection
It uses boosting (supporting AdaBoost and a variety of other similar methods -- all based on boosting).
The important difference is related to speed of evaluation is important in cascade classifiers and their stage based boosting algorithms allow very fast evaluation and high accuracy (in particular support training with many negatives), at a better balance point than an SVM for this particular application.

Resources