I'm confused about the way I should make the "features extraction " method
I want to use SVMs to apply "Object recognition" in images ,
There's a sample in Emgu's examples that holds an XML file contains the features of a cat !
and I've been trying since a week to know how they did it and what methods they used
and I came across this page
http://experienceopencv.blogspot.com/2011/02/learning-deformable-models-with-latent.html
that displays the steps ! It's so complicated plus couldn't do it myself
I'm so lost !! can anyone tell me an appropriate method of "features extraction "Compatible with SVMs learning ?
Accord has SVM example but it's on hand writing and doesn't deal with color images =(
any helping links ?
thanks
all feature extraction methods are compatible with svm... u just need to choose one... select one and get the features and then input these features into svm.... explanation of what is feature extraction is here http://en.wikipedia.org/wiki/Feature_extraction
You need to concentrate on the gabor filter, which is an advanced extractor for face recognition and object recognition.
Related
I was planning on doing some classification/segmentation on whole slide images. Since the images are huge, I was wondering about the methods that can be applied to process them. So far I've come across techniques that split the image into multiple parts, process those parts and combine the results. However, I would like to know more about other better approaches and if it's the good one. Any reference to existing literature would be of great help.
pyvips has a feature for generating patches from slide images efficiently.
This benchmark shows how it works. It can generate about 25,000 64x64 patches a second in the 8 basic orientations from an SVS file:
https://github.com/libvips/pyvips/issues/100#issuecomment-493960943
It's handy for training. I don't know how that compares to the other patch generation systems people use.
To read these images, the standard library is "open-slide" [https://openslide.org/api/python/]. By "open-slide" you can read, e.g., patches or thumbnails.
For basic image processing operations like filtering, "libvips" and its python binding "pyvips" is quick and convenient to use [https://libvips.github.io/pyvips/vimage.html].
If you need to pass data (like random patches) to a machine learning model, I would personally suggest "PyDmed". When training, e.g., a classifier or a generative model, the loading speed of "PyDmed" is suitable for feeding batches of data to GPU(s).
Here is the link to PyDmed public repo:
https://github.com/amirakbarnejad/PyDmed
Here is the link to PyDmed quick start:
https://amirakbarnejad.github.io/Tutorial/tutorial_section1.html
As akbarnejad mentioned, my preference is to use openslide.
I usually end up writing bespoke dataloaders to feed into pytorch models that use openslide to first do some simple segmentation using various thresholds of a low resolution (thumbnail) image of the slide to get patch coordinates and then pull out the relevant patches of tissue for feeding into the training model.
There are a few good examples of this and tools that try to make it simpler for both pytorch and Keras -
Pytorch
wsi-preprocessing
Keras
Deeplearning-digital-pathology
Both
deep-openslide
I have class which has slightly different features from the other class:
ex - This image has buckle in it (consider it as a class) https://6c819239693cc4960b69-cc9b957bf963b53239339d3141093094.ssl.cf3.rackcdn.com/1000006329245-822018-Black-Black-1000006329245-822018_01-345.jpg
But This image is quite similar to it but has no buckle :
https://sc01.alicdn.com/kf/HTB1ASpYSVXXXXbdXpXXq6xXFXXXR/latest-modern-classic-chappal-slippers-for-men.jpg
I am little confused about which model to use in these kind of cases which actually learns pixel to pixel values.
Any thoughts will be appreciable.
thanks !!
I have already tried Inception,Resnet etc models.
With a less volume train data (300-400 around each class) can we reach a good recall/precision/F1 score.
You might want to look into transfer learning due to the small dataset, what you can do is use a transferred ResNet model to work as a feature extractor and try a YOLO(You only look once) algorithm on it, look through each window(Look Sliding window implementation using ConvNets) to obtain a belt buckle and based on that you can classify the image.
Based on my understanding of your dataset, to do the above approach though you will need to re-annotate your dataset as per the requirements of YOLO algorithm.
To look at an example of the above approach, visit https://mc.ai/implementing-yolo-using-resnet-as-feature-extractor/
Edit If you have XML annotated Dataset and need to convert it to csv to follow the above example use https://github.com/datitran/raccoon_dataset
Happy modelling.
Problem:
I have a "face" images database of multiple persons, in which for each person I have multiple images(each have something different in it in terms of facial expression like smiling, thinking, simple etc).
While testing, I am having a testing data set of "smiling face image" of persons for whom image already exist in database but images in database and test data set are not exactly same (i.e. two images of same person smiling at different time, out of which one is in database and other is in test data set).
Now, the problem is my application detects the person correctly but in facial expressions it mis-matches ex.: in place of "smiling face" sometimes it gives "simple face".
PS: Efficiency in terms of finding exact person is 100% but facial expression mis-match is a problem.
Algo I am using:
Image Normalization and enhancement
SURF Feature Detection and matching
Can anyone suggest what may have gone wrong or what can be a better algorithm/approach to solve this problem ?
Is there a better algorithm than SURF for comparing two images??
I would use other face recognition algorithms, for example: LBP + svm.
You can use face-rec.org to read about face recognition algorithms, or the results page of the "labeled face in the wild" page:
http://vis-www.cs.umass.edu/lfw/results.html
If your'e using OpenCV, you can check out OpenCV's module for face recognition
http://docs.opencv.org/trunk/modules/contrib/doc/facerec/
I am currently working on a project where I have to extract the facial expression of a user (only one user at a time from a webcam) like sad or happy.
The best possibility I found so far:
I used OpenCV for face detection.
Some user on a OpenCV board suggested looking for AAM (active apereance models) and ASM (active shape models), but all I found were papers.
-So i'm Using Active Shape Models with Stasm, which will give me access to 77 different points within the face like on the picture
now i want to know the best way to do :
the best learning method to use on cohn and kanade database to classify the emotions (happy,....) ?
the best method to classify the facial expressions on a video in real time ?
Look here for similar solution video and description of algorithm: http://www2.isr.uc.pt/~pedromartins/ in "Identity and Expression Recognition on Low Dimensional Manifolds" 2009 year.
I am working with SVM-light. I would like to use SVM-light to train a classifier for object detection. I figured out the syntax to start a training:
svm_learn example2/train_induction.dat example2/model
My problem: how can I build the "train_induction.dat" from a
set of positive and negative pictures?
There are two parts to this question:
What feature representation should I use for object detection in images with SVMs?
How do I create an SVM-light data file with (whatever feature representation)?
For an intro to the first question, see Wikipedia's outline. Bag of words models based on SIFT or sometimes SURF or HOG features are fairly standard.
For the second, it depends a lot on what language / libraries you want to use. The features can be extracted from the images using something like OpenCV, vlfeat, or many others. You can then convert those features to the SVM-light format as described on the SVM-light homepage (no anchors on that page; search for "The input file").
If you update with what language and library you want to use, we can give more specific advice.