I am trying to implement the OpenCV LBPHFaceRecognizer() and make it work for the images of digits from the MNIST dataset. These images are 28 x 28 px and look like this:
But for this task I need an haarcascade.xml file which is able to recognize digits. In the OpenCV package I only find xml files which are suited for facerecognition and russian plate numbers.
Here is my code, I basicly just need to replace the cascadePath = "haarcascade_frontalface_default.xml" with an apropriate xml for digits, but where do I get one?
All in all I want to test facerecognition with numbers instead of faces. So an input image where a "1" is shown should be able to recognize all other "1"`s in the dataset.
For this, you need to train a cascade. Here two link to explain how to do this :
1 This is the Opencv documentation for opencv_traincascade which is the opencv app to train cascade (generate .xml)
2 This is a useful tutorial to train cascade with opencv. It explains what to do and give some tricks to generate input file.
Related
I want to build a database with Yolo and this is my first time working with deep learning
how can I build a database for Yolo and train it?
How do I get the weights of the classifications?
Is it too difficult for someone new to Deep Learning?
Yes you can do it with ease!! and welcome to the Deep learning Community. You are welcome.
First download the darknet folder from Link
Go inside the folder and type make in command prompt
git clone https://github.com/pjreddie/darknet
cd darknet
make
Define these files -
data/custom.names
data/images
data/train.txt
data/test.txt
Now its time to label the images using LabelImg and save it in YOLO format which will generate corresponding label .txt files for the images dataset.
Labels of our objects should be saved in data/custom.names.
Using the script you can split the dataset into train and test-
import glob, os
dataset_path = '/media/subham/Data1/deep_learning/usecase/yolov3/images'
# Percentage of images to be used for the test set
percentage_test = 20
# Create and/or truncate train.txt and test.txt
file_train = open('train.txt', 'w')
file_test = open('test.txt', 'w')
# Populate train.txt and test.txt
counter = 1
index_test = round(100 / percentage_test)
for pathAndFilename in glob.iglob(os.path.join(dataset_path, "*.jpg")):
title, ext = os.path.splitext(os.path.basename(pathAndFilename))
if counter == index_test+1:
counter = 1
file_test.write(dataset_path + "/" + title + '.jpg' + "\n")
else:
file_train.write(dataset_path + "/" + title + '.jpg' + "\n")
counter = counter + 1
For train our object detector we can use the existing pre trained weights that are already trained on huge data sets. From here we can download the pre trained weights to the root directory.
Create a yolo-custom.data file in the custom_data directory which should contain information regarding the train and test data sets
classes=2
train=custom_data/train.txt
valid=custom_data/test.txt
names=custom_data/custom.names
backup=backup/
Now we have to make changes in our yolov3.cfg for training our model. For two classes. Based on the required performance we can select the YOLOv3 configuration file. For this example we will be using yolov3.cfg. We can duplicate the file from cfg/yolov3.cfg to custom_data/cfg/yolov3-custom.cfg
The maximum number of iterations for which our network should be trained is set with the param max_batches=4000. Also update steps=3200,3600 which is 80%, 90% of max_batches.
We will need to update the classes and filters params of [yolo] and [convolutional] layers that are just before the [yolo] layers.
In this example since we have a single class (tesla) we will update the classes param in the [yolo] layers to 1 at line numbers: 610, 696, 783
Similarly we will need to update the filters param based on the classes count filters=(classes + 5) * 3. For two classes we should set filters=21 at line numbers: 603, 689, 776
All the configuration changes are made to custom_data/cfg/yolov3-custom.cfg
Now, we have defined all the necessary items for training the YOLOv3 model. To train-
./darknet detector train custom_data/detector.data custom_data/cfg/yolov3-custom.cfg darknet53.conv.74
Also you can mark bounded boxes of objects in images for training Yolo right in your web browser, just open url. This tool is deployed to GitHub Pages.
Use this popular forked darknet repository https://github.com/AlexeyAB/darknet. The author describes many steps that will help you to build and use your own Yolo detector model.
How to build your own custom dataset and train it? Follow this step . He suggests to use Yolo Mark labeling tool to build your dataset, but you can also try another tool as described in here and here.
How to get the weights? The weights will be stored in darknet/backup/ directory after every 1000 iterations (you can adjust this value later). The link above explains everything about how to make and use the weights file.
I don't think it will be so difficult if you already know math, statistic and programming. Learning the basic neural network like perceptron, MLP then move to modern Machine Learning is a good start. Then you might want to expand your knowledge to Computer Vision related or NLP related area
Depending on what kind of OS you have. You can either hit up https://github.com/AlexeyAB/darknet [especially for Windows] or stick to https://github.com/pjreddie/darknet.
Steps to do so:
1) Setup darknet as detailed in the posts.
2) I used LabelIMG to label my images. make sure that the format
you save the images is in YOLO. If you save using the PascalVOC format or others you can write scripts to change it to the format that darknet expects.[YOLO]. Also, make sure that you do not change your labels file. If you want to add new labels, at it at the end of the file, not in between. YOLO format is quite different, so your previously labelled images may get messed up if you make changes in between the classes.
3)The weights will be generated as you train your model in a specific folder in darknet.[If you need more details I am happy to help answer that]. You can download the .74 file in YOLO and start training. The input to train needs a built darknet.exe a cfg file a .74 file and your training data location/access.
The setup is draconian, the process itself is not.
To build your own dataset, you should use LabelImg. It's a free and very easy software which will produce for you all the files you need to build a dataset. In fact, because you are working with yolo, you need a txt file for each of your image which will contain important information like bbox coordinates, label name. All these txt files are automatically produced by LabelImg so all you have to do is open the directory which contains all your images with LabelImg, and start the labelisation. Then, well you will have all your txt files, you will also need to create some other files in order to start training (see https://blog.francium.tech/custom-object-training-and-detection-with-yolov3-darknet-and-opencv-41542f2ff44e).
I have an image with 8 channels.I have a conventional algorithm where weights are added to each of these channels to get an output as '0' or '1'.This works fine with several samples and complex scenarios. I would like implement the same in Machine Learning using CNN method.
I am new to ML and started looking out the tutorials which seem to be exclusively dealing with image processing problems- Hand writing recognition,Feature extraction etc.
http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/
https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/neural_networks.html
I have setup the Keras with Theano as background.Basic Keras samples are working without problem.
What steps do I require to follow in order achieve the same result using CNN ? I do not comprehend the use of filters,kernels,stride in my use case.How do we provide Training data to Keras if the pixel channel values and output are in the below form?
Pixel#1 f(C1,C2...C8)=1
Pixel#2 f(C1,C2...C8)=1
Pixel#3 f(C1,C2...C8)=0 .
.
Pixel#N f(C1,C2...C8)=1
I think you should treat this the same way you use CNN to do semantic segmentation. For an example look at
https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf
You can use the same architecture has they are using but for the first layer instead of using filters for 3 channels use filters for 8 channels.
For the loss function you can use the same loos function or something that is more specific for binary loss.
There are several implementation for keras but with tensorflow
backend
https://github.com/JihongJu/keras-fcn
https://github.com/aurora95/Keras-FCN
Since the input is in the form of channel values,that too in sequence.I would suggest you to use Convolution1D. Here,you are taking each pixel's channel values as the input and you need to predict for each pixel.Try this
eg :
Conv1D(filters, kernel_size, strides=1, padding='valid')
Conv1D()
MaxPooling1D(pool_size)
......
(Add many layers as you want)
......
Dense(1)
use binary_crossentropy as the loss function.
I want to know that how can i train cascade classifier to detect only eyelashes or nose feature points in DLIB and [OPENCV][2]#
To be more clear i just want to extract some particular feature points to text file.
i tried extracting features but to no avail it gives all 68 points.
[2]: http://opencv.org/#I want to know that how can i train cascade classifier to detect only eyelashes or nose feature points in [A][1] and [B][2]#
1. To be more clear i just want to extract some particular feature points to text file.
2. i tried extracting features but to no avail it gives all 68 points.
For Dlib python api starting point should be this sample http://dlib.net/face_landmark_detection.py.html
As you see - it has face detection and shape prediction:
dets = detector(img, 1)
...
shape = predictor(img, d)
The shape object contain face shape as a list of feature point coordinates - parts. Each part is one point, for example shape.part(30) is a tip of nose. You can see their numbers on sample pictures from this blog
As I understand, you need simply save this points into file, that can be done like this:
with open("sample_file.txt", "w") as f:
for i in range(30, 32):
f.write("{};{}\n".format(i, shape.part(i)))
Where 30-32 are part numbers that you want to write to file
here is the deal.
I am trying to make an SVM based POS tagger.
The feature vectors for the SVM was created with the help of format converters.
Now here is a screenshot of the training file that I am using.
http://tinypic.com/r/n4fn2r/8
I have 25 labels for various POS tags. when i use the java implementation or the command line tools for prediction i get the following results.
http://tinypic.com/r/2dtw5ky/8
I have tried with all the kernels available but it gave more or less the same results.
This is happening even when the training file is used as the testing file.
please help me out here..!!
P.S. I cannot share more than two links. Thus here is a snippet of the model file
svm_type c_svc
kernel_type rbf
gamma 0.000548546
nr_class 25
total_sv 431
rho -0.929467 1.01073 1.0531 1.03472 1.01585 0.953263 1.03027 -0.921365 0.984535 1.02796 1.01266 1.03374 0.949463 0.977925 0.986551 -0.920912 0.940926 -0.955562 0.975386 -0.981959 -0.884042 0.0516955 -0.980884 -0.966095 0.995091 1.023 1.01489 1.00308 0.948314 1.01137 -0.845876 0.968034 1.0076 1.00064 1.01335 0.942633 0.965703 0.979212 -0.861236 0.935055 -0.91739 0.970223 -0.97103 0.0743777 0.970321 -0.971215 -0.931582 0.972377 0.958193 0.931253 0.825797 0.954894 -0.972884 -0.941726 0.945077 0.922366 0.953999 -1.00503 0.840985 0.882229 -0.961742 0.791631 -0.984971 0.855911 -0.991528 -0.951211 -0.962096 -0.99213 -0.99708 -0.957557 -0.308987 -0.455442 -0.94881 -0.995319 -0.974945 -0.964637 -0.902152 -0.955258 -1.05287 -1.00614 -0.
update
Just trained the SVM with svm type as c-SVC and kernel type as linear. Which gave a non-zero(although very poor) accuracy.
As mentioned by #Pedrom, parameter choice is absolutely crucial when training SVMs. I suggest you have a look at this practical guide. Also, 431 words is nowhere near enough to train a 25-class model. You will definitely need more data.
That said, 0% accuracy is indeed odd. Can you please show us the commands you are using to train and evaluate the model?
I haven't found any method to train new latent svm detector models using openCV. I'm currently using the existing models given in the xml files, but I would like to train my own.
Is there any method for doing so?
Thank you,
Gil.
As of now only DPM-detection is implemented in OpenCV, not training.
If you want to train your own models, the most reliable approach is to use Felzenszwalb's and Girshick's matlab code (most of the heavy stuff is implemented in C) (http://www.cs.berkeley.edu/~rbg/latent/)(http://www.rossgirshick.info/latent/) It is reliable and works reasonably fast
If you want to do it in C-only, there is an implementation here (http://libccv.org/doc/doc-dpm/) that I haven't tried myself.
I think there is a function in the octave version of the author's code here
(Octave Version of DPM). It is in step #5,
mat2opencvxml('./INRIA/inriaperson_final.mat', 'inriaperson_cascade_cv.xml');
I will try it and let you know about the result.
EDIT
I tried to convert the .mat file from the octave version i mentioned before to .xml file, and compared the result with the built in opencv .xml model and the construction of the 2 xmls was different (tags, #components,..), it seems that this version of octave dpm generates xml files for later opencv version (i am using 2.4).
VOC-release3.1 is the one matches opencv2.4.14. I tried to convert the already trained model from this version using mat2xml function available in opencv and the result xml file is successfully loaded and working with opencv. Here are some helpful links:
mat2xml code
VOC-release-3.1
How To Train DPM on a New Object