Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am new in machine learning domain. We have some videos of the cattle farm. All cattle are being detected by the YoloV2.
In one frame/min, we have to feed all cattle's image to a model and determine where each one of them is laying or standing. The problem is more like - cat vs dog (laying vs standing).
Can someone please suggest any computationally inexpensive image classifier to accomplish the objective?
I am planning to train the model with 200 images/class and 70-80 % accuracy is good enough at this moment.
Take a look at classifiers for mobile phones. They are usually quite performant and still decently accurate. The most common one is MobileNet, but there are newer variants nowadays. You do need to finetune the classifier though. An example on how to do this with keras can be found here: https://towardsdatascience.com/transfer-learning-using-mobilenet-and-keras-c75daf7ff299 .
If you want computationally inexpensive image classifier, you have to re-train object detection model.
train new model that is suitable for only cattle prediction.
It means you need to prepare two class of images(laying cattles and standing cattles)
After preparing images, build new model that consists of 2 class classifier, not 80 class or 20 class classifier (usually in YOLOv2).
Because you want computationally inexpensive model. and you don't need other classes in your classifier.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
Context of my problem:
I'm performing hyperparameter tuning using GridSearchCV from scikit-learn in mt random forest regressor. To alleviate overfitting, I found that maybe I should use the pruning technique. I checked in the docs and I found ccp_alpha parameter that refers to pruning; and I also found this example that tells about pruning in the decision tree.
My question:
Since I'm looking for the best parameters of the random forest (GRidSeachCV), how should I input the ccp_alpha value? Should I include before or after the GridSearchCV? Considering that every time that I perform GridSearchCV the structure of the model changes... Are you guys have some reference? articles?
My point of view:
For me makes more sense to perform hyperparameter tuning first and then add the ccp_alpha (pruning) before train and test this "best model", but I'm not sure....
Since ccp_alpha is also a parameter to tune, it should be a part of your CV. Your other parameters depend on that too.
It is a regularization parameter (like lambda in Lasso/Ridge regression) thus a high value gives you very small trees.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am working on building a custom facial recognition for our office.
I am planning to use Google FaceNet,
Now my question is that you can find or create your own version of facenet model in keras or pytorch there's no issue in that, but regarding creating dataset ,I want to know what are the best practices to capture photo of person when I don't have any prior photo of that person,all I have is a camera and a person ,should I create variance in by changing lightning condition or orientation or face size ?
A properly trained FaceNet model should already be somewhat invariant to lighting conditions, pose and other features that should not be a part of identifying a face. At least that is what is claimed in a draft of the FaceNet paper. If you only intend to compare feature vectors generated from the network, and intend to recognize a small group of people, your own dataset likely does not have to be particulary large.
Personally I have done something quite similar to what you are trying to achieve for a group of around ~100 people. The dataset consisted of 1 image per person and I used a 1-N-N classifier to classify the generated feature vectors. While I do not remember the exact results, it did work quite well. The pretrained network's architecture was different from FaceNet's but the overall idea was the same though.
The only way to truly answer your question though would be to experiment and see how well things work out in practice.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
i am exploring new architectures for LSTMs. i have looked into a few commonly used datasets, such as IMDB's movie reviews and sine waves, but haven't found a good generalizable dataset. if MNIST is the "hello world" for convolutional networks, then what would be the equivalent dataset for LSTMs?
You can check examples in which people use simpler models, like HMM and try running LSTM on them.
For example you can try running this POS tagging code (the pos_* part) from lazyprogrammer's course (here is a script that downloads and handles the data). This code contains models that use LSTMs on Tensorflow/Theano and also HMMs (and even logistic regression that does not take into account the sequential nature of the data).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Keras provides serveral pretrained models, like mentioned here:
https://keras.io/applications/
These applications are pretrained networks, like the following:
Xception
VGG16
VGG19
ResNet50
InceptionV3
MobileNet
I know that, VGG16 and VGG19 are fairly old networks in comparison to the others. However, is there a simply way to find out which model is the strongest or has the most weights?
One can look at the amount of layers by simpyly executing sth like:
model = applications.ResNet50(...)
print(len(model))
However, this does not give any information about the amount of weights provided, or the complexity (e.g. ResNet is residual, while VGG19 is not)
These models are implemented based on the corresponding original papers, which you can also see in the keras documentation.
For the detailed pros/cons of each model, you should read the papers. The newer model is not always the better in all applications.
For model size, you can see the number of weights in each layer by:
[w.size for w in model.get_weights()]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
use for drone quadcopter to track human body
This problem depend on many factors :
Computational resources.
Quality of images.
How much accuracy do you expect from the algorithm
By the way, the easiest way for implementing such algorithm is Cascade Classifier which is implemented in OpenCV. You can train your own model or you can use the trained model which exists in openCV files. This method support three feature types: HOG,LBP and HAAR. The base of this method is paper Viola and Jones published on 2001. The test time is near to online in an ordinary computer.
If you need more accurate method you can try DPM (deformable part models) based method. There are many released version of this method on the internet. The speed of detection is almost 2 HZ.
If you need more accuracy I suggest you to go forward with CNN (Convolutional Neural Networks). Of Course you need more computational resources (GPU or high spec CPUs)