This question already has answers here:
Is there an optimizer in keras based on precision or recall instead of loss?
(7 answers)
Targeting a specific metric to optimize in tensorflow
(2 answers)
Cost function training target versus accuracy desired goal
(2 answers)
Closed 2 years ago.
I am doing binary classification using the classifiers from scikit learn. I would ideally like to optimize the AUC directly rather than use cross entropy or log loss as a proxy. Are there any classifiers that can do this? I would happily move to another library if that would help.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 10 months ago.
Improve this question
I have a dataset on which I need to apply binary classification to predict target value. I applied 5 algorithms on training set such as Logistic regression, Naive Bayes, KNN, SVM, Decision Trees. Out of which Binary Classification using Logistic Regression is giving me highest accuracy but the thing is I did not preprocess my dataset. Now should I again train my model using all five algorithms or Is it sure that Binary Classification using Logistic Regression will again give my highest accuracy after pre processing training dataset?
No one can tell you if the results will be different after pre-processing, since we don't know what pre-processing you will be doing. But best practices are to pre-process that dataset uniformly so that all algorithms are being trained on the same data.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
being new to Deep Learning i am struggling to understand the difference between different state of the art algos and their uses. like how is resnet or vgg diff from yolo or rcnn family. are they subcomponents of these detection models? also are SSDs another family like yolo or rcnn?
ResNet is a family of neural networks (using residual functions). A lot of neural network use ResNet architecture, for example:
ResNet18, ResNet50
Wide ResNet50
ResNeSt
and many more...
It is commonly used as a backbone (also called encoder or feature extractor) for image classification, object detection, object segmentation and many more.
There is others families of nets like VGG, EfficientNets etc...
FasterRCNN/RCN, YOLO and SSD are more like "pipeline" for object detection. For example, FasterRCNN use a backbone for feature extraction (like ResNet50) and a second network called RPN (Region Proposal Network).
Take a look a this article which present the most common "pipeline" for object detection.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I've been training a convolutional network with 7 learnable layers for about 20 hours. What are some general ways to tell if the network has converged or still needs training?
Here's a histogram of the parameters of the first convolutional layer:
Here's a graph of the loss and accuracy of the training and test set:
Obviously while you are getting score rising (train and test) it's meaning you on the right way and you are going to local/global minimum. When you'll see changing of direction score moving (traing still going down, test going up) or both scores go in stagnation, then it's time to stop.
BUT
While you are using accuracy as metric of evaluating you can just get anomal behavior of model. for ex: all result of network output would be number of most valuable classes. Explanation. This problem can be solved by using another metric of evaluating like f1, logloss and so on and you'll see any problems on learning period.
Also for imbalanced data you can use any strategies for avoiding negative effects of imbalance. Like weights in softmax_cross_entropy in tensorflow. Implementation you can find there.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
Can anyone please tell how to save the trained parameters in Convolution nets to predict future unseen images.
In neural nets, we can save the parameters (Weights and Biases) and the we can run forward prop function using these saved parameters to predict. But in Conv nets, how do we do it because, we are not defining a lot of parameters ourselves but tensor flow is defining them for us?
Thanks
Convolutional networks are just another type of neural network. Even in "normal" neural networks, one doesn't typically specify weights and biases manually. Rather, they are learned through training (e.g., via backpropagation), then typically saved to a file for later use.
TensorFlow is not defining the weights and biases of your CNN for you. You are either learning them using TensorFlow or loading them from a file. If you want to save your trained TensorFlow model, the process is explained in the TensorFlow documentation.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
In almost most of the cases, I come across about GPUs while dealing with any execution part in Deep Learning.
This has to do with GPU architecture versus CPU. It turns out gaming requires a lot of matrix multiplications, hence the GPU architecture was optimized for these types of operations, specifically they are optimized for high rate floating-point arithmetic. More on this here
It so happens that neural networks are mostly matrix multiplications.
For example:
Is the mathematical formulation of a simple neural network with one hidden layer. W_h is a matrix of weights that multiplies your input x, to which we add a bias b_h. The linear equation W_hx + b_h can be compacted to a single matrix multiplication. The sigma is a nonlinear activation like sigmoid. The outer sigmoid is again a matrix multiplication. Hence GPUs