Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
In general, a more complicated neural network(say, an object classification CNN with 128 layers) requires more "resources"(time,number of gpu) to train than a less complicated neural network(for example, an object classification CNN with 32 layers).I found a link which has a very nice summary of different types of CNN and "resources" required to train them:
https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html
However, after the training is complete, when we're actually using these neural networks(say, an autonomous driving car using a trained CNN to help navigate the car), do more complicated and more accurate neural networks require more "computational resources"(could be cpu,memory,etc) to run than a less complicated,less accurate neural networks?
I'm asking a generic question and the neural networks are not limited to object classification but can also include neural networks in NLP or other areas.
If the answer is "it depends",can you provide some examples of more complicated, more accurate neural networks using more resources to run than less complicated/accurate neural networks?
There is a recently released CVPR 2017 paper with the title Speed/accuracy trade-offs for modern convolutional object detectors (Huang et al.) that compares different feature extractors with some neural network architectures which the authors call "meta-architectures". They compare the so build models towards their speed, memory usage and accuracy.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
being new to Deep Learning i am struggling to understand the difference between different state of the art algos and their uses. like how is resnet or vgg diff from yolo or rcnn family. are they subcomponents of these detection models? also are SSDs another family like yolo or rcnn?
ResNet is a family of neural networks (using residual functions). A lot of neural network use ResNet architecture, for example:
ResNet18, ResNet50
Wide ResNet50
ResNeSt
and many more...
It is commonly used as a backbone (also called encoder or feature extractor) for image classification, object detection, object segmentation and many more.
There is others families of nets like VGG, EfficientNets etc...
FasterRCNN/RCN, YOLO and SSD are more like "pipeline" for object detection. For example, FasterRCNN use a backbone for feature extraction (like ResNet50) and a second network called RPN (Region Proposal Network).
Take a look a this article which present the most common "pipeline" for object detection.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Disclaimer: I also posted this question on CrossValidated but it is not receiving any attention. If this is not the place for it I will gladly remove it.
As I understand it, the only difference between them is the way the two networks are trained. Deep autoencoders are trained in the same way as a single-layer neural network, while stacked autoencoders are trained with a greedy, layer-wise approach. Hugo Larochelle confirms this in the comment of this video. I wonder if this is the ONLY difference, any pointers?
The terminology in the field isn't fixed, well-cut and clearly defined and different researches can mean different things or add different aspects to the same terms. Example discussions:
What is the difference between Deep Learning and traditional Artificial Neural Network machine learning? (some people think that 2 layers is deep enough, some mean 10+ or 100+ layers).
Multi-layer perceptron vs deep neural network (mostly synonyms but there are researches that prefer one vs the other).
As for AE, according to various sources, deep autoencoder and stacked autoencoder are exact synonyms, e.g., here's a quote from "Hands-On Machine Learning with Scikit-Learn and TensorFlow":
Just like other neural networks we have discussed, autoencoders can
have multiple hidden layers. In this case they are called stacked
autoencoders (or deep autoencoders).
Later on, the author discusses two methods of training an autoencoder and uses both terms interchangeably.
I would agree that the perception of the term "stacked" is that an autoencoder can extended with new layers without retraining, but this is actually true regardless of how existing layers have been trained (jointly or separately). Also regardless of the training method, the researches may or may not call it deep enough. So I wouldn't focus too much on terminology. It can stabilize some day but not right now.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
use for drone quadcopter to track human body
This problem depend on many factors :
Computational resources.
Quality of images.
How much accuracy do you expect from the algorithm
By the way, the easiest way for implementing such algorithm is Cascade Classifier which is implemented in OpenCV. You can train your own model or you can use the trained model which exists in openCV files. This method support three feature types: HOG,LBP and HAAR. The base of this method is paper Viola and Jones published on 2001. The test time is near to online in an ordinary computer.
If you need more accurate method you can try DPM (deformable part models) based method. There are many released version of this method on the internet. The speed of detection is almost 2 HZ.
If you need more accuracy I suggest you to go forward with CNN (Convolutional Neural Networks). Of Course you need more computational resources (GPU or high spec CPUs)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
After reading a few papers on deep learning and deep belief networks, I got a basic idea of how it works. But still stuck with the last step, i.e, the classification step.
Most of the implementation I found on the Internet deal with generation. (MNIST digits)
Is there some explanation (or code) available somewhere that talk about classifying images(preferably natural images or objects) using DBNs?
Also some pointers in the direction would be really helpful.
The basic idea
These days, the state-of-the-art deep learning for image classification problems (e.g. ImageNet) are usually "deep convolutional neural networks" (Deep ConvNets). They look roughly like this ConvNet configuration by Krizhevsky et al:
For the inference (classification), you feed an image into the left side (notice that the depth on the left side is 3, for RGB), crunch through a series of convolution filters, and it spits out a 1000-dimensional vector on the right-hand side. This picture is especially for ImageNet, which focuses on classifying 1000 categories of images, so the 1000d vector is "score of how likely it is that this image fits in the category."
Training the neural net is only slightly more complex. For training, you basically run classification repeatedly, and every so often you do backpropagation (see Andrew Ng's lectures) to improve the convolution filters in the network. Basically, backpropagation asks "what did the network classify correctly/incorrectly? For misclassified stuff, let's fix the network a little bit."
Implementation
Caffe is a very fast open-source implementation (faster than cuda-convnet from Krizhevsky et al) of deep convolutional neural networks.
The Caffe code is pretty easy to read; there's basically one C++ file per type of network layer (e.g. convolutional layers, max-pooling layers, etc).
You should use a softmax layer (http://en.wikipedia.org/wiki/Softmax_activation_function) on top of the network you have used for generation, and use backpropagation to fine tune the final network.
These days people start to using SVM in classification layer.
Deep learning is evolving very freely and widely.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
Where is ANN classification (regression) better than SVM? Some real-world examples?
There are many applications where they're better, many applications where they're comparable, many applications where they are worse. It also depends on who you ask. It is hard to say this type of data or that type of data/application.
An example where ANN, in particular convolutional neural networks, work better than SVMs would be digit classification on MNIST. Another such case is the work of Geoff Hinton's group on speech recognition using Deep Belief Networks
Recently I have read a paper of proving the theoretical equivalence between ANN and SVM. However, ANN is usually slower than SVM.
I am just finishing some out-of-the-box comparison between support vector machines and neural networks on several popular regression- and classification datasets - first results in short: svms learn fast and predict slow - neural networks learn slow but predict fast and have very lightweight models. Concerning accuracy/loss, both methods seem to be on par.
It will largely depend as both have different tradeoffs and design criteria. There has been some work to show the relationship and some say equivalence as seen in other answers to this question. Below is another reference which draws links between these two techniques in machine learning:
Ronan Collobert and Samy Bengio. 2004. Links between perceptrons, MLPs
and SVMs. In Proceedings of the twenty-first international
conference on Machine learning (ICML '04). ACM, New York, NY, USA,
23-. DOI: https://doi.org/10.1145/1015330.1015415