I am a beginner in deep learning and while I was studying it, I came across the term "channel" several times, such as quantization channel, input channels, and output channels. Nevertheless, I am still confused about what it meant and its difference with feature maps. Can anyone clarify this for me? Thanks in advance!
Related
I was trying to understand a research paper called "DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution". What I didn't understand is how the "Switchable Atrous Convolution" works and why we have used it? I know what Atrous/ Dilated convolution is, but what is Switchable here? How it is determined?
I had spent days trying to grasp the concept before asking this question here.
Here are the links that I have collected and read from (might help you as well):
Official research paper at arxiv
Medium blog to get high-level overview
Python Implementation of Switchable Atrous Convolution (official GitHub repo)
I really value your time.
Thank you.
I am answering my own question in the hope that it will be helpful for other people.
SAC works like a soft switch, more like a mixing coefficient, which tells what information to take from both the atrous convolution (having different atrous rate) and mix them up. As "S" is dependent on 1x1 convolution has a trainable parameter, this helps the network to learn the optimal mixing coefficient. This is how our algorithm is looking twice at the image with the different receptive fields (different atrous rate) to capture important semantic level information which is important for object detection and semantic/ instance segmentation.
These images helped me a lot to unfold the information.
I'm in a big confuse. Is it possible to find whe main word or substring (with the help of training set) in sentence. I'm parsing vacancies and trying to build a text-maining app, that could quess what skills are mentioned in the text. Yes, maybe this is task for some kind of global text search with skill's dictionary, but i'm really very curious, can NN help?
As you have guesed, I'm a newbie at ML.
Word2Vec is a basic application of Neural Networks that could help create a numerical representation of words, which you could use to build a smart interpretation of your sentences.
More interestingly, using a LSTM can handle context, and identify key words in sentences like in this paper: http://www.clsp.jhu.edu/~guoguo/papers/icassp2015_myhotword.pdf . This is a paper on identifying key words in sentences to allow for faster, more useful applications of voice recognition software. Here is the github: https://github.com/MajerMartin/lstm-dtw-keyword-spotting . It's too much to explain in this post but that should keep you busy and get you started training a neural network for keyword identification.
Short answer: No NNs cannot help.
Long answer: Maybe they can if you really, REALLY want them to and have tons of time and skill.
The Problem is that Neural Networks are used to handle numbers and not words.
Most types of Neural Networks rely on the capability to decide if two values are close to each other. This is still not easy with strings in a language context.
So if you don't want to spend the next few years researching Neural Networks I would look for a different approach ;)
I am newbie in deep learning, so not sure about the ability of deep learning. I am wondering if it is possible to use DL for pattern recognition. More specific, given many images containing different people wearing the same clothes or shoes. Can we tell it out that certain patterns in these images are the same, i.e. wearing the same clothes or shoes?
If yes, what is the pipeline to do it? From the beginning of data preparation to the end of classification/prediction? Any reference papers or blogs recommended? Thanks in advance!
Here are some examples I found online for better illustration:
enter link description here
enter link description here
From what I could gather from your problem statement, I believe if you require some general similarity you should go for an unsupervised clustering algorithm. It will create groups for your data set based on hidden similarities but you will have to label the clusters yourself which might not be very useful in your case.
Another approach could be to manually label your images with categories of interest for instance same background, same shoes, same clothes etc and train a multi class convolutional neural network.
Neural network would probably give you better results or you can even use the deep feather the neural net learned to extract features from your data and use a clustering algorithm on the extracted features.
This is an interesting time to be learning about deep learning. These tutorials might help you.
Let me know if I can help further.
I'm doing some research on machine learning algorithms that would be useful for processing image data and using them for recognition purposes. I've stumbled across SpikeNET and thought it had potential. However their example code is very confusing (the comments are in French) and being on a Windows box I cannot compile the project without fiddling around in Cygwin too much.
If anyone has any further information on the Spiking Neuron technology or any other highly researched machine learning techniques that yield good results, I would be highly interested.
Thanks in advance.
Well, for object detection the more "standard" state of art approaches are haar cascades and SIFT features.
As for "working code" have you spent any time at all poking around OpenCV? this is a very complete computer vision library that can help you along the way. Perhaps start here?
I am working with OpenCV for a project used for recognition and I had a general question regarding the API and it's terms. I've looked online and couldn't find anything specific to this but I was wondering what the differences were regarding the Discrete Adaboost, Real AdaBoost, LogitBoost, and Gentle AdaBoost. If anyone could direct me to a pros v cons or a general description about these so that I may research which would be useful.
Update
I have added a link to a powerpoint file that goes over the different variations of the Boosting techniques. Hope this hopes someone else out there.
Adaboost powerpoint
Thanks in advance
There isn't really a simple "always use technique X" otherwise there wouldn't be a need for all the others . You really have to understand the details and experiment.
see The opencv discussion and A list of papers and technical summaries