please can anyone explain the main differences respectively the relations between this techniques. On the one and in many tutorials image segmentation is used as the base od blob detection. But on the other hand blob detection algorithms like Connected-component labeling is equal to a Region-growing methods which is related to image segmentation.
They have distinct concepts, however, sometimes they do overlap.
Let me try to explain it in layman's terms:
Blob detection refers to a specific application of image processing techniques, whose purpose is to isolate (one or more) objects (aka. regions) in the input image;
Image segmentation refers to a classification of image processing techniques used to partition an image into smaller segments (groups of pixels).
Image segmentation have many applications, and so it happens that one of them is actually object detection. This is where the confusion usually surfaces because now these 2 terms mean similars things:
The application of image segmentation techniques for object detection, is exactly what blob detection is all about.
So I believe the main difference is: image segmentation refers to a vast group of techniques, and blob detection refers to an application of those techniques.
Related
I am using Mask-RCNN to solve an object detection problem. This is an implementation of Mask R-CNN on Python 3, cv2, Keras, and TensorFlow. I am trying to identify the damaged area of a truck. The results which I got are good when I am running the model on those images which do not have any shadow or reflection from the surrounding. But the model fails on those type of images which has a shadow or some other reflection. I have used some image processing techniques which are 1. Converting images to grayscale and 2. Color processing. But both of them not given any good results.
Please suggest what i can do to minimize false-positive results.
The problem with training custom classifiers is that even if you have enough images of the object itself, there isn't enough data of that that same object in different contexts and backgrounds.
I'd suggest you to augmentate the data by applying some sorts of distortion, including artificial shadows and reflections. By doing this, you will get more data with different contexts and minimize your false-positive results.
There are several tools for doing this. One of them is albumentations
: https://github.com/albumentations-team/albumentations. It allows you to create a lot of image augmentations, including random shadows.
I am working on a limited number of large size images, each of which can have 3072*3072 pixels. To train a semantic segmentation model using FCN or U-net, I construct a large sample of training sets, each training image is 128*128.
In the prediction stage, what I do is to cut a large image into small pieces, the same as trainning set of 128*128, and feed these small pieces into the trained model, get the predicted mask. Afterwards, I just stitch these small patches together to get the mask for the whole image. Is this the right mechanism to perform the semantic segmentation against the large images?
Your solution is often used for this kind of problem. However, I would argue that it depends on the data if it truly makes sense. Let me give you two examples you can still find on kaggle.
If you wanted to mask certain parts of satellite images, you would probably get away with this approach without a drop in accuracy. These images are highly repetitive and there's likely no correlation between the segmented area and where in the original image it was taken from.
If you wanted to segment a car from its background, it wouldn't be desirable to break it into patches. Over several layers the network will learn the global distribution of a car in the frame. It's very likely that the mask is positive in the middle and negative in the corners of the image.
Since you didn't give any specifics what you're trying to solve, I can only give a general recommendation: Try to keep the input images as large as your hardware allows. In many situation I would rather downsample the original images than breaking it down into patches.
Concerning the recommendation of curio1729, I can only advise against training on small patches and testing on the original images. While it's technically possible thanks to fully convolutional networks, you're changing the data to an extend, that might very likely hurt performance. CNNs are known for their extraction of local features, but there's a large amount of global information that is learned over the abstraction of multiple layers.
Input image data:
I would not advice feeding the big image (3072x3072) directly into the caffe.
Batch of small images will fit better into the memory and parallel programming will too come into play.
Data Augmentation will also be feasible.
Output for big Image:
As for the output of big Image, you better recast the input size of FCN to 3072x3072 during test phase. Because, layers of FCN can accept inputs of any size.
Then you will get 3072x3072 segmented image as output.
I was reading about image segmentation, and I understood that it is the first step in image analysis. But I also read that if I am using SURF or SIFT to detect and extract features there is no need for segmentation. Is that true? Is there a need for segmentation if I am using SURF?
The dependency between segmentation and recognition is a bit more complex. Clearly, knowing which pixels of the image belong to your object makes recognition easier. However, this relationship works also in the other direction: knowing what is in the image makes it easier to do segmentation. However, for simplicity, I will only speak about a simple pipeline where segmentation is performed first (for instance based on some simple color model) and each of the segments is then processed.
Your question specifically asks about the SURF features. However, in this context, what is important is that SURF is a local descriptor, i.e. it describes small image patches around detected keypoints. Keypoints should be points in the image where information relevant to your recognition problem can be found (interesting parts of the image), but also points that can reliably be detected in a repeatable fashion on all images of objects belonging to the class of interest. As a result, a local descriptor only cares about the pixels around points selected by the keypoint detector and for each such keypoint extracts a small feature vector. On the other hand a global descriptor will consider all pixels within some area, typically a segment, or the whole image.
Therefore, to perform recognition in an image using a global descriptor, you need to first select the area (segment) from which you want your features to be extracted. These features would then be used to recognize what is the content of the segment. The situation is a bit different with a local descriptor, since it describes local patches that the keypoint detector determines as relevant. As a result, you get multiple feature vectors for multiple points in the image, even if you do not perform segmentation. Each of these feature vectors tells you something about the content of the image and you can try to assign each such local feature vector to a "class" and gather their statistics to understand the content of the image. Such simple model is called the Bag-of-words model.
I have read an article regarding Brain tumor segmentation.That article has some methods to segment the brain tumor cells from normal brain cells.Those methods are pre-processing,segmentation and feature extraction.But I couldn't understand,whats the difference between segmentation and Feature extraction.I googled it also,but still I didn't understand.Can anyone please explain the basic concept of this methods?
Segmentation is usually understood as the decomposition of a whole into parts. In particular, decomposing or partitioning an image into homogeneous regions.
Feature extraction is a broader concept, which can be described as finding areas with specific properties, such as corners, but it can also be any set of measurements, be them scalar, vector or other. Those features are commonly used for pattern recognition and classification.
A typical processing scheme could be to segment out cells from the image, then characterizing their shape by means of, say edge smoothness features, and telling normal from ill cells.
Image Segmentation vs. Feature Localization • Image Segmentation: If R is a segmented region,
1. R is usually connected; all pixels in R are connected (8- connected or 4-connected).
2. Ri \ Rj = , i 6= j; regions are disjoint.
3. [ni=1Ri = I, where I is the entire image; the segmentation
is complete.
• Feature Localization: a coarse localization of image fea- tures based on proximity and compactness – more e↵ective than Image Segmentation.
Feature extraction is a prerequisite for image segmentation.
When you face a project for segmenting a particular shape or structure in an image, one of the procedure to be applied is to extract the relevant features for that region so that you can differentiate it from other region.
A simple and basic features which are commonly used in image segmentation could be intensity. So you can make different groups of structure based on the intensity they show in the image.
Feature extraction is used for classification and relevant and significant features are used for labeling different classed inside an image.
Recently I began to study computer vision. In a series of text, I encountered "segmentation" which creates a group of pixels (super-pixels) based on pixel features.
Here is an example:
But, I'm not sure how do they use super-pixels at the first place. Do we use it for paint-like visualization? Or some kinds of object detection?
The segmentation you did for this image is useless since it does not split any thing useful. But consider this segmentation for example:
It splits the duck and the other objects from the background.
You can here find some useful application of Image Segmentation : https://en.wikipedia.org/wiki/Image_segmentation#Applications
Usually super pixel alone is not enough to perform segmentation. It can be the first step in performing segmentation. But further processing need to be done to perform segmentation.
In one of the papers I have read they use seam processing to measure the energy of the edges.
There is another paper for jitendra Malik about using super pixels in segmentation.