Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
The LBP feature has the drawback that it is not too robust on flat image areas.
Questions:
What is a flat image?
What do we mean by "not being robust on flat
image areas"?
An image region is said to be "flat" if it has a nearly uniform intensity. In other words, the variance of the intensity values within the region is very low.
The LBP feature is not robust on "flat" image areas since it is based on intensity differences. Within flat image regions, the intensity differences are of small magnitude and highly affected by image noise. Moreover, they are ignorant of the actual intensity level at the location they are computed on.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I understand that the the convert -unsharp from ImageMagick is using Unsharp Masking to sharpen the image. What kind of algorithm is behind convert -adaptive-sharpen? When I want to sharpen my lanscape images, which algorithm should I use? What are the advantages and disadvantages for the two algorithms?
I'm not an expert on the algorithm, but both operations achieve the same goal by creating a "mask" to scale the intensity of the sharpening. They differ on how the generate the "mask", and the arithmetic operations.
With -unsharp
Given...
For demonstration, let's break this down into channels.
Create a "mask" by applying a Gaussian blur.
Apply the gain of the inverse mask if threshold applies.
Ta-Da
With -adaptive-sharpen
Given...
For demonstration, let's break this down into channels (again).
Create "mask" by applying edge detection, and then Gaussian blur.
Apply sharpen, but scale the intensity against the above mask.
Fin
Which command will give the better results for normal outdoor images?
That depends on the subject matter. It's a good rule-of-thumb to use -adaptive-sharpen if the image contains large empty space (sky, sea, grass, &etc), or bokeh/blurred background. Else -unsharp will work just fine.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I wrote a 2d simulation (very similar to the Atari- OpenAi games) in pygame, which I need for an reinforcement learning project. I'd like to train a neural network using mainly image data, i.e. screenshots of the pygame gameboard.
I am able to make those screenshots, but:
- Is possible to gather this image data - or, more precisely, the
corresponding rgb image matrix - also without rendering the whole
playing ground to the screen?
As I figured out there is the possibility to do such in pyglet ... But I would like to avoid to rewrite the whole simulation.
Basically, yes. You don't have to actually draw anything to the screen surface.
Once you have a Surface, you can use methods like get_at, the PixelArray module or the surfarray module to access the RGB(A)-values of each pixel.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am working on image segmentation and object detection and I think they do the same thing (they both localize and recognize objects in the raw picture). Is there any benefit in using object detection at all cause deeplab_V3+ do the job with better performance than any other object detection algorithms?
you can look at deeplab_V3+ demo in here
In object detection, the method localizes and classifies the object in the image based on bounding box coordinates. However, in image segmentation, the model also detects the exact boundaries of the object, which usually makes it a bit slower. They both have their own uses. In many applications (e.g. face detection), you only want to detect some specific objects in images and don't necessarily care about the exact boundaries of them. But in some applications (e.g. medical images), you want the exact boundaries of a tumor for example. Also we can consider the process of preparing the data for these tasks:
classification: we only provide a label for each image
localization: we provide a bounding box (4 elements) for each image
detection: we should provide a bounding box and a label for each object
segmentation: we need to define the exact boundaries of each object (semantic segmentation)
So for segmentation, more work is required both in providing the data and in training a (encoder-decoder) model, and it depends on your purpose of the task.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I need to calculate the color histogram of images in order to get a feature for finding similarity between images.
(description: https://stackoverflow.com/a/844113/5142270 and https://en.wikipedia.org/wiki/Color_histogram).
The only problem I am facing is in deciding how to scale the images so that they will have the same number of pixels. Is there a standard image size(in pixels) that is used by researchers for this purpose, when there are thousands of images that can be of any dimension? I tried searching a lot on how to scale the images, but was unable to find out what was supposed to be done.
Thanks in advance.
You can try using pyramids.
You basically don't have 1 'golden number' of pixels, but you do your feature finding on an image 1/2 the size, and 1/4 and 1/8 and so on, so your feature detection will not be size dependent.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
A lot of research papers that I am reading these days just abstractly write image1-image2
I imagine they mean gray scale images. But how to extend these to color images ?
Do I take the intensities and subtract ? How would I compute these intensities by taking the average or by taking the weighted average as illustrated here?
Also I would prefer if you could quote the source of this as well preferably from a research paper or a textbook.
Edit: I am working on motion detection where there are tons of algorithms which create a background model of the video(image) and then we subtract the current frame(again a image) from this model. We see if this difference exceeds a given threshold in which case we classify the pixel as foreground pixel. So far I have been subtracting the intensities directly but don't know whether other approach is possible.
Subtraction directly at RGB space or after converting to grayscale space is possible to miss useful information, and at the same time induce many unwanted outliers. It is possible that you don't need the subtraction operation. By investigating the intensity difference between background and object at all three channels, you can determine the range of background at the three channels, and simply set them to zero. This study demonstrated such method is robust against non-salient motion (such as moving leaves) with the presence of shadows at various environments.