This is a continuation of my older question.
Given non-rectangular region, how do I enumerate all pixels from it and arrange them as a single vector? Order doesn't matter (though should be deterministic). Is there any fast way (or at least standard function) or my best approach is iterating over pixels in the image and picking up only those from ROI?
Additional plus if it is possible to restore region data from that vector later.
You can use numpy.where() function for this. You don't have to do iterate through the pixels.
I will continue from your last question. In the accepted answer, a mask is created and you have drawn a polygon on it, to decide the mask region. What you have to do is to simply find where pixel is 255 in that mask image.
ROI_pixel_locations = np.transpose(np.where(mask[:,:,0]==255))
This will give you a Mx2 array of x,y locations.
If you are using OpenCV 2.4.4 or higher, it has a function cv2.nonzero() with exactly same purpose.
NOTE :
In the previous question, the accepted answer creates 3 channel mask image. If it was a single channel mask image, you need to give only like this :
ROI_pixel_locations = np.transpose(np.where(mask==255))
But during AND operation, you modify line as follows :
masked_image = cv2.bitwise_and(image,image, mask=mask)
Extracting the region
You can use numpy.nonzero()
mask_1c = mask[:, :, 0]
indexes = mask_1c.nonzero()
mask_1c is there because in your previous question you have a 3 channel mask image.
Storing as a vector
If you'd rather store the content as a single array (instead of a tuple of arrays)
indexes_v = np.array(indexes).T # Returns an Nx2 matrix
Using the region
Let's say you then wanted to invert that region, for example:
image[indexes[0], indexes[1], :] = 255 - image[indexes[0], indexes[1], :]
where I've assumed that the image is of type np.uint8 (has a max of 255).
Related
How to use free scaling parameter (alpha) when dealing with getOptimalNewCameraMatrix and stereoRectify : should one use the same value ?
As far as I understand it, I guess a few things that led me to this question are worth to be listed:
In getOptimalNewCameraMatrix, OpenCV doc says "alpha Free scaling parameter between 0 (when all the pixels in the undistorted image are valid) and 1 (when all the source image pixels are retained in the undistorted image)" [sounds to me like 1 = retain source pixels = minimize loss]
In stereoRectify, OpenCV doc says "alpha Free scaling parameter.... alpha=0 means that ... (no black areas after rectification). alpha=1 means that ... (no source image pixels are lost)
So in the end alpha, seems to be a parameter that may "act" the same way ? (1 = no source pixel lost - sounds like, not sure here)
As far as I understand it, after calibrateCamera, one may want to call getOptimalNewCameraMatrix (computing new matrices as outputs) and then stereoRectify (using new computed matrices as inputs) : do one want to use the same alpha?
Are these 2 alphas the same? Or does one want to use 2 different alphas?
The alphas are the same.
The choice of value depends entirely on the application. Ask yourself:
Does the application need to see all the input pixels to do its job (because, for example, it must use all the "extra" FOV near the image edges, or because you know that the scene's subject that's of interest to the application may be near the edges and you can't lose even a pixel of it)?
Yes: choose alpha=1
No: choose a value of alpha that keeps the "interesting" portion of
the image predictably inside the undistorted image.
In the latter case (again, depending on the application) you may need to compute the boundary of the undistorted image within the input one. This is just a poly-curve, that can be be approximated by a polygon to any level of accuracy you need, down to the pixel. Or you can use a mask.
I have an array of data from a grayscale image that I have segmented sets of contiguous points of a certain intensity value from.
Currently I am doing a naive bounding box routine where I find the minimum and maximum (x,y) [row, col] points. This obviously does not provide the smallest possible box that contains the set of points which is demonstrable by simply rotating a rectangle so the longest axis is no longer aligned with a principal axis.
What I wish to do is find the minimum sized oriented bounding box. This seems to be possible using an algorithm known as rotating calipers, however the implementations of this algorithm seem to rely on the idea that you have a set of vertices to begin with. Some details on this algorithm: https://www.geometrictools.com/Documentation/MinimumAreaRectangle.pdf
My main issue is in finding the vertices within the data that I currently have. I believe I need to at least find candidate vertices in order to reduce the amount of iterations I am performing, since the amount of points is relatively large and treating the interior points as if they are vertices is unnecessary if I can figure out a way to not include them.
Here is some example data that I am working with:
Here's the segmented scene using the naive algorithm, where it segments out the central objects relatively well due to the objects mostly being aligned with the image axes:
.
In red, you can see the current bounding boxes that I am drawing utilizing 2 vertices: top-left and bottom-right corners of the groups of points I have found.
The rotation part is where my current approach fails, as I am only defining the bounding box using two points, anything that is rotated and not axis-aligned will occupy much more area than necessary to encapsulate the points.
Here's an example with rotated objects in the scene:
Here's the current naive segmentation's performance on that scene, which is drawing larger than necessary boxes around the rotated objects:
Ideally the result would be bounding boxes aligned with the longest axis of the points that are being segmented, which is what I am having trouble implementing.
Here's an image roughly showing what I am really looking to accomplish:
You can also notice unnecessary segmentation done in the image around the borders as well as some small segments, which should be removed with some further heuristics that I have yet to develop. I would also be open to alternative segmentation algorithm suggestions that provide a more robust detection of the objects I am interested in.
I am not sure if this question will be completely clear, therefore I will try my best to clarify if it is not obvious what I am asking.
It's late, but that might still help. This is what you need to do:
expand pixels to make small segments connect larger bodies
find connected bodies
select a sample of pixels from each body
find the MBR ([oriented] minimum bounding rectangle) for selected set
For first step you can perform dilation. It's somehow like DBSCAN clustering. For step 3 you can simply select random pixels from a uniform distribution. Obviously the more pixels you keep, the more accurate the MBR will be. I tested this in MATLAB:
% import image as a matrix of 0s and 1s
oI = ~im2bw(rgb2gray(imread('vSb2r.png'))); % original image
% expand pixels
dI = imdilate(oI,strel('disk',4)); % dilated
% find connected bodies of pixels
CC = bwconncomp(dI);
L = labelmatrix(CC) .* uint8(oI); % labeled
% mark some random pixels
rI = rand(size(oI))<0.3;
sI = L.* uint8(rI) .* uint8(oI); % sampled
% find MBR for a set of connected pixels
for i=1:CC.NumObjects
[Y,X] = find(sI == i);
mbr(i) = getMBR( X, Y );
end
You can also remove some ineffective pixels using some more processing and morphological operations:
remove holes
find boundaries
find skeleton
In MATLAB:
I = imfill(I, 'holes');
I = bwmorph(I,'remove');
I = bwmorph(I,'skel');
I read similar question in Stack Overflow. I tried, but I still can not understand how it works.
I read OpenCV document cv::HoughCircles, here are some explanation about dp parameter:
Inverse ratio of the accumulator resolution to the image resolution. For example, if dp=1 , the accumulator has the same resolution as the input image. If dp=2 , the accumulator has half as big width and height.
Here are my question. For example, if dp = 1, the size of accumulator is same as image, there is a consistent one-to-one match between pixels in image and positions in accumulator, but if dp = 2, how to match?
Thanks in advance.
There is no such thing as a one-to-one match here. You do have an image with pixels and a hough space, which is used for voting for circles. This parameter is just a convenient way to specify the size of the hough space relatively to the image size.
Please take a look at this answer for more details.
EDIT:
Your image has (x,y)-coordinates. Your circle hough space has (a,b,r)-coordinates, whereas (a,b) are the circle centers and r are the radii. Let's say you find a edge pixel. Now you vote for each circle, which could go through this edge pixel. I found this nice picture of hough space with a single vote i.e. a single edge pixel (continuous case). In practice this vote happens within the 3D accumulator matrix. You can think of it as rasterization of this continuous case.
Now, as already mentioned the dp parameter defines the size of this accumulator matrix relatively to your image size. The bigger the dp parameter the lower the resolution of your rasterization. It's like taking photos with different resolutions. If you downsize your photo multiple pixels will reduce to a single one. Same happens if you reduce your accumulator matrix respectively increase your dp parameter. Multiple votes for different circle centers (which lie next to each other) and radii (which are of similar size) are now merged, i.e. you do get less accurate circle parameters, but a more "robust" voting.
Please be aware that the OpenCV implementation is a little bit more complicated (they use the Hough gradient method instead of the standard Hough transform) but the considerations still apply.
When given an image such as this:
And not knowing the color of the object in the image, I would like to be able to automatically find the best H, S and V ranges to threshold the object itself, in order to get a result such as this:
In this example, I manually found the values and thresholded the image using cv::inRange.The output I'm looking for, are the best H, S and V ranges (min and max value each, total of 6 integer values) to threshold the given object in the image, without knowing in advance what color the object is. I need to use these values later on in my code.
Keypoints to remember:
- All given images will be of the same size.
- All given images will have the same dark background.
- All the objects I'll put in the images will be of full color.
I can brute force over all possible permutations of the 6 HSV ranges values, threshold each one and find a clever way to figure out when the best blob was found (blob size maybe?). That seems like a very cumbersome, long and highly ineffective solution though.
What would be good way to approach this? I did some research, and found that OpenCV has some machine learning capabilities, but I need to have the actual 6 values at the end of the process, and not just a thresholded image.
You could create a small 2 layer neural network for the task of dynamic HSV masking.
steps:
create/generate ground truth annotations for image and its HSV range for the required object
design a small neural network with at least 1 conv layer and 1 fcn layer.
Input : Mask of the image after applying the HSV range from ground truth( mxn)
Output : mxn mask of the image in binary
post processing : multiply the mask with the original image to get the required object highligted
I developed an algorithm for border tracing of objects in an image. The algorithm is capable of tracing all the objects in an image and returns the result so that you don't have to slice an image with multiple objects to use them with the algorithm.
So basically I begin by finding a threshold value, then get the binary image after threshold and then run the algorithm on it.
The algorithm is below:
find the first pixel that belongs to any object.
Trace that object (has its own algorithm)
get the minimum area of the square that contains that object
mark all the pixels in that square as 0 (erase it from the binary image)
repeat from 1 until there isn't any objects left.
This algorithm worked perfectly with objects that are far from each other, but when I tried with the image attached, I got the result attached also.
The problem is that, the square is near the circle and part of it lies in the square that contains the object, so this part is deleted because the program thinks that it is part of the first object.
I would appreciate it if anyone has a solution to this issue.
Thanks!
A quick-and-dirty method is to sort the bounding boxes in ascending order by area before erasing the shapes. That way smaller shapes are removed first, which will reduce the number of overlapping objects. This will be sufficient if you have only convex shapes.
Pseudocode:
calculate all bounding boxes of shapes
sort boxes by area (smallest area first)
foreach box in list:
foreach pixel in box:
set pixel to 0
A method guaranteed to work for arbitrary shapes is to fill the box using a mask of the object. You already create a binary image, so you can use this as the mask.
Pseudocode:
foreach box in list:
foreach pixel in box:
if (pixel in mask == white): set pixel to 0
You can try using the canny edge detection technique for resolving this issue.
You can find more about it in the following URL,
http://homepages.inf.ed.ac.uk/rbf/HIPR2/canny.htm
Regards
Shiva