I have a hard time solving the issue with mask creation.My image is large,
40959px X 24575px and im trying to create a mask for it.
I noticed that i dont have a problem for images up to certain size(I tested about 33000px X 22000px), but for dimensions larger than that i get an error inside my mask(Error is that it gets black in the middle of the polygon and white region extends itself to the left edge.Result should be without black area inside polygon and no white area extending to the left edge of image).
So my code looks like this:
pixel_points_list = latLonToPixel(dataSet, lat_lon_pairs)
print pixel_points_list
# This is the list im getting
#[[213, 6259], [22301, 23608], [25363, 22223], [27477, 23608], [35058, 18433], [12168, 282], [213, 6259]]
image = cv2.imread(in_tmpImgFilePath,-1)
print image.shape
#Value of image.shape: (24575, 40959, 4)
mask = np.zeros(image.shape, dtype=np.uint8)
roi_corners = np.array([pixel_points_list], dtype=np.int32)
print roi_corners
#contents of roi_corners_array:
"""
[[[ 213 6259]
[22301 23608]
[25363 22223]
[27477 23608]
[35058 18433]
[12168 282]
[ 213 6259]]]
"""
channel_count = image.shape[2]
ignore_mask_color = (255,)*channel_count
cv2.fillPoly(mask, roi_corners, ignore_mask_color)
cv2.imwrite("mask.tif",mask)
And this is the mask im getting with those coordinates(minified mask):
You see that in the middle of the mask the mask is mirrored.I took those points from pixel_points_list and drawn them on coordinate system and im getting valid polygon, but when using fillPoly im getting wrong results.
Here is even simpler example where i have only 4(5) points:
roi_corners = array([[ 213 6259]
[22301 23608]
[35058 18433]
[12168 282]
[ 213 6259]])
And i get
Does anyone have a clue why does this happen?
Thanks!
The issue is in the function CollectPolyEdges, called by fillPoly (and drawContours, fillConvexPoly, etc...).
Internally, it's assumed that the point coordinates (of integer type int32) have meaningful values only in the 16 lowest bits. In practice, you can draw correctly only if your points have coordinates up to 32768 (which is exactly the maximum x coordinate you can draw in your image.)
This can't be considered as a bug, since your images are extremely large.
As a workaround, you can try to scale your mask and your points by a given factor, fill the poly on the smaller mask, and then re-scale the mask back to original size
As #DanMaĆĄek pointed out in the comments, this is in fact a bug, not fixed, yet.
In the bug discussion, there is another workaround mentioned. It consists on drawing using multiple ROIs with size less than 32768, correcting coordinates for each ROI using the offset parameter in fillPoly.
Related
I have two images and I need to place the second image inside the first image. The second image can be resized, rotated or skewed such that it covers a larger area of the other images as possible. As an example, in the figure shown below, the green circle need to be placed inside the blue shape:
Here the green circle is transformed such that it covers a larger area. Another example is shown below:
Note that there may be some multiple results. However, any similar result is acceptable as shown in the above example.
How do I solve this problem?
Thanks in advance!
I tested the idea I mentioned earlier in the comments and the output is almost good. It may be better but it takes time. The final code was too much and it depends on one of my old personal projects, so I will not share. But I will explain step by step how I wrote such an algorithm. Note that I have tested the algorithm many times. Not yet 100% accurate.
for N times do this:
1. Copy from shape
2. Transform it randomly
3. Put the shape on the background
4-1. It is not acceptable if the shape exceeds the background. Go to
the first step.
4.2. Otherwise we will continue to step 5.
5. We calculate the length, width and number of shape pixels.
6. We keep a list of the best candidates and compare these three
parameters (W, H, Pixels) with the members of the list. If we
find a better item, we will save it.
I set the value of N to 5,000. The larger the number, the slower the algorithm runs, but the better the result.
You can use anything for Transform. Mirror, Rotate, Shear, Scale, Resize, etc. But I used warpPerspective for this one.
im1 = cv2.imread(sys.path[0]+'/Back.png')
im2 = cv2.imread(sys.path[0]+'/Shape.png')
bH, bW = im1.shape[:2]
sH, sW = im2.shape[:2]
# TopLeft, TopRight, BottomRight, BottomLeft of the shape
_inp = np.float32([[0, 0], [sW, 0], [sW, sH], [0, sH]])
cx = random.randint(5, sW-5)
ch = random.randint(5, sH-5)
o = 0
# Random transformed output
_out = np.float32([
[random.randint(-o, cx-1), random.randint(1-o, ch-1)],
[random.randint(cx+1, sW+o), random.randint(1-o, ch-1)],
[random.randint(cx+1, sW+o), random.randint(ch+1, sH+o)],
[random.randint(-o, cx-1), random.randint(ch+1, sH+o)]
])
# Transformed output
M = cv2.getPerspectiveTransform(_inp, _out)
t = cv2.warpPerspective(shape, M, (bH, bW))
You can use countNonZero to find the number of pixels and findContours and boundingRect to find the shape size.
def getSize(msk):
cnts, _ = cv2.findContours(msk, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cnts.sort(key=lambda p: max(cv2.boundingRect(p)[2],cv2.boundingRect(p)[3]), reverse=True)
w,h=0,0
if(len(cnts)>0):
_, _, w, h = cv2.boundingRect(cnts[0])
pix = cv2.countNonZero(msk)
return pix, w, h
To find overlaping of back and shape you can do something like this:
make a mask from back and shape and use bitwise methods; Change this section according to the software you wrote. This is just an example :)
mskMix = cv2.bitwise_and(mskBack, mskShape)
mskMix = cv2.bitwise_xor(mskMix, mskShape)
isCandidate = not np.any(mskMix == 255)
For example this is not a candidate answer; This is because if you look closely at the image on the right, you will notice that the shape has exceeded the background.
I just tested the circle with 4 different backgrounds; And the results:
After 4879 Iterations:
After 1587 Iterations:
After 4621 Iterations:
After 4574 Iterations:
A few additional points. If you use a method like medianBlur to cover the noise in the Background mask and Shape mask, you may find a better solution.
I suggest you read about Evolutionary Computation, Metaheuristic and Soft Computing algorithms for better understanding of this algorithm :)
I am using an object detection machine learning model (only 1 object). It working well in case there are a few objects in image. But, if my image has more than 300 objects, it can't recognize anything. So, I want to divide it into two parts or four parts without crossing any object.
I used threshold otsu and get this threshold otsu image. Actually I want to divide my image by this line expect image. I think my model will work well if make predictions in each part of image.
I tried to use findContour, and find contourArea bigger than a half image area, draw it into new image, get remain part and draw into another image. But most of contour area can't reach 1/10 image area. It is not a good solution.
I thought about how to detect a line touch two boundaries (top and bottom), how can I do it?
Any suggestion is appreciate. Thanks so much.
Since your region of interests are separated already, you can use connectedComponents to get the bounding boxes of these regions. My approach is below.
img = cv2.imread('circles.png',0)
img = img[20:,20:] # remove the connecting lines on the top and the left sides
_, img = cv2.threshold(img,0,1,cv2.THRESH_BINARY)
labels,stats= cv2.connectedComponentsWithStats(img,connectivity=8)[1:3]
plt.imshow(labels,'tab10')
plt.show()
As you can see, two regions of interests have different labels. All we need to do is to get the bounding boxes of these regions. But first, we have to get the indices of the regions. For this, we can use the size of the areas, because after the background (blue), they have the largest areas.
areas = stats[1:,cv2.CC_STAT_AREA] # the first index is always for the background, we do not need that, so remove the background index
roi_indices = np.flip(np.argsort(areas))[0:2] # this will give you the indices of two largest labels in the image, which are your ROIs
# Coordinates of bounding boxes
left = stats[1:,cv2.CC_STAT_LEFT]
top = stats[1:,cv2.CC_STAT_TOP]
width = stats[1:,cv2.CC_STAT_WIDTH]
height = stats[1:,cv2.CC_STAT_HEIGHT]
for i in range(2):
roi_ind = roi_indices[i]
roi = labels==roi_ind+1
roi_top = top[roi_ind]
roi_bottom = roi_top+height[roi_ind]
roi_left = left[roi_ind]
roi_right = roi_left+width[roi_ind]
roi = roi[roi_top:roi_bottom,roi_left:roi_right]
plt.imshow(roi,'gray')
plt.show()
For your information, my method is only valid for 2 regions. In order to split into 4 regions, you would need some other approach.
I am working on some leaf images using OpenCV (Java). The leaves are captured on a white paper and some has shadows like this one:
Of course, it's somehow the extreme case (there are milder shadows).
Now, I want to threshold the leaf and also remove the shadow (while reserving the leaf's details).
My current flow is this:
1) Converting to HSV and extracting the Saturation channel:
Imgproc.cvtColor(colorMat, colorMat, Imgproc.COLOR_RGB2HSV);
ArrayList<Mat> channels = new ArrayList<Mat>();
Core.split(colorMat, channels);
satImg = channels.get(1);
2) De-noising (median) and applying adaptiveThreshold:
Imgproc.medianBlur(satImg , satImg , 11);
Imgproc.adaptiveThreshold(satImg , satImg , 255, Imgproc.ADAPTIVE_THRESH_MEAN_C, Imgproc.THRESH_BINARY, 401, -10);
And the result is this:
It looks OK, but the shadow is causing some anomalies along the left boundary. Also, I have this feeling that I am not using the white background to my benefit.
Now, I have 2 questions:
1) How can I improve the result and get rid of the shadow?
2) Can I get good results without working on saturation channel?. The reason I ask is that on most of my images, working on L channel (from HLS) gives way better results (apart from the shadow, of course).
Update: Using the Hue channel makes threshdolding better, but makes the shadow situation worse:
Update2: In some cases, the assumption that the shadow is darker than the leaf doesn't always hold. So, working on intensities won't help. I'm looking more toward a color channels approach.
I don't use opencv, instead I was trying to use matlab image processing toolbox to extract the leaf. Hopefully opencv has all the processing functions for you. Please see my result below. I did all the operations in your original image channel 3 and channel 1.
First I used your channel 3, threshold it with 100 (left top). Then I remove the regions on the border and regions with the pixel size smaller than 100, filling in the hole in the leaf, the result is shown in right top.
Next I used your channel 1, did the same thing as I did in channel 3, the result is shown in left bottom. Then I found out the connected regions (there are only two as you can see in the left bottom figure), remove the one with smaller area (shown in right bottom).
Suppose the right top image is I1, and the right bottom image is I, the leaf is extracted by implement ~I && I1. The leaf is:
Hope it helps. Thanks
I tried two different things:
1. other thresholding on the saturation channel
2. try to find two contours: shadow and leaf
I use c++ so your code snippets will look a little different.
trying otsu-thresholding instead of adaptive thresholding:
cv::threshold(hsv_imgs,mask,0,255,CV_THRESH_BINARY|CV_THRESH_OTSU);
leading to following images (just OTSU thresholding on saturation channel):
the other thing is computing gradient information (i used sobel, see oppenCV documentation), thresholding that and after an opening-operator I used findContours giving something like this, not useable yet (gradient contour approach):
I'm trying to do the same thing with photos of butterflies, but with more uneven and unpredictable backgrounds such as this. Once you've identified a good portion of the background (e.g. via thresholding, or as we do, flood filling from random points), what works well is to use the GrabCut algorithm to get all those bits you might miss on the initial pass. In python, assuming you still want to identify an initial area of background by thresholding on the saturation channel, try something like
import cv2
import numpy as np
img = cv2.imread("leaf.jpg")
sat = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)[:,:,1]
sat = cv2.medianBlur(sat, 11)
thresh = cv2.adaptiveThreshold(sat , 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 401, 10);
cv2.imwrite("thresh.jpg", thresh)
h, w = img.shape[:2]
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
grabcut_mask = thresh/255*3 #background should be 0, probable foreground = 3
cv2.grabCut(img, grabcut_mask,(0,0,w,h),bgdModel,fgdModel,5,cv2.GC_INIT_WITH_MASK)
grabcut_mask = np.where((grabcut_mask ==2)|(grabcut_mask ==0),0,1).astype('uint8')
cv2.imwrite("GrabCut1.jpg", img*grabcut_mask[...,None])
This actually gets rid of the shadows for you in this case, because the edge of the shadow actually has high saturation levels, so is included in the grab cut deletion. (I would post images, but don't have enough reputation)
Usually, however, you can't trust shadows to be included in the background detection. In this case you probably want to compare areas in the image with colour of the now-known background using the chromacity distortion measure proposed by Horprasert et. al. (1999) in "A Statistical Approach for Real-time Robust Background Subtraction and Shadow Detection". This measure takes account of the fact that for desaturated colours, hue is not a relevant measure.
Note that the pdf of the preprint you find online has a mistake (no + signs) in equation 6. You can use the version re-quoted in Rodriguez-Gomez et al (2012), equations 1 & 2. Or you can use my python code below:
def brightness_distortion(I, mu, sigma):
return np.sum(I*mu/sigma**2, axis=-1) / np.sum((mu/sigma)**2, axis=-1)
def chromacity_distortion(I, mu, sigma):
alpha = brightness_distortion(I, mu, sigma)[...,None]
return np.sqrt(np.sum(((I - alpha * mu)/sigma)**2, axis=-1))
You can feed the known background mean & stdev as the last two parameters of the chromacity_distortion function, and the RGB pixel image as the first parameter, which should show you that the shadow is basically the same chromacity as the background, and very different from the leaf. In the code below, I've then thresholded on chromacity, and done another grabcut pass. This works to remove the shadow even if the first grabcut pass doesn't (e.g. if you originally thresholded on hue)
mean, stdev = cv2.meanStdDev(img, mask = 255-thresh)
mean = mean.ravel() #bizarrely, meanStdDev returns an array of size [3,1], not [3], so flatten it
stdev = stdev.ravel()
chrom = chromacity_distortion(img, mean, stdev)
chrom255 = cv2.normalize(chrom, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX).astype(np.uint8)[:,:,None]
cv2.imwrite("ChromacityDistortionFromBackground.jpg", chrom255)
thresh2 = cv2.adaptiveThreshold(chrom255 , 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 401, 10);
cv2.imwrite("thresh2.jpg", thresh2)
grabcut_mask[...] = 3
grabcut_mask[thresh==0] = 0 #where thresh == 0, definitely background, set to 0
grabcut_mask[np.logical_and(thresh == 255, thresh2 == 0)] = 2 #could try setting this to 2 or 0
cv2.grabCut(img, grabcut_mask,(0,0,w,h),bgdModel,fgdModel,5,cv2.GC_INIT_WITH_MASK)
grabcut_mask = np.where((grabcut_mask ==2)|(grabcut_mask ==0),0,1).astype('uint8')
cv2.imwrite("final_leaf.jpg", grabcut_mask[...,None]*img)
I'm afraid with the parameters I tried, this still removes the stalk, though. I think that's because GrabCut thinks that it looks a similar colour to the shadows. Let me know if you find a way to keep it.
I've ran in to an issue concerning generating floating point coordinates from an image.
The original problem is as follows:
the input image is handwritten text. From this I want to generate a set of points (just x,y coordinates) that make up the individual characters.
At first I used findContours in order to generate the points. Since this finds the edges of the characters it first needs to be ran through a thinning algorithm, since I'm not interested in the shape of the characters, only the lines or as in this case, points.
Input:
thinning:
So, I run my input through the thinning algorithm and all is fine, output looks good. Running findContours on this however does not work out so good, it skips a lot of stuff and I end up with something unusable.
The second idea was to generate bounding boxes (with findContours), use these bounding boxes to grab the characters from the thinning process and grab all none-white pixel indices as "points" and offset them by the bounding box position. This generates even worse output, and seems like a bad method.
Horrible code for this:
Mat temp = new Mat(edges, bb);
byte roi_buff[] = new byte[(int) (temp.total() * temp.channels())];
temp.get(0, 0, roi_buff);
int COLS = temp.cols();
List<Point> preArrayList = new ArrayList<Point>();
for(int i = 0; i < roi_buff.length; i++)
{
if(roi_buff[i] != 0)
{
Point tempP = bb.tl();
tempP.x += i%COLS;
tempP.y += i/COLS;
preArrayList.add(tempP);
}
}
Is there any alternatives or am I overlooking something?
UPDATE:
I overlooked the fact that I need the points (pixels) to be ordered. In the method above I simply do scanline approach to grabbing all the pixels. If you look at the 'o' for example, it would grab first the point on the left hand side, then the one on the right hand side. I would need them to be ordered by their neighbouring pixels since I want to draw paths with the points later on (outside of opencv).
Is this possible?
You should look into implementing your own connected components labelling. The concept is very simple: you scan the first line and assign unique labels to each horizontally connected strip of pixels. You basically check for every pixel if it is connected to its left neighbour and assign it either that neighbour's label or a new label. In the second row you do the same, but you also check against the pixels above it. Sometimes you need a label merge: two strips that were not connected in the previous row are joined in the current row. The way to deal with this is either to keep a list of label equivalences or use pointers to labels (so you can easily do a complete label change for an object).
This is basically what findContours does, but if you implement it yourself you have the freedom to go for 8-connectedness and even bridge a single-pixel or two-pixel gap. That way you get "almost-connected components labelling". It looks like you need this for the "w" in your example picture.
Once you have the image labelled this way, you can push all the pixels of a single label to a vector, and order them something like this. Find the top left pixel, push it to a new vector and erase it from the original vector. Now find the pixel in the original vector closest to it, push it to the new vector and erase from the original. Continue until all pixels have been transferred.
It will not be very fast this way, but it should be a start.
i have a 128x128 array of elevation data (elevations from -400m to 8000m are displayed using 9 colors) and i need to resize it to 512x512. I did it with bicubic interpolation, but the result looks weird. In the picture you can see original, nearest and bicubic. Note: only the elevation data are interpolated not the colors themselves (gamut is preserved). Are those artifacts seen on the bicubic image result of my bad interpolation code or they are caused by the interpolating of discrete (9 steps) data?
http://i.stack.imgur.com/Qx2cl.png
There must be something wrong with the bicubic code you're using. Here's my result with Python:
The black border around the outside is where the result was outside of the palette due to ringing.
Here's the program that produced the above:
from PIL import Image
im = Image.open(r'c:\temp\temp.png')
# convert the image to a grayscale with 8 values from 10 to 17
levels=((0,0,255),(1,255,0),(255,255,0),(255,0,0),(255,175,175),(255,0,255),(1,255,255),(255,255,255))
img = Image.new('L', im.size)
iml = im.load()
imgl = img.load()
colormap = {}
for i, color in enumerate(levels):
colormap[color] = 10 + i
width, height = im.size
for y in range(height):
for x in range(width):
imgl[x,y] = colormap[iml[x,y]]
# resize using Bicubic and restore the original palette
im4x = img.resize((4*width, 4*height), Image.BICUBIC)
palette = []
for i in range(256):
if 10 <= i < 10+len(levels):
palette.extend(levels[i-10])
else:
palette.extend((i, i, i))
im4x.putpalette(palette)
im4x.save(r'c:\temp\temp3.png')
Edit: Evidently Python's Bicubic isn't the best either. Here's what I was able to do by hand in Paint Shop Pro, using roughly the same procedure as above.
While bicubic interpolation can sometimes generate interpolating values outside the original range (can you verify if this is happening to you?) It really seems like you may have a bug, but it is hard to say without looking at the code. As a general rule the bicubic solution should be smoother than the nearest neighbor solution.
Edit: I take that back, I see no interpolating values outside the original range in your images. Still, I think the strange part is the "jaggedness" you get when using bicubic, you may want to double check that.