Filling holes in 3D volume - image-processing

I am trying to find libraries that can fill small cavity inside a volume as well as a hole that is through the volume like a tube. I had tried SimpleITK but didn't got any success with that. I had tried all the GrayScale Morphological operation there but these holes are not getting filled up.
I would like to know the solution for this problem.
import SimpleITK as sitk
image = sitk.ReadImage("volume.mha")
filt_1 = sitk.GrayscaleFillholeImageFilter()
filt_2 = sitk.GrayscaleMorphologicalClosingImageFilter()
output_1 = filt_1.Execute(image)
output_2 = filt_2.Execute(image)
The filters are being created in such a manner with default parameters and are applied then on the input image.
Thanks and Regards
Vaibhav

A little bit late, however, this answer could be use for the ones dealing with the same problem:
import SimpleITK as sitk
image = sitk.ReadImage("volume.mha")
#be sure that you have a binary or grayscale image
binary_image = sitk.BinaryThreshold(image, 0.5, 1, 1, 0)
dims = segmentation.GetDimension()
# then apply a fillhole algorithm
filledMask = sitk.BinaryFillhole(segmentation)
# since the method before perfoms a dilation, try a erosion afterwards
erodedMask = sitk.BinaryErode(filledMask, [3]*dims)

Related

PySimpleGUI/tk: drawing OpenCV image into sg.Graph efficiently

An image captured from a camera is stored into a numpy ndarray object with a shape of (1224,1024,3).
This format is very convenient for using OpenCV methods over it.
I was looking for the way to draw it into (or onto) an sg.Graph element of PySimpleGUI.
The method I have found worked, but was very inefficient:
def draw_img(self, img):
# turn the image into a PIL image object:
pil_im = Image.fromarray(img)
# use PIL to convert the image into an in-memory PNG file
with BytesIO() as output:
pil_im.save(output, format="PNG")
png = output.getvalue()
# remove any previous elements from the canvas of our sg.Graph:
self.image_element.erase()
# add an image into the sg.Graph element
self.image_element.draw_image(data=png, location=(0, self.img_sz[1]))
The reason for being inefficient is clearly because we are encoding the raw image into PNG.
However I could not find any better way to do this! In my case, I had to show every frame coming from the camera, and it was way too slow.
So what is a better way to do it?
I could not find any 'by the book' solution, but I have found a working way.
The idea came from a discussion in this page. With it, I had to copy the original code of draw_image in PySimpleGUI and modify it.
def draw_img(self, img):
# turn our ndarray into a bytesarray of PPM image by adding a simple header:
# this header is good for RGB. for monochrome, use P5 (look for PPM docs)
ppm = ('P6 %d %d 255 ' % (self.img_sz[0], self.img_sz[1])).encode('ascii') + img.tobytes()
# turn that bytesarray into a PhotoImage object:
image = tk.PhotoImage(width=self.img_sz[0], height=self.img_sz[1], data=ppm, format='PPM')
# for first time, create and attach an image object into the canvas of our sg.Graph:
if self.img_id is None:
self.img_id = self.image_element.Widget.create_image((0, 0), image=image, anchor=tk.NW)
# we must mimic the way sg.Graph keeps a track of its added objects:
self.image_element.Images[self.img_id] = image
else:
# we reuse the image object, only changing its content
self.image_element.Widget.itemconfig(self.img_id, image=image)
# we must update this reference too:
self.image_element.Images[self.img_id] = image
Using this method, I could achieve a great performance, so I decided to share this solution with the community. Hoping this will help anybody!
Thank you for helping me. I modified your plan.
my code:
def image_source(path,width=128,height=128):
img = Image.open(path).convert('RGB')#PIL image
image = img.resize((width, height))
ppm = ('P6 %d %d 255 ' % (width, height)).encode('ascii') + image.tobytes()
return ppm
image1=image_source(r'Z:\xxxx6.png')
layout = [.... sg.Image(data=image1) ...

Image preprocessing - Train and test image data are not being read in corresponding order

I am trying to train a CNN model for an image processing problem statement. - I am facing a major issue in the preprocessing stage, where the train datasets of both train_rain and train_no_rain are not in the order I wish for them to be. This is affecting the performance of my model, as it is important for my model to ID an image with rain streaks and then the same image without them.
Any solutions to this issue?
Here are the samples of what I am trying to imply -
Say after reading the datasets as shown below:
path_1 = "gdrive/My Drive/Rain100H/train/rainy"
train_rain = []
no_train_rain = 0
gauss_img = []
for img in glob.glob(path_1+"/*.png"):
im = cv.imread(img)
im = cv.resize(im,(128,128))
#Gaussian Blur
im_gb = cv.GaussianBlur(im,(5,5),0)
gauss_img.append(im_gb)
cv.waitKey()
no_train_rain+=1
train_rain.append(im)
train_no_rain = []
no_train_no_rain = 0
path_2 = "gdrive/My Drive/Rain100H/train/no rain"
for img in glob.glob(path_2+"/*.png"):
im = cv.imread(img)
im = cv.resize(im,(128,128))
cv.waitKey()
no_train_no_rain+=1
train_no_rain.append(im)
Now I want to read the first images from train_rain and train_no_rain, AFTER converting them to numpy arrays. and I did that using this -
import matplotlib.pyplot as plt
first image from train_rain
plt.imshow(train_rain[1])
first image from train_no_rain
plt.imshow(train_no_rain[1])
But ideally, the first image in train_no_rain should be:
PS: The datasets have all the images beforehand, it's just that they are not being read in a particular order.
Any sort of help would be much appreciated :)

Extracting Semi-structured Text from a complex UI (Golf Simulator)

I'm fairly new to the world of OCR, OpenCV, Tesseract etc and was hoping to get some advice or a nudge in the right direction for a project I'm working on. For context, I practice golf at an indoor simulator that is powered by Full Swing Golf. My goal is to build an app (preferably iphone, but desktop is fine too) that will be able to grab the data provided by the simulator and process it however I'd like. The overall workflow would look something like:
Set up iPhone or laptop camera to watch the simulator screen.
Hit ball
Statistics Screen is displayed that looks more or less like:
Detect that the Statistics Screen has been displayed and grab all relevant data:
| Distance | Launch | Back Spin | Club Speed | Carry | To Pin | Direction | Ball Speed | Side Spin | Club Face | Club Path |
|----------|--------|-----------|------------|-------|--------|-----------|------------|-----------|-----------|-----------|
| 345 | 13 | 3350 | 135 | 335 | 80 | 2.4 | 190 | 350 | 4.3 | 1.6 |
5-?: Save the data to my app, keep track of it over time etc...
Attempts So Far:
It seemed like OpenCV's matchTemplate would be a simple way to find all of the headings in the image (Distance, Launch etc...) and it does seem to work when the image and template are both the perfect resolution. However, as this will be an iPhone app, the quality is not something I can really guarantee (within reason). Moreso, the screen will almost never be straight-on as it appears above. Most likely, the camera will be off to the side and we will have to de-skew accordingly. I've attempted to use the following image to work on my deskewing logic to no avail:
Finding the reference points in order to deskew via getPerspectiveTransform and warpPerspective has proven to be incredibly difficult due to the above issues with matching templates.
I've also tried dynamically adjusting for scale with code resembling the following:
def findTemplateLocation(image_path):
template = cv2.imread(image_path)
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
w, h = template.shape[::-1]
threshold = 0.65
loc = []
for scale in np.linspace(0.1, 2, 20)[::-1]:
resized = imutils.resize(template, width=int(template.shape[1] * scale))
w, h = resized.shape[::-1]
res = cv2.matchTemplate(image_gray, resized, cv2.TM_CCOEFF_NORMED)
loc = np.where(res >= threshold)
if len(list(zip(*loc[::-1]))) > 0:
break
if loc and len(list(zip(*loc[::-1]))) > 0:
adjusted_w = int(w/scale)
adjusted_h = int(h/scale)
print(str(adjusted_w) + " " + str(adjusted_h) + " " + str(scale))
ret = []
for pt in zip(*loc[::-1]):
ret.append({'width': w, 'height': h, 'location': pt})
return ret
return None
This still returns a ton of false positives.
I'm hoping to get some advice on how to approach this problem with a clean slate. I'm open to any language / workflow.
If it does seem that I'm on the right track, my current code is at https://gist.github.com/naderhen/9ec8d45f13d92507131d5bce0e84fad8 . Would really appreciate any suggestions for the best next steps.
Thanks for any help you can provide!
EDIT: Additional Resources
I've uploaded a number of videos and still photos from my time at the indoor simulator this weekend: https://www.dropbox.com/sh/5vub2mi4rvunyaw/AAAY1_7Q_WBV4JvmDD0dEiTDa?dl=0
I tried to get a number of different angles, with different lighting etc. Please let me know if I can provide any other resources that may help.
So, I tried two different methods:
Contour detection - This seemed to be the most obvious method since the statistics screen is the primary part of the image and is present in all your images. Although it does work with two of the three images, it might not be very robust with the parameters. Here are the steps that I tried for contour:
First, get the the image in grayscale or take one of the Value channel in HSV. Then, threshold the image using either Otsu or Adaptive Thresholding. After playing with a lot of the associated parameters, I got satisfactory results, which would basically mean nice whole statistics screen in white on a black background. After this, sort the contours like this:
contours = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)[1]
# Sort the contours to avoid unnecessary comparison in the for loop below
cntsSorted = sorted(contours, key=lambda x: cv2.contourArea(x), reverse=True)
for cnt in cntsSorted[0:20]:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.04 * peri, True)
if len(approx) == 4 and peri > 10000:
cv2.drawContours(sorted_image, cnt, -1, (0, 255, 0), 10)
Feature Detection and Matching: Since using contours wasn't robust enough, I tried another method which I've worked on for a similar problem as yours. This method is fairly robust, much faster (I tried this on an android phone 2 years back and it would do the job in less than a second for a 1280 x 760 image). However, after trying on your work cases, I figured that your images are pretty vague. What I mean by that is, you have two images in your question that have fairly similar primaries and it works for that but the images you posted in comments are very different from these and hence it doesn't find suitable number of good matches (at least 10 in my case). If you can post a nice set of images that you actually will encounter, I will update this answer with my results on the new set. More importantly, the images of the scene evidently have change in perspective which shouldn't be an issue assuming you are able to get a very good source image (as the first one in your question). However, the change in lighting conditions can be a pain. I'd suggest either using different color spaces such as HSV, Lab and Luv instead of BGR.
Here is where you can find a working example of how to implement your own feature matcher. There are some code changes required depending on the version of OpenCV you are using but I am sure you can find the solutions ( I did ;) ).
A good example:
Some suggestions:
Try getting as clean an image as possible for the image you are using to match with the others (your first image in my case). Hopefully, this would require you to do less processing.
Try using unsharp mask before finding Keypoints.
My results are from using ORB. You can also try with other detectors/descriptors like SURF, SIFT and FAST.
Finally, your approach of template matching should work in cases where there is a change only in the scaling and not the perspective.
Hope this helps! Write a comment if you have any additional questions and/or when you have a good image set ready (rubs palms). Cheers!
Edit 1: This is the code that I used for the Feature Detection and Matching in Opencv 3.4.3 and Python 3.4
def unsharp_mask(im):
# This is used to sharpen images
gaussian_3 = cv2.GaussianBlur(im, (3, 3), 3.0)
return cv2.addWeighted(im, 2.0, gaussian_3, -1.0, 0, im)
def screen_finder2(image, source, num=0):
def resize(im, new_width):
r = float(new_width) / im.shape[1]
dim = (new_width, int(im.shape[0] * r))
return cv2.resize(im, dim, interpolation=cv2.INTER_AREA)
width = 300
source = resize(source, new_width=width)
image = resize(image, new_width=width)
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2LUV)
image, u, v = cv2.split(hsv)
hsv = cv2.cvtColor(source, cv2.COLOR_BGR2LUV)
source, u, v = cv2.split(hsv)
MIN_MATCH_COUNT = 10
orb = cv2.ORB_create()
kp1, des1 = orb.detectAndCompute(image, None)
kp2, des2 = orb.detectAndCompute(source, None)
flann = cv2.DescriptorMatcher_create(cv2.DescriptorMatcher_FLANNBASED)
# Without the below 2 lines, matching doesn't work
des1 = np.asarray(des1, dtype=np.float32)
des2 = np.asarray(des2, dtype=np.float32)
matches = flann.knnMatch(des1, des2, k=2)
# store all the good matches as per Lowe's ratio test
good = []
for m, n in matches:
if m.distance < 0.7 * n.distance:
good.append(m)
if len(good) >= MIN_MATCH_COUNT:
src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1,
1, 2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(-1,
1, 2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
matchesMask = mask.ravel().tolist()
h,w = image.shape
pts = np.float32([[0, 0], [0, h-1], [w-1, h-1], [w-1, 0]]).reshape(-1,
1, 2)
dst = cv2.perspectiveTransform(pts, M)
source_bgr = cv2.cvtColor(source, cv2.COLOR_GRAY2BGR)
img2 = cv2.polylines(source_bgr, [np.int32(dst)], True, (0,0,255), 3,
cv2.LINE_AA)
cv2.imwrite("out"+str(num)+".jpg", img2)
else:
print("Not enough matches." + str(len(good)))
matchesMask = None
draw_params = dict(matchColor=(0, 255, 0), # draw matches in green color
singlePointColor=None,
matchesMask=matchesMask, # draw only inliers
flags=2)
img3 = cv2.drawMatches(image, kp1, source, kp2, good, None, **draw_params)
cv2.imwrite("ORB"+str(num)+".jpg", img3)
match_image = unsharp_mask(cv2.imread("source.jpg"))
image_1 = unsharp_mask(cv2.imread("Screen_1.jpg"))
screen_finder2(match_image, image_1, num=1)

Searching for objects from database on image

Let's suppose I have database with thousands of images with different forms and sizes (smaller than 100 x 100px) and it's guaranted that every of images shows only one object - symbol, logo, road sign, etc. I would like to be able to take any image ("my_image.jpg") from the Internet and answer the question "Do my_image contains any object (object can be resized, but without deformations) from my database?" - let's say with 95% reliability. To simplify my_images will have white background.
I was trying use imagehash (https://github.com/JohannesBuchner/imagehash), which would be very helpful, but to get rewarding results I think I have to calculate (almost) every possible hash of my_image - the reason is I don't know object size and location on my_image:
hash_list = []
MyImage = Image.open('my_image.jpg')
for x_start in range(image_width):
for y_start in range(image_height):
for x_end in range(x_start, image_width):
for y_end in range(y_start, image_height):
hash_list.append(imagehash.phash(MyImage.\
crop(x_start, y_start, x_end, y_end)))
...and then try to find similar hash in database, but when for example image_width = image_height = 500 this loops and searching will take ages. Of course I can optymalize it a little bit but it still looks like seppuku for bigger images:
MIN_WIDTH = 30
MIN_HEIGHT = 30
STEP = 2
hash_list = []
MyImage = Image.open('my_image.jpg')
for x_start in range(0, image_width - MIN_WIDTH, STEP):
for y_start in range(0, image_height - MIN_HEIGHT, STEP):
for x_end in range(x_start + MIN_WIDTH, image_width, STEP):
for y_end in range(y_start + MIN_HEIGHT, image_height, STEP):
hash_list.append(...)
I wonder if there is some nice way to define which parts of my_image are profitable to calculate hashes - for example cutting edges looks like bad idea. And maybe there is an easier solve? It will be great if the program could give the answer in max 20 minutes. I would be gratefull for any advice.
PS: sorry for my English :)
This looks like an image retrieval problem to me. However, in your case, you are more interested in a binary YES / NO answer which tells if the input image (my_image.jpg) is of an object which is present in your database.
The first thing which I can suggest is that you can resize all the images (including input) to a fixed size, say 100 x 100. But if an object in some image is very small or is present in a specific region of image (for e.g., top left) then resizing can make things worse. However, it was not clear from your question that how likely this is in you case.
About your second question for finding out the location of object, I think you were considering this because your input images are of large size, such as 500 x 500? If so, then resizing is better idea. However, if you asked this question because objects a localized to particular regions in images, then I think you can compute a gradient image which will help you to identify background regions as follows: since background has no variation (complete white) gradient values will be zero for pixels belonging to background regions.
Rather than calculating and using image hash, I suggest you to read about bag-of-visual-words (for e.g., here) based approaches for object categorization. Although your aim is not to categorize objects, but it will help you come up with a different approach to solve your problem.
After all I found solution that looks really nice for me and maybe it will be useful for someone else:
I'm using SIFT to detect "best candidates" from my_image:
def multiscale_template_matching(template, image):
results = []
for scale in np.linspace(0.2, 1.4, 121)[::-1]:
res = imutils.resize(image, width=int(image.shape[1] * scale))
r = image.shape[1] / float(res.shape[1])
if res.shape[0] < template.shape[0] or res.shape[1] < template.shape[1];
break
## bigger correlation <==> better matching
## template_mathing uses SIFT to return best correlation and coordinates
correlation, (x, y) = template_matching(template, res)
coordinates = (x * r, y * r)
results.appent((correlation, coordinates, r))
results.sort(key=itemgetter(0), reverse=True)
return results[:10]
Then for results I'm calculating hashes:
ACCEPTABLE = 10
def find_best(image, template, candidates):
template_hash = imagehash.phash(template)
best_result = 50 ## initial value must be greater than ACCEPTABLE
best_cand = None
for cand in candidates:
cand_hash = get_hash(...)
hash_diff = template_hash - cand_hash
if hash_diff < best_result:
best_result = hash_diff
best_cand = cand
if best_result <= ACCEPTABLE:
return best_cand, best_result
else:
return None, None
If result < ACCEPTABLE, I'm almost sure the answer is "GOT YOU!" :) This solve allows me to compare my_image with 1000 of objects in 7 minutes.

Blocproc in matlab with two output variables

I have the following problem. I have to compute dense SIFT interest points in a very high dimensional image (182MP). When I run the code in the full image Matlab always close suddently. So I decided to run the code in image patches.
the code
I tried to use blocproc in matlab to call the c++ function that performs the dense sift interest points detection this way:
fun = #(block_struct) denseSIFT(block_struct.data, options);
[dsift , infodsift] = blockproc(ndvi,[1000 1000],fun);
where dsift is the sift descriptors (vectors) and infodsift has the information of the interest points, such as the x and y coordinates.
the problem
The problem is the fact that blocproc just allow one output, but i want both outputs. The following error is given by matlab when i run the code.
Error using blockproc
Too many output arguments.
Is there a way for me doing this?
Would it be a problem for you to "hard code" a version of blockproc?
Assuming for a moment that you can divide your image into NxM smaller images, you could loop around as follows:
bigImage = someFunction();
sz = size(bigImage);
smallSize = sz ./ [N M];
dsift = cell(N,M);
infodsift = cell(N,M);
for ii = 1:N
for jj = 1:M
smallImage = bigImage((ii-1)*smallSize(1) + (1:smallSize(1)), (jj-1)*smallSize(2) + (1:smallSize(2));
[dsift{ii,jj} infodsift{ii,jj}] = denseSIFT(smallImage, options);
end
end
The results will then be in the two cell arrays. No real need to pre-allocate, but it's tidier if you do. If the individual matrices are the same size, you can convert into a single large matrix with
dsiftFull = cell2mat(dsift);
Almost magic. This won't work if your matrices are different sizes - but then, if they are, I'm not sure you would even want to put them all in a single one (unless you decide to horzcat them).
If you do decide you want a list of "all the colums as a giant matrix", then you can do
giantMatrix = [dsift{:}];
This will return a matrix with (in your example) 128 rows, and as many columns as there were "interest points" found. It's shorthand for
giantMatrix = [dsift{1,1} dsift{2,1} dsift{3,1} ... dsift{N,M}];

Resources