Problem Statement: I have generated a video from data to images of ANSYS Simulation of vortices formed due to flat plate plunging. The video contains vortices (in simpler terms blobs), which are ever evolving (dissociating and merging).
Sample video
Objective: The vortices are needed to be identified and labelled such that the consistency of the label is maintained. If a certain vortex has been given a label in the previous frame its label remains the same. If it dissociates the larger component (parent) should retain the same label whereas the smaller component gets a new label. If it merges then the label of larger of the two should be given to them.
Attempt: I have written a function which detects object boundary (contour detection) and then finds the center of the identiied contour. Further mapping the centroid to the one closest in the next frame provided the distance is less than certain threshold.
Attempted tracking video
Tracking algorithm:
import math
class EuclideanDistTracker:
def __init__(self):
# Store the center positions of the objects
self.center_points = {}
# Keep the count of the IDs
# each time a new object id detected, the count will increase by one
self.id_count = 0
def update(self, objects_rect):
# Objects boxes and ids
objects_bbs_ids = []
# Get center point of new object
for rect in objects_rect:
x, y, w, h = rect
cx = (x + x + w) // 2
cy = (y + y + h) // 2
# Find out if that object was detected already
same_object_detected = False
for id, pt in self.center_points.items():
dist = math.hypot(cx - pt[0], cy - pt[1])
if dist < 20: # Threshold
self.center_points[id] = (cx, cy)
print(self.center_points)
objects_bbs_ids.append([x, y, w, h, id])
same_object_detected = True
break
# New object is detected we assign the ID to that object
if same_object_detected is False:
self.center_points[self.id_count] = (cx, cy)
objects_bbs_ids.append([x, y, w, h, self.id_count])
self.id_count += 1
# Clean the dictionary by center points to remove IDS not used anymore
new_center_points = {}
for obj_bb_id in objects_bbs_ids:
_, _, _, _, object_id = obj_bb_id
center = self.center_points[object_id]
new_center_points[object_id] = center
# Update dictionary with IDs not used removed
self.center_points = new_center_points.copy()
return objects_bbs_ids
Implementing tracking algorithm to the sample video:
import cv2
import numpy as np
from tracker import *
# Create tracker object
tracker = EuclideanDistTracker()
cap = cv2.VideoCapture("Video Source")
count = 0
while True:
ret, frame = cap.read()
print("\n")
if(count != 0):
print("Frame Count: ", count)
frame = cv2.resize(frame, (0, 0), fx = 1.5, fy = 1.5)
height, width, channels = frame.shape
# 1. Object Detection
hsvFrame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
sensitivity = 20
white_lower = np.array([0,0,255-sensitivity])
white_upper = np.array([255,sensitivity,255])
white_mask = cv2.inRange(hsvFrame, white_lower, white_upper)
# Morphological Transform, Dilation
kernal = np.ones((3, 3), "uint8")
c = 1
contours_w, hierarchy_w = cv2.findContours(white_mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
detections = []
for contour_w in contours_w:
area = cv2.contourArea(contour_w)
if area > 200 and area < 100000:
cv2.drawContours(frame, [contour_w], -1, (0, 0, 0), 1)
x,y,w,h = cv2.boundingRect(contour_w)
detections.append([x, y, w, h])
# 2. Object Tracking
boxes_ids = tracker.update(detections)
for box_id in boxes_ids:
x, y, w, h, id = box_id
cv2.putText(frame, str(id), (x - 8, y + 8), cv2.FONT_HERSHEY_PLAIN, 2, (0, 0, 0), 2)
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 0, 0), 1)
cv2.imshow("Frame", frame)
count += 1
key = cv2.waitKey(0)
if key == 27:
break
cap.release()
cv2.destroyAllWindows()
Problem: I was able to implement continuous labelling but the problem is that the objective of retaining label for the parent vortex is not maintained. (As seen from time (t) = 0s to 9s, the largest vortex is given a label = 3, whereas from t = 9s it is given label = 9. I want that to remain as label = 3 in Attempted tracking video). Any suggestions would be helpful and guide if I am on the right track or should I use some other standard tracking algorithm or any deep learning algorithm.
PS: Sorry for the excessive text but stackoverflow denied me putting just link of the codes.
Related
I am attempting to detect an image of a certain type on a page of degraded quality, that has rotational and translational variance. I need to "cropped" the detected image out of the page, so I will need the rotation and coords of the detected image. For example an image that has been photocopied on an A4 page.
I am using SIFT to detect objects the scanned page. These images can be rotated and translated but are not sheered or have any perspective distortion. I am using the classic (SIFT, SURF, ORB, etc) approach however it assumes perspective transform in order to create the 4 points of the bounding polygon. The issue here is since the key points dont line up perfectly (due to varying image qualities, the projection assumes spatial distortion and the polygon is rightfully distorted.
The approach I want to try is to "snap" the detected polygon points to the dimensions/area of the input image. This should allow me to determine the angle of rotation and translation of the image on the page.
Things I have tried are (And Failed):
Filter key point to remove outliers to minimise distortion.
Affine/Rotations/etc matrices, however they assume point from the samples are equidistant and dont do approximations.
ICP: Would probably work, but there is not enough samples and it seems to be more of an approach than a method. I am certain there is a better way.
def detect(img, frame, detector):
frame = frame.copy()
kp1, desc1 = detector.detectAndCompute(img, None)
kp2, desc2 = detector.detectAndCompute(frame, None)
index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(desc1, desc2, k=2)
good_points = []
for m, n in matches:
if m.distance < 0.5 * n.distance:
good_points.append(m)
if(len(good_points) == 20):
break
# out_img=cv2.drawMatches(img, kp1, frame, kp2, good_points, flags=2, outImg=None)
# plt.figure(figsize = (6*4, 8*4))
# plt.imshow(out_img)
if len(good_points) > 10: # at least 6 matches are required
# Get the matching points
query_pts = np.float32([kp1[m.queryIdx].pt for m in good_points]).reshape(-1, 1, 2)
train_pts = np.float32([kp2[m.trainIdx].pt for m in good_points]).reshape(-1, 1, 2)
matrix, mask = cv2.findHomography(query_pts, train_pts, cv2.RANSAC, 5.0)
matches_mask = mask.ravel().tolist()
h, w = img.shape
pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]]).reshape(-1, 1, 2)
dst = cv2.perspectiveTransform(pts, matrix)
overlayImage = cv2.polylines(frame, [np.int32(dst)], True, (0, 0, 0), 3)
plt.figure(figsize = (6*2, 8*2))
plt.imshow(overlayImage)
orb = cv2.SIFT_create()
for frame in frames:
detect(img, frame, orb)
This is an example of a page with the image we are trying to detect on it.
Blue line: rectangle with correct size
Red Line: determines polygon using perspective transform
I stumbled on a post that show you how to extract the minimum bounding box from a set of points. This works really well as it also discloses the rotation.
def detect_ICP(img, frame, detector):
frame = frame.copy()
kp1, desc1 = detector.detectAndCompute(img, None)
kp2, desc2 = detector.detectAndCompute(frame, None)
index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(desc1, desc2, k=2)
matches = sorted(matches, key = lambda x:x[0].distance + 0.5 * x[1].distance)
good_points = []
for m, n in matches:
if m.distance < 0.5 * n.distance:
good_points.append(m)
out_img=cv2.drawMatches(img, kp1, frame, kp2, good_points, flags=2, outImg=None)
plt.figure(figsize = (6*4, 8*4))
plt.imshow(out_img)
if len(good_points) > 10: # at least 6 matches are required
# Get the matching points
query_pts = np.float32([kp1[m.queryIdx].pt for m in good_points]).reshape(-1, 1, 2)
train_pts = np.float32([kp2[m.trainIdx].pt for m in good_points]).reshape(-1, 1, 2)
matrix, mask = cv2.findHomography(query_pts, train_pts, cv2.RANSAC, 5.0)
# matches_mask = mask.ravel().tolist()
h, w = img.shape
pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]]).reshape(-1, 1, 2)
dst = cv2.perspectiveTransform(pts, matrix)
# determine the minimum bounding box
minAreaRect = cv2.minAreaRect(dst) # This will have size and rotation information
rotatedBox = cv2.boxPoints(minAreaRect)
rotatedBox = np.float32(rotatedBox).reshape(-1, 1, 2)
overlayImage = cv2.polylines(frame, [np.int32(rotatedBox)], True, (0, 0, 0), 3)
plt.figure(figsize = (6*2, 8*2))
plt.imshow(overlayImage)
I have collected images of S9 phones, added labels with labellmg and trained for a few hours in google colab. I had minimal loss so I thought it is enough. I only selected the rectangles where the phone is displayed and nothing else. What I dont understand is, it draws a lot of rectangles on the phone. I only want 1 or 2 rectangles drawn on the phone itself. Did I do something wrong?
def detect_img(self, img):
blob = cv2.dnn.blobFromImage(img, 0.00392 ,(416,416), (0,0,0), True, crop=False)
input_img = self.net.setInput(blob)
output = self.net.forward(self.output)
height, width, channel = img.shape
boxes = []
trusts = []
class_ids = []
for out in output:
for detect in out:
total_scores = detect[5:]
class_id = np.argmax(total_scores)
trust_factor = total_scores[class_id]
if trust_factor > 0.5:
x_center = int(detect[0] * width)
y_center = int(detect[1] * height)
w = int(detect[2] * width)
h = int(detect[3] * height)
x = int(x_center - w / 2)
y = int(x_center - h / 2)
boxes.append([x,y,w,h])
trusts.append(float(trust_factor))
class_ids.append(class_id)
cv2.rectangle(img, (x_center,y_center), (x + w, y + h), (0,255,0), 2)
When I set the trust_factor to 0.8, a lot of the rectangles are gone but there are still rectangles outside the phone, while I only selected the phone itself in labellmg and not the background.
You can use function "non maximum suppression" that it removes rectangles which have less score. I put a code for NMS
def NMS(boxes, overlapThresh = 0.4):
# Return an empty list, if no boxes given
if len(boxes) == 0:
return []
x1 = boxes[:, 0] # x coordinate of the top-left corner
y1 = boxes[:, 1] # y coordinate of the top-left corner
x2 = boxes[:, 2] # x coordinate of the bottom-right corner
y2 = boxes[:, 3] # y coordinate of the bottom-right corner
# Compute the area of the bounding boxes and sort the bounding
# Boxes by the bottom-right y-coordinate of the bounding box
areas = (x2 - x1 + 1) * (y2 - y1 + 1) # We add 1, because the pixel at the start as well as at the end counts
# The indices of all boxes at start. We will redundant indices one by one.
indices = np.arange(len(x1))
for i,box in enumerate(boxes):
# Create temporary indices
temp_indices = indices[indices!=i]
# Find out the coordinates of the intersection box
xx1 = np.maximum(box[0], boxes[temp_indices,0])
yy1 = np.maximum(box[1], boxes[temp_indices,1])
xx2 = np.minimum(box[2], boxes[temp_indices,2])
yy2 = np.minimum(box[3], boxes[temp_indices,3])
# Find out the width and the height of the intersection box
w = np.maximum(0, xx2 - xx1 + 1)
h = np.maximum(0, yy2 - yy1 + 1)
# compute the ratio of overlap
overlap = (w * h) / areas[temp_indices]
# if the actual boungding box has an overlap bigger than treshold with any other box, remove it's index
if np.any(overlap) > treshold:
indices = indices[indices != i]
#return only the boxes at the remaining indices
return boxes[indices].astype(int)
I am new to opencv-python. I have found the lines in image through houghtransformP. The lines drawn from hough transform are discontinued and are giving multiple lines. I need to draw only one line for the edges and find the 'distance' between lines which are found.
The output image is shown below
"""
Created on Fri Nov 8 11:41:16 2019
#author: romanth.chowan
"""
import cv2
import numpy as np
import math
def getSlopeOfLine(line):
xDis = line[0][2] - line[0][0]
if (xDis == 0):
return None
return (line[0][3] - line[0][1]) / xDis
if __name__ == '__main__':
inputFileName_ =r"C:\Users\romanth.chowan\Desktop\opencv\stent spec\2prox.jpeg"
img = cv2.imread(inputFileName_)
img1=cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
edges = cv2.Laplacian(gray,cv2.CV_8UC1) # Laplacian Edge Detection
lines = cv2.HoughLinesP(edges, 1, np.pi / 180, 300, 10, 10)
print(len(lines))
parallelLines = []
for a in lines:
for b in lines:
if a is not b:
slopeA = getSlopeOfLine(b)
slopeB = getSlopeOfLine(b)
if slopeA is not None and slopeB is not None:
if 0 <= abs(slopeA - slopeB) <= 10:
parallelLines.append({'lineA': a, 'lineB': b})
for pairs in parallelLines:
lineA = pairs['lineA']
lineB = pairs['lineB']
leftx, boty, rightx, topy = lineA[0]
cv2.line(img, (leftx, boty), (rightx, topy), (0, 0, 255), 2)
left_x, bot_y, right_x, top_y = lineB[0]
cv2.line(img, (left_x, bot_y), (right_x, top_y), (0, 0, 255), 2)
cv2.imwrite('linesImg.jpg', img)
output image after drawing lines:
It's mostly geometric task, not specific to OpenCV.
For each line you have two points (x1,y1) and (x2,y2) which are already used in your getSlopeOfLine(line) method.
You can denote each line in form:
ax + by + c = 0
To do that use two known line's points:
(y1 - y2)x + (x2 - x1)y + (x1y2 - x2y1) = 0
Note than parallel lines have same a and b while different c.
And than measure distance between any two of them (distance between non-parallel lines is equal to zero since they have a crossing point) :
d = abs(c2 - c1) / sqrt(a*a + b*b)
In Euclidean geometry line may be denoted in several ways and one may suit specific task better than another.
Currently you evaluate line's slope, from formula above we can get:
y = (-b / a)x - c / b
same to (b has another meaning now)
y = kx + b
Or using two line's points:
y = (x1 - x2) / (y1 - y2) * x + (x1y2 - x2y1)
Where k is line's slope (tan(alpha)) and b is shift.
Now you just match parallel lines (one with close k). You can take in account line's shift to merge several parallel lines into one.
currently I'm working on a project where I try to find the corners of the rectangle's surface in a photo using OpenCV (Python, Java or C++)
I've selected desired surface by filtering color, then I've got mask and passed it to the cv2.findContours:
cnts, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnt = sorted(cnts, key = cv2.contourArea, reverse = True)[0]
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.02*peri, True)
if len(approx) == 4:
cv2.drawContours(mask, [approx], -1, (255, 0, 0), 2)
This gives me an inaccurate result:
Using cv2.HoughLines I've managed to get 4 straight lines that accurately describe the surface. Their intersections are exactly what I need:
edged = cv2.Canny(mask, 10, 200)
hLines = cv2.HoughLines(edged, 2, np.pi/180, 200)
lines = []
for rho,theta in hLines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(mask, (x1,y1), (x2,y2), (255, 0, 0), 2)
lines.append([[x1,y1],[x2,y2]])
The question is: is it possible to somehow tweak findContours?
Another solution would be to find coordinates of intersections. Any clues for this approach are welcome :)
Can anybody give me a hint how to solve this problem?
Finding intersection is not so trivial problem as it seems to be, but before the intersection points will be found, following problems should be considered:
The most important thing is to choose the right parameters for the HoughLines function, since it can return from 0 to an infinite numbers of lines (we need 4 parallel)
Since we do not know in what order these lines go, they need to be compared with each other
Because of the perspective, parallel lines are no longer parallel, so each line will have a point of intersection with the others. A simple solution would be to filter the coordinates located outside the photo. But it may happen that an undesirable intersection will be within the photo.
The coordinates should be sorted. Depending on the task, it could be done in different ways.
cv2.HoughLines will return an array with the values of rho and theta for each line.
Now the problem becomes a system of equations for all lines in pairs:
def intersections(edged):
# Height and width of a photo with a contour obtained by Canny
h, w = edged.shape
hl = cv2.HoughLines(edged,2,np.pi/180,190)[0]
# Number of lines. If n!=4, the parameters should be tuned
n = hl.shape[0]
# Matrix with the values of cos(theta) and sin(theta) for each line
T = np.zeros((n,2),dtype=np.float32)
# Vector with values of rho
R = np.zeros((n),dtype=np.float32)
T[:,0] = np.cos(hl[:,1])
T[:,1] = np.sin(hl[:,1])
R = hl[:,0]
# Number of combinations of all lines
c = n*(n-1)/2
# Matrix with the obtained intersections (x, y)
XY = np.zeros((c,2))
# Finding intersections between all lines
for i in range(n):
for j in range(i+1, n):
XY[i+j-1, :] = np.linalg.inv(T[[i,j],:]).dot(R[[i,j]])
# filtering out the coordinates outside the photo
XY = XY[(XY[:,0] > 0) & (XY[:,0] <= w) & (XY[:,1] > 0) & (XY[:,1] <= h)]
# XY = order_points(XY) # obtained points should be sorted
return XY
here is the result:
It is possible to:
select the longest contour
break it into segments and group them by gradient
Fit lines to the largest four groups
Find intersection points
But then, Hough transform does nearly the same thing. Is there any particular reason for not using it?
Intersection points of lines are very easy to calculate. A high-school coordinate geometry lesson can provide you with the algorithm.
I'm trying to get the number contours from an image.
Original image is in number_img:
After I've used the following code:
gray = cv2.cvtColor(number_img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (1, 1), 0)
ret, thresh = cv2.threshold(blur, 70, 255, cv2.THRESH_BINARY_INV)
img2, contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
for c in contours:
area = cv2.contourArea(c)
[x, y, w, h] = cv2.boundingRect(c)
if (area > 50 and area < 1000):
[x, y, w, h] = cv2.boundingRect(c)
cv2.rectangle(number_img, (x, y), (x + w, y + h), (0, 0, 255), 2)
Since there are small boxes in between, I tried to limit with height:
if (area > 50 and area < 1000) and h > 50:
[x, y, w, h] = cv2.boundingRect(c)
cv2.rectangle(number_img, (x, y), (x + w, y + h), (0, 0, 255), 2)
What other ways should I do to get the best contours of number to do OCR?
Thanks.
Just tried in Matlab. Hopefully you can adapt the code to OpenCV and tweak some parameters. It is not clear the right most blob is a number or not.
img1 = imread('DSYEW.png');
% first we can convert the grayscale image you provided to a binary
% (logical) image. It is always the best option in image preprocessing.
% Here I used the threshold .28 based on your image. But you may change it
% for a general solution.
img = im2bw(img1,.28);
% Then we can use the Matlab 'regionprops' command to identify the
% individual blobs in binary image. 'regionprops' gives us an output, the
% Area of the each blob.
s = regionprops(imcomplement(img));
% Now as you did, we can filter out the bounding boxes with an area
% threshold. I used 350 originally. But it can be changed for a better
% output.
s([s.Area] < 350) = [];
% Now we draw each bounding box on the image.
figure; imshow(img);
for k = 1 : length(s)
bb = s(k).BoundingBox;
rectangle('Position', [bb(1),bb(2),bb(3),bb(4)],...
'EdgeColor','r','LineWidth',2 )
end
Output image:
Update 1:
Just changed the area parameter in the above code as follows. Unfortunately I don't have Python OpenCV in my Mac. But, it is all about tweaking the parameters in you code.
s([s.Area] < 373) = [];
Output image:
Update 2:
Numbers 3 and 4 in the above figure were detected as one digit. If you look at carefully you are see that 3 and 4 are connected with each other, and that is why above code detected it as a single digit. So I used the imdilate function to get rid of that. Next, in your code, even the white holes inside some digits were detected as digits. To eliminate that we can fill the holes using imfill in Matlab.
Updated code:
img1 = imread('TCXeuO9.png');
img = im2bw(img1,.28);
img = imcomplement(img);
img = imfill(img,'holes');
img = imcomplement(img);
se = strel('line',2,90);
img = imdilate(img, se);
s = regionprops(imcomplement(img));
s([s.Area] < 330) = [];
figure; imshow(img);
for k = 1 : length(s)
bb = s(k).BoundingBox;
rectangle('Position', [bb(1),bb(2),bb(3),bb(4)],...
'EdgeColor','r','LineWidth',2 )
end
Output image: