Pi live video color detection - opencv

I'm planning to create an ambilight effect behind my TV. I want to achieve this by using a camera pointed at my TV. I think the easiest way is using a simple ip-camera. I need color detection to detect the colors on the screen and translate this to rgb values on the led strip.
I have a Raspberry Pi as hub in the middle of my house. I was thinking about using it like this
Ip camera pointed at my screen Process the video on the pi and translate it to rgb values and send it to mqtt server. Behind my TV receive the colors on my nodeMCU.
How can I detect colors on a live stream (on multiple points) on my pi?

If you can create any background colour the best approach might be calculating k-means or median to get "the most popular" colours. If the ambient light can be different in different places then using ROI at the image edges you can check what colour is dominant in this area (by comparing number of samples of different colours).
If you have only limited colours (e.g. only R, G and B) then you can simply check which channel has highest intensity in desired region.
I wrote the code with an assumption that you can create any RGB ambient color.
As a test image I use this one:
The code is:
import cv2
import numpy as np
# Read an input image (in your case this will be an image from the camera)
img = cv2.imread('saul2.png ', cv2.IMREAD_COLOR)
# The block_size defines how big the patches around an image are
# the more LEDs you have and the more segments you want, the lower block_size can be
block_size = 60
# Get dimensions of an image
height, width, chan = img.shape
# Calculate number of patches along height and width
h_steps = height / block_size
w_steps = width / block_size
# In one loop I calculate both: left and right ambient or top and bottom
ambient_patch1 = np.zeros((60, 60, 3))
ambient_patch2 = np.zeros((60, 60, 3))
# Create output image (just for visualization
# there will be an input image in the middle, 10px black border and ambient color)
output = cv2.copyMakeBorder(img, 70, 70, 70, 70, cv2.BORDER_CONSTANT, value = 0)
for i in range(h_steps):
# Get left and right region of an image
left_roi = img[i * 60 : (i + 1) * 60, 0 : 60]
right_roi = img[i * 60 : (i + 1) * 60, -61 : -1]
left_med = np.median(left_roi, (0, 1)) # This is an actual RGB color for given block (on the left)
right_med = np.median(right_roi, (0, 1)) # and on the right
# Create patch having an ambient color - this is just for visualization
ambient_patch1[:, :] = left_med
ambient_patch2[:, :] = right_med
# Put it in the output image (the additional 70 is because input image is in the middle (shifted by 70px)
output[70 + i * 60 : 70+ (i + 1) * 60, 0 : 60] = ambient_patch1
output[70 + i * 60 : 70+ (i + 1) * 60, -61: -1] = ambient_patch2
for i in range(w_steps):
# Get top and bottom region of an image
top_roi = img[0 : 60, i * 60 : (i + 1) * 60]
bottom_roi = img[-61 : -1, i * 60: (i + 1) * 60]
top_med = np.median(top_roi, (0, 1)) # This is an actual RGB color for given block (on top)
bottom_med = np.median(bottom_roi, (0, 1)) # and bottom
# Create patch having an ambient color - this is just for visualization
ambient_patch1[:, :] = top_med
ambient_patch2[:, :] = bottom_med
# Put it in the output image (the additional 70 is because input image is in the middle (shifted by 70px)
output[0 : 60, 70 + i * 60 : 70 + (i + 1) * 60] = ambient_patch1
output[-61: -1, 70 + i * 60 : 70 + (i + 1) * 60] = ambient_patch2
# Save output image
cv2.imwrite('saul_output.png', output)
And this gives a result as follows:
I hope this helps!
EDIT:
And the two more examples:

Related

Image Processing: Mapping a scanned image on a template image with many identical features

Problem description
We are trying to match a scanned image onto a template image:
Example of a scanned image:
Example of a template image:
The template image contains a collection of hearts varying in size and contour properties (closed, open left and open right). Each heart in the template is a Region of Interest for which we know the location, size, and contour type. Our goal is to match a scanned onto the template so that we can extract these ROIs in the scanned image. In the scanned image, some of these hearts are crossed, and they will be presented to a classifier that decides if they are crossed or not.
Our approach
Following a tutorial on PyImageSearch, we have attempted to use ORB to find matching keypoints (code included below). This should allow us to compute a perspective transform matrix that maps the scanned image on the template image.
We have tried some preprocessing steps such as thresholding and/or blurring the scanned image. We have also tried to increase the maximum number of features as much as possible.
The problem
The method fails to work for our image set. This can be seen in the following image:
It appears that a lot of keypoints are mapped to the wrong part of the template image, so the transform matrix is not calculated correctly.
Is ORB the right technique to use here, or are there parameters of the algorithm that could be fine-tuned to improve performance? It feels like we are missing out on something simple that should make it work, but we really don't know how to go forward with this approach :).
We are trying out an alternative technique where we cross-correlate the scan with individual heart shapes. This should give an image with peaks at the heart locations. By drawing a bounding box around these peaks we hope to map that bounding box on the bounding box of the template (I can elaborat on this upon request)
Any suggestions are greatly appreciated!
import cv2 as cv
import matplotlib.pyplot as plt
import numpy as np
# Preprocessing parameters
THRESHOLD = True
BLUR = False
# ORB parameters
MAX_FEATURES = 4048
KEEP_PERCENT = .01
SHOW_DEBUG = True
# Convert both the input image and template to grayscale
scan_file = r'scan.jpg'
template_file = r'template.jpg'
scan = cv.imread(scan_file)
template = cv.imread(template_file)
scan_gray = cv.cvtColor(scan, cv.COLOR_BGR2GRAY)
template_gray = cv.cvtColor(template, cv.COLOR_BGR2GRAY)
if THRESHOLD:
_, scan_gray = cv.threshold(scan_gray, 127, 255, cv.THRESH_BINARY)
_, template_gray = cv.threshold(template_gray, 127, 255, cv.THRESH_BINARY)
if BLUR:
scan_gray = cv.blur(scan_gray, (5, 5))
template_gray = cv.blur(template_gray, (5, 5))
# Use ORB to detect keypoints and extract (binary) local invariant features
orb = cv.ORB_create(MAX_FEATURES)
(kps_template, desc_template) = orb.detectAndCompute(template_gray, None)
(kps_scan, desc_scan) = orb.detectAndCompute(scan_gray, None)
# Match the features
#method = cv.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING
#matcher = cv.DescriptorMatcher_create(method)
#matches = matcher.match(desc_scan, desc_template)
bf = cv.BFMatcher(cv.NORM_HAMMING)
matches = bf.match(desc_scan, desc_template)
# Sort the matches by their distances
matches = sorted(matches, key = lambda x : x.distance)
# Keep only the top matches
keep = int(len(matches) * KEEP_PERCENT)
matches = matches[:keep]
if SHOW_DEBUG:
matched_visualization = cv.drawMatches(scan, kps_scan, template, kps_template, matches, None)
plt.imshow(matched_visualization)
Based on the clarifications provided by #it_guy, I have attempted to find all the crossed hearts using just the scanned image. I would have to try the algorithm on more images to check whether this approach will generalize or not.
Binarize the scanned image.
gray_image = cv2.cvtColor(rgb_image, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray_image, 180, 255, cv2.THRESH_BINARY_INV)
Perform dilation to close small gaps in the outline of the hearts, and the curves representing crosses. Note - The structuring element np.ones((1,2), np.uint8 can be changed by running the algorithm through multiple images and finding the most suitable structuring element.
closing_original = cv2.morphologyEx(original_binary, cv2.MORPH_DILATE, np.ones((1,2), np.uint8)).
Find all the contours in the image. The contours include all hearts and the triangle at the bottom. We eliminate other contours like dots by placing constraints on the height and width of contours to filter them. Further, we also use contour hierachies to eliminate inner contours in cross hearts.
contours_original, hierarchy_original = cv2.findContours(closing_original, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
We iterate through each of the filtered contours.
Contour with normal heart -
Contour with crossed heart -
Let us observe the difference between these two types of hearts. If we look at the transition from white-to-black pixel and black-to-white pixel ( from top to bottom ) inside the normal heart, we see that for majority of the image columns the number of such transitions are 4. ( Top border - 2 transitions, bottom border - 2 transitions )
white-to-black pixel - (255, 255, 0, 0, 0)
black-to-white pixel - (0, 0, 255, 255, 255)
But, in the case of the crossed heart, the number of transitions for majority of the columns must be 6, since the crossed curve / line adds two more transitions inside the heart (black-to-white first, then white-to-black). Hence, among all image columns which have greater than or equal to 4 such transitions, if more than 40% of the columns have 6 transitions, then the given contour represents a crossed contour. Result -
Code -
import cv2
import numpy as np
def convert_to_binary(rgb_image):
gray_image = cv2.cvtColor(rgb_image, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray_image, 180, 255, cv2.THRESH_BINARY_INV)
return gray_image, thresh
original = cv2.imread('original.jpg')
height, width = original.shape[:2]
original_gray, original_binary = convert_to_binary(original) # Get binary image
cv2.imwrite("binary.jpg", original_binary)
closing_original = cv2.morphologyEx(original_binary, cv2.MORPH_DILATE, np.ones((1,2), np.uint8)) # Close small gaps in the binary image
cv2.imwrite("closed.jpg", closing_original)
contours_original, hierarchy_original = cv2.findContours(closing_original, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE) # Get all the contours
bounding_rects_original = [cv2.boundingRect(c) for c in contours_original] # Get all contour bounding boxes
orig_boxes = list()
all_contour_image = original.copy()
for i, (x, y, w, h) in enumerate(bounding_rects_original):
if h > height / 2 or w > width / 2: # Eliminate extremely large contours
continue
if h < w / 2 or w < h / 2: # Eliminate vertical / horuzontal lines
continue
if w * h < 200: # Eliminate small area contours
continue
if hierarchy_original[0][i][3] != -1: # Eliminate contours created by heart crosses
continue
orig_boxes.append((x, y, w, h))
cv2.rectangle(all_contour_image, (x,y), (x + w, y + h), (0, 255, 0), 3)
# cv2.imshow("warped", closing_original)
cv2.imwrite("all_contours.jpg", all_contour_image)
final_image = original.copy()
for x, y, w, h in orig_boxes:
cropped_image = closing_original[y - 2 :y + h + 2, x: x + w] # Get the heart binary image
col_pixel_diffs = np.abs(np.diff(cropped_image.T.astype(np.int16))/255) # Obtain all consecutive pixel differences in all the columns
column_sums = np.sum(col_pixel_diffs, axis=1) # Get the sum of each column's transitions. This results in an array of size equal
# to the number of columns, each element representing the number of black-white and white-black transitions.
percent_crosses = np.sum(column_sums >= 6)/ np.sum(column_sums >= 4) # Percentage of columns with 6 transitions among columns with 4 transitions
if percent_crosses > 0.4: # Crossed heart criterion
cv2.rectangle(final_image, (x,y), (x + w, y + h), (0, 255, 0), 3)
cv2.imwrite("crossed_heart.jpg", cropped_image)
else:
cv2.imwrite("normal_heart.jpg", cropped_image)
cv2.imwrite("all_crossed_hearts.jpg", final_image)
This approach can be tested on more images to find its accuracy.

How to get pixel color values from a kivy window

I would like get the color of individual pixels with the python library kivy on the screen at the current time.
A psuedo code example:
Window.Getpixel((50, 50))
Output:
rgba: 1, 0, 0, 1
Any help with this would be great.
You can use export_as_image() to get a core Image of the root of your App. Then use read_pixel(x,y) to get the pixel data
Based on the documentation, kivy.core.window.WindowBase.screenshot is the only function I can find that performs screen capture (which is necessary to extract a pixel). It looks like it writes to a file, so you would then read from that file and take the pixel (messy). Ideally, you'd want to use some other library (other than kivy) such as pyscreenshot (you can use pycreenshot.grab(bbox=(x, x, x + 1, y + 1))[0, 0]) or one of its backends (some of which have python wrappers), or pyscreeze (you can use pyscreeze.screenshot(region=(x, x, x + 1, y + 1))[0, 0]).
this is working for me
from kivy.core.window import Window
from kivy.core.image import Image
insistance on keep_data=True
def getPixel(x, y):
im = Window.screenshot()
m = Image.load(im, keep_data=True)
color = m.read_pixel(x, y)
return color[0] * 255, color[1] * 255, color[2] * 255

Approximately locate an image within another

I have this image of the world :
And this image of europe :
What technique could I use to approximately locate the image of europe within the world map?
Template matching is a technique for finding similar images within larger images, but it requires the template to be of the same size as in the sub-image. In this example OpenCV was used, but it can also be done using scikit-image.
import cv2
from imageio import imread
img = imread("https://i.stack.imgur.com/OIsWn.png", pilmode="L")
template = imread("https://i.stack.imgur.com/fkNeW.png", pilmode="L")
# threshold images for equal colours
img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)[1]
template = cv2.threshold(template, 127, 255, cv2.THRESH_BINARY)[1]
aspect_ratio = template.shape[1] / template.shape[0]
# estimate the width and compute the height based on the aspect ratio
w = 380
h = int(w / aspect_ratio)
# resize the template to match the sub-image as best as possible
template = cv2.resize(template, (w, h))
result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)
top_left = max_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
cv2.rectangle(img, top_left, bottom_right, 127, 3)
cv2.imwrite("europe_bounding_box.png", img)
Result:
Although this example uses a predetermined estimated width, it is also possible to test a range of possible widths and determine which value results in the best match.

How to use readNet (or readFromDarknet) instead of readNetFromCaffe?

I did an object detection using opencv by loading pre-trained MobileNet SSD model. from this post.
It reads a video and detects objects without any problem. But I would like to use readNet (or readFromDarknet) instead of readNetFromCaffe
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
because I have pre-trained weights and cfg file of my own objects only in Darknet framework. Therefore I simply changed readNetFromCaffe into readNet in above post and got an error:
Traceback (most recent call last):
File "people_counter.py", line 124, in <module>
for i in np.arange(0, detections.shape[2]):
IndexError: tuple index out of range
Here detections is an output from
blob = cv2.dnn.blobFromImage(frame, 1.0/255.0, (416, 416), True, crop=False)
net.setInput(blob)
detections = net.forward()
Its shape is (1, 1, 100, 7) tuple (when using readNetFromCaffe).
I was kinda expecting it wouldn't work just by changing the model. Then I decided to look for an object detector code where readNet was used and I found it here. I read through the code and found the same lines as follows:
blob = cv2.dnn.blobFromImage(image, scale, (416,416), (0,0,0), True, crop=False)
net.setInput(blob)
outs = net.forward(get_output_layers(net))
Here, the shape of outs is (1, 845, 6) list. But in order for me to be able to use it right away (here), outs should be of the same size with detections. I've come up to this part and have no clue about how I should proceed.
If something isn't clear, I just need help to use readNet (or readFromDarknet) instead of readNetFromCaffe in this post
If we look at the code closely we can see that everying is dependent on the outputs of detections, line 121, and we should tweak its outputs to match them with the outs of this, line 63. After spending almost a day, I came to a reasonable (not the perfect) solution. Basically, it is all about output blobs of readNetFromCaffe and readFromDarknet, because they output a blob with a shape 1x1xNx7 and NxC, respectively. Here Ns are the number of detections, but with different size vectors, namely, N in 1x1xNx7 is is a number of detections and an every detection is a vector of values
[batchId, classId, confidence, left, top, right, bottom] and N in NxC a number of
detected objects and C is a number of classes + 4 where the first 4 numbers are [center_x, center_y, width, height]. After analyzing these, we may replace (124-130 lines)
for i in np.arange(0, detections.shape[2]):
confidence = detections[0, 0, i, 2]
if confidence > args["confidence"]:
idx = int(detections[0, 0, i, 1])
if CLASSES[idx] != "person":
continue
box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
(startX, startY, endX, endY) = box.astype("int")
with equivalent lines
for i in np.arange(0, detections.shape[0]):
scores = detections[i][5:]
classId = np.argmax(scores)
confidence = scores[classId]
if confidence > args["confidence"]:
idx = int(classId)
if CLASSES[idx] != "person":
continue
center_x = int(detections[i][0] * 416)
center_y = int(detections[i][1] * 416)
width = int(detections[i][2] * 416)
height = int(detections[i][3] * 416)
left = int(center_x - width / 2)
top = int(center_y - height / 2)
right = width + left - 1
bottom = height + top - 1
box = [left, top, width, height]
(startX, startY, endX, endY) = box
This way we can keep track of "person" class using Darknet's cfg and weights and count them up/down with a visualiation line.
Again, there might be some other more simpler ways of tracking the detections of Darknet weights file, but this works for this particular case.
A reference:
more about blobs output by readNetFromCaffe and readFromDarknet

How to find corners on a Image using OpenCv

I´m trying to find the corners on a image, I don´t need the contours, only the 4 corners. I will change the perspective using 4 corners.
I´m using Opencv, but I need to know the steps to find the corners and what function I will use.
My images will be like this:(without red points, I will paint the points after)
EDITED:
After suggested steps, I writed the code: (Note: I´m not using pure OpenCv, I´m using javaCV, but the logic it´s the same).
// Load two images and allocate other structures (I´m using other image)
IplImage colored = cvLoadImage(
"res/scanteste.jpg",
CV_LOAD_IMAGE_UNCHANGED);
IplImage gray = cvCreateImage(cvGetSize(colored), IPL_DEPTH_8U, 1);
IplImage smooth = cvCreateImage(cvGetSize(colored), IPL_DEPTH_8U, 1);
//Step 1 - Convert from RGB to grayscale (cvCvtColor)
cvCvtColor(colored, gray, CV_RGB2GRAY);
//2 Smooth (cvSmooth)
cvSmooth( gray, smooth, CV_BLUR, 9, 9, 2, 2);
//3 - cvThreshold - What values?
cvThreshold(gray,gray, 155, 255, CV_THRESH_BINARY);
//4 - Detect edges (cvCanny) -What values?
int N = 7;
int aperature_size = N;
double lowThresh = 20;
double highThresh = 40;
cvCanny( gray, gray, lowThresh*N*N, highThresh*N*N, aperature_size );
//5 - Find contours (cvFindContours)
int total = 0;
CvSeq contour2 = new CvSeq(null);
CvMemStorage storage2 = cvCreateMemStorage(0);
CvMemStorage storageHull = cvCreateMemStorage(0);
total = cvFindContours(gray, storage2, contour2, Loader.sizeof(CvContour.class), CV_RETR_CCOMP, CV_CHAIN_APPROX_NONE);
if(total > 1){
while (contour2 != null && !contour2.isNull()) {
if (contour2.elem_size() > 0) {
//6 - Approximate contours with linear features (cvApproxPoly)
CvSeq points = cvApproxPoly(contour2,Loader.sizeof(CvContour.class), storage2, CV_POLY_APPROX_DP,cvContourPerimeter(contour2)*0.005, 0);
cvDrawContours(gray, points,CvScalar.BLUE, CvScalar.BLUE, -1, 1, CV_AA);
}
contour2 = contour2.h_next();
}
}
So, I want to find the cornes, but I don´t know how to use corners function like cvCornerHarris and others.
First, check out /samples/c/squares.c in your OpenCV distribution. This example provides a square detector, and it should be a pretty good start on how to detect corner-like features. Then, take a look at OpenCV's feature-oriented functions like cvCornerHarris() and cvGoodFeaturesToTrack().
The above methods can return many corner-like features - most will not be the "true corners" you are looking for. In my application, I had to detect squares that had been rotated or skewed (due to perspective). My detection pipeline consisted of:
Convert from RGB to grayscale (cvCvtColor)
Smooth (cvSmooth)
Threshold (cvThreshold)
Detect edges (cvCanny)
Find contours (cvFindContours)
Approximate contours with linear features (cvApproxPoly)
Find "rectangles" which were structures that: had polygonalized contours possessing 4 points, were of sufficient area, had adjacent edges were ~90 degrees, had distance between "opposite" vertices was of sufficient size, etc.
Step 7 was necessary because a slightly noisy image can yield many structures that appear rectangular after polygonalization. In my application, I also had to deal with square-like structures that appeared within, or overlapped the desired square. I found the contour's area property and center of gravity to be helpful in discerning the proper rectangle.
At a first glance, for a human eye there are 4 corners. But in computer vision, a corner is considered to be a point that has large gradient change in intensity across its neighborhood. The neighborhood can be a 4 pixel neighborhood or an 8 pixel neighborhood.
In the equation provided to find the gradient of intensity, it has been considered for 4-pixel neighborhood SEE DOCUMENTATION.
Here is my approach for the image in question. I have the code in python as well:
path = r'C:\Users\selwyn77\Desktop\Stack\corner'
filename = 'env.jpg'
img = cv2.imread(os.path.join(path, filename))
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) #--- convert to grayscale
It is a good choice to always blur the image to remove less possible gradient changes and preserve the more intense ones. I opted to choose the bilateral filter which unlike the Gaussian filter doesn't blur all the pixels in the neighborhood. It rather blurs pixels which has similar pixel intensity to that of the central pixel. In short it preserves edges/corners of high gradient change but blurs regions that have minimal gradient changes.
bi = cv2.bilateralFilter(gray, 5, 75, 75)
cv2.imshow('bi',bi)
To a human it is not so much of a difference compared to the original image. But it does matter. Now finding possible corners:
dst = cv2.cornerHarris(bi, 2, 3, 0.04)
dst returns an array (the same 2D shape of the image) with eigen values obtained from the final equation mentioned HERE.
Now a threshold has to be applied to select those corners beyond a certain value. I will use the one in the documentation:
#--- create a black image to see where those corners occur ---
mask = np.zeros_like(gray)
#--- applying a threshold and turning those pixels above the threshold to white ---
mask[dst>0.01*dst.max()] = 255
cv2.imshow('mask', mask)
The white pixels are regions of possible corners. You can find many corners neighboring each other.
To draw the selected corners on the image:
img[dst > 0.01 * dst.max()] = [0, 0, 255] #--- [0, 0, 255] --> Red ---
cv2.imshow('dst', img)
(Red colored pixels are the corners, not so visible)
In order to get an array of all pixels with corners:
coordinates = np.argwhere(mask)
UPDATE
Variable coor is an array of arrays. Converting it to list of lists
coor_list = [l.tolist() for l in list(coor)]
Converting the above to list of tuples
coor_tuples = [tuple(l) for l in coor_list]
I have an easy and rather naive way to find the 4 corners. I simply calculated the distance of each corner to every other corner. I preserved those corners whose distance exceeded a certain threshold.
Here is the code:
thresh = 50
def distance(pt1, pt2):
(x1, y1), (x2, y2) = pt1, pt2
dist = math.sqrt( (x2 - x1)**2 + (y2 - y1)**2 )
return dist
coor_tuples_copy = coor_tuples
i = 1
for pt1 in coor_tuples:
print(' I :', i)
for pt2 in coor_tuples[i::1]:
print(pt1, pt2)
print('Distance :', distance(pt1, pt2))
if(distance(pt1, pt2) < thresh):
coor_tuples_copy.remove(pt2)
i+=1
Prior to running the snippet above coor_tuples had all corner points:
[(4, 42),
(4, 43),
(5, 43),
(5, 44),
(6, 44),
(7, 219),
(133, 36),
(133, 37),
(133, 38),
(134, 37),
(135, 224),
(135, 225),
(136, 225),
(136, 226),
(137, 225),
(137, 226),
(137, 227),
(138, 226)]
After running the snippet I was left with 4 corners:
[(4, 42), (7, 219), (133, 36), (135, 224)]
UPDATE 2
Now all you have to do is just mark these 4 points on a copy of the original image.
img2 = img.copy()
for pt in coor_tuples:
cv2.circle(img2, tuple(reversed(pt)), 3, (0, 0, 255), -1)
cv2.imshow('Image with 4 corners', img2)
Here's an implementation using cv2.goodFeaturesToTrack() to detect corners. The approach is
Convert image to grayscale
Perform canny edge detection
Detect corners
Optionally perform 4-point perspective transform to get top-down view of image
Using this starting image,
After converting to grayscale, we perform canny edge detection
Now that we have a decent binary image, we can use cv2.goodFeaturesToTrack()
corners = cv2.goodFeaturesToTrack(canny, 4, 0.5, 50)
For the parameters, we give it the canny image, set the maximum number of corners to 4 (maxCorners), use a minimum accepted quality of 0.5 (qualityLevel), and set the minimum possible Euclidean distance between the returned corners to 50 (minDistance). Here's the result
Now that we have identified the corners, we can perform a 4-point perspective transform to obtain a top-down view of the object. We first order the points clockwise then draw the result onto a mask.
Note: We could have just found contours on the Canny image instead of doing this step to create the mask, but pretend we only had the 4 corner points to work with
Next we find contours on this mask and filter using cv2.arcLength() and cv2.approxPolyDP(). The idea is that if the contour has 4 points, then it must be our object. Once we have this contour, we perform a perspective transform
Finally we rotate the image depending on the desired orientation. Here's the result
Code for only detecting corners
import cv2
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 120, 255, 1)
corners = cv2.goodFeaturesToTrack(canny,4,0.5,50)
for corner in corners:
x,y = corner.ravel()
cv2.circle(image,(x,y),5,(36,255,12),-1)
cv2.imshow('canny', canny)
cv2.imshow('image', image)
cv2.waitKey()
Code for detecting corners and performing perspective transform
import cv2
import numpy as np
def rotate_image(image, angle):
# Grab the dimensions of the image and then determine the center
(h, w) = image.shape[:2]
(cX, cY) = (w / 2, h / 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# Compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# Adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# Perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))
def order_points_clockwise(pts):
# sort the points based on their x-coordinates
xSorted = pts[np.argsort(pts[:, 0]), :]
# grab the left-most and right-most points from the sorted
# x-roodinate points
leftMost = xSorted[:2, :]
rightMost = xSorted[2:, :]
# now, sort the left-most coordinates according to their
# y-coordinates so we can grab the top-left and bottom-left
# points, respectively
leftMost = leftMost[np.argsort(leftMost[:, 1]), :]
(tl, bl) = leftMost
# now, sort the right-most coordinates according to their
# y-coordinates so we can grab the top-right and bottom-right
# points, respectively
rightMost = rightMost[np.argsort(rightMost[:, 1]), :]
(tr, br) = rightMost
# return the coordinates in top-left, top-right,
# bottom-right, and bottom-left order
return np.array([tl, tr, br, bl], dtype="int32")
def perspective_transform(image, corners):
def order_corner_points(corners):
# Separate corners into individual points
# Index 0 - top-right
# 1 - top-left
# 2 - bottom-left
# 3 - bottom-right
corners = [(corner[0][0], corner[0][1]) for corner in corners]
top_r, top_l, bottom_l, bottom_r = corners[0], corners[1], corners[2], corners[3]
return (top_l, top_r, bottom_r, bottom_l)
# Order points in clockwise order
ordered_corners = order_corner_points(corners)
top_l, top_r, bottom_r, bottom_l = ordered_corners
# Determine width of new image which is the max distance between
# (bottom right and bottom left) or (top right and top left) x-coordinates
width_A = np.sqrt(((bottom_r[0] - bottom_l[0]) ** 2) + ((bottom_r[1] - bottom_l[1]) ** 2))
width_B = np.sqrt(((top_r[0] - top_l[0]) ** 2) + ((top_r[1] - top_l[1]) ** 2))
width = max(int(width_A), int(width_B))
# Determine height of new image which is the max distance between
# (top right and bottom right) or (top left and bottom left) y-coordinates
height_A = np.sqrt(((top_r[0] - bottom_r[0]) ** 2) + ((top_r[1] - bottom_r[1]) ** 2))
height_B = np.sqrt(((top_l[0] - bottom_l[0]) ** 2) + ((top_l[1] - bottom_l[1]) ** 2))
height = max(int(height_A), int(height_B))
# Construct new points to obtain top-down view of image in
# top_r, top_l, bottom_l, bottom_r order
dimensions = np.array([[0, 0], [width - 1, 0], [width - 1, height - 1],
[0, height - 1]], dtype = "float32")
# Convert to Numpy format
ordered_corners = np.array(ordered_corners, dtype="float32")
# Find perspective transform matrix
matrix = cv2.getPerspectiveTransform(ordered_corners, dimensions)
# Return the transformed image
return cv2.warpPerspective(image, matrix, (width, height))
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 120, 255, 1)
corners = cv2.goodFeaturesToTrack(canny,4,0.5,50)
c_list = []
for corner in corners:
x,y = corner.ravel()
c_list.append([int(x), int(y)])
cv2.circle(image,(x,y),5,(36,255,12),-1)
corner_points = np.array([c_list[0], c_list[1], c_list[2], c_list[3]])
ordered_corner_points = order_points_clockwise(corner_points)
mask = np.zeros(image.shape, dtype=np.uint8)
cv2.fillPoly(mask, [ordered_corner_points], (255,255,255))
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
transformed = perspective_transform(original, approx)
result = rotate_image(transformed, -90)
cv2.imshow('canny', canny)
cv2.imshow('image', image)
cv2.imshow('mask', mask)
cv2.imshow('transformed', transformed)
cv2.imshow('result', result)
cv2.waitKey()
find contours with RETR_EXTERNAL option.(gray -> gaussian filter -> canny edge -> find contour)
find the largest size contour -> this will be the edge of the rectangle
find corners with little calculation
Mat m;//image file
findContours(m, contours_, hierachy_, RETR_EXTERNAL);
auto it = max_element(contours_.begin(), contours_.end(),
[](const vector<Point> &a, const vector<Point> &b) {
return a.size() < b.size(); });
Point2f xy[4] = {{9000,9000}, {0, 1000}, {1000, 0}, {0,0}};
for(auto &[x, y] : *it) {
if(x + y < xy[0].x + xy[0].y) xy[0] = {x, y};
if(x - y > xy[1].x - xy[1].y) xy[1] = {x, y};
if(y - x > xy[2].y - xy[2].x) xy[2] = {x, y};
if(x + y > xy[3].x + xy[3].y) xy[3] = {x, y};
}
xy[4] will be the four corners.
I was able to extract four corners this way.
Apply houghlines to the canny image - you will get a list of points
apply convex hull to this set of points

Resources