I already do rectification with my own code. Now I am trying to make cv2.stereoRectify work.
Suppose I have the following code:
import numpy as np
import cv2
img1 = cv2.imread(IMG_LEFT) # Load image left
img2 = cv2.imread(IMG_RIGHT) # Load image right
A1 = np.array(A1) # Left camera matrix intrinsic
A2 = np.array(A2) # Right camera matrix intrinsic
RT1 = np.array(RT1) # Left camera extrinsic (3x4)
RT2 = np.array(RT2) # Right camera extrinsic (3x4)
# Original projection matrices
Po1 = A1.dot( RT1 )
Po2 = A2.dot( RT2 )
# Camera centers (world coord.)
C1 = -np.linalg.inv(Po1[:,:3]).dot(Po1[:,3])
C2 = -np.linalg.inv(Po2[:,:3]).dot(Po2[:,3])
# Transformations
T1to2 = C2 - C1 # Translation from first to second camera
R1to2 = RT2[:,:3].dot(np.linalg.inv(RT1[:,:3])) # Rotation from first to second camera (3x3)
Then, I would like to find the rectification transformations (3x3). Following the OpenCV documentation I am trying:
Rectify1, Rectify2, Pn1, Pn2, _, _, _ = cv2.stereoRectify(A1, np.zeros((1,5)), A2, np.zeros((1,5)), (img1.shape[1], img1.shape[0]), R1to2, T1to2, alpha=-1 )
mapL1, mapL2 = cv2.initUndistortRectifyMap(A1, np.zeros((1,5)), Rectify1, Pn1, (img1.shape[1], img1.shape[0]), cv2.CV_32FC1)
mapR1, mapR2 = cv2.initUndistortRectifyMap(A2, np.zeros((1,5)), Rectify2, Pn2, (img2.shape[1], img2.shape[0]), cv2.CV_32FC1)
img1_rect = cv2.remap(img1, mapL1, mapL2, cv2.INTER_LINEAR)
img2_rect = cv2.remap(img2, mapR1, mapR2, cv2.INTER_LINEAR)
Anyway I am getting totally screwed images, surely not rectified. What am I doing wrong?
I guess it is something about rotations/translations, but I cannot figure it out.
Also, is OpenCV a bit overcomplicated about this? It should be an easy operation anyway.
Many thanks.
EDIT:
You may notice that I set distortion parameters to zero. That is because I am using computer generated stereo images that have no lens distortion.
Digging into OpenCV documentation I found the reason why stereoRectify() does not seem to work.
Most of research paper often refer to homography transformation to be applied to the image.
OpenCV, instead, is computing the rotation in the object space as it is (shyly) explained in the cv2.initUndistortrectifyMap() documentation (see here).
So after calling:
R1, R2, Pn1, Pn2, _, _, _ = cv2.stereoRectify(A1, np.zeros((1,5)), A2, np.zeros((1,5)), (img1.shape[1], img1.shape[0]), R1to2, T1to2, alpha=-1 )
To get the rectification transformations I use:
Rectify1 = R1.dot(np.linalg.inv(A1))
Rectify2 = R2.dot(np.linalg.inv(A2))
Where R1 and R2 are the transformations output of OpenCV and A1 and A2 are the 3x3 camera matrices (intrinsics).
It seems to work just fine. Please comment if you have any better idea.
Hope this to be useful for anyone.
Related
Greeting,
I have been trying to extract some regions from the face
In this case (upper lip) using Dlib, the thing is after extracting the ROI (which look perfect) I realized that there is some noise around the ROI
Can't figure out what I'm doing wrong, and how to resolve this issue.
This is the used Python code:
import cv2
import numpy as np
import dlib
import os
from scipy import ndimage, misc
import time
def extract_index_nparray(nparray):
index = None
for num in nparray[0]:
index = num
break
return index
img = cv2.imread( 'input_facial_image.jpg')
img=cv2.resize(img,(512,512))
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
mask = np.zeros_like(img_gray)
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("/facial-landmarks-recognition/shape_predictor_68_face_landmarks.dat")
# Face 1
faces = detector(img_gray)
for face in faces:
landmarks = predictor(img_gray, face)
landmarks_points = []
for n in [48,49,50,51,52,53,54,64,63,62,61,60]:
x = landmarks.part(n).x
y = landmarks.part(n).y
landmarks_points.append((x, y))
points = np.array(landmarks_points, np.int32)
convexhull = cv2.convexHull(points)
# cv2.polylines(img, [convexhull], True, (255, 0, 0), 3)
cv2.fillConvexPoly(mask, convexhull, 255)
face_image_1 = cv2.bitwise_or(img, img, mask=mask)
cv2.imwrite('extracted_lips.jpg', face_image_1 )
The extracted image looks like this :
upper lips extracted image
But in further steps in my work, I realized a noise around the upper lip, so I examined and I found unclean_upperlip
Is there any way to get rid of the noise during the ROI extracting or any image processing technique to bypass this issue?
Thanks in advance
For anyone who faced the same issue as me, it's simple. Just change the output format to png. The JPG compressing is the issue here.
I hope that this helps
I have been working on a code where an image is given as shown
I have to place this knife onto some other image. The condition is that I have to crop the knife along its outline and not in a rectangular box.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('2.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)
img_blur = cv2.bilateralFilter(img, d = 7,
sigmaSpace = 75, sigmaColor =75)
img_gray = cv2.cvtColor(img_blur, cv2.COLOR_RGB2GRAY)
a = img_gray.max()
_, thresh = cv2.threshold(img_gray, a/2+60, a,cv2.THRESH_BINARY_INV)
plt.imshow(thresh, cmap = 'gray')
contours, hierarchy = cv2.findContours(
image = thresh,
mode = cv2.RETR_TREE,
method = cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key = cv2.contourArea, reverse = True)
img_copy = img.copy()
final = cv2.drawContours(img_copy, contours, contourIdx = -1,
color = (255, 0, 0), thickness = 2)
plt.imshow(img_copy)
This is what I have tried but it doesn't seem to work well.
Input
Output
You can do it starting with bounding box using snake algorithm with balloon force added.
Snake's algo is defined such that it minimizes 3 energies - Continuity, Curvature and Gradient. The first two (together called internal energy) get minimized when points (on curve) are pulled closer and closer i.e. contract. If they expand then energy increases which is not allowed by snake algorithm.
But this initial algo proposed in 1987 has a few problems. One of the problem is that in flat areas (where gradient is zero) algo fails to converge and does nothing. There are several modifications proposed to solve this problem. The solution of interest here is - Balloon Force proposed by LD Cohen in 1989.
Balloon force guides the contour in non-informative areas of the image, i.e., areas where the gradient of the image is too small to push the contour towards a border. A negative value will shrink the contour, while a positive value will expand the contour in these areas. Setting this to zero will disable the balloon force.
Another improvement is - Morphological Snakes which use morphological operators (such as dilation or erosion) over a binary array instead of solving PDEs over a floating point array, which is the standard approach for active contours. This makes Morphological Snakes faster and numerically more stable than their traditional counterpart.
Scikit-image's implementation using the above two improvements is morphological_geodesic_active_contour. It has a parameter balloon
Using your image
import numpy as np
import matplotlib.pyplot as plt
from skimage.segmentation import morphological_geodesic_active_contour, inverse_gaussian_gradient
from skimage.color import rgb2gray
from skimage.util import img_as_float
from PIL import Image, ImageDraw
im = Image.open('knife.jpg')
im = np.array(im)
im = rgb2gray(im)
im = img_as_float(im)
plt.imshow(im, cmap='gray')
Now let us create a function which will help us to store iterations:
def store_evolution_in(lst):
"""Returns a callback function to store the evolution of the level sets in
the given list.
"""
def _store(x):
lst.append(np.copy(x))
return _store
This method needs image to be preprocessed to highlight the contours. This can be done using the function inverse_gaussian_gradient, although the user might want to define their own version. The quality of the MorphGAC segmentation depends greatly on this preprocessing step.
gimage = inverse_gaussian_gradient(im)
Below we define our starting point - a bounding box.
init_ls = np.zeros(im.shape, dtype=np.int8)
init_ls[200:-400, 20:-30] = 1
List with intermediate results for plotting the evolution
evolution = []
callback = store_evolution_in(evolution)
Now required magic line for morphological_geodesic_active_contour with balloon contraction is below:
ls = morphological_geodesic_active_contour(gimage, 200, init_ls,
smoothing=1, balloon=-0.75,
threshold=0.7,
iter_callback=callback)
Now let us plot the results:
fig, axes = plt.subplots(1, 2, figsize=(8, 8))
ax = axes.flatten()
ax[0].imshow(im, cmap="gray")
ax[0].set_axis_off()
ax[0].contour(ls, [0.5], colors='b')
ax[0].set_title("Morphological GAC segmentation", fontsize=12)
ax[1].imshow(ls, cmap="gray")
ax[1].set_axis_off()
contour = ax[1].contour(evolution[0], [0.5], colors='r')
contour.collections[0].set_label("Starting Contour")
contour = ax[1].contour(evolution[25], [0.5], colors='g')
contour.collections[0].set_label("Iteration 25")
contour = ax[1].contour(evolution[-1], [0.5], colors='b')
contour.collections[0].set_label("Last Iteration")
ax[1].legend(loc="upper right")
title = "Morphological GAC Curve evolution"
ax[1].set_title(title, fontsize=12)
plt.show()
With more balloon force you can get only the blade of knife as well.
ls = morphological_geodesic_active_contour(gimage, 100, init_ls,
smoothing=1, balloon=-1,
threshold=0.7,
iter_callback=callback)
Play with these parameters - smoothing, balloon, threshold to get your perfect curve
I have 2 cameras and want to calculate the disparity between them.
The translation between those cameras is mainly in z-Direction (i.e. "inside the image plane") and a little bit in x and y direction. For example: (0.2, 0.2, 0.8)
When I now calculate the rectification parameters with the stereoRectify() method, the output images are just black. Using other translation vectors works just fine, but the results are wrong of course.
Why is it like that and how can I solve the problem?
Edit: These translation values result in both rectified images black. Other (wrong) translation vectors work just fine. Changing the value of alpha doesn't change much.
rotation_quat = Quaternion(0.999999913938509, 0.00029714546497339216, -0.00011465939948083866, 0.0002658585515330323)
rotation = rotation_quat.rotation_matrix.astype(np.float64)
translation = np.array([0.2,0.2,0.8]).astype(np.float64)
R1, R2, P1, P2, Q, roi1, roi2 = cv2.stereoRectify(intrinsicMatrix1, distCoeffs1, intrinsicMatrix2, distCoeffs2, img1.shape[::-1], rotation, translation, alpha=1)
map11, map12 = cv2.initUndistortRectifyMap(intrinsicMatrix1, distCoeffs1, R1, P1, img1.shape[::-1], cv2.CV_32FC1)
map21, map22 = cv2.initUndistortRectifyMap(intrinsicMatrix2, distCoeffs1, R2, P2, img2.shape[::-1], cv2.CV_32FC1)
# rectify
img1_rect = cv2.remap(img1, map11, map12, cv2.INTER_LANCZOS4)
img2_rect = cv2.remap(img2, map21, map22, cv2.INTER_LANCZOS4)
cv2.imshow("img1_rect", img1_rect)
cv2.waitKey(0)
cv2.imshow("img2_rect", img2_rect)
cv2.waitKey(0)
How can I calculate distance from camera to a point on a ground plane from an image?
I have the intrinsic parameters of the camera and the position (height, pitch).
Is there any OpenCV function that can estimate that distance?
You can use undistortPoints to compute the rays backprojecting the pixels, but that API is rather hard to use for your purpose. It may be easier to do the calculation "by hand" in your code. Doing it at least once will also help you understand what exactly that API is doing.
Express your "position (height, pitch)" of the camera as a rotation matrix R and a translation vector t, representing the coordinate transform from the origin of the ground plane to the camera. That is, given a point in ground plane coordinates Pg = [Xg, Yg, Zg], its coordinates in camera frame are given by
Pc = R * Pg + t
The camera center is Cc = [0, 0, 0] in camera coordinates. In ground coordinates it is then:
Cg = inv(R) * (-t) = -R' * t
where inv(R) is the inverse of R, R' is its transpose, and the last equality is due to R being an orthogonal matrix.
Let's assume, for simplicity, that the the ground plane is Zg = 0.
Let K be the matrix of intrinsic parameters. Given a pixel q = [u, v], write it in homogeneous image coordinates Q = [u, v, 1]. Its location in camera coordinates is
Qc = Ki * Q
where Ki = inv(K) is the inverse of the intrinsic parameters matrix. The same point in world coordinates is then
Qg = R' * Qc + Cg
All the points Pg = [Xg, Yg, Zg] that belong to the ray from the camera center through that pixel, expressed in ground coordinates, are then on the line
Pg = Cg + lambda * (Qg - Cg)
for lambda going from 0 to positive infinity. This last formula represents three equations in ground XYZ coordinates, and you want to find the values of X, Y, Z and lambda where the ray intersects the ground plane. But that means Zg=0, so you have only 3 unknowns. Solve them (you recover lambda from the 3rd equation, then substitute in the first two), and you get Xg and Yg of the solution to your problem.
I am trying to detect foreground motion using opencv2 by removing static (mostly) BG elements. The method I am using is based on taking the mean of a series of images - representing the background. Then calculating one Standard deviation above and below that mean. Using that as a window to detect foreground motion.
This mechanism reportedly works well for moderately noisy environments like waving trees in the BG.
The desired output is a mask that can be used in a subsequent operation so as to minimise further processing. Specifically I am going to use optical flow detection within that region.
cv2 has made this much easier and the code is much simpler to read and understand. Thanks cv2 and numpy.
But I am having difficulty doing the correct FG detection.
Ideally I also want to erode/dilate the BG mean so as to eleminate 1 pixel noise.
The code is all togethr so you have a number of frames at the start (BGsample) to gather the BG data before FG detection starts. the only dependencies are opencv2 (> 2.3.1 ) and numpy (which should be included in > opencv 2.3.1 )
import cv2
import numpy as np
if __name__ == '__main__':
cap = cv2.VideoCapture(0) # webcam
cv2.namedWindow("input")
cv2.namedWindow("sig2")
cv2.namedWindow("detect")
BGsample = 20 # number of frames to gather BG samples from at start of capture
success, img = cap.read()
width = cap.get(3)
height = cap.get(4)
# can use img.shape(:-1) # cut off extra channels
if success:
acc = np.zeros((height, width), np.float32) # 32 bit accumulator
sqacc = np.zeros((height, width), np.float32) # 32 bit accumulator
for i in range(20): a = cap.read() # dummy to warm up sensor
# gather BG samples
for i in range(BGsample):
success, img = cap.read()
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.accumulate(frame, acc)
cv2.accumulateSquare(frame, sqacc)
#
M = acc/float(BGsample)
sqaccM = sqacc/float(BGsample)
M2 = M*M
sig2 = sqaccM-M2
# have BG samples now
# start FG detection
key = -1
while(key < 0):
success, img = cap.read()
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#Ideally we create a mask for future use that is B/W for FG objects
# (using erode or dilate to remove noise)
# this isn't quite right
level = M+sig2-frame
grey = cv2.morphologyEx(level, cv2.MORPH_DILATE,
cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3)), iterations=2)
cv2.imshow("input", frame)
cv2.imshow("sig2", sig2/60)
cv2.imshow("detect", grey/20)
key = cv2.waitKey(1)
cv2.destroyAllWindows()
I don't think you need to manually compute the mean and standard deviation use cv2.meanStdDev instead. In the code below, I'm using your average background matrix computed from
M = acc/float(BGsample)
So, now we can compute the mean and standard deviation of the average background image, and finally inRange is used to pull out the range that you wanted (i.e., the mean +/- 1 standard deviation).
(mu, sigma) = cv2.meanStdDev(M)
fg = cv2.inRange(M, (mu[0] - sigma[0]), (mu[0] + sigma[0]))
# proceed with morphological clean-up here...
Hope that helps!
my best guess so far. Using detectmin, max to coerce the fp sigma into grayscale for the cv2.inRange to use.
Seems to work OK but was hoping for better... plenty of holes in valid FG data.
I suppose it would work better in rgb instead of grayscale.
Can't get noise reduction using dilate or erode to work.
Any improvements ?
import cv2
import numpy as np
if __name__ == '__main__':
cap = cv2.VideoCapture(1)
cv2.namedWindow("input")
#cv2.namedWindow("sig2")
cv2.namedWindow("detect")
BGsample = 20 # number of frames to gather BG samples from at start of capture
success, img = cap.read()
width = cap.get(3)
height = cap.get(4)
if success:
acc = np.zeros((height, width), np.float32) # 32 bit accumulator
sqacc = np.zeros((height, width), np.float32) # 32 bit accumulator
for i in range(20): a = cap.read() # dummy to warm up sensor
# gather BG samples
for i in range(BGsample):
success, img = cap.read()
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.accumulate(frame, acc)
cv2.accumulateSquare(frame, sqacc)
#
M = acc/float(BGsample)
sqaccM = sqacc/float(BGsample)
M2 = M*M
sig2 = sqaccM-M2
# have BG samples now
# calculate upper and lower bounds of detection window around mean.
# coerce into 8bit image space for cv2.inRange compare
detectmin = cv2.convertScaleAbs(M-sig2)
detectmax = cv2.convertScaleAbs(M+sig2)
# start FG detection
key = -1
while(key < 0):
success, img = cap.read()
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
level = cv2.inRange(frame, detectmin, detectmax)
cv2.imshow("input", frame)
#cv2.imshow("sig2", M/200)
cv2.imshow("detect", level)
key = cv2.waitKey(1)
cv2.destroyAllWindows()