Landscape image to portrait with opencv - opencv

I have a method that does some processing on an IplImage and the method works as it should if the image is 640x480 pixels. But if it is 480x640 pixels, it does't not... because the image needs to be rotated to become 640x480 again, but then I need to rotate it back to 480x640 or translate ther coordinates taken from cvHaarDetectObjects to 480x640.
Can anybody tell me how can I do this?
thanks!!

Try transpose followed by flip. The flip is needed because transpose leaves an mirrored image when compared to the results of a rotation. If the algorithm can work with the mirrored image directly, I would recommend simply flipping the coordinate values of the detection result, rather than flipping the input image.
(Disclaimer: I haven't tried transpose or flip on multi-channel images)

Related

OpenCV, dlib landmarks rotation

I am new in OpenCV an dlib, and I am not sure if my designe is correct. I want to write C++ face detector for android phone wich should detect faces with differents phone orientation and rotatrion angles. Lets stay when phone orientation is portrait and landscape. I am using OpenCV to rotate/edit image and dlib to detect faces. dlib shape predicats initialized with shape_predictor_68_face_landmarks.dat and it can detect face only in correct phone orientation (it means if I rotate phone by 90 it can not detect face.)
To make possible detect faces I read axis from accelerometor and rotate source image to correct orientation before send it to dlib face detector and it detects ok, but output coordinates in dlib::full_object_detection shape of course matchs to rotated picture but not original. So it means i have to convert (rotate landmarks) to back to original image.
Is there are any existing API in dlib or OpenCV to make possible rotate landmarks (dlib::full_object_detection) for specified angle? It will be good if you can provide some example.
For iPhone apps, EXIF data in images captured using iPhone cameras can be used to rotate images first. But I can't guarantee this for Android phones.
In most practical situations, it is easier to rotate the image and perform face detection when face detection in the original image does not return any results (or returns strange results like very small faces). I have seen this done in several Android apps, and have used it myseklf on a couple of projects.
As I understand, you want to rotate the detected landmark to the coordinates system of the original image. If so, you can use getRotationMatrix2D and transform to rotate the list of point.
For example:
Your image was rotated 90 degree to the right around the center point (the middle point of image), now you need to rotate the landmark points back -90 degree around the center point. The code is
// the center point
Point2f center=(width/2,height/2)
//the angle to rotate, in radiant
// in your case it is -90 degree
double theta_deg= angleInDegree * 180 /M_PI;
// get the matrix to rotate
Mat rotateMatrix = getRotationMatrix2D(center, theta_deg, 1.0);
// the vector to get landmark points
std::vector<cv::Point> inputLandmark;
std::vector<cv::Point> outputLandmark;
// we use the same rotate matrix and use transform
cv::transform(inputLandmark, outputLandmark, rotateMatrix);

Selecting ROI for high resolution images with Qt and Opencv

I am working on a project that involves selecting an ROI from high resolution image(more like 5187x3268 like that). Right now i am using findContours in OpenCV to detect a round object(since hough circles is kind of slow for high res images). The problem is that due to high amount of texture on the object, findContours will be erroneous sometimes.
What i am doing now is, to show the user what findContours has detected in a Qt Window, and decide if it has detected correctly. If it has detected correctly, the user will press Ok button, if not No, Let me select button will be pressed.
When ever the user pressed No, Let me select, the application will start capturing mouseEvent and displays a rectangle using QRubberBand. I am using QLabel to display the image, since my screen size is 1920x1080, i have to resize the image to some resolution( lets say 1537x1280, so that it leaves some space for buttons).
I am using opencv resize to resize the image.
width = myImageDisplayer.width()
height = myImageDisplayer.height()
resizedImage = cv2.resize(myImage,(height,width),cv2.INTER_LINEAR)
I am using ratios to calculate the size reduction like this(
xReduction = originalImage.rows()/resizedImage.rows()
yReduction = originalImage.cols()/resizedImage.cols()
, and multiplying the event.pos() coordinates with the ratios to get correct coordinates in the original image.
xrealCoordinates = event.pos().x()*xReduction
yrealCoordinates = event.pos().y()*yReduction
Since the coordinates will be of float, im rounding them off. The problem is in rounding off the float values, as i am losing precision in conversion.
Precision is important since i need to recalculate the principal coordinates(calculated by calibrating the stereo setup)after selecting ROI from the images.
How does Opencv calculates original image coordinates correctly after resizing?
I observed it when i opened the same image using imshow, and if i drag my mouse i can see original image coordinates, even though the image has been resized to fit the screen.
If anybody can help me in this issue, i will be thankful.

How to blend 80x60 thermal and 640x480 RGB image?

How do I blend two images - thermal(80x60) and RGB(640x480) efficiently?
If I scale the thermal to 640x480 it doesn't scale up evenly or doesn't have enough quality to do any processing on it. Any ideas would be really helpful.
RGB image - http://postimg.org/image/66f9hnaj1/
Thermal image - http://postimg.org/image/6g1oxbm5n/
If you scale the resolution of the thermal image up by a factor of 8 and use Bilinear Interpolation you should get a smoother, less-blocky result.
When combining satellite images of different resolution, (I talk about satellite imagery because that is my speciality), you would normally use the highest resolution imagery as the Lightness or L channel to give you apparent resolution and detail in the shapes because the human eye is good at detecting contrast and then use the lower resolution imagery to fill in the Hue and Saturation, or a and b channels to give you the colour graduations you are hoping to see.
So, in concrete terms, I would consider converting the RGB to Lab or HSL colourspace and retaining the L channel. The take the thermal image and up-res it by 8 using bilinear interpolation and use the result as the a, or b or H or S and maybe fill in the remaining channel with the one from the RGB that has the most variance. Then convert the result back to RGB for a false-colour image. It is hard to tell without seeing the images or knowing what you are hoping to find in them. But in general terms, that would be my approach. HTH.
Note: Given that a of Lab colourspace controls the red/green relationship, I would probably try putting the thermal data in that channel so it tends to show more red the "hotter" the thermal channel is.
Updated Answer
Ok, now I can see your images and you have a couple more problems... firstly the images are not aligned, or registered, with each other which is not going to help - try using a tripod ;-) Secondly, your RGB image is very poorly exposed so it is not really going to contribute that much detail - especially in the shadows - to the combined image.
So, firstly, I used ImageMagick at the commandline to up-size the thermal image like this:
convert thermal.png -resize 640x480 thermal.png
Then, I used Photoshop to do a crude alignment/registration. If you want to try this, the easiest way is to put the two images into separate layers of the same document and set the Blending mode of the upper layer to Difference. Then use the Move Tool (shortcut v) to move the upper image around till the screen goes black which means that the details are on top of each other and when subtracted they come to zero, i.e. black. Then crop so the images are aligned and turn off one layer and save, then turn that layer back on and the other layer off and save again.
Now, I used ImageMagick again to separate the two images into Lab layers:
convert bigthermalaligned.png -colorspace Lab -separate thermal.png
convert rgbaligned.png -colorspace Lab -separate rgb.png
which gives me
thermal-0.png => L channel
thermal-1.png => a channel
thermal-2.png => b channel
rgb-0.png => L channel
rgb-1.png => a channel
rgb-2.png => b channel
Now I can take the L channel of the RGB image and the a and b channels of the thermal image and put them together:
convert rgba-0.png thermal-1.png thermal-2.png -normalize -set colorpsace lab -combine result.png
And you get this monstrosity! Obviously you can play around with the channels and colourpsaces and a tripod and proper exposures, but you should be able to see some of the details of the RGB image - especially the curtains on the left, the lights, the camera on the cellphone and the label on the water bottle - have come through into the final image.
Assuming that the images were not captured using a single camera, you need to note that the two cameras may have different parameters. Also, if it's two cameras, they are probably not located in the same world position (offset).
In order to resolve this, you need to get the intrinsic calibration matrix of each of the cameras, and find the offset between them.
Then, you can find a transformation between a pixel in one camera and the other. Unfortunately, if you don't have any depth information about the scene, the most you can do with the calibration matrix is get a ray direction from the camera position to the world.
The easy approach would be to ignore the offset (assuming the scene is not too close to the camera), and just transform the pixel.
p2=K2*(K1^-1 * p1)
Using this you can construct a new image that is a composite of both.
The more difficult approach would be to reconstruct the 3D structure of the scene by finding features that you can match between both images, and then triangulate the point with both rays.

Stretch region of image through opencv or opengl in iOS

I am trying to make double chin in fat image as mentioned in my desired result image below.
I have morphed the normal face to fat face by wrapping an image on mesh and deformed the mesh.
Original image
Wrapped image on mesh grid with vertex points displaced
Current result image
I tried a lot by arranging mesh points but could not get the result like I have shown in first image.
Any ideas how to achieve this by open GL or open CV in iOS?
It's obvious from the first image that there is an added effect to produce the double or triple chin.
This actually looks like a either a preset image blended into the original or a scale and stretched version of the original chin blended into the warped image.

OpenCV cvRemap Cropping Image

So I am very new to OpenCV (2.1), so please keep that in mind.
So I managed to calibrate my cheap web camera that I am using (with a wide angle attachment), using the checkerboard calibration method to produce the intrinsic and distortion coefficients.
I then have no trouble feeding these values back in and producing image maps, which I then apply to a video feed to correct the incoming images.
I run into an issue however. I know when it is warping/correcting the image, it creates several skewed sections, and then formats the image to crop out any black areas. My question then is can I view the complete warped image, including some regions that have black areas? Below is an example of the black regions with skewed sections I was trying to convey if my terminology was off:
An image better conveying the regions I am talking about can be found here! This image was discovered in this post.
Currently: The cvRemap() returns basically the yellow box in the image linked above, but I want to see the whole image as there is relevant data I am looking to get out of it.
What I've tried: Applying a scale conversion to the image map to fit the complete image (including stretched parts) into frame
CvMat *intrinsic = (CvMat*)cvLoad( "Intrinsics.xml" );
CvMat *distortion = (CvMat*)cvLoad( "Distortion.xml" );
cvInitUndistortMap( intrinsic, distortion, mapx, mapy );
cvConvertScale(mapx, mapx, 1.25, -shift_x); // Some sort of scale conversion
cvConvertScale(mapy, mapy, 1.25, -shift_y); // applied to the image map
cvRemap(distorted,undistorted,mapx,mapy);
The cvConvertScale, when I think I have aligned the x/y shift correctly (guess/checking), is somehow distorting the image map making the correction useless. There might be some math involved here I am not correctly following/understanding.
Does anyone have any other suggestions to solve this problem, or what I might be doing wrong? I've also tried trying to write my own code to fix distortion issues, but lets just say OpenCV knows already how to do it well.
From memory, you need to use InitUndistortRectifyMap(cameraMatrix,distCoeffs,R,newCameraMatrix,map1,map2), of which InitUndistortMap is a simplified version.
cvInitUndistortMap( intrinsic, distort, map1, map2 )
is equivalent to:
cvInitUndistortRectifyMap( intrinsic, distort, Identity matrix, intrinsic,
map1, map2 )
The new parameters are R and newCameraMatrix. R species an additional transformation (e.g. rotation) to perform (just set it to the identity matrix).
The parameter of interest to you is newCameraMatrix. In InitUndistortMap this is the same as the original camera matrix, but you can use it to get that scaling effect you're talking about.
You get the new camera matrix with GetOptimalNewCameraMatrix(cameraMat, distCoeffs, imageSize, alpha,...). You basically feed in intrinsic, distort, your original image size, and a parameter alpha (along with containers to hold the result matrix, see documentation). The parameter alpha will achieve what you want.
I quote from the documentation:
The function computes the optimal new camera matrix based on the free
scaling parameter. By varying this parameter the user may retrieve
only sensible pixels alpha=0, keep all the original image pixels if
there is valuable information in the corners alpha=1, or get something
in between. When alpha>0, the undistortion result will likely have
some black pixels corresponding to “virtual” pixels outside of the
captured distorted image. The original camera matrix, distortion
coefficients, the computed new camera matrix and the newImageSize
should be passed to InitUndistortRectifyMap to produce the maps for
Remap.
So for the extreme example with all the black bits showing you want alpha=1.
In summary:
call cvGetOptimalNewCameraMatrix with alpha=1 to obtain newCameraMatrix.
use cvInitUndistortRectifymap with R being identity matrix and newCameraMatrix set to the one you just calculated
feed the new maps into cvRemap.

Resources