I am currently working on a program which should take an LDR images and multiply certain pixel in the image, so that their pixel value would exceed the normal 0-255 (0-1) pixel value boundary. The program i have written can do so, but I am not able to write the image file, as the imwrite() in OpenCV clambs the values back in the range of 0-255 (0-1)
if they are bigger than 255.
Is there anybody there who knows how to write a floating point image with pixel values bigger than 255 (1)
My code looks like this
Mat ApplySunValue(Mat InputImg)
{
Mat Image1 = imread("/****/.jpg",CV_LOAD_IMAGE_COLOR);
Mat outPutImage;
Image1.convertTo(Image1, CV_32FC3);
for(int x = 0; x < InputImg.cols; x++){
for(int y = 0; y < InputImg.rows; y++){
float blue = Image1.at<Vec3f>(y,x)[0] /255.0f;
float green = Image1.at<Vec3f>(y,x)[1] /255.0f;
float red = Image1.at<Vec3f>(y,x)[2] /255.0f ;
Image1.at<Vec3f>(y,x)[0] = blue;
Image1.at<Vec3f>(y,x)[1] = green;
Image1.at<Vec3f>(y,x)[2] = red;
int pixelValue = InputImg.at<uchar>(y,x);
if(pixelValue > 254){
Image1.at<Vec3f>(y,x)[0] = blue * SunMultiplyer;
Image1.at<Vec3f>(y,x)[1] = green * SunMultiplyer;
Image1.at<Vec3f>(y,x)[2] = red * SunMultiplyer;
}
}
}
imwrite("/****/Nice.TIFF", Image1 * 255);
namedWindow("Hej",CV_WINDOW_AUTOSIZE);
imshow("hej", Image1);
return InputImg;
}
For storage purposes, the following is more memory efficient than the XML / YAML alternative (due to the use of a binary format):
// Save the image data in binary format
std::ofstream os(<filepath>,std::ios::out|std::ios::trunc|std::ios::binary);
os << (int)image.rows << " " << (int)image.cols << " " << (int)image.type() << " ";
os.write((char*)image.data,image.step.p[0]*image.rows);
os.close();
You can then load the image as follows:
// Load the image data from binary format
std::ifstream is(<filepath>,std::ios::in|std::ios::binary);
if(!is.is_open())
return false;
int rows,cols,type;
is >> rows; is.ignore(1);
is >> cols; is.ignore(1);
is >> type; is.ignore(1);
cv::Mat image;
image.create(rows,cols,type);
is.read((char*)image.data,image.step.p[0]*image.rows);
is.close();
For instance, without compression, a 1920x1200 floating-point three-channel image takes 26 MB when stored in binary format, whereas it takes 129 MB when stored in YML format. This size difference also has an impact on runtime since the number of accesses to the hard drive are very different.
Now, if what you want is to visualize your HDR image, you have no choice but to convert it to LDR. This is called "tone-mapping" (Wikipedia entry).
As far as I know, when opencv writes using imwrite, it writes in the format supported by the image container, and this by default is 255.
However, if you just want to save the data, you might consider writing the Mat object to an xml/yaml file.
//Writing
cv::FileStorage fs;
fs.open(filename, cv::FileStorage::WRITE);
fs<<"Nice"<<Image1;
//Reading
fs.open(filename, cv::FileStorage::READ);
fs["Nice"]>>Image1;
fs.release(); //Very Important
Related
I am working on a OCR project, and in the preprocessing, some RED stamps need to be removed, so that the text near the stamps could be detected. I try a lot of methods(like change the values of pixel, threshold in Red channel) but fail.
Any suggestions are highly appreciated.
Python, C++, Java or what? Since you didn't state the OpenCV implementation you are using, I'm giving my answer in C++.
An option is to use the HSV color space to filter out the range of red values that defines the seal. My approach is to use the CMYK color space to filter everything except the black (or dark) text. It should do a pretty good job on printed media, which is your case.
//read input image:
std::string imageName = "C://opencvImages//seal.png";
cv::Mat imageInput = cv::imread( imageName );
Now, perform the CMYK conversion. OpenCV does not support this operation out of the box, bear with me as I provide the helper function at the end of this post.
//CMYK conversion:
std::vector<cv::Mat> cmyk;
cmyk = rgb2cmyk( imageInput );
//This is the Black channel:
cv::Mat blackChannel = cmyk[3].clone();
This is the image of the black channel; it is nice how everything that is not black (or dark) practically disappears!
Now, optionally, enhance the result applying brightness and contrast adjustment. Just try to separate the text from the background a little bit better; we want some defined pixel distributions to get a nice binary image.
//Brightness and contrast adjustment:
float alpha = 2.0;
float beta = -50.0;
contrastBrightnessAdjustment( blackChannel, alpha, beta );
Again, OpenCV does not offer brightness and contrast adjustment out of the box; however, its implementation is very easy. Hold on a little bit, and let me show you the result of this operation:
Nice. Let's Otsu-threshold this bad boy to get a nice binary image containing the clean text:
cv::threshold( blackChannel, binaryImage ,0, 255, cv::THRESH_OTSU );
This is what you get:
Now, the RGB to CMYK conversion function. I'm using the following implementation. The function receives an RGB image and returns a vector containing each of the CMYK channels
std::vector<cv::Mat> rgb2cmyk( cv::Mat& inputImage ){
std::vector<cv::Mat> cmyk;
for (int i = 0; i < 4; i++) {
cmyk.push_back( cv::Mat( inputImage.size(), CV_8UC1 ) );
}
std::vector<cv::Mat> inputRGB;
cv::split( inputImage, inputRGB );
for (int i = 0; i < inputImage.rows; i++)
{
for (int j = 0; j < inputImage.cols; j++)
{
float r = (int)inputRGB[2].at<uchar>(i, j) / 255.;
float g = (int)inputRGB[1].at<uchar>(i, j) / 255.;
float b = (int)inputRGB[0].at<uchar>(i, j) / 255.;
float k = std::min(std::min(1-r, 1-g), 1-b);
cmyk[0].at<uchar>(i, j) = (1 - r - k) / (1 - k) * 255.;
cmyk[1].at<uchar>(i, j) = (1 - g - k) / (1 - k) * 255.;
cmyk[2].at<uchar>(i, j) = (1 - b - k) / (1 - k) * 255.;
cmyk[3].at<uchar>(i, j) = k * 255.;
}
}
return cmyk;
}
And the contrastBrightnessAdjustment function is this, implemented using pointer arithmetic. The function receives a grayscale image and applies the linear transformation via the alpha and beta parameters:
void contrastBrightnessAdjustment( cv::Mat inputImage, float alpha, int beta ){
cv::MatIterator_<cv::Vec3b> it, end;
for (it = inputImage.begin<cv::Vec3b>(), end = inputImage.end<cv::Vec3b>(); it != end; ++it) {
uchar &pixel = (*it)[0];
pixel = cv::saturate_cast<uchar>(alpha*pixel+beta);
}
}
I have two images that I am subtracting from one another quite simply:
Mat foo, a, b;
...//imread onto a and b or somesuch
foo = a - b;
Now, as I understand it, any pixel value that goes into the negatives (or over 255 for that matter) will be set to zero instead. If that is so, I'd like to know if there is any way to permit it to go under zero so that I may adjust the image later without information loss.
I'm working with greyscale images if that simplifies things.
This is how a simple convert => substract => convertAndScaleBack application would look like:
input:
and
int main()
{
cv::Mat input = cv::imread("../inputData/Lenna.png", CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat input2 = cv::imread("../inputData/Lenna_edges.png", CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat input1_16S;
cv::Mat input2_16S;
input.convertTo(input1_16S, CV_16SC1);
input2.convertTo(input2_16S, CV_16SC1);
// compute difference of 16 bit signed images
cv::Mat diffImage = input1_16S-input2_16S;
// now you have a 16S image that has some negative values
// find minimum and maximum values:
double min, max;
cv::minMaxLoc(diffImage, &min, &max);
std::cout << "min pixel value: " << min<< std::endl;
cv::Mat backConverted;
// scale the pixel values so that the smalles value is 0 and the largest one is 255
diffImage.convertTo(backConverted,CV_8UC1, 255.0/(max-min), -min);
cv::imshow("backConverted", backConverted);
cv::waitKey(0);
}
output:
I've noticed that using convertTo to convert a matrix from 32-bit to 16-bit "rounds" number to the upper boud. So, values bigger than 0x0000FFFF in the source matrix will be set as 0xFFFF in the destination matrix.
What I want for my application is instead to mask the values, setting in the destination just the 2 LSB of the values.
Here is an example:
Mat mat32;
Mat mat16;
mat32 = Mat(2,2,CV_32SC1);
for(int y = 0; y < 2; y++)
for(int x = 0; x < 2; x++)
mat32.at<unsigned int>(cv::Point(x,y)) = 0x0000FFFE + (y*2+x);
mat32.convertTo(mat16, CV_16UC1);
The matrixes have these values:
32 bits matrix:
0000FFFE 0000FFFF
00010000 00010001
16 bits matrix:
0000FFFE 0000FFFF
0000FFFF 0000FFFF
In the second row of 16-bit matrix I want to have
00000000 00000001
I can do this by scanning the source matrix value-by-value and masking the values, but the performances are low.
Is there an OpenCV function that does this?
Thanks to everyone!
MIX
This can be done, but this requires a somewhat dirty trick, so it is up to you to use this approach or not. So this is how it can be done:
For this example lets create 1000x1000 32-bit matrix and set all its values to 65541 (=256*256+5). So after the conversion we expect to have a matrix filled with fives.
Mat M1(1000, 1000, CV_32S, Scalar(65541));
And here is the trick:
Mat M2(1000, 1000, CV_16SC2, M1.data);
We created matrix M2 over the same memory buffer as M1, but M2 'think' that this is a buffer of 2-channel 16-bit image. Now the last thing to do is to copy the channel you need to the place you need. This can be done by split() or mixChannels() functions. For example:
Mat M3(1000, 1000, CV_16S);
int fromto[] = {0,0};
Mat inpu[] = {M2}, outpu[] = {M3};
mixChannels(inpu, 1, outpu, 1, fromto, 1);
cout << M3.at<short>(10,10) << endl;
Ye I know that the format of mixChannels looks weird and makes the code even less readable, but it works... If you prefer split() function:
vector<Mat> v;
split(M2,v);
cout << v[0].at<short>(10,10) << " " << v[1].at<short>(10,10) << endl;
There is no OpenCV function (that I know of) which does the conversion like you want, so either you code it yourself or like you said you go through a masking step first to remove the 16 high bits.
The mask can be applied using the bitwise_and in C++ or cvAndS in C. See here.
You could also have made your hand-written code more efficient. In general, you should avoid OpenCV pixel accessors in loops because they have bad performance. I don't have an OpenCV install at hand so this could be slighlty off -- the idea is to use the data field directly, and step which is the number of bytes per row:
for(int y = 0; y < mat32.height; ++) {
int* row = (int*)( (char*)mat32.data + y * mat32.step);
for(int x = 0; x < mat32.step/ 4)
row[x] &= 0xffff;
Then, once the mask is applied, all values fit in 16 bits, and convertTo will just truncate the 16 upper bits.
The other solution is to code the conversion by hand:
mat16.resize( mat32.size() );
for(int y = 0; y < mat32.height; ++) {
const int* row32 = (const int*)( (char*)mat32.data + y * mat32.step);
short* row16 = (short*) ( (char*)mat16.data + y * mat16.step);
for(int x = 0; x < mat32.step/ 4)
row16[x] = short(row32[x]);
I would like to detect this pattern
As you can see it's basically the letter C, inside another, with different orientations. My pattern can have multiple C's inside one another, the one I'm posting with 2 C's is just a sample. I would like to detect how many C's there are, and the orientation of each one. For now I've managed to detect the center of such pattern, basically I've managed to detect the center of the innermost C. Could you please provide me with any ideas about different algorithms I could use?
And here we go! A high level overview of this approach can be described as the sequential execution of the following steps:
Load the input image;
Convert it to grayscale;
Threshold it to generate a binary image;
Use the binary image to find contours;
Fill each area of contours with a different color (so we can extract each letter separately);
Create a mask for each letter found to isolate them in separate images;
Crop the images to the smallest possible size;
Figure out the center of the image;
Figure out the width of the letter's border to identify the exact center of the border;
Scan along the border (in a circular fashion) for discontinuity;
Figure out an approximate angle for the discontinuity, thus identifying the amount of rotation of the letter.
I don't want to get into too much detail since I'm sharing the source code, so feel free to test and change it in any way you like.
Let's start, Winter Is Coming:
#include <iostream>
#include <vector>
#include <cmath>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
cv::RNG rng(12345);
float PI = std::atan(1) * 4;
void isolate_object(const cv::Mat& input, cv::Mat& output)
{
if (input.channels() != 1)
{
std::cout << "isolate_object: !!! input must be grayscale" << std::endl;
return;
}
// Store the set of points in the image before assembling the bounding box
std::vector<cv::Point> points;
cv::Mat_<uchar>::const_iterator it = input.begin<uchar>();
cv::Mat_<uchar>::const_iterator end = input.end<uchar>();
for (; it != end; ++it)
{
if (*it) points.push_back(it.pos());
}
// Compute minimal bounding box
cv::RotatedRect box = cv::minAreaRect(cv::Mat(points));
// Set Region of Interest to the area defined by the box
cv::Rect roi;
roi.x = box.center.x - (box.size.width / 2);
roi.y = box.center.y - (box.size.height / 2);
roi.width = box.size.width;
roi.height = box.size.height;
// Crop the original image to the defined ROI
output = input(roi);
}
For more details on the implementation of isolate_object() please check this thread. cv::RNG is used later on to fill each contour with a different color, and PI, well... you know PI.
int main(int argc, char* argv[])
{
// Load input (colored, 3-channel, BGR)
cv::Mat input = cv::imread("test.jpg");
if (input.empty())
{
std::cout << "!!! Failed imread() #1" << std::endl;
return -1;
}
// Convert colored image to grayscale
cv::Mat gray;
cv::cvtColor(input, gray, CV_BGR2GRAY);
// Execute a threshold operation to get a binary image from the grayscale
cv::Mat binary;
cv::threshold(gray, binary, 128, 255, cv::THRESH_BINARY);
The binary image looks exactly like the input because it only had 2 colors (B&W):
// Find the contours of the C's in the thresholded image
std::vector<std::vector<cv::Point> > contours;
cv::findContours(binary, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);
// Fill the contours found with unique colors to isolate them later
cv::Mat colored_contours = input.clone();
std::vector<cv::Scalar> fill_colors;
for (size_t i = 0; i < contours.size(); i++)
{
std::vector<cv::Point> cnt = contours[i];
double area = cv::contourArea(cv::Mat(cnt));
//std::cout << "* Area: " << area << std::endl;
// Fill each C found with a different color.
// If the area is larger than 100k it's probably the white background, so we ignore it.
if (area > 10000 && area < 100000)
{
cv::Scalar color = cv::Scalar(rng.uniform(0, 255), rng.uniform(0,255), rng.uniform(0,255));
cv::drawContours(colored_contours, contours, i, color,
CV_FILLED, 8, std::vector<cv::Vec4i>(), 0, cv::Point());
fill_colors.push_back(color);
//cv::imwrite("test_contours.jpg", colored_contours);
}
}
What colored_contours looks like:
// Create a mask for each C found to isolate them from each other
for (int i = 0; i < fill_colors.size(); i++)
{
// After inRange() single_color_mask stores a single C letter
cv::Mat single_color_mask = cv::Mat::zeros(input.size(), CV_8UC1);
cv::inRange(colored_contours, fill_colors[i], fill_colors[i], single_color_mask);
//cv::imwrite("test_mask.jpg", single_color_mask);
Since this for loop is executed twice, one for each color that was used to fill the contours, I want you to see all images that were generated by this stage. So the following images are the ones that were stored by single_color_mask (one for each iteration of the loop):
// Crop image to the area of the object
cv::Mat cropped;
isolate_object(single_color_mask, cropped);
//cv::imwrite("test_cropped.jpg", cropped);
cv::Mat orig_cropped = cropped.clone();
These are the ones that were stored by cropped (by the way, the smaller C looks fat because the image is rescaled by this page to have the same size of the larger C, don't worry):
// Figure out the center of the image
cv::Point obj_center(cropped.cols/2, cropped.rows/2);
//cv::circle(cropped, obj_center, 3, cv::Scalar(128, 128, 128));
//cv::imwrite("test_cropped_center.jpg", cropped);
To make it clearer to understand for what obj_center is for, I painted a little gray circle for educational purposes on that location:
// Figure out the exact center location of the border
std::vector<cv::Point> border_points;
for (int y = 0; y < cropped.cols; y++)
{
if (cropped.at<uchar>(obj_center.x, y) != 0)
border_points.push_back(cv::Point(obj_center.x, y));
if (border_points.size() > 0 && cropped.at<uchar>(obj_center.x, y) == 0)
break;
}
if (border_points.size() == 0)
{
std::cout << "!!! Oops! No border detected." << std::endl;
return 0;
}
// Figure out the exact center location of the border
cv::Point border_center = border_points[border_points.size() / 2];
//cv::circle(cropped, border_center, 3, cv::Scalar(128, 128, 128));
//cv::imwrite("test_border_center.jpg", cropped);
The procedure above scans a single vertical line from top/middle of the image to find the borders of the circle to be able to calculate it's width. Again, for education purposes I painted a small gray circle in the middle of the border. This is what cropped looks like:
// Scan the border of the circle for discontinuities
int radius = obj_center.y - border_center.y;
if (radius < 0)
radius *= -1;
std::vector<cv::Point> discontinuity_points;
std::vector<int> discontinuity_angles;
for (int angle = 0; angle <= 360; angle++)
{
int x = obj_center.x + (radius * cos((angle+90) * (PI / 180.f)));
int y = obj_center.y + (radius * sin((angle+90) * (PI / 180.f)));
if (cropped.at<uchar>(x, y) < 128)
{
discontinuity_points.push_back(cv::Point(y, x));
discontinuity_angles.push_back(angle);
//cv::circle(cropped, cv::Point(y, x), 1, cv::Scalar(128, 128, 128));
}
}
//std::cout << "Discontinuity size: " << discontinuity_points.size() << std::endl;
if (discontinuity_points.size() == 0 && discontinuity_angles.size() == 0)
{
std::cout << "!!! Oops! No discontinuity detected. It's a perfect circle, dang!" << std::endl;
return 0;
}
Great, so the piece of code above scans along the middle of the circle's border looking for discontinuity. I'm sharing a sample image to illustrate what I mean. Every gray dot on the image represents a pixel that is tested. When the pixel is black it means we found a discontinuity:
// Figure out the approximate angle of the discontinuity:
// the first angle found will suffice for this demo.
int approx_angle = discontinuity_angles[0];
std::cout << "#" << i << " letter C is rotated approximately at: " << approx_angle << " degrees" << std::endl;
// Figure out the central point of the discontinuity
cv::Point discontinuity_center;
for (int a = 0; a < discontinuity_points.size(); a++)
discontinuity_center += discontinuity_points[a];
discontinuity_center.x /= discontinuity_points.size();
discontinuity_center.y /= discontinuity_points.size();
cv::circle(orig_cropped, discontinuity_center, 2, cv::Scalar(128, 128, 128));
cv::imshow("Original crop", orig_cropped);
cv::waitKey(0);
}
return 0;
}
Very well... This last piece of code is responsible for figuring out the approximate angle of the discontinuity as well as indicate the central point of discontinuity. The following images are stored by orig_cropped. Once again I added a gray dot to show the exact positions detected as the center of the gaps:
When executed, this application prints the following information to the screen:
#0 letter C is rotated approximately at: 49 degrees
#1 letter C is rotated approximately at: 0 degrees
I hope it helps.
For start you could use Hough transformation. This algorithm is not very fast, but it's quite robust. Especially if you have such clear images.
The general approach would be:
1) preprocessing - suppress noise, convert to grayscale / binary
2) run edge detector
3) run Hough transform - IIRC it's `cv::HoughCircles` in OpenCV
4) do some postprocessing - remove surplus circles, decide which ones correspond to shape of letter C, and so on
My approach will give you 2 hough circles per letter C. One on inner boundary, one on outer letter C. If you want only one circle per letter you can use skeletonization algoritm. More info here http://homepages.inf.ed.ac.uk/rbf/HIPR2/skeleton.htm
Given that we have nested C structures and you know the centres of the Cs and would like to evaluate the orientations- one simply needs to observe the distribution of pixels along the radius of the concentric Cs in all directions.
This can be done by performing a simple morphological dilation operation from the centre. As we reach the right radius for the innermost C, we will reach a maximum number of pixels covered for the innermost C. The difference between the disc and the C will give us the location of the gap in the whole and one can perform an ultimate erosion to get the centroid of the gap in the C. The angle between the centre and this point is the orientation of the C. This step is iterated till all Cs are covered.
This can also be done quickly using the Distance function from the centre point of the Cs.
How to read a frame from YUV file in OpenCV?
I wrote a very simple python code to read YUV NV21 stream from binary file.
import cv2
import numpy as np
class VideoCaptureYUV:
def __init__(self, filename, size):
self.height, self.width = size
self.frame_len = self.width * self.height * 3 / 2
self.f = open(filename, 'rb')
self.shape = (int(self.height*1.5), self.width)
def read_raw(self):
try:
raw = self.f.read(self.frame_len)
yuv = np.frombuffer(raw, dtype=np.uint8)
yuv = yuv.reshape(self.shape)
except Exception as e:
print str(e)
return False, None
return True, yuv
def read(self):
ret, yuv = self.read_raw()
if not ret:
return ret, yuv
bgr = cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR_NV21)
return ret, bgr
if __name__ == "__main__":
#filename = "data/20171214180916RGB.yuv"
filename = "data/20171214180916IR.yuv"
size = (480, 640)
cap = VideoCaptureYUV(filename, size)
while 1:
ret, frame = cap.read()
if ret:
cv2.imshow("frame", frame)
cv2.waitKey(30)
else:
break
As mentioned, there are MANY types of YUV formats:
http://www.fourcc.org/yuv.php
To convert to RGB from a YUV format in OpenCV is very simple:
Create a one-dimensional OpenCV Mat of the appropriate size for that frame data
Create an empty Mat for the RGB data with the desired dimension AND with 3 channels
Finally use cvtColor to convert between the two Mats, using the correct conversion flag enum
Here is an example for a YUV buffer in YV12 format:
Mat mYUV(height + height/2, width, CV_8UC1, (void*) frameData);
Mat mRGB(height, width, CV_8UC3);
cvtColor(mYUV, mRGB, CV_YUV2RGB_YV12, 3);
The key trick is to define the dimensions of your RGB Mat before you convert.
UPDATE there's a newer version of the code here: https://github.com/chelyaev/opencv-yuv
I'm posting some code that will read a single YUV 4:2:0 planar image file. You can directly apply this to most YUV files (just keep reading from the same FILE object). The exception to this is when dealing with YUV files that have a header (typically, they have a *.y4m extension). If you want to deal with such files, you have two options:
Write your own function to consume the header data from the FILE object before using the code below
Strip the headers from *.y4m images (using ffmpeg or similar tool). This is the option I prefer since it's the simplest.
It also will not work for any other form of YUV format (non-planar, different chroma decimation). As #Stephane pointed out, there are many such formats (and most of them don't have any identifying headers), which is probably why OpenCV doesn't support them out of the box.
But working with them is fairly simple:
Start with an image and it's dimensions (this is required when reading a YUV file)
Read luma and chroma into 3 separate images
Upscale chroma images by a factor of 2 to compensation for chroma decimation. Note that there are actually several ways to compensate for chroma decimation. Upsampling is just the simplest
Combine into YUV image. If you want RGB, you can use cvCvtColor.
Finally, the code:
IplImage *
cvLoadImageYUV(FILE *fin, int w, int h)
{
assert(fin);
IplImage *py = cvCreateImage(cvSize(w, h), IPL_DEPTH_8U, 1);
IplImage *pu = cvCreateImage(cvSize(w/2,h/2), IPL_DEPTH_8U, 1);
IplImage *pv = cvCreateImage(cvSize(w/2,h/2), IPL_DEPTH_8U, 1);
IplImage *pu_big = cvCreateImage(cvSize(w, h), IPL_DEPTH_8U, 1);
IplImage *pv_big = cvCreateImage(cvSize(w, h), IPL_DEPTH_8U, 1);
IplImage *image = cvCreateImage(cvSize(w, h), IPL_DEPTH_8U, 3);
IplImage *result = NULL;
assert(py);
assert(pu);
assert(pv);
assert(pu_big);
assert(pv_big);
assert(image);
for (int i = 0; i < w*h; ++i)
{
int j = fgetc(fin);
if (j < 0)
goto cleanup;
py->imageData[i] = (unsigned char) j;
}
for (int i = 0; i < w*h/4; ++i)
{
int j = fgetc(fin);
if (j < 0)
goto cleanup;
pu->imageData[i] = (unsigned char) j;
}
for (int i = 0; i < w*h/4; ++i)
{
int j = fgetc(fin);
if (j < 0)
goto cleanup;
pv->imageData[i] = (unsigned char) j;
}
cvResize(pu, pu_big, CV_INTER_NN);
cvResize(pv, pv_big, CV_INTER_NN);
cvMerge(py, pu_big, pv_big, NULL, image);
result = image;
cleanup:
cvReleaseImage(&pu);
cvReleaseImage(&pv);
cvReleaseImage(&py);
cvReleaseImage(&pu_big);
cvReleaseImage(&pv_big);
if (result == NULL)
cvReleaseImage(&image);
return result;
}
I don't think it is possible to do, at least with the current version. Of course, it wouldn't be that difficult to do, but it is not such an interesting feature, as:
OpenCV usually works on webcam stream, which are in RGB format, or on coded files, which are directly decoded into RGB for display purposes ;
OpenCV is dedicated to Computer Vision, where YUV is a less common format than in the Coding community for example ;
there are a lot of different YUV formats, which would imply a lot of work to implement them.
Conversions are still possible though, using cvCvtColor(), which means that it is of some interest anyway.
I encountered the same problem. My solution is
1. read one yuv frame (such as I420) to a string object "yuv".
2. convert the yuv frame to BGR24 format. I use libyuv to do it. It is easy to write a python wrapper for libyuv functions. now you get another string object "bgr" with BGR24 format.
3. use numpy.fromstring to get image object from the "bgr" string object. you need to change the shape of the image object.
Below is a simple yuv viewer for your reference.
import cv2
# below is the extension wrapper for libyuv
import yuvtorgb
import numpy as np
f = open('i420_cif.yuv', 'rb')
w = 352
h = 288
size = 352*288*3/2
while True:
try:
yuv = f.read(size)
except:
break
if len(yuv) != size:
f.seek(0, 0)
continue
bgr = yuvtorgb.i420_to_bgr24(yuv, w, h)
img = np.fromstring(bgr, dtype=np.uint8)
img.shape = h,w,3
cv2.imshow('img', img)
if cv2.waitKey(50) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
For future reference: I have converted #xianyanlin's brilliant answer to Python 3. The below code works with videos taken from the Raspberry Pi camera and seems to output correct color and aspect ratio.
Warning: it uses the numpy format for specifying resolution of height * width, e.g. 1080 * 1920, 480 * 640.
class VideoCaptureYUV:
def __init__(self, filename, size):
self.height, self.width = size
self.frame_len = self.width * self.height * 3 // 2
self.f = open(filename, 'rb')
self.shape = (int(self.height*1.5), self.width)
def read_raw(self):
try:
raw = self.f.read(self.frame_len)
yuv = np.frombuffer(raw, dtype=np.uint8)
yuv = yuv.reshape(self.shape)
except Exception as e:
print(str(e))
return False, None
return True, yuv
def read(self):
ret, yuv = self.read_raw()
if not ret:
return ret, yuv
bgr = cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR_I420, 3)
return ret, bgr