I am using Delphi-OpenCV and unlike the advanced functions, like object detection etc, I am not able to achieve a fairly simple thing - read pixel values from a PIplImage, or a pCvMat.
Since there is pretty much no documentation for Delphi-OpenCV, I experimented a lot, but the closest I can get is a few nonsenses, or opencv core dll errors when trying to fill a TCvScalar by a multitude of different approaches.
I have a 8bit, 1 channel jpg picture and need to provide a pixel coordinates and get its value like this in C++ :
Scalar intensity = img.at<uchar>(y, x);
Could someone please point me in the right direction?
You can use TocvImage class over your pIplImage image, for example:
var
image: pIplImage;
img: TocvImage;
px: TocvPixel;
begin
image:= cvLoadImage(c_str('d:\IMAG0132.jpg'),CV_LOAD_IMAGE_GRAYSCALE);
img := TocvImage.Create(image);
px := img.Pixel[100,100];
img.Free;
end;
Or see TocvImage.GetPixel function as reference how to directly read pixel information from pIplImage.
Related
I need to pre-process an image to convert it to a high contrast dark-on-light background that is ideal to feed to OCR tools.
The pre-processing, which for starters I did in Gimp, simply involves running its Color->Invert operation, which gives me a result that works very well when fed into OCR tools.
This question though is how to replicate the same operation via OpenCV.
The following is the OpenCV code (via the Go wrapper for OpenCV) that I have managed so far:
func preprocessImage(inputImage gocv.Mat) {
white := gocv.NewMatWithSizeFromScalar(
gocv.Scalar{255.0, 255.0, 255.0, 255.0},
inputImage.Rows(), inputImage.Cols(),
inputImage.Type()
)
targetMat := gocv.NewMat()
gocv.Subtract(white, inputImage, &targetMat)
pngCompressionOptions := []int{gocv.IMWritePngCompression, 0}
gocv.IMWrite("result.png", targetMat, pngCompressionOptions)
}
However, this does not seem to match the results I obtain from Gimp.
As a sample, here's the original image:
Here's the result of applying Color->Invert via Gimp:
Here's the result I get via the OpenCV code shown above:
As is evident, there seems to quite some difference between the two results.
Gimp's documentation on what exactly Color->Invert does is a bit cryptic, at least to me. It mentions that "hues are replaced by their complementary colors" but am unclear as to how to replicate that.
Just to be clear, I am not expecting working Golang code in the answers. I am just looking for some hints as to what OpenCV functions I should string together (in any language, I can port that to Go) in order to replicate Gimp's Color-Invert operation.
I am unfamiliar with Go, but it looks like your image got converted to greyscale on opening.
Check your flags/parameters where you loaded it and that you have at least three, rather than one single channel.
I have no idea for how to implement matrix implementation efficiently in OpenCV.
I have binary Mat nz(150,600) with 0 and 1 elements.
I have Mat mk(150,600) with double values.
I like to implement as in Matlab as
sk = mk(nz);
That command copy mk to sk only for those element of mk element at the location where nz has 1. Then make sk into a row matrix.
How can I implement it in OpenCV efficiently for speed and memory?
You should take a look at Mat::copyTo and Mat::clone.
copyTo will make an copy with optional mask where its non-zero elements indicate which matrix elements need to be copied.
mk.copyTo(sk, nz);
And if you really want a row matrix then call sk.reshape() as member sansuiso already suggested. This method ...
creates alternative matrix header for the same data, with different
number of channels and/or different number of rows.
bkausbk gave the best answer. However, a second way around:
A=bitwise_and(nz,mk);
If you access A, you can copy the non-zero into a std::vector. If you want your output to be a cv::Mat instance then you have to allocate the memory first:
S=countNonZero(A); //size of the final output matrix
Now, fast element access is an actual topic of itself. Google it. Or have a look at opencv/modules/core/src/stat.cpp where countNonZero() is implemented to get some ideas.
There are two steps involved in your task.
First, you convert to double the input matrix:
cv::Mat binaryMat; // source matrix, filled somewhere
cv::Mat doubleMat; // target matrix (with doubles)
binaryMat.convertTo(doubleMat, CV64F); // Perform the conversion
Then, reshape the result as a row matrix:
doubleMat = cv::reshape(doubleMat, 1, 1);
// Alternatively:
cv::Mat doubleRow = cv::reshape(doubleMat, 1, 1);
The cv::reshape operation is efficient in the sense that the data is not copied, only the structure header changes.
This function returns a new reference to a matrix (by creating a new header), thus you should not forget to assign its result.
I'm using the SharpDX Toolkit, and I'm trying to create a Texture2D programmatically, so I can manually specify all the pixel values. And I'm not sure what pixel format to create it with.
SharpDX doesn't even document the toolkit's PixelFormat type (they have documentation for another PixelFormat class but it's for WIC, not the toolkit). I did find the DirectX enum it wraps, DXGI_FORMAT, but its documentation doesn't give any useful guidance on how I would choose a format.
I'm used to plain old 32-bit bitmap formats with 8 bits per color channel plus 8-bit alpha, which is plenty good enough for me. So I'm guessing the simplest choices will be R8G8B8A8 or B8G8R8A8. Does it matter which I choose? Will they both be fully supported on all hardware?
And even once I've chosen one of those, I then need to further specify whether it's SInt, SNorm, Typeless, UInt, UNorm, or UNormSRgb. I don't need the sRGB colorspace. I don't understand what Typeless is supposed to be for. UInt seems like the simplest -- just a plain old unsigned byte -- but it turns out it doesn't work; I don't get an error, but my texture won't draw anything to the screen. UNorm works, but there's nothing in the documentation that explains why UInt doesn't. So now I'm paranoid that UNorm might not work on some other video card.
Here's the code I've got, if anyone wants to see it. Download the SharpDX full package, open the SharpDXToolkitSamples project, go to the SpriteBatchAndFont.WinRTXaml project, open the SpriteBatchAndFontGame class, and add code where indicated:
// Add new field to the class:
private Texture2D _newTexture;
// Add at the end of the LoadContent method:
_newTexture = Texture2D.New(GraphicsDevice, 8, 8, PixelFormat.R8G8B8A8.UNorm);
var colorData = new Color[_newTexture.Width*_newTexture.Height];
_newTexture.GetData(colorData);
for (var i = 0; i < colorData.Length; ++i)
colorData[i] = (i%3 == 0) ? Color.Red : Color.Transparent;
_newTexture.SetData(colorData);
// Add inside the Draw method, just before the call to spriteBatch.End():
spriteBatch.Draw(_newTexture, new Vector2(0, 0), Color.White);
This draws a small rectangle with diagonal lines in the top left of the screen. It works on the laptop I'm testing it on, but I have no idea how to know whether that means it's going to work everywhere, nor do I have any idea whether it's going to be the most performant.
What pixel format should I use to make sure my app will work on all hardware, and to get the best performance?
The formats in the SharpDX Toolkit map to the underlying DirectX/DXGI formats, so you can, as usual with Microsoft products, get your info from the MSDN:
DXGI_FORMAT enumeration (Windows)
32-bit-textures are a common choice for most texture scenarios and have a good performance on older hardware. UNorm means, as already answered in the comments, "in the range of 0.0 .. 1.0" and is, again, a common way to access color data in textures.
If you look at the Hardware Support for Direct3D 10Level9 Formats (Windows) page you will see, that DXGI_FORMAT_R8G8B8A8_UNORM as well as DXGI_FORMAT_B8G8R8A8_UNORM are supported on DirectX 9 hardware. You will not run into compatibility-problems with both of them.
Performance is up to how your Device is initialized (RGBA/BGRA?) and what hardware (=supported DX feature level) and OS you are running your software on. You will have to run your own tests to find it out (though in case of these common and similar formats the difference should be a single digit percentage at most).
I am trying to get the resolution of a JPEG image without decoding the file. I got several samples from internet but none is working properly. It seems to be this way because many JPEG files are not standard, though any graphic application (Irfan, PSP, Firefox etc) can open them.
The header of a JPEG was supposed to be:
typedef struct _JFIFHeader
{
BYTE SOI[2]; /* 00h Start of Image Marker */
BYTE APP0[2]; /* 02h Application Use Marker */
BYTE Length[2]; /* 04h Length of APP0 Field */
BYTE Identifier[5]; /* 06h "JFIF" (zero terminated) Id String */
BYTE Version[2]; /* 07h JFIF Format Revision */
BYTE Units; /* 09h Units used for Resolution */
BYTE Xdensity[2]; /* 0Ah Horizontal Resolution */
BYTE Ydensity[2]; /* 0Ch Vertical Resolution */
BYTE XThumbnail; /* 0Eh Horizontal Pixel Count */
BYTE YThumbnail; /* 0Fh Vertical Pixel Count */
} JFIFHEAD;
However, when I looked into one of those non-standard files, the Xdensity and Ydensity fields were wrong. But again, all graphic applications can read this non-standard file.
Does anybody knows a piece of Delphi code that can actually read all JPEG files?
Delphi 7, Win 7 32 bit
I don't know about ALL JPEG files, but you will need to handle the two common file formats for JPEG. Since JPEG is a compression method and not a file format, the world at large has developed a few ways of storing JPEG image data in files. The two you are most likely to encounter are JFIF and EXIF. The above code covers JFIF, but doesn't handle EXIF. These two are largely incompatible but both are JPEG, so you'll need to detect and handle if you are using header information, as they defer.
For resolution, as an example. EXIF's field are x-Resolution and y-Resolution, vs the X/Y Density approach.
I would:
Do some reading on the two formats (JFIF and EXIF). I find
Wikipedia is a great place to start
on this reference (for some past
projects I've done), but SO most
likely has some great info on this
topic as well.
JFIF:
http://en.wikipedia.org/wiki/JPEG_File_Interchange_Format
EXIF:
http://en.wikipedia.org/wiki/Exif
Write code to detect the format using the starting headers
Handle each format independently
Wrap the whole thing so you can just toss a JPEG at it and get the
density. This will also give you a great spot to toss other helper code to deals with the "fun" world of JPEG handling
Here is some code which could help you get the data you want:
function GetJpegSize(jpeg: TMemoryStream; out width, height, BitDepth: integer): boolean;
var n: integer;
b: byte;
w: Word;
begin
result := false;
n := jpeg.Size-8;
jpeg.Position := 0;
if n<=0 then
exit;
jpeg.Read(w,2);
if w<>$D8FF then
exit; // invalid format
jpeg.Read(b,1);
while (jpeg.Position<n) and (b=$FF) do begin
jpeg.Read(b,1);
case b of
$C0..$C3: begin
jpeg.Seek(3,soFromCurrent);
jpeg.Read(w,2);
height := swap(w);
jpeg.Read(w,2);
width := swap(w);
jpeg.Read(b,1);
BitDepth := b*8;
Result := true; // JPEG format OK
exit;
end;
$FF:
jpeg.Read(b,1);
$D0..$D9, $01: begin
jpeg.Seek(1,soFromCurrent);
jpeg.Read(b,1);
end;
else begin
jpeg.Read(w,2);
jpeg.Seek(swap(w)-2, soFromCurrent);
jpeg.Read(b,1);
end;
end;
end;
end;
Units, Xdensity and Ydensity members of JPEG file header specifies unit of measurement used to describe physical dot density when a file is printed.
If Units is 1, Xdensity and Ydensity are dots per inch.
If Units is 2, Xdensity and Ydensity are dots per cm.
The point is that dot resolution (the scaled printing resolution) stored in an image file simply does not matter on the screen. Thus, Windows programs will always show you 96 logical ppi on the screen for any file. Note, some applications prefer using 72 logical ppi to display pictures on the screen, e.g. Adobe applications.
Graphics applications such as ACDSee, Adobe Photoshop, CorelDRAW, simply ignores Units, Xdensity and Ydensity members when displaying JPG files on the screen, but graphics applications consider the value of those members when printing JPG files if they exist. In case, a JPG file does not have Units, Xdensity and Ydensity members, graphics applications use their custom default values (usually 150 dpi) to print the JPG file.
So, for the question about a Delphi code that can read all JPEG header files, the answer is simple, just read JPG file header information; in case the optional members did not exist in a file, just ignore the optional members or tell end-users that they were currently not specified in the file.
Further Reading on DPI and PPI confusions
Say No to 72 dpi
A few scanning tips
References on JPEG File Format Specification
JPEG File Interchange Format File Format Summary
JPEG File Interchange Format, v1.02
There is a TP/TPW/Delphi (1-4, but will probably work till unicode versions without big mods) package, pasjp(e)g that can read most of the older JPG types (but not e.g. JPEG2000)
FPC also includes this package.
The original site from J. Nommsi has disappeared, but the package is still available, e.g. from
http://pascal.sources.ru/graph/pasjpg11.htm
I want to read a bitmap file (from the file system) using OCAML and store the pixels (the colors) inside an array which have th dimension of the bitmap, each pixel will take one cell in the array.
I found the function Graphics.dump_image image -> color array array
but it doesn't read from a file.
CAMLIMAGE should do it. There is also a debian package (libcamlimage-ocmal-dev), as well as an installation through godi, if you use that to manage your ocaml packages.
As a useful example of reading and manipulating images in ocaml, I suggest looking over the code for a seam removal algorithm over at eigenclass.
You can also, as stated by jonathan --but not well-- call C functions from ocaml, such as ImageMagick. Although you're going to do a lot of manipulation of the image data to bring the image into ocaml, you can always write c for all your functions to manipulate the image as an abstract data type --this seems to be completely opposite of what you want though, writing most of the program in C not ocaml.
Since I recently wanted to play around with camlimages (and had some trouble installing it --I had to modify two of the ml files from compilation errors, very simple ones though). Here is a quick program, black_and_white.ml, and how to compile it. This should get someone painlessly started with the package (especially, dynamic image generation):
let () =
let width = int_of_string Sys.argv.(1)
and length = int_of_string Sys.argv.(2)
and name = Sys.argv.(3)
and black = {Color.Rgb.r = 0; g=0; b=0; }
and white = {Color.Rgb.r = 255; g=255; b=255; } in
let image = Rgb24.make width length black in
for i = 0 to width-1 do
for j = 0 to (length/2) - 1 do
Rgb24.set image i j white;
done;
done;
Png.save name [] (Images.Rgb24 image)
And to compile,
ocamlopt.opt -I /usr/local/lib/ocaml/camlimages/ ci_core.cmxa graphics.cmxa ci_graphics.cmxa ci_png.cmxa black_and_white.ml -o black_and_white
And to run,
./black_and_white 20 20 test1.png
I don't know of an out-of-the box way to do it. You could open the file with open_in and read it byte at a time with input_char, suck in the header and the data and build up the color array array that way for simple formats (e.g. BMPs) but for anything like JPGs or PNGs a roll your-own solution would probably be more work than you want to get into.
You could also use one of the numerous SDL bindings for OCaml, specifically the SDL_image ones, which let you load all kinds of images easily, and provides functions to access individual pixels and raw data as an array.
OCamlSDL is a popular one.
If you don't want to use CAMLIMAGE, usually raw RGB or PNM/PPM (which have an easy to create header format followed by RGB values) images are used. ImageMagick allows you to then view this formats or convert them into more usable formats.