Detecting how bright an image is - ios

For my app I use a user selected image as the background with the text above it blended in using kCGBlendingModeOverlay. It's fine on certain images, but the text isn't legible on a bright image. I know Apple use an algorithm in iOS 7 to change text colour based on the content below, but my question is how I would go about implementing it. I've searched around but haven't found anything relating to this so far. Does anyone have an idea about where I could start?
Thanks

We use this grayscale calculation that weighs the colors the same way the human eye does, basically.
The eye has different sensitivity for the different colors, so N photons of green will appear brighter than N photons of blue.
lColorIndex := ( (r * 77 + g * 151 + b * 28) shr 8 );
if (lColorIndex < 130) then
ForegroundColor := clWhite
else
ForegroundColor := clBlack;
This is only a pixel calculation, so you'd have to average over the area you're looking at (average R,G,B before the calculation of course)
Background:
The brightness we see is more or less:
v = 59% of the green, 30% of the red and 11% of the blue channel
= (30 * r + 59 * g + 11 * b) div 100
which comes close to:
v = (77 * r + 151 * g + 28 * b) div 256
v = (77 * r + 151 * g + 28 * b) shr 8
which calculates faster in the computer world.

Related

Preprocessing for OCR [duplicate]

I've been using tesseract to convert documents into text. The quality of the documents ranges wildly, and I'm looking for tips on what sort of image processing might improve the results. I've noticed that text that is highly pixellated - for example that generated by fax machines - is especially difficult for tesseract to process - presumably all those jagged edges to the characters confound the shape-recognition algorithms.
What sort of image processing techniques would improve the accuracy? I've been using a Gaussian blur to smooth out the pixellated images and seen some small improvement, but I'm hoping that there is a more specific technique that would yield better results. Say a filter that was tuned to black and white images, which would smooth out irregular edges, followed by a filter which would increase the contrast to make the characters more distinct.
Any general tips for someone who is a novice at image processing?
fix DPI (if needed) 300 DPI is minimum
fix text size (e.g. 12 pt should be ok)
try to fix text lines (deskew and dewarp text)
try to fix illumination of image (e.g. no dark part of image)
binarize and de-noise image
There is no universal command line that would fit to all cases (sometimes you need to blur and sharpen image). But you can give a try to TEXTCLEANER from Fred's ImageMagick Scripts.
If you are not fan of command line, maybe you can try to use opensource scantailor.sourceforge.net or commercial bookrestorer.
I am by no means an OCR expert. But I this week had the need to convert text out of a jpg.
I started with a colorized, RGB 445x747 pixel jpg.
I immediately tried tesseract on this, and the program converted almost nothing.
I then went into GIMP and did the following.
image > mode > grayscale
image > scale image > 1191x2000 pixels
filters > enhance > unsharp mask with values of
radius = 6.8, amount = 2.69, threshold = 0
I then saved as a new jpg at 100% quality.
Tesseract then was able to extract all the text into a .txt file
Gimp is your friend.
As a rule of thumb, I usually apply the following image pre-processing techniques using OpenCV library:
Rescaling the image (it's recommended if you’re working with images that have a DPI of less than 300 dpi):
img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
Converting image to grayscale:
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Applying dilation and erosion to remove the noise (you may play with the kernel size depending on your data set):
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
Applying blur, which can be done by using one of the following lines (each of which has its pros and cons, however, median blur and bilateral filter usually perform better than gaussian blur.):
cv2.threshold(cv2.GaussianBlur(img, (5, 5), 0), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cv2.threshold(cv2.bilateralFilter(img, 5, 75, 75), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cv2.threshold(cv2.medianBlur(img, 3), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cv2.adaptiveThreshold(cv2.GaussianBlur(img, (5, 5), 0), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
cv2.adaptiveThreshold(cv2.bilateralFilter(img, 9, 75, 75), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
cv2.adaptiveThreshold(cv2.medianBlur(img, 3), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
I've recently written a pretty simple guide to Tesseract but it should enable you to write your first OCR script and clear up some hurdles that I experienced when things were less clear than I would have liked in the documentation.
In case you'd like to check them out, here I'm sharing the links with you:
Getting started with Tesseract - Part I: Introduction
Getting started with Tesseract - Part II: Image Pre-processing
Three points to improve the readability of the image:
Resize the image with variable height and width(multiply 0.5 and 1 and 2 with image height and width).
Convert the image to Gray scale format(Black and white).
Remove the noise pixels and make more clear(Filter the image).
Refer below code :
Resize
public Bitmap Resize(Bitmap bmp, int newWidth, int newHeight)
{
Bitmap temp = (Bitmap)bmp;
Bitmap bmap = new Bitmap(newWidth, newHeight, temp.PixelFormat);
double nWidthFactor = (double)temp.Width / (double)newWidth;
double nHeightFactor = (double)temp.Height / (double)newHeight;
double fx, fy, nx, ny;
int cx, cy, fr_x, fr_y;
Color color1 = new Color();
Color color2 = new Color();
Color color3 = new Color();
Color color4 = new Color();
byte nRed, nGreen, nBlue;
byte bp1, bp2;
for (int x = 0; x < bmap.Width; ++x)
{
for (int y = 0; y < bmap.Height; ++y)
{
fr_x = (int)Math.Floor(x * nWidthFactor);
fr_y = (int)Math.Floor(y * nHeightFactor);
cx = fr_x + 1;
if (cx >= temp.Width) cx = fr_x;
cy = fr_y + 1;
if (cy >= temp.Height) cy = fr_y;
fx = x * nWidthFactor - fr_x;
fy = y * nHeightFactor - fr_y;
nx = 1.0 - fx;
ny = 1.0 - fy;
color1 = temp.GetPixel(fr_x, fr_y);
color2 = temp.GetPixel(cx, fr_y);
color3 = temp.GetPixel(fr_x, cy);
color4 = temp.GetPixel(cx, cy);
// Blue
bp1 = (byte)(nx * color1.B + fx * color2.B);
bp2 = (byte)(nx * color3.B + fx * color4.B);
nBlue = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
// Green
bp1 = (byte)(nx * color1.G + fx * color2.G);
bp2 = (byte)(nx * color3.G + fx * color4.G);
nGreen = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
// Red
bp1 = (byte)(nx * color1.R + fx * color2.R);
bp2 = (byte)(nx * color3.R + fx * color4.R);
nRed = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
bmap.SetPixel(x, y, System.Drawing.Color.FromArgb
(255, nRed, nGreen, nBlue));
}
}
bmap = SetGrayscale(bmap);
bmap = RemoveNoise(bmap);
return bmap;
}
SetGrayscale
public Bitmap SetGrayscale(Bitmap img)
{
Bitmap temp = (Bitmap)img;
Bitmap bmap = (Bitmap)temp.Clone();
Color c;
for (int i = 0; i < bmap.Width; i++)
{
for (int j = 0; j < bmap.Height; j++)
{
c = bmap.GetPixel(i, j);
byte gray = (byte)(.299 * c.R + .587 * c.G + .114 * c.B);
bmap.SetPixel(i, j, Color.FromArgb(gray, gray, gray));
}
}
return (Bitmap)bmap.Clone();
}
RemoveNoise
public Bitmap RemoveNoise(Bitmap bmap)
{
for (var x = 0; x < bmap.Width; x++)
{
for (var y = 0; y < bmap.Height; y++)
{
var pixel = bmap.GetPixel(x, y);
if (pixel.R < 162 && pixel.G < 162 && pixel.B < 162)
bmap.SetPixel(x, y, Color.Black);
else if (pixel.R > 162 && pixel.G > 162 && pixel.B > 162)
bmap.SetPixel(x, y, Color.White);
}
}
return bmap;
}
INPUT IMAGE
OUTPUT IMAGE
This is somewhat ago but it still might be useful.
My experience shows that resizing the image in-memory before passing it to tesseract sometimes helps.
Try different modes of interpolation. The post https://stackoverflow.com/a/4756906/146003 helped me a lot.
What was EXTREMLY HELPFUL to me on this way are the source codes for Capture2Text project.
http://sourceforge.net/projects/capture2text/files/Capture2Text/.
BTW: Kudos to it's author for sharing such a painstaking algorithm.
Pay special attention to the file Capture2Text\SourceCode\leptonica_util\leptonica_util.c - that's the essence of image preprocession for this utility.
If you will run the binaries, you can check the image transformation before/after the process in Capture2Text\Output\ folder.
P.S. mentioned solution uses Tesseract for OCR and Leptonica for preprocessing.
Java version for Sathyaraj's code above:
// Resize
public Bitmap resize(Bitmap img, int newWidth, int newHeight) {
Bitmap bmap = img.copy(img.getConfig(), true);
double nWidthFactor = (double) img.getWidth() / (double) newWidth;
double nHeightFactor = (double) img.getHeight() / (double) newHeight;
double fx, fy, nx, ny;
int cx, cy, fr_x, fr_y;
int color1;
int color2;
int color3;
int color4;
byte nRed, nGreen, nBlue;
byte bp1, bp2;
for (int x = 0; x < bmap.getWidth(); ++x) {
for (int y = 0; y < bmap.getHeight(); ++y) {
fr_x = (int) Math.floor(x * nWidthFactor);
fr_y = (int) Math.floor(y * nHeightFactor);
cx = fr_x + 1;
if (cx >= img.getWidth())
cx = fr_x;
cy = fr_y + 1;
if (cy >= img.getHeight())
cy = fr_y;
fx = x * nWidthFactor - fr_x;
fy = y * nHeightFactor - fr_y;
nx = 1.0 - fx;
ny = 1.0 - fy;
color1 = img.getPixel(fr_x, fr_y);
color2 = img.getPixel(cx, fr_y);
color3 = img.getPixel(fr_x, cy);
color4 = img.getPixel(cx, cy);
// Blue
bp1 = (byte) (nx * Color.blue(color1) + fx * Color.blue(color2));
bp2 = (byte) (nx * Color.blue(color3) + fx * Color.blue(color4));
nBlue = (byte) (ny * (double) (bp1) + fy * (double) (bp2));
// Green
bp1 = (byte) (nx * Color.green(color1) + fx * Color.green(color2));
bp2 = (byte) (nx * Color.green(color3) + fx * Color.green(color4));
nGreen = (byte) (ny * (double) (bp1) + fy * (double) (bp2));
// Red
bp1 = (byte) (nx * Color.red(color1) + fx * Color.red(color2));
bp2 = (byte) (nx * Color.red(color3) + fx * Color.red(color4));
nRed = (byte) (ny * (double) (bp1) + fy * (double) (bp2));
bmap.setPixel(x, y, Color.argb(255, nRed, nGreen, nBlue));
}
}
bmap = setGrayscale(bmap);
bmap = removeNoise(bmap);
return bmap;
}
// SetGrayscale
private Bitmap setGrayscale(Bitmap img) {
Bitmap bmap = img.copy(img.getConfig(), true);
int c;
for (int i = 0; i < bmap.getWidth(); i++) {
for (int j = 0; j < bmap.getHeight(); j++) {
c = bmap.getPixel(i, j);
byte gray = (byte) (.299 * Color.red(c) + .587 * Color.green(c)
+ .114 * Color.blue(c));
bmap.setPixel(i, j, Color.argb(255, gray, gray, gray));
}
}
return bmap;
}
// RemoveNoise
private Bitmap removeNoise(Bitmap bmap) {
for (int x = 0; x < bmap.getWidth(); x++) {
for (int y = 0; y < bmap.getHeight(); y++) {
int pixel = bmap.getPixel(x, y);
if (Color.red(pixel) < 162 && Color.green(pixel) < 162 && Color.blue(pixel) < 162) {
bmap.setPixel(x, y, Color.BLACK);
}
}
}
for (int x = 0; x < bmap.getWidth(); x++) {
for (int y = 0; y < bmap.getHeight(); y++) {
int pixel = bmap.getPixel(x, y);
if (Color.red(pixel) > 162 && Color.green(pixel) > 162 && Color.blue(pixel) > 162) {
bmap.setPixel(x, y, Color.WHITE);
}
}
}
return bmap;
}
The Tesseract documentation contains some good details on how to improve the OCR quality via image processing steps.
To some degree, Tesseract automatically applies them. It is also possible to tell Tesseract to write an intermediate image for inspection, i.e. to check how well the internal image processing works (search for tessedit_write_images in the above reference).
More importantly, the new neural network system in Tesseract 4 yields much better OCR results - in general and especially for images with some noise. It is enabled with --oem 1, e.g. as in:
$ tesseract --oem 1 -l deu page.png result pdf
(this example selects the german language)
Thus, it makes sense to test first how far you get with the new Tesseract LSTM mode before applying some custom pre-processing image processing steps.
Adaptive thresholding is important if the lighting is uneven across the image.
My preprocessing using GraphicsMagic is mentioned in this post:
https://groups.google.com/forum/#!topic/tesseract-ocr/jONGSChLRv4
GraphicsMagic also has the -lat feature for Linear time Adaptive Threshold which I will try soon.
Another method of thresholding using OpenCV is described here:
https://docs.opencv.org/4.x/d7/d4d/tutorial_py_thresholding.html
I did these to get good results out of an image which has not very small text.
Apply blur to the original image.
Apply Adaptive Threshold.
Apply Sharpening effect.
And if the still not getting good results, scale the image to 150% or 200%.
Reading text from image documents using any OCR engine have many issues in order get good accuracy. There is no fixed solution to all the cases but here are a few things which should be considered to improve OCR results.
1) Presence of noise due to poor image quality / unwanted elements/blobs in the background region. This requires some pre-processing operations like noise removal which can be easily done using gaussian filter or normal median filter methods. These are also available in OpenCV.
2) Wrong orientation of image: Because of wrong orientation OCR engine fails to segment the lines and words in image correctly which gives the worst accuracy.
3) Presence of lines: While doing word or line segmentation OCR engine sometimes also tries to merge the words and lines together and thus processing wrong content and hence giving wrong results. There are other issues also but these are the basic ones.
This post OCR application is an example case where some image pre-preocessing and post processing on OCR result can be applied to get better OCR accuracy.
Text Recognition depends on a variety of factors to produce a good quality output. OCR output highly depends on the quality of input image. This is why every OCR engine provides guidelines regarding the quality of input image and its size. These guidelines help OCR engine to produce accurate results.
I have written a detailed article on image processing in python. Kindly follow the link below for more explanation. Also added the python source code to implement those process.
Please write a comment if you have a suggestion or better idea on this topic to improve it.
https://medium.com/cashify-engineering/improve-accuracy-of-ocr-using-image-preprocessing-8df29ec3a033
you can do noise reduction and then apply thresholding, but that you can you can play around with the configuration of the OCR by changing the --psm and --oem values
try:
--psm 5
--oem 2
you can also look at the following link for further details
here
So far, I've played a lot with tesseract 3.x, 4.x and 5.0.0.
tesseract 4.x and 5.x seem to yield the exact same accuracy.
Sometimes, I get better results with legacy engine (using --oem 0) and sometimes I get better results with LTSM engine --oem 1.
Generally speaking, I get the best results on upscaled images with LTSM engine. The latter is on par with my earlier engine (ABBYY CLI OCR 11 for Linux).
Of course, the traineddata needs to be downloaded from github, since most linux distros will only provide the fast versions.
The trained data that will work for both legacy and LTSM engines can be downloaded at https://github.com/tesseract-ocr/tessdata with some command like the following. Don't forget to download the OSD trained data too.
curl -L https://github.com/tesseract-ocr/tessdata/blob/main/eng.traineddata?raw=true -o /usr/share/tesseract/tessdata/eng.traineddata
curl -L https://github.com/tesseract-ocr/tessdata/blob/main/eng.traineddata?raw=true -o /usr/share/tesseract/tessdata/osd.traineddata
I've ended up using ImageMagick as my image preprocessor since it's convenient and can easily run scripted. You can install it with yum install ImageMagick or apt install imagemagick depending on your distro flavor.
So here's my oneliner preprocessor that fits most of the stuff I feed to my OCR:
convert my_document.jpg -units PixelsPerInch -respect-parenthesis \( -compress LZW -resample 300 -bordercolor black -border 1 -trim +repage -fill white -draw "color 0,0 floodfill" -alpha off -shave 1x1 \) \( -bordercolor black -border 2 -fill white -draw "color 0,0 floodfill" -alpha off -shave 0x1 -deskew 40 +repage \) -antialias -sharpen 0x3 preprocessed_my_document.tiff
Basically we:
use TIFF format since tesseract likes it more than JPG (decompressor related, who knows)
use lossless LZW TIFF compression
Resample the image to 300dpi
Use some black magic to remove unwanted colors
Try to rotate the page if rotation can be detected
Antialias the image
Sharpen text
The latter image can than be fed to tesseract with:
tesseract -l eng preprocessed_my_document.tiff - --oem 1 -psm 1
Btw, some years ago I wrote the 'poor man's OCR server' which checks for changed files in a given directory and launches OCR operations on all not already OCRed files. pmocr is compatible with tesseract 3.x-5.x and abbyyocr11.
See the pmocr project on github.

Pi live video color detection

I'm planning to create an ambilight effect behind my TV. I want to achieve this by using a camera pointed at my TV. I think the easiest way is using a simple ip-camera. I need color detection to detect the colors on the screen and translate this to rgb values on the led strip.
I have a Raspberry Pi as hub in the middle of my house. I was thinking about using it like this
Ip camera pointed at my screen Process the video on the pi and translate it to rgb values and send it to mqtt server. Behind my TV receive the colors on my nodeMCU.
How can I detect colors on a live stream (on multiple points) on my pi?
If you can create any background colour the best approach might be calculating k-means or median to get "the most popular" colours. If the ambient light can be different in different places then using ROI at the image edges you can check what colour is dominant in this area (by comparing number of samples of different colours).
If you have only limited colours (e.g. only R, G and B) then you can simply check which channel has highest intensity in desired region.
I wrote the code with an assumption that you can create any RGB ambient color.
As a test image I use this one:
The code is:
import cv2
import numpy as np
# Read an input image (in your case this will be an image from the camera)
img = cv2.imread('saul2.png ', cv2.IMREAD_COLOR)
# The block_size defines how big the patches around an image are
# the more LEDs you have and the more segments you want, the lower block_size can be
block_size = 60
# Get dimensions of an image
height, width, chan = img.shape
# Calculate number of patches along height and width
h_steps = height / block_size
w_steps = width / block_size
# In one loop I calculate both: left and right ambient or top and bottom
ambient_patch1 = np.zeros((60, 60, 3))
ambient_patch2 = np.zeros((60, 60, 3))
# Create output image (just for visualization
# there will be an input image in the middle, 10px black border and ambient color)
output = cv2.copyMakeBorder(img, 70, 70, 70, 70, cv2.BORDER_CONSTANT, value = 0)
for i in range(h_steps):
# Get left and right region of an image
left_roi = img[i * 60 : (i + 1) * 60, 0 : 60]
right_roi = img[i * 60 : (i + 1) * 60, -61 : -1]
left_med = np.median(left_roi, (0, 1)) # This is an actual RGB color for given block (on the left)
right_med = np.median(right_roi, (0, 1)) # and on the right
# Create patch having an ambient color - this is just for visualization
ambient_patch1[:, :] = left_med
ambient_patch2[:, :] = right_med
# Put it in the output image (the additional 70 is because input image is in the middle (shifted by 70px)
output[70 + i * 60 : 70+ (i + 1) * 60, 0 : 60] = ambient_patch1
output[70 + i * 60 : 70+ (i + 1) * 60, -61: -1] = ambient_patch2
for i in range(w_steps):
# Get top and bottom region of an image
top_roi = img[0 : 60, i * 60 : (i + 1) * 60]
bottom_roi = img[-61 : -1, i * 60: (i + 1) * 60]
top_med = np.median(top_roi, (0, 1)) # This is an actual RGB color for given block (on top)
bottom_med = np.median(bottom_roi, (0, 1)) # and bottom
# Create patch having an ambient color - this is just for visualization
ambient_patch1[:, :] = top_med
ambient_patch2[:, :] = bottom_med
# Put it in the output image (the additional 70 is because input image is in the middle (shifted by 70px)
output[0 : 60, 70 + i * 60 : 70 + (i + 1) * 60] = ambient_patch1
output[-61: -1, 70 + i * 60 : 70 + (i + 1) * 60] = ambient_patch2
# Save output image
cv2.imwrite('saul_output.png', output)
And this gives a result as follows:
I hope this helps!
EDIT:
And the two more examples:

Convolution operator yielding spectrum of colors

I have been trying to make my own convolution operator instead of using the inbuilt one that comes with Java. I applied the inbuilt convolution operator on this image
link
using the inbuilt convolution operator with gaussian filter I got this image.
link
Now I run the same image using my code
public static int convolve(BufferedImage a,int x,int y){
int red=0,green=0,blue=0;
float[] matrix = {
0.1710991401561097f, 0.2196956447338621f, 0.1710991401561097f,
0.2196956447338621f, 0.28209479177387814f, 0.2196956447338621f,
0.1710991401561097f, 0.2196956447338621f, 0.1710991401561097f,
};
for(int i = x;i<x+3;i++){
for(int j = y;j<y+3;j++){
int color = a.getRGB(i,j);
red += Math.round(((color >> 16) & 0xff)*matrix[(i-x)*3+j-y]);
green += Math.round(((color >> 8) & 0xff)*matrix[(i-x)*3+j-y]);
blue += Math.round(((color >> 0) & 0xff)*matrix[(i-x)*3+j-y]);
}
}
return (a.getRGB(x, y)&0xFF000000) | (red << 16) | (green << 8) | (blue);
}
And The result I got is this.
link
Also how do I optimize the code that I wrote. The inbuilt convolution operator takes 1 ~ 2 seconds while my code even if it is not serving the exact purpose as it is suppose to, is taking 5~7 seconds !
I accidentally rotated my source image while uploading. So please ignore that.
First of all, you are needlessly (and wrongly) converting your result from float to int at each cycle of the loop. Your red, green and blue should be of type float and should be cast back to integer only after the convolution (when converted back to RGB):
float red=0.0f, green = 0.0f, blue = 0.0f
for(int i = x;i<x+3;i++){
for(int j = y;j<y+3;j++){
int color = a.getRGB(i,j);
red += ((color >> 16) & 0xff)*matrix[(i-x)*3+j-y];
green += ((color >> 8) & 0xff)*matrix[(i-x)*3+j-y];
blue += ((color >> 0) & 0xff)*matrix[(i-x)*3+j-y];
}
}
return (a.getRGB(x, y)&0xFF000000) | (((int)red) << 16) | (((int)green) << 8) | ((int)blue);
The bleeding of colors in your result is caused because your coefficients in matrix are wrong:
0.1710991401561097f + 0.2196956447338621f + 0.1710991401561097f +
0.2196956447338621f + 0.28209479177387814f + 0.2196956447338621f +
0.1710991401561097f + 0.2196956447338621f + 0.1710991401561097f =
1.8452741
The sum of the coefficients in a blurring convolution matrix should be 1.0. When you apply this matrix to an image you may get colors that are over 255. When that happens the channels "bleed" into the next channel (blue to green, etc.).
A completely green image with this matrix would result in:
green = 255 * 1.8452741 ~= 471 = 0x01D7;
rgb = 0xFF01D700;
Which is a less intense green with a hint of red.
You can fix that by dividing the coefficients by 1.8452741, but you want to make sure that:
(int)(255.0f * (sum of coefficients)) = 255
If not you need to add a check which limits the size of channels to 255 and don't let them wrap around. E.g.:
if (red > 255.0f)
red = 255.0f;
Regarding efficiency/optimization:
It could be that the difference in speed may be explained by this needless casting and calling Math.Round, but a more likely candidate is the way you are accessing the image. I'm not familiar enough with BufferedImage and Raster to advice you on the most efficient way to access the underlying image buffer.

Color over grayscale image

I want to color gray-scale image with only one color. So I have for example pixel: RGB(34,34,34) and I want to color it with color: RGB(200,100,50) to get new RGB pixel. So I new to do this for every pixel in image.
The white pixels get color: RGB(200,100,50), darker pixels get darker color than RGB(200,100,50).
So the result is gray-scale with black and selected color instead of black and white.
I will program this hard core without any built in function.
Similar to this: Image or this:Image
All you need to do is use the ratio of gray to white as a multiplier to your color. I think you'll find that this gives better results than a blend.
new_red = gray * target_red / 255
new_green = gray * target_green / 255
new_blue = gray * target_blue / 255
From what you describe I figure you look for a blending algorithm.
What you need is a blendingPercentage (bP).
new red = red1 * bP + red2 * (1 - bP)
new green = green1 * bP + green2 * (1 - bP)
new blue = blue1 * bP + blue2 * (1 - bP)
Your base color is RGB 34 34 34; Color to blend is RGB 200 100 50
BlendingPercentage for example = 50% -> 0.5
Therefore:
New red = 34 * 0.5 + 200 * (1 - 0.5) = 117
New green = 34 * 0.5 + 100 * (1 - 0.5) = 67
New blue = 34 * 0.5 + 50 * (1 - 0.5) = 42

Why doesn't opencv report width and height of a IplImage* correctly?

I got the reference image of a video(.avi) so the the width and height of the image must be as the same as the width and height of the video is and it is.
(my video is a CvCapture* and my image is a IplImage*)
Width is 1280 and height is 960;
But when I told OpenCV that if the coordinate of a pixel is in the specific rectangle then do something. All of the width of the image was the width of that rectangle.
const int Y1 = 430, Y2 = 730, X1 = 0, X2 = 1279 ;
for (int i = Y1; i <= Y2; i++)
for (int j = X1; j <= X2; j++)
CV_IMAGE_ELEM(frame_BGR, uchar, i, j) = 255;
But only near 1/5 of the width of the page is now white! Then I X2 = 3000. Then all of the width of the image is now white and silly thing is that when I change X2 = 10000 then code didn't report SEGMENTATION FAULT.
Why the width-reporting is not working correctly?
I run it on both Ubuntu - g++ and and Windows 7 - visual studio 2010. I think my resolution is high. I know that the video is taken by a Nokia 5800 cellphone. It is so important for me, so excuse me if I was very specific!
If the image isn't single channel , you are using CV_IMAGE_ELEM wrongly
It has to be pixel = CV_IMAGE_ELEM( frame_BGR, uchar, row_number, col_number * 3 + color_channel );
So for BGR:
uchar blue = CV_IMAGE_ELEM( frame_BGR, uchar, row_number, col_number * 3 + 0 );
uchar green = CV_IMAGE_ELEM( frame_BGR, uchar, row_number, col_number * 3 + 1 );
uchar red = CV_IMAGE_ELEM( frame_BGR, uchar, row_number, col_number * 3 + 2);
Really CV_IMAGE_ELEM isn't really worth the effort, you might as well just use frame_BGR.ptr(row) to get a pointer to the start of the row and then increment the pointer to give you B,G,R along the row.

Resources