I'm trying to reproduce Photoshop's multiply blend mode in OpenCV. Equivalents to this would be what you find in GIMP, or when you use the CIMultiplyBlendMode in Apple's CoreImage framework.
Everything I read online suggests that multiply blending is accomplished simply by multiplying the channels of the two input images (i.e., Blend = AxB). And, this works, except for the case(s) where alpha is < 1.0.
You can test this very simply in GIMP/PhotoShop/CoreImage by creating two layers/images, filling each with a different solid color, and then modifying the opacity of the first layer. (BTW, when you modify alpha, the operation is no longer commutative in GIMP for some reason.)
A simple example: if A = (0,0,0,0) and B = (0.4,0,0,1.0), and C = AxB, then I would expect C to be (0,0,0,0). This is simple multiplication. But this is not how this blend is implemented in practice. In practice, C = (0.4,0,0,1.0), or C = B.
The bottom line is this: I need to figure out the formula for the multiply blend mode (which is clearly more than AxB) and then implement it in OpenCV (which should be trivial once I have the formula).
Would appreciate any insights.
Also, for reference, here are some links which show multiply blend as being simply AxB:
How does photoshop blend two images together
Wikipedia - Blend Modes
Photoshop Blend Modes
Here is an OpenCV solution based the source code of GIMP, specifically the function gimp_operation_multiply_mode_process_pixels.
Instead of looping on all pixels it can be vectorized, but I followed the steps of GIMP.
Input images must be of type CV_8UC3 or CV_8UC4.
it supports also the opacity value, that must be in [0, 255]
in the original GIMP implementation there is also the support for a mask. It can be trivially added to the code, eventually.
This implementation is in fact not symmetrical, and reproduce your strange behaviour.
#include <opencv2\opencv.hpp>
using namespace cv;
Mat blend_multiply(const Mat& level1, const Mat& level2, uchar opacity)
CV_Assert(level1.size() == level2.size());
CV_Assert(level1.type() == level2.type());
CV_Assert(level1.channels() == level2.channels());
// Get 4 channel float images
Mat4f src1, src2;
if (level1.channels() == 3)
Mat4b tmp1, tmp2;
cvtColor(level1, tmp1, COLOR_BGR2BGRA);
cvtColor(level2, tmp2, COLOR_BGR2BGRA);
tmp1.convertTo(src1, CV_32F, 1. / 255.);
tmp2.convertTo(src2, CV_32F, 1. / 255.);
level1.convertTo(src1, CV_32F, 1. / 255.);
level2.convertTo(src2, CV_32F, 1. / 255.);
Mat4f dst(src1.rows, src1.cols, Vec4f(0., 0., 0., 0.));
// Loop on every pixel
float fopacity = opacity / 255.f;
float comp_alpha, new_alpha;
for (int r = 0; r < src1.rows; ++r)
for (int c = 0; c < src2.cols; ++c)
const Vec4f& v1 = src1(r, c);
const Vec4f& v2 = src2(r, c);
Vec4f& out = dst(r, c);
comp_alpha = min(v1[3], v2[3]) * fopacity;
new_alpha = v1[3] + (1.f - v1[3]) * comp_alpha;
if ((comp_alpha > 0.) && (new_alpha > 0.))
float ratio = comp_alpha / new_alpha;
out[0] = max(0.f, min(v1[0] * v2[0], 1.f)) * ratio + (v1[0] * (1.f - ratio));
out[1] = max(0.f, min(v1[1] * v2[1], 1.f)) * ratio + (v1[1] * (1.f - ratio));
out[2] = max(0.f, min(v1[2] * v2[2], 1.f)) * ratio + (v1[2] * (1.f - ratio));
out[0] = v1[0];
out[1] = v1[1];
out[2] = v1[2];
out[3] = v1[3];
Mat3b dst3b;
Mat4b dst4b;
dst.convertTo(dst4b, CV_8U, 255.);
cvtColor(dst4b, dst3b, COLOR_BGRA2BGR);
return dst3b;
int main()
Mat3b layer1 = imread("path_to_image_1");
Mat3b layer2 = imread("path_to_image_2");
Mat blend = blend_multiply(layer1, layer2, 255);
return 0;
I managed to sort this out. Feel free to comment with any suggested improvements.
First, I found a clue as to how to implement the multiply function in this post:
multiply blending
And here's a quick OpenCV implementation in C++.
Mat MultiplyBlend(const Mat& cvSource, const Mat& cvBackground) {
// assumption: cvSource and cvBackground are of type CV_8UC4
// formula: (cvSource.rgb * cvBackground.rgb * cvSource.a) + (cvBackground.rgb * (1-cvSource.a))
Mat cvAlpha(cvSource.size(), CV_8UC3, Scalar::all(0));
Mat input[] = { cvSource };
int from_to[] = { 3,0, 3,1, 3,2 };
mixChannels(input, 1, &cvAlpha, 1, from_to, 3);
Mat cvBackgroundCopy;
Mat cvSourceCopy;
cvtColor(cvSource, cvSourceCopy, CV_RGBA2RGB);
cvtColor(cvBackground, cvBackgroundCopy, CV_RGBA2RGB);
// A = cvSource.rgb * cvBackground.rgb * cvSource.a
Mat cvBlendResultLeft;
multiply(cvSourceCopy, cvBackgroundCopy, cvBlendResultLeft, 1.0 / 255.0);
multiply(cvBlendResultLeft, cvAlpha, cvBlendResultLeft, 1.0 / 255.0);
// invert alpha
bitwise_not(cvAlpha, cvAlpha);
// B = cvBackground.rgb * (1-cvSource.a)
Mat cvBlendResultRight;
multiply(cvBackgroundCopy, cvAlpha, cvBlendResultRight, 1.0 / 255.0);
delete(cvBackgroundCopy, cvAlpha);
// A + B
Mat cvBlendResult;
add(cvBlendResultLeft, cvBlendResultRight, cvBlendResult);
delete(cvBlendResultLeft, cvBlendResultRight);
cvtColor(cvBlendResult, cvBlendResult, CV_RGB2RGBA);
return cvBlendResult;
Since the Corona situation characterizes my studies as self-study, as a Processing-Language newbie I don't have an easy time getting into the subject of image processing , more specifically convolution. Therefore I hope that you can help me.
My lecturer, who unfortunately is nearly never reachable, left me the following conv code. The theory behind convolution is clear to me, but I have many gaps in understanding related to the code. Could someone leave a line comment so that I can get into the code a bit more fluently?
The Code is following
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
int offset = matrix_size / 2;
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
int xloc = x+i-offset;
int yloc = y+j-offset;
int loc = xloc + img.width*yloc;
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
return color(rtotal, gtotal, btotal);
I have to do a bit of guesswork since I'm not positive about all of the functions you're using and I'm not familiar with the Processing 3+ library, but here's my best shot at it.
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
// Note: the 'matrix' parameter here will also frequently be referred to as
// a 'window' or 'kernel' in research
// I'm not certain what your PImage class is from, but I'll assume
// you're using the Processing 3+ library and work off of that assumption
// how much of each color we see within the kernel (matrix) space
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
// this offset is to zero-center our kernel
// the fact that we use matrix_size / 2 sort of implicitly
// alludes to the fact that our matrix_size should be an odd-number
// so that we can have a middle-pixel
int offset = matrix_size / 2;
// looping through the kernel. the fact that we use 'matrix_size'
// as our end-condition for both dimensions means that our 'matrix' kernel
// must always be a square
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
// calculating the index conversion from 2D to the 1D format that PImage uses
// refer to: https://processing.org/tutorials/pixels/
// for a better understanding of PImage indexing (about 1/3 of the way down the page)
// WARNING: by subtracting the offset it is possible to hit negative
// x,y values here if you pick an x or y position less than matrix_size / 2.
// the same index-out-of-bounds can occur on the high end.
// When you convolve using a kernel of N x N size (N here would be matrix_size)
// you can only convolve from [N / 2, Width - (N / 2)] for x and y
int xloc = x+i-offset;
int yloc = y+j-offset;
// this is the final 1D PImage index that corresponds to [xloc, yloc] in our 2D image
// really go back up and take a look at the link if this doesn't make sense, it's pretty good
int loc = xloc + img.width*yloc;
// I have to do some speculation again since I'm not certain what red(img.pixels[loc]) does
// I'll assume it returns the red red channel of the pixel
// this section just adds up all of the pixel colors multiplied by the value in the kernel
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
// the fact that no further division or averaging happens after the for-loops implies
// that the kernel you feed in should have balanced values for your kernel size
// for example, a kernel that's designed to average out the color over the 3 x 3 area
// it covers (this would be like blurring the image) would be filled with 1/9
// in general: the kernel you're using should have a sum of 1 for all of the numbers inside
// this is just 'in general' you can play around with not doing that, but you'll probably notice a
// darkening effect for when the sum is less than 1, and a brightening effect if it's greater than 1
// for more info on kernels, read this: https://en.wikipedia.org/wiki/Kernel_(image_processing)
// I don't have the code for this constrain function,
// but it's almost certainly just your typical clamp (constrains the values to [0, 255])
// Note: this means that your values saturate at 0 and 255
// if you see a lot of black or white then that means your kernel
// probably isn't balanced as mentioned above
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
// Finished!
return color(rtotal, gtotal, btotal);
I am working on a OCR project, and in the preprocessing, some RED stamps need to be removed, so that the text near the stamps could be detected. I try a lot of methods(like change the values of pixel, threshold in Red channel) but fail.
Any suggestions are highly appreciated.
Python, C++, Java or what? Since you didn't state the OpenCV implementation you are using, I'm giving my answer in C++.
An option is to use the HSV color space to filter out the range of red values that defines the seal. My approach is to use the CMYK color space to filter everything except the black (or dark) text. It should do a pretty good job on printed media, which is your case.
//read input image:
std::string imageName = "C://opencvImages//seal.png";
cv::Mat imageInput = cv::imread( imageName );
Now, perform the CMYK conversion. OpenCV does not support this operation out of the box, bear with me as I provide the helper function at the end of this post.
//CMYK conversion:
std::vector<cv::Mat> cmyk;
cmyk = rgb2cmyk( imageInput );
//This is the Black channel:
cv::Mat blackChannel = cmyk[3].clone();
This is the image of the black channel; it is nice how everything that is not black (or dark) practically disappears!
Now, optionally, enhance the result applying brightness and contrast adjustment. Just try to separate the text from the background a little bit better; we want some defined pixel distributions to get a nice binary image.
//Brightness and contrast adjustment:
float alpha = 2.0;
float beta = -50.0;
contrastBrightnessAdjustment( blackChannel, alpha, beta );
Again, OpenCV does not offer brightness and contrast adjustment out of the box; however, its implementation is very easy. Hold on a little bit, and let me show you the result of this operation:
Nice. Let's Otsu-threshold this bad boy to get a nice binary image containing the clean text:
cv::threshold( blackChannel, binaryImage ,0, 255, cv::THRESH_OTSU );
This is what you get:
Now, the RGB to CMYK conversion function. I'm using the following implementation. The function receives an RGB image and returns a vector containing each of the CMYK channels
std::vector<cv::Mat> rgb2cmyk( cv::Mat& inputImage ){
std::vector<cv::Mat> cmyk;
for (int i = 0; i < 4; i++) {
cmyk.push_back( cv::Mat( inputImage.size(), CV_8UC1 ) );
std::vector<cv::Mat> inputRGB;
cv::split( inputImage, inputRGB );
for (int i = 0; i < inputImage.rows; i++)
for (int j = 0; j < inputImage.cols; j++)
float r = (int)inputRGB[2].at<uchar>(i, j) / 255.;
float g = (int)inputRGB[1].at<uchar>(i, j) / 255.;
float b = (int)inputRGB[0].at<uchar>(i, j) / 255.;
float k = std::min(std::min(1-r, 1-g), 1-b);
cmyk[0].at<uchar>(i, j) = (1 - r - k) / (1 - k) * 255.;
cmyk[1].at<uchar>(i, j) = (1 - g - k) / (1 - k) * 255.;
cmyk[2].at<uchar>(i, j) = (1 - b - k) / (1 - k) * 255.;
cmyk[3].at<uchar>(i, j) = k * 255.;
return cmyk;
And the contrastBrightnessAdjustment function is this, implemented using pointer arithmetic. The function receives a grayscale image and applies the linear transformation via the alpha and beta parameters:
void contrastBrightnessAdjustment( cv::Mat inputImage, float alpha, int beta ){
cv::MatIterator_<cv::Vec3b> it, end;
for (it = inputImage.begin<cv::Vec3b>(), end = inputImage.end<cv::Vec3b>(); it != end; ++it) {
uchar &pixel = (*it)[0];
pixel = cv::saturate_cast<uchar>(alpha*pixel+beta);
I have a basic question that I am struggling with here. When you look at the findmyicone sample code from WWDC 2010, you will see this:
static const uint8_t orangeColor[] = {255, 127, 0};
uint8_t referenceColor[3];
// Remove luminance
static inline void normalize( const uint8_t colorIn[], uint8_t colorOut[] ) {
// Dot product
int sum = 0;
for (int i = 0; i < 3; i++)
sum += colorIn[i] / 3;
for (int j = 0; j < 3; j++)
colorOut[j] = (float) ((colorIn[j] / (float) sum) * 255);
And then it is called:
normalize(orangeColor, referenceColor);
Running the debugger, it is converting BGRA: (Red 255, Green 127, Blue 0) to (Red 0, Green 255, Blue 0). I have looked on the web and SO to find details on luminance and dot product and there is really no information.
1- Can someone guide me on what this function is doing?
2- Can you guide me to some helpful topics/primer online as well?
Thanks again
What they're trying to do is track a particular color across variations in brightness, so they're normalizing for the luminance of the color. I do something similar in the fragment shader I use in a color tracking example based on a GPU Gems paper from Apple, as well as the ColorObjectTracking sample application in my GPUImage framework:
vec3 normalizeColor(vec3 color)
return color / max(dot(color, vec3(1.0/3.0)), 0.3);
vec4 maskPixel(vec3 pixelColor, vec3 maskColor)
float d;
vec4 calculatedColor;
// Compute distance between current pixel color and reference color
d = distance(normalizeColor(pixelColor), normalizeColor(maskColor));
// If color difference is larger than threshold, return black.
calculatedColor = (d > threshold) ? vec4(0.0) : vec4(1.0);
//Multiply color by texture
return calculatedColor;
The above calculation takes the average of the three color components by multiplying each channel by 1/3 and then summing them (that's what the dot product does here). It then divides each color channel by this average to arrive at a normalized color.
The distance between this normalized color and the target one is calculated, and if it is within a certain threshold the pixel is marked as being of that color.
This is just one way of determining proximity of one color to another. Another way is to convert the RGB values into Y, Cr, and Cb (Y, U, and V) components and then take the distance between just the chrominance portions (Cr and Cb):
vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
vec4 textureColor2 = texture2D(inputImageTexture2, textureCoordinate2);
float maskY = 0.2989 * colorToReplace.r + 0.5866 * colorToReplace.g + 0.1145 * colorToReplace.b;
float maskCr = 0.7132 * (colorToReplace.r - maskY);
float maskCb = 0.5647 * (colorToReplace.b - maskY);
float Y = 0.2989 * textureColor.r + 0.5866 * textureColor.g + 0.1145 * textureColor.b;
float Cr = 0.7132 * (textureColor.r - Y);
float Cb = 0.5647 * (textureColor.b - Y);
float blendValue = 1.0 - smoothstep(thresholdSensitivity, thresholdSensitivity + smoothing, distance(vec2(Cr, Cb), vec2(maskCr, maskCb)));
This code is what I use in a chroma keying shader, and it's based on a similar calculation that Apple uses in one of their sample applications. Which one is best can depend on the particular situation you're facing.
I use OpenCV to undestort set of points after camera calibration.
The code follows.
const int npoints = 2; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][2] = {{0,0}, {2560, 1920}};
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
After the code above dst contains following numbers:
-8.82689655e-001 -7.05507338e-001 4.16228324e-001 3.04863811e-001
which are too small in comparison with numbers in src.
At the same time if I undistort image via the call:
cvUndistort2( srcImage, dstImage, &camera1, &dist_coeffs1 );
I receive good undistorted image which means that pixel coordinates are not modified so drastically in comparison with separate points.
How to obtain the same undistortion for specific points as for images?
The points should be "unnormalized" using camera matrix.
More specifically, after call of cvUndistortPoints following transformation should be also added:
double fx = CV_MAT_ELEM(camera1, double, 0, 0);
double fy = CV_MAT_ELEM(camera1, double, 1, 1);
double cx = CV_MAT_ELEM(camera1, double, 0, 2);
double cy = CV_MAT_ELEM(camera1, double, 1, 2);
float * dst_ptr = (float*)dst->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
float& px = *(dst_ptr + pi * 2);
float& py = *(dst_ptr + pi * 2 + 1);
// perform transformation.
// In fact this is equivalent to multiplication to camera matrix
px = px * fx + cx;
py = py * fy + cy;
More info on camera matrix at OpenCV 'Camera Calibration and 3D Reconstruction'
Following C++ function call should work as well:
std::vector<cv::Point2f> inputDistortedPoints = ...
std::vector<cv::Point2f> outputUndistortedPoints;
cv::Mat cameraMatrix = ...
cv::Mat distCoeffs = ...
cv::undistortPoints(inputDistortedPoints, outputUndistortedPoints, cameraMatrix, distCoeffs, cv::noArray(), cameraMatrix);
It may be your matrix size :)
OpenCV expects a vector of points - a column or a row matrix with two channels. But because your input matrix is only 2 pts, and the number of channels is also 1, it cannot figure out what's the input, row or colum.
So, fill a longer input mat with bogus values, and keep only the first:
const int npoints = 4; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][4] = {{0,0}, {2560, 1920}}; // the rest will be set to 0
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
While OpenCV specifies undistortPoints accept only 2-channel input, actually, it accepts
1-column, 2-channel, multi-row mat or (and this case is not documented)
2 column, multi-row, 1-channel mat or
multi-column, 1 row, 2-channel mat
(as seen in undistort.cpp, line 390)
But a bug inside (or lack of available info), makes it wrongly mix the second one with the third one, when the number of columns is 2. So, your data is considered a 2-column, 2-row, 1-channel.
I also reach this problems, and I take some time to research an finally understand.
You see the formula above, in the open system, distort operation is before camera matrix, so the process order is:
image_distorted ->camera_matrix -> un-distort function->camera_matrix->back to image_undistorted.
So you need a small fix to and camera1 again.
Mat eye3 = Mat::eye(3, 3, CV_64F);
cvUndistortPoints(src, dst, &camera1, &distCoeffs1, &eye3,&camera1);
Otherwise, if the last two parameters is empty, It would be project to a Normalized image coordinate.
See codes: opencv-3.4.0-src\modules\imgproc\src\undistort.cpp :297
I have a piece of code for rotating and translating image:
Point2f pt(0, in.rows);
double angle = atan(trans.c / trans.b) * 180 / M_PI;
Mat r = getRotationMatrix2D(pt, -angle, 1.0);
warpAffine(in, out, r, in.size(), interpolation); /* rotation */
Mat t = (Mat_<double>(2, 3) << 1, 0, trans.a, 0, 1, -trans.d);
warpAffine(out, out, t, in.size(), interpolation); /* translation */
The problem is that I'm doing this in two times. So if I have an angle of 90degree for example, the first "out" variable will be empty because all data are out of bounds. Is there a way to do it in one pass ? In order to avoid loosing my data and having black image.
I think that the best thing would be to combine r and t in one matrix but I'm a little lost.
Best regards,
Here is an example on how to combine 2 homographies by simple multiplication and how to extract an affine transformation from a 3x3 homography.
int main(int argc, char* argv[])
cv::Mat input = cv::imread("C:/StackOverflow/Input/Lenna.png");
// create to 3x3 identity homography matrices
cv::Mat homography1 = cv::Mat::eye(3, 3, CV_64FC1);
cv::Mat homography2 = cv::Mat::eye(3, 3, CV_64FC1);
double alpha1 = -13; // degrees
double t1_x = -86; // pixel
double t1_y = -86; // pixel
double alpha2 = 21; // degrees
double t2_x = 86; // pixel
double t2_y = 86; // pixel
// hope there is no error in the signs:
// combine homography1
homography1.at<double>(0, 0) = cos(CV_PI*alpha1 / 180);
homography1.at<double>(0, 1) = -sin(CV_PI*alpha1 / 180);
homography1.at<double>(1, 0) = sin(CV_PI*alpha1 / 180);
homography1.at<double>(1, 1) = cos(CV_PI*alpha1 / 180);
homography1.at<double>(0, 2) = t1_x;
homography1.at<double>(1, 2) = t1_y;
// compose homography2
homography2.at<double>(0, 0) = cos(CV_PI*alpha2 / 180);
homography2.at<double>(0, 1) = -sin(CV_PI*alpha2 / 180);
homography2.at<double>(1, 0) = sin(CV_PI*alpha2 / 180);
homography2.at<double>(1, 1) = cos(CV_PI*alpha2 / 180);
homography2.at<double>(0, 2) = t2_x;
homography2.at<double>(1, 2) = t2_y;
cv::Mat affine1 = homography1(cv::Rect(0, 0, 3, 2));
cv::Mat affine2 = homography2(cv::Rect(0, 0, 3, 2));
cv::Mat dst1;
cv::Mat dst2;
cv::warpAffine(input, dst1, affine1, input.size());
cv::warpAffine(input, dst2, affine2, input.size());
cv::Mat combined_homog = homography1*homography2;
cv::Mat combined_affine = combined_homog(cv::Rect(0, 0, 3, 2));
cv::Mat dst_combined;
cv::warpAffine(input, dst_combined, combined_affine, input.size());
cv::imshow("input", input);
cv::imshow("dst1", dst1);
cv::imshow("dst2", dst2);
cv::imshow("combined", dst_combined);
return 0;
In this example, an image is first rotated and translated to the left, later to the right. If the two transformations are performed after each other, significant image areas would get lost. Instead if they are combined by homograhy multiplication, it is like the full operation done in a single step without losing image parts in the intemediate step.
if image was first transformed with H1, later with H2:
if the image is transformed with the combination of H1*H2 directly:
One typical application of this homography combination is to first translate the image center to the origin, then rotate, then translate back to original position. This has the effect as if the image was rotated around its center of gravity.