Accurate Image resizing - image-processing

I need to resize an image using bilinear interpolation and create an image pyramid.I will detect corners at the different levels of the pyramid and scale the pixel co-ordinates so that they are relative to the dimensions of the largest image.
If a corner of an object is detected as a corner/keypoint/feature in all the levels,after scaling the corresponding pixel co-ordinates from the different levels so that they fall on the largest image, ideally I would like them to have the same value. Thus when resizing the images, I am trying to be as accurate as possible.
Let's assume I am resizing an image L_n_minus_1 to create a smaller image L_n. My scale factor is "ratio" (ratio>1).
*I cannot use any library.
I can resize using the pseudocode below (which is what I generally find when I search online for resizing algorithms.)
int offset = 0;
for (int i = 0; i < height_of_L_n; i++){
for (int j = 0; j < width_of_L_n; j++){
//********* This part will differ in the later version I provided below
//
int xSrcInt = (int)(ratio * j);
float xDiff = ratio * j - xSrcInt;
int ySrcInt = (int)(ratio * i);
float yDiff = ratio * i - ySrcInt;
// The above code will differ in the later version I provided below
index = (ySrcInt * width_of_L_n_minus_1 + xSrcInt);
//Get the 4 pixel values to interpolate
a = L_n_minus_1[index];
b = L_n_minus_1[index + 1];
c = L_n_minus_1[index + width_of_L_n_minus_1];
d = L_n_minus_1[index + width_of_L_n_minus_1 + 1];
//Calculate the co-efficients for interpolation
float c0 = (1 - x_diff)*(1 - y_diff);
float c1 = (x_diff)*(1 - y_diff);
float c2 = (y_diff)*(1 - x_diff);
float c3 = (x_diff*y_diff);
//half is added for rounding the pixel intensity.
int intensity = (a*c0) + (b*c1) + (c*c2) + (d*c3) + 0.5;
if (intensity > 255)
intensity = 255;
L_n[offset++] = intensity;
}
}
Or I could use this modified piece of code below :
int offset = 0;
for (int i = 0; i < height_of_L_n; i++){
for (int j = 0; j < width_of_L_n; j++){
// Here the code differs from the first piece of code
// Assume pixel centers start from (0.5,0.5). The top left pixel has co-ordinate (0.5,0.5)
// 0.5 is added to go to the co-ordinates where top left pixel has co-ordinate (0.5,0.5)
// 0.5 is subtracted to go to the generally used co-ordinates where top left pixel has co-ordinate (0,0)
// or in other words map the new co-ordinates to array indices
int xSrcInt = int((ratio * (j + 0.5)) - 0.5);
float xDiff = (ratio * (j + 0.5)) - 0.5 - xSrcInt;
int ySrcInt = int((ratio * (i + 0.5)) - 0.5);
float yDiff = (ratio * (i + 0.5)) - 0.5 - ySrcInt;
// Difference with previous code ends here
index = (ySrcInt * width_of_L_n_minus_1 + xSrcInt);
//Get the 4 pixel values to interpolate
a = L_n_minus_1[index];
b = L_n_minus_1[index + 1];
c = L_n_minus_1[index + width_of_L_n_minus_1];
d = L_n_minus_1[index + width_of_L_n_minus_1 + 1];
//Calculate the co-efficients for interpolation
float c0 = (1 - x_diff)*(1 - y_diff);
float c1 = (x_diff)*(1 - y_diff);
float c2 = (y_diff)*(1 - x_diff);
float c3 = (x_diff*y_diff);
//half is added for rounding the pixel intensity.
int intensity = (a*c0) + (b*c1) + (c*c2) + (d*c3) + 0.5;
if (intensity > 255)
intensity = 255;
L_n[offset++] = intensity;
}
}
The second piece of code was developed assuming pixel centers having co-ordinates like (0.5, 0.5) as they have in textures.
This way the top left pixel will have co-ordinate (0.5, 0.5).
Let us assume :
a 2 by 2 Destination Image is being resized from a 4 by 4 Source Image.
In the first piece of code, it is assumed that the first pixel has co-ordinates (0,0), thus for example my ratio is 2. Then
xSrcInt = (int)(0*2); // 0
ySrcInt = (int)(0*2); // 0
xDiff = (0*2) - 0; // 0
yDiff = (0*2) - 0; // 0
Thus effectively I will just be copying the first pixel value from the source, as c0 will be 1 and c1,c2 and c3 will be 0.
But in the second piece of code I will get
xSrcInt = (int)((0.5*2) - 0.5); // 0;
ySrcInt = (int)((0.5*2) - 0.5); // 0;
xDiff = ((0.5*2) - 0.5) - 0; // 0.5;
yDiff = ((0.5*2) - 0.5) - 0; // 0.5;
In this case c0,c1,c2 and c3 will all be equal to 0.25. Thus I will be using the 4 pixels at the top left.
Please let me know what do you think and if there is any bug in my second piece of code. As far as visual results go they are working perfectly.
But yes I do seem to notice better alignment of keypoints with the second piece of code. But may be that's because I am judging with prejudice :-).
Thanks in advance.

Related

opencv: how to clusterize by angle using kmeans()

Question is, how to clusterize pairs of some units by their angle? Problem is that, kmeans operates on the notion of Euclidean space distance and does not know about periodic nature of angles. So to make it work, one needs to translate the angle to Euclidean space but hold the following true:
close angles are close values in Euclidean space;
far angles are far in Euclidean space.
Which means, that, 90 and -90 are distant values, 180 and -180 is the same, 170 and -170 are close (angles come from left up and to right: 0 - +180 and from left down to the right: 0 - -180)
I tried to use various sin() functions but they all have issues mentioned in points 1 and 2. Most perspective one is sin(x * 0.5f) but also having the problem that 180 and -180 are distant values in Euclidean space.
The solution I found is to translate angles to points on circle and feed them into kmeans. This way we make it to compare distances between points and this works perfectly.
Important thing to mention. Kmeans #eps in termination criterion is expressed in terms of units of samples that you feed to kmeans. In our example maximal distant points have dist 200 units (2 * radius). This means that having 1.0f is totally fine. If you use cv::normalize(samples, samples, 0.0f, 1.0f) for your samples before calling kmeans(), adjust your #eps appropriately. Something like eps=0.01f plays better here.
Enjoy! Hope this helps someone.
static cv::Point2f angleToPointOnCircle(float angle, float radius, cv::Point2f origin /* center */)
{
float x = radius * cosf(angle * M_PI / 180.0f) + origin.x;
float y = radius * sinf(angle * M_PI / 180.0f) + origin.y;
return cv::Point2f(x, y);
}
static std::vector<std::pair<size_t, int> > biggestKmeansGroup(const std::vector<int> &labels, int count)
{
std::vector<std::pair<size_t, int> > indices;
std::map<int, size_t> l2cm;
for (int i = 0; i < labels.size(); ++i)
l2cm[labels[i]]++;
std::vector<std::pair<size_t, int> > c2lm;
for (std::map<int, size_t>::iterator it = l2cm.begin(); it != l2cm.end(); it++)
c2lm.push_back(std::make_pair(it->second, it->first)); // count, group
std::sort(c2lm.begin(), c2lm.end(), cmp_pair_first_reverse);
for (int i = 0; i < c2lm.size() && count-- > 0; i++)
indices.push_back(c2lm[i]);
return indices;
}
static void sortByAngle(std::vector<boost::shared_ptr<Pair> > &group,
std::vector<boost::shared_ptr<Pair> > &result)
{
std::vector<int> labels;
cv::Mat samples;
/* Radius is not so important here. */
for (int i = 0; i < group.size(); i++)
samples.push_back(angleToPointOnCircle(group[i]->angle, 100, cv::Point2f(0, 0)));
/* 90 degrees per group. May be less if you need it. */
static int PAIR_MAX_FINE_GROUPS = 4;
int groupNr = std::max(std::min((int)group.size(), PAIR_MAX_FINE_GROUPS), 1);
assert(group.size() >= groupNr);
cv::kmeans(samples.reshape(1, (int)group.size()), groupNr, labels,
cvTermCriteria(CV_TERMCRIT_EPS/* | CV_TERMCRIT_ITER*/, 30, 1.0f),
100, cv::KMEANS_RANDOM_CENTERS);
std::vector<std::pair<size_t, int> > biggest = biggestKmeansGroup(labels, groupNr);
for (int g = 0; g < biggest.size(); g++) {
for (int i = 0; i < group.size(); i++) {
if (labels[i] == biggest[g].second)
result.push_back(group[i]);
}
}
}

Get Rho and Theta from Hough-Transform opencvsharp?

I have Hough-Transform implemented using Opencvsharp (opencv), and get the lines detected on my image in console application/windows-from-application:
lines = edgeImg.HoughLines2(storage, HoughLinesMethod.Probabilistic, 1, Math.PI / 180, 60, 100, 100);
for (int i = 0; i < lines.Total; i++)
{
CvLineSegmentPoint segP= lines.GetSeqElem<CvLineSegmentPoint>(i).Value;
double angle = Math.Atan2((segP.P2.Y) - (segP.P1.Y), (segP.P2.X) - (segP.P1.X)) * 180 / Math.PI;
if (Math.Abs(angle) <= 60)
continue;
if (segP.P1.Y > segP.P2.Y + 20 || segP.P1.Y < segP.P2.Y - 20)
src.Line(segP.P1, segP.P2, CvColor.blue, 2, LineType.AntiAlias, 0);
}
I have tried different methods for visualizing the rho-theta space. since "HoughLinesMethod" does all the transformation internally, I have tried to get these values from x,y in the reverse way:
double angle = Math.Atan2(dy, dx) * 180 / Math.PI;
double theta = 90 - angle;
var thetaRad = theta*Math.PI/180;
double rho = (x1 * Math.Cos(thetaRad) + y1 * Math.Sin(thetaRad));
my first question is if I need to get two values for rho/theta, both for x1,y1 and also x2,y2 ; or calculating only one "rho/theta" would be the right intersect?
Thanks!
second, how can I visualize them in the right format? (what I currently see on my outout image is some random white dots at the top left corner of my output)
third, is it rational to get rho,theta values back in this way or you would suggest to perform the hough transform by myself and reduce the complexity? (I used opencvsharp function for better and efficient performance!)

Get random number from screen except rectangle

As my tile says that I want to get random number for origin (X-Axis & y-Axis) so in my whole screen in iPad landscape I have 1 rectangle, I want to get random number for origin which out of this rectangle, so obiously I want to get random number for X-Axis between max and min and same as for Y-Axis.
I tried with following answers but not helpful for me.
Generate Random Numbers Between Two Numbers in Objective-C
Generate a random float between 0 and 1
Generate random number in range in iOS?
For more clear see below image
In above image I just want to find random number (for origin) of GREEN screen. How can I achieve it ?
Edited
I had tried.
int randNum = rand() % ([max intValue] - [min intValue]) + [min intValue];
Same for both X-Axis & y-Axis.
If the blue exclusion rectangle is not "too large" compared to the green screen rectangle
then the easiest solution is to
create a random point inside the green rectangle,
check if the point lies inside the blue rectangle, and
repeat the process if necessary.
That would look like:
CGRect greenRect = ...;
CGRect blueRect = ...;
CGPoint p;
do {
p = CGPointMake(greenRect.origin.x + arc4random_uniform(greenRect.size.width),
greenRect.origin.y + arc4random_uniform(greenRect.size.height));
} while (CGRectContainsPoint(blueRect, p));
If I remember correctly, the expected number of iterations is G/(G - B), where G is
the area of the green rectangle and B is the area of the blue rectangle.
What if you first determined x within the green rectangle like this:
int randomX = arc4random()%greenRectangle.frame.size.width;
int randomY; // we'll do y later
Then check if this is inside the blue rectangle:
if(randomX < blueRectangle.frame.origin.x && randomX > (blueRectangle.frame.origin.x + blueRectangle.frame.size.width))
{
//in this case we are outside the rectangle with the x component
//so can randomly generate any y like this:
randomY = arc4random()%greenRectangle.frame.size.height;
}
//And if randomX is in the blue rectangle then we can use the space either before or after it:
else
{
//randomly decide if you are going to use the range to the left of blue rectangle or to the right
BOOL shouldPickTopRange = arc4random()%1;
if(shouldPickTopRange)
{
//in this case y can be any point before the start of blue rectangle
randomY = arc4random()%blueRectangle.frame.origin.y;
}
else
{
//in this case y can be any point after the blue rectangle
int minY = blueRectangle.frame.origin.y + blueRectangle.frame.size.height;
int maxY = greenRectangle.frame.size.height;
randomY = arc4random()%(maxY - minY + 1) + minY;
}
}
Then your random point would be:
CGPoint randomPoint = CGPointMake(randomX, randomY);
The only thing missing above is to check if your blue rectangle sits at y = 0 or at the very bottom of green rectangle.
[Apologies I did this with OS X, translation is straightforward]
A non-iterative solution:
- (NSPoint) randomPointIn:(NSRect)greenRect excluding:(NSRect)blueRect
{
// random point on green x-axis
int x = arc4random_uniform(NSWidth(greenRect)) + NSMinX(greenRect);
if (x < NSMinX(blueRect) || x > NSMaxX(blueRect))
{
// to the left or right of the blue, full height available
int y = arc4random_uniform(NSHeight(greenRect)) + NSMinY(greenRect);
return NSMakePoint(x, y);
}
else
{
// within the x-range of the blue, avoid it
int y = arc4random_uniform(NSHeight(greenRect) - NSHeight(blueRect)) + NSMinY(greenRect);
if (y >= NSMinY(blueRect))
{
// not below the blue, step over it
y += NSHeight(blueRect);
}
return NSMakePoint(x, y);
}
}
This picks a random x-coord in the range of green. If that point is outside the range of blue it picks a random y-coord in the range of green; otherwise it reduces the y range by the height of blue, produces a random point, and then increases it if required to avoid blue.
There are other solutions based on picking a uniform random point in the available area (green - blue) and then adjusting, but the complexity isn't worth it I think (I haven't done the stats).
Addendum
OK folk seem concerned over uniformity, so here is the algorithm mentioned in my last paragraph. We're picking an "point" with integer coords so the number of points to pick from is the green area minus the blue area. Pick a point randomly in this range. Now place it into one of the rectangles below, left, right or above the blue:
// convenience
int RectArea(NSRect r) { return (int)NSWidth(r) * (int)NSHeight(r); }
- (NSPoint) randomPointIn:(NSRect)greenRect excluding:(NSRect)blueRect
{
// not we are using "points" with integer coords so the
// bottom left point is 0,0 and the top right (width-1, height-1)
// you can adjust this to suit
// the number of points to pick from is the diff of the areas
int availableArea = RectArea(greenRect) - RectArea(blueRect);
int pointNumber = arc4random_uniform(availableArea);
// now "just" locate pointNumber into the available space
// we consider four rectangles, one each full width above and below the blue
// and one each to the left and right of the blue
int belowArea = NSWidth(greenRect) * (NSMinY(blueRect) - NSMinY(greenRect));
if (pointNumber < belowArea)
{
return NSMakePoint(pointNumber % (int)NSWidth(greenRect) + NSMinX(greenRect),
pointNumber / (int)NSWidth(greenRect) + NSMinY(greenRect));
}
// not below - consider to left
pointNumber -= belowArea;
int leftWidth = NSMinX(blueRect) - NSMinX(greenRect);
int leftArea = NSHeight(blueRect) * leftWidth;
if (pointNumber < leftArea)
{
return NSMakePoint(pointNumber % leftWidth + NSMinX(greenRect),
pointNumber / leftWidth + NSMinY(blueRect));
}
// not left - consider to right
pointNumber -= leftArea;
int rightWidth = NSMaxX(greenRect) - NSMaxX(blueRect);
int rightArea = NSHeight(blueRect) * rightWidth;
if (pointNumber < rightArea)
{
return NSMakePoint(pointNumber % rightWidth + NSMaxX(blueRect),
pointNumber / rightWidth + NSMinY(blueRect));
}
// it must be above
pointNumber -= rightArea;
return NSMakePoint(pointNumber % (int)NSWidth(greenRect) + NSMinX(greenRect),
pointNumber / (int)NSWidth(greenRect) + NSMaxY(blueRect));
}
This is uniform, but whether it is worth it you'll have to decide.
Okay. This was bothering me, so I did the work. It's a lot of source code, but computationally lightweight and probabilistically correct (haven't tested).
With all due respect to #MartinR, I think this is superior insofar as it doesn't loop (consider the case where the contained rect covers a very large portion of the outer rect). And with all due respect to #CRD, it's a pain, but not impossible to get the desired probabilities. Here goes:
// Find a random position in rect, excluding a contained rect called exclude
//
// It looks terrible, but it's just a lot of bookkeeping.
// Divide rect into 8 regions, like a tic-tac-toe board, excluding the center square
// Reading left to right, top to bottom, call these: A,B,C,D, (no E, it's the center) F,G,H,I
// The random point must be in one of these regions, choose by throwing a random dart, using
// cumulative probabilities to choose. The likelihood that the dart will be in regions A-I is
// the ratio of each's area to the total (less the center)
// With a target rect, correctly selected, we can easily pick a random point within it.
+ (CGPoint)pointInRect:(CGRect)rect excluding:(CGRect)exclude {
// find important points in the grid
CGFloat xLeft = CGRectGetMinX(rect);
CGFloat xCenter = CGRectGetMinX(exclude);
CGFloat xRight = CGRectGetMaxX(exclude);
CGFloat widthLeft = exclude.origin.x-CGRectGetMinX(rect);
CGFloat widthCenter = exclude.size.width;
CGFloat widthRight = CGRectGetMaxY(rect)-CGRectGetMaxX(exclude);
CGFloat yTop = CGRectGetMinY(rect);
CGFloat yCenter = exclude.origin.y;
CGFloat yBottom = CGRectGetMaxY(exclude);
CGFloat heightTop = exclude.origin.y-CGRectGetMinY(rect);
CGFloat heightCenter = exclude.size.height;
CGFloat heightBottom = CGRectGetMaxY(rect)-CGRectGetMaxY(exclude);
// compute the eight regions
CGFloat areaA = widthLeft * heightTop;
CGFloat areaB = widthCenter * heightTop;
CGFloat areaC = widthRight * heightTop;
CGFloat areaD = widthLeft * heightCenter;
CGFloat areaF = widthRight * heightCenter;
CGFloat areaG = widthLeft * heightBottom;
CGFloat areaH = widthCenter * heightBottom;
CGFloat areaI = widthRight * heightBottom;
CGFloat areaSum = areaA+areaB+areaC+areaD+areaF+areaG+areaH+areaI;
// compute the normalized probabilities
CGFloat pA = areaA/areaSum;
CGFloat pB = areaB/areaSum;
CGFloat pC = areaC/areaSum;
CGFloat pD = areaD/areaSum;
CGFloat pF = areaF/areaSum;
CGFloat pG = areaG/areaSum;
CGFloat pH = areaH/areaSum;
// compute cumulative probabilities
CGFloat cumB = pA+pB;
CGFloat cumC = cumB+pC;
CGFloat cumD = cumC+pD;
CGFloat cumF = cumD+pF;
CGFloat cumG = cumF+pG;
CGFloat cumH = cumG+pH;
// now pick which region we're in, using cumulatvie probabilities
// whew, maybe we should just use MartinR's loop. No No, we've come too far!
CGFloat dart = uniformRandomUpTo(1.0);
CGRect targetRect;
// top row
if (dart < pA) {
targetRect = CGRectMake(xLeft, yTop, widthLeft, heightTop);
} else if (dart >= pA && dart < cumB) {
targetRect = CGRectMake(xCenter, yTop, widthCenter, heightTop);
} else if (dart >= cumB && dart < cumC) {
targetRect = CGRectMake(xRight, yTop, widthRight, heightTop);
}
// middle row
else if (dart >= cumC && dart < cumD) {
targetRect = CGRectMake(xRight, yCenter, widthRight, heightCenter);
} else if (dart >= cumD && dart < cumF) {
targetRect = CGRectMake(xLeft, yCenter, widthLeft, heightCenter);
}
// bottom row
else if (dart >= cumF && dart < cumG) {
targetRect = CGRectMake(xLeft, yBottom, widthLeft, heightBottom);
} else if (dart >= cumG && dart < cumH) {
targetRect = CGRectMake(xCenter, yBottom, widthCenter, heightBottom);
} else {
targetRect = CGRectMake(xRight, yBottom, widthRight, heightBottom);
}
// yay. pick a point in the target rect
CGFloat x = uniformRandomUpTo(targetRect.size.width) + CGRectGetMinX(targetRect);
CGFloat y = uniformRandomUpTo(targetRect.size.height)+ CGRectGetMinY(targetRect);
return CGPointMake(x, y);
}
float uniformRandomUpTo(float max) {
return max * arc4random_uniform(RAND_MAX) / RAND_MAX;
}
Try this code, Worked for me.
-(CGPoint)randomPointInRect:(CGRect)r
{
CGPoint p = r.origin;
p.x += arc4random_uniform((u_int32_t) CGRectGetWidth(r));
p.y += arc4random_uniform((u_int32_t) CGRectGetHeight(r));
return p;
}
I don't like piling onto answers. However, the provided solutions do not work, so I feel obliged to chime in.
Martin's is fine, and simple... which may be all you need. It does have one major problem though... finding the answer when the inner rectangle dominates the containing rectangle could take quite a long time. If it fits your domain, then always choose the simplest solution that works.
jancakes solution is not uniform, and contains a fair amount of bias.
The second solution provided by dang just plain does not work... because arc4_random takes and returns uint32_t and not a floating point value. Thus, all generated numbers should fall into the first box.
You can address that by using drand48(), but it's not a great number generator, and has bias of its own. Furthermore, if you look at the distribution generated by that method, it has heavy bias that favors the box just to the left of the "inner box."
You can easily test the generation... toss a couple of UIViews in a controller, add a button handler that plots 100000 "random" points and you can see the bias clearly.
So, I hacked up something that is not elegant, but does provide a uniform distribution of random numbers in the larger rectangle that are not in the contained rectangle.
You can surely optimize the code and make it a bit easier to read...
Caveat: Will not work if you have more than 4,294,967,296 total points. There are multiple solutions to this, but this should get you moving in the right direction.
- (CGPoint)randomPointInRect:(CGRect)rect
excludingRect:(CGRect)excludeRect
{
excludeRect = CGRectIntersection(rect, excludeRect);
if (CGRectEqualToRect(excludeRect, CGRectNull)) {
return CGPointZero;
}
CGPoint result;
uint32_t rectWidth = rect.size.width;
uint32_t rectHeight = rect.size.height;
uint32_t rectTotal = rectHeight * rectWidth;
uint32_t excludeWidth = excludeRect.size.width;
uint32_t excludeHeight = excludeRect.size.height;
uint32_t excludeTotal = excludeHeight * excludeWidth;
if (rectTotal == 0) {
return CGPointZero;
}
if (excludeTotal == 0) {
uint32_t r = arc4random_uniform(rectHeight * rectWidth);
result.x = r % rectWidth;
result.y = r /rectWidth;
return result;
}
uint32_t numValidPoints = rectTotal - excludeTotal;
uint32_t r = arc4random_uniform(numValidPoints);
uint32_t numPointsAboveOrBelowExcludedRect =
(rectHeight * excludeWidth) - excludeTotal;
if (r < numPointsAboveOrBelowExcludedRect) {
result.x = (r % excludeWidth) + excludeRect.origin.x;
result.y = r / excludeWidth;
if (result.y >= excludeRect.origin.y) {
result.y += excludeHeight;
}
} else {
r -= numPointsAboveOrBelowExcludedRect;
uint32_t numPointsLeftOfExcludeRect =
rectHeight * excludeRect.origin.x;
if (r < numPointsLeftOfExcludeRect) {
uint32_t rowWidth = excludeRect.origin.x;
result.x = r % rowWidth;
result.y = r / rowWidth;
} else {
r -= numPointsLeftOfExcludeRect;
CGFloat startX =
excludeRect.origin.x + excludeRect.size.width;
uint32_t rowWidth = rectWidth - startX;
result.x = (r % rowWidth) + startX;
result.y = r / rowWidth;
}
}
return result;
}

standard deviation of a UIImage/CGImage

I need to calculate the standard deviation on an image I have inside a UIImage object.
I know already how to access all pixels of an image, one at a time, so somehow I can do it.
I'm wondering if there is somewhere in the framework a function to perform this in a better and more efficient way... I can't find it so maybe it doensn't exist.
Do anyone know how to do this?
bye
To further expand on my comment above. I would definitely look into using the Accelerate framework, especially depending on the size of your image. If you image is a few hundred pixels by a few hundred. You will have a ton of data to process and Accelerate along with vDSP will make all of that math a lot faster since it processes everything on the GPU. I will look into this a little more, and possibly put some code in a few minutes.
UPDATE
I will post some code to do standard deviation in a single dimension using vDSP, but this could definitely be extended to 2-D
float *imageR = [0.1,0.2,0.3,0.4,...]; // vector of values
int numValues = 100; // number of values in imageR
float mean = 0; // place holder for mean
vDSP_meanv(imageR,1,&mean,numValues); // find the mean of the vector
mean = -1*mean // Invert mean so when we add it is actually subtraction
float *subMeanVec = (float*)calloc(numValues,sizeof(float)); // placeholder vector
vDSP_vsadd(imageR,1,&mean,subMeanVec,1,numValues) // subtract mean from vector
free(imageR); // free memory
float *squared = (float*)calloc(numValues,sizeof(float)); // placeholder for squared vector
vDSP_vsq(subMeanVec,1,squared,1,numValues); // Square vector element by element
free(subMeanVec); // free some memory
float sum = 0; // place holder for sum
vDSP_sve(squared,1,&sum,numValues); sum entire vector
free(squared); // free squared vector
float stdDev = sqrt(sum/numValues); // calculated std deviation
Please explain your query so that can come up with specific reply.
If I am getting you right then you want to calculate standard deviation of RGB of pixel or HSV of color, you can frame your own method of standard deviation for circular quantities in case of HSV and RGB.
We can do this by wrapping the values.
For example: Average of [358, 2] degrees is (358+2)/2=180 degrees.
But this is not correct because its average or mean should be 0 degrees.
So we wrap 358 into -2.
Now the answer is 0.
So you have to apply wrapping and then you can calculate standard deviation from above link.
UPDATE:
Convert RGB to HSV
// r,g,b values are from 0 to 1 // h = [0,360], s = [0,1], v = [0,1]
// if s == 0, then h = -1 (undefined)
void RGBtoHSV( float r, float g, float b, float *h, float *s, float *v )
{
float min, max, delta;
min = MIN( r, MIN(g, b ));
max = MAX( r, MAX(g, b ));
*v = max;
delta = max - min;
if( max != 0 )
*s = delta / max;
else {
// r = g = b = 0
*s = 0;
*h = -1;
return;
}
if( r == max )
*h = ( g - b ) / delta;
else if( g == max )
*h=2+(b-r)/delta;
else
*h=4+(r-g)/delta;
*h *= 60;
if( *h < 0 )
*h += 360;
}
and then calculate standard deviation for hue value by this:
double calcStddev(ArrayList<Double> angles){
double sin = 0;
double cos = 0;
for(int i = 0; i < angles.size(); i++){
sin += Math.sin(angles.get(i) * (Math.PI/180.0));
cos += Math.cos(angles.get(i) * (Math.PI/180.0));
}
sin /= angles.size();
cos /= angles.size();
double stddev = Math.sqrt(-Math.log(sin*sin+cos*cos));
return stddev;
}

FFT Convolution - Really low PSNR

I'm convoluting an image (512*512) with a FFT filter (kernelsize=10), it looks good.
But when I compare it with an image which I convoluted the normal way the result was horrible.
The PSNR is about 35.
67,187/262,144 Pixel values have a difference of 1 or more(peak at ~8) (having a max pixel value of 255).
My question is, is it normal when convoluting in frequency space or might there be a problem with my convolution/transforming functions? . Because the strange thing is that I should get better results when using double as data-type. But it stays COMPLETELY the same.
When I transform an image into frequency space, DON'T convolute it, then transform it back it's fine and the PSNR is about 140 when using float.
Also, due to the pixel differences being only 1-10 I think I can rule out scaling errors
EDIT: More Details for bored interested people
I use the open source kissFFT library. With real 2dimensional input (kiss_fftndr.h)
My Image Datatype is PixelMatrix. Simply a matrix with alpha, red, green and blue values from 0.0 to 1.0 float
My kernel is also a PixelMatrix.
Here some snippets from the Convolution function
Used datatypes:
#define kiss_fft_scalar float
#define kiss_fft_cpx struct {
kiss_fft_scalar r;
kiss_fft_scalar i,
}
Configuration of the FFT:
//parameters to kiss_fftndr_alloc:
//1st param = array with the size of the 2 dimensions (in my case dim={width, height})
//2nd param = count of the dimensions (in my case 2)
//3rd param = 0 or 1 (forward or inverse FFT)
//4th and 5th params are not relevant
kiss_fftndr_cfg stf = kiss_fftndr_alloc(dim, 2, 0, 0, 0);
kiss_fftndr_cfg sti = kiss_fftndr_alloc(dim, 2, 1, 0, 0);
Padding and transforming the kernel:
I make a new array:
kiss_fft_scalar kernel[width*height];
I fill it with 0 in a loop.
Then I fill the middle of this array with the kernel I want to use.
So if I would use a 2*2 kernel with values 1/4, 1/4, 1/4 and 1/4 it would look like
0 0 0 0 0 0
0 1/4 1/4 0
0 1/4 1/4 0
0 0 0 0 0 0
The zeros are padded until they reach the size of the image.
Then I swap the quadrants of the image diagonally. It looks like:
1/4 0 0 1/4
0 0 0 0
0 0 0 0
1/4 0 0 1/4
now I transform it: kiss_fftndr(stf, floatKernel, outkernel);
outkernel is declarated as
kiss_fft_cpx outkernel= new kiss_fft_cpx[width*height]
Getting the colors into arrays:
kiss_fft_scalar *red = new kiss_fft_scalar[width*height];
kiss_fft_scalar *green = new kiss_fft_scalar[width*height];
kiss_fft-scalar *blue = new kiss_fft_scalar[width*height];
for(int i=0; i<height; i++) {
for(int j=0; i<width; j++) {
red[i*height+j] = input.get(j,i).getRed(); //input is the input image pixel matrix
green[i*height+j] = input.get(j,i).getGreen();
blue{i*height+j] = input.get(j,i).getBlue();
}
}
Then I transform the arrays:
kiss_fftndr(stf, red, outred);
kiss_fftndr(stf, green, outgreen);
kiss_fftndr(stf, blue, outblue); //the out-arrays are type kiss_fft_cpx*
The convolution:
What we have now:
3 transformed color arrays from type kiss_fft_cpx*
1 transformed kernel array from type kiss_fft_cpx*
They are both complex arrays
Now comes the convolution:
for(int m=0; m<til; m++) {
for(int n=0; n<til; n++) {
kiss_fft_scalar real = outcolor[m*til+n].r; //I do that for all 3 arrys in my code!
kiss_fft_scalar imag = outcolor[m*til+n].i; //so I have realred, realgreen, realblue
kiss_fft_scalar realMask = outkernel[m*til+n].r; // and imagred, imaggreen, etc.
kiss_fft_scalar imagMask = outkernel[m*til+n].i;
outcolor[m*til+n].r = real * realMask - imag * imagMask; //Same thing here in my code i
outcolor[m*til+n].i = real * imagMask + imag * realMask; //do it with all 3 colors
}
}
Now I transform them back:
kiss_fftndri(sti, outred, red);
kiss_fftndri(sti, outgreen, green);
kiss_fftndri(sti, outblue, blue);
and I create a new Pixel Matrix with the values from the color-arrays
PixelMatrix output;
for(int i=0; i<height; i++) {
for(int j=0; j<width; j++) {
Pixel p = new Pixel();
p.setRed( red[i*height+j] / (width*height) ); //I divide through (width*height) because of the scaling happening in the FFT;
p.setGreen( green[i*height+j] );
p.setBlue( blue[i*height+j] );
output.set(j , i , p);
}
}
Notes:
I already take care in advance that the image has a size with a power of 2 (256*256), (512*512) etc.
Examples:
kernelsize: 10
Input:
Output:
Output from normal convolution:
my console says :
142519 out of 262144 Pixels have a difference of 1 or more (maxRGB = 255)
PSNR: 32.006027221679688
MSE: 44.116752624511719
though for my eyes they look the same °.°
Maybe one person is bored and goes through the code. It's not urgent, but it's a kind of problem I just want to know what the hell I did wrong ^^
Last, but not least, my PSNR function, though I don't really think that's the problem :D
void calculateThePSNR(const PixelMatrix first, const PixelMatrix second, float* avgpsnr, float* avgmse) {
int height = first.getHeight();
int width = first.getWidth();
BMP firstOutput;
BMP secondOutput;
firstOutput.SetSize(width, height);
secondOutput.SetSize(width, height);
double rsum=0.0, gsum=0.0, bsum=0.0;
int count = 0;
int total = 0;
for(int i=0; i<height; i++) {
for(int j=0; j<width; j++) {
Pixel pixOne = first.get(j,i);
Pixel pixTwo = second.get(j,i);
double redOne = pixOne.getRed()*255;
double greenOne = pixOne.getGreen()*255;
double blueOne = pixOne.getBlue()*255;
double redTwo = pixTwo.getRed()*255;
double greenTwo = pixTwo.getGreen()*255;
double blueTwo = pixTwo.getBlue()*255;
firstOutput(j,i)->Red = redOne;
firstOutput(j,i)->Green = greenOne;
firstOutput(j,i)->Blue = blueOne;
secondOutput(j,i)->Red = redTwo;
secondOutput(j,i)->Green = greenTwo;
secondOutput(j,i)->Blue = blueTwo;
if((redOne-redTwo) > 1.0 || (redOne-redTwo) < -1.0) {
count++;
}
total++;
rsum += (redOne - redTwo) * (redOne - redTwo);
gsum += (greenOne - greenTwo) * (greenOne - greenTwo);
bsum += (blueOne - blueTwo) * (blueOne - blueTwo);
}
}
fprintf(stderr, "%d out of %d Pixels have a difference of 1 or more (maxRGB = 255)", count, total);
double rmse = rsum/(height*width);
double gmse = gsum/(height*width);
double bmse = bsum/(height*width);
double rpsnr = 20 * log10(255/sqrt(rmse));
double gpsnr = 20 * log10(255/sqrt(gmse));
double bpsnr = 20 * log10(255/sqrt(bmse));
firstOutput.WriteToFile("test.bmp");
secondOutput.WriteToFile("test2.bmp");
system("display test.bmp");
system("display test2.bmp");
*avgmse = (rmse + gmse + bmse)/3;
*avgpsnr = (rpsnr + gpsnr + bpsnr)/3;
}
Phonon had the right idea. Your images are shifted. If you shift your image by (1,1), then the MSE will be approximately zero (provided that you mask or crop the images accordingly). I confirmed this using the code (Python + OpenCV) below.
import cv
import sys
import math
def main():
fname1, fname2 = sys.argv[1:]
im1 = cv.LoadImage(fname1)
im2 = cv.LoadImage(fname2)
tmp = cv.CreateImage(cv.GetSize(im1), cv.IPL_DEPTH_8U, im1.nChannels)
cv.AbsDiff(im1, im2, tmp)
cv.Mul(tmp, tmp, tmp)
mse = cv.Avg(tmp)
print 'MSE:', mse
psnr = [ 10*math.log(255**2/m, 10) for m in mse[:-1] ]
print 'PSNR:', psnr
if __name__ == '__main__':
main()
Output:
MSE: (0.027584912741602553, 0.026742391458366047, 0.028147870144492403, 0.0)
PSNR: [63.724087463606452, 63.858801190963192, 63.636348220531396]
My advice for you to try to implement the following code :
A=double(inputS(1:10:length(inputS))); %segmentation
A(:)=-A(:);
%process the image or signal by fast fourior transformation and inverse fft
fresult=fft(inputS);
fresult(1:round(length(inputS)*2/fs))=0;
fresult(end-round(length(fresult)*2/fs):end)=0;
Y=real(ifft(fresult));
that's code help you to obtain the same size image and good for remove DC component ,the you can to convolution.

Resources