Resize organized point cloud - image-processing

I have an organized point cloud (1280 * 720) captured from a 3D camera. I just wonder whether there's a method to resize(cut down) this point cloud to a smaller size (eg. 128 * 72), when keeping this cloud organized.
(I think this shouldn't be the same as down sampling. "Resize" means like zooming an image).
I am using Point Cloud Library 1.8.0 but stuck with this.
Any advice is welcome, thanks first!

The answer of Rooscannon is in particular correct, but has some bugs in it. The correct uniform subsampling of a organized point cloud is as follows:
// Downsampling or keypoint extraction
int scale = 3;
PointCloud<PointXYZRGB>::Ptr keypoints (new PointCloud<PointXYZRGB>);
keypoints->width = cloud->width / scale;
keypoints->height = cloud->height / scale;
keypoints->points.resize(keypoints->width * keypoints->height);
for( size_t i = 0, ii = 0; i < keypoints->height; ii += scale, i++){
for( size_t j = 0, jj = 0; j < keypoints->width; jj += scale, j++){
keypoints->at(j, i) = cloud->at(jj, ii); //at(column, row)
}
}
So the loop conditions, the indexing and the initialization of the subsampled point cloud are different. Otherwise, the subsampled point cloud would not be organized anymore.

Just take a point out of the number of time you want to reduce your cloud,
something like that shloud work :
for (pcl::PointCloud<pcl::PointXYZ>::const_iterator it = src->begin(); it< src->end(); it+=times)
{
dest.points.push_back(*it);
}
Only problem is the cloud might containt some NaN values. To correct it just set is_dense to false into dest and call removeNaNFromPointCloud on it.
Hope this can help you !

Can't comment but removing NaNs from your point cloud by default makes it unorganized. Quite likely the NaNs are there as dummy points in case your instrument was not able to observe a point in the matrix just to keep the matrix dimensions correct. Removing those breaks the matrix structure and you'll have a different amount of points than your 1280 * 720 matrix would expect.
If you wish to down sample an organized point cloud say by a factor of 2, you could try something like
int scale = 2;
pcl::PointCloud<pcl::your_point_type> down_sampled_cloud;
down_sampled_cloud.width = original_cloud.width / scale;
down_sampled_cloud.height = original_cloud.height / scale;
for( int ii = 0; ii < original_cloud.height; ii+=scale){
for( int jj = 0; jj < original_cloud.width; jj+=scale ){
down_sampled_cloud.push_back(original_cloud.at(ii,jj));
}
}
Change scale to what you wish.
This method just down samples the original point cloud, it will not interpolate points between existing points. Scaling by a decimal factor is trickier and might yield unwanted results if the surface is not continuous.

Related

How can I align the frequency bins with the fourier transform magnitude?

I am attempting to implement a Fast Fourier Transform with associated complex magnitude function on the STM32F411RE Nucleo developer board. My goal is to separate a combined signal with multiple sinusoidal elements into their separate frequency components, with correct amplitude.
My issues is that I cannot correctly line up the frequency bins outcomes from the Complex magnitude function with the frequencies. I am also starting to question the validity of these outcomes as such.
I have tried to use a number of different implementations posted by people for the FFT algorithm with the magnitude fix, most notably the examples listed on StackoverFlow by SleuthEye and Blog by LB9MG.
AFAIK I have a similar approach, but somehow their approaches yield the desired results and mine do not. Below is my code that I have altered to work via the implementation that SleuthEye has created.
int main(void)
{
fftLen = 32; // can be 32, 64, 128, 256, 512, 1024, 2048, 4096
half_fftLen = fftLen/2;
volatile float32_t sampleFreq = 50 * fftLen; // Fs = binsize * fft length, desired binsize = 50 hz
arm_rfft_fast_instance_f32 inst;
arm_status status;
status = arm_rfft_fast_init_f32(&inst, fftLen);
float32_t signalCombined[fftLen] = {0};
float32_t fftCombined[fftLen] = {0};
float32_t fftMagnitude[fftLen] = {0};
volatile float32_t fftFreq[fftLen] = {0};
float32_t maxAmp;
uint32_t maxAmpInd;
while (1)
{
for (int i = 0; i< fftLen; i++)
{
signalCombined[i] = 40 * arm_sin_f32(450 * i); // 450 frequency at 40 amplitude
}
arm_rfft_fast_f32(&inst, signalCombined, fftCombined, 0); // perhaps switch to complex transform to allow for negative frequencies?
arm_cmplx_mag_f32(fftCombined, fftMagnitude, half_fftLen);
fftMagnitude[0] = fftCombined[0];
fftMagnitude[half_fftLen] = fftCombined[1];
arm_max_f32(fftMagnitude, half_fftLen, &maxAmp, &maxAmpInd); // We need the 3 max values
for (int k = 0; k < fftLen ; k++)
{
fftFreq[k] = ((k*sampleFreq)/fftLen);
}
}
Shown below are the results that I get out of the code listed above: whilst I do get a magnitude out of the algorithms (at the correct index 12), it does not correspond to the frequency or the amplitude of the input array signalCombined[].
Does anyone have an idea of why this is happening? Like so many of my errors it is probably a really trivial and stupid thing, but I cannot figure out for the life of me why this is happening.
EDIT: thanks to SleuthEye's help finding the frequencies is now possible, as the initial approach for generating the sin() signal was done incorrectly.
Some new issues popped up as the FFT only appears to yield the correct frequencies for the 32 samples, despite the bin size scaling accordingly to accommodate the adjusted sample size.
I am also unable to implement the amplitude fixing algorith: as per SleuthEye's Link with the example code 2*(1/N)*abs(X(k))^2 I have made my own implementation 2 * powf(fabs(fftMagnitude[j]), 2) / fftLen as shown in the code below, but this does not yield results that are even close to correct.
while (1)
{
for (int i = 0; i < fftLen; i++)
{
signalCombined[i] = 400 * arm_sin_f32(2 * PI * 450 * i / sampleFreq); // Sin Alpha, 400 amp at 10 kHz
// 700 * arm_sin_f32(2 * PI * 33000 * i / sampleFreq) + // Sin Bravo, 700 amp at 33 kHz
// 300 * arm_sin_f32(2 * PI * 50000 * i / sampleFreq); // Sin Charlie, 300 amp at 50 kHz
}
arm_rfft_fast_f32(&inst, signalCombined, fftCombined, 0); // calculate the fourier transform of the time domain signal
arm_cmplx_mag_f32(fftCombined, fftMagnitude, half_fftLen); // calculate the magnitude of the fourier transform
fftMagnitude[0] = fftCombined[0];
fftMagnitude[half_fftLen] = fftCombined[1];
for (int j = 0; j < sizeof(fftMagnitude); j++)
{
fftMagnitude[j] = 2 * powf(fabs(fftMagnitude[j]), 2) / fftLen; // Algorithm to fix the amplitude of each unique frequency
}
arm_max_f32(fftMagnitude, half_fftLen, &maxAmp, &maxAmpInd); // We need the 3 max values
for (int k = 0; k < fftLen ; k++)
{
fftFreq[k] = ((k*sampleFreq)/fftLen);
}
}
Your tone generation does not take into account the sampling frequency of 1600Hz, so you are effectively generating a tone at a frequency of 450*1600/(2*PI) ~ 114591Hz which gets aliased to ~608Hz. That 608Hz frequency roughly corresponds to a frequency index around 12 when using an FFT size of 32.
The generation of a 450Hz tone at a 1600Hz sampling frequency should be done as follows:
for (int i = 0; i< fftLen; i++)
{
signalCombined[i] = 40 * arm_sin_f32(2 * PI * 450 * i / sampleFreq);
}
As far as matching the amplitude, keep in kind that there is a scaling factor between the time-domain and frequency-domain of approximately 0.5*fftLen (see this other post of mine).

How to know if a photo is black or too dark?

I have a UIImagePickerViewController where the user takes a photo. My problem is how to know before uploading the photo to the server if the user is sending a dark photo. I mean a totally or nearly black.
I was researching and I found this:
const UInt8 *pixels = CFDataGetBytePtr(imageData);
UInt8 blackThreshold = 10; // or some value close to 0
int bytesPerPixel = 4;
for(int x = 0; x < width1; x++) {
for(int y = 0; y < height1; y++) {
int pixelStartIndex = (x + (y * width1)) * bytesPerPixel;
UInt8 alphaVal = pixels[pixelStartIndex]; // can probably ignore this value
UInt8 redVal = pixels[pixelStartIndex + 1];
UInt8 greenVal = pixels[pixelStartIndex + 2];
UInt8 blueVal = pixels[pixelStartIndex + 3];
if(redVal < blackThreshold && blueVal < blackThreshold && greenVal < blackThreshold) {
//This pixel is close to black...do something with it
}
}
}
However, I don't know how to apply the algorithm.
Yep that's a fairly simple way of doing it. You could, for example, iterate through and see what percentage of the pixels are pure black (i.e. clipped shadows) or nearly black. Or you could average the pixel colors throughout the whole image and see if it falls below a certain threshold. There are lots of approaches and these two might be a tad simplistic, but I'm not sure if this calls for anything particularly sophisticated. What threshold you want to use is up to you.
Also, while it has little practical impact, if I was going to be picky about the algorithm, I might only perform the "brightness" logic if the alphaVal was over a certain threshold, as well, as the color information is meaningless at transparent portions of image. Having said that, real photos rarely have any transparency, so this may be non-issue.
FYI, here is Apple's code for retrieving the pixel buffer. It's an oldie, but a goodie. (If I recall correctly, the only hassle is that the kCGImageAlphaPremultipliedFirst reference in CreateARGBBitmapContext must be cast with (CGBitmapInfo).)
By the way, if you're trying to determine the luminance of a particular pixel, one common algorithm is:
luminance = 0.2126 * red + 0.7152 * green + 0.0722 * blue

OpenCV, Haar cascade classifier: scaling feature or computing image pyramid?

I read the paper of Viola and Jones.
They stated clearly in the paper that their algorithm is faster than others because calculation of image pyramid is avoided by scaling feature rectangles.
But I googled around for a long time, only to find that OpenCV implements the image pyramid method instead of scaling the feature rectangles. And integral image is computed for all sub images in the pyramid. And this is done for every frame if this algorithm is used to process video in stead of picture.
What's the rationale of this choice? I don't quite get it.
All I can understand is completely the opposite: for video applications, scaling the features only needs to be done once, and the scaled features can be reused by all the frames. And only the integral image of the whole image needs to be computed .
Am I correct on this?
Viola and Jones also presented a 15fps frame rate on a Pentium 3 computer, but I hardly see anybody achieving that performance with the OpenCV implementation on modern computer. That's strange, isn't it?
Any input will be helpful. Thank you.
I have tried to verify this by looking into their code. This is based on version 2.4.10. The short answer is: both. OpenCv scales the image according to the scale factor at which the detection is performed and it can also rescale the features at different window sizes according to the scale factor. Justification is bellow:
1. Looking at the older functions, cvHaarDetectObjectsForROC from objdetect module (haar.cpp). Notable arguments are the CvSize minSize, CvSize maxSize and const CvArr* _img, double scaleFactor, int minNeighbors.
CvSeq*
cvHaarDetectObjectsForROC( const CvArr* _img,
CvHaarClassifierCascade* cascade, CvMemStorage* storage,
std::vector<int>& rejectLevels, std::vector<double>& levelWeights,
double scaleFactor, int minNeighbors, int flags,
CvSize minSize, CvSize maxSize, bool outputRejectLevels )
{
CvMat stub, *img = (CvMat*)_img;
.... // skip a bit ahead to this part
if( flags & CV_HAAR_SCALE_IMAGE )
{
CvSize winSize0 = cascade->orig_window_size; // this would be the trained size of 24x24 pixels mentioned in the paper
for( factor = 1; ; factor *= scaleFactor )
{
// detection window for current scale
CvSize winSize = { cvRound(winSize0.width*factor), cvRound(winSize0.height*factor) };
//resized image size
CvSize sz = { cvRound( img->cols/factor ), cvRound( img->rows/factor ) };
// take every possible scale factor as long as the resulting window doesn't exceed the maximum size given and is bigger than the minimum one
if( winSize.width > maxSize.width || winSize.height > maxSize.height )
break;
if( winSize.width < minSize.width || winSize.height < minSize.height )
continue;
img1 = cvMat( sz.height, sz.width, CV_8UC1, imgSmall->data.ptr );
... // skip sum, square sum, tilted sums a.k.a interal image arrays initialization
cvResize( img, &img1, CV_INTER_LINEAR ); // scaling down the image here
cvIntegral( &img1, &sum1, &sqsum1, _tilted ); // compute integral representation for the scaled down version
... //skip some lines
cvSetImagesForHaarClassifierCascade( cascade, &sum1, &sqsum1, _tilted, 1. ) //-> set the structures and also rescales the feature according to the last parameter which is the scale factor.
// Notice it is 1.0 because the image was scaled down this time.
<call detection function with notable arguments: cascade,... factor, cv::Mat(&sum1), cv::Mat(&sqsum1) ...>
// the above call is a parallel for that evaluates a window at a certain position in the image with the cascade classifier
// note the class naming HaarDetectObjects_ScaleImage_Invoker in the actual code and skipped here.
} // end for
} // if
else
{
int n_factors = 0; // total number of factors
cvIntegral( img, sum, sqsum, tilted ); // -> makes a single integral image for the given image (the original one passed in the cvHaarDetectObjects)
// below aims to see the total number of scale factors at which detection is performed.
for( n_factors = 0, factor = 1;
factor*cascade->orig_window_size.width < img->cols - 10 &&
factor*cascade->orig_window_size.height < img->rows - 10;
n_factors++, factor *= scaleFactor );
... // skip some lines
for( ; n_factors-- > 0; factor *= scaleFactor )
{
CvSize winSize = { cvRound( cascade->orig_window_size.width * factor ), cvRound( cascade->orig_window_size.height * factor )};
... // skip check for minSize and maxSize here
cvSetImagesForHaarClassifierCascade( cascade, sum, sqsum, tilted, factor ); // -> notice here the scale factor is given so that the trained Haar features can be rescaled.
<parallel for detect call given a startX, endX and startY endY, window size and cascade> // Note the name here HaarDetectObjects_ScaleCascade_Invoker used in actual code and skipped here
}
} // end of if
... // skip rest
} // end of cvHaarDetectObjectsForROC function
If you take the new API (C++) the class CascadeClassifier if it loads the new .xml format of the cascade outputted by the traincascade.exe application will scale the image according to the scale factor (for Haars it should be up from what I know of). The detectMultiScale method of the class will default to the detectSingleScale method at some point in the code:
if( !detectSingleScale( scaledImage, stripCount, processingRectSize, stripSize, yStep, factor, candidates, rejectLevels, levelWeights, outputRejectLevels ) )
break; // from cascadedetect.cpp in the detectMultiScale method.
Possible reason I can think of: In order to have a unified design in C++ this is the only method that can achieve transparency with a single interface for different types of features.
I left the trail of thought in case I have understood something wrong or have omitted something another user can correct me by verifying this trail.

GPUImage - How to specify filter size for GPUImageMedianFilter and GPUImageGaussianBlurFilter

Hi GPUImage community and Brad,
I would like to specify the filter size (radius) of the GPUImageMedianFilter
and GPUImageGaussianBlurFilter.
Does that demand specifying GPU commends? Or can it be done through the GPUImage wrapper?
If so, how can I do that?
Thanks
This is probably not the place to ask a specific question about this framework, but I can answer you on this.
The GPUImageMedianFilter is a hardcoded 3x3 median filter based on the article "A Fast, Small-Radius GPU Median Filter" by Morgan McGuire in the ShaderX6 book. More on this can be found here, including larger-radius versions of this. Despite being the fastest implementation of this that I have found, it is still incredibly slow to run on all but the fastest iOS devices, so increasing the sampling area will only slow this down further.
The GPUImageGaussianBlurFilter does a 9-hit simple Gaussian blur in two separated passes. The blurSize property allows you to expand or contract the sampling area slightly, but if you go beyond a multiplier of 1.5, you'll start seeing fringe artifacts due to too few samples being used to blur over a large area. I'm working on a couple of ways of expanding the blur area in a performant manner, but that is the limitation of this particular filter.
Here's how to calculate the median within the pixel-neighborhood radius of your choosing:
kernel vec4 medianUnsharpKernel(sampler u) {
vec4 pixel = unpremultiply(sample(u, samplerCoord(u)));
vec2 xy = destCoord();
int radius = 3;
int bounds = (radius - 1) / 2;
vec4 sum = vec4(0.0);
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
sum += unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
}
}
vec4 mean = vec4(sum / vec4(pow(float(radius), 2.0)));
float mean_avg = float(mean);
float comp_avg = 0.0;
vec4 comp = vec4(0.0);
vec4 median = mean;
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
comp = unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
comp_avg = float(comp);
median = (comp_avg < mean_avg) ? max(median, comp) : median;
}
}
return premultiply(vec4(vec3(abs(pixel.rgb - median.rgb)), 1.0));
}
A brief description of the steps
1. Calculate the mean of the values of the pixels surrounding the source pixel in a 3x3 neighborhood;
2. Find the maximum pixel value of all pixels in the same neighborhood that are less than the mean.
3. [OPTIONAL] Subtract the median pixel value from the source pixel value for edge detection.
If you're using the median value for edge detection, there are a couple of ways to modify the above code for better results, namely, hybrid median filtering and truncated media filtering (a substitute and a better 'mode' filtering). If you're interested, please ask.

Average blurring mask produces different results

multiplying each pixel by the average blurring mask *(1/9) but the result is totally different.
PImage toAverageBlur(PImage a)
{
PImage aBlur = new PImage(a.width, a.height);
aBlur.loadPixels();
for(int i = 0; i < a.width; i++)
{
for(int j = 0; j < a.height; j++)
{
int pixelPosition = i*a.width + j;
int aPixel = ((a.pixels[pixelPosition] /9));
aBlur.pixels[pixelPosition] = color(aPixel);
}
}
aBlur.updatePixels();
return aBlur;
}
Currently, you are not applying an average filter, you are only scaling the image by a factor of 1/9, which would make it darker. Your terminology is good, you are trying to apply a 3x3 moving average (or neighbourhood average), also known as a boxcar filter.
For each pixel i,j, you need to take the sum of (i-1,j-1), (i-1,j), (i-1,j+1), (i,j-1), (i,j),(i,j+1),(i+1,j-1),(i+1,j),(i+1,j+1), then divide by 9 (for a 3x3 average). For this to work, you need to not consider the pixels on the image edge, which do not have 9 neighbours (so you start at pixel (1,1), for example). The output image will be a pixel smaller on each side. Alternatively, you can mirror values out to add an extra line to your input image which will make the output image the same size as the original.
There are more efficient ways of doing this, for example using FFT based convolution; these methods are faster because they don't require looping.

Resources