I am trying to forward Fast Fourier Transfer an Image and then backward Fast Fourier Transfer it. I am using the library from http://www.fftw.org/. The thing is that I have stored the RGB values in a one dimensional array in the listed order. The way I think would work is to allocate arrays for each of the colors and do a separate FFT for each array. Like this:
fftw_plan_dft_2d(imageWidth, imageHeight, pixelColor, fft_result, FFTW_FORWARD,
FFTW_ESTIMATE)
I don't know much about FFT but to me it doesn't seem like an ideal way to do it. Can someone tell me if there is a better way to FFT all the pixelcolors from an Image with the library from fftw.org?
Thanks.
No sure what an fft of a colour image would mean.
Presumably you want to look at structure in each colour or more common - just make a greyscale (intensity) image and work on that
Related
My goal is to calculate the blurriness of an image in this way. Basically it firstly converts an image to grayscale and then convolve it with laplacian kernel, and then take a variance of it.
I currently choose GPUImage for doing this job because it has grayscale filter and convolution filter and it's easy to implement and use. I chose it over other options like CoreImage and Accelerate because they don't explicitly have grayscale filter (actually I would prefer Accelerate since it seems fastest one of the three, but I don't know how to make it happen.), and I chose it over OpenCV because it seems a pain to have C++ code to work with Swift, and it's the slowest one of all frameworks above.
But now I found that I couldn't find a way to calculate the variance of GPUImage.
Anyone knows how to do it?
And considering my end goal, what framework is the best choice for me?
If anyone knows how to achieve this task using Accelerate that would be awesome too!
I start with creating an initial mask of an object in an image. Using this mask, a histogram is created which is then used to process subsequent images.
I use the calcBackProject function to find pixels in the image that belong to the histogram. The problem I am having is that too much of the image is being accepted because certain objects are similar to the color of the initial object. Is there any alternative to calcBackProject? In my application, I can't afford to get objects that do not belong. All of this assumes that I have a perfect initial mask.
There are many ways to track an object, and it can be very difficult. Within OpenCV you may want to try the meanshift/camshift tracker to see if these are any better. If not then you may have to stray out of the opencv world and try tracking-learning-detection frameworks.
Meanshift/Camshift/etc in OpenCV
http://docs.opencv.org/modules/video/doc/video.html
http://docs.opencv.org/trunk/doc/py_tutorials/py_video/py_meanshift/py_meanshift.html
Tracking-Learning-Detection in C++:
STRUCK: http://www.samhare.net/research/struck (uses opencv)
Tracking-Learning-Detection in Matlab:
Preditor: http://personal.ee.surrey.ac.uk/Personal/Z.Kalal/tld.html
I'm looking for a possibility to convert raster images to vector data using OpenCV. There I found a function cv::findContours() which seems to be a bit primitive (more probably I did not understand it fully):
It seems to use b/w images only (no greyscale and no coloured images) and does not seem to accept any filtering/error suppresion parameters that could be helpful in noisy images, to avoid very short vector lines or to avoid uneven polylines where one single, straight line would be the better result.
So my question: is there a OpenCV possibility to vectorise coloured raster images where the colour-information is assigned to the resulting polylinbes afterwards? And how can I apply noise reduction and error suppression to such a algorithm?
Thanks!
If you want to raster image by color than I recommend you to clusterize image on some group of colors (or quantalize it) and after this extract contours of each color and convert to needed format. There are no ready vectorizing methods in OpenCV.
I have a 8-bit grayscale image and I apply a transformation (modified census transform) to it. After transformation I need to represent each pixel of the image with 9-bit. I store my 9-bit data in uint16 and when I want to display my image I used two different methods. I'm not sure which one is the right way to do it or if there are any better approaches to do it.
1- Take the most significant 8-bit from the 9-bit and represent image as 8-bit.
2- Divide each pixel value to 2 and represent image as 8-bit.
In both way there is a loss of information. Could anyone suggest a better way to do this?
Thank you
Why don't you just normalize them? normalizedValue=currentValue/maximumValue. Then just display the normalized image?
The number of intensity levels which you can represent depends very much on your hardware. Even if you somehow manage to represent extra grey levels you wont be able to differentiate among them. The two methods which you proposed in your question are essentially the same.
We as human, could recognize these two images as same image :
In computer, it will be easy to recognize these two image if they are in the same size, so we have to make Preprocessing stage or step before recognize it, like scaling, but if we look deeply to scaling process, we will know that it's not an efficient way.
Now, could you help me to find some way to convert images into objects that doesn't deal with size or pixel location, to be input for recognition method ?
Thanks advance.
I have several ideas:
Let the image have several color thresholds. This way you get large
areas of the same color. The shapes of those areas can be traced with
curves which are math. If you do this for the larger and the smaller
one and see if the curves match.
Try to define key spots in the area. I don't know for sure how
this works but you can look up face detection algoritms. In such
an algoritm there is a math equation for how a face should look.
If you define enough object in such algorithms you can define
multiple objects in the images to see if the object match on the
same spots.
And you could see if the predator algorithm can accept images
of multiple size. If so your problem is solved.
It looks like you assume that human's brain recognize image in computationally effective way, which is rather not true. this algorithm is so complicated that we did not find it. It also takes a large part of your brain to deal with visual data.
When it comes to software there are some scale(or affine) invariant algorithms. One of such algorithms is LeNet 5 neural network.