GPUImage: synchronization of 2 branch chains - ios

I am working on an image stitching app, which takes input from camera, estimates image transformation and warped input image by the estimated transformation. As shown in the following figure, image from camera are input into 2 branch of chains. However, the processing of image warping is dependent on the result of transformation estimation. My question is how I can make branch 1 wait for the results of branch 2?

If you make your image warping filter a subclass of GPUImageTwoInputFilter, this synchronization is taken care of for you.
Target the GPUImageVideoCamera instance to your feature matching / transformation estimation filter and the image warping filter, then target your feature matching / transformation estimation filter to the image warping filter. This will cause your video input to come in via the first input image and the results of your feature matching and transformation estimation filter to be in the second image. GPUImageTwoInputFilter subclasses only process and output a frame once input frames have been provided to both their inputs.
This should give you the synchronization you want, and be pretty straightforward to set up.

I think, you can try to use, something like dispatch_semaphore_t
Look here.

Related

Detecting a pattern of dark/bright bands in an image

I'm trying to detect a pattern like this in some images
The actual image looks something like this
It could be scaled and/or rotated. Is there a way to do that efficiently without resorting to neural nets or some learning algorithm? Can some detection be done based on the value gradient for example (dark-bright-dark-bright-dark)?
input image is MxN (in your example M<N ):
take mean RGB image
mean Y to get 1xN vector
derive
abs
threshold
calculate the distance between peaks.
search for a location where the ratio between the distances is as expected (from what i see in your example ~ 1:7:1)
if a place found, validate the colors in the middle of the distance (from your example should be white-black-white)
You might be able to use Gabor Filters at varying orientations, and do standard threshold to identify objects.
If you know the frequency of the pattern you could try using a bandpass filter to isolate objects at that frequency. If it is a very strong frequency, you might be able to identify it in the image's Fourier transform.
Without much other knowledge about what you are looking for in your image, it will be very difficult to identify a specific repeating pattern.

Image Registration by Manual marking of corresponding points using OpenCV

I have a processed binary image of dimension 300x300. This processed image contains few object(person or vehicle).
I also have another RGB image of the same scene of dimensiion 640x480. It is taken from a different position
note : both cameras are not the same
I can detect objects to some extent in the first image using background subtraction. I want to detect corresponding objects in the 2nd image. I went through opencv functions
getAffineTransform
getPerspectiveTransform
findHomography
estimateRigidTransform
All these functions require corresponding points(coordinates) in two images
In the 1st binary image, I have only the information that an object is present,it does not have features exactly similar to second image(RGB).
I thought conventional feature matching to determine corresponding control points which could be used to estimate the transformation parameters is not feasible because I think I cannot determine and match features from binary and RGB image(am I right??).
If I am wrong, what features could I take, how should I proceed with Feature matching, find corresponding points, estimate the transformation parameters.
The solution which I tried more of Manual marking to estimate transformation parameters(please correct me if I am wrong)
Note : There is no movement of both cameras.
Manually marked rectangles around objects in processed image(binary)
Noted down the coordinates of the rectangles
Manually marked rectangles around objects in 2nd RGB image
Noted down the coordinates of the rectangles
Repeated above steps for different samples of 1st binary and 2nd RGB images
Now that I have some 20 corresponding points, I used them in the function as :
findHomography(src_pts, dst_pts, 0) ;
So once I detect an object in 1st image,
I drew a bounding box around it,
Transform the coordinates of the vertices using the above found transformation,
finally draw a box in 2nd RGB image with transformed coordinates as vertices.
But this doesnt mark the box in 2nd RGB image exactly over the person/object. Instead it is drawn somewhere else. Though I take several sample images of binary and RGB and use several corresponding points to estimate the transformation parameters, it seems that they are not accurate enough..
What are the meaning of CV_RANSAC and CV_LMEDS option, ransacReprojecThreshold and how to use them?
Is my approach good...what should I modify/do to make the registration accurate?
Any alternative approach to be used?
I'm fairly new to OpenCV myself, but my suggestions would be:
Seeing as you have the objects identified in the first image, I shouldn't think it would be hard to get keypoints and extract features? (or maybe you have this already?)
Identify features in the 2nd image
Match the features using OpenCV FlannBasedMatcher or similar
Highlight matching features in 2nd image or whatever you want to do.
I'd hope that because all your features in the first image should be positives (you know they are the features you want), then it'll be relatively straight forward to get accurate matches.
Like I said, I'm new to this so the ideas may need some elaboration.
It might be a little late to answer this and the asker might not see this, but if the 1st image is originally a grayscale then this could be done:
1.) 2nd image ----> grayscale ------> gray2ndimg
2.) Point to Point correspondences b/w gray1stimg and gray2ndimg by matching features.

How to remove distortion due to motion, from an image

I am trying to track motion of a toy car. I have recorded few videos and now trying to calculate rotation.
My problem is extracting features from object surface is quit challenging due to motion blur. Below image shows a cropped image from a video frame. The distortion happen in horizontal lines. The distortion seen in this image happens when object is moving. When the object is not moving there is no distortion.
Image shows distorted image of the car when its moving forward in a diagonal path cross the image frame.
I tried a wiener filter, based on median and variance but it didn't do much improvement. It only gave me a smoothed image as if Gaussian blur was applied on it.
What type of enhancements should I do to get a better image?
video - 720 x 576 frames - 25fps
from the picture provided it looks like you need to de-interlace the video rather than just trying to filter what's there; i remember doing this by just taking every other scan line and then doing a resize to put it back in perspective.
i found a pretty cool site that talks about deinterlacing in case you'd like to see if you might have other possibilities:
http://www.100fps.com/
(oh, and i have not inspected the image very closely so it's possible that there is some other interlacing scheme going on than just every other line; in which case my first answer wouldn't work properly. and it does imply that you will lose some resolution but that's just the nature of interlaced video...)
Given that your camera outputs interlaced video, you are better off using one field of the video. Either only use the even lines of the image or only the odd lines. The image will be squashed but you won't be mixing two images together.
Yep, that image needs to be de-interlaced. Correcting "distortion" due to linear movement is a different thing, you need to do a linear directional filtering depending on the speed of the vehicle, the distance to the camera and the obturation speed.
You have to first calculate the impulse response for a given set of conditions (those above, which represent the deviation or the distance between the same point taken at the beggining of the capture and the end of it), and then apply the inverse filtering. You may need to use some filtering or image processing toolkit, if using Matlab it's going to be easy.
Did you try:
deconvblind
Follow the example on deconvblind mathworks. It might work well on your example image.
Another example - Image Restoration
The following algorithm is a very simple de-interlaceing method:
cv::Mat input = cv::imread("img.jpg");
cv::Mat tmp(input.rows/2, input.cols*2, input.type(), input.data);
tmp = tmp.colRange(0, input.cols);
cv::Mat output;
cv::resize(tmp, output, Size(), 1, 2);

How to match texture similarity in images?

What are the ways in which to quantify the texture of a portion of an image? I'm trying to detect areas that are similar in texture in an image, sort of a measure of "how closely similar are they?"
So the question is what information about the image (edge, pixel value, gradient etc.) can be taken as containing its texture information.
Please note that this is not based on template matching.
Wikipedia didn't give much details on actually implementing any of the texture analyses.
Do you want to find two distinct areas in the image that looks the same (same texture) or match a texture in one image to another?
The second is harder due to different radiometry.
Here is a basic scheme of how to measure similarity of areas.
You write a function which as input gets an area in the image and calculates scalar value. Like average brightness. This scalar is called a feature
You write more such functions to obtain about 8 - 30 features. which form together a vector which encodes information about the area in the image
Calculate such vector to both areas that you want to compare
Define similarity function which takes two vectors and output how much they are alike.
You need to focus on steps 2 and 4.
Step 2.: Use the following features: std() of brightness, some kind of corner detector, entropy filter, histogram of edges orientation, histogram of FFT frequencies (x and y directions). Use color information if available.
Step 4. You can use cosine simmilarity, min-max or weighted cosine.
After you implement about 4-6 such features and a similarity function start to run tests. Look at the results and try to understand why or where it doesnt work. Then add a specific feature to cover that topic.
For example if you see that texture with big blobs is regarded as simmilar to texture with tiny blobs then add morphological filter calculated densitiy of objects with size > 20sq pixels.
Iterate the process of identifying problem-design specific feature about 5 times and you will start to get very good results.
I'd suggest to use wavelet analysis. Wavelets are localized in both time and frequency and give a better signal representation using multiresolution analysis than FT does.
Thre is a paper explaining a wavelete approach for texture description. There is also a comparison method.
You might need to slightly modify an algorithm to process images of arbitrary shape.
An interesting approach for this, is to use the Local Binary Patterns.
Here is an basic example and some explanations : http://hanzratech.in/2015/05/30/local-binary-patterns.html
See that method as one of the many different ways to get features from your pictures. It corresponds to the 2nd step of DanielHsH's method.

Why can fourier transform be used for image recognization while being sensitive to noises?

As we know Fourier Transform is sensitive to noises(like salt and peppers),
how can it still be used for image recognization?
Is there a FT expert here?
Update to actually answer the question you asked... :) Pre-process the image with a non-linear filter to suppress the salt & pepper noise. Median filter maybe?
Basic lesson on FFTs on matched filters follows...
The classic way of detecting a smaller image within a larger image is the matched filter. Essentially, this involves doing a cross correlation of the larger image with the smaller image (the thing you're trying to recognize).
For every position in the larger image
Overlay the smaller image on the larger image
Multiply all corresponding pixels
Sum the results
Put that sum in this position in the filtered image
The matched filter is optimal where the only noise in the larger image is white noise.
This IS computationally slow, but it can be decomposed into FFT (fast Fourier transform) operations, which are much more efficient. There are much more sophisticated approaches to image matching that tolerate other types of noise much better than the matched filter does. But few are as efficient as the matched filter implemented using FFTs.
Google "matched filter", "cross correlation" and "convolution filter" for more.
For example, here's one brief explanation that also points out the drawbacks of this very oldschool image matching approach: http://www.dspguide.com/ch24/6.htm
Not sure exactly what you're asking. If you are asking about how FFT can be used for image recognition, here are some thoughts.
FFT can be used to perform image "classification". It can't be used to recognize different faces or objects, but it can be used to classify the type of image. FFT calculates the spacial frequency content of the image. So for example, natural scene, face, city scene, etc. will have different FFTs. Therefore you can classify image or even within image (e.g. aerial photo to classify terrain).
Also, FFT is used in pre-processing for image recognition. It can be used for OCR (optical character recognition) to rotate the scanned image into correct orientation. FFT of typed text has a strong orientation. Same thing for parts inspection in industrial automation.
I don't think you'll find many methods in use that rely on Fourier Transforms for image recognition.
In the case of salt and pepper noise, it can be considered high frequency noise, and thus you could low pass filter your FFT before making a comparison with the target image.
I would imagine that it would work, but that different images that are somewhat similar (like both are photographs taken outside) would register as being the same image.

Resources