How to determine the movement using only two frames - image-processing

I'm learning the moving object detection using a sequence of frames.
This is an example of two frames. I need to select moved object in the right frame.
I can subtract one frame from another. In the selected area the result would be none zero => that was a movement in that area. But if u look at the right frame, u could see a background selected as well.
Can I somehow separate the car from the background?
i guess the method, when we collecting the background pixels, and than subtract the image from the background is useless on a two frames, right?

You are right that the method does not work very well with only two frames. The method you describe works best when you have one image with only background, which you can then use to compare with new images to look for movement.
It is possible to calculate the movement of the object with only two frames, but then you probably need more advanced methods, such as optical flow or image registration algorithms.

Related

Interact with complex figure in iOS

I need to be able to interact with a representation of a cilinder that has many different parts in it. When the users taps over on of the small rectangles, I need to display a popover related to the specific piece (form).
The next image demonstrates a realistic 3d approach. But, I repeat, I need to solve the problem, the 3d is NOT required (would be really cool though). A representation that complies the functional needs will suffice.
The info about the parts to make the drawing comes from an API (size, position, etc)
I dont need it to be realistic really. The simplest aproximation would be to show a cilinder in a 2d representation, like a rectangle made out of interactable small rectangles.
So, as I mentioned, I think there are (as I see it) two opposite approaches: Realistic or Simplified
Is there a way to achieve a nice solution in the middle? What libraries, components, frameworks that I should look into?
My research has led me to SceneKit, but I still dont know if I will be able to interact with it. Interaction is a very important part as I need to display a popover when the user taps on any small rectangle over the cylinder.
Thanks
You don't need any special frameworks to achieve an interaction like this. This effect can be achieved with standard UIKit and UIView and a little trigonometry. You can actually draw exactly your example image using 2D math and drawing. My answer is not an exact formula but involves thinking about how the shapes are defined and break the problem down into manageable steps.
A cylinder can be defined by two offset circles representing the end pieces, connected at their radii. I will use an orthographic projection meaning the cylinder doesn't appear smaller as the depth extends into the background (but you could adapt to perspective if needed). You could draw this with CoreGraphics in a UIView drawRect.
A square slice represents an angle piece of the circle, offset by an amount smaller than the length of the cylinder, but in the same direction, as in the following diagram (sorry for imprecise drawing).
This square slice you are interested in is the area outlined in solid red, outside the radius of the first circle, and inside the radius of the imaginary second circle (which is just offset from the first circle by whatever length you want the slice).
To draw this area you simply need to draw a path of the outline of each arc and connect the endpoints.
To check if a touch is inside one of these square slices:
Check if the touch point is between angle a from the origin at a.
Check if the touch point is outside the radius of the inside circle.
Check if the touch point is inside the radius of the outside circle. (Note what this means if the circles are more than a radius apart.)
To find a point to display the popover you could average the end points on the slice or find the middle angle between the two edges and offset by half the distance.
Theoretically, doing this in Scene Kit with either SpriteKit or UIKit Popovers is ideal.
However Scene Kit (and Sprite Kit) seem to be in a state of flux wherein nobody from Apple is communicating with users about the raft of issues folks are currently having with both. From relatively stable and performant Sprite Kit in iOS 8.4 to a lot of lost performance in iOS 9 seems common. Scene Kit simply doesn't seem finished, and the documentation and community are both nearly non-existent as a result.
That being said... the theory is this:
Material IDs are what's used in traditional 3D apps to define areas of an object that have different materials. Somehow these Material IDs are called "elements" in SceneKit. I haven't been able to find much more about this.
It should be possible to detect the "element" that's underneath a touch on an object, and respond accordingly. You should even be able to change the state/nature of the material on that element to indicate it's the currently selected.
When wanting a smooth, well rounded cylinder as per your example, start with a cylinder that's made of only enough segments to describe/define the material IDs you need for your "rectangular" sections to be touched.
Later you can add a smoothing operation to the cylinder to make it round, and all the extra smoothing geometry in each quadrant of unique material ID should be responsive, regardless of how you add this extra detail to smooth the presentation of the cylinder.
Idea for the "Simplified" version:
if this representation is okey, you can use a UICollectionView.
Each cell can have a defined size thanks to
collectionView:layout:sizeForItemAtIndexPath:
Then each cell of the collection could be a small rectangle representing a
touchable part of the cylinder.
and using
collectionView:(UICollectionView *)collectionView
didSelectItemAtIndexPath:(NSIndexPath *)indexPath
To get the touch.
This will help you to display the popover at the right place:
CGRect rect = [collectionView layoutAttributesForItemAtIndexPath:indexPath].frame;
Finally, you can choose the appropriate popover (if the app has to work on iPhone) here:
https://www.cocoacontrols.com/search?q=popover
Not perfect, but i think this is efficient!
Yes, SceneKit.
When user perform a touch event, that mean you knew the 2D coordinate on screen, so your only decision is to popover a view or not, even a 3D model is not exist.
First, we can logically split the requirement into two pieces, determine the touching segment, showing right "color" in each segment.
I think the use of 3D model is to determine which piece of data to show in your case if I don't get you wrong. In that case, the SCNView's hit test method will do most of work for you. What you should do is to perform a hit test, take out the hit node and the hit's local 3D coordinate of this node, you can then calculate which segment is hit by this touch and do the decision.
Now how to draw the surface of the cylinder would be the only left question, right? There are various ways to do, for example simply paint each image you need and programmatically and attach it to the cylinder's material or have your image files on disk and use as material for the cylinder ...
I think the problem would be basically solved.

Apply definitely CGAffineTransform* to a UIView

I'm having a problem with scale transformation I have to apply to UIViews on Swift (but it's the same in objective-c too)
I'm applying a CGAffineTransformMakeScale() to multiples views during a gestureRecognizer.
It's like a loop for a cards deck. I remove the one on top and the X others behind scale up and a new one is added in the back.
The first iteration works as expected. But when I try to swipe the new front one, all the cards reset to their initial frame size because i'm trying to apply a new transform, which seems to cancel the previous one and reset the view to its initial state.
How can I apply definitely/commit the first transform change to be able to apply a new one after that based on the UIView resulting new size ?
I tried a UIView.commitAnimations() but no change.
EDIT :
Here's a simple example to understand what I try to do :
Imagine I have an initial UIView of 100x100
I have a shrink factor of 0.95, which means next views behind will be 95x95, then 90.25, then 85.73, etc
If I remove the top one (100x100), I want to scale up the others, so the 95x95 will become 100x100, etc
This is done by applying the inverse of the shrink factor, here 1.052631...
First time I apply the inverse factor, all views are correctly resized.
My problem is, when I trigger again by a swipe on the new front UIView a new resize of all views (So, for example, the 90.25x90.25 which became 95x95 should now scale to 100x100).
At this moment, the same CGAffineTransformMakeScale() is apply to all views, which all instantly reset to their original frame size (so the now 95x95 reset to 90.25x90.25, and then begin to apply the transformation on this old size).
As suggested here or elsewhere, using UIView.commitAnimations() in the end of each transformation don't change anything, and using a CGAffineTransformConcat() is like powering over and over the scaling by himself and of course views become insanely big...
I hope I made myself more clear, that's not easy to explain, don't hesitate to ask if something is wrong here.
After a lot of reading and consulting colleagues who know better than me about iOS programmation, here's my conclusion :
Applying a CGAffineTransformMakeScale() only modify visually a view but not its properties, and since it's difficult (and costly) to modify afterward the bounds and/or frame of a view, I should avoid to try to make a transform, update bounds, make another transform, etc.
Applying the same CGAffineTransformMakeScale() only reset the effect and not apply to the previous one.
Applying a CGAffineTransformScale() with the same values on top of the previous CGAffineTransformMakeScale() (or with a CGAffineTransformConcat()) has some unpredictable effect and will be very difficult to calculate precisely the new values to apply each time to get the effect I want.
The best way I can go with this is only applying one CGAffineTransformMakeScale() that I will keep updating scales values all along the view's life.
It implies now for me to rework all my implementation logic in reverse, but that's the easiest way to do this right.
Thanks all for your tips.

OpenCV background subtraction: How to precompute background model?

I am working on a tracking algorithm and one of the earliest steps it does is background subtraction. The algorithm gets a series of frames that represent the video with a moving object and static background. The object is in every frame.
In my first version of this process I computed a median image from all the frames and got a very good background scene approximation. Then I subtracted the resulting image from every frame in video sequence to get foreground (moving objects).
The above method worked well, but then I tried to replace it by using OpenCV's background subtractors MOG and MOG2.
What I do not understand is how these two classes perform the "precomputation of the background model"? As far as I understood from dozens of tutorials and documentations, these subtractors update the background model every time I use the apply() method and return a foreground mask.
But this means thet the first result of the apply() method will be a blank mask. And the later images wil have initial object's position ghost in it (see example below):
What am I missing? I googled a lot and seem to be the only one with this problem... Is there a way to run background precomputation that I am not aware of?
EDIT: I found a "trick" to do it: Before using OpenCV's MOG or MOG2 I first compute median background image, then I use it in first apply() call. The following apply() calls produce the foreground mask without the initial position ghost.
But still, is this how it should be done or is there a better way?
If your moving objects are present right from the start, all updating background estimators will place them in the background initially. A solution to that is to initialize your MOG on all frames and then run MOG again with this initialization (as with your median estimate). Depending on the number of frames you might want to adjust the update parameter of MOG (learningRate) to make sure its fully initialized (if you have 100 frames it probably needs to be higher at least 0.01):
void BackgroundSubtractorMOG::operator()(InputArray image, OutputArray fgmask, double **learningRate**=0)
If your moving objects are not present right from the start, make sure that MOG is fully initialized when they appear by setting a high enough value for the update parameter learningRate.

Time-delay effect GPUImage

I'm trying to achieve the "Ghost" effect from http://webcamtoy.com/ using GPUImage.
My understanding is that it would be a two-input filter, with a given time delay between the two frames used. I'd then just add the two frames with 0.5 alpha each.
I've seen how to use the current and previous frames with GPUImage using GPUImageBuffer (example of that in the GPUImageLowPassFilter) but I'm not sure how to set up a time delay between the two frames I want to use.
Any ideas or pointers? I was thinking of creating a custom filter and overriding newFrameReadyAtTime:atIndex: to delay the propagation downstream for the first x frames (where x is the delay in terms of number of frames). Maybe a clean way to do this would be to subclass GPUImageBuffer to automatically stack x frames before piping them out into a 2-input filter.
Thanks!
I think you're on the right track with keeping old frames. For the color effects, you're looking at something like extracting the color channels, using them as input to combine in a blend filter. The key is that the input's values have to add up to the natural color values in the non-changing portions of the video.

handling finger detection on small objects

The application I am working on requires a 4px bar height with a full screen size width. I need to be able to select this 4px bar and move it around. I also can not change the size of this bar it has to be 4px in height.
This wouldn't be that big of an issue if I wasn't using OpenGL to create the object. OpenGL obviously does not have its own selection features so I am needing to program my own.
Initially after research I built a color selector to identify the object. How my color selector works is what ever x and y my finger touch returns from touchesBegan: is the pixel I grab from a screenshot of the OpenGL View. The issue with this is finger location is not precise at all. If I use the mouse it works perfect...
I decided to maybe loop through a buffer zone of the selected x and y but unfortunately a screenshot of the OpenGL view has antialiasing happens to the image when it's stored in memory and the buffer returns several shade of my objects color. I could possibly do a comparative color look up, to see if its in the range of colors but that seems overly complicated with how much I have already had to do. Plus cycling through the buffer zone isn't quick.
I also have thought maybe just remembering the location of my line on the screen and if my finger is close to that location just know that that's the one I want to select and move it around.
The future of this application can have up to 4 lines just like this so, I want something more secure then just knowing the location of where it is in memory.
What better way is there out there of handling selection of small objects?
How about maintaining an array of frames for the four objects, but expand the heights to something more manageable (8px or bigger)? Then, a touch within the larger region could be compared against the array (using CGRectContainsPoint). If you get a hit, then "snap to" the center point of the smaller (4px) rectangle before beginning the drag.
I do something like this by maintaining a list of "drop targets" for drag & drop, where it snaps to the drop target when it gets pretty close. Don't know if I'm conveying the idea very well, but it ought to work.
If the four 4px rectangles are going to be contiguous or very close together, you'll have to be able to make the selected one stand out or the user won't be able to tell which they're dragging -- but you could do that by making it bigger (maybe 6-8 px) then bringing it to the front so it overlays its adjacent neighbors.
More of an idea than an answer I guess.
John,
I would suggest a different approach. As you've discovered, touches in iOS are very imprecise. Apple usually suggests that the "hit box" for your controls be at least 40x40 points. I've gone as small as 30x30 points, but that starts to get hard.
What I would suggest you do is to factor your code so the app knows where the line is, and keeps track of it as a logical object. Then in your touch handler, interpret touches based on a large "buffer area" around the things you want the user to be able to move. If you just have a single horizontal bar, this should work great. Where you'll get into trouble is if you have multiple, thin horizontal bars that are close together. In that case you might need to rethink your app design and find another way to solve the problem.
As for the implementation details, you might add a pan gesture recognizer to your OpenGL view, and have it notify the OpenGL view of touch and drag actions. Then your OpenGL view can use knowledge of where your draggable objects are to decide how to interpret the touches.

Resources