Determine the corners of a sheet of paper with iOS 5 AV Foundation and core-image in realtime - ios

I am currently building a camera app prototype which should recognize sheets of paper lying on a table. The clue about this is that it should do the recognition in real time, so I capture the video stream of the camera, which in iOS 5 can easily be done with the AV foundation. I looked at here and here
They are doing some basic object recognition there.
I have found out that using OpenCV library in this realtime environment does not work in a performant way.
So what I need is an algorithm to determine the edges of an image without OpenCV.
Does anyone have some sample code snippets which lay out how to do this or point me in the right direction.
Any help would be appreciated.

You're not going to be able to do this with the current Core Image implementation in iOS, because corner detection requires some operations that Core Image doesn't yet support there. However, I've been developing an open source framework called GPUImage that does have the required capabilities.
For finding the corners of an object, you can use a GPU-accelerated implementation of the Harris corner detection algorithm that I just got working. You might need to tweak the thresholds, sensitivities, and input image size to work for your particular application, but it's able to return corners for pieces of paper that it finds in a scene:
It also finds other corners in that scene, so you may need to use a binary threshold operation or some later processing to identify which corners belong to a rectangular piece of paper and which to other objects.
I describe the process by which this works over at Signal Processing, if you're interested, but to use this in your application you just need to grab the latest version of GPUImage from GitHub and make the GPUImageHarrisCornerDetectionFilter the target for a GPUImageVideoCamera instance. You then just have to add a callback to handle the corner array that's returned to you from this filter.
On an iPhone 4, the corner detection process itself runs at ~15-20 FPS on 640x480 video, but my current CPU-bound corner tabulation routine slows it down to ~10 FPS. I'm working on replacing that with a GPU-based routine which should be much faster. An iPhone 4S currently handles everything at 20-25 FPS, but again I should be able to significantly improve the speed there. Hopefully, that would qualify as being close enough to realtime for your application.

I use Brad's library GPUImage to do that, result is perfectible but enough good.
Among detected Harris corners, my idea is to select:
The most in the upper left for the top-left corner of the sheet
The most in the upper right for the top-right corner of the sheet
etc.
#Mirco - Have you found a better solution ?
#Brad - In your screenshot, what parameters for Harris filter do you use to have just 5 corners detected ? I have a lot of than that ...

Related

Augmented Reality – Lighting Real-World objects with Virtual light

Is it possible to import a virtual lamp object into the AR scene, that projects a light cone, which illuminates the surrounding space in the room and the real objects in it, e.g. a table, floor, walls?
For ARKit, I found this SO post.
For ARCore, there is an example of relighting technique. And this source code.
I have also been suggested that post-processing can be used to brighten the whole scene.
However, these examples are from a while ago and perhaps threre is a newer or a more straight forward solution to this problem?
At the low level, RealityKit is only responsible for rendering virtual objects and overlaying them on top of the camera frame.
If you want to illuminate the real scene, you need to post-process the camera frame.
Here are some tutorials on how to do post-processing:
Tutorial1⃣️
Tutorial2⃣️
If all you need is an effect like This , then all you need to do is add a CGImage-based post-processing effect for the virtual object (lights).
More specifically, add a bloom filter to the rendered image(You can also simulate bloom filters with Gaussian blur).
In this way, the code is all around UIImage and CGImage, so it's pretty simple😎
If you want to be more realistic, consider using the depth map provided by LiDAR to calculate which areas can be illuminated for a more detailed brightness.
Or If you're a true explorer, you can use Metal to create a real world Digital Twin point cloud in real time to simulate occlusion of light.
There's nothing new in relighting techniques based on 3D compositing principles in 2021. At the moment, when you're working with RealityKit or SceneKit, you have to personally implement the relighting functionality with the help of two additional render passes (RGB pass is always needed) - Normals pass and PointPosition pass. Both AOVs must be 32-bit.
However, in the near future, when Apple engineers finally implement texture capturing in Scene Reconstruction – any inexperienced AR developer will be able to apply a relighting procedure.
Watch this Vimeo Video to find out how relighting can be achieved in The Foundry NUKE.
A crucial point here, when implementing the Relighting effect, is the presence of a LiDAR scanner (or iToF sensor if you're using ARCore). In other words, today's relighting solution for iOS is Metal + RealityKit.

OpenCV - background removal and object detection

I need to detect where objects (mostly people) are in relation to a wall. I can have a fixed position camera in the ceiling so I thought to get an image of the space with nothing in it. Then use the difference of that and the current camera image to get an image with just the things. Then I can do blob detection I think to get the positions (only need x).
Does this seem sound? I'm not very accomplished in OpenCV so am looking for some advice.
That would be one way of going about it, but not very robust as the video feed won't produce consistent precise images so the background will never be nicely subtracted out, and people walking through the scene will occlude light and could also possibly match parts of your background.
This process of removing the background from a video is simply dubbed "background subtraction" and there are built-in OpenCV methods for it.
OpenCV has tutorials on their site showing the basics, for both python and C++.

Face image manipulation iOS

I would like some advice on how to approach this problem. I am making an app where users will be retrieving photos of faces from a camera roll or camera capture (assuming they are always portrait) and I want to make it appear as though the face images are talking (ex. moving pixels around the mouth up and down) using any known image manipulation techniques. The resultant animation of the photo will appear on a separate view. I have started learning OpenGL and researched Open CV, Core Image, GPUImage and other frameworks. I have been given a small timeframe and generally, my experience with graphics processing is limited. I would appreciate it if anybody were to instruct me on what to do using any of the frameworks or libraries I have mentioned above.
Since all you need is some animation of the image, I don't think it is a good idea to move the pixels around as you said. It's very complicated and the result of moving the pixels around might looks bad.
A much simpler approach is by using gif image. All you need to do is to make the animation of talking as a gif image and then use it in your app.
Please refer to the following question.

iOS Camera Color Recognition in Real Time: Tracking a Ball

I have been looking for a bit and know that people are able to track faces with core image and openGL. However I am not that sure where to start the process of tracking a colored ball with the iOS camera.
Once I have a lead to being able to track the ball. I hope to create something to detect. when the ball changes directions.
Sorry I don't have source code, but I am unsure where to even start.
The key point is image preprocessing and filtering. You can use the Camera API-s to get the video stream from the camera. Take a snapshot picture from it, then you should use a Gaussian-blur on it (spatial enhance), then a Luminance Average Threshold Filter (to make black and white image). After that a morphological preprocessing should be wise (opening, closing operators), to hide the small noises. Then an Edge detection algorithm (with for example a Prewitt-operator). After these processes only the edges remain, your ball should be a circle (when the recording environment was ideal) After that you can use a Hough-transform to find the center of the ball. You should record the ball position and in the next frame, the small part of the picture can be processed (around the ball only).
Other keyword could be: blob detection
A fast library for image processing (on GPU with openGL) is Brad Larsons: GPUImage library https://github.com/BradLarson/GPUImage
It implements all the needed filter (except Hough-transformation)
The tracking process can be defined as following:
Having the initial coordinate and dimensions of an object with a given visual characteristics (image features)
In the next video frame, find the same visual characteristics near the coordinate of the last frame.
Near means considering basic transformations related to the last frame:
translation in each direction;
scale;
rotation;
The variation of these tranformations are strictly related with the frame rate. Higher the frame rate, nearest the position will be in the next frame.
Marvin Framework provides plug-ins and examples to perform this task. It's not compatible with iOs yet. However, it is open source and I think you can port the source code easily.
This video demonstrates some tracking features, starting at 1:10.

How to detect movement of object on iPhone's camera screen? [duplicate]

I saw that someone has made an app that tracks your feet using the camera, so that you can kick a virtual football on your iPhone screen.
How could you do something like this? Does anyone know of any code examples or other information about using the iPhone camera for detecting objects and tracking them?
I just gave a talk at SecondConf where I demonstrated the use of the iPhone's camera to track a colored object using OpenGL ES 2.0 shaders. The post accompanying that talk, including my slides and sample code for all demos can be found here.
The sample application I wrote, whose code can be downloaded from here, is based on an example produced by Apple for demonstrating Core Image at WWDC 2007. That example is described in Chapter 27 of the GPU Gems 3 book.
The basic idea is that you can use custom GLSL shaders to process images from the iPhone camera in realtime, determining which pixels match a target color within a given threshold. Those pixels then have their normalized X,Y coordinates embedded in their red and green color components, while all other pixels are marked as black. The color of the whole frame is then averaged to obtain the centroid of the colored object, which you can track as it moves across the view of the camera.
While this doesn't address the case of tracking a more complex object like a foot, shaders like this should be able to be written that could pick out such a moving object.
As an update to the above, in the two years since I wrote this I've now developed an open source framework that encapsulates OpenGL ES 2.0 shader processing of images and video. One of the recent additions to that is a GPUImageMotionDetector class that processes a scene and detects any kind of motion within it. It will give you back the centroid and intensity of the overall motion it detects as part of a simple callback block. Using this framework to do this should be a lot easier than rolling your own solution.

Resources