How to detect colors or texture of specific shape from the physical image in ARkit, RealityKit? - augmented-reality

The End user will have a coloring book with different things like airplanes, cars, animals, etc. they can use multiple colors to color them, I have to detect each of these physical images and the colors or texture from the physical image and apply them to the model. Please point me in the right direction.

Related

ARKit – White paper sheet detection

I would like to build a very simple AR app, which is able to detect a white sheet of A4 paper in its surrounding. I thought it would be enough to use Apple's image recognition sample project as well as a white sample image in the ratio of a A4 sheet but the ARSession will fail.
One or more reference images have insufficient texture: white_a4,
NSLocalizedRecoverySuggestion=One or more images lack sufficient
texture and contrast for accurate detection. Image detection works
best when an image contains multiple high-contrast regions distributed
across its extent.
Is there a simple way, to detect sheets of paper using ARKit? Thanks!
I think even ARKit 3.0 isn't ready for an abstract white sheet's detection at the moment.
If you have a white sheet with some markers at its corners, or some text on it, or, even, a white sheet placed inside definite environment (it's a kind of detection based on surroundings, not on the sheet itself) – then it has some sense.
But simple white paper has no distinct marks on it, hence ARKit has no understanding what it is, what its color is (outside a room it has cold tint, for instance, but inside a room it has warm tint), what a contrast is (contrast's important property in image detection) and how it's oriented (this mainly depends on your PoV).
Suppose the common sense of image detection is that ARKit detects image, not its absence.
So, for successive detection you'll need to give ARKit not only a sheet but its surrounding as well.
Also, you can look at Apple's recommendations when working with image detection technique:
Enter the physical size of the image in Xcode as accurately as possible. ARKit relies on this information to determine the distance of the image from the camera. Entering an incorrect physical size will result in an ARImageAnchor that’s the wrong distance from the camera.
When you add reference images to your asset catalog in Xcode, pay attention to the quality estimation warnings Xcode provides. Images with high contrast work best for image detection.
Use only images on flat surfaces for detection. If an image to be detected is on a nonplanar surface, like a label on a wine bottle, ARKit might not detect it at all, or might create an image anchor at the wrong location.
Consider how your image appears under different lighting conditions. If an image is printed on glossy paper or displayed on a device screen, reflections on those surfaces can interfere with detection.
I must add that you need a unique texture pattern, not a repetitive one.
What you could do is run a simple ARWorldTrackingConfiguration where you periodically analyze the camera image for rectangles using the Vision framework.
This post (https://medium.com/s23nyc-tech/using-machine-learning-and-coreml-to-control-arkit-24241c894e3b) describes how to use ARKit in combination with CoreML

Extract face features from ARSCNFaceGeometry

I've been trying without success to extract face features, for instance the mouth, from ARSCNFaceGeometry in order to change their color or add a different material.
I understand I need to create an SCNGeometry for which I have the SCNGeometrySource but haven't been able to create the SCNGeometryElement.
Have tried creating it from ARFaceAnchor in update(from faceGeometry: ARFaceGeometry) but so far have been unable.
Would really appreciate someone help
ARSCNFaceGeometry is a single mesh. If you want different areas of it to be different colors, your best bet is to apply a texture map (which you do in SceneKit by providing images for material property contents).
There’s no semantic information associated with the vertices in the mesh — that is, there’s nothing that says “this point is the tip of the nose, these points are the edge of the upper lip, etc”. But the mesh is topologically stable, so if you create a texture image that adds a bit of color around the lips or a lightning bolt over the eye or whatever, it’ll stay there as the face moves around.
If you need help getting started on painting a texture, there are a couple of things you could try:
Create a dummy texture first
Make a square image and fill it with a double gradient, such that the red and blue component for each pixel is based on the x and y coordinate of that pixel. Or some other distinctive pattern. Apply that texture to the model, and see how it looks — the landmarks in the texture will guide you where to paint.
Export the model
Create a dummy ARSCNFaceGeometry using the init(blendShapes:) initializer and an empty blendShapes dictionary (you don’t need an active ARFaceTracking session for this, but you do need an iPhone X). Use SceneKit’s scene export APIs (or Model I/O) to write that model out to a 3D file of some sort (.scn, which you can process further on the Mac, or something like .obj).
Import that file into your favorite 3D modeling tool (Blender, Maya, etc) and use that tool to paint a texture. Then use that texture in your app with real faces.
Actually, the above is sort of an oversimplification, even though it’s the simple answer for common cases. ARSCNFaceGeometry can actually contain up to four submeshes if you create it with the init(device:fillMesh:) initializer. But even then, those parts aren’t semantically labeled areas of the face — they’re the holes in the regular face model, flat fill-ins for the places where eyes and mouth show through.

What method or set of methods should be used to detect similar objects on a picture/video?

At the moment I use Caffe to recognize the objects on the picture.
There can be several hundreds of objects and many of them can be very similar (like a bolt with square head and a bolt with 6-sided head).
In addition to the photo of the object, it is possible to take several photos of the objects from different angles or shoot a video.
Is it possible to increase the accuracy of recognition, in addition to comparing the results of the photos of the same scene from different angles?
Updated:
Learned model setup
I created a 3D model for every object. The synthetic training images were rendered in Blender. The various rendering parameters were used: camera’s location, lights's position, light's intensity, 3 textures (white, gray, dark gray) for the ground and 4 textures for the objects (light green, green, dark green, almost black green).
Because objects can have different colors, I decided to use grayvalue images to save time and the size of the input data and not to render same set of images for every possible color.
The models were rendered at a resolution of 256x256 pixels, 5000 renders for every model. I use OpenCV to segmentate the image and detect the objects' locations. That is why green materials with different tones were used. Renders were segmentated, substracted images were resized to 256x256 images and converted to grayvalue images.
I used "AlexNet" network to learn the model. It is always 100% accuracy during validation(10-20% of training images) after 3-4 epochs.
Ussualy there are many objects on a real photo. It is segmented using the same algorithm as was applied to the rendered images. Recognition is performed for every segmented "subimage".

Best way to draw a cube with solid-coloured faces

I'm completely new to DirectX (11) so this question will be extremely basic. Sorry about that.
I'd like to draw a cube on screen that has solid-coloured faces. All of the examples that I've seen have 8 vertices, with a colour defined at each vertex (red, green, blue). The pixel shader then interpolates between these vertices to give a spectrum of colours. This looks nice, but isn't what I'm trying to achieve. I'd just like a cube with six, coloured faces.
Two ideas come to mind:
use 24 vertices, and have each vertex referenced only a single time, i.e. no sharing. This way I can define three different colours at each 3D position, one for each face.
use a texture for each face that 'stretches' to give the face the correct colour. I'm not very familiar with textures right now, so not all that sure about this idea.
What's the typical/canonical way to achieve this effect? I'm sure this 'problem' has been solved many, many times before.
For your particular problem, vertex coloring might be the easiest and best solution. But the more complex you models will become the more complicated is to create a proper vertex coloring, because you don't always want to limit you in your imagination to the underlying geometry.
In general 3D objects are colored with one or more textures. Therefore you create an UV-Mapping (wiki), which unwraps you three-dimensional surface onto a 2D-Plane, the texture. Now you can paint freely in any resolution you want colors on your object, which gives you the most freedom to have the model look as you want.
Of course each application has its own characteristics, so some projects would choose another approach, but I think this is the most popular way to colorize models.
Option 1 is the way to go if:
You want zero color bleed between faces
You want zero texture bleed between faces
You later want to use the color as a lighting scheme ala Minecraft
Caveats:
Could use more memory as more verts being used (There are some techniques around this depending on how large your object is and its spacial resolution. eg using 1 byte for x/y/z instead of a float)

blending colors ios

I want to rectangular crop the eye from one face and paste it on another face, so that in the resulting image skin color of portion of eye blend nicely with the face color of the persons on which we are pasting eyes. I am able to crop and paste, but having problem with blending. Currently, the boundaries of the rectangular cropped eye after pasting are very much visible. I want to reduce this effect, so that the eyes nicely blend with face and resulting image won't look fake.
My suggestion is to do the blending in code. First, you need do create two bitmap contexts so you have the bits of your face and the bits of your new eye.
in the overlap area only, you need to determine the outer most "skin" area by evaluating the colors of the two areas, and create a mapping of those areas in both that are "skin". you would be working from the outermost areas and work towards the center.
for color evaluation, you should turn colors into HSV (or HCL) and look at hue and saturation.
you will need to figure out some criteria for determining what is skin and what is eye
once you have defined the outer area - the one NOT an eye, but skin, you will blend. The blend will use more of the original based on its distance from the center of the eye (or distance to the ellipse defining the eye. Thus initially, the outer color will be say 5% new, 95% original.
as you get close to the eye, you will use more of the eye overlay skin color.
This should produce a really nice image. The biggest problem of course will be getting a good algorithm for separating eye from skin.

Resources