YOLO multiple annotations on same object - image-processing

Is it ok to put multiple annotations of the same class on the same object?
Is it going to break or not affect or increase accuracy?
OneBirdMultipleAnnotations
Iv put multiple bounding boxes of the same class on an object

This probably won’t do anything because overlapping boxes are filtered out with Intersection Over Union. You can read more about how the metrics are calculated here.
If you’re using a modern YOLO like v5 or v7 you might want to try labeling with polygons, though. It does what it looks like you’re trying to do (give the model more information about where the object of interest is compared to the background) and is easy to do with the Smart Polygon model if you’re already using Roboflow.

Related

What are the Pre-processing steps required before Object annotation in Deep learning using Keras ?

I am new in deep learning. What are the pre-processing steps I should use before annotating my object.
I have an Dataset of images of size 640*360. I want to continuously detect this object in an video in any background.
Like should I crop the object and then annot ? Or should I Use the Entire Image and annot the specific Object ?
Which Image should Be considered for annotation ?
So you have mentioned two ways you can go.
Like should I crop the object and then annot ?
Yes. This is a an easy option. after cropping you should be able to use a sliding windows technique to search in test images. But bear in mind that this process is quite expensive.
You should take a look the paper called R-CNN. Its methodology is quite similar.
Paper link: https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf?spm=5176.100239.blogcont55892.8.pm8zm1&file=Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf
should I Use the Entire Image and annot the specific Object ?
This is the more recent way to go where you allow the model to learn itself what the subject is and where it is located in the given image. Usually you will have a image and the annotations like [x co-ordinate, y co-ordinate, height, width] for each objects. The model should be trained to produce those outputs as close as possible.
For multiple objects you will need a layer called ROI-pooling layer. This is explained in https://deepsense.ai/region-of-interest-pooling-explained/

How can I transform an image to match a circular model in OpenCV

I'm trying to make a program that can take an image of a dartboard and read the score. So far I can get the position of each dart by comparing it to a model image as you can see here:
However this only works if the input image is practically the same. In this other case the board is slightly in a different perspective so I was thinking maybe I can transform the image to match the model image and then do the process that you can see above.
So my question is: How can I transform this last image to match the shape and pespective of the model dart board with OpenCV?
The dart board is basically planar. Thus, you can model the wanted transformation by a homography. Now you can perform a simple feature extraction and matching like here or if speed is not as important utilize an intensity based parametric alignment algorithm (more accurate).
However, as already mentioned in the comments, it will not be as simple afterwards. The dart flights will (depending on the distortion) most likely cover an area of your board which does not coincide with the actual score. Actually, even with a frontal view it is difficult to say.
I assume you will have to find the point on which the darts stick in your board. Furthermore, I think this will be easier with a view from a certain angle. Maybe, you can fit lines segments just in the area where you detected a difference beforehand.
I don't think comparing an image with the model that was captured using a different subject with a different angle is a good idea. There should be lots of small differences even after perfectly matching them geometrically - like shades, lighting, color differences, etc.
I would just capture an image every time the game begin (reference) and extract the features (straight lines seem good enough) and then after the game, capture an image, subtract the reference, and do blob analysis to find darts.

How to create a hole in the box in SceneKit?

I'm using SceneKit to create a 3D Room for a Swift iOS app.
I'm using multiple boxes and placing it together to create different walls of the room. I want to also add doors and windows to the room for which I need to cut holes into the walls. This looks like a very common scenario but yet I couldn't find any relevant answers out there.
I know there are multiple ways of doing it -
Simplest being, don't cut the box. Place another box with door or wall texture.
But I do want to keep a light source outside of the room and want it to flow into the room through these doors and windows
Create multiple boxes for single wall and put them together to make a geometry
My last resort maybe.
Create custom geometry.
Feels too complicated since it requires me to draw each triangle myself. Not sure?
But what I was actually expecting -
Subtract geometries from geometries?
Library that's already handling these complexities?
Any pointers would be very helpful.
Thanks.
Scene kit offers some awesome potential but it's not a substitute for a 3D modeling program. If you want something much beyond assembling with primitives and extrusion in a plane you should think about constructing your model in a dedicated 3-D package and exporting the model into SceneKit as a .dae file. You might take a look at Blender. It's free and readily available on the net. I suspect it can easily do what you want and the learning curve will be compensated by the higher level functions of a graphics program versus coding.
I think #bpedit described the best approach.
A weak second choice would be to use SCNShape to build your geometry. That still leaves you the problem of constructing a Bezier path that matches your wall layout/topology. That might be a helpful hack in the short term, to save you from an immediate learning curve in modeling software. But I predict you'll still eventually move to a tool like Blender, SketchUp, Cheetah 3D, or Maya.

Confused about what Exactly an image feature is

I read about image features on wikipedia and I am still confused about what exactly they are.
Term is explained in a manner such that I cant clear my confusion.
1. They represent a Class (edge is a feature and boundry is another)
2. They represent a instance of a Class(all the edges detected will be a feature)
Suppose I detect all the corners of an object and put them in an array say A.
Did I get only one feature or I got features=len(A).
Each feature is an individual "interesting" point or area in the image, with "interesting" depending on what algorithm is used to find features. In your example, you'd have A features, each of the corners being one.

How to perform Watershed segmentation and Blob analysis on a single image?

I'm planning to write a program using Open-CV to count the number of objects in an image similar to the one below.
The method I'm planning to employ is to use the histogram to threshold the image and then to use Blob detection to count the number of Blobs that are identified. This would work fine as long as the pellet like objects do not touch each other. (Overlapping is out of scope though) I've looked into the possibility of using the Watershed segmentation to identify objects that are touching each other.
What I'm not clear is, how to apply these two techniques for an image that may or may not have overlapping pellets. Provided that there is at least one instance of pellets touching each other in an image, am I to perform both the techniques? If so in what order? Or am I to perform Watershed only since there's going to be overlapping somewhere and performing Blob detection would lead to an erroneous output due to the merged blobs? Thanks in advance.
You say "Provided that there is at least one instance of overlapping in an image" but also "Overlapping is out of scope though".
If the Watershed algorithm handles images with overlapping pellets, Blob detection will probably not provide any advantage (since it will merge overlapping objects).
If you really wanted to combine the approaches, you could run both of them in their own pipelines, and use a probabilistic model to combine the two. But it's best to start simple and see what sort of results you get first.
Here's an example using Matlab which performs cell segmentation using Watershed:
http://blogs.mathworks.com/steve/2006/06/02/cell-segmentation/
If you need to avoid counting objects which are only partially in view, you can use a Voronoi diagram and remove objects that connect with the edges:
http://pythonvision.org/basic-tutorial

Resources