My implementation of mser is not detecting text area correctly, what should i do? - opencv

I am a beginner in OpenCV, and am trying to extract numbers from a dataset of images and use the extracted numbers as a dataset for NN. For this, I'm using mser's bounding boxes and then cropping the image in size of the bounding box. but mser is not detecting the text area correctly. Please help me on how to do it more precisely. here is my code:
mser = cv2.MSER_create(_delta = 1)
msers, bbs = mser.detectRegions(gray)
Here bbs are the list of bounding boxes, they are not on the text area, not even one.
Image of the ground truth, where the bounding box should be:
Bounding box by mser:
Another example of the bounding box by mser:

If you want to detect text on an image, I use Tesseract to accomplish this. You just link it a tessdata file to the language you are using, and it should detect the text in the image and output it as a string. However, if you want to crop the original image further before you detect text on an image, you could use blob detection. Blob detection is where an image is passed through a variety of different image thresholds, and it looks for consistency, and if consistency is found, a blob is created at that region. You could use blob detection in this situation, then create your bounding rectangles off of those blobs.

Related

How to remove bounding box from an image using Opencv

The Link of the Image is here I am having an image and I want to remove all bounding boxes using OpenCV the color code of the bounding boxes is different. How can we do that?
I have tried many codes of OpenCV but I am not able to get the result. So I want the full code.
Thanks

Shape detection aftr object detection to get precise bounding box

I'm working on a project where I want to get height of object in image.
I'm able to identify the object and get bounding box using object detection but the box is not precise(little bigger). Can I crop the image with bounding box coordinates and then apply some mechanism like edge detection or something to get exact bounding box or height.
Since semantic segmentation is heavier I don't wanna go ahead with that also my trained models are in yolo.
Edit: Added pic with the object detected- object is stop sign as u can see the bounding box is little bigger with respect to height..

How do I perform data augmentation in object localization

Performing data augmentation for classification task is easy as most transform do not change the ground truth label of the image.
However in the case of object localization:
The position of the bounding box is relative to the crop that has been taken.
There can be the case that the bounding box is only partially in the crop window, do we perform some sort of clipping in this case.
There will also be the case that the object bounding box are not included in the crop, do we discard these examples during training.
I am unable to understand how such cases are handled in object localization. Most papers suggest the use of Multi-Scale training but dont address these issues.
The augmentation methods have to alter the content of the bounding box. In the case of Color augmentations, the pixel distribution would be changed and the coordinates of the bounding box would not change. But in the case of geometric augmentations such as cropping or scaling, not only the pixel distribution would be affected but also the coordinates of the bounding box. Those changes should be kept in the annotation files so the algorithm can read it.
Custom scripts are common to solve this problem. However, In my repository I have a library that would help you. Here is the link https://github.com/lozuwa/impy . With this library you can perform the operations I described previously.

Objects detection in an image

Need some guidance here, I'm trying to identify different objects in an image and get their bounding box.
The image is always clean with transparent background and well separated objects.
for example in the above image there are 3 objects. Any idea or any tool would be helpful.
As the objects are on such background, a simple connected components labeling will give you a first basic answer. However, it will be more complicated to find out which objects are overlapping.
Do you have any information about the objects to detect?
You can just use the template matching to find the flowers and the top-right object (assuming here that they are similar) given the image of the flower (as a template) and the whole picture.
There is an example of the template detection here: (where reference.png is the original image, and the template.png is the object you want to detect, like the flower)
Here is an image of the flower (renamed to template.png):
Running the template matching code with the whole image as reference.png, we can find the flowers (highlighted in green rectangles):
Although the code did not implement the bounding boxes, you can use boundingRect() to draw a minimum bounding rectangle (given a single contour).
The outline may be something like:
Set a ROI (Region of Interest) inside each of the green boxes.
Find the contours of pink objects.
Use boundingRect on the contours found, and draw the minimum rectangle around the flower.

Use OpenCV to detect text blocks to send to Tesseract iOS

How can I use OpenCV to detect all the text in an image, I want to to be able to detect "blocks" of texts individually. Then pass the the recognized blocks into tesseract. Here is an example, if I were to scan this I would want to scan the paragraphs separately, not go from left to right which is what tesseract does.
Image of the example
That would be my first test:
Threshold the image to get a black and white image, with the text in black
Erode it until the paragraph converts to a big blob. It may have lots of holes, it doesn't matter.
Find contours and the bounding box
If some paragraphs merge, you should erode less or dilate a little bit after the erode.

Resources