Any ideas to train a segmentation model using slanting images? - machine-learning

I have some problem training segmentation models.
I have some UAV images(in tif format), which I cut them into pieces like this:
And I made some mask in ENVI like this(still tif format):
When I transform it to jpg format, It's like this:
Enough introduction here, and my problems are:
If possible, I want to use Tif format images to train the model(like U-net FPN...), is there any exiting library support tif as a input?
You can see my images are slanting, part of them are transparent. Is it legal to input training images like this?
I will be more than thankful if you can offer me some places to find examples alike. Thanks!
I cannot use cv2.resize to change the input size of tif mask. And I have some problems loading images data. Anyway my original results are awful, so I need some examples and tutoral.

Related

Template matching for colored image input

I have a working code for template matching. But it only works if the input image is converted into grayscale. Is it possible to do template matching considering the template color as well that needs to be found in the given image?
inputImg = cv2.imread("location")
template = cv2.imread("location")
Yes, you can do it but why?
The idea of converting to the grey-scale is to apply the selected edge-detection algorithm to find the features of the input image.
Since you are working the features the possibility of finding the template image in the original image will be higher. As a result, converting to grey-scale has two advantages. Accuracy and computational complexity.
The matchTemplate method is also working for the RGB images. Now you need to find the image characteristic for 3 different channels. Yet you are not sure about whether your features robust or not, since most edge-detection algorithms are designed for the grey-scale images.

How to segment ROI using SIFT/SURF

SIFT is used for feature extraction. Most of the tutorials that I have seen out there only show the features detected using SIFT. I need to identify ROI using SIFT. Images look like this but in worse condition (taken from different angles, some are blur, with more texts and numbers in other places too)
I need to extract this and then perform digit recognition:
What are the ways to segment this part? I was going for SIFT/SURF but couldn't find any tutorial to segment out the ROI. If there are any other suggestions then please provide the link.
Edit: Images that I have are grayscale
Edit1: this is just an example image I got from Google, My dataset only has grayscale images not colored

OCR on antialiased text

I have to OCR table from PDF document. I wrote simple Python+opencv script to get individual cells. After that new problem arose. Text is antialiased and not good-quality.
Recognition rate of tesseract is very low. I've tried to preprocess images with adaptive thresholding but results weren't much better.
I've tried trial version of ABBYY FineReader and indeed it gives fine output, but I don't want to use non-free software.
I wonder if some preprocessing would solve issue or is it nessesary to write and learn other OCR system.
If you look closely at your antialiased text samples, you'll notice that the edges contain a lot of red and blue:
This suggests that the antialiasing is taking place inside your computer, which has used subpixel rendering to optimise the results for your LCD monitor.
If so, it should be quite easy to extract the text at a higher resolution. For example, you can use ImageMagick to extract images from PDF files at 300 dpi by using a command line like the following:
convert -density 300 source.pdf output.png
You could even try loading the PDF in your favourite viewer and copying the text directly to the clipboard.
Addendum:
I tried converting your sample text back into its original pixels and applying the scaling technique mentioned in the comments. Here are the results:
Original image:
After scaling 300% and applying simple threshold:
After smart scaling and thresholding:
As you can see, some of the letters are still a bit malformed, but I think there's a better chance of reading this with Tesseract.

Need advice on training Tesseract OCR (text with conversion/compression artifacts)

I need to do OCR on images that have gone through a digital to analog (interlaced video) to digital conversion, then jpeg compressed (resulting in compression artifacts). I have not been able to locate the exact fonts used, but we'll be looking at a mix of sans serif - e.g., Arial, Calibri, and Tiresias might work well as a training set. There is no way to get around the jpeg compression. These are text-only, white-on-black images at standard def resolution (720x480 deinterlaced).
An example is located here, resized at 1000%:
I've found a preprocessing pipeline that works fairly well for Tesseract:
Resize to 400-600%
Blur
Threshold (binarization)
Erode (get thinner stroke width)
One problem is that letters like 't' and 'f' end up with a diamond shape at the cross. Still, this process works well, but isn't quite perfect. So I'd like to train tesseract. My question:
How should I create the training set?
Should I try to emulate the analog-to-digital-to-analog by adding a small amount of noise, then compress with jpeg? Should I do preprocessing on my training set, similar to what I listed above? If I train with noisy jpeg compressed images to match my captured images, is it best to skip preprocessing on the captured images?
Additionally, any hints on getting rid of the conversion/compression artifacts without sacrificing the text would be appreciated.

Convert raster images to vector graphics using OpenCV?

I'm looking for a possibility to convert raster images to vector data using OpenCV. There I found a function cv::findContours() which seems to be a bit primitive (more probably I did not understand it fully):
It seems to use b/w images only (no greyscale and no coloured images) and does not seem to accept any filtering/error suppresion parameters that could be helpful in noisy images, to avoid very short vector lines or to avoid uneven polylines where one single, straight line would be the better result.
So my question: is there a OpenCV possibility to vectorise coloured raster images where the colour-information is assigned to the resulting polylinbes afterwards? And how can I apply noise reduction and error suppression to such a algorithm?
Thanks!
If you want to raster image by color than I recommend you to clusterize image on some group of colors (or quantalize it) and after this extract contours of each color and convert to needed format. There are no ready vectorizing methods in OpenCV.

Resources