How can I convert bounding box pixels of an image to white, and the background to black? - image-processing

I have a set of images similar to this one:
And for each image, I have a text file with bounding box regions expressed in normalized pixel values, YOLOv5 format (a text document with rows of type: class, x_center, y_center, width, height). Here's an example:
3 0.1661542727623449 0.6696164480452673 0.2951388888888889 0.300925925925926
3 0.41214353459362196 0.851908114711934 0.2719907407407405 0.2961837705761321
I'd like to obtain a new dataset of masked images, where the bounding box area from the original image gets converted into white pixels, and the rest gets converted into black pixels. This would be and example of the output image:
I'm sure there is a way to do this in PIL (Pillow) in Python, but I just can't seem to find a way.
Would somebody be able to help?
Kindest thanks!

so here's the answer:
import os
import numpy as np
from PIL import Image
label=open(os.path.join(labPath, filename), 'r')
lines=label.read().split('\n')
square=np.zeros((1152,1152))
for line in lines:
if line!='':
line=line.split() #line: class, x, y, w, h
left=int((float(line[1])-0.5*float(line[3]))*1152 )
bot=int((float(line[2])+0.5*float(line[4]))*1152)
top=int(bot-float(line[4])*1152)
right=int(left+float(line[3])*1152)
square[top:bot, left:right]=255
square_img = Image.fromarray(square)
square_img=square_img.convert("L")
Let me know if you have any questions!

Related

How to filter bigger font sizes of a text?

I've been writing a code to read a text, using opencv and tesseract on raspberry PI. It is working well, but I would like to filter only the title of the text, that is, differentiate the smallest characters from the biggest and extract only the biggest ones.
Is there any way to achieve this differentiation?
Here is the initial code:
import cv2
import pytesseract
cap = cv2.VideoCapture(0)
cap.set(3,640)
cap.set(4,480)
while True:
success, img = cap.read()
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Video",img)
if cv2.waitKey(1) & 0xFF ==ord('q'):
cv2.imwrite("NewPicture.jpg",img)
break
text = pytesseract.image_to_string(img, config='--oem 3 --psm 11')
print(text)
Example image
A quick search of the pytesseract documentation shows that it has:
# Get verbose data including boxes, confidences, line and page numbers
print(pytesseract.image_to_data(Image.open('test.png')))
You may get quite a bit of data using this API and the filter the size of bounding boxes.

why does reading image with cv2 has different behavior from PIL?

As show in the above image, when I read it with pillow:
from PIL import Image
label = Image.open('example.png')
print(np.unique(array(label)))
The number are within range of [0, 34], which is correct. However, when I read with cv2:
import cv2
label = cv2.imread('example.png')
print(np.unique(label))
The number are with [0, 255] which is not correct in my application. How could I align the behavior of cv2 and pil please ?
Also when I checked the matlab example code parsing this image, it is written like this:
[labels, color_mappings] = imread('example.png')
It seems that the png file has two data fields, one is the fields with values ranges from 0 to 34, and the other is the color pixels, how could I parse it with cv2?
I think Dan has the right answer, but if you want to do some "quick and dirty" testing, you could use the following code to:
convert your palette image into a single channel greyscale PGM image of the palette indices that OpenCV can read without any extra libraries, and a separate palette file that you can apply back afterwards
load back a PGM file of the indices that OpenCV may have altered, and reapply the saved palette
#!/usr/bin/env python3
import numpy as np
from PIL import Image
# Open palette image and remove pointless alpha channel
im = Image.open('image.png').convert('P')
# Extract palette and save as CSV
np.array(im.getpalette()).tofile('palette.csv',sep=',')
# Save palette indices as single channel PGM image that OpenCV can read
na = np.array(im)
im = Image.fromarray(na).save('indices.pgm')
So that will have split image.png into indices.pgm that OpenCV can read as a single channel image and palette.csv that we can reload later.
And here is the second part, where we rebuild the image from indices.pgm and palette.csv
# First load indices
im = Image.open('indices.pgm')
# Now load palette
palette = np.fromfile('palette.csv',sep=',').astype(np.uint8)
# Put palette back into image
im.putpalette(palette)
# Save
im.save('result.png')
Remember not to use any interpolation other than NEAREST_NEIGHBOUR in OpenCV else you will introduce new colours not present in the original image.
Keywords: Python, PNG, image processing, palette, palette indices, palette index

Detect handwritten characters in boxes from a filled form using Fourier transforms

I am trying to extract handwritten characters from boxes. The scanning of the forms is not consistent, so the width and height of the boxes are also not constants.
Here is a part of the form.
My current approach:
1. Extract horizontal lines
2. Extract vertical lines
3. Combine both the above images
4. Find contours ( used opencv)
This approach gives me most of the boxes. But, when the box is filled with characters like "L" or "I", the vertical line in the character is also getting extracted as a part of vertical lines extraction. Hence the contours also get messed up.
Since the boxes are arranged periodically, is there a way to extract the boxes using Fast Fourier transforms?
I recently came up with a python package that deals with this exact problem.
I called it BoxDetect and after installing it through:
pip install boxdetect
It may look somewhat like this (you need to adjust parameters for different forms:
from boxdetect import config
config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1, 100)
config.horizontal_max_distance_multiplier = 2
from boxdetect.pipelines import get_boxes
image_path = "dumpster/m1nda.jpg"
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)
You might want to check below thread for more info:
How to detect all boxes for inputting letters in forms for a particular field?
The Fourier transform is the last thing I would think of.
I'd rather try with a Hough line detector to get long lines or as you did, with edge detection, but I would reconstruct the grids explicitly, finding their pitch and the exact locations of the rows/columns, hence every individual cell.
You can try select handwritten characters by color.
example:
import cv2
import numpy as np
img=cv2.imread('YdUqv .jpg')
#convert to hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#color definition
color_lower = np.array([105,80,60])
color_upper = np.array([140,255,255])
# select color objects
mask = cv2.inRange(hsv, color_lower, color_upper)
cv2.imwrite('hand.png', mask)
Result:

Parameter to isolate frames with colored lines

I'm writing a code that should detect frames in a video that have colored lines. I'm new to openCV and would like to know if I should evaluate saturation, entropy, RBG intensity, etc. The lines, as shown in the pictures, come in every color and density. When black and white, but they are all the same color inside a given frame. Any advice?
Regular frame:
Example 1:
Example 2:
You can use something like this to get the mean Saturation and see that it is lower for your greyscale image and higher for your colour ones:
#!/usr/bin/env python3
import cv2
# Open image
im =cv2.imread('a.png',cv2.IMREAD_UNCHANGED)
# Convert to HSV
hsv=cv2.cvtColor(im,cv2.COLOR_BGR2HSV)
# Get mean Saturation - I use index "1" because Hue is index "0" and Value is index "2"
meanSat = hsv[...,1].mean()
Results
first image (greyish): meanSat = 78
second image (blueish): meanSat = 162
third image (redish): meanSat = 151
If it is time-critical, I guess you could just calculate for a small extracted patch since the red/blue lines are all over the image anyway.

Overlay smaller image in a larger image in OpenCV

I would like to replace a part of the image with my image in Opencv
I used
cvGetPerspectiveMatrix() with a warpmatrix and using cvAnd() and cvOr()
but could not get it to work
This is the code that is currently displaying the image and a white polygon for the replacement image. I would like to replace the white polygon for a pic with any dimension to be scaled and replaced with the region pointed.
While the code is in javacv I could convert it to java even if c code is posted
grabber.start();
while(isDisp() && (image=grabber.grab())!=null){
if (dst_corners != null) {// corners of the image to be replaced
CvPoint points = new CvPoint((byte) 0,dst_corners,0,dst_corners.length);
cvFillConvexPoly(image,points, 4, CvScalar.WHITE, 1, 0);//white polygon covering the replacement image
}
correspondFrame.showImage(image);
}
Any pointers to this will be very helpful.
Update:
I used warpmatrix with this code and I get a black spot for the overlay image
cvSetImageROI(image, cvRect(x1,y1, overlay.width(), overlay.height()));
CvPoint2D32f p = new CvPoint2D32f(4);
CvPoint2D32f q = new CvPoint2D32f(4);
q.position(0).x(0);
q.position(0).y(0);
q.position(1).x((float) overlay.width());
q.position(1).y(0);
q.position(2).x((float) overlay.width());
q.position(2).y((float) overlay.height());
q.position(3).x(0);
q.position(3).y((float) overlay.height());
p.position(0).x((int)Math.round(dst_corners[0]);
p.position(0).y((int)Math.round(dst_corners[1]));
p.position(1).x((int)Math.round(dst_corners[2]));
p.position(1).y((int)Math.round(dst_corners[3]));
p.position(3).x((int)Math.round(dst_corners[4]));
p.position(3).y((int)Math.round(dst_corners[5]));
p.position(2).x((int)Math.round(dst_corners[6]));
p.position(2).y((int)Math.round(dst_corners[7]));
cvGetPerspectiveTransform(q, p, warp_matrix);
cvWarpPerspective(overlay, image, warp_matrix);
I get a black spot for the overlay image and even though the original image is a polygon with 4 vertices the overlay image is set as a rectangle. I believe this is because of the ROI. Could anyone please tell me how to fit the image as is and also why I am getting a black spot instead of the overlay image.
I think cvWarpPerspective(link) is what you are looking for.
So instead of doing
CvPoint points = new CvPoint((byte) 0,dst_corners,0,dst_corners.length);
cvFillConvexPoly(image,points, 4, CvScalar.WHITE, 1, 0);//white polygon covering the replacement image
Try
cvWarpPerspective(yourimage, image, M, image.size(), INTER_CUBIC, BORDER_TRANSPARENT);
Where M is the matrix you get from cvGetPerspectiveMatrix
One way to do it is to scale the pic to the white polygon size and then copy it to the grabbed image setting its Region of Interest (here is a link explaining the ROI).
Your code should look like this:
resize(pic, resizedImage, resizedImage.size(), 0, 0, interpolation); //resizedImage should have the points size
cvSetImageROI(image, cvRect(the points coordinates));
cvCopy(resizedImage,image);
cvResetImageROI(image);
I hope that helps.
Best regards,
Daniel

Resources