As show in the above image, when I read it with pillow:
from PIL import Image
label = Image.open('example.png')
print(np.unique(array(label)))
The number are within range of [0, 34], which is correct. However, when I read with cv2:
import cv2
label = cv2.imread('example.png')
print(np.unique(label))
The number are with [0, 255] which is not correct in my application. How could I align the behavior of cv2 and pil please ?
Also when I checked the matlab example code parsing this image, it is written like this:
[labels, color_mappings] = imread('example.png')
It seems that the png file has two data fields, one is the fields with values ranges from 0 to 34, and the other is the color pixels, how could I parse it with cv2?
I think Dan has the right answer, but if you want to do some "quick and dirty" testing, you could use the following code to:
convert your palette image into a single channel greyscale PGM image of the palette indices that OpenCV can read without any extra libraries, and a separate palette file that you can apply back afterwards
load back a PGM file of the indices that OpenCV may have altered, and reapply the saved palette
#!/usr/bin/env python3
import numpy as np
from PIL import Image
# Open palette image and remove pointless alpha channel
im = Image.open('image.png').convert('P')
# Extract palette and save as CSV
np.array(im.getpalette()).tofile('palette.csv',sep=',')
# Save palette indices as single channel PGM image that OpenCV can read
na = np.array(im)
im = Image.fromarray(na).save('indices.pgm')
So that will have split image.png into indices.pgm that OpenCV can read as a single channel image and palette.csv that we can reload later.
And here is the second part, where we rebuild the image from indices.pgm and palette.csv
# First load indices
im = Image.open('indices.pgm')
# Now load palette
palette = np.fromfile('palette.csv',sep=',').astype(np.uint8)
# Put palette back into image
im.putpalette(palette)
# Save
im.save('result.png')
Remember not to use any interpolation other than NEAREST_NEIGHBOUR in OpenCV else you will introduce new colours not present in the original image.
Keywords: Python, PNG, image processing, palette, palette indices, palette index
Related
I have a set of images similar to this one:
And for each image, I have a text file with bounding box regions expressed in normalized pixel values, YOLOv5 format (a text document with rows of type: class, x_center, y_center, width, height). Here's an example:
3 0.1661542727623449 0.6696164480452673 0.2951388888888889 0.300925925925926
3 0.41214353459362196 0.851908114711934 0.2719907407407405 0.2961837705761321
I'd like to obtain a new dataset of masked images, where the bounding box area from the original image gets converted into white pixels, and the rest gets converted into black pixels. This would be and example of the output image:
I'm sure there is a way to do this in PIL (Pillow) in Python, but I just can't seem to find a way.
Would somebody be able to help?
Kindest thanks!
so here's the answer:
import os
import numpy as np
from PIL import Image
label=open(os.path.join(labPath, filename), 'r')
lines=label.read().split('\n')
square=np.zeros((1152,1152))
for line in lines:
if line!='':
line=line.split() #line: class, x, y, w, h
left=int((float(line[1])-0.5*float(line[3]))*1152 )
bot=int((float(line[2])+0.5*float(line[4]))*1152)
top=int(bot-float(line[4])*1152)
right=int(left+float(line[3])*1152)
square[top:bot, left:right]=255
square_img = Image.fromarray(square)
square_img=square_img.convert("L")
Let me know if you have any questions!
I've been writing a code to read a text, using opencv and tesseract on raspberry PI. It is working well, but I would like to filter only the title of the text, that is, differentiate the smallest characters from the biggest and extract only the biggest ones.
Is there any way to achieve this differentiation?
Here is the initial code:
import cv2
import pytesseract
cap = cv2.VideoCapture(0)
cap.set(3,640)
cap.set(4,480)
while True:
success, img = cap.read()
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Video",img)
if cv2.waitKey(1) & 0xFF ==ord('q'):
cv2.imwrite("NewPicture.jpg",img)
break
text = pytesseract.image_to_string(img, config='--oem 3 --psm 11')
print(text)
Example image
A quick search of the pytesseract documentation shows that it has:
# Get verbose data including boxes, confidences, line and page numbers
print(pytesseract.image_to_data(Image.open('test.png')))
You may get quite a bit of data using this API and the filter the size of bounding boxes.
i had search through anywhere on google and forums
but i couldnt found what i wanted.
hope someone could help me here...
i had generated a 2d map from octomap using map_server map_saver from a pcd file it generated 2 file which is pgm and yaml file
however the generated pgm file does not have grid line on it.
my question is is it possible to show grid line on the image generated from map_saver? or is there any other way to generate an image with grid line from a 2D map?
You may try this, change the map dimension and if you have a different orientation rotate the map accordingly.
import pylab as plt
import numpy as np
# Load the image
img = np.array(plt.imread("g_map.pgm"))
# assume dimension of map 20mx20m
map_dim_x = 20
map_dim_y = 20
# relationship between pixel and map
dx, dy = int(img.shape[0]/map_dim_x),int(img.shape[1]/map_dim_y)
grid_color = 0
img[:,::dy,] = grid_color
img[::dx,] = grid_color
plt.imshow(img)
plt.show()
I am trying to extract handwritten characters from boxes. The scanning of the forms is not consistent, so the width and height of the boxes are also not constants.
Here is a part of the form.
My current approach:
1. Extract horizontal lines
2. Extract vertical lines
3. Combine both the above images
4. Find contours ( used opencv)
This approach gives me most of the boxes. But, when the box is filled with characters like "L" or "I", the vertical line in the character is also getting extracted as a part of vertical lines extraction. Hence the contours also get messed up.
Since the boxes are arranged periodically, is there a way to extract the boxes using Fast Fourier transforms?
I recently came up with a python package that deals with this exact problem.
I called it BoxDetect and after installing it through:
pip install boxdetect
It may look somewhat like this (you need to adjust parameters for different forms:
from boxdetect import config
config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1, 100)
config.horizontal_max_distance_multiplier = 2
from boxdetect.pipelines import get_boxes
image_path = "dumpster/m1nda.jpg"
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)
You might want to check below thread for more info:
How to detect all boxes for inputting letters in forms for a particular field?
The Fourier transform is the last thing I would think of.
I'd rather try with a Hough line detector to get long lines or as you did, with edge detection, but I would reconstruct the grids explicitly, finding their pitch and the exact locations of the rows/columns, hence every individual cell.
You can try select handwritten characters by color.
example:
import cv2
import numpy as np
img=cv2.imread('YdUqv .jpg')
#convert to hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#color definition
color_lower = np.array([105,80,60])
color_upper = np.array([140,255,255])
# select color objects
mask = cv2.inRange(hsv, color_lower, color_upper)
cv2.imwrite('hand.png', mask)
Result:
I'm writing a code that should detect frames in a video that have colored lines. I'm new to openCV and would like to know if I should evaluate saturation, entropy, RBG intensity, etc. The lines, as shown in the pictures, come in every color and density. When black and white, but they are all the same color inside a given frame. Any advice?
Regular frame:
Example 1:
Example 2:
You can use something like this to get the mean Saturation and see that it is lower for your greyscale image and higher for your colour ones:
#!/usr/bin/env python3
import cv2
# Open image
im =cv2.imread('a.png',cv2.IMREAD_UNCHANGED)
# Convert to HSV
hsv=cv2.cvtColor(im,cv2.COLOR_BGR2HSV)
# Get mean Saturation - I use index "1" because Hue is index "0" and Value is index "2"
meanSat = hsv[...,1].mean()
Results
first image (greyish): meanSat = 78
second image (blueish): meanSat = 162
third image (redish): meanSat = 151
If it is time-critical, I guess you could just calculate for a small extracted patch since the red/blue lines are all over the image anyway.