Changing number of channels in image - image-processing

I am trying to convert number of channels to 1. For example I have image and I need to resize as 98,98 and I have done with this code -->
from PIL import Image
from skimage.transform import resize
import cv2
image = Image.open('//imagepath')
new_image =image.resize((98, 98))
and I get shape as (98,98,3) but I need it in the shape like (98,98,1). I have tried with this code -->
new_image = cv2.cvtColor(new_image, cv2.COLOR_BGR2GRAY)
but I am getting error. How can I solve this?

Related

How can I convert bounding box pixels of an image to white, and the background to black?

I have a set of images similar to this one:
And for each image, I have a text file with bounding box regions expressed in normalized pixel values, YOLOv5 format (a text document with rows of type: class, x_center, y_center, width, height). Here's an example:
3 0.1661542727623449 0.6696164480452673 0.2951388888888889 0.300925925925926
3 0.41214353459362196 0.851908114711934 0.2719907407407405 0.2961837705761321
I'd like to obtain a new dataset of masked images, where the bounding box area from the original image gets converted into white pixels, and the rest gets converted into black pixels. This would be and example of the output image:
I'm sure there is a way to do this in PIL (Pillow) in Python, but I just can't seem to find a way.
Would somebody be able to help?
Kindest thanks!
so here's the answer:
import os
import numpy as np
from PIL import Image
label=open(os.path.join(labPath, filename), 'r')
lines=label.read().split('\n')
square=np.zeros((1152,1152))
for line in lines:
if line!='':
line=line.split() #line: class, x, y, w, h
left=int((float(line[1])-0.5*float(line[3]))*1152 )
bot=int((float(line[2])+0.5*float(line[4]))*1152)
top=int(bot-float(line[4])*1152)
right=int(left+float(line[3])*1152)
square[top:bot, left:right]=255
square_img = Image.fromarray(square)
square_img=square_img.convert("L")
Let me know if you have any questions!

OpenCV - Circle Detection Too Sensitive Even with Blur

Hi, just posting this on behalf of my 10yo son. He's working on a Python/OpenCV/GUI application and having some issues. Hoping someone might be able to point him in the right direction. Information as per below (maybe he needs to take a different approach?)
At the moment in my project I am having a problem with no errors. The only problem is the code isn't doing exactly what I want it to be. I can not tell if the blur is too strong, the blur is making the circle detection more sensitive or something else. My code is below.
I am trying to make the circle detection less sensitive by using a blur, however I can not tell what it's doing because there is no error.
What I want it to do is:
blur the image
ensure the circle detection is not to sensitive (not too many circles)
show the image unblurred and on the unblurred image show the circles from the blurred image
For an example, I should be able to detect moon craters.
import tkinter as tk
from tkinter import filedialog
from PIL import ImageTk, Image
import numpy as np
import cv2
root = tk.Tk()
root.title("Circle detecter")
root.geometry("1100x600")
root.iconbitmap('C:/Users/brett/')
def open():
global my_image
filename = filedialog.askopenfilename(initialdir="images", title="Select A File", filetypes=(("jpg files", "*.jpg"),("all files", "*.*")))
my_label.config(text=filename)
my_image = Image.open(filename)
tkimg = ImageTk.PhotoImage(my_image)
my_image_label.config(image=tkimg)
my_image_label.image = tkimg # save a reference of the image
def find_circles():
# convert PIL image to OpenCV image
circles_image = np.array(my_image.convert('RGB'))
gray_img = cv2.cvtColor(circles_image, cv2.COLOR_BGR2GRAY)
img = cv2.medianBlur(gray_img, 5)
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, 1, 20,
param1=20, param2=60, minRadius=20, maxRadius=200)
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0]:
# draw the outer circle
cv2.circle(circles_image, (i[0],i[1]), i[2], (0,255,0), 2)
# draw the center of the circle
cv2.circle(circles_image, (i[0],i[1]), 2, (0,0,255), 3)
# convert OpenCV image back to PIL image
image = Image.fromarray(circles_image)
# update shown image
my_image_label.image.paste(image)
tk.Button(root, text="Load Image", command=open).pack()
tk.Button(root, text="Find circles", command=find_circles).pack()
# for the filename of selected image
my_label = tk.Label(root)
my_label.pack()
# for showing the selected image
my_image_label = tk.Label(root)
my_image_label.pack()
root.mainloop()

How to filter bigger font sizes of a text?

I've been writing a code to read a text, using opencv and tesseract on raspberry PI. It is working well, but I would like to filter only the title of the text, that is, differentiate the smallest characters from the biggest and extract only the biggest ones.
Is there any way to achieve this differentiation?
Here is the initial code:
import cv2
import pytesseract
cap = cv2.VideoCapture(0)
cap.set(3,640)
cap.set(4,480)
while True:
success, img = cap.read()
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Video",img)
if cv2.waitKey(1) & 0xFF ==ord('q'):
cv2.imwrite("NewPicture.jpg",img)
break
text = pytesseract.image_to_string(img, config='--oem 3 --psm 11')
print(text)
Example image
A quick search of the pytesseract documentation shows that it has:
# Get verbose data including boxes, confidences, line and page numbers
print(pytesseract.image_to_data(Image.open('test.png')))
You may get quite a bit of data using this API and the filter the size of bounding boxes.

why does reading image with cv2 has different behavior from PIL?

As show in the above image, when I read it with pillow:
from PIL import Image
label = Image.open('example.png')
print(np.unique(array(label)))
The number are within range of [0, 34], which is correct. However, when I read with cv2:
import cv2
label = cv2.imread('example.png')
print(np.unique(label))
The number are with [0, 255] which is not correct in my application. How could I align the behavior of cv2 and pil please ?
Also when I checked the matlab example code parsing this image, it is written like this:
[labels, color_mappings] = imread('example.png')
It seems that the png file has two data fields, one is the fields with values ranges from 0 to 34, and the other is the color pixels, how could I parse it with cv2?
I think Dan has the right answer, but if you want to do some "quick and dirty" testing, you could use the following code to:
convert your palette image into a single channel greyscale PGM image of the palette indices that OpenCV can read without any extra libraries, and a separate palette file that you can apply back afterwards
load back a PGM file of the indices that OpenCV may have altered, and reapply the saved palette
#!/usr/bin/env python3
import numpy as np
from PIL import Image
# Open palette image and remove pointless alpha channel
im = Image.open('image.png').convert('P')
# Extract palette and save as CSV
np.array(im.getpalette()).tofile('palette.csv',sep=',')
# Save palette indices as single channel PGM image that OpenCV can read
na = np.array(im)
im = Image.fromarray(na).save('indices.pgm')
So that will have split image.png into indices.pgm that OpenCV can read as a single channel image and palette.csv that we can reload later.
And here is the second part, where we rebuild the image from indices.pgm and palette.csv
# First load indices
im = Image.open('indices.pgm')
# Now load palette
palette = np.fromfile('palette.csv',sep=',').astype(np.uint8)
# Put palette back into image
im.putpalette(palette)
# Save
im.save('result.png')
Remember not to use any interpolation other than NEAREST_NEIGHBOUR in OpenCV else you will introduce new colours not present in the original image.
Keywords: Python, PNG, image processing, palette, palette indices, palette index

Show multiple images in same window with Python OpenCV?

I want to display original image left side and grayscale image on right side. Below is my code, I create grayscale image and create window, but I couldn't put grayscale image to right side. How can I do this?
import cv
import time
from PIL import Image
import sys
filePath = raw_input("file path: ")
filename = filePath
img = cv.LoadImage(filename)
imgGrayScale = cv.LoadImage(filename, cv.CV_LOAD_IMAGE_GRAYSCALE) # create grayscale image
imgW = img.width
imgH = img.height
cv.NamedWindow("title", cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage("title", img )
cv.ResizeWindow("title", imgW * 2, imgH)
cv.WaitKey()
First concatenate the images either horizontally (across columns) or vertically (across rows) and then display it as a single image.
import numpy as np
import cv2
from skimage.data import astronaut
import scipy.misc as misc
img=cv2.cvtColor(astronaut(),cv2.COLOR_BGR2RGB)
numpy_horizontal_concat = np.concatenate((img, img), axis=1)
cv2.imshow('Numpy Horizontal Concat', numpy_horizontal_concat)
As far as I know, one window, one image. So create a new image with imgW*2 and copy the contents of the grayscale image at the region starting from (originalimage.width,0). The ROI capabilities may be helpful to you.

Resources