I'm new to OpenCV looking for direction on best approach to reading a traditional thermometer using computer vision. Any guidance, general approach, sample code? Thanks for any consideration on this very broad question.
So I guess more specifically how do you narrow your contours to your area of interest, such as just having bounding boxes around just the numbers for instance in the the attached image. Thanks for any consideration. [1]: "thermometer"
import cv2
import numpy as np
img = cv2.imread('thermometer.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
contours,hierarchy = cv2.findContours(edges,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
keys = [i for i in range(48,58)]
#cnts = sorted(contours, key = cv2.contourArea, reverse = True)[:10]
for cnt in contours:
#if cv2.contourArea(cnt)>50:
[x,y,w,h] = cv2.boundingRect(cnt)
roi = img[y:y+h,x:x+w]
roismall = cv2.resize(roi,(10,10))
key = cv2.waitKey(0)

Yeah, that is pretty general. I don't know what computer vision is, but I'm guessing it's some software that looks at said thermometer.
So first of all think in terms of what this software can understand. I'm going to guess that it can pick up on a color change. So you should be able to know when the colors go from red to white (or whatever it is when the thermometer is not red). The program may or may not be smart enough to read the numbers indicating the temperature along the thermometer (I'm assuming this is a vertical thermometer). If the #s are written on glass or a curved, the software probably won't be able to read it. However if it is black letters on a flat white background you may be in luck. Can you then find the closest # to where the red transitions to white? If not you may need to calibrate ahead of time the temperature that is associated with varying heights. In that case you will essentially be ignoring the written #s and be hardcoding them into your program.
Good luck!

Assuming its a static image you can calculate a scale of x pixels = y degrees. Or, well approximately... You can detect the high point of the mercury with simple colour detection, convert image to hsv, filter out with in range to leave just the red, then find the smallest y val and check against your scale.
Finding the Assuming a constant linear scale and each x amount of pixels in the y-axis is equal to y degrees will be easier than trying to detect the digits and read them. Though, to answer your question, as your digits are constant, I'd recommend cropping around the known positions of the numbers and template matching, but that's still pointless as the numbers won't change, so you can just hardcode the positions!
or, if it's a real world scenario, either use a digital thermometer and detect the lcd digits with template matching, or connect a thermometer to the computer.


computer vision - Counting small circles in an image

The image below has many circles. Click and zoom in to see the circles.
What I want is counting the circles using any free language, such as python.
Is there a function or idea to do it?
Edit: I came up with a better solution, partially inspired by this answer below. I thought of this method originally (as noted in the OP comments) but I decided against it. The original image was just not good enough quality for it. However I improved that method and it works brilliantly for the better quality image. The original approach is first, and then the new approach at the bottom.
First approach
So here's a general approach that seems to work well, but definitely just gives estimates. This assumes that circles are roughly the same size.
First, the image is mostly blue---so it seems reasonable to just do the analysis on the blue channel. Thresholding the blue channel, in this case, using Otsu thresholding (which determines an optimal threshold value without input) seems to work very well. This isn't too much of a surprise since the distribution of color values is pretty much binary. Check the mask that results from it!
Then, do a connected component analysis on the mask to get the area of each component (component = white blob in the mask). The statistics returned from connectedComponentsWithStats() give (among other things) the area, which is exactly what we need. Then we can simply count the circles by estimating how many circles fit in a given component based on its area. Also note that I'm taking the statistics for every label except the first one: this is the background label 0, and not any of the white blobs.
Now, how large in area is a single circle? It would be best to let the data tell us. So you could compute a histogram of all the areas, and since there are more single circles than anything else, there will be a high concentration around 250-270 pixels or so for the area. Or you could just take an average of all the areas between something like 50 and 350 which should also get you in a similar ballpark.
Really in this histogram you can see the demarcations between single circles, double circles, triple, and so on quite easily. Only the larger components will give pretty rough estimates. And in fact, the area doesn't seem to scale exactly linearly. Blobs of two circles are slightly larger than two single circles, and blobs of three are larger still than three single circles, and so on, so this makes it a little difficult to estimate nicely, but rounding should still keep us close. If you want you could include a small multiplication parameter that increases as the area increases to account for that, but that would be hard to quantify without going through the histogram, I didn't worry about this.
A single circle area divided by the average single circle area should be close to 1. And the area of a 5-circle group divided by the average circle area should be close to 5. And this also means that small insignificant components, that are 1 or 10 or even 100 pixels in area, will not count towards the total since round(50/avg_circle_size) < 1/2, so those will round down to a count of 0. Thus I should just be able to take all the component areas, divide them by the average circle size, round, and get to a decent estimate by summing them all up.
import cv2
import numpy as np
img = cv2.imread('circles.png')
mask = cv2.threshold(img[:, :, 0], 255, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
stats = cv2.connectedComponentsWithStats(mask, 8)[2]
label_area = stats[1:, cv2.CC_STAT_AREA]
min_area, max_area = 50, 350 # min/max for a single circle
singular_mask = (min_area < label_area) & (label_area <= max_area)
circle_area = np.mean(label_area[singular_mask])
n_circles = int(np.sum(np.round(label_area / circle_area)))
print('Total circles:', n_circles)
This code is simple and effective for rough counts.
However, there are definitely some assumptions here about the groups of circles compared to a normal circle size, and there are issues where circles that are at the boundaries will not be counted correctly (these aren't well defined---a two circle blob that is half cut off will look more like one circle---no clear way to count or not count these with this method). Further I just used automatic thresholding via Otsu here; you could get (probably better) results with more careful color filtering. Additionally in the mask generated by Otsu, some circles that are masked have a few pixels removed from their center. Morphology could add these pixels back in, which would give you a (slightly larger) more accurate area for the single circle components. Either way, I just wanted to give the general idea towards how you could easily estimate this with minimal code.
New approach
Before, the goal was to count circles. This new approach instead counts the centers of the circles. The general idea is you threshold and then flood fill from a background pixel to fill in the background (flood fill works like the paint bucket tool in photo editing apps), that way you only see the centers, as shown in this answer below.
However, this relies on global thresholding, which isn't robust to local lighting changes. This means that since some centers are brighter/darker than others, you won't always get good results with a single threshold.
Here I've created an animation to show looping through different threshold values; watch as some centers appear and disappear at different times, meaning you get different counts depending on the threshold you choose (this is just a small patch of the image, it happens everywhere):
Notice that the first blob to appear in the top left actually disappears as the threshold increases. However, if we actually OR each frame together, then each detected pixel persists:
But now every single speck appears, so we should clean up the mask each frame so that we remove single pixels as they come (otherwise they may build up and be hard to remove later). Simple morphological opening with a small kernel will remove them:
Applied over the whole image, this method works incredibly well and finds almost every single cell. There are only three false positives (detected blob that's not a center) and two misses I can spot, and the code is very simple. The final thing to do after the mask has been created is simply count the components, minus one for the background. The only user input required here is a single point to flood fill from that is in the background (seed_pt in the code).
img = cv2.imread('circles.png', 0)
seed_pt = (25, 25)
fill_color = 0
mask = np.zeros_like(img)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
for th in range(60, 120):
prev_mask = mask.copy()
mask = cv2.threshold(img, th, 255, cv2.THRESH_BINARY)[1]
mask = cv2.floodFill(mask, None, seed_pt, fill_color)[1]
mask = cv2.bitwise_or(mask, prev_mask)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
n_centers = cv2.connectedComponents(mask)[0] - 1
print('There are %d cells in the image.'%n_centers)
There are 874 cells in the image.
One possible solution would be to read the image using OpenCV, get its grayscale, then use Canny edge detection and perform countour finding in OpenCV. This will return a list of countours. It would look something like:
import cv2
image = cv2.imread('path-to-your-image')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# tweak the parameters of the GaussianBlur for best performance
blurred = cv2.GaussianBlur(gray, (7, 7), 0)
# again, try different values here
edged = cv2.Canny(blurred, 20, 140)
(_, contours, _) = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
If you have all images like this - consider thresholding it, not necessarily by auto threshold-seeking algorithm like Otsu, but rather using simplest threshold by a given threshold value. Yes, before thresholding you have to convert your color input to gray-scale, or take one of color channels. Then based on few experiments with channels and threshold values - determine threshold value to have circles with holes in monochrome thresholding result. Based on your png image I found value of 81 (intensity of gray varies from 0 to 255) to be great to threshold gray-scale version of your input to have such binary image with holes in place, as described above.
Then simply count those holes.
Holes can be determined by seed-filling white area, connected to image border. As result you will have white hole connected components on black background - so simply count them.
More details you can find here and use leptonica primitives to do thresholding, hole counting an so on.

OMR: evaluate filled circle

I'm implementing OMR system for test papers. But faced with problems when determining filled circles. I've succeeded in getting these grayscale regions of interest .
The problems are:
- Binary thresholding (adaptive and fixed) and counting non zero pixels gives a lot of errors because of letters in a circles and different brightness of photos made by mobile cameras.
- Also tried technique described in this survey that uses average grayscale values of a circle do mark it filled or not, but the brightness of an image is not uniform because of different light sources when people take photos be their cameras and I got a lot of wrong results.
- People also doesn't follow rules such us filling the whole circle, algorithm also need to be robust in such cases.
Sample images
I already have about 10 GBs of samples, so may be machine learning or other statistical methods will be useful.
Does anybody know other methods to classify a circle as filled?
Since it is not a straight forward problem, it needs lot of tweaking to make it robust. But I would like suggest you a good starting point. You can play with it and try to make it work.
import numpy as np
import cv2
image_ori = cv2.imread("circle_opt.png")
lower_bound = np.array([0, 0, 0])
upper_bound = np.array([255, 255, 195])
image = image_ori
mask = cv2.inRange(image_ori, lower_bound, upper_bound)
masked_red = cv2.bitwise_and(image, image, mask=mask)
kernel = np.ones((3,3),np.uint8)
closing = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
contours = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL,
contours.sort(key=lambda x:cv2.boundingRect(x)[0])
print len(contours)
for c in contours:
(x,y),r = cv2.minEnclosingCircle(c)
center = (int(x),int(y))
r = int(r)
if 10 <= r <= 15:,center,r,(0,255,0),2)
# cv2.imwrite('omr_processed.png', image_ori)
The result I got from my code on the image you shared was this
You can apply thresholds to these green circled patches and then count non-zeros to get if the circle is marked or not. You can play with lower and upper_bound to try to make the solution robust.
Hope this helps! Good luck on your problem solving :)

Detect triangles, ellipses and rectangles from an image

I am trying to detect the regions of traffic signs. Using OpenCV, my approach is as follows:
The color image:
Using the TanTriggs Preprocessing get rid of the illumination variances:
Equalize histogram:
And binarize (Cv2.Threshold(blobs, blobs, 127, 255, ThresholdTypes.BinaryInv):
Iterate each blob using ConnectedComponents and get the mean color value using the blob as mask. If it is a red color then it may be a red sign.
Then get contours of this blob using FindContours.
Simplify the contours using ApproxPolyDP and check the points of each contour:
If 3 points then triangle shape is acceptable --> candidate for triangle sign
If 4 points then shape is acceptable --> candidate
If more than 4 points, BBox dimensions are acceptable and most of the points are on the ellipse fitted (FitEllipse) --> candidate
This approach works for the separated blobs in the binary image, like the circular 100km sign in my example. However if there is a connection to the outside objects, like the triangle left bottom part in the binary image, it fails.
Because, the mean value of this blob is far from red!
Using Erosion helps in some cases, however makes it worse in many of the other images.
Using different threshold values for the binarization also works for some, but fails on many; like the erosion.
Using HoughCircle is just very slow and I couldn't manage to get good results playing with the parameters.
I have tried using matchShapes but couldn't get good results.
Can anybody show me another way the achieve what I want (with a reasonable computational time)?
Any information, or code in any language is wellcome.
Using circularity measure (C=P^2/4πA) or the approach I have described above, triangle and ellips shapes can be found when they are separated. However when the contour is like this for example:
I could not find a robust way to extract the triangle piece. If I could, I would check the mean color, and decide if its a red sign candidate.
Sorry, I don't have the kudos to comment, but can't you use the red colour?
import common
myshow = common.myshow
img = cv2.imread("ms0QB.png")
grey = np.zeros(img.shape[:2],np.uint8)
hsv = cv2.cvtColor(img,cv2.COLOR_mask = np.logical_or(hsv[:,:,0]>160,hsv[:,:,0]<10 )
grey[mask] = 255

Build blocks and isolate characters OpenCV

I have been searching for an answer for a while to this question but cannot find anything useful.
I am trying to read machine readable zone with a camera. I need to extract characters one by one from machine readable zone and feed to OCR. I tried to threshold image, to find contours, extract characters one by one but while it is on live camera find contours miss some characters and I get results not as I expected.
While machine readable zone is known size, form, is there a proper method to build blocks for each character and extract them?
rect = []
blur = cv2.medianBlur(roi_gray,3) #roi_gray is aligned horizontally MRZ zone
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11,2)
_,contours, hierarchy = cv2.findContours(thresh.copy(),cv2.RETR_CCOMP,cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key=cv2.contourArea, reverse = True)[:90]
minH = 20
minW = 20
for ctr in contours:
if cv2.contourArea(ctr) < 1000:
xyc,wh,a = cv2.minAreaRect(ctr)
w,h = wh
x,y = xyc
if h >= minH or w >= minW:
rect is containing collected contours but problem is that after thresholding as example character N is splitting into two contours, or it was not found by findContours so letter is missing in finally output.
I have found video there seems author build blocks for each character but unfortunately author does not provide any additional information about method or code. Video link
To me that ID text of interest area has an aspect ratio, maybe the block means that text area. Having an aspect ration (-+ an error) it may be a possibility to remove other text areas. In OpenCV 3 there is a detector for text.
More, I suppose the area detected is tracked, at least it seems so in the video.
IMHO that app is doing a blur, then a binarization then a erode-dilate to detect text lines. So, after a wrap correction (or maybe even a little perspective correction), with a vertical projection you can detect the character width, so you can detect each character and feed it to the OCR.
According to the comment, I add the info for the character area. I would do an opening operation for filling white spaces inside the letters, or linking the contours. Then, by simply vertical sum the pixels values, you'll get a vertical projection. now you have some minimums between the characters. Using those minimums you can get a character width by averaging the distances between them.
What you can also do is not processing on each frame this width, but getting a width that vary not too much over consecutive frames. You can achieve this by doing an average over widths in the last 5 frames (using a queue).
Try it and come back with some results, like this we will be able to help you more.
There is an OpenCV forum, too, there you'll probably find more informations

Preprocessing image for Tesseract OCR with OpenCV

I'm trying to develop an App that uses Tesseract to recognize text from documents taken by a phone's cam. I'm using OpenCV to preprocess the image for better recognition, applying a Gaussian blur and a Threshold method for binarization, but the result is pretty bad.
Here is the the image I'm using for tests:
And here the preprocessed image:
What others filter can I use to make the image more readable for Tesseract?
I described some tips for preparing images for Tesseract here:
Using tesseract to recognize license plates
In your example, there are several things going on...
You need to get the text to be black and the rest of the image white (not the reverse). That's what character recognition is tuned on. Grayscale is ok, as long as the background is mostly full white and the text mostly full black; the edges of the text may be gray (antialiased) and that may help recognition (but not necessarily - you'll have to experiment)
One of the issues you're seeing is that in some parts of the image, the text is really "thin" (and gaps in the letters show up after thresholding), while in other parts it is really "thick" (and letters start merging). Tesseract won't like that :) It happens because the input image is not evenly lit, so a single threshold doesn't work everywhere. The solution is to do "locally adaptive thresholding" where a different threshold is calculated for each neighbordhood of the image. There are many ways of doing that, but check out for example:
Adaptive gaussian thresholding in OpenCV with cv2.adaptiveThreshold(...,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,...)
Local Otsu's method
Local adaptive histogram equalization
Another problem you have is that the lines aren't straight. In my experience Tesseract can handle a very limited degree of non-straight lines (a few percent of perspective distortion, tilt or skew), but it doesn't really work with wavy lines. If you can, make sure that the source images have straight lines :) Unfortunately, there is no simple off-the-shelf answer for this; you'd have to look into the research literature and implement one of the state of the art algorithms yourself (and open-source it if possible - there is a real need for an open source solution to this). A Google Scholar search for "curved line OCR extraction" will get you started, for example:
Text line Segmentation of Curved Document Images
Lastly: I think you would do much better to work with the python ecosystem (ndimage, skimage) than with OpenCV in C++. OpenCV python wrappers are ok for simple stuff, but for what you're trying to do they won't do the job, you will need to grab many pieces that aren't in OpenCV (of course you can mix and match). Implementing something like curved line detection in C++ will take an order of magnitude longer than in python (* this is true even if you don't know python).
Good luck!
Scanning at 300 dpi (dots per inch) is not officially a standard for OCR (optical character recognition), but it is considered the gold standard.
Converting image to Greyscale improves accuracy in reading text in general.
I have written a module that reads text in Image which in turn process the image for optimum result from OCR, Image Text Reader .
import tempfile
import cv2
import numpy as np
from PIL import Image
def process_image_for_ocr(file_path):
# TODO : Implement using opencv
temp_filename = set_image_dpi(file_path)
im_new = remove_noise_and_smooth(temp_filename)
return im_new
def set_image_dpi(file_path):
im =
length_x, width_y = im.size
factor = max(1, int(IMAGE_SIZE / length_x))
size = factor * length_x, factor * width_y
# size = (1800, 1800)
im_resized = im.resize(size, Image.ANTIALIAS)
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.jpg')
temp_filename =, dpi=(300, 300))
return temp_filename
def image_smoothening(img):
ret1, th1 = cv2.threshold(img, BINARY_THREHOLD, 255, cv2.THRESH_BINARY)
ret2, th2 = cv2.threshold(th1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
blur = cv2.GaussianBlur(th2, (1, 1), 0)
ret3, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
return th3
def remove_noise_and_smooth(file_name):
img = cv2.imread(file_name, 0)
filtered = cv2.adaptiveThreshold(img.astype(np.uint8), 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 41,
kernel = np.ones((1, 1), np.uint8)
opening = cv2.morphologyEx(filtered, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel)
img = image_smoothening(img)
or_image = cv2.bitwise_or(img, closing)
return or_image
Note: this should be a comment to Alex I answer, but it's too long so i put it as answer.
from "An Overview of the Tesseract OCR engine, by Ray Smith, Google Inc." at
"Processing follows a traditional step-by-step
pipeline, but some of the stages were unusual in their
day, and possibly remain so even now. The first step is
a connected component analysis in which outlines of
the components are stored. This was a computationally
expensive design decision at the time, but had a
significant advantage: by inspection of the nesting of
outlines, and the number of child and grandchild
outlines, it is simple to detect inverse text and
recognize it as easily as black-on-white text. Tesseract
was probably the first OCR engine able to handle
white-on-black text so trivially."
So it seems it's not needed to have black text on white background, and should work the opposite too.
You can play around with the configuration of the OCR by changing the --psm and --oem values, in your case specifically I will suggest using
--psm 3
--oem 2
you can also look at the following link for further details
I guess you have used the generic approach for Binarization, that is the reason whole image is not binarized uniformly. You can use Adaptive Thresholding technique for binarization. You can also do some skew correction, perspective correction, noise removal for better results.
Refer to this medium article, to know about the above-mentioned techniques along with code samples.
For wavy text like yours there is this fantastic Python code on GitHub, which transforms the text to straight lines: (this is the most updated version of MZucker's original post and the mechanics are explained here:
