Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
My task is to pin-point where is the plate number in an image. The image does not only contain the plate number. It may contain the whole car or anything. I used gaussian blur then grayscale then contrast then laplacian of gaussian to detect the edges.
Now, I am at loss on how to detect where is the plate number in the image. I am not going to read the license number, just make the system know where is the license number.
Can you direct me to a study regarding this? Or perhaps the algorithm that can be used to do this.
Thank you!
I think a more robust way to tackle this is a train a detector if you have enough training images of the license plate in different scenarios. Few things you can try is Haar cascade classifier in Opencv library. It does a multiscale detection of learned patterns.
You could try edge detection or some form of Hough transforms.
For example, do edge detection and then look for rectangles (or if the images aren't straight on, parallelograms) in the image. If you know that the plates will all be the same shape and size ratios, you can use that to speed up your search.
EDIT:
Found this for you.
Using some feature recognition algorithm e.g. SIFT would be a good starting point. Do you need real-time recognition or not? I recommend trying to tighten search space first, for example by filtering out regions from the image (is your environment controlled or not?). There is an article about recognising license plates using SIFT here (I just skimmed it but it looks reasonable).
License-plates or number plates of vehcles come with 2 striking properties.
They have specified color pattern (Black letters on white, yellow or gray background)
Aspect ratio
These properties can be used to extract only the license plate. First threshold the image using adaptive thresholding. Then find contours in the image with aspect ratio in a close range to standard value. This method should work for most of the cases. You can also try erosion followed by dilation of thresholded image to remove noise.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
The community reviewed whether to reopen this question 6 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm trying to implement this paper right now:
Automatic Skin and Hair Masking Using Convolutional Neural Networks
I've gotten the FCN and CRF part working, and I found the code to generate the alpha mask once I have the trimap.
I'm stuck on the part between (c) and (d), though.
How do I generate a trimap given the binary mask? The paper says:
We apply morphological operators on the binary segmentation
mask for hair and skin, obtaining a trimap that indicates
foreground (hair/skin), background and unknown pixels. In
order to deal with segmentation inaccuracies, and to best capture
the appearance variance of both foreground and background,
we first erode the binary mask with a small kernel,
then extract the skeleton pixels as part of foreground constrain
pixels. We also erode the binary mask with a larger kernel to
get more foreground constrain pixels. The final foreground
constrain pixels is the union of the two parts. If we only keep
the second part then some thin hair regions will be gone after
erosion with a large kernel. If a pixel is outside the dilated
mask then we take it as background constrain pixel. All other
pixels as marked as unknown, see figure 2 (d).
OpenCV supports morphological operations.
Please see this tutorial explaining how to use erode and dilate functions.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
First of all this Theory confuse me could someone explain it for me in some words.?
also the word scale in computer vision context does it means the various size of objects
Or the various units measurement of objects ( i.e meter , cm etc) or what I think is the various degrees smoothing/blurring for the same interesting Image ?
Second making multi-scale of Image by using smooth/blur operator which one I know the Gaussian blur operator. why they do a numbers of Smoothing for the Same Image , what the point of making numbers of smooth Images with different details/resolution but not different in size for the same scene (i.e one smooth operator on the interest image with size 256X256 and another time with 512X512 ).
I'm talking in context of Features extraction & description .
I will be thankful if some one could clarify the subject for me sorry for my Language !.
"Scale" here alludes to both the size of the image as well as the size of the objects themselves... at least for current feature detection algorithms. The reason why you construct a scale space is because we can focus on features of a particular size depending on what scale we are looking at. The smaller the scale, the coarser or smaller features we can concentrate on. Similarly, the larger the scale, the finer or larger features we can concentrate on.
You do all of this on the same image because this is a common pre-processing step for feature detection. The whole point of feature detection is to be able to detect features over multiple scales of the image. You only output those features that are reliable over all of the different scales. This is actually the basis of the Scale-Invariant Feature Transform (SIFT) where one of the objectives is to be able to detect keypoints robustly that can be found over multiple scales of the image.
What you do to create multiple scales is decompose an image by repeatedly subsampling the image and blurring the image with a Gaussian filter at each subsampled result. This is what is known as a scale space. A typical example of what a scale space looks like is shown here:
The reason why you choose a Gaussian filter is fundamental to the way the scale space works. At each scale, you can think of each image produced as being a more "simplified" version of the one found from the previous scale. With typical blurring filters, they introduce new spurious structures that don't correspond to those simplifications made in the finer scales. I won't go into the details, but there is a whole bunch of scale space theory where in the end, scale space construction using the Gaussian blur is the most fundamental way to do this, because new structures are not created when going from a fine scale to any coarse scale. You can check out that Wikipedia article I linked you to above that talks about the scale space for more details.
Now, traditionally a scale space is created by convolving your image with a Gaussian filter of various standard deviations, and that Wikipedia article has a nice pictorial representation of that. However, when you look at more recent feature detection algorithms like SURF or SIFT, they use a combination of blurring using different standard deviations as well as subsampling the image, which is what I talked about at the beginning of this post.
Either way, check out that Wikipedia post for more details. They talk about about this stuff more in depth than what I've done here.
Good luck!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'd love to write a program that will take a scanned invoice (original is A4 paper, scanned as JPEG file (wrapped in a PDF), ~4000 pixels wide) and look for logotypes. If a logotype is found, the invoice file (PDF) will be tagged with those tags associated with the logotypes found in the invoice.
I expect 20 or so logotypes to look for, and about 2500 invoices (so yes, a pain to do manually).
My ideas are drawn towards OpenCV since I know that's used behind the scenes by Sikuli. I would only look for logos in certain areas, ie logo A should only be looked for in top left corner of every invoice, logo B top right etc. Dropping the JPG to monochrome with high contrast I assume would help too?
"20 or so logotypes" is a good number to use keypoints (corners, blobs etc) and it's descriptors(SIFT, SURF, FREAK etc) in find-nearest-neighbor-way. Steps are:
1 train
create a training set of logo (take it from your documents)
calculate a set of keypoints and it's descriptors for every logo
2 find
do picture equalization and noise filtering
find keypoints and it's descriptors
find best matching descriptors (find nearest neighbor) in you training set
find homography for matching keypoints position to be sure it is a complete logo but not just one accidental point
All this steps are implemented in openCV. But you will need some time to play with parameters to have the best solution. Anyway you have very low level of logo distortion so you will have high level of "True Positive" results and low level of "False Positive" ones.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am currently studying a computer vision module in college.
I would like to get a theoretical understanding of what contours are in computer vision and what they are used for.
A contour is simply the boundary of an object in an image. Various representations of contours (e.g. chain code, Fourier descriptors, shape context) are used to recognize or categorize objects.
This assumes that you have a way to segment out an object and find its boundary, which itself is not a trivial problem. One particular class of algorithms for finding boundaries is called active contours or snakes. Is this what you are asking about?
Here you can go through the official documentation of opencv, where they say that contour is a simple curve which joins continuous points with same color or intensity.
I used the concept of contours in hand gesture recognition where i have used the area bounded by contours as a basis to remove the noise and detect only the hand part in the image.
Contour is a boundary around something that has well defined edges, which means that the machine is able to calculate difference in gradient (significant difference in magnitude of pixel value), try to see if the same difference continues and forms a recognisable shape and draw a boundary around it. Opencv can do it for a lot of shapes and they are shown in the link below.
Just imagine how you do it with your eyes. You're in a room and you create a boundary in your mind when you see a frame or a monitor or a ball. Exactly the same way contours work in opencv. As #Dima said, various algorithms are used for this purpose.
If you need examples and how contours are represented in opencv, here's a link.
Hope this helps.
Open CV python provides us with contours and several edge detection features to identify several attributes of objects. Contours can be explained simply as a curve joining all the continuous points(along the boundary), having same colour or intensity.
Use of binary image in contour detection:
Contours are useful tool for shape analysis and object detection and recognition. We take in binary image (in other words, images whose pixels have only 2 possible values).So before finding contours, apply threshold or canny edge detection.
Steps for finding the contours:
1)Convert to grayscale
2)Convert to binary image
3)Find contours
Draw contours :
To draw the contours, cv2.drawContours function is used. It can also be used to draw any shape provided you have its boundary points.
Properties of contours:
1)To find the area.
2)To find the perimeter
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm trying to develop a system, which recognizes various objects present in an image based on their primitive features like texture, shape & color.
The first stage of this process is to extract out individual objects from an image and later on doing image processing on each one by one.
However, segmentation algorithm I've studied so far are not even near perfect or so called Ideal Image segmentation algorithm.
Segmentation accuracy will decide how much better the system responds to given query.
Segmentation should be fast as well as accurate.
Can any one suggest me any segmentation algorithm developed or implemented so far, which won't be too complicated to implement but will be fair enough to complete my project..
Any Help is appreicated..
A very late answer, but might help someone searching for this in google, since this question popped up as the first result for "best segmentation algorithm".
Fully convolutional networks seem to do exactly the task you're asking for. Check the paper in arXiv, and an implementation in MatConvNet.
The following image illustrates a segmentation example from these CNNs (the paper I linked actually proposes 3 different architectures, FCN-8s being the best).
Unfortunately, the best algorithm type for facial recognition uses wavelet reconstruction. This is not easy, and almost all current algorithms in use are proprietary.
This is a late response, so maybe it's not useful to you but one suggestion would be to use the watershed algorithm.
beforehand, you can use a generic drawing(black and white) of a face, generate a FFT of the drawing---call it *FFT_Face*.
Now segment your image of a persons face using the watershed algorithm. Call the segmented image *Water_face*.
now find the center of mass for each contour/segment.
generate an FFT of *Water_Face*, and correlate it with the *FFT_Face image*. The brightest pixel in resulting image should be the center of the face. Now you can compute the distances between this point and the centers of segments generated earlier. The first few distances should be enough to distinguish one person from another.
I'm sure there are several improvements to the process, but the general idea should get you there.
Doing a Google search turned up this paper: http://www.cse.iitb.ac.in/~sharat/papers/prim.pdf
It seems that getting it any better is a hard problem, so I think you might have to settle for what's there.
you can try the watershed segmentation algorithm
also you can calculate the accuracy of the segmentation algorithm by the qualitative measures