Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am doing a project which is hole detection in road. I am using a laser to emit beam on the road and using a camera to take a image of the road. the image may be like this
Now i want to process this image and give a result that is it straight or not. if it curve then how big the curve is.
I dont understand how to do this. i have search a lot but cant find a appropriate result .Can any one help me for that?
This is rather complicated and your question is very broad, but lets have a try:
Perhaps you have to identify the dots in the pixel image. There are several options to do this, but I'd smoothen the image by a blur filter and then find the most red pixels (which are believed to be the centers of the dots). Store these coordinates in a vector array (array of x times y).
I'd use a spline interpolation between the dots. This way one can simply get the local derivation of a curve touching each point.
If the maximum of the first derivation is small, the dots are in a line. If you believe, the dots belong to a single curve, the second derivation is your curvature.
For 1. you may also rely on some libraries specialized in image processing (this is the image processing part of your challenge). One such a library is opencv.
For 2. I'd use some math toolkit, either octave or a math library for a native language.
There are several different ways of measuring the straightness of a line. Since your question is rather vague, it's impossible to say what will work best for you.
But here's my suggestion:
Use linear regression to calculate the best-fit straight line through your points, then calculate the mean-squared distance of each point from this line (straighter lines will give smaller results).
You may need to read this paper, it is so interesting one to solve your problem
As #urzeit suggested, you should first find the points as accurately as possible. There's really no way to give good advice on that without seeing real pictures, except maybe: try to make the task as easy as possible for yourself. For example, if you can set the camera to a very short shutter time (microseconds, if possible) and concentrate the laser energy in the same time, the "background" will contribute less energy to the image brightness, and the laser spots will simply be bright spots on a dark background.
Measuring the linearity should be straightforward, though: "Linearity" is just a different word for "linear correlation". So you can simply calculate the correlation between X and Y values. As the pictures on linked wikipedia page show, correlation=1 means all points are on a line.
If you want the actual line, you can simply use Total Least Squares.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am developing a diagraming app for Apple Pencil using Qt-5.8/PyQt5 and am trying to get the pencil strokes as smooth as some of the other apps that I am seeing, namely Notability and PDF Expert. I patched Qt-5.8 to provide fast access to the floating-point coalesced and predicted UITouch data provided by Apple and my app code is fast and responsive, but the lines are still jittery (see screenshots):
Notability and PDF Expert both produce lines that maintain their sharpness at various levels of zoon, which suggests to me that they may be vectorized.
Anyone have a suggestion for smoothing out my painting? I am already painting at retina resolution and using the same 250Hz Apple Pencil data as they are. Is there a mathematical technique for smoothing a series of points, or some other trick out there?
Before you implement a smoothing/optimization filter on the input, make sure you're calling the appropriate API to get the best data available.
If you request data from touch.location(in: view) the samples will be discretized (rounded) to the pixel-grid.
If you request data from touch.preciseLocation(in: view) the samples will not be rounded. They will include fractional spacing between pixels, which is critical to the task at hand.
Note taking apps tend to actually store and paint the drawings as vectors, which is why they are smooth. It also enables several cool features, like being able to select and move text around, change its color and style, it is also very efficient for storage and can be zoomed in or out without loss of resolution, compared to raster painting.
In some applications there is even a two step process, there is an initial smoothing taking place while drawing a specific glyph and another pass which takes place after you lift the pen and the glyph is considered finished.
Your code on the other hand looks very raster-y. There is a number of ways to simplify the input points, ranging from very simple to incredibly complex.
In your case what you could try is rather simple, and should work fine for the kind of usage you are aiming at.
You need to keep processing each stroke / glyph as the pen moves, and instead of adding every intermediate position to the stroke control points, you only add points that deviate from the current angle / direction above a certain threshold. It is conceptually a lot like the Ramer–Douglas–Peucker algorithm, but you don't apply it on pre-existing data points, but rather while the points are created, which is more efficient and better for user experience.
Your first data point is created when you put down the pen on the screen. Then you start moving the pen. You now have a second point, so you add that, but also calculate the angle of the line which the two points form, or the direction the pen is going. Then, as you move the pen further, you have a third point, which you check against the second point, and if the angle difference is not above the threshold, instead adding the third point you modify to extend the second point to that position effectively eliminating a redundant point. So you only end up creating points with deviate enough to form the rough shape of the line, and skip all the tiny little variances which create your jittery lines.
This is only the first step, this will leave you with a simplified, but faceted line. If you draw it directly, it will not look like a smooth curve, but like a series of line segments. The second step is point interpolation, probably regular old cubic interpolation will do just fine. You then get each actual position by interpolating between each set of 3 points, and draw the brush stroke at every brush spacing threshold. When interpolating the position, you also interpolate the brush pressure between the two points defining the currently drawn segment, which you must store along with each curve defining point. The pressure interpolation itself can be as simple as linear.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I was given this question on a job interview and think I really messed up. I was wondering how others would go about it so I could learn from this experience.
You have one image from a surveillance video located at an airport which includes line of people waiting for check-in. You have to assess if the line is big/crowded and therefore additional clerks are necessary. You can assume anything that may help your answer. What would you do?
I told them I would try to
segment the area containing people from the rest by edge detection
use assumptions on body contour such as relative height/width to denoise unwanted edges
use color knowledges; but then they asked how to do that and I didn't know
You failed to mention one of the things that makes it easy to identify people standing in a queue — the fact that they aren't going anywhere (at least, not very quickly). I'd do it something like this (Warning: contains lousy Blender graphics):
You said I could assume anything, so I'll assume that the airport's floor is a nice uniform green colour. Let's take a snapshot of the queue every 10 seconds:
We can use a colour range filter to identify the areas of floor that are empty in each image:
Then by calculating the maximum pixel values in each of these images, we can eliminate people who are just milling around and not part of the queue. Calculating the queue length from this image should be very easy:
There are several ways of improving on this. For example, green might not be a good choice of colour in Dublin airport on St Patrick's day. Chequered tiles would be a little more difficult to segregate from foreground objects, but the results would be more reliable. Using an infrared camera to detect heat patterns is another alternative.
But the general approach should be fairly robust. There's absolutely no need to try and identify the outlines of individual people — this is really very difficult when people are standing close together.
I would just use a person detector, for example OpenCV's HOG people detection:
http://docs.opencv.org/modules/gpu/doc/object_detection.html
or latent svm with the person model:
http://docs.opencv.org/modules/objdetect/doc/latent_svm.html
I would count the number of people in the queue...
I would estimate the color of the empty floor, and go to a normalized color space (like { R/(R+G+B), G/(R+G+B) } ). Also do this for the image you want to check, and compare these two.
My assumption: where the difference is larger than a threshold T it is due to a person.
When this is happening for too much space it is crowded and you need more clerks for check-in.
This processing will be way more robust than trying to recognize and count individual persons, and will work with quite row resolution / low amount of pixels per person.
I need to stitch images without overlaps.
The task will be more clear from the example:
Source:
Target:
Basicly I need a method that determines how well two images are joined to each other.
UPDATE
Using of random forest from OpenCv library allows to reach 80% of successful responses. Trained forest shows how well the two parts of puzzle fit each other.
Assuming you don't want the software to have a 5year old's encyclopdic knowledge of Disney characters - then your match is based on the point at which lines meet?
Just store a list of coords that a line hits the edge of a square and then compare each pair of squares minimising the difference in hit positions.
ps . Assuming the squares don't rotate just store a list of distance along each side for each side of the square.
Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
You may consider dilate edges in each fragment, which could probably make up the losing edges in the read lines. Then, stitching fragments from this point.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am interested in recognizing letters on a Boggle board, probably using openCV. The letters are all the same font but could be rotated, so using a standard text recognition library is a bit of a problem. Additionally the M and W have underscores to differentiate them, and the Q is actually a Qu.
I am fairly confident I can isolate the seperate letters in the image, I am just wondering how to do the recognition part.
It depends on how fast you need to be.
If you can isolate the square of the letter and rotate it so that the sides of the square containing the letter are horizontal and vertical then I would suggest you:
convert the images to black/white (with the letter the one colour and the rest of the die the other
make a dataset of reference images of all letters in all four possible orientations (i.e. upright and rotated 90, 180 and 270 degrees)
use a template matching function such as cvMatchTemplate to find the best matching image from your dataset for each new image.
This will take a bit of time, so optimisations are possible, but I think it will get you a reasonable result.
If getting them in a proper orientation is difficult you could also generate rotated versions of your new input on the fly and match those to your reference dataset.
If the letters have different scale then I can think of two options:
If orientation is not an issue (i.e. your boggle block detection can also put the block in the proper orientation) then you can use the boundingbox of the area that has the letter colour as rough indicator of the scale of the incoming picture, and scale that to be the same size as the boundingbox on your reference images (this might be different for each reference image)
If orientation is an issue then just add scaling as a parameter of your search space. So you search all rotations (0-360 degrees) and all reasonable sizes (you should probably be able to guess a reasonable range from the images you have).
You can use a simple OCR like Tesseract. It is simple to use and is quite fast. You'll have to do the 4 rotations though (as mentioned in #jilles de wit's answer).
I made an iOS-app that does just this, based on OpenCV. It's called SnapSolve. I wrote a blog about how the detection works.
Basically, I overlay all 26x4 possible letters + rotations on each shape, and see which letter overlaps most. A little tweak to this is to smooth the overlay image, to get rid of artefacts where letters almost overlap but not quite.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm trying to develop a system, which recognizes various objects present in an image based on their primitive features like texture, shape & color.
The first stage of this process is to extract out individual objects from an image and later on doing image processing on each one by one.
However, segmentation algorithm I've studied so far are not even near perfect or so called Ideal Image segmentation algorithm.
Segmentation accuracy will decide how much better the system responds to given query.
Segmentation should be fast as well as accurate.
Can any one suggest me any segmentation algorithm developed or implemented so far, which won't be too complicated to implement but will be fair enough to complete my project..
Any Help is appreicated..
A very late answer, but might help someone searching for this in google, since this question popped up as the first result for "best segmentation algorithm".
Fully convolutional networks seem to do exactly the task you're asking for. Check the paper in arXiv, and an implementation in MatConvNet.
The following image illustrates a segmentation example from these CNNs (the paper I linked actually proposes 3 different architectures, FCN-8s being the best).
Unfortunately, the best algorithm type for facial recognition uses wavelet reconstruction. This is not easy, and almost all current algorithms in use are proprietary.
This is a late response, so maybe it's not useful to you but one suggestion would be to use the watershed algorithm.
beforehand, you can use a generic drawing(black and white) of a face, generate a FFT of the drawing---call it *FFT_Face*.
Now segment your image of a persons face using the watershed algorithm. Call the segmented image *Water_face*.
now find the center of mass for each contour/segment.
generate an FFT of *Water_Face*, and correlate it with the *FFT_Face image*. The brightest pixel in resulting image should be the center of the face. Now you can compute the distances between this point and the centers of segments generated earlier. The first few distances should be enough to distinguish one person from another.
I'm sure there are several improvements to the process, but the general idea should get you there.
Doing a Google search turned up this paper: http://www.cse.iitb.ac.in/~sharat/papers/prim.pdf
It seems that getting it any better is a hard problem, so I think you might have to settle for what's there.
you can try the watershed segmentation algorithm
also you can calculate the accuracy of the segmentation algorithm by the qualitative measures