I came across another thread:
In a digital photo, how can I detect if a mountain is obscured by clouds?
about analysing images
but I couldn't work out how to go from that to what I would like to do, which seems to be somewhat similar.
Basically, I want to take an image of the sky (only 640 x 480) and measure how "blue" it is - or how grey/cloudy. I have plenty of comparison images I could use and am not sure whether to try and use convolution or just some type of histogram measurement.
Ideally, I'd like to come with a percentage figure which approximates to the "blueness" of the image.
Any thoughts/ideas or example commands/scripts would be wonderful.
Thanks for reading.
Andrew
I'm not an image processing specialist, but I would imagine that if you convert the image to the HSV color space, all of the pixels representing a unobstructed part of the sky will have a high saturation and a hue close to blue. I would ignore the value channel because brightness changes with time of day. just set your hue and saturation threshold appropriately, and count pixels, and see how many cloud vs sky pixels you get.
Not sure how well it will work but it's an idea.
I had the same question last month after adding a camera module to my raspberry pi based weather station.
I tested this imagemagick command :
convert file.jpg -colors 8 -format "%c" histogram:info:
to get the 8 most represented colors of this sky part in RGB format.
I figured out that you can evaluate a pourcentage from those 3 numbers and a grey image (cloudy) is something as 33% 33% 33%. I didn't do it, but with awk command or some perl, you might be able to do that.
But here (in Congo), a blue sky is not so far from a grey sky. So it's not the perfect method to compute a cloudiness factor. Instead, it will be a not so bad idead to compute the variance to see if you have clouds or uniform colored sky. Or maybe compute both and see what append. If you do that several time by hour, you can track some evolution ?
After reading EternityForest answer, I figured out that you can use
-colorspace HSB
or
-colorspace HSL
in my previous command. If you put 1 instead of 8, you will get the most represented value.
Some discussions about this on imagemagick forum
I will stay connected to this thread cause I'm really interested solving this cloudiness question too !
Good luck,
Greg.
Related
I have a large number of grayscale images that show bright "fibers" on a darker background. I am trying to quantify the "amount" of fibers. Since they overlap almost everywhere it will be impossible to count the number of fibers, so instead I want to resort to simply calculating how large the area fraction of the white fibers is compared to the full image (e.g. this one is 55% white, another one with less fibers is only 43% white, etc). In other words, I want to quantify the density of the fibers in the image.
Example pictures:
High density: https://dl.dropboxusercontent.com/u/14309718/f1.jpg
Lower density: https://dl.dropboxusercontent.com/u/14309718/f2.jpg
I figured a simple (adaptive) threshold filter would do the job nicely by just converting the image to purely black/white and then counting the fraction of white pixels. However, my answer seems to depend almost completely and only on the threshold value that I choose. I did some quick experiments by taking a large number of different thresholds and found that in all pictures the fraction of white pixels is almost exactly a linear function of the threshold value. In other words - I can get any answer I want between roughly 10% and 90% depending on the threshold I choose.
This is obviously not a good approach because my results are extremely biased with how I choose the threshold and therefore completely useless. Furthermore I have about 100 of these images and I'm not looking forward to trying to choose the "correct" threshold for all of them manually.
How can I improve this method?
As the images are complex and the outlines of the fibers are fuzzy, there is little hope of getting an "exact" measurement.
What matters then is to achieve repeatability, i.e. ensure that the same fiber density is always assigned the same measurement, even in varying lighting conditions if possible, and different densities are assigned different measurements.
This rules out human intervention in adjusting a threshold.
My best advice is to rely on Otsu thresholding, which is very good at finding meaningful background and foreground intensities and is fairly illumination-independent.
Enhancing the constrast before Otsu should be avoided because binarization commutes with contrast enhancement (so that there is no real benefit), but contrast enhancement can degrade the image by saturating at places.
Just echoing #YvesDaoust' thoughts really - and providing some concrete examples...
You can generate histograms of your images using ImageMagick which is installed on most Linux distros and is available for OSX and Windows. I am just doing this at the command-line but it is powerful and easy to run some tests and see how Yves' suggestion works for you.
# Make histograms for both images
convert 1.jpg histogram:h1.png
convert 2.jpg histogram:h2.png
Yes, they are fairly bimodal - so Otsu thresholding should find a threshold that maximises the between-class variance. Use the script otsuthresh from Fred Weinhaus' website here
./otsuthresh 1.jpg 1.gif
Thresholding Image At 44.7059%
./otsuthresh 2.jpg 2.gif
Thresholding Image At 42.7451%
Count percentage of white pixels in each image:
convert 1.gif -format "%[fx:int(mean*100)]" info:
50
convert 2.gif -format "%[fx:int(mean*100)]" info:
48
Not that brilliant a distinction! Mmmm... I tried adding in a median filter to reduce the noise, but that didn't help. Do you have your images available as PNG to avoid the nasty artefacts?
Is there a way to convert a gray-scale image to a colour image?
Here's a few JPG examples
Photo 1
Photo 2
Photo 3
ImageMagick is powerful but doesn't seem capable of converting to a colourful version.
You cannot accurately re-create the information that is lost by aggregating the three channels together when converting to colour. You can, however, generate a false colour and you can also make some assumptions that may be fairly reasonable for some/many pictures.
Generally, when doing these types of things, you create a LUT (Look-up Table) and use the grey value at each pixel location to look-up a replacement colour. So, if we make a LUT that goes from Red through Green through Blue, the dark tones in your image will be mapped to Red, the midtones to Green and the highlights to Blue. Let's try it:
convert -size 1x1! xc:red xc:lime xc:blue +append -resize 255x1! rainbowCLUT.png
If we now apply that to your image:
convert yourImage.jpg rainbowCLUT.png -clut result.png
Ok, we have now got colour but that is not very realistic. So, to do it better, we need to start making some assumptions. One assumption might be that there is something pretty black in the picture (i.e. a tie, a jacket, some dark hair, or a deep shadow somewhere), another assumption might be that there is probably a white highlight somewhere in the image (i.e. a white background, the whites of an eye) and finally we assume that there is some flesh tone somewhere in the middle. So, let's make a CLUT that looks like that, i.e. it goes from solid black through a flesh tone in the middle to a white highlight:
convert -size 128x1! gradient:black-"rgb(210,160,140)" gradient:"rgb(210,160,140)"-white +append clut.png
(I have put a 1 pixel wide red border around it just so you can see it on StackOverflow's white background)
Now we can apply it to your images:
convert yourImage.jpg -normalize clut.png -clut result.png
Note how I also used -normalize to try and make the image fit my assumption that there was a solid black tone in the image and a solid white highlight.
This technique is only an attempt at re-creating something that is no longer in the image so it will not always work. Of course, if you know extra information about your subjects, the lighting and so on, you could build that into the LUT.
Unfortunately this is not possible. Grayscale images do not contain sufficient information to create a color image. In instances where you see B&W/Grayscale images converted to color, this has been done manually in an application such as photoshop.
You can use imagemagick to apply a filter but you cannot re-introduce color.
http://www.imagemagick.org/Usage/color_mods/#level-colors
To convert an image to grayscale you just take the average of the r g b value in each pixel and set r g and b to that value. Therefore it is pretty much impossible to convert it back to color. The key word is pretty much, I'm sure someone will eventually invent some complex algorithm that will look at the pixels around it see their averages and maybe make out a conclusion of around what color is in that area maybe, I dunno. But as for now I don't think it's possible to do such a thing unfortunately. Sorry.
take a look at siggraph2016_colorization
i did not try it but seems interesting.
They present a novel technique to automatically colorize grayscale images that combines both global priors and local image features.
Colorization Architecture:
Their model consists of four main components: a low-level features network, a mid-level features network, a global features network, and a colorization network. The components are all tightly coupled and trained in an end-to-end fashion. The output of our model is the chrominance of the image which is fused with the luminance to form the output image.
here a sample
I have this image here:
http://imgur.com/QFSimZX
That when looking at it, a human can see that it says PINE (N) on the top line and PI on the second line. The problem I have is that when using tesseract-ocr to read what the text says it has pretty bad outputs. I have a lot of images like this and need to automate this process, so doing it manually is not idea. I have used imagemagick to get it in the current state, but would like to know if there is any way to make this image more readable by possibly connecting the close areas of black. I know almost nothing about image manipulation so I don't know where to begin searching. If anyone knows a method for making this more readable I would greatly appreciate it.
This is a pretty tricky problem, and the solution that works best will depend sensitively on characteristics of the image - what scale is the type? how degraded is the image? The boundary between details that you want to keep and degradation that you want to fix is something that only the human operator can decide, so there is no automated one-size-fits-all solution for this problem, and you should expect to do some experimentation.
The basic technique is that you want to adjust the value of each pixel in the image to be similar to the pixels that surround it. Put in those terms, you might realise that this is just a blur operation. After you blur the image though, you are left with letters with fuzzy edges, so to get crisp letters again, that's a threshold operation - you set a threshold level of gray, and everything lighter than that shade of gray becomes white and everything darker than the threshold becomes black. The blur plus threshold combination gives you a wide range of effects that you can use to make text more (or less) legible. For the example image given, I had pretty good results with a blur radius of 5 and a threshold level of 70%.
convert QFSimZX.jpg -blur 5 -threshold 70% output.png
You can get more sophisticated than this if needed, by implementing a custom blur function with the -fx operator. Fx is powerful but somewhat complex, and you can read about it here: http://www.imagemagick.org/script/fx.php . I tried a quick fx expression that filled in a pixel based first on its above and below neighbors, then on its left and right neighbors. This technique really allows you to fine tune which pixels are considered in computing the blur:
convert QFSimZX.jpg -monochrome \
-fx 'p[0,-1]+p[0,1] >= 2 ? 1 : 0' \
-fx 'p[-1,0]+p[1,0] >= 2 ? 1 : 0' \
output.png
I think this can be a stupid question but after read a lot and search a lot about image processing every example I see about image processing uses gray scale to work
I understood that gray scale images use just one channel of color, that normally is necessary just 8 bit to be represented, etc... but, why use gray scale when we have a color image? What are the advantages of a gray scale? I could imagine that is because we have less bits to treat but even today with faster computers this is necessary?
I am not sure if I was clear about my doubt, I hope someone can answer me
thank you very much
As explained by John Zhang:
luminance is by far more important in distinguishing visual features
John also gives an excellent suggestion to illustrate this property: take a given image and separate the luminance plane from the chrominance planes.
To do so you can use ImageMagick separate operator that extracts the current contents of each channel as a gray-scale image:
convert myimage.gif -colorspace YCbCr -separate sep_YCbCr_%d.gif
Here's what it gives on a sample image (top-left: original color image, top-right: luminance plane, bottom row: chrominance planes):
To elaborate a bit on deltheil's answer:
Signal to noise. For many applications of image processing, color information doesn't help us identify important edges or other features. There are exceptions. If there is an edge (a step change in pixel value) in hue that is hard to detect in a grayscale image, or if we need to identify objects of known hue (orange fruit in front of green leaves), then color information could be useful. If we don't need color, then we can consider it noise. At first it's a bit counterintuitive to "think" in grayscale, but you get used to it.
Complexity of the code. If you want to find edges based on luminance AND chrominance, you've got more work ahead of you. That additional work (and additional debugging, additional pain in supporting the software, etc.) is hard to justify if the additional color information isn't helpful for applications of interest.
For learning image processing, it's better to understand grayscale processing first and understand how it applies to multichannel processing rather than starting with full color imaging and missing all the important insights that can (and should) be learned from single channel processing.
Difficulty of visualization. In grayscale images, the watershed algorithm is fairly easy to conceptualize because we can think of the two spatial dimensions and one brightness dimension as a 3D image with hills, valleys, catchment basins, ridges, etc. "Peak brightness" is just a mountain peak in our 3D visualization of the grayscale image. There are a number of algorithms for which an intuitive "physical" interpretation helps us think through a problem. In RGB, HSI, Lab, and other color spaces this sort of visualization is much harder since there are additional dimensions that the standard human brain can't visualize easily. Sure, we can think of "peak redness," but what does that mountain peak look like in an (x,y,h,s,i) space? Ouch. One workaround is to think of each color variable as an intensity image, but that leads us right back to grayscale image processing.
Color is complex. Humans perceive color and identify color with deceptive ease. If you get into the business of attempting to distinguish colors from one another, then you'll either want to (a) follow tradition and control the lighting, camera color calibration, and other factors to ensure the best results, or (b) settle down for a career-long journey into a topic that gets deeper the more you look at it, or (c) wish you could be back working with grayscale because at least then the problems seem solvable.
Speed. With modern computers, and with parallel programming, it's possible to perform simple pixel-by-pixel processing of a megapixel image in milliseconds. Facial recognition, OCR, content-aware resizing, mean shift segmentation, and other tasks can take much longer than that. Whatever processing time is required to manipulate the image or squeeze some useful data from it, most customers/users want it to go faster. If we make the hand-wavy assumption that processing a three-channel color image takes three times as long as processing a grayscale image--or maybe four times as long, since we may create a separate luminance channel--then that's not a big deal if we're processing video images on the fly and each frame can be processed in less than 1/30th or 1/25th of a second. But if we're analyzing thousands of images from a database, it's great if we can save ourselves processing time by resizing images, analyzing only portions of images, and/or eliminating color channels we don't need. Cutting processing time by a factor of three to four can mean the difference between running an 8-hour overnight test that ends before you get back to work, and having your computer's processors pegged for 24 hours straight.
Of all these, I'll emphasize the first two: make the image simpler, and reduce the amount of code you have to write.
I disagree with the implication that gray scale images are always better than color images; it depends on the technique and the overall goal of the processing. For example, if you wanted to count the bananas in an image of a fruit bowl image, then it's much easier to segment when you have a colored image!
Many images have to be in grayscale because of the measuring device used to obtain them. Think of an electron microscope. It's measuring the strength of an electron beam at various space points. An AFM is measuring the amount of resonance vibrations at various points topologically on a sample. In both cases, these tools are returning a singular value- an intensity, so they implicitly are creating a gray-scale image.
For image processing techniques based on brightness, they often can be applied sufficiently to the overall brightness (grayscale); however, there are many many instances where having a colored image is an advantage.
Binary might be too simple and it could not represent the picture character.
Color might be too much and affect the processing speed.
Thus, grayscale is chosen, which is in the mid of the two ends.
First of starting image processing whether on gray scale or color images, it is better to focus on the applications which we are applying. Unless and otherwise, if we choose one of them randomly, it will create accuracy problem in our result. For example, if I want to process image of waste bin, I prefer to choose gray scale rather than color. Because in the bin image I want only to detect the shape of bin image using optimized edge detection. I could not bother about the color of image but I want to see rectangular shape of the bin image correctly.
I am trying to teach my camera to be a scanner: I take pictures of printed text and then convert them to bitmaps (and then to djvu and OCR'ed). I need to compute a threshold for which pixels should be white and which black, but I'm stymied by uneven illumination. For example if the pixels in the center are dark enough, I'm likely to wind up with a bunch of black pixels in the corners.
What I would like to do, under relatively simple assumptions, is compensate for uneven illumination before thresholding. More precisely:
Assume one or two light sources, maybe one with gradual change in light intensity across the surface (ambient light) and another with an inverse square (direct light).
Assume that the white parts of the paper all have the same reflectivity/albedo/whatever.
Find some algorithm to estimate degree of illumination at each pixel, and from that recover the reflectivity of each pixel.
From a pixel's reflectivity, classify it white or black
I have no idea how to write an algorithm to do this. I don't want to fall back on least-squares fitting since I'd somehow like to ignore the dark pixels when estimating illumination. I also don't know if the algorithm will work.
All helpful advice will be upvoted!
EDIT: I've definitely considered chopping the image into pieces that are large enough so they still look like "text on a white background" but small enough so that illumination of a single piece is more or less even. I think if I then interpolate the thresholds so that there's no discontinuity across sub-image boundaries, I will probably get something halfway decent. This is a good suggestion, and I will have to give it a try, but it still leaves me with the problem of where to draw the line between white and black. More thoughts?
EDIT: Here are some screen dumps from GIMP showing different histograms and the "best" threshold value (chosen by hand) for each histogram. In two of the three a single threshold for the whole image is good enough. In the third, however, the upper left corner really needs a different threshold:
I'm not sure if you still need a solution after all this time, but if you still do. A few years ago I and my team photographed about 250,000 pages with a camera and converted them to (almost black and white ) grey scale images which we then DjVued ( also make pdfs of).
(See The catalogue and complete collection of photographic facsimiles of the 1144 paper transcripts of the French Institute of Pondicherry.)
We also ran into the problem of uneven illumination. We came up with a simple unsophisticated solution which worked very well in practice. This solution should also work to create black and white images rather than grey scale (as I'll describe).
The camera and lighting setup
a) We taped an empty picture frame to the top of a table to keep our pages in the exact same position.
b) We put a camera on a tripod also on top of the table above and pointing down at the taped picture frame and on a bar about a foot wide attached to the external flash holder on top of the camera we attached two "modelling lights". These can be purchased at any good camera shop. They are designed to provide even illumination. The camera was shaded from the lights by putting small cardboard box around each modelling light. We photographed in greyscale which we then further processed. (Our pages were old browned paper with blue ink writing so your case should be simpler).
Processing of the images
We used the free software package irfanview.
This software has a batch mode which can simultaneously do color correction, change the bit depth and crop the images. We would take the photograph of a page and then in interactive mode adjust the brightness, contrast and gamma settings till it was close to black and white. (We used greyscale but by setting the bit depth to 2 you will get black and white when you batch process all the pages.)
After determining the best color correction we then interactively cropped a single image and noted the cropping settings. We then set all these settings in the batch mode window and processed the pages for one book.
Creating DjVu images.
We used the free DjVu Solo 3.1 to create the DjVu images. This has several modes to create the DjVu images. The mode which creates black and white images didn't work well for us for photographs, but the "photo" mode did.
We didn't OCR (since the images were handwritten Sanskrit) but as long as the letters are evenly illuminated I think your OCR software should ignore big black areas like between a two page spread. But you can always get rid of the black between a two page spread or at the edges by cropping the pages twices once for the left hand pages and once for the right hand pages and the irfanview software will allow you to cleverly number your pages so you can then remerge the pages in the correct order. I.e rename your pages something like page-xxxA for lefthand pages and page-xxxB for righthand pages and the pages will then sort correctly on name.
If you still need a solution I hope some of the above is useful to you.
i would recommend calibrating the camera. considering that your lighting setup is fixed (that is the lights do not move between pictures), and your camera is grayscale (not color).
take a picture of a white sheet of paper which covers the whole workable area of your "scanner". store this picture, it tells what is white paper for each pixel. now, when you take take a picture of a document to scan, you can reload your "white reference picture" and even the illumination before performing a threshold.
let's call the white reference REF, the picture DOC, the even illumination picture EVEN, and the maximum value of a pixel MAX (for 8bit imaging, it is 255). for each pixel:
EVEN = DOC * (MAX/REF)
notes:
beware of the parenthesis: most image processing library uses the image pixel type for performing computation on pixel values and a simple multiplication will overload your pixel. eventually, write the loop yourself and use a 32 bit integer for intermediate computations.
the white reference image can be smoothed before being used in the process. any smoothing or blurring filter will do, and don't hesitate to apply it aggressively.
the MAX value in the formula above represents the target pixel value in the resulting image. using the maximum pixel value targets a bright white, but you can adjust this value to target a lighter gray.
Well. Usually the image processing I do is highly time sensitive, so a complex algorithm like the one you're seeking wouldn't work. But . . . have you considered chopping the image up into smaller pieces, and re-scaling each sub-image? That should make the 'dark' pixels stand out fairly well even in an image of variable lighting conditions (I am assuming here that you are talking about a standard mostly-white page with dark text.)
Its a cheat, but a lot easier than the 'right' way you're suggesting.
This might be horrendously slow, but what I'd recommend is to break the scanned surface into quarters/16ths and re-color them so that the average grayscale level is similar across the page. (Might break if you have pages with large margins though)
I assume that you are taking images of (relatively) small black letters on a white background.
One approach could be to "remove" the small black objects, while keeping the illumination variations of the background. This gives an estimate of how the image is illuminated, which can be used for normalizing the original image. It is often enough to subtract the illumination estimate from the original image and then do a threshold based segmentation.
This approach is based on gray scale morphological filters, and could be implemented in matlab like below:
img = imread('filename.png');
illumination = imclose(img, strel('disk', 10));
imgCorrected = img - illumination;
thresholdValue = graythresh(imgCorrected);
bw = imgCorrected > thresholdValue;
For an example with real images take a look at this guide from mathworks. For further reading about the use of morphological image analysis this book by Pierre Soille can be recommended.
Two algorithms come to my mind:
High-pass to alleviate the low-frequency illumination gradient
Local threshold with an appropriate radius
Adaptive thresholding is the keyword. Quote from a 2003 article by R.
Fisher, S. Perkins, A. Walker, and E. Wolfart: “This more sophisticated version
of thresholding can accommodate changing lighting conditions in the image, e.g.
those occurring as a result of a strong illumination gradient or shadows.”
ImageMagick's -lat option can do it, for example:
convert -lat 50x50-2000 input.jpg output.jpg
input.jpg
output.jpg
You could try using an edge detection filter, then a floodfill algorithm, to distinguish the background from the foreground. Interpolate the floodfilled region to determine the local illumination; you may also be able to modify the floodfill algorithm to use the local background value to jump across lines and fill boxes and so forth.
You could also try a Threshold Hysteresis with a rate of change control. Here is the link to the normal Threshold Hysteresis. Set the first threshold to a typical white value. Set the second threshold to less than the lowest white value in the corners.
The difference is that you want to check the difference between pixels for all values in between the first and second threshold. Ideally if the difference is positive, then act normally. But if it is negative, you only want to threshold if the difference is small.
This will be able to compensate for lighting variations, but will ignore the large changes between the background and the text.
Why don't you use simple opening and closing operations?
Try this, just lool at the results:
src - cource image
src - open(src)
close(src) - src
and look at the close - src result
using different window size, you will get backgound of the image.
I think this helps.