A guide to convert_imageset.cpp - image-processing

I am relatively new to machine learning/python/ubuntu.
I have a set of images in .jpg format where half contain a feature I want caffe to learn and half don't. I'm having trouble in finding a way to convert them to the required lmdb format.
I have the necessary text input files.
My question is can anyone provide a step by step guide on how to use convert_imageset.cpp in the ubuntu terminal?
Thanks

A quick guide to Caffe's convert_imageset
Build
First thing you must do is build caffe and caffe's tools (convert_imageset is one of these tools).
After installing caffe and makeing it make sure you ran make tools as well.
Verify that a binary file convert_imageset is created in $CAFFE_ROOT/build/tools.
Prepare your data
Images: put all images in a folder (I'll call it here /path/to/jpegs/).
Labels: create a text file (e.g., /path/to/labels/train.txt) with a line per input image . For example:
img_0000.jpeg 1
img_0001.jpeg 0
img_0002.jpeg 0
In this example the first image is labeled 1 while the other two are labeled 0.
Convert the dataset
Run the binary in shell
~$ GLOG_logtostderr=1 $CAFFE_ROOT/build/tools/convert_imageset \
--resize_height=200 --resize_width=200 --shuffle \
/path/to/jpegs/ \
/path/to/labels/train.txt \
/path/to/lmdb/train_lmdb
Command line explained:
GLOG_logtostderr flag is set to 1 before calling convert_imageset indicates the logging mechanism to redirect log messages to stderr.
--resize_height and --resize_width resize all input images to same size 200x200.
--shuffle randomly change the order of images and does not preserve the order in the /path/to/labels/train.txt file.
Following are the path to the images folder, the labels text file and the output name. Note that the output name should not exist prior to calling convert_imageset otherwise you'll get a scary error message.
Other flags that might be useful:
--backend - allows you to choose between an lmdb dataset or levelDB.
--gray - convert all images to gray scale.
--encoded and --encoded_type - keep image data in encoded (jpg/png) compressed form in the database.
--help - shows some help, see all relevant flags under Flags from tools/convert_imageset.cpp
You can check out $CAFFE_ROOT/examples/imagenet/convert_imagenet.sh
for an example how to use convert_imageset.

Related

Tesseract - Preprocessing that Doesn't Affect Final Image

I'm using the latest version of Tesseract (5.0), and I'm trying to determine whether or not I can insert some preprocessing steps that will -not- affect the form of the final image.
For example, I might start out with an image such
as this.
There are different levels of shadow/brightness, so I might use adaptive Gaussian thresholding to avoid shadows during binarization.
I will now run this through tesseract, with the hope of creating an OCR'd PDF in the end. However, I want the image that the end user (and I) see to be the full-color, original image, with the text from the transformed image underlaid
Is there a way to manage this? Or am I completely missing the point here.
I was provided an answer on another forum, and wanted to share it here.
Instead of using the built in PDF option in Tesseract, I used the hOCR setting. My pipeline went:
Preprocess image (thresholding, etc)
Run tesseract with the following command: tesseract example1.jpg example1 -l eng hocr
Use the hocr-pdf module from Ocropus to merge the hocr'd material with the ORIGINAL IMAGE, no preprocessing.

Tesseract does not recognize complete image whereas correctly recognizes part of it?

I have to parse some Lab Reports and I am using Tesseract to extract data from them. I have encountered an issue that Tesseract does not correctly recognize the text if I pass entire page's image. But if I pass a small subsection of page (from Test Report covering the entire table till *****) it is able to read all the text correctly.
In the formal case (when I pass the entire image) it produces some random text output of English words which do not make sense. Part of text is as follows:
Command I ran: tesseract -l eng report.png out
Refierence No : assurcAN, 98941-EU
5:er Nu (SKU) , 95942, 95943
Labelled age gwup “aw
Quamny 20 pweces
Fackagmg pmwosd Yes
Vendor
Manmamurer
But when I pass the subsection, I get accurate results.
What might be the issue here? How do I fix it?
See the sample report image:

Opencv traincascade cannot fill temp stage

So, I have 20 positive samples and 500 negative samples. I created the .vec file using createsample utility.Now, when i try to train the classifier using the traincascade.exe utility, I run into the following error:
I have looked into many solutions given to people who have faced similar issues, but none of them worked.
Things I tried: 1. Increasing the negative sample size 2. Checking the path of the negative(or background images) stored in the Negative.txt file 3. Varying different parameters.
Here is some information regarding the path: My working directory has the following files: 1. Traincascade.exe 2. Positive image folder 3. NegativeImageFolder 4. vec file 5. Negative.txt (file that has path to images in the negative image folder)
My Negative.txt file has the absolute file path for the images in the negative image folder. I also tried changing the file path to the following format:
NegativeImageFolder\Image1.pgm
but didn't work! I tried both front and backslash too!
I have run out of ways to change the file path or make any modification to make this work!
First of all: is NumStages 1 and maxDepth 1 intentional?
Looking at Opencv's source code (cascadeclassifier.cpp, imagestorage.cpp), the error is thrown when in function
bool CvCascadeClassifier::updateTrainingSet( double& acceptanceRatio)
a number, negCount=500, of negative samples cannot be filled.
Before, everything was ok with positive samples (and the line about pos count that was printed on the screen is a proof of this).
Digging deep into source code negCount cannot be filled when imgReader.getNeg( img ) returns false, this means it cannot provide any image, which in turn happens when the list of source negatives is empty.
So you have to concentrate all your efforts in the direction of providing the algorithm with the correct list of negative images.
There are two ways to solve this: make sure that Negative.txt is read and all paths are regular and that every image in the list can be read regularly.
Is the file name “Negative.txt” or “Negatives.txt”?
Anyway with so few positive and negative samples you won’t train anything functioning, it is only useful to make you understand how the process of training works.
Well I was able to resolve the issue and run the train the classifier successfully. However, I am not 100% sure as to how the change I made helped.
This is what I did:
I was generating the Negative.txt file using Excel. I would enter the file path of one image and increment the image filename (since my images were name image1, image2, image3...). So the format as mentioned earlier would be :
C:\OpenCV-3.0.0\opencv\build\x64\vc12\bin\Negative\Image1.pgm
And finally save the file as a Unicode txt document. However, saving it as a unicode txt document gave me the error stated in the question. I saved it as a Text (tab delimited) file and it worked.

"Separate image files" and "Image stack" in MicroManager plugin - easy way to convert between the two?

Apologies for tagging this just ImageJ - it's a problem regarding MicroManager, a microscopy plugin for it and I thought this would be best.
I'd recently taken images for an important experiment using MicroManager (a recent version, though I cannot recall the exact number). The IT services at my institution have recently been having some networking problems and my saved preferences for the software had been erased. I'd got half way through my experiment when I realised that I'd saved my images as separate image files (three greyscale TIFFs plus metadata text files) instead of OME-TIFF iamge stacks.
All of my ImageJ macros for image processing rely on having a multiple channel image stack, so this is a bit of a problem. Is there any easy way in MicroManager (or ImageJ) to bulk convert these single channel greyscale images into the OME-TIFF image stack after the images have already been taken?
Cheers.
You can start with a macro like this one:
// Convert your images to a stack
run("Images to Stack", "name=Stack title=[] use");
// The stack will default the images to time points. Convert to channels
run("Stack to Hyperstack...", "order=xyczt(default) channels=3 slices=1 frames=1 display=Color");
// Export as OME-TIFF
run("Bio-Formats Exporter");
This is designed to reconstruct one dataset at a time (open 3 images, run the macro and export the OME-TIFF).
If you don't want any dialogs to show you can pass an output directory to the Bio-Formats exporter:
run("Bio-Formats Exporter", "save=/path/to/image.ome.tif export compression=Uncompressed");
For the output file name you can get the original image name in the macro with getTitle()
There is also a template example on iterating over all the files in a directory, if you want to completely automate the macro. However this may take some tweaking since you want to operate on your images 3 at a time.
Hope that helps!

Is there anyway (commandline tools) to calculate MD5 hash for .NEF (also .CR2, .TIFF) regardless any metadata?

Is there anyway (commandline tools) to calculate MD5 hash for .NEF (also .CR2, .TIFF) regardless any metadata, e.g. EXIF, IPTC, XMP and so on?
The MD5 hash should be same once we update any metadata inside the image file.
I searched for a while, the closest solution is:
exiftool test.nef -all= -o - -m | md5
but 'exiftool -all=' still keeps a set of EXIF tags in the output file. The MD5 hash can be changed if I update remaining tags.
ImageMagick has a method for doing exactly this. It is installed on most Linux distros and is available for OSX (ideally via homebrew) and also Windows. There is an escape for the image signature which includes only pixel data and not metadata - you use it like this:
identify -format %# _DSC2007.NEF
feb37d5e9cd16879ee361e7987be7cf018a70dd466d938772dd29bdbb9d16610
I know it does what you want and that the calculated checksum does not change when you modify the metadata on PNG files for example, and I know it does calculate the checksum correctly for CR2 and NEF files. However, I am not in the habit of modifying RAW files such as you have and have not tested it does the right thing in that case - though I would be startled if it didn't! So please test before use.
The reason that there is still some Exif data left is because the image data for a NEF file (and similar TIFF based filetypes) is located within that Exif block. Remove that and you have removed the image data. See ExifTool FAQ 7, which has an example shortcut tag that may help you out.
I assume your intention is to verify the actual image data has not been tampered with.
An alternate approach to stripping the meta-data can be to convert the image to a format that has no metadata.
ImageMagick is a well known open source (Apache 2 license) for image manipulation and conversion. It provides libraries with various language bindings as well as command line tools for various operating systems.
You could try:
convert test.nef bmp:- | md5
This converts test.nef to bmp on stdout and pipes it to md5.
AFAIR bmp has no support for metadata and I'm not sure if ImageMagick even preserves metadata across conversions.
This will only work with single image files (i.e. not multi-image tiff or gif animations). There is also the slight possibility some changes can be made to the image which result in the same conversion because of color space conversions, but these changes would not be visible.

Resources