Create LMDB for Imageset - machine-learning

I'm quite new to Caffe framework. I'm trying to create LMDB database for face image dataset. I downloaded face-image-set from here . This dataset is in csv format. Anyway i downloaded the images. The csv file contains face and person bounding coordinates in each images. How will i create LMDB-db for these images?
I saw this command -
~$ GLOG_logtostderr=1 $CAFFE_ROOT/build/tools/convert_imageset \
--resize_height=200 --resize_width=200 --shuffle \
/path/to/jpegs/ \
/path/to/labels/train.txt \
/path/to/lmdb/train_lmdb
But i dont know how to create this- /labels/train.txt. or what does the it mean. i saw the SPACE. I don't know what this number means.
Only labels, i want is person and face. So how to create train.txt ? From there how to create the LMDB?. Please help find the right answer, Please..

I just went through this one and got some results.

Related

Tesseract - Preprocessing that Doesn't Affect Final Image

I'm using the latest version of Tesseract (5.0), and I'm trying to determine whether or not I can insert some preprocessing steps that will -not- affect the form of the final image.
For example, I might start out with an image such
as this.
There are different levels of shadow/brightness, so I might use adaptive Gaussian thresholding to avoid shadows during binarization.
I will now run this through tesseract, with the hope of creating an OCR'd PDF in the end. However, I want the image that the end user (and I) see to be the full-color, original image, with the text from the transformed image underlaid
Is there a way to manage this? Or am I completely missing the point here.
I was provided an answer on another forum, and wanted to share it here.
Instead of using the built in PDF option in Tesseract, I used the hOCR setting. My pipeline went:
Preprocess image (thresholding, etc)
Run tesseract with the following command: tesseract example1.jpg example1 -l eng hocr
Use the hocr-pdf module from Ocropus to merge the hocr'd material with the ORIGINAL IMAGE, no preprocessing.

How to use a retrained "tensorflow for poets" graph on iOS?

With "tensorflow for poets", I retrained the inceptionv3 graph. Now I want to use tfcoreml converter to convert the graph to an iOS coreML model.
But tf_coreml_converter.py stops with "NotImplementedError: Unsupported Ops of type: PlaceholderWithDefault".
I already tried "optimize_for_inference" and "strip_unused", but I can't get rid of this unsupported op "PlaceholderWithDefault".
Any idea what steps are needed after training in tensorflow-for-poets, to convert a "tensorflow-for-poets" graph (inceptionv3) to an iOS coreML model?
I succedded in removing the PlaceholderWithDefault op from the retrained tensorflow for poets graph with this steps:
Optimize graph for interference:
python -m tensorflow.python.tools.optimize_for_inference \
--input retrained_graph.pb \
--output graph_optimized.pb \
--input_names=Mul\
--output_names=final_result
Remove PlaceholderWithDefault op with transform_graph tool:
bazel build tensorflow/tools/graph_transforms:transform_graph
bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=graph_optimized.pb \
--out_graph=graph_optimized_stripped.pb \
--inputs='Mul' \
--outputs='final_result' \
--transforms='remove_nodes(op=PlaceholderWithDefault)'
Afterwards I could convert it to coreML. But as Matthijs already pointed out, the latest version of tfcoreml from git hub does it automatically.
Whoever created this graph used tf.placeholder_with_default() to define the placeholder (a placeholder in TF is used for the inputs to the neural network). Since tf-coreml does not support the PlaceholderWithDefault op, you cannot use this graph.
Possible solutions:
Define the placeholders using tf.placeholder() instead. The problem is that you'll need to retrain the graph from scratch since Tensorflow for Poets uses a pretrained graph and you can no longer use that.
Hack the graph to replace the PlaceholderWithDefault op with Placeholder.
Hack tf-coreml to use a Placeholder op whenever it encounters a PlaceholderWithDefault op. This is probably the quickest solution.
Update: From the code, it looks like a recent update to tf-coreml now simply skips the PlaceholderWithDefault layer. It should no longer give an error message. So if you use the latest version of tf-coreml (not using pip but by checking out the master branch of the GitHub repo) then you should no longer get this error.
import tfcoreml as tf_converter
tf_converter.convert(tf_model_path = '/Users/username/path/tf_files/retrained_graph.pb',
mlmodel_path = 'MyModel.mlmodel',
output_feature_names = ['final_result:0'],
input_name_shape_dict = {'input:0':[1,224,224,3]},
image_input_names = ['input:0'],
class_labels = '/Users/username/path/tf_files/retrained_labels.txt',
image_scale=2/255.0,
red_bias=-1,
green_bias=-1,
blue_bias=-1
)
Using tfcoreml, I found success with these settings.

Is there a video change detection software available?

We have a video file from a security camera. There is a reflective object that reflects some image data but this is not clear. If we look very carefully to that reflective object, we can understand what is going on outside of that camera. Do we have a chance to substract a default scene screenshot image from every frame of the rest of the video file? That would give us the reflected objects movements' video more clearly.
Edit
This picture shows what I need:
And also this:
They call this Video-Based Change Detection
This dirty shell code got things done:
#!/bin/bash
#
# READ: http://www.imagemagick.org/Usage/compare/#difference
#mkdir orig-images diff-images
fps=6
## create png files
#ffmpeg -i orig.avi -r $fps -f image2 orig-images/image-%07d.png
cd orig-images
# get first image as default scene
for i in $(ls image-*.png); do
default_image=$i
break
done
# or set default scene manually
default_image="image-0003631.png"
rm ../diff-images/*
for i in $(ls image-*.png); do
echo "processing: $i"
#compare $default_image $i -compose src "../diff-images/diff-$i"
convert $i $default_image -compose difference -composite \
-evaluate Pow 2 -separate -evaluate-sequence Add -evaluate Pow 0.5 \
"../diff-images/diff-$i"
done
cd ..
cd diff-images
## create movie from png files
rm ../out.mov
ffmpeg -r $fps -start_number 3529 -i diff-image-%07d.png ../out.mov
I can suggest scenedetect 1, a nice and updated python software
From the website, for the lazy ones
PySceneDetect is a command-line application and a Python library for detecting scene changes in videos, and automatically splitting the video into separate clips. Not only is it free and open-source software (FOSS), but there are several detection methods available (see Features), from simple threshold-based fade in/out detection, to advanced content aware fast-cut detection of each shot.
PySceneDetect can be used on its own as a stand-alone executable, with other applications as part of a video processing pipeline, or integrated directly into other programs/scripts via the Python API. PySceneDetect is written in Python, and requires the OpenCV and Numpy software libraries.
Examples and Use Cases
Here are some of the things people are using PySceneDetect for:
splitting home videos or other source footage into individual scenes
automated detection and removal of commercials from PVR-saved video sources
processing and splitting surveillance camera footage
statistical analysis of videos to find suitable "loops" for looping GIFs/cinemagraphs
academic analysis of film and video (e.g. finding mean shot length)

Caffe mean file creation without database

I run caffe using an image_data_layer and don't want to create an LMDB or LevelDB for the data, But The compute_image_mean tool only works with LMDB/LevelDB databases.
Is there a simple solution for creating a mean file from a list of files (the same format that image_data_layer is using)?
You may notice that recent models (e.g., googlenet) do not use a mean file the same size as the input image, but rather a 3-vector representing a mean value per image channel. These values are quite "immune" to the specific dataset used (as long as it is large enough and contains "natural images").
So, as long as you are working with natural images you may use the same values as e.g., GoogLenet is using: B=104, G=117, R=123.
The simplest solution is to create a LMDB or LevelDB database of the image set.
The complicated solution is to write a tool similar to compute_image_mean, which takes image inputs and do the transformations and find the mean!

A guide to convert_imageset.cpp

I am relatively new to machine learning/python/ubuntu.
I have a set of images in .jpg format where half contain a feature I want caffe to learn and half don't. I'm having trouble in finding a way to convert them to the required lmdb format.
I have the necessary text input files.
My question is can anyone provide a step by step guide on how to use convert_imageset.cpp in the ubuntu terminal?
Thanks
A quick guide to Caffe's convert_imageset
Build
First thing you must do is build caffe and caffe's tools (convert_imageset is one of these tools).
After installing caffe and makeing it make sure you ran make tools as well.
Verify that a binary file convert_imageset is created in $CAFFE_ROOT/build/tools.
Prepare your data
Images: put all images in a folder (I'll call it here /path/to/jpegs/).
Labels: create a text file (e.g., /path/to/labels/train.txt) with a line per input image . For example:
img_0000.jpeg 1
img_0001.jpeg 0
img_0002.jpeg 0
In this example the first image is labeled 1 while the other two are labeled 0.
Convert the dataset
Run the binary in shell
~$ GLOG_logtostderr=1 $CAFFE_ROOT/build/tools/convert_imageset \
--resize_height=200 --resize_width=200 --shuffle \
/path/to/jpegs/ \
/path/to/labels/train.txt \
/path/to/lmdb/train_lmdb
Command line explained:
GLOG_logtostderr flag is set to 1 before calling convert_imageset indicates the logging mechanism to redirect log messages to stderr.
--resize_height and --resize_width resize all input images to same size 200x200.
--shuffle randomly change the order of images and does not preserve the order in the /path/to/labels/train.txt file.
Following are the path to the images folder, the labels text file and the output name. Note that the output name should not exist prior to calling convert_imageset otherwise you'll get a scary error message.
Other flags that might be useful:
--backend - allows you to choose between an lmdb dataset or levelDB.
--gray - convert all images to gray scale.
--encoded and --encoded_type - keep image data in encoded (jpg/png) compressed form in the database.
--help - shows some help, see all relevant flags under Flags from tools/convert_imageset.cpp
You can check out $CAFFE_ROOT/examples/imagenet/convert_imagenet.sh
for an example how to use convert_imageset.

Resources