Hadoop splitting image into tiles for a segmentation process - image-processing

I want to use Hadoop to process segmentation on large scale images (Pleiades image tif format)
The idea is to split the image into tiles and distribute them to each node.
The map task will be the segmentation process. I developped the segmentation algorithm using the OTB library written in C++.
I created an otb application which can be launched since a java program by giving the path of the tile and it returned the segmented tile.
I really don't know how to implement the split mechanism, I have to custom the fileInputFormat class and the RecorReader class but with what types of input keys and input values.
Actually I have to get the path to the tile to segment...
If someone has a suggestion ?
Best Regards,

Solution strategy may depend on amount of images and variety of their sizes.
If you have a lot of images (much more then amount of free mapper slots in your cluster) you probably may consider process each image inside of mapper with StreamInputFormat and use your segmentation algorithm. Otherwise you may need to implement your own InputFormat that will form correct InputSplits.

Related

Processing rgb images

I am using Drake to implement a visual servoing scheme. For that I need to process color images and extract features.
I am using the ManipulationStation class, and I have already published the color images of the rgbd_sensor objects to LCM channels (in c++). Now I want to process the images during the simulation.
I know that it would be best to process images internally (without using the ImageWriter), and for that, I cannot use the image_array_t on LCM channels or ImageRgbaU8, I have to convert images to an Eigen or OpenCV type.
I will then use feature extraction functions in these libraries. These features will be used in a control law.
Do you have any examples on how to write a system or code in c++ that could convert my Drake images to OpenCV or Eigen types? What is the best way to do it?
Thanks in advance for the help!
Arnaud
There's no current example converting the drake::Image to Eigen or OpenCV but it should be pretty straightforward.
What you should do is create a system akin to ImageWriter. It takes an image as an input and does whatever processing you need on the image. You connect its input port to the RgbdSensor's output port (again, like ImageWriter.) The only difference, instead of taking the image and converting it to an lcm data type, you convert it to your OpenCV or Eigen type and apply your logic to that.
If you look, ImageWriter declares a periodic publish event. Change that to (probably) a discrete update event (still periodic, I'd imagine) and then change ImageWriter::WriteImage callback into your "Image-to-OpenCV" callback.
Finally, you could also declare an output port that would make your converted image to other systems.

How to prepare training data for image segmentation

I am using bounding box marking tools like BBox and YOLO marker for object localisation. I wanted to know is there any equivalent marking tools available for image segmentation tasks. How people in academia and research are preparing data sets for these image segmentation tasks. Recent Kaggle competition severstal-steel-defect-detection has pixel level segmentation information. Which tool they used to prepare this data?
Generally speaking it is a pretty complex but a common task, so you'll likely be able to find several tools. Supervise.ly is a good example. Look through the demo to understand the actual complexity.
Another way is to use OpenCV to get some specific results. We did that, but results were pretty rough. Another problem is performance. There are couple reasons we use 4K video.
Long story short, we decided to implement a custom tool to get required results (and do that fast enough).
(see in action)
Just to summarize, if you want to build a training set for segmentation you have the following options:
Use available services (pretty much all of them will require additional manual work)
Use OpenCV to deal with a specially prepared input
Develop a custom solution to deal with a properly prepared input, providing full control and accurate results
The third option seems to be the most flexible solution. Here are some examples. Those are custom multi-color segmentation results. You might got an impression custom implementation is way more complex, but as it turned out if you properly implement some straight-forward algorithm you might be surprised with the result. We were interested in accurate pixel-perfect results:
(see in action)
I have created a simple script to generate colored masks of annotation to be used for semantic segmentation.
You will need to use VIA(VGG image annotator) tool which gives the facility to mark a region in polygon. Once a polygon is created on a class/attribute you can give an attribute name and save the annotation as csv file. The x,y coordinates of the polygon gets saved in the csv file basically.
The program and steps to use are present at: https://github.com/pateldigant/SemanticAnnotator
If you have any doubts regarding the use of this script you can comment down.

How to recognize or match two images?

I have one image stored in my bundle or in the application.
Now I want to scan images in camera and want to compare that images with my locally stored image. When image is matched I want to play one video and if user move camera from that particular image to somewhere else then I want to stop that video.
For that I have tried Wikitude sdk for iOS but it is not working properly as it is crashing anytime because of memory issues or some other reasons.
Other things came in mind that Core ML and ARKit but Core ML detect the image's properties like name, type, colors etc and I want to match the image. ARKit will not support all devices and ios and also image matching as per requirement is possible or not that I don't have idea.
If anybody have any idea to achieve this requirement they can share. every help will be appreciated. Thanks:)
Easiest way is ARKit's imageDetection. You know the limitation of devices it support. But the result it gives is wide and really easy to implement. Here is an example
Next is CoreML, which is the hardest way. You need to understand machine learning even if in brief. Then the tough part - training with your dataset. Biggest drawback is you have single image. I would discard this method.
Finally mid way solution is to use OpenCV. It might be hard but suit your need. You can find different methods of feature matching to find your image in camera feed. example here. You can use objective-c++ to code in c++ for ios.
Your task is image similarity you can do it simply and with more reliable output results using machine learning. Since your task is using camera scanning. Better option is CoreML.You can refer this link by apple for Image Similarity.You can optimize your results by training with your own datasets. Any more clarifications needed comment.
Another approach is to use a so-called "siamese network". Which really means that you use a model such as Inception-v3 or MobileNet and both images and you compare their outputs.
However, these models usually give a classification output, i.e. "this is a cat". But if you remove that classification layer from the model, it gives an output that is just a bunch of numbers that describe what sort of things are in the image but in a very abstract sense.
If these numbers for two images are very similar -- if the "distance" between them is very small -- then the two images are very similar too.
So you can take an existing Core ML model, remove the classification layer, run it twice (once on each image), which gives you two sets of numbers, and then compute the distance between these numbers. If this distance is lower than some kind of threshold, then the images are similar enough.

What is the fastest way to use Tensorflow Object Detection API as a flat image recognizer?

I am new to computer vision but I am trying to code an android/ios app which does the following:
Get the live camera preview and try to detect one flat image (logo or painting) in that. In real-time. Draw a rect around the logo if found. If there is no match, dont draw the rectangle.
I found the Tensorflow Object Detection API as a good starting point. And support was just announced for importing TensorFlow models into Core ML.
I followed a lot of tutorials to train my own object detector. The training data is the key. I found a pretty good library to generate augmented image. I have created hundreds of variation of my image source (rotation, skew etc ...).
But it has failed! This dataset is probably good for image classification (with my image in full screen) but not in context (the room).
I think transfer-learning is the key, In my case, I used the ssd_mobilenet_v1_coco model as a base. I tried to fake the context of my augmented image with the Random Erasing Data Augmentation technique without success.
What are my available solutions? Do I tackle the problem rightly? I need to make the model training as fast as possible.
May I have to use some datasets for indoor-outdoor image classification and put my image randomly above? How important are the perspectives?
Thank you!
I have created hundreds of variation of my image source (rotation, skew etc ...). But it has failed!
So that mean your model did not converge or the final performance was bad? If your model did not converge then add more data. "Hundred of samples" is very few. So use more images and make more samples, and make your sample s dispersed as possible.
I think transfer-learning is the key, In my case, I used the ssd_mobilenet_v1_coco model as a base. I tried to fake the context of my augmented image with the Random Erasing Data Augmentation technique without success.
You mean fine-tuning. Did you reduced the label to 2 (your image and background) and did fine-tuning. If you didn't then you surely failed. Oh man, you should at least show me your model definition.
What are my available solutions? Do I tackle the problem rightly? I need to make the model training as fast as possible.
To make training converge faster, just add more GPUs and train on multiple GPUs. If you don't have money, rent some GPU cluster on Azure. Believe me, it is not that expensive.
Hope that help

Real Time Cuda Image Processing advice

I am trying to implement an algorithm for a system which the camera get 1000fps, and I need to get the value of each pixel in all images and do the different calculation on the evolution of pixel[i][j] in N number of images, for all the pixels in the images. I have the (unsigned char *ptr) I want to transfer them to the GPU and start implementing the algorithm.but I am not sure what would be the best option for realtime processing.
my system:
CPU Intel Xeon x5660 2.8Ghz(2 processors)
GPU NVIDIA Quadro 5000
I got the following questions:
I do I need to add any Image Processing library addition to CUDA? if yes what do you suggest?
can I create a matrix for pixel[i,j] containing values for images [1:n] for each pixel in the image size? for example for 1000 images with 200x200 size I will end up with 40000 matrix each
containing 1000 values for one pixel? Does CUDA gives me some options like OpenCV to have a Matrices? or Vector?
1 - Do I need to add any Image Processing library addition to CUDA?
Apples and oranges. Each has a different purpose. An image processing library like OpenCV offers a lot more than simple accelerated matrix computations. Maybe you don't need OpenCV to do the processing in this project as you seem to rather use CUDA directly. But you could still have OpenCV around to make it easier to load and write different image formats from the disk.
2 - Does CUDA gives me some options like OpenCV to have a Matrices?
Absolutely. Some time ago I wrote a simple (educational) application that used OpenCV to load an image from the disk and use CUDA to convert it to its grayscale version. The project is named cuda-grayscale. I haven't tested it with CUDA 4.x but the code shows how to do the basic when combining OpenCV and CUDA.
It sounds like you will have 40000 independent calculations, where each calculation works only within one (temporal) pixel. If so, this should be a good task for the GPU. Your 352 core Fermi GPU should be able to beat your 12 hyperthreaded Xeon cores.
Is the algorithm you plan to run a common operation? It sounds like it might not be, in which case you will likely have to write your own kernels.
Yes, you can have arrays of elements of any type in CUDA.
Having this being a "streaming oriented" approach is good for a GPU implementation in that it maximizes number of calculations as compared to transfers over the PCIe bus. It it might also introduce difficulties in that, if you want to process the 1000 values for a given pixel in a specific order (oldest to newest, for instance), you will probably want to avoid continuously shifting all the frames in memory (to make room for the newest frame). It will slightly complicate your addressing of the pixel values, but the best approach, to avoid shifting the frames, may be to overwrite the oldest frame with the newest frame each time a new frame is added. That way, you end up with a "stack of frames" that is fairly well ordered but has a discontinuity between old and new frames somewhere within it.
I do I need to add any Image Processing library addition to CUDA ???
if yes what do you suggest?
Disclosure: My company develop & market CUVILib
There are very few options when it comes to GPU Accelerated Imaging libraries which also offer general-purpose functionality. CUVILib is one of those options which offers the following, very suited for your specific needs:
CuviImage object which holds your image data and image as a 2D matrix
You can write your own GPU function and use CuviImage as a 2D GPU matrix.
CUVILib already provides a rich set of Imaging functionality like Color Operations, Image Statistics, Feature detection, Motion estimation, FFT, Image Transforms etc so chances are that you will find your desired functionality.
As for the question of whether GPUs are suited for your application: Yes! Imaging is one of those domains which are ideal for parallel computation.
Links:
CUVILib: http://www.cuvilib.com
TunaCode: http://www.tunacode.com

Resources