How to properly extract orientation info of the image using Tesseract 3.04?

How to properly extract orientation info of the image using Tesseract 3.04? - orientation

I have Tesseract 3.04 static build and am trying to extract orientation info using the code provided in official samples:
const char* inputfile = "/usr/src/tesseract/testing/eurotext.tif";
tesseract::Orientation orientation;
tesseract::WritingDirection direction;
tesseract::TextlineOrder order;
float deskew_angle;
PIX *image = pixRead(inputfile);
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
api->Init("/usr/src/tesseract/", "eng");
api->SetPageSegMode(tesseract::PSM_AUTO_OSD);
api->SetImage(image);
api->Recognize(0);
tesseract::PageIterator* it = api->AnalyseLayout();
it->Orientation(&orientation, &direction, &order, &deskew_angle);
printf("Orientation: %d;\nWritingDirection: %d\nTextlineOrder: %d\n" \
"Deskew angle: %.4f\n",
orientation, direction, order, deskew_angle);
My application crashes on extraction on the following line:
it->Orientation(&orientation, &direction, &order, &deskew_angle);
What is going wrong in this code?
Thanks!

If you are interesting in orientation info you can use leptonica instead. IMO it should be faster. See example

I found orientation in tesserect by below code and it returns 0,1(=90),2(=180),3(=270) orientated sides
#include <osdetect.h>
#include <baseapi.h>
#include <allheaders.h>
int main()
{
TessBaseAPI *tessBaseAPI = new TessBaseAPI();
tessBaseAPI ->Init("/tessdataPath/", "eng"))
Pix *image = pixRead(imagePath);
tessBaseAPI->SetImage(image);
//orientation
OSResults os_results;
tessBaseAPI->DetectOS(&os_results);
int orientationType=os_results.best_result.orientation_id
pixDestroy(&image);
return 0;
}

Related

How to convert k4a_image_t to opencv matrix? (Azure Kinect Sensor SDK)

I started playing around with Azure Kinect Sensor SDK. I went through the official how-to guides. I can capture images as raw buffers from the sensor, but I would like to turn them into opencv matrices.

First, you need to tell Azure Kinect sensor to capture in BGRA32 format for the color image (instead of JPEG or other compressed formats). The depth image is captured in 16 bit 1 channel format.
You do this by setting up the config:
k4a_device_configuration_t config = K4A_DEVICE_CONFIG_INIT_DISABLE_ALL;
config.camera_fps = K4A_FRAMES_PER_SECOND_30;
config.color_format = K4A_IMAGE_FORMAT_COLOR_BGRA32; // <==== For Color image
config.color_resolution = K4A_COLOR_RESOLUTION_2160P;
config.depth_mode = K4A_DEPTH_MODE_NFOV_UNBINNED; // <==== For Depth image
Then, once configured, you can capture a color image the following way, and then using the raw buffer create an opencv matrix from the color image:
k4a_image_t colorImage = k4a_capture_get_color_image(capture); // get image metadata
if (colorImage != NULL)
{
// you can check the format with this function
k4a_image_format_t format = k4a_image_get_format(colorImage); // K4A_IMAGE_FORMAT_COLOR_BGRA32
// get raw buffer
uint8_t* buffer = k4a_image_get_buffer(colorImage);
// convert the raw buffer to cv::Mat
int rows = k4a_image_get_height_pixels(colorImage);
int cols = k4a_image_get_width_pixels(colorImage);
cv::Mat colorMat(rows , cols, CV_8UC4, (void*)buffer, cv::Mat::AUTO_STEP);
// ...
k4a_image_release(colorImage);
}
Similarly, for the depth image, you can convert the raw depth data to opencv like this (note, the matrix type has changed!):
k4a_image_t depthImage = k4a_capture_get_depth_image(capture); // get image metadata
if (depthImage != NULL)
{
// you can check the format with this function
k4a_image_format_t format = k4a_image_get_format(depthImage); // K4A_IMAGE_FORMAT_DEPTH16
// get raw buffer
uint8_t* buffer = k4a_image_get_buffer(depthImage);
// convert the raw buffer to cv::Mat
int rows = k4a_image_get_height_pixels(depthImage);
int cols = k4a_image_get_width_pixels(depthImage);
cv::Mat depthMat(rows, cols, CV_16U, (void*)buffer, cv::Mat::AUTO_STEP);
// ...
k4a_image_release(depthImage);
}
NOTE: Opencv matrix constructor will not copy or allocate new memory
for the pointer, instead it initializes the matrix head to point to
the specified data!
Sources:
Azure Kinect Sensor SDK - k4a_image_format_t
Azure Kinect Sensor SDK - k4a_image_t
Opencv Mat

#pragma comment(lib, "k4a.lib")
#include <k4a/k4a.h>
#include <opencv4/opencv2/opencv.hpp>
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
using namespace std;
using namespace cv;
int main()
{
// getting the capture
k4a_capture_t capture;
const int32_t TIMEOUT_IN_MS = 1000;
uint32_t count = k4a_device_get_installed_count();
if (count == 0)
{
printf("No k4a devices attached!\n");
return 1;
}
// Open the first plugged in Kinect device
k4a_device_t device = NULL;
if (K4A_FAILED(k4a_device_open(K4A_DEVICE_DEFAULT, &device)))
{
printf("Failed to open k4a device!\n");
return 1;
}
// Get the size of the serial number
size_t serial_size = 0;
k4a_device_get_serialnum(device, NULL, &serial_size);
// Allocate memory for the serial, then acquire it
char *serial = (char*)(malloc(serial_size));
k4a_device_get_serialnum(device, serial, &serial_size);
printf("Opened device: %s\n", serial);
free(serial);
// Configure a stream of 4096x3072 BRGA color data at 15 frames per second
k4a_device_configuration_t config = K4A_DEVICE_CONFIG_INIT_DISABLE_ALL;
config.camera_fps = K4A_FRAMES_PER_SECOND_30;
config.color_format = K4A_IMAGE_FORMAT_COLOR_BGRA32;
config.color_resolution = K4A_COLOR_RESOLUTION_1080P;
config.depth_mode = K4A_DEPTH_MODE_NFOV_UNBINNED;
config.synchronized_images_only = true;
// Start the camera with the given configuration
k4a_device_start_cameras(device, &config);
int frame_count = 0;
k4a_device_get_capture(device, &capture, TIMEOUT_IN_MS);
k4a_image_t color_image = k4a_capture_get_color_image(capture);
uint8_t* color_buffer = k4a_image_get_buffer(color_image);
int rows = k4a_image_get_height_pixels(color_image);
int cols = k4a_image_get_width_pixels(color_image);
cv::Mat color(rows, cols, CV_8UC4, (void*)color_buffer, cv::Mat::AUTO_STEP);
//cv::imshow("random", color);
k4a_image_t depth_image = k4a_capture_get_depth_image(capture);
uint8_t* depth_buffer = k4a_image_get_buffer(depth_image);
int depth_rows = k4a_image_get_height_pixels(depth_image);
int depth_cols = k4a_image_get_width_pixels(depth_image);
cv::Mat depth(depth_rows, depth_cols, CV_8UC4, (void*)depth_buffer, cv::Mat::AUTO_STEP);
//cv::imshow("depth image",depth);
//cv::waitKey(0);
cv::imwrite("depth.jpg",depth);
cv::imwrite("color.jpg",color);
k4a_device_stop_cameras(device);
k4a_device_close(device);
return 0;
}
For complete build with cmake: https://github.com/ShitalAdhikari/Azure_kinect

how to write a read DICOM file using VTK or ITK libraries?

I used vtkDICOMImageReader to read the DICOM file. I used the vtkImageThreshold to threshold a CT image. And now i want to write it back to my hard disk before further processing.
I tried vtkImageWriter library to write it back. But it is not working when i try to open the file using 3D slicer. I am much grateful if anyone can suggest me a methodology for writing Dicom files.
i have included my code here and i am trying to threshold a dicom image and viewing it. Then i would like to save the thresholded image as a dicom file. But i could not succeed in doing that. please help me.
thanks in advance.
#include <itkImageToVTKImageFilter.h>
#include <vtkSmartPointer.h>
#include <vtkImageData.h>
#include <vtkImageThreshold.h>
#include <vtkRenderWindow.h>
#include <vtkRenderWindowInteractor.h>
#include <vtkInteractorStyleImage.h>
#include <vtkRenderer.h>
#include <vtkImageMapper3D.h>
#include <vtkImageActor.h>
#include <vtkImageCast.h>
#include <vtkNIFTIImageWriter.h>
#include <vtkImageMandelbrotSource.h>
#include <vtkImageViewer2.h>
#include <vtkDICOMImageReader.h>
int main(int argc, char* argv[])
{
std::string folder = argv[1];
vtkSmartPointer<vtkDICOMImageReader> reader =
vtkSmartPointer<vtkDICOMImageReader>::New();
reader->SetFileName(folder.c_str());
reader->Update();
vtkSmartPointer<vtkImageViewer2> imageViewer =
vtkSmartPointer<vtkImageViewer2>::New();
imageViewer->SetInputConnection(reader->GetOutputPort());
// threshold the images
vtkSmartPointer<vtkImageThreshold> imageThreshold =
vtkSmartPointer<vtkImageThreshold>::New();
imageThreshold->SetInputConnection(reader->GetOutputPort());
// unsigned char lower = 127;
unsigned char upper = 511;
imageThreshold->ThresholdByLower(upper);
imageThreshold->ReplaceInOn();
imageThreshold->SetInValue(0);
imageThreshold->ReplaceOutOn();
imageThreshold->SetOutValue(511);
imageThreshold->Update();
// Create actors
vtkSmartPointer<vtkImageActor> inputActor =
vtkSmartPointer<vtkImageActor>::New();
inputActor->GetMapper()->SetInputConnection(
reader->GetOutputPort());
vtkSmartPointer<vtkImageActor> thresholdedActor =
vtkSmartPointer<vtkImageActor>::New();
thresholdedActor->GetMapper()->SetInputConnection(
imageThreshold->GetOutputPort());
// There will be one render window
vtkSmartPointer<vtkRenderWindow> renderWindow =
vtkSmartPointer<vtkRenderWindow>::New();
renderWindow->SetSize(600, 300);
// And one interactor
vtkSmartPointer<vtkRenderWindowInteractor> interactor =
vtkSmartPointer<vtkRenderWindowInteractor>::New();
interactor->SetRenderWindow(renderWindow);
// Define viewport ranges
// (xmin, ymin, xmax, ymax)
double leftViewport[4] = {0.0, 0.0, 0.5, 1.0};
double rightViewport[4] = {0.5, 0.0, 1.0, 1.0};
// Setup both renderers
vtkSmartPointer<vtkRenderer> leftRenderer =
vtkSmartPointer<vtkRenderer>::New();
renderWindow->AddRenderer(leftRenderer);
leftRenderer->SetViewport(leftViewport);
leftRenderer->SetBackground(.6, .5, .4);
vtkSmartPointer<vtkRenderer> rightRenderer =
vtkSmartPointer<vtkRenderer>::New();
renderWindow->AddRenderer(rightRenderer);
rightRenderer->SetViewport(rightViewport);
rightRenderer->SetBackground(.4, .5, .6);
leftRenderer->AddActor(inputActor);
rightRenderer->AddActor(thresholdedActor);
leftRenderer->ResetCamera();
rightRenderer->ResetCamera();
renderWindow->Render();
interactor->Start();
vtkSmartPointer<vtkNIFTIImageWriter> writer =
vtkSmartPointer<vtkNIFTIImageWriter>::New();
writer->SetInputConnection(reader->GetOutputPort());
writer->SetFileName("output");
writer->Write();
// writing the thresholded image to the hard drive.
//this is the part i am not able to code. Please can somebody help me please?
return EXIT_SUCCESS;
}

First, do you want to save out the thresholded image or just the image as read in? If not replace reader with imageThreshold
writer->SetInputConnection(reader->GetOutputPort())
From looking at the VTK tests of NIFTI readers and writers the following options may be required...
writer->SetNIFTIHeader(reader->GetNIFTIHeader())
writer->SetQFac(reader->GetQFac());
writer->SetTimeDimension(reader->GetTimeDimension());
writer->SetTimeSpacing(reader->GetTimeSpacing());
writer->SetRescaleSlope(reader->GetRescaleSlope());
writer->SetRescaleIntercept(reader->GetRescaleIntercept());
writer->SetQFormMatrix(reader->GetQFormMatrix());
I would test with adding these options and then see what you get

I think VTK is a little bit confused to process image. I only use it to display image.I wrote a piece of code for you. The code which read , threshold and write dicom image is below.It use only itk. I think it fills the bill.
#include "itkBinaryThresholdImageFilter.h"
#include "itkImageFileReader.h"
#include "itkGDCMImageIO.h"
#include "itkImageFileWriter.h"
#include "itkImage.h"
int main () {
typedef unsigned char InputPixelType ; //Pixel Type of Input image
typedef unsigned char OutputPixelType; //Pixel Type of Output image
const unsigned int InputDimension = 2; //Dimension of image
typedef itk::Image < InputPixelType, InputDimension > InputImageType; //Type definition of Input Image
typedef itk::Image < InputPixelType, InputDimension > OutputImageType;//Type definition of Output Image
typedef itk::ImageSeriesReader< InputImageType > ReaderType;//Type definition of Reader
typedef itk::BinaryThresholdImageFilter<InputImageType, OutputImageType > FilterType; // Type definition of Filter
typedef itk::ImageFileWriter<OutputImageType> ImageWriterType; //Definition of Writer of Ouput image
typedef itk::GDCMImageIO ImageIOType; //Type definition of Image IO for Dicom images
//Starts Reading Process
ReaderType::Pointer reader = ReaderType::New(); //Creates reader
ImageIOType::Pointer gdcmImageIO_input = ImageIOType::New(); //Creates ImageIO object for input image
reader->SetImageIO( gdcmImageIO_input ); // Sets image IO of reader
reader->SetFileNames( "example_image.dcm" ); // Sets filename to reader
//Exceptional handling
try
{
reader->UpdateLargestPossibleRegion();
}
catch (itk::ExceptionObject & e)
{
std::cerr << "exception in file reader " << std::endl;
std::cerr << e << std::endl;
return EXIT_FAILURE;
}
// Start filtering process
FilterType::Pointer filter = FilterType::New(); //Creates filter
filter->SetInput( reader->GetOutput() );
filter->SetOutsideValue( 0); // Set pixel value which are out of lower and upper threshold value
filter->SetInsideValue( 255 );// Set pixel value which are within lower and upper threshold value
filter->SetLowerThreshold( 25 ); // Lower threshold value 25
filter->SetUpperThreshold( 150 );// Upper threshold value 150
filter->Update();
//Starts Writing Process
ImageWriterType::Pointer imageWriter = ImageWriterType::New(); // Creates writer
ImageIOType::Pointer gdcmImageIO_output = ImageIOType::New(); // Creates Image IO object for output image
imageWriter->SetImageIO( gdcmImageIO_output ); // Set image IO as dicom
imageWriter->SetInput(filter->GetOutput() ); // Connects output of filter with to input of writer
imageWriter->SetFileName(example_image_thresholded.dcm); // Sets output file name
//Exceptional handling
try
{
imageWriter->Update();
}
catch ( itk::ExceptionObject &exception )
{
std::cerr << "Exception in file writer ! " << std::endl;
std::cerr << exception << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}

Disparity map code in openCV C++

I have calibrated and stereo rectified images in MATLAB using Caltech's toolbox (http://www.vision.caltech.edu/bouguetj/calib_doc/). I tried the disaprity in MATLAB and it is not returning good results now I would like to try it in OPENCV. I could not find any OPENCV sample code for disparity from their website. so this is the code I found so far:(code coming from http://www.jayrambhia.com/blog/disparity-maps/)
#include "opencv2/core/core.hpp"
#include "opencv2/calib3d/calib3d.hpp"
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include "opencv2/contrib/contrib.hpp"
#include <stdio.h>
#include <string.h>
using namespace cv;
using namespace std;
int main(int argc, char* argv[])
{
Mat img1, img2, g1, g2;
Mat disp, disp8;
//char* method = argv[3];
char* method = "SGBM";
//img1 = imread(argv[1]);
//img2 = imread(argv[2]);
img1 = imread("leftImage.jpg");
img2 = imread("rightImage.jpg");
cvtColor(img1, g1, CV_BGR2GRAY);
cvtColor(img2, g2, CV_BGR2GRAY);
if (!(strcmp(method, "BM")))
{
StereoBM sbm;
sbm.state->SADWindowSize = 9;
sbm.state->numberOfDisparities = 112;
sbm.state->preFilterSize = 5;
sbm.state->preFilterCap = 61;
sbm.state->minDisparity = -39;
sbm.state->textureThreshold = 507;
sbm.state->uniquenessRatio = 0;
sbm.state->speckleWindowSize = 0;
sbm.state->speckleRange = 8;
sbm.state->disp12MaxDiff = 1;
sbm(g1, g2, disp);
}
else if (!(strcmp(method, "SGBM")))
{
StereoSGBM sbm;
sbm.SADWindowSize = 3;
sbm.numberOfDisparities = 144;
sbm.preFilterCap = 63;
sbm.minDisparity = -39;
sbm.uniquenessRatio = 10;
sbm.speckleWindowSize = 100;
sbm.speckleRange = 32;
sbm.disp12MaxDiff = 1;
sbm.fullDP = false;
sbm.P1 = 216;
sbm.P2 = 864;
sbm(g1, g2, disp);
}
normalize(disp, disp8, 0, 255, CV_MINMAX, CV_8U);
imshow("left", img1);
imshow("right", img2);
imshow("disp", disp8);
waitKey(0);
return(0);
}
and this is the error I get:
Unhandled exception at at 0x000007FEFD4D940D in OPEN_CV_TEST.exe: Microsoft C++ exception: cv::Exception at memory location 0x0000000000149260.
I am new to C++ and there is no description on the procedure to run the code. so I just put those left and right images in the \x64\Debug folder of my project and running the code in MS visual studio 2012 windows 7 64 bit. I created the project before and ran a sample test and it worked. so now I am just copying the above code in the main C++ source file. I assume there should not be any library file or header files missing.
also please note that I do not need need to rectify images and no need for stereo matching either right now.
any help is greatly appreciated.

I figured it out! it was the "imread" function in OPENCV which was causing problems! I used "cvLoadImage" instead. I also put the images in the folder of the project right next to CPP files and also in DEBUG folders. It is working fine now. Apparently the "IMREAD" function is a known problem in OPENCV!

How to save (cvWrite or imwrite) an image in OpenCV 2.4.3?

I am trying to save an OpenCV image to the hard drive.
Here is what I tried:
public void SaveImage (Mat mat) {
Mat mIntermediateMat = new Mat();
Imgproc.cvtColor(mRgba, mIntermediateMat, Imgproc.COLOR_RGBA2BGR, 3);
File path =
Environment.getExternalStoragePublicDirectory(
Environment.DIRECTORY_PICTURES);
String filename = "barry.png";
File file = new File(path, filename);
Boolean bool = null;
filename = file.toString();
bool = Highgui.imwrite(filename, mIntermediateMat);
if (bool == true)
Log.d(TAG, "SUCCESS writing image to external storage");
else
Log.d(TAG, "Fail writing image to external storage");
}
}
Can any one show how to save that image with OpenCV 2.4.3?

Your question is a bit confusing, as your question is concerning OpenCV on the desktop, but your code is for Android, and you ask about IplImage, but your posted code is using C++ and Mat. Assuming you're on the desktop using C++, you can do something along the lines of:
cv::Mat image;
std::string image_path;
//load/generate your image and set your output file path/name
//...
//write your Mat to disk as an image
cv::imwrite(image_path, image);
...Or for a more complete example:
void SaveImage(cv::Mat mat)
{
cv::Mat img;
cv::cvtColor(...); //not sure where the variables in your example come from
std::string store_path("..."); //put your output path here
bool write_success = cv::imwrite(store_path, img);
//do your logging...
}
The image format is chosen based on the supplied filename, e.g. if your store_path string was "output_image.png", then imwrite would save it was a PNG image. You can see the list of valid extensions at the OpenCV docs.
One caveat to be aware of when writing images to disk with OpenCV is that the scaling will differ depending on the Mat type; that is, for floats the images are expected to be within the range [0, 1], while for say, unsigned chars they'll be from [0, 256).
For IplImages, I'd advise just switching to use Mat, as the old C-interface is deprecated. You can convert an IplImage to a Mat via cvarrToMat then use the Mat, e.g.
IplImage* oldC0 = cvCreateImage(cvSize(320,240),16,1);
Mat newC = cvarrToMat(oldC0);
//now can use cv::imwrite with newC
alternately, you can convert an IplImage to a Mat just with
Mat newC(oldC0); //where newC is a Mat and oldC0 is your IplImage
Also I just noticed this tutorial at the OpenCV website, which gives you a walk-though on loading and saving images in a (desktop) environment.

detector->detect(img, keypoint); error

I want to implement bag of words in opencv. after detector->detect(img, keypoint); detects keypoints, when i want to clean keypoints using keypoint.clear(); or when the function wants to return the following error will be appeared.
"Unhandled exception at 0x011f45bb in BOW.exe: 0xC0000005: Access violation reading location 0x42ebe098."
and also detected keypoints have bizarre points coordinates like cv::Point_ pt{x=-1.5883997e+038y=-1.5883997e+038 }
Part of the code
Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("FlannBased");
Ptr<DescriptorExtractor> extractor = new SurfDescriptorExtractor();
Ptr<FeatureDetector> detector = new SurfFeatureDetector(2000);
void extractTrainingVocabulary() {
IplImage *img;
int i,j;
CvSeq *imageKeypoints = 0;
for(j=1;j<=60;j++)
for(i=1;i<=60;i++){
sprintf( ch,"%d%s%d%s",j," (",i,").jpg");
const char* imageName = ch;
Mat img = imread(ch);
vector<KeyPoint> keypoint;
detector->detect(img, keypoint);
Mat features;
extractor->compute(img, keypoint, features);
bowTrainer.add(features);
keypoint.clear();//problem
}
return;
}

I noticed something about your code, on extractTrainingVocabulary() you declare IplImage* img; and inside the loop you declare another variable with the same name (but different type): Mat img = imread(ch);.
Even though that might not be the problem, it's certainly not good practice. I would fix that immediately and update the code on your question.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to properly extract orientation info of the image using Tesseract 3.04? - orientation

If you are interesting in orientation info you can use leptonica instead. IMO it should be faster. See example

Related

How to convert k4a_image_t to opencv matrix? (Azure Kinect Sensor SDK)

how to write a read DICOM file using VTK or ITK libraries?

Disparity map code in openCV C++

How to save (cvWrite or imwrite) an image in OpenCV 2.4.3?

detector->detect(img, keypoint); error

Categories

Resources