read haar cascade in distribute mode in hadoop - opencv

I am using OpenCV library for image detection in Hadoop using Spark framework.
I am able to run the spark program in local mode where Haar file is present in local file system.
But I am getting null pointer error for reading the Haar file in distributed mode Although I have copied haar file in all the cluster nodes and provided the absolute path in the code.
String fileloc ="/home/centos/haarcascade_frontalface_alt.xml"
CascadeClassifier faceDetector = new CascadeClassifier(fileloc);
Error:
Caused by: java.lang.NullPointerException
at javax.xml.bind.DatatypeConverterImpl.guessLength(DatatypeConverterImpl.java:658)
at javax.xml.bind.DatatypeConverterImpl._parseBase64Binary(DatatypeConverterImpl.java:696)
at javax.xml.bind.DatatypeConverterImpl.parseBase64Binary(DatatypeConverterImpl.java:438)
at javax.xml.bind.DatatypeConverter.parseBase64Binary(DatatypeConverter.java:342)
at com.lb.customlogic.impl.CustomLogicImpl.process(CustomLogicImpl.java:82)
... 20 more
I have tried with prefix extensions file://, file:/ and file:/// but those are not working out for me.
Do I need to add anything extra in the prefix to get the file read during execution of the program ?
Since Opencv has no support for Hadoop, I think I can't provide the HDFS shared location path for haar file.

This issue is resolved after adding --files parameter in spark-submit.
haar file is distributed across all the nodes. we just need to provide file name in source code:
String fileloc ="haarcascade_frontalface_alt.xml"
CascadeClassifier faceDetector = new CascadeClassifier(fileloc);

Related

pytorch torchvision.datasets.ImageFolder FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints

Tried to load training data with pytorch torch.datasets.ImageFolder in Colab.
transform = transforms.Compose([transforms.Resize(400),
transforms.ToTensor()])
dataset_path = 'ss/'
dataset = datasets.ImageFolder(root=dataset_path, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=20)
I encountered the following error :
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-27-7abcc1f434b1> in <module>()
2 transforms.ToTensor()])
3 dataset_path = 'ss/'
----> 4 dataset = datasets.ImageFolder(root=dataset_path, transform=transform)
5 dataloader = torch.utils.data.DataLoader(dataset, batch_size=20)
3 frames
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in make_dataset(directory, class_to_idx, extensions, is_valid_file)
100 if extensions is not None:
101 msg += f"Supported extensions are: {', '.join(extensions)}"
--> 102 raise FileNotFoundError(msg)
103
104 return instances
FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp
My Dataset folder contains a subfolder with many training images in png format, still the ImageFolder can't access them.
I encountered the same problem when I was using IPython notebook-like tools.
First please check if there is any hidden files under your dataset_path. Use ls -a if you are under a Linux environment.
The case happen to me is I found a hidden file called .ipynb_checkpoints which is located parallelly to image class subfolders. I think that file causes confusion to PyTorch dataset. I made sure it is not useful so I simply deleted it. Then the dataset works fine.
Or if you would like to simply ignore that file, you may also try this.
The files in the image folder need to be placed in the subfolders for each class, like this:
root/dog/xxx.png
root/dog/xxy.png
root/dog/[...]/xxz.png
root/cat/123.png
root/cat/nsdf3.png
root/cat/[...]/asd932_.png
https://pytorch.org/vision/stable/datasets.html#torchvision.datasets.ImageFolder
Are your files in ss dir organized in this way?
1- The files in the image folder need to be placed in the subfolders for each class (as said Sergii Dymchenko)
2- Put the absolute path when using google colab
The solution for google colaboratory:
When you creating a directory, coollaboratory additionally creates .ipynb_checkpoints in it.
To solve the problem, it is enough to remove it from the folder containing directories with images (i.e. from the train folder). You need to run:
!rm -R test/train/.ipynb_checkpoints
!ls test/train/ -a #to make sure that the deletion has occurred
where test/train/ is my path to datasets folders

How to use ROS urdf package in Drake?

I'm new to Drake.And I want to use my robot model which is a ROS urdf package converted from SolidWorks plugin.I've tried to use
builder = DiagramBuilder()
plant, scene_graph = AddMultibodySceneGraph(builder,0.0)
parser = Parser(plant)
pkg_map = parser.package_map()
pkg_map.AddPackageXml("myxmlfilepath")
robot_file = "my_urdf_file_path"
robot = parser.AddModelFromFile(robot_file, "robot")
But I got an errorRuntimeError:The WaveFront obj file has no faces in the last line.I try to use ROS to display and it works well. My urdf file has some .stl format mesh file for visual and collision. I see some comments here, so I guess Drake maybe can't deal with .stl in a default way so far.What should I do except manually converting .stl mesh file to .obj file? Thanks in Advance.
As you noted in your comment above, once you converted your STL file to an OBJ file, you were able to load your URDF file.
This is because Drake does not yet support loading STL (or DAE) meshes at runtime; they must be converted (either manually as you had done, or via the build system). For more info, please see:
https://github.com/RobotLocomotion/drake/issues/2941
In that issue, I've referenced your post as a motivator towards adding additional mesh support.
Thanks!

<lang>.traineddata not found ( even when it's in the correct folder)

I'm using tesseract to detect text in spanish in some screenshot of a game, I had some issues with the "spa.traineddata" so I started to train my own data called "spa1.traineddata" and I used the two files to make text detection more accurate, yesterday I make some tests and seemed to work well, but the file "spa1.traineddata" needed more training, so I decided to continue today, I added some new images to train my file "spa1.trainneddata" and then I wanted to test it and throws me the following error
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\\Program Files\\Tesseract-OCR/tessdata/-l spa.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'-l spa\' Error opening data file /home/debian/src/github/tesseract-ocr/tesseract/bin/ndebug/x86_64-w64-mingw32-5.0.0-alpha.20200223/usr/x86_64-w64-mingw32/share/tessdata/spa1 --psm
6.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'spa1 --psm 6\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
and these are the lines of code that I use for testing
custom_config = r'-l spa+spa1 --psm 6'
pytesseract.image_to_string(Image.open('imagenes/obv.png'), lang=custom_config)
I was searching and found that the error was because the lang.traideddata files were not in the tessdata folder, but mine are in the folder (That's why I was able to work with them yesterday)
I attach a screenshot of the tessdata folder, the last two files are the traineddata
tessdata folder
also, it is useful I'm using vscode, python 3.7 and tesseract 4
I hope you can help me (sorry my bad english uwu)

Problem SamplingRateCalculatorList (00000283DDC3C0E0) : All classes are empty ! OTB + QGis

I use OTB (Orfeo Tool Box) in QGis for classification. When I use the ImageTrainClassifier tool in a batch process, I have a problem for some images. Instead of returning a model in a xml/txt file format, it returns several files with those extensions : .xml_rates_1, .xml_samples_1.dbf, .xml_samples_1.prj, .xml_samples_1.shp, .xml_samples_1.shx, .xml_stats_1 (I have the same files with txt instead of xml if I use txt file format as output).
During the execution of the algorithms, I have only one warning message :
(WARNING): file ..\Modules\Learning\Sampling\src\otbSamplingRateCalculatorList.cxx, line 99, SamplingRateCalculatorList (00000283DDC3C0E0): All classes are empty !
And after that :
(FATAL) TrainImagesClassifier: No samples found in the inputs!
The problem is that after that, I want to use ImageClassifier, that takes the model of ImageTrainClassifier in input, that I don’t have.
Thanks for your help

Implementing bag of word algorithm from opencv sample codes

I am trying to implement bagofwords_classification.cpp from opencv version 2.4.5 sample codes.cpp . What are the changes that we are required to make in this .cpp file for proper working of code. I am new to opencv and still trying sample codes.
How and where to add the Feature detector,descriptor extractor, descriptor matcher ?? in that .cpp code
Whenever i debug any code it never display results but just output the info about what that .cpp file is gonna do. In (EXAMPLE) matching_to_many_images.cpp even the images are saved in the file but still no results are shown.
To show an image, you can use cvShowImage("Title",image) or imshow(). This depends on wheter to image is an IplImage or Mat.
The code example is not 'false', the program uses commandline arguments, thus to start it you need to add certain commands.
From the code
[feature detector]
Feature detector name (e.g. SURF, FAST...) - see createFeatureDetector() function.
[descriptor extractor]
Descriptor extractor name (e.g. SURF, SIFT) - see createDescriptorExtractor() function.
[descriptor matcher]
Descriptor matcher name (e.g. BruteForce) - see createDescriptorMatcher() function.
then from those arguments it calls
Ptr<FeatureDetector> featureDetector = createFeatureDetector( ddmParams.detectorType );
Ptr<DescriptorExtractor> descExtractor = createDescriptorExtractor( ddmParams.descriptorType );

Resources