Difference between Mat, MCvMat, Image, MIplImage, ScalarArray, Matrix in EmguCV4 - opencv

What is the difference between these types of image storage variables in EmguCV4, I'm really confused, Unfortunately there is no good training resource for EmguCV4, Even on the EmguCV4 site
Why should there be so many variables to store an image? When and where should we use which type of variable?
Mat, MCvMat
Image, MIplImage
ScalarArray
Matrix

Related

what's dataset type in tensorflow object-detection api?

I am trying to do my own object detection using my own dataset. I started my first machine learning program from google tensorflow object detection api, the link is here:eager_few_shot_od_training_tf2_colab.ipynb
In the colab tutorial, the author use javascript label the images, the result like this:
gt_boxes = [
np.array([[0.436, 0.591, 0.629, 0.712]], dtype=np.float32),
np.array([[0.539, 0.583, 0.73, 0.71]], dtype=np.float32),
np.array([[0.464, 0.414, 0.626, 0.548]], dtype=np.float32),
np.array([[0.313, 0.308, 0.648, 0.526]], dtype=np.float32),
np.array([[0.256, 0.444, 0.484, 0.629]], dtype=np.float32)
]
When I run my own program, I use labelimg replace to javascript, but the dataset is not compatible.
Now I have two questions, the first one is what is the dataset type in colab tutorial? coco, yolo, voc, or any other? the second is how transform dataset between labelimg data and colab tutorial data? My target is using labelimg to label data then substitute in colab tutorial.
The "data type" are just ratio values based on the height and width of the image. So the coordinates are just ratio values for where to start and end the bounding box. Since each image is going to be preprocessed, that is, it's dimensions are changed when fed into the model (batch,height,width,channel) the bounding box coordinates must have the correct ratio as the image might change dimensions from it's original size.
Like for the example, the model expects images to be 640x640. So if you provide an image of 800x600 it has to be resized. Now if the model gave back the coordinates [100,100,150,150] for an 640x640, clearly that would not be the same for 800x600 images.
However, to get this data format you should use PascalVOC when using labelImg.
The typical way to do this is to create TFRecord files and decode them in your training script order to create datasets. However, you are free to choose whatever method you like Tensorflow dataset in order to train your model.
Hope this answered your questions.

Z-Value, Z-Near and Z-Far Within Kernel Function

While loading texture within kernel function in Metal, is it possible to find the default z-value (if it exists at all) of the texture being sampled, the z-near and z-far values (likewise, if these values exist at all when the kernel is used instead of the normal pipeline using shaders) of the space in which the texture resides?
What I am trying to understand is:
When sampling a texture within kernel function, is it possible for us to change (or set) the z value of the texture before writing it? I have not been able to find this information along with the z-near and z-far values (is it even possible that we define these values manually when using the kernel function?) from the documentation.
Thanks.

Improve image quality

I need to improve image quality, from low quality to high hd quality. I am using OpenCV libraries. I experimented a lot with GaussianBlur(), Laplacian(), transformation functions, filter functions etc, but all I could succeed is to convert image to hd resolution and keep the same quality. Is it possible to do this? Do I need to implement my own algorithm or is there a way how it's done? I will really appreciate any kind of help. Thanks in advance.
I used this link for my reference. It has other interesting filters that you can play with.
If you are using C++:
detailEnhance(Mat src, Mat dst, float sigma_s=10, float sigma_r=0.15f)
If you are using python:
dst = cv2.detailEnhance(src, sigma_s=10, sigma_r=0.15)
The variable 'sigma_s' determines how big the neighbourhood of pixels must be to perform filtering.
The variable 'sigma_r' determines how the different colours within the neighbourhood of pixels will be averaged with each other. Its range is from: 0 - 1. A smaller value means similar colors will be averaged out while different colors remain as they are.
Since you are looking for sharpness in the image, I would suggest you keep the kernel as minimum as possible.
Here is the result I obtained for a sample image:
1. Original image:
2. Sharpened image for lower sigma_r value:
3. Sharpened image for higher sigma_r value:
Check the above mentioned link for more information.
How about applying Super Resolution in OpenCV? A reference article with more details can be found here: https://learnopencv.com/super-resolution-in-opencv/
So basically you will need to have the Python dependency opencv-contrib-python installed, together with a working version of opencv-python.
There are different techniques for the Super Resolution in OpenCV you can choose from, including EDSR, ESPCN, FSRCNN, and LapSRN. Code examples in both Python and C++ have been included in the tutorial article as well for easy reference.
A correction is needed
dst = cv2.detailEnhance(src, sigma_s=10, sigma_r=0.15)
using kernel will give error.
+1 to kris stern answer,
If you are looking for practical implementation of super resolution using pretrained model in OpenCV, have a look at below notebook also video describing details.
https://github.com/pankajr141/experiments/blob/master/Reasoning/ComputerVision/super_resolution_enhancing_image_quality_using_pretrained_models.ipynb
https://www.youtube.com/watch?v=JrWIYWO4bac&list=UUplf_LWNn0a9ubnKCZ-95YQ&index=4
Below is a sample code using opencv
model_pretrained = cv2.dnn_superres.DnnSuperResImpl_create()
# setting up the model initialization
model_pretrained.readModel(filemodel_filepath)
model_pretrained.setModel(modelname, scale)
# prediction or upscaling
img_upscaled = model_pretrained.upsample(img_small)

Different matching results for opencv's descriptor_extractor_matcher when loading data from file

i am using the following code in the descriptor_extractor_matcher.cpp sample to compute the descriptors of img1 (Mat descriptors01), write it to my disk and load it back (Mat descriptors1). (same steps for the keypoints, but code is rather much the same ...)
Ptr<DescriptorExtractor> descriptorExtractor = DescriptorExtractor::create( argv[2] );
...
Mat descriptors01;
descriptorExtractor->compute( img1, keypoints1, descriptors01 ); // compute descriptors
FileStorage storage("test.yml", FileStorage::WRITE); //save it to disc
storage << "blub" << descriptors01;
storage.release();
Mat descriptors1;
FileStorage storage1("test.yml", FileStorage::READ); // load it again
storage1["blub"] >> descriptors1;
storage1.release();
The keypoints & descriptors for image 2 are computed and used without saving and loading.
I am using only the loaded data (keypoints & descriptors) for image 1 for the matching, so for the descriptors: descriptors1.
Now here is the thing: if I compare the cases
A) Using the code above for computing, storing and loading;
B) Using only loaded data (without computing and store it again)
for the matching I get different results, as you can see in the pictures for keypoints aswell as for the matching descriptors. I would have expect no differences... What am I missing here? Must I compare 2 images, and cannot compare an image to a stored set of keypoints and it's descriptors ?
Of course I'm using the same values for [detectorType] [descriptorType] [matcherType] [matcherFilterType] [image1] [image2] [ransacReprojThreshold], by the way ;)
Thanks alot!
UPDATE:
It seems the issue is depending on the descriptor. Working with loaded descriptors works for SIFT and SURF, but not for ORB and other. Images: Results with different descriptors for case A and B:
Try repeating A or B individually and see if the results are coming out to be the same. I suspect they won't and I say that because, #1 Your object of interest has poor texture and that would result in poor descriptors. #2 The viewpoint change between the two images is huge and which leads to the problem of non-repeatability even for the best of the descriptors like SIFT.
Now, comes the part of how to solve this repeatability issue, #1 use some threshold on the norm of the descriptor so that only very strong features are used for matching. #2 use the epipolar constraint along with RANSAC to filter out wrong matches. I am attaching two images to show how the filter hugely affects the correspondences.
Using SURF to find correspondence between the two images (two images in red-cyan colormap)
After filtering the images using RANSAC using epipolar constraint.
Feel free to comment and discuss further over this issue. :-)

OpenCV CalcPca input data

I am trying to implement a face recognition training function with opencv, using "eigenfaces". I have the sample data, but I can't find any info on CalcPCA function arguments. All I know is that it takes data matrix, reference to average eigenface matrix, reference to eigen vector, and reference to eigen values matrix.
My question is, how should I pass the data from several test image matrices into the first argument of CalcPCA so I can get the average eigenface and vectors?
This seems to be a good example: http://tech.groups.yahoo.com/group/OpenCV/message/47627
You can do in this way:
You have for example 10 Mat where each math represent an image.
Now you can create a new Mat that you can put into this new Mat the previus 10 Mat.
At this point use Mat.pushback(...) to insert the 10 Mat.
Hope this is helpful for you.
Marco

Resources