What are the model configuration parameters in tensorflow object detection API? - machine-learning

In the Config file there are some parameters which I don't understand properly . I will mention them here.
first_stage_features_stride - Is this the ration of input/output ?
height_stride (In the first_stage_anchor_generator) - What is this ?

First stage feature stride is describing what is the output stride .
So you have to understand what is the difference between input and output strides . Please go with this answer .
Then height stride is the same , Which describe how an anchor should behave in the real image w.r.t the final feature-map .

Related

How do we provide the labels to our training set in transfer learning?

I am trying to use a pre trained model specifically training on logos . I am using MobileNet for training in logos . If we made our CNN model from scratch then we provide the labels in it . I don't know how to provide the labels in transfer learning . Either image data generator automatically provide the labels when we use flow_from_directory function . Little part of code is shown below . Elaborate !
training_set = train_datagen.flow_from_directory('Datasets/Train',
target_size = (224, 224),
batch_size = 32,
class_mode = 'categorical')
r = model.fit_generator(training_set,validation_data=test_set,epochs=5,steps_per_epoch=len(training_set),
validation_steps=len(test_set)
)
I believe that the labels are inferred from the directory schema, so if your main directory looks like this for each of the train and test sets:
main_directory/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
You should be fine.
for more examples refer to this

How can I predict the next image from a series of images?

I am planing to predict the next image from an image sequence. I have searched on the internet (Google/YouTube) for tutorials and for similar work. but I couldn't find any.
I want to know whether it is possible to find the pattern and predict the next image and can I find some tutorials for that.
You can use a CNN:
The input is then not 3 * w * h but (3*number of images) * w * h - so you can just concatenate the stuff in depth
The output is just an image instead of a class. So no flattening in between... or a reshape has to be added.
Have a look at
Fully Convolutional Networks for Semantic Segmentation and Image-to-Image translation.
If you haven't seen it already: Keras is pretty handy.
You might also be interested in the concept of Optical Flow

Is my general understanding of finding weights correct?

I started a course in Deep Learning. I'm trying to make an example in order to explain to myself how the weights are found mathematically.
If what I wrote below is nonsense I'll be glad to hear an explanation. Thanks.
So, for a given image we do WX+b. We get some vector Y and then we compare it to a desired label vector L according to . I'm assuming that we calculate D with "Cosine Similarity". For simplicity S(Y)==Y. So what we're trying to do is to calculate so it will be one.
Let’s say we have image X of the letter “a” and two labels (“a”, “b”). Then . We want to calculate W and b for which we will get such vector that when we’ll insert it into we’ll get zero. We convert X to a vector. Since we have 2 labels and size of the X is 9, the W and b are the following: . So, we get: . This gives us the following system of equations: . So, now we need to solve the following .
If what I wrote above is not nonsense, I don't quite understand where finding minimum is applied?
In deep learning finding the minimum means minimizes the cross entropy function. The cross entropy symbolizes the "Loss" of the network. We therefore try by changing the weights and biases of the network to produce an output which minimizes the cross entropy loss. Therefore we minimize D(S,L).

Improve image quality

I need to improve image quality, from low quality to high hd quality. I am using OpenCV libraries. I experimented a lot with GaussianBlur(), Laplacian(), transformation functions, filter functions etc, but all I could succeed is to convert image to hd resolution and keep the same quality. Is it possible to do this? Do I need to implement my own algorithm or is there a way how it's done? I will really appreciate any kind of help. Thanks in advance.
I used this link for my reference. It has other interesting filters that you can play with.
If you are using C++:
detailEnhance(Mat src, Mat dst, float sigma_s=10, float sigma_r=0.15f)
If you are using python:
dst = cv2.detailEnhance(src, sigma_s=10, sigma_r=0.15)
The variable 'sigma_s' determines how big the neighbourhood of pixels must be to perform filtering.
The variable 'sigma_r' determines how the different colours within the neighbourhood of pixels will be averaged with each other. Its range is from: 0 - 1. A smaller value means similar colors will be averaged out while different colors remain as they are.
Since you are looking for sharpness in the image, I would suggest you keep the kernel as minimum as possible.
Here is the result I obtained for a sample image:
1. Original image:
2. Sharpened image for lower sigma_r value:
3. Sharpened image for higher sigma_r value:
Check the above mentioned link for more information.
How about applying Super Resolution in OpenCV? A reference article with more details can be found here: https://learnopencv.com/super-resolution-in-opencv/
So basically you will need to have the Python dependency opencv-contrib-python installed, together with a working version of opencv-python.
There are different techniques for the Super Resolution in OpenCV you can choose from, including EDSR, ESPCN, FSRCNN, and LapSRN. Code examples in both Python and C++ have been included in the tutorial article as well for easy reference.
A correction is needed
dst = cv2.detailEnhance(src, sigma_s=10, sigma_r=0.15)
using kernel will give error.
+1 to kris stern answer,
If you are looking for practical implementation of super resolution using pretrained model in OpenCV, have a look at below notebook also video describing details.
https://github.com/pankajr141/experiments/blob/master/Reasoning/ComputerVision/super_resolution_enhancing_image_quality_using_pretrained_models.ipynb
https://www.youtube.com/watch?v=JrWIYWO4bac&list=UUplf_LWNn0a9ubnKCZ-95YQ&index=4
Below is a sample code using opencv
model_pretrained = cv2.dnn_superres.DnnSuperResImpl_create()
# setting up the model initialization
model_pretrained.readModel(filemodel_filepath)
model_pretrained.setModel(modelname, scale)
# prediction or upscaling
img_upscaled = model_pretrained.upsample(img_small)

OpenCV Principal Component Analysis terminology - what actually is a 'sample'?

I'm working with Principal Component Analysis (PCA) in openCV. The constructor inputs for the case I'm interested in are:
PCA(InputArray data, InputArray mean, int flags, double retainedVariance);
Regarding the InputArray 'data' the documents state the appropriate flags should be:
CV_PCA_DATA_AS_ROW indicates that the input samples are stored as
matrix rows.
CV_PCA_DATA_AS_COL indicates that the input samples are
stored as matrix columns.
My question pertains to the use of the term 'samples' in that I'm not sure what a sample is in this context.
For example let's say I have 4 sets of data and for the sake of illustration let's label them A-D. Now each set A through D has 8 elements. They are then set up in the Mat variable I'll use as InputArray as follows:
The question is, which is it:
My sets are samples?
My data elements are samples?
Another way of asking:
Do I have 4 samples (CV_PCA_DATA_AS_COL)
Or do I have 4 sets of 8 samples (CV_PCA_DATA_AS_ROW)
?
As a guess, I'd choose CV_PCA_DATA_AS_COL (i.e. I have 4 samples) - but that's just where my head is at... Until I learn the correct terminology it seems the word 'sample' could apply to either reasoning.
Ugh...
So the answer was found by reversing the logic behind the documentation for the PCA::project step...
Mat PCA::project(InputArray vec)
vec – input vector(s); must have the same dimensionality and the same
layout as the input data used at PCA phase, that is, if
CV_PCA_DATA_AS_ROW are specified, then vec.cols==data.cols (vector
dimensionality)
i.e. 'sample' is equivalent to 'set', and the elements are the 'dimension'.
(and my guess was correct :)

Resources