White noise test from spectral analysis - time-series

What is the white noise test from spectral analysis? In my project I am asked to work on evaluating whether the residual series behave like white noise series by Ljung-Box test and the white-noise test from spectral analysis. I fitted a model using auto.arima in r for the differenced data(so that it is stationary) and for the residuals I did conduct the Ljung-Box test in R and got a p-value of 0.9783 and therefore I concluded that the the residuals behave like the white noise. How do I conduct the white noise test from spectral analysis? If somebody can guide me as to how I can do it in R then it should be very helpful.
Are there any test that I can use for this in r?

Related

Semantic Segmentation: How to evaluate the noise influence of the effectivity and robustness of the medical image segmentation?

I have done some pre-processing including N4 Bias correction, noise removal and scaling on medical 3D MRIs, and I was asked one question:
How to evaluate the noise influence of the effectivity and robustness of the medical image segmentation? When affecting the image structure with various noise, the extracted features will be deteriorated. Such effect should be taken advantage in the context of the method
effectivity for different noise intensity.
How to evaluate the noise affect and how to justify the noise removal method used in the scientific manuscript?
I don't know if this can be helpful but I did once in classrom with nuclear magnetic resonance.
In that case we use the Shepp Logan Phantom with FFT. then we add noise to the picture (by adding random numbers with gaussian distribution).
When you transform the image back to the phantom you can see the effects of noise and sometimes artifacts (mostly due to the FFT algorithm and the window function choosed).
What I did was check the mean value of color in the image before and after, then on edges of the pahntom (skull) you can see how much is clear the passage from white to black and vice versa.
This can be easily tested with MATLAB code and the phantom. When you have the accuracy you need you can then apply the algorithm you choose on real images.

Normalizing lighting conditions for image recognition

I am using OpenCV to process a video to identify frames containing objects using a rudimentary algorithm (background subtraction → brightness threshold → dilation → contour detection).
The video is shot over the course of a day, so lighting conditions are gradually changing, so I expect it would improve my results if I did some sort of brightness and/or contrast normalization step first.
Answers to this question suggest using convertScaleAbs, or contrast optimization with histogram clipping, or histogram equalization.
Of these techniques or others, what would be the best preprocessing step to normalize the frames?

How is a linear autoencoder equal to PCA?

I would like the mathematical proof of it. does anyone know a paper for it. or can workout the math?
https://pvirie.wordpress.com/2016/03/29/linear-autoencoders-do-pca/
PCA is restricted to a linear map, while auto encoders can have nonlinear enoder/decoders.
A single layer auto encoder with linear transfer function is nearly equivalent to PCA, where nearly means that the WW found by AE and PCA won't be the same--but the subspace spanned by the respective WW's will.
Here is a detailed mathematical explanation:
https://arxiv.org/abs/1804.10253
This paper also shows that using a linear autoencoder, it is possible not only to compute the subspace spanned by the PCA vectors, but it is actually possible to compute the principal components themselves.

How many learning curves should I plot for a multi-class logistic regression classifier?

If we have K classes, do I have to plot K learning curves?
Because it seems impossible to me to calculate the train/validation error against all K theta vectors at once.
To clarify, the learning curve is a plot of the training & cross validation/test set error/cost vs training set size. This plot should allow you to see if increasing the training set size improves performance. More generally, the learning curve allows you to identify whether your algorithm suffers from a bias (under fitting) or variance (over fitting) problem.
It depends. Learning curves do not concern themselves with the number of classes. Like you said, it is a plot of training set and test set error, where that error is a numerical value. This is all learning curves are.
That error can be anything you want: accuracy, precision, recall, F1 score etc. (even MAE, MSE and others for regression).
However, the error you choose to use is the one that does or does not apply to your specific problem, which in turn indirectly affects how you should use learning curves.
Accuracy is well defined for any number of classes, so if you use this, a single plot should suffice.
Precision and recall, however, are defined only for binary problems. You can somewhat generalize them (see here for example) by considering the binary problem with classes x and not x for each class x. In that case, you will probably want to plot learning curves for each class. This will also help you identify problems relating to certain classes better.
If you want to read more about performance metrics, I like this paper a lot.

ROC curves cross validation

How to generate a ROC curve for a cross validation?
For a single test I think I should threshold the classification scores of SVM to generate the ROC curve.
But I am unclear about how to generate it for a cross validation?
After a complete round of cross validation all observations have been classified once (although by different models) and have been give an estimated probability of belonging to the class of interest, or a similar statistic. These probabilities can be used to generate a ROC curve in exactly the same way as probabilities obtained on an external test set. Just calculate the classwise error rates as you vary the classification threshold from 0 to 1 and your are all set.
However, typically you would like to perform more than one round of crossvalidation, as the performance varies depending on how the folds are divided. It is not obvious to me how to calculate the mean ROC curve of all rounds. I suggest plotting them all and calculate the mean AUC.
As follow-up to Backlin:
The variation in the results for different runs of k-fold or leave-n-out cross validation show instability of the models. This is valuable information.
Of course you can pool the results and just generate one ROC.
But you can also plot the set of curves
see e.g. the R package ROCR
or calculate e.g. median and IQR at different thresholds and construct a band depicting these variations.
Here's an example: the shaded areas are the inter quartile ranges observed over 125 iterations of 8-fold cross validation. The thin black areas contain half of the observed specificity-sensitivity pairs for one particular threshold, median marked by x (ignore the + marks).

Resources