Meaning of `envelope` in context of time series - time-series

While checking TIMESAT, more specifically
http://web.nateko.lu.se/timesat/docs/TIMESAT33_SoftwareManual.pdf page 50, section 9.4 TSM_GUI.
Then I read:
"The fits are affected by a number of options for detecting spikes, adapting to the upper envelope..."
What's the meaning of envelope?

In physics and engineering, the envelope of an oscillating signal is a
smooth curve outlining its extremes. The envelope thus generalizes
the concept of a constant amplitude. The
envelope function may be a function of time, space, angle, or indeed
of any variable.
Time Series without envelope
Time Series with upper and lower envelope
Note: Images and more information can also be found here

Related

Shortest way to find the focused point in an auto focus algorithm

I'm writing an algorithm for auto focus. For that I'm using a stepper motor which has 3318 steps for focus.
To find the focus, after every frame from the camera I'm taking the statistics and performing some calculation which results in a numeric value, i.e. focus value (fv). So the motor step where I get the highest fv is where my image is highest focused.
Right now, I am traversing through all the points to find the maximum fv and it's working but taking too long; about 15 secs.
Is there any algorithm I can use to reduce the no. of steps and minimize the time to find the focused point?
If you assume there is:
A single global maximum sharpness score;
No local maxima
Then your focus function should be relatively smooth.
In this case, you can do a search that is faster than linear.
Basically start somewhere and start rolling downhill.
You can use e.g. the Golden section search or by calculating the local change (derivatives) use the Newton (rolling down hill) or Conjugate gradient (jumping downhill) methods.
First, find out what exactly your bottleneck is:
time to take a frame
time to move the stepper motor to a specific position
time to process the frame and get a focus function value
Then learn something about the functional dependence of your focus function on the focus position in general (for your samples).
Is it smooth or bumpy (noisy)?
Is it wide (very flat maximum) or is it narrow (very steep, small maximum)?
Is it approximately quadratic?
Most probably there is not much noise, the maximum is rather wide and approximately quadratic.
Then Newton's method or the Levenberg-Marquardt fitting algorithm would converge in a few iterations.
However they only find local optima as well as the Golden search mentioned in the answer by Adi Shavit.
When noise is a problem, I recommend a kind of robust, zoom in approach:
Measure 10 frames over the whole range (332 steps away each)
Smooth the resulting 10 values slightly if there is noise present
Take the position of the best frame
Measure 20 frames over a range of [-330,330] steps around this best frame with a step size of 33 steps per frame
Smooth the resulting 20 values slightly if there is noise present
Take the position of the best frame
Measure 10 frames over a range of [-15, 15] steps around this best frame with a step size of 3 steps per frame
Smooth the resulting 10 values slightly if there is noise present
Take the best frame and measure one frame above and below
Take the best frame, it's the focus position
This needs 10+20+10+2=32 frames recorded and may therefore present an approximately 100 times speedup compared to taking 3318 frames or (0.15s instead of 15s) if taking the frames is the crucial part and not moving the stepper motor.

How to get a pixel movement when using "optical flow"

I'm doing 2D-Image processing and I have a quick question:
does optical flow provide the movement detection for a given Pixel, or is it just working with this information (i.e. you need an additional method to get this data)?
For what I've seen I'm assuming, that you need to provide the movement in x and y on your own, but on the other hand it is working with a constant pixel intensity (from one image to the next), which I guess should be obsolete if you already got the movement information
Has anyone a hint? Since all the tutorials, literatures, lectures I've seen skip this important step
Optical flow calculates this movement for you. You need to specify the pixels coordinates in the first frame, and some parameters as to the target search region, and then it calculates the movement. The problem is that it's not always correct, and in some cases, not possible, when the pixel is not really distinguishable from its surroundings.
In OpenCV, the function goodFeaturesToTrack usually precedes optical flow, as it detects pixels that have higher likelihood of being processed correctly. Even then, you still need to do some extra processing to verify that the movement was correct.

Unstable homography estimation using ORB

I am developing a feature tracking application and so far, after trying to almost all the feature detectors/descriptors, i've got the most satisfactory overall results with ORB.
Both my feature descriptor and detector is ORB.
I am selecting a specific area for detecting features on my source image (by masking). and then matching it with features detected on subsequent frames.
Then i filter my matches by performing ratio test on 'matches' obtained from the following code:
std::vector<std::vector<DMatch>> matches1;
m_matcher.knnMatch( m_descriptorsSrcScene, m_descriptorsCurScene, matches1,2 );
I also tried the two way ratio test(filtering matches from Source to Current scene and vice-versa, then filtering out common matches) but it didn't do much, so I went ahead with the one way ratio test.
i also add a min distance check to my ratio test, which, it apppears, gives better results
if (distanceRatio < m_fThreshRatio && bestMatch.distance < 5*min_dist)
{
refinedMatches.push_back(bestMatch);
}
and in the end , i estimate the Homography.
Mat H = findHomography(points1,points2);
I've tried using the RANSAC method for estimating inliners and then using those to recalculate my Homography, but that gives more unstability plus consumes more time.
then in the end i draw a rectangle around my specific region which is to be tracked. i get the plane coordinates by:
perspectiveTransform( obj_corners, scene_corners, H);
where 'objcorners' are the coordinates of my masked(or unmasked) region.
The reactangle I draw using 'scene_corners' seems to be vibrating. increasing the number of features has reduced it quite a bit, but I cant increase them too much because of the time constraint.
How can i improve the stability?
Any suggestions would be appreciated.
Thanks.
If it is the vibrations that are really bothersome to you then you could try taking the moving average of the homography matrices over time:
cv::Mat homoG = cv::findHomography(obj, scene, CV_RANSAC);
if (homography.empty()) {
homoG.copyTo(homography);
}
cv::accumulateWeighted(homoG, homography, 0.1);
Make the 'homography' variable global, and keep calling this every time you get a new frame.
The alpha parameter of accumulateWeighted is the reciprocal of the period of the moving average.
So 0.1 is taking the average of the last 10 frames and 0.2 is taking the average of the last 5 and so on...
A suggestion that comes to mind from experience with feature detection/matching is that sometimes you just have to accept the matched feature points will not work perfectly. Even subtle changes in the scene you are looking at can cause somewhat annoying problems, for example changes in light or unwanted objects coming into view.
It appears to me that you have a decently working feature matching in place from what you say, you may want to work on a way of keeping the region of interest constant. If you know the typical speed or any other movement patterns unique to any object you are trying to track between frames, or any constraints relating to the position of your camera, it may be useful in avoiding recalculating the region of interest unnecessarily causing vibrations. Or in fact it may help in creating a more efficient searching algorithm, allowing you to increase the number of feature points you can detect and use.
Another (small) hack you can use is to avoid redrawing the region window if the previous window was of similar size and position.

Block based Motion Estimation in Video Compression

As we know almosty all video encoders use some temporal coding. It uses block (Rectangular area) based motion estimation to find best macth of a block of pixels for a current frame in reference / previous frames. This gives the motion vector. This is fine if the motion is translational(i.e. if the block moves to left/right or up/down) What if the object rotates and if the object was rectangular in shape and it rotates, then motion estimation would not be so accurate and hence would not result in least presidual(original minus prediction).
So what methods does a video encoder adopt to deal with such rotational motions./movements.
Does it then handle such situation by coding that block as Intra block(Code as it is without any reference to any previous) within the P frame
or
are there any other tricks at hand to deal it while coding it as P macroblock itself?
As far as I know, video encoders don't have any special case for rotational movements. First, detection of rotational motion itself would consume a lot of time. Also, motion estimation is done at the macroblock level and therefore, there might be quite a few macroblocks in the frame that are not moving in a rotational manner, unless the whole frame itself is somehow rotating.
One "trick" that I can suggest is the following-
Calculate PSNR between predicted frame (P Frame) and actual frame. If PSNR is too low, it makes more sense to encode the frame as an information frame (I Frame). Note that this cannot be done for live transmissions because it would be time consuming. But it can be done when encoding time is not a factor. In that case you could simply use a Full Search.
The point of motion estimation is that it is a computationally cheap way of reducing 'typical' videos.
If you were to use motion based coding on something like a video of a waterfall it would fail to reduce the size.
A similar concept applies to JPEG photos. The JPEG compression only works because it takes advantage of the particular sensitivity of the human eye.
Ultimately, data is data and you cannot losslessly reduce the amount of it. The best you can do is to make some guesses about the source and destination and then try to recreate something that will be indistinguishable to the viewer, but which uses less data. That is why motion estimation WORKS. 99.99 percent of movies that humans watch have humans in them, moving around like humans do...left and right...up and down. And by WORKS, I mean, can be done in a quick enough time to make it worthwhile to do it for millions of hours of footage produced every year.
This, of course has something to do with Shannon entropy http://en.wikipedia.org/wiki/Entropy_(information_theory) , but that article makes my brain start to seep out through my eye sockets a bit...
First thing is the computational complexity which increases dramatically for every addition of a rotational direction. For example, the Motion estimation time is 'x' seconds. After adding say right hand 90 degrees, we have again 'x' seconds, since it needs to check the same reference frame search window again with the rotated block. Again after adding the left rotation 90 degrees, again it adds another x seconds to motion estimate, and so on. And the main issue here is that, in the entire encoder, typically, Motion Estimation is the block which consumes major part of encoding time.
Second issue is the complexity of motion compensation unit. If we have rotational block in estimation or prediction then we must generate the same transformation for generating the compensated frame, in the encoder and decoder too. The worst thing is that it adds much complexity in the decoder side also.
The third thing is the prediction unit for the support of variable block size. The standard always defines motion vectors for the block sizes which are fixed. If rotational block sizes are proposed, then the directions needs to be standardized in decoder also, where motion compensation unit, entropy encoder/decoder etc.
The fourth thing is the Motion Vector Coding. Since we add the rotational motion vectors, we need to add extra bits to motion vectors.So, put these things in a beam balance - "adding addition bits for each MV" and "improving compression efficiency using rotational Motion vectors", which one weighs more. If the balance is balanced, or if "adding extra bits for each MV" weighs more, then there is no use of using rotational MVs.
Fifth thing is about the deep understanding of the encoder block diagram. The encoder which we are using is analogous to adaptive Differential Pulse Code Modulator or any similar type with predictive coding. The video signal is always encoder differentially. When a video signal or any signal is coded differentially, the time difference between previous and the current sample is infinitesimally small (here 1/frame-rate), such that the individual blocks always follow translational direction.So, we get in use, the rotational MVs only if we are using multiple reference frame when reference frame if larger than frame-rate or at least larger than GOP-size. So, in this case rotational MVs could give very slight improvement in PSNR or increase Motion Estimation time dramatically.
Another thing is about the subjective and statistical study of the Motion direction.
Despite all these, there are some proposals in JCT-VC for implementing this, but finally not approved in current HEVC standard. May be they will try to figure it out and solve all the issues in future.

Algorithm for detecting peaks from recorded, noisy data. Graphs inside

So I've recorded some data from an Android GPS, and I'm trying to find the peaks of these graphs, but I haven't been able to find anything specific, perhaps because I'm not too sure what I'm looking for. I have found some MatLab functions, but I can't find the actual algorithms that do it. I need to do this in Java, but I should be able to translate code from other languages.
As you can see, there are lots of 'mini-peaks', but I just want the main ones.
Your solution depends on what you want to do with the data. If you want to do very serious things then you should most likely use (Fast) Fourier Transforms, and extract both the phase and frequency output from it. But that's very computationally intensive and takes a long while to program. If you just want to do something simple that doesn't require a lot of computational resources, then here's a suggestion:
For that exact problem i implemented the below algorithm a few hours ago. I invented the algorithm myself so i do not know if it has a name already, but it is working great on very noisy data.
You need to determine the average peak-to-peak distance and call that PtP. Do that measurement any what you like. Judging from the graph in your case it appears to be about 35. In my code i have another algorithm i invented to do that automatically.
Then choose a random starting index on the graph. Poll every new datapoint from then on and wait until the graph has either risen or fallen from the starting index level by about 70% of PtP. If it was a fall then that's a tock. If it was a rise then that's a tick. Store that level as the last tick or tock height. Produce a 'tick' or 'tock' event at this index.
Continue forward in the data. After ticks, if the data continues to rise after that point then store that level as the new 'height-of-tick' but do not produce a new tick event. After tocks, if the data continues to fall after that point then store that level as the new 'depth-of-tock' but do not produce a new tock event.
If last event was a tock then wait for a tick, if last event was a tick then wait for a tock.
Each time you detect a tick, then that should be a peak! Good luck.
I think what you want to do is run this through some sort of low-pass filter. Depending on exactly what you want to get out of this dataset, a simple "box car" filter might be
sufficient: at each point, take the average of the N samples centered on that point,
and take the average as the filtered value. The larger N is, the more aggressively smoothed the filtered data will be.
I guess you have lots of points... Calculate mean value of them, subtract it from all point's values and get highest point value (negative or positive) from each range where points have same sign till they change it. I hope I am clear...
With particulary nasty and noisy data I usually use smoothing. Easiest example of smoothing is moving average. Then you can find peacks on that moving average. And then you simply go back to your original data and take the closest peak to one you found on moving average.
I've done some looking into peak detection and I can tell you that if your data doesn't behave, it could mess up your algorithm. Off the top of my head, you could try: Pick a threshold, i.e threshold = 250. If data is above threshold, find the max at that period. This is assuming that the data you have has a mean about 230. Not sure how fancy you want to get. Hope that helps.

Resources