I am working on Google Tensorboard, and I'm feeling confused about the meaning of Histogram Plot. I read the tutorial, but it seems unclear to me. I really appreciate if anyone could help me figure out the meaning of each axis for Tensorboard Histogram Plot.
Sample histogram from TensorBoard
I came across this question earlier, while also seeking information on how to interpret the histogram plots in TensorBoard. For me, the answer came from experiments of plotting known distributions.
So, the conventional normal distribution with mean = 0 and sigma = 1 can be produced in TensorFlow with the following code:
import tensorflow as tf
cwd = "test_logs"
W1 = tf.Variable(tf.random_normal([200, 10], stddev=1.0))
W2 = tf.Variable(tf.random_normal([200, 10], stddev=0.13))
w1_hist = tf.summary.histogram("weights-stdev_1.0", W1)
w2_hist = tf.summary.histogram("weights-stdev_0.13", W2)
summary_op = tf.summary.merge_all()
init = tf.initialize_all_variables()
sess = tf.Session()
writer = tf.summary.FileWriter(cwd, session.graph)
sess.run(init)
for i in range(2):
writer.add_summary(sess.run(summary_op),i)
writer.flush()
writer.close()
sess.close()
Here is what the result looks like:
.
The horizontal axis represents time steps.
The plot is a contour plot and has contour lines at the vertical axis values of -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, and 1.5.
Since the plot represents a normal distribution with mean = 0 and sigma = 1 (and remember that sigma means standard deviation), the contour line at 0 represents the mean value of the samples.
The area between the contour lines at -0.5 and +0.5 represent the area under a normal distribution curve captured within +/- 0.5 standard deviations from the mean, suggesting that it is 38.3% of the sampling.
The area between the contour lines at -1.0 and +1.0 represent the area under a normal distribution curve captured within +/- 1.0 standard deviations from the mean, suggesting that it is 68.3% of the sampling.
The area between the contour lines at -1.5 and +1-.5 represent the area under a normal distribution curve captured within +/- 1.5 standard deviations from the mean, suggesting that it is 86.6% of the sampling.
The palest region extends a little beyond +/- 4.0 standard deviations from the mean, and only about 60 per 1,000,000 samples will be outside of this range.
While Wikipedia has a very thorough explanation, you can get the most relevant nuggets here.
Actual histogram plots will show several things. The plot regions will grow and shrink in vertical width as the variation of the monitored values increases or decreases. The plots may also shift up or down as the mean of the monitored values increases or decreases.
(You may have noted that the code actually produces a second histogram with a standard deviation of 0.13. I did this to clear up any confusion between the plot contour lines and the vertical axis tick marks.)
#marc_alain, you're a star for making such a simple script for TB, which are hard to find.
To add to what he said the histograms showing 1,2,3 sigma of the distribution of weights. which is equivalent to the 68th,95th, and 98th percentiles. So think if you're model has 784 weights, the histogram shows how the values of those weights change with training.
These histograms are probably not that interesting for shallow models, you could imagine that with deep networks, weights in high layers might take a while to grow because of the logistic function being saturated. Of course I'm just mindlessly parroting this paper by Glorot and Bengio, in which they study the weights distribution through training and show how the logistic function is saturated for the higher layers for quite a while.
When plotting histograms, we put the bin limits on the x-axis and the count on the y-axis. However, the whole point of histogram is to show how a tensor changes over times. Hence, as you may have already guessed, the depth axis (z-axis) containing the numbers 100 and 300, shows the epoch numbers.
The default histogram mode is Offset mode. Here the histogram for each epoch is offset in the z-axis by a certain value (to fit all epochs in the graph). This is like seeing all histograms places one after the other, from one corner of the ceiling of the room (from the mid point of the front ceiling edge to be precise).
In the Overlay mode, the z-axis is collapsed, and the histograms become transparent, so you can move and hover over to highlight the one corresponding to a particular epoch. This is more like the front view of the Offset mode, with only outlines of histograms.
As explained in the documentation here:
tf.summary.histogram
takes an arbitrarily sized and shaped Tensor, and compresses it into a
histogram data structure consisting of many bins with widths and
counts. For example, let's say we want to organize the numbers [0.5,
1.1, 1.3, 2.2, 2.9, 2.99] into bins. We could make three bins:
a bin containing everything from 0 to 1 (it would contain one element, 0.5),
a bin containing everything from 1-2 (it would contain two elements, 1.1 and 1.3),
a bin containing everything from 2-3 (it would contain three elements: 2.2, 2.9 and 2.99).
TensorFlow uses a similar approach to create bins, but unlike in our
example, it doesn't create integer bins. For large, sparse datasets,
that might result in many thousands of bins. Instead, the bins are
exponentially distributed, with many bins close to 0 and comparatively
few bins for very large numbers. However, visualizing
exponentially-distributed bins is tricky; if height is used to encode
count, then wider bins take more space, even if they have the same
number of elements. Conversely, encoding count in the area makes
height comparisons impossible. Instead, the histograms resample the
data into uniform bins. This can lead to unfortunate artifacts in
some cases.
Please read the documentation further to get the full knowledge of plots displayed in the histogram tab.
Roufan,
The histogram plot allows you to plot variables from your graph.
w1 = tf.Variable(tf.zeros([1]),name="a",trainable=True)
tf.histogram_summary("firstLayerWeight",w1)
For the example above the vertical axis would have the units of my w1 variable. The horizontal axis would have units of the step which I think is captured here:
summary_str = sess.run(summary_op, feed_dict=feed_dict)
summary_writer.add_summary(summary_str, **step**)
It may be useful to see this on how to make summaries for the tensorboard.
Don
Each line on the chart represents a percentile in the distribution over the data: for example, the bottom line shows how the minimum value has changed over time, and the line in the middle shows how the median has changed. Reading from top to bottom, the lines have the following meaning: [maximum, 93%, 84%, 69%, 50%, 31%, 16%, 7%, minimum]
These percentiles can also be viewed as standard deviation boundaries on a normal distribution: [maximum, μ+1.5σ, μ+σ, μ+0.5σ, μ, μ-0.5σ, μ-σ, μ-1.5σ, minimum] so that the colored regions, read from inside to outside, have widths [σ, 2σ, 3σ] respectively.
I want to determine image sharpness by the amount of high frequencies within the image. As far as I understand the dft() function from OpenCV returns two matrices with real and complex numbers.
This is where I am stuck. How can I determine the amount of high frequencies from this data?
I am thankful for every hint/link which could provide me with a better understanding.
Greetings
Make FT
Calculate magnitude of result
Now you have 2D matrix. Consider upper left quadrant (other are mirrors for real source).
Here Magn[0][0] entry corresponds to zero frequency, and Magn[(n-1)/2][(n-1)/2] entry corresponds to the highest frequency.
Left upper part of this submatrix contains low-frequency samples, so you can calculate sum of values in this part and in the rest part and compare these sums. For example (pseudocode):
cvIntegral(Magn, Rect(0..n/4, 0..n/4)) compare with
cvIntegral(Magn, Rect(0..n/2, 0..n/2)) - cvIntegral(Magn, Rect(0..n/4, 0..n/4))
I was going thru this book to understand wavelets. Its a beautifully written not much technical document.
web.iitd.ac.in/~sumeet/WaveletTutorial.pdf
But in its very first chapter it describes below figure with explanation:
The frequency is measured in cycles/second, or with a more common
name, in "Hertz". For example the electric power we use in our daily
life in the US is 60 Hz (50 Hz elsewhere in the world). This means
that if you try to plot the electric current, it will be a sine wave
passing through the same point 50 times in 1 second. Now, look at the
following figures. The first one is a sine wave at 3 Hz, the second
one at 10 Hz, and the third one at 50 Hz. Compare them
But I am unable to understand what X and Y axis values represents. The X values range is in between [1,-1] so I am assuming it is value of the signal while Y axis is representing the time in milliseconds (1000ms = 1 sec). But then the document goes on further to state the representation of same signal in frequency-amplitude domain:
So how do we measure frequency, or how do we find the frequency
content of a signal? The answer is FOURIER TRANSFORM (FT). If the FT
of a signal in time domain is taken, the frequency-amplitude
representation of that signal is obtained. In other words, we now have
a plot with one axis being the frequency and the other being the
amplitude. This plot tells us how much of each frequency exists in our
signal.
But I am not able to understand what does in the upper graph X and Y axis values represents - shouldn't is be Frequency (X Axis) and Amplitude (Y axis) - if I am correct then why does Y axis has values ranked as 0,200 and 400 - shouldn't it be between range [1,-1] or rather [0,1]?
For the time domain signals the X axis is time and the Y axis is amplitude.
For the frequency domain equivalents the X axis is frequency and the Y axis is magnitude.
Note that when using most FFTs there is a scaling factor of N, where N is the number of points, so the magnitude values in the frequency domain plots are much greater than amplitude of the original time domain signal.
As Paul R wrote above, in the first image the horizontal X-axis represents time with the units ms.
The time interval has the length 1000ms.
The vertical Y-axis represents the amplitude of the signal. However, in the diagram the unit is not Volt, but it is normalized to amplitude 1.
If you perform a Fourier Transformation on that time signal, you will get a frequency spectrum.
If you use a DFT (Discrete Fourier Transformation) or a FFT (Fast Fourier Transformation),
the result depends on the implementation of the algorithm.
a) If the algorithm delivers a normalized result, the amplitude of your frequency line is 0.5 (if the amplitude of your input signal is 1).
b) If the algorithm delivers a non normalized result, the amplitude of your frequency line is half the value of the number of DFT/FFT input values.
Your frequency line has the value of 500, which means the algorithm does not use normalization and the number of input samples was 1000.
Now, what is represented by the horizontal X-axes in the frequency domain?
In the time domain, the length of your time input interval is T = 1000ms = 1s.
Therefore the distance between the frequency lines in the frequency domain is df = 1/s = 1Hz.
As we know from the amplitude in the frequency domain, the input signal in time domain had 1000 samples. This means the sampling time was dt = T/1000 = 1s/1000 = 1ms.
Therefore the total frequency interval F = (fmin, ..., fmax) in frequency domain is 1/dt = 1/1ms = 1kHz.
However, the range does NOT start at fmin = 0 Hz and ends at 1kHz, as one could assume inspecting the upper diagram in the second image. The spectrum calculated by a DFT/FFT contains a positive and a negative frequency range. This means you get a frequency range: (-500Hz, -499Hz, -498Hz, ... -1Hz, 0Hz, 1Hz, 2Hz, ..., 498Hz, 499Hz). The value 500Hz does not exist!
However, for the user's convenience the spectrum is not output in this order, but it is shifted by 500Hz (F/2). This means the spectrum starts with the DC value:
0Hz, 1Hz, 2Hz, ..., 498Hz, 499Hz, -500Hz, -499Hz, -498Hz, ..., -2Hz, -1HZ.
Because the spectrum of a real input function is hermitian Y(f) == Y(-f)*, the positive band carries the complete information. So, you can cut off the negative side band.
The upper diagram in the second image shows two peaks. The first peak appears at f = 50Hz and the second peak is shown at f=950Hz. However, this is not correct. The labels of the horizontal axes are wrong. The second peak appears at f = -50Hz.
In the lower diagram the frequency range ends at 500Hz (499Hz would be correct)a). The range of the negative frequencies is cut off.
I'm newbie in cepstrum analysis. So that's the question.
I have signal with the length 4096 and sample rate 8000 Hz. I make FFT and get the array with the length 4096*2 (2*i position is for cosinus coeff, 2*i+1 position is for sinus coeff). Frequency step is (sampleRate/signalLength == 8000/4096). So, I can calculate frequency at i position this way: i*sampleRate/signalLength.
Then, I make the cepstrum transformation. I can't understand how to find quefrency step and how to find frequency for given quefrency.
The bin number of an FFT result is inversely proportional to the length of the period of a sinusoidal component in the time domain. The bin number of a quefrency result is also inversely proportional to the distance between partials in a series of overtones in the frequency domain (this distance often the same as a root or fundamental pitch). Thus quefrency bin number would be proportional to period or repeat lag (autocorrelation peak) of a harmonically rich periodic signal in the time domain.
I'm validating an image segmentation algorithm applied to 2D images. The algorithm generates a contour segment, i.e. a set of connected pixels that form a freecurve in 2D space. The idea is to compare this set of pixels with a ground-truth, in my case another contour segment manually traced by an expert. An image showing what would be a segmentation result and the corresponding manual (ground-truth) segmentation is shown below:
I'm trying to think of an adequate comparison metric to validate the segmentation results. Ideally the best metric would be the point-to-point euclidean distance between corresponding pairs of pixels on each segment, however (as seen in previous figure) the segments don't have the same length (i.e. differ by the total number of pixels) so pixel-to-pixel comparisons have to be discarded.
Can you suggest me an adequate metric for validating my algorithm? Thanks for any suggestion!
For each pixel in the ground truth, take the distance to the nearest pixel in the segmentation result. Then take the sum of that for all ground truth pixels as the total error.
That's basically recall weighted by distance. If you start with the pixels in the result, it would resemble precision instead.
If the curves are closed, you can compute the area between the curves. If you can tell which pixels belong to a segment, that is as easy as computing XOR set of the 2 pixel sets.
Here is an example using that I've created using Matlab:
You could divide each line into n segments of equal length, then compute the euclidean distance between each segment and its pair on the other line.