How to skip to a specific frame in a given spectrogram file

How to skip to a specific frame in a given spectrogram file - signal-processing

I'm encountering problems skipping ahead to a specific frame of a melspec feature set found here. The aim of getting features from the feature set is to analyse the difference in beats per second (BPS) so that i can match up the BPS of two tracks in order to mix between the two tracks or warp the timing of the track to synchronise the two pieces of music together. The feature set does specify the following:
Pre-extracted in the "feature" directory are space-delimited floating-point ASCII matrices:
beat_synchronus: one beat-synchronus vector per line
non-beat-synchronus: 512-sample hop frames # 22050Hz sample rate, one vector per line one vector per line:"
I'm not quite sure how to interpret this - is the melspec beat or non beat synchronous and how does that work in regards to delimitating frames?
I've got as far as working out the frame duration thanks to this answer but I don't know how to apply the knowledge gained from the frame duration to the task of navigating to a specific timecode or frame. The closest I've got is working out the offset divided by the frame to work out how many frames need to be skipped to get to the offset (1 second into the track for example gives 2583 frames). However, the file is not demarcated into lines and as far as I can tell is just a continuous list of entries. This leads to the question of what the size is of a given frame is (if that's the right terminology) is it the case that it is 2383 entries to the second need to be skipped to get to the right entry or is it the case that each frame has a specific number of entries and I need to skip 2583 frames of size x? what is size x (512?)?
I've been able to open the file for melspec but for the melspec file there are no delimiters between entries. It is instead a continuous list of entries.
The code I have so far is as follows to work out the duration of a frame, and therefore the number of frames in an offset track to be skipped. However this does not indicate the size of a given frame and how to access that from the file for the melspec.
spectrogram is the file_path for a given feature set. The offset is the time in seconds offset from the start of a track.
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
# readlines of file so that offset is applied.
with open(spectrogram) as feature_set:
indices = int(SHIFT_FRAMES)
for line in feature_set:
print(line)
feature_set.close()
This gives a list of 10 lines of results, which do not seem to be naturally delimited by line.

The sample file you are referring is a matrix of 128 x 7392 values.
To better understand the format of this file, you may look at the extractFeatures.py script used to extract the features. You may notice that the melspec feature is described as "non-beat-synchronus" and computed using librosa.feature.melspectrogram, using mostly default arguments and producing an output S of n_mel rows by t columns.
To figure out the value of n_mel you need to look at librosa.filters.mel, which indicates a default value of 128. The number of frames t on the other hand is computed internally in librosa.util.frame as 1 + int((len(y) - frame_length) / hop_length), where the frame_length uses the default value 2048 and the hop_length uses the default value 512.
To summarize, the 128 rows correspond to the 128 MEL-frequency bins, and the 7392 columns correspond to time frames.
You could thus use the following to extract the column of interest:
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
# readlines of file so that offset is applied.
with open(spectrogram) as feature_set:
indices = int(SHIFT_FRAMES)
for line in feature_set:
print(line.split(" ")[indices])
feature_set.close()
Using numpy you could also read the entire spectrogram and address a specific column:
import numpy as np
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
data = np.loadtxt(spectrogram)
column = int(SHIFT_FRAMES)
print(data[:,column])
Going back on the fact that the feature extraction was done using librosa, you may also consider using librosa.core.time_to_frames instead of manually computing the frame number:
def skipToFrame(spectrogram, offset):
SHIFT_FRAMES = librosa.core.time_to_frames(offset, sr=22050, hop_length=512, n_fft=2048)
...
On a final note, you should be aware that each of these time frame uses 2048 samples, but they overlap such that each successive frame advances 512 sample relative to the previous sample. So the frames cover the following time intervals:
frame # | start (s) | end (s)
================================
1 | 0.000 | 0.093
2 | 0.023 | 0.116
3 | 0.046 | 0.139
...
41 | 0.929 | 1.022
42 | 0.952 | 1.045
...
7392 | 171.619 | 171.712

Related

Finding the Jacobian of a frame with respect to the joints of a given model in Pydrake

Is there any way to find the Jacobian of a frame with respect to the joints of a given model (as opposed to the whole plant), or alternatively to determine which columns of the full plant Jacobian correspond to a given model’s joints? I’ve found MultibodyPlant.CalcJacobian*, but I’m not sure if those are the right methods.
I also tried mapping the JointIndex of each joint in the model to a column of MultibodyPlant.CalcJacobian*, but the results didn't make sense -- the joint indices are sequential (all of one model followed by all of the other), but the Jacobian columns look interleaved (a column corresponding to one model followed by one corresponding to the other).

Assuming you are computing with respect to velocities, you'll want to use Joint.velocity_start() and Joint.num_velocities() to create a mask or set of indices. If you are in Python, then you can use NumPy's array slicing to select the desired columns of your Jacobian.
(If you compute w.r.t. position, then make sure you use Joint.position_start() and Joint.num_positions().)
Example notebook:
https://nbviewer.jupyter.org/github/EricCousineau-TRI/repro/blob/eb7f11d/drake_stuff/notebooks/multibody_plant_jacobian_subset.ipynb
(TODO: Point to a more official source.)
Main code to pay attention to:
def get_velocity_mask(plant, joints):
"""
Generates a mask according to supplied set of ``joints``.
The binary mask is unable to preserve ordering for joint indices, thus
`joints` required to be a ``set`` (for simplicity).
"""
assert isinstance(joints, set)
mask = np.zeros(plant.num_velocities(), dtype=np.bool)
for joint in joints:
start = joint.velocity_start()
end = start + joint.num_velocities()
mask[start:end] = True
return mask
def get_velocity_indices(plant, joints):
"""
Generates a list of indices according to supplies list of ``joints``.
The indices are generated according to the order of ``joints``, thus
``joints`` is required to be a list (for simplicity).
"""
indices = []
for joint in joints:
start = joint.velocity_start()
end = start + joint.num_velocities()
for i in range(start, end):
indices.append(i)
return indices
...
# print(Jv1_WG1) # Prints 7 dof from a 14 dof plant
[[0.000 -0.707 0.354 0.707 0.612 -0.750 0.256]
[0.000 0.707 0.354 -0.707 0.612 0.250 0.963]
[1.000 -0.000 0.866 -0.000 0.500 0.612 -0.079]
[-0.471 0.394 -0.211 -0.137 -0.043 -0.049 0.000]
[0.414 0.394 0.162 -0.137 0.014 0.008 0.000]
[0.000 -0.626 0.020 0.416 0.035 -0.064 0.000]]

arbitrarily weighted moving average (low- and high-pass filters)

Given input signal x (e.g. a voltage, sampled thousand times per second couple of minutes long), I'd like to calculate e.g.
/ this is not q
y[3] = -3*x[0] - x[1] + x[2] + 3*x[3]
y[4] = -3*x[1] - x[2] + x[3] + 3*x[4]
. . .
I'm aiming for variable window length and weight coefficients. How can I do it in q? I'm aware of mavg and signal processing in q and moving sum qidiom
In the DSP world it's called applying filter kernel by doing convolution. Weight coefficients define the kernel, which makes a high- or low-pass filter. The example above calculates the slope from last four points, placing the straight line via least squares method.

Something like this would work for parameterisable coefficients:
q)x:10+sums -1+1000?2f
q)f:{sum x*til[count x]xprev\:y}
q)f[3 1 -1 -3] x
0n 0n 0n -2.385585 1.423811 2.771659 2.065391 -0.951051 -1.323334 -0.8614857 ..
Specific cases can be made a bit faster (running 0 xprev is not the best thing)
q)g:{prev[deltas x]+3*x-3 xprev x}
q)g[x]~f[3 1 -1 -3]x
1b
q)\t:100000 f[3 1 1 -3] x
4612
q)\t:100000 g x
1791
There's a kx white paper of signal processing in q if this area interests you: https://code.kx.com/q/wp/signal-processing/

This may be a bit old but I thought I'd weigh in. There is a paper I wrote last year on signal processing that may be of some value. Working purely within KDB, dependent on the signal sizes you are using, you will see much better performance with a FFT based convolution between the kernel/window and the signal.
However, I've only written up a simple radix-2 FFT, although in my github repo I do have the untested work for a more flexible Bluestein algorithm which will allow for more variable signal length. https://github.com/callumjbiggs/q-signals/blob/master/signal.q
If you wish to go down the path of performing a full manual convolution by a moving sum, then the best method would be to break it up into blocks equal to the kernel/window size (which was based on some work Arthur W did many years ago)
q)vec:10000?100.0
q)weights:30?1.0
q)wsize:count weights
q)(weights$(((wsize-1)#0.0),vec)til[wsize]+) each til count v
32.5931 75.54583 100.4159 124.0514 105.3138 117.532 179.2236 200.5387 232.168.

If your input list not big then you could use the technique mentioned here:
https://code.kx.com/q/cookbook/programming-idioms/#how-do-i-apply-a-function-to-a-sequence-sliding-window
That uses 'scan' adverb. As that process creates multiple lists which might be inefficient for big lists.
Other solution using scan is:
q)f:{sum y*next\[z;x]} / x-input list, y-weights, z-window size-1
q)f[x;-3 -1 1 3;3]
This function also creates multiple lists so again might not be very efficient for big lists.
Other option is to use indices to fetch target items from the input list and perform the calculation. This will operate only on input list.
q) f:{[l;w;i]sum w*l i+til 4} / w- weight, l- input list, i-current index
q) f[x;-3 -1 1 3]#'til count x
This is a very basic function. You can add more variables to it as per your requirements.

Compute annual mean using x-arrays

I have a python xarray dataset with time,x,y for its dimensions and value1 as its variable. I'm trying to compute annual mean of value1 for each x,y coordinate pair.
I've run into this function while reading the docs:
ds.groupby('time.year').mean()
This seems to compute a single annual mean for all x,y coordinate pairs in value1 at each given time slice
rather than the annual means of individual x,y coordinate pairs at each given time slice.
While the code snippet above produces the wrong output, I'm very interested in its oversimplified form. I would really like to figure out the "X-arrays trick" to doing annual mean for a given x,y coordinate pair rather than hacking it together myself.
Cam someone point me in the right direction? Should I temporarily turn this into a pandas object?

To avoid the default of averaging over all dimensions, you simply need to supply the dimension you want to average over explicitly:
ds.groupby('time.year').mean('time')

Note, that calling ds.groupby('time.year').mean('time') will be incorrect if you are working with monthly and not daily data. Taking the mean will place equal weight on months of different length, e.g., Feb and July, which is wrong.
Instead use below from NCAR:
def weighted_temporal_mean(ds, var):
"""
weight by days in each month
"""
# Determine the month length
month_length = ds.time.dt.days_in_month
# Calculate the weights
wgts = month_length.groupby("time.year") / month_length.groupby("time.year").sum()
# Make sure the weights in each year add up to 1
np.testing.assert_allclose(wgts.groupby("time.year").sum(xr.ALL_DIMS), 1.0)
# Subset our dataset for our variable
obs = ds[var]
# Setup our masking for nan values
cond = obs.isnull()
ones = xr.where(cond, 0.0, 1.0)
# Calculate the numerator
obs_sum = (obs * wgts).resample(time="AS").sum(dim="time")
# Calculate the denominator
ones_out = (ones * wgts).resample(time="AS").sum(dim="time")
# Return the weighted average
return obs_sum / ones_out
average_weighted_temp = weighted_temporal_mean(ds_first_five_years, 'TEMP')

Histogram calculation in julia-lang

refer to julia-lang documentations :
hist(v[, n]) → e, counts
Compute the histogram of v, optionally using approximately n bins. The return values are a range e, which correspond to the edges of the bins, and counts containing the number of elements of v in each bin. Note: Julia does not ignore NaN values in the computation.
I choose a sample range of data
testdata=0:1:10;
then use hist function to calculate histogram for 1 to 5 bins
hist(testdata,1) # => (-10.0:10.0:10.0,[1,10])
hist(testdata,2) # => (-5.0:5.0:10.0,[1,5,5])
hist(testdata,3) # => (-5.0:5.0:10.0,[1,5,5])
hist(testdata,4) # => (-5.0:5.0:10.0,[1,5,5])
hist(testdata,5) # => (-2.0:2.0:10.0,[1,2,2,2,2,2])
as you see when I want 1 bin it calculates 2 bins, and when I want 2 bins it calculates 3.
why does this happen?

As the person who wrote the underlying function: the aim is to get bin widths that are "nice" in terms of a base-10 counting system (i.e. 10k, 2×10k, 5×10k). If you want more control you can also specify the exact bin edges.

The key word in the doc is approximate. You can check what hist is actually doing for yourself in Julia's base module here.
When you do hist(test,3), you're actually calling
hist(v::AbstractVector, n::Integer) = hist(v,histrange(v,n))
That is, in a first step the n argument is converted into a FloatRange by the histrange function, the code of which can be found here. As you can see, the calculation of these steps is not entirely straightforward, so you should play around with this function a bit to figure out how it is constructing the range that forms the basis of the histogram.

Finding standard deviation using only mean, min, max?

I want to find the standard deviation:
Minimum = 5
Mean = 24
Maximum = 84
Overall score = 90
I just want to find out my grade by using the standard deviation
Thanks,

A standard deviation cannot in general be computed from just the min, max, and mean. This can be demonstrated with two sets of scores that have the same min, and max, and mean but different standard deviations:
1 2 4 5 : min=1 max=5 mean=3 stdev≈1.5811
1 3 3 5 : min=1 max=5 mean=3 stdev≈0.7071
Also, what does an 'overall score' of 90 mean if the maximum is 84?

I actually did a quick-and-dirty calculation of the type M Rad mentions. It involves assuming that the distribution is Gaussian or "normal." This does not apply to your situation but might help others asking the same question. (You can tell your distribution is not normal because the distance from mean to max and mean to min is not close). Even if it were normal, you would need something you don't mention: the number of samples (number of tests taken in your case).
Those readers who DO have a normal population can use the table below to give a rough estimate by dividing the difference of your measured minimum and your calculated mean by the expected value for your sample size. On average, it will be off by the given number of standard deviations. (I have no idea whether it is biased - change the code below and calculate the error without the abs to get a guess.)
Num Samples Expected distance Expected error
10 1.55 0.25
20 1.88 0.20
30 2.05 0.18
40 2.16 0.17
50 2.26 0.15
60 2.33 0.15
70 2.38 0.14
80 2.43 0.14
90 2.47 0.13
100 2.52 0.13
This experiment shows that the "rule of thumb" of dividing the range by 4 to get the standard deviation is in general incorrect -- even for normal populations. In my experiment it only holds for sample sizes between 20 and 40 (and then loosely). This rule may have been what the OP was thinking about.
You can modify the following python code to generate the table for different values (change max_sample_size) or more accuracy (change num_simulations) or get rid of the limitation to multiples of 10 (change the parameters to xrange in the for loop for idx)
#!/usr/bin/python
import random
# Return the distance of the minimum of samples from its mean
#
# Samples must have at least one entry
def min_dist_from_estd_mean(samples):
total = 0
sample_min = samples[0]
for sample in samples:
total += sample
sample_min = min(sample, sample_min)
estd_mean = total / len(samples)
return estd_mean - sample_min # Pos bec min cannot be greater than mean
num_simulations = 4095
max_sample_size = 100
# Calculate expected distances
sum_of_dists=[0]*(max_sample_size+1) # +1 so can index by sample size
for iternum in xrange(num_simulations):
samples=[random.normalvariate(0,1)]
while len(samples) <= max_sample_size:
sum_of_dists[len(samples)] += min_dist_from_estd_mean(samples)
samples.append(random.normalvariate(0,1))
expected_dist = [total/num_simulations for total in sum_of_dists]
# Calculate average error using that distance
sum_of_errors=[0]*len(sum_of_dists)
for iternum in xrange(num_simulations):
samples=[random.normalvariate(0,1)]
while len(samples) <= max_sample_size:
ave_dist = expected_dist[len(samples)]
if ave_dist > 0:
sum_of_errors[len(samples)] += \
abs(1 - (min_dist_from_estd_mean(samples)/ave_dist))
samples.append(random.normalvariate(0,1))
expected_error = [total/num_simulations for total in sum_of_errors]
cols=" {0:>15}{1:>20}{2:>20}"
print(cols.format("Num Samples","Expected distance","Expected error"))
cols=" {0:>15}{1:>20.2f}{2:>20.2f}"
for idx in xrange(10,len(expected_dist),10):
print(cols.format(idx, expected_dist[idx], expected_error[idx]))

Yo can obtain an estimate of the geometric mean, sometimes called the geometric mean of the extremes or GME, using the Min and the Max by calculating the GME= $\sqrt{ Min*Max }$. The SD can be then calculated using your arithmetic mean (AM) and the GME as:
SD= $$\frac{AM}{GME} * \sqrt{(AM)^2-(GME)^2 }$$
This approach works well for log-normal distributions or as long as the GME, GM or Median is smaller than the AM.

In principle you can make an estimate of standard deviation from the mean/min/max and the number of elements in the sample. The min and max of a sample are, if you assume normality, random variables whose statistics follow from mean/stddev/number of samples. So given the latter, one can compute (after slogging through the math or running a bunch of monte carlo scripts) a confidence interval for the former (like it is 80% probable that the stddev is between 20 and 40 or something like that).
That said, it probably isn't worth doing except in extreme situations.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart