Pick a random frame from one ranges of keyframes - actionscript

I'm trying to do a Quiz game with 10 levels. In one level there will be 6 frames. For example , level 1 is 1-6 frames , level 2 is 7-12 frames , level 3 is 13 - 18 frames... When the user move from one level to another (eg level 2 to level 3), there will be a random selection of the frames from 13-18 frames ..
how do i code it such that the random selection can be execute?

I'm starting off this answer with two assumptions:
you are working with ActionScript 3 (for data-typing purposes in my code examples)
every level has the same amount of frames available
First, let's create a variable that holds the number of frames available in each level. In your case that's 6, but this can always be changed.
var levelFrames:int = 6;
Now create a function that will give the random frame based on a given level.
function getLevelFrame(level:int):int {
var baseFrame:int = (level-1) * levelFrames;
var randomFrame:int = Math.ceil(Math.random() * levelFrames);
return baseFrame + randomFrame;
}
(level-1) * levelFrames gives the base frame
Math.ceil(Math.random() * levelFrames) gives a number between 1 and 6 to add to the starting frame
And just use it like this:
trace(getLevelFrame(1)); // this outputs a number between 1 and 6
trace(getLevelFrame(2)); // this outputs a number between 7 and 12
trace(getLevelFrame(3)); // this outputs a number between 13 and 18
etc...

Related

How to skip to a specific frame in a given spectrogram file

I'm encountering problems skipping ahead to a specific frame of a melspec feature set found here. The aim of getting features from the feature set is to analyse the difference in beats per second (BPS) so that i can match up the BPS of two tracks in order to mix between the two tracks or warp the timing of the track to synchronise the two pieces of music together. The feature set does specify the following:
Pre-extracted in the "feature" directory are space-delimited floating-point ASCII matrices:
beat_synchronus: one beat-synchronus vector per line
non-beat-synchronus: 512-sample hop frames # 22050Hz sample rate, one vector per line one vector per line:"
I'm not quite sure how to interpret this - is the melspec beat or non beat synchronous and how does that work in regards to delimitating frames?
I've got as far as working out the frame duration thanks to this answer but I don't know how to apply the knowledge gained from the frame duration to the task of navigating to a specific timecode or frame. The closest I've got is working out the offset divided by the frame to work out how many frames need to be skipped to get to the offset (1 second into the track for example gives 2583 frames). However, the file is not demarcated into lines and as far as I can tell is just a continuous list of entries. This leads to the question of what the size is of a given frame is (if that's the right terminology) is it the case that it is 2383 entries to the second need to be skipped to get to the right entry or is it the case that each frame has a specific number of entries and I need to skip 2583 frames of size x? what is size x (512?)?
I've been able to open the file for melspec but for the melspec file there are no delimiters between entries. It is instead a continuous list of entries.
The code I have so far is as follows to work out the duration of a frame, and therefore the number of frames in an offset track to be skipped. However this does not indicate the size of a given frame and how to access that from the file for the melspec.
spectrogram is the file_path for a given feature set. The offset is the time in seconds offset from the start of a track.
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
# readlines of file so that offset is applied.
with open(spectrogram) as feature_set:
indices = int(SHIFT_FRAMES)
for line in feature_set:
print(line)
feature_set.close()
This gives a list of 10 lines of results, which do not seem to be naturally delimited by line.
The sample file you are referring is a matrix of 128 x 7392 values.
To better understand the format of this file, you may look at the extractFeatures.py script used to extract the features. You may notice that the melspec feature is described as "non-beat-synchronus" and computed using librosa.feature.melspectrogram, using mostly default arguments and producing an output S of n_mel rows by t columns.
To figure out the value of n_mel you need to look at librosa.filters.mel, which indicates a default value of 128. The number of frames t on the other hand is computed internally in librosa.util.frame as 1 + int((len(y) - frame_length) / hop_length), where the frame_length uses the default value 2048 and the hop_length uses the default value 512.
To summarize, the 128 rows correspond to the 128 MEL-frequency bins, and the 7392 columns correspond to time frames.
You could thus use the following to extract the column of interest:
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
# readlines of file so that offset is applied.
with open(spectrogram) as feature_set:
indices = int(SHIFT_FRAMES)
for line in feature_set:
print(line.split(" ")[indices])
feature_set.close()
Using numpy you could also read the entire spectrogram and address a specific column:
import numpy as np
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
data = np.loadtxt(spectrogram)
column = int(SHIFT_FRAMES)
print(data[:,column])
Going back on the fact that the feature extraction was done using librosa, you may also consider using librosa.core.time_to_frames instead of manually computing the frame number:
def skipToFrame(spectrogram, offset):
SHIFT_FRAMES = librosa.core.time_to_frames(offset, sr=22050, hop_length=512, n_fft=2048)
...
On a final note, you should be aware that each of these time frame uses 2048 samples, but they overlap such that each successive frame advances 512 sample relative to the previous sample. So the frames cover the following time intervals:
frame # | start (s) | end (s)
================================
1 | 0.000 | 0.093
2 | 0.023 | 0.116
3 | 0.046 | 0.139
...
41 | 0.929 | 1.022
42 | 0.952 | 1.045
...
7392 | 171.619 | 171.712

Padding time-series subsequences for LSTM-RNN training

I have a dataset of time series that I use as input to an LSTM-RNN for action anticipation. The time series comprises a time of 5 seconds at 30 fps (i.e. 150 data points), and the data represents the position/movement of facial features.
I sample additional sub-sequences of smaller length from my dataset in order to add redundancy in the dataset and reduce overfitting. In this case I know the starting and ending frame of the sub-sequences.
In order to train the model in batches, all time series need to have the same length, and according to many papers in the literature padding should not affect the performance of the network.
Example:
Original sequence:
1 2 3 4 5 6 7 8 9 10
Subsequences:
4 5 6 7
8 9 10
2 3 4 5 6
considering that my network is trying to anticipate an action (meaning that as soon as P(action) > threshold as it goes from t = 0 to T = tmax, it will predict that action) will it matter where the padding goes?
Option 1: Zeros go to substitute original values
0 0 0 4 5 6 7 0 0 0
0 0 0 0 0 0 0 8 9 10
0 2 3 4 5 6 0 0 0 0
Option 2: all zeros at the end
4 5 6 7 0 0 0 0 0 0
8 9 10 0 0 0 0 0 0 0
2 3 4 5 0 0 0 0 0 0
Moreover, some of the time series are missing a number of frames, but it is not known which ones they are - meaning that if we only have 60 frames, we don't know whether they are taken from 0 to 2 seconds, from 1 to 3s, etc. These need to be padded before the subsequences are even taken. What is the best practice for padding in this case?
Thank you in advance.
The most powerful attribute of LSTMs and RNNs in general is that their parameters are shared along the time frames(Parameters recur over time frames) but the parameter sharing relies upon the assumption that the same parameters can be used for different time steps i.e. the relationship between the previous time step and the next time step does not depend on t as explained here in page 388, 2nd paragraph.
In short, padding zeros at the end, theoretically should not change the accuracy of the model. I used the adverb theoretically because at each time step LSTM's decision depends on its cell state among other factors and this cell state is kind of a short summary of the past frames. As far as I understood, that past frames may be missing in your case. I think what you have here is a little trade-off.
I would rather pad zeros at the end because it doesn't completely conflict with the underlying assumption of RNNs and it's more convenient to implement and keep track of.
On the implementation side, I know tensorflow calculates the loss function once you give it the sequences and the actual sequence size of each sample(e.g. for 4 5 6 7 0 0 0 0 0 0 you also need to give it the actual size which is 4 here) assuming you're implementing the option 2. I don't know whether there is an implementation for option 1, though.
Better go for padding zeroes in the beginning, as this paper suggests Effects of padding on LSTMs and CNNs,
Though post padding model peaked it’s efficiency at 6 epochs and started to overfit after that, it’s accuracy is way less than pre-padding.
Check table 1, where the accuracy of pre-padding(padding zeroes in the beginning) is around 80%, but for post-padding(padding zeroes in the end), it is only around 50%
In case you have sequences of variable length, pytorch provides a utility function torch.nn.utils.rnn.pack_padded_sequence. The general workflow with this function is
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence
embedding = nn.Embedding(4, 5)
rnn = nn.GRU(5, 5)
sequences = torch.tensor([[1,2,0], [3,0,0], [2,1,3]])
lens = [2, 1, 3] # indicating the actual length of each sequence
embeddings = embedding(sequences)
packed_seq = pack_padded_sequence(embeddings, lens, batch_first=True, enforce_sorted=False)
e, hn = rnn(packed_seq)
One can collect the embedding of each token by
e = pad_packed_sequence(e, batch_first=True)
Using this function is better than padding by yourself, because torch will limit RNN to only inspecting the actual sequence and stop before the padded token.

GBM handling factor variables, worried about too many factors

I am working on a basketball model that predicts how well an NBA player will play in their next game, based on how well they have performed in all previous games of the season. There are roughly 10 players per NBA team, and each of 30 teams has played about 25 games this season, so my dataframe has about 10*30*25 = 7,500 observations at this point. I run my model each day, predicting how well players will play in the next day - therefore, for tomorrow I will make roughly 10*30 = 300 predictions.
My question is this - currently i have about 50 columns / features / x-variables that I am using for prediction, all of which are numeric variables (average number of points scored, average number of rebounds, etc.). However, I think it may help my model to know which player each row corresponds to. That is, I want to pass a 51st column, a factor variable including the players names. I read online that GBM can deal with factor variables as it will "dummify" them internally, however I am worried that "dummifying" 300 different players will not perform well. Will passing a factor variable with all of the player names backfire and ultimately hurt my model, due to the large number of dummy variables it will create internally, or is this okay?
my_df
PLAYER FG FGA X3P X3PA FT FTA
1042 Andre Drummond 6 16 0 0 6 10
17747 Marcus Morris 6 19 1 4 5 6
14861 Kentavious Caldwell-Pope 7 14 4 7 3 3
7976 Ersan Ilyasova 6 12 3 6 1 2
22401 Reggie Jackson 4 10 2 4 5 5
24475 Stanley Johnson 3 10 1 3 0 0
24649 Steve Blake 1 6 1 5 0 0
12489 Jodie Meeks 1 4 0 0 0 0
1955 Aron Baynes 3 5 0 0 0 0
21500 Paul Millsap 7 15 2 6 3 4
I have used factor variables with a large number of levels in gbm and the biggest problem you will face with that is that your computation time will significantly increase.(which may not be a problem for your case as the dataset you are using is small) Also, when you plot variable importance
gbm_model <- train(A0 ~ .,
data = training,
method="gbm",
distribution = "bernoulli",
metric="ROC",
maximise=TRUE,
tuneGrid=grid,
train.fraction = 0.6,
trControl=ctrl)
ggplot(varImp(gbm_model, scale=TRUE))
each factor level shows up separately, which can make it pretty confusing to asses importance.
Apart from this, you mention that you have 7,500 observations, 50 features and 300 different players. If you consider adding player name as a variable that would mean approx 25 obs per player, which is a pretty small sample to work with and may mean that your model wont generalize well. So my personal suggestion would be to abstain from doing so.
However, I see the point of why you would want to do so and would suggest that you try clustering the players (using player-specific criteria or maybe even some features you already have) and then use the cluster a player belongs to as a variable.
Hope this helps! :)
I have the same proble with function gbm, for instance i added a randomn factor with 100 levels and it appears as the most influent variable.

Generate a Random Number between 0.0001 and 0.002 in Objective C (iOS)?

Does anyone know how I could generate a random number in a range in iOS? I am currently working on a synthesizer in iOS (using SpriteKit and AudioKit) and I am trying to modify the loudness of the synth with changing its variability whenever a slider is being moved as well.
This is what my code looks like:
[Synth setAmplitude: 0.5 + (slider.currentValue * loudnessVar)];
where 0.5 is the default amplitude value and loudnessVar is a random number.
Since, the slider returns values from -170 to 170 , I would need a relatively low number in order to set a value between 0 and 1.
Is anyone able to help with this?
The way to generate a random number in a range is:
NSInteger random = min + arc4random() % (max - min);
So, you can generate a number between 1-20 and divide it by 1000, it's just an example.
I.

vowpalwabbit strange features count

I have found that during training my model vw shows very big (much more than my features count ) feature number count in it's log.
I have tried to reproduce it using some small example:
simple.test:
-1 | 1 2 3
1 | 3 4 5
then "vw simple.test" command says that it have used 8 features. +one feature is constant but what are the other ? And in my real exmaple difference between my features and features used in wv is abot x10 more.
....
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
using no cache
Reading datafile = t
num sources = 1
average since example example current current current
loss last counter weight label predict features
finished run
number of examples = 2
weighted example sum = 2
weighted label sum = 3
average loss = 1.9179
best constant = 1.5
total feature number = 8 !!!!
total feature number displays a sum of feature counts from all observed examples. So it's 2*(3+1 constant)=8 in your case. The number of features in current example is displayed in current features column. Note that only 2^Nth example is printed on screen by default. In general observations can have unequal number of features.

Resources