How to convert a low-pass filter to a band-pass filter - signal-processing

I have a a low pass filter described by the following transfer function:
h[n] = (w_c/Pi) * sinc( n * w_c / Pi ), where is w_c is the cutoff frequency
I have to convert this low-pass filter to a band-pass filter.

You h[n] transforms into a rect in frequency domain. To make it band pass you need to move its central frequency higher.
To do this, multiply h[n] by exp(j*w_offset*n), where w_offset is the amount to shift. If w_offset is positive, then you shift towards higher frequencies.
Multiplication in time domain is convolution in frequency domain. Since exp(j*w_offset*n) turns into impulse function centred on w_offset, the multiplication shifts the H(w) by w_offset.
See Discrete Time Fourier Transform for more details.
Note: such a filter will not be symmetric about 0, which means it will have complex values. To make it symmetric, you need to add h[n] multiplied by exp(-j*w_offset*n):
h_bandpass[n] = h[n](exp(j*w_offset*n)+exp(-j*w_offset*n))
Since cos(w*n) = (exp(j*w*n)+exp(-j*w*n))/2 we get:
h_bandpass[n] = h[n]cos(w_offset*n)
This filter then has purely real values.

The short answer is that you will multiply by a complex exponential in the time domain. Multiplication in the time domain will shift the signal in the frequency domain.
Matlab code:
n_taps = 100;
n = 1:n_taps;
h = ( w_c / Pi ) * sinc( ( n - n_taps / 2) * w_c / Pi ) .* ...
exp( i * w_offset * ( n - n_taps / 2) );
p.s. I happened to have just implemented this exact functionality for school a couple of weeks ago.
Here is code for creating your own band pass filter using the windowing method:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Function: Create bandpass filter using windowing method
% Purpose: Simple method for creating filter taps ( useful when more elaborate
% filter design libraries are not available )
%
% #author Trevor B. Smith, 24MAR2009
%
% #param n_taps How many taps are in your output filter
% #param omega_p1 The lower cutoff frequency for your passband filter
% #param omega_p2 The upper cutoff frequency for your passband filter
% #return h_bpf_hammingWindow The filter coefficients for your passband filter
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function h_bpf_hammingWindow = BPF_hammingWindow(n_taps,omega_p1,omega_p2)
% Error checking
if( ( omega_p2 == omega_p1 ) || ( omega_p2 < omega_p1 ) || ( n_taps < 10 ) )
str = 'ERROR - h_bpf_hammingWindow(): Incorrect input parameters'
h_bpf_hammingWindow = -1;
return;
end
% Compute constants from function parameters
length = n_taps - 1; % How many units of T ( i.e. how many units of T, sampling period, in the continuous time. )
passbandLength = omega_p2 - omega_p1;
passbandCenter = ( omega_p2 + omega_p1 ) / 2;
omega_c = passbandLength / 2; % LPF omega_c is half the size of the BPF passband
isHalfSample = 0;
if( mod(length,2) == 1 )
isHalfSample = 1/2;
end
% Compute hamming window
window_hamming = hamming(n_taps);
% Compute time domain samples
n = transpose(-ceil(length/2):floor(length/2));
h1 = sinc( (1/pi) * omega_c * ( n + isHalfSample ) ) * pi .* exp( i * passbandCenter * ( n + isHalfSample ) );
% Window the time domain samples
h2 = h1 .* window_hamming;
if 1
figure; stem(h2); figure; freqz(h2);
end
% Return filter coefficients
h_bpf_hammingWindow = h2;
end % function BPF_hammingWindow()
Example on how to use this function:
h_bpf_hammingWindow = BPF_hammingWindow( 36, pi/4, 3*pi/4 );
freqz(h_bpf_hammingWindow); % View the frequency domain

Let f[n] be the signal you get from the low-pass filter with w_c at the lower bound of the desired band. You can get the frequencies above this lower bound by subtracting f[n] from the original signal. This is the input you want for the second low-pass filter.

Related

Gradient Descent produces incorrect Thetas in octave

I'm trying out a prediction algorithm using polynomial regression of the form h(x) = theta0 + theta1 * x1 + theta2 * x2, where x2=x1^2
I'm calculating the thetas with two methods, to compare the results: Normal Equation Vs. Gradient Decent. Then I plot the regression line for both methods for scores from 65 to 100, to see how it fits with my data.
When calculating thetas using Normal Equation, all seems to be working as expected. In the graph below, "x" is the actual scores, and "o" is the predicted scores.
However when calculating thetas using Gradient Decent, the resulting regression line does not fit my data. It looks like this:
While minimizing my Cost Function, I'm plotting the Gradient Descent iterations over J, to confirm that values converge. This seems to be working correctly:
Here's my code:
function [theta_normalEq, theta_gradientDesc] = a1_LinearRegression()
clear();
% suppose you want to fit a model of the form h(x) = theta0 + theta1 * x1 + theta2 * x2
% where x1 is the midterm score and x2 is (midterm score)^2
midTerm = [89; 72; 94; 69]; % mid-term Exam scores
midTerm2 = midTerm .^2; % same as above but each element squared (. refers to "each element". If 'dot' was not there, ^2 alone would mean matrix multiplications
X = [midTerm midTerm2]; % concatinate the two vectors into a single matrix
y = [96; 74; 87; 78]; %final Exam scores
% Method A:
% calculate theta (bias for each independent variable) using Normal Equation
% This works in some cases only (see comments in corresponding function below)
theta_normalEq = normalEquation(X, y);
% Method B:
% Use Gradient Descent
theta_gradientDesc = gradientDescent(X, y, 1.3, 60);
% plot regression line to see visually how it fits with our data
plotRegressionLine(midTerm, y, X, theta_gradientDesc);
% clear unneeded variables for a tidy output window
clear ('midTerm', 'midTerm2');
endfunction
% plots a regression line to see visually how it fits with our data
function plotRegressionLine(midTerm, y, X, theta)
% Our X matrix is n-long, but our theta is n+1 (remember we are modeling h(x) = theta0 + theta1 * x1 + theta2 * x2)
% Therefore we will introduce an X0 and set it to x0 = 1 for all values of i, so that we can do matrix operations with theta and X.
% This makes the two vectors 'theta' and x(i) match each other element-wise (that is, have the same number of elements: n+1).
X0 = ones(rows(X),1);
X = [X0 X]; % concatination; X had 2 columns, now it has 3. The very first column now consists of 'ones'
clear ('X0'); % just clears the variable
% with our thetas calculated, we can now plug them in our original model to make predictions
% model form: h(x) = theta0 + theta1 * x1 + theta2 * x2
% vectorized version: h(x) = X * theta
y_predicted = X * theta;
% let's also calculate the poits for all possible scores, to draw a regression line
scoreMin = 65;
scoreMax = 100;
step = 0.1;
scores = (scoreMin: step: scoreMax)';
scoresX = [ones(rows(scores),1) scores scores.^2];
scoresY_predicted = scoresX * theta;
% plot
figure 2;
clf;
hold on;
plot(midTerm, y, "x"); % draws our actual data points
plot(midTerm, y_predicted, "or"); % draws our predicted data points
plot(scores, scoresY_predicted, "r"); % draws our calculated regression line
hold off;
endfunction
% Performs gradient descent to learn theta. Updates theta by taking num_iters
%
% X = matrix of independent variables (e.g., size of house, number of bedrooms, number of bathrooms, etc)
% y = vector of dependent variables (e.g., cost of house)
% alpha = the rate of learning
% number of iterations to try finding the optimum theta
%
% Start by trying out a random alpha, like 0.1 or 1.
% If alpha is too small, it will take too long to minimize J and see values converging (too many iterations)
% If alpha is too large, we will overshoot the function minimum and values will start increasing again
% Ideally we want as large an alpha to get enough resolution to discover the function minimum with as few iterations as possible, without overshooting the minimum
%
% We also want a numner of iterations that are enough, but not too many. Depending on our problem and data, this can be from 30 to 300 to 3000 to 3 million, or more.
% In practice, we plot J against number of iterations as we go along the loop, to discover experimentally the optimal values for 'alpha' and 'num_iters'
% The graph we are looking for looks like a hokey stick of reducing values, that flattens horizontally. The J no longer reduces (the flat horizontal part), we have converged.
%
function theta = gradientDescent(X, y, alpha, num_iters)
% NORMALIZE FEATURES
% We can speed up gradient descent by having each of our input values in roughly the same range
% This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.
% The way to prevent this is to modify the ranges of our input variables so that they are all roughly the same
% zscore() normalizes each feature (each column) independently, which is what we want: (value - mean of values for that column) / standard deviation of that column
X = zscore(X);
y = zscore(y);
% Our X matrix is n-long, but our theta is n+1 (remember we are modeling h(x) = theta0 + theta1 * x1 + theta2 * x2)
% Therefore we will introduce an X0 and set it to x0 = 1 for all values of i, so that we can do matrix operations with theta and X.
% This makes the two vectors 'theta' and x(i) match each other element-wise (that is, have the same number of elements: n+1).
X0 = ones(rows(X),1);
X = [X0 X]; % concatination; X had 2 columns, now it has 3. The very first column now consists of 'ones'
clear ('X0'); % just clears the variable
% number of training examples
m = length(y);
% save the cost J in every iteration in order to plot J vs. num_iters and check for convergence
J_history = zeros(num_iters, 1);
% We start with a random set of thetas.
% Gradient Descent improves them at each iteration until values converge
% NOTE: do not use randomMatrix() to initialize. Rather, hard code random values so that they are identical at each run attempt,
% to help us experiment with different sets of 'alpha' & 'num_iters' until we discover their optimal values.
%theta = randomMatrix(columns(X), 1, 0, 1);
theta = [0;0;0];
for iter = 1:num_iters
h = X * theta;
stderr = h - y;
theta = theta - (alpha/m) * X' * stderr;
J_history(iter) = computeCost(X, y, theta);
endfor
% plot J vs. num_iters and check for convergence
xAxis = 1:1:num_iters; % create vector from 1 to num_iters with step 1
figure 1;
clf;
plot(xAxis, J_history);
endfunction
% These two functions give identical results, but maybe one runs faster than another
function J = computeCost(X, y, theta)
m = length(y); % number of training examples
J = 1/(2*m) * sum( ( X*theta - y) .^ 2);
endfunction
%
function J = computeCostVectorized(X, y, theta)
m = length(y); % number of training examples
J = 1/(2*m) * (X*theta - y)' * (X*theta - y);
endfunction
% alternative way of finding the optimum theta without iteration and without having to try different alphas (rate of learning)
% however this method can be slow in situations with a lot of features + large training set combos
% There is no need to do feature scaling with the normal equation!!!
%
% WARNING:
% X'* X may be noninvertible. The common causes are:
% > Redundant features, where two features are very closely related (i.e. they are linearly dependent)
% > Too many features (e.g. m ≤ n). In this case, delete some features or use "regularization" (to be explained in a later lesson)
%
% Solutions to the above problems include deleting a feature that is linearly dependent with another or deleting one or more features when there are too many features
function theta = normalEquation(X, y)
% Our X matrix is n-long, but our theta is n+1 (remember we are modeling h(x) = theta0 + theta1 * x1 + theta2 * x2)
% Therefore we will introduce an X0 and set it to x0 = 1 for all values of i, so that we can do matrix operations with theta and X.
% This makes the two vectors 'theta' and x(i) match each other element-wise (that is, have the same number of elements: n+1).
X0 = ones(rows(X),1);
X = [X0 X]; % concatination; X had 2 columns, now it has 3. The very first column now consists of 'ones'
clear ('X0'); % just clears the variable
Xt = X';
theta = pinv(Xt * X) * Xt * y;
endfunction
% returns a random matrix of the specified size
% if you don't care to specify mean and variance, just use 0 and 1 respectively (or just call 'randn(rows, columns)' directly)
function retVal = randomMatrix(rows, columns, mean, variance)
retVal = mean + sqrt(variance)*(randn(rows,columns));
endfunction

Which is the correct implementation of regularization in octave?

I'm currently taking Andrew Ng's machine learning course and I try implementing the stuff as I learn so as not to forget them, I just finished regularization (chapter 7). I know that theta 0 is updated normally, separate from other parameters, however, I am not sure which of these is the correct implementation.
Implementation 1: in my gradient function, after computing the regularization vector, change theta 0 part to 0 so when it is added to the total, it is as if theta 0 was never regularized.
Implementation 2: store theta in a temp variable: _theta, update it with a reg_step of 0 (so it's as if there's no regularization), store the new theta 0 in a temp variable: t1, then update the original theta value with my desired reg_step and replace theta 0 with t1 (value from non-regularized update).
below is my code for the first implementation, it's not meant to be advanced, I'm just practicing:
I'm using octave which is 1-index, so theta(1) is theta(0)
function ret = gradient(X,Y,theta,reg_step),
H = theta' * X;
dif = H-Y;
mul = dif .* X;
total = sum(mul,2);
m=(size(Y)(1,1));
regular = (reg_step/m)*theta;
regular(1)=0;
ret = (total/m)+regular,
endfunction
Thanks in advance.
A slight tweak to the first implementation worked for me.
First, calculate regularization for every theta. Then go on to perform gradient step and later you can change the first entry of the matrix containing gradients manually to ignore regularization for theta_0.
% Calculate regularization
regularization = (reg_step / m) * theta;
% Gradient Step
gradients = (1 / m) * (X' * (predictions - y)) + regularization;
% Ignore regularization in theta_0
gradients(1) = (1 / m) * (X(:, 1)' * (predictions - y));

Vectorize getting every nth element (but nth element is variable)

How can I vectorize getting every nth element if the nth element is variable?
I know about:
A = randi( 10, 10, 2 );
B = A(2:2:end, :); % make another matrix (B) that contains every 2nd element
But my nth variable changes.
Here's a working FOR loop code based on the golden angle:
1) It converts the (golden angle) in degrees wanted into cell bit location for array.
2) Shift array by that given amount.
3) Places the 1st shifted cell wanted into a new array.
signal_used_L1 = [1:9](:).';
total_samples = numel( signal_used_L1 );
for hh = 1 : length( signal_used_L1 )
% PHI
deg_to_shift = 137.5077 * hh;
% convert degrees wanted into cell bits
shift_sig_in_bits_L1 = total_samples * deg_to_shift / 360;
% shift signal by given amount of cell bits
shift_sig_L1 = circshift( signal_used_L1(:).' , ...
[0, round(shift_sig_in_bits_L1)] );
% create array with shifted cell bits
sig_bit_built(1, hh) = shift_sig_L1(1, 1);
end
PS: I'm using Octave 4.2.2
Not sure what you're trying to do exactly, but I'd vectorise your code as follows:
signal_used_L1 = [1:9](:).';
total_samples = numel( signal_used_L1 );
% PHI
deg_to_shift = 137.5077 * [1:length( signal_used_L1 )];
% convert degrees wanted into cell bits
shift_sig_in_bits_L1 = total_samples * deg_to_shift / 360;
% obtain "wrap-around" indices given above cell bits
indices = mod( -round( shift_sig_in_bits_L1 ), total_samples ) + 1;
% create array with shifted cell bits
signal_used_L1( indices )
Incidentally, I think you meant to do circshift with a negative shift though (i.e. move "n" places to the right). In which case the vectorised code above would be mod( round... rather than mod( -round...

Generating a Histogram by Harmonic Number

I am trying to create a program in GNU Octave to draw a histogram showing the fundamental and harmonics of a modified sinewave (the output from an SCR dimmer, which consists of a sinewave which is at zero until part way through the wave).
I've been able to generate the waveform and perform FFT to get a set of Frequency vs Amplitude points, however I am not sure how to convert this data into bins suitable for generating a histogram.
Sample code and an image of what I'm after below - thanks for the help!
clear();
vrms = 120;
freq = 60;
nCycles = 2;
level = 25;
vpeak = sqrt(2) * vrms;
sampleinterval = 0.00001;
num_harmonics = 10
disp("Start");
% Draw the waveform
x = 0 : sampleinterval : nCycles * 1 / freq; % time in sampleinterval increments
dimmed_wave = [];
undimmed_wave = [];
for i = 1 : columns(x)
rad_value = x(i) * 2 * pi * freq;
off_time = mod(rad_value, pi);
on_time = pi*(100-level)/100;
if (off_time < on_time)
dimmed_wave = [dimmed_wave, 0]; % in the dimmed period, value is zero
else
dimmed_wave = [dimmed_wave, sin(rad_value)]; % when not dimmed, value = sine
endif
undimmed_wave = [undimmed_wave, sin(rad_value)];
endfor
y = dimmed_wave * vpeak; % calculate instantaneous voltage
undimmed = undimmed_wave * vpeak;
subplot(2,1,1)
plot(x*1000, y, '-', x*1000, undimmed, '--');
xlabel ("Time (ms)");
ylabel ("Voltage");
% Fourier Transform to determine harmonics
subplot(2,1,2)
N = length(dimmed_wave); % number of points
fft_vals = abs(fftshift(fft(dimmed_wave))); % perform fft
frequency = [ -(ceil((N-1)/2):-1:1) ,0 ,(1:floor((N-1)/2)) ] * 1 / (N *sampleinterval);
plot(frequency, fft_vals);
axis([0,400]);
xlabel ("Frequency");
ylabel ("Amplitude");
You know your base frequency (fundamental tone), let's call it F. 2*F is the second harmonic, 3*F the third, etc. You want to set histogram bin edges halfway between these: 1.5*F, 2.5*F, etc.
You have two periods in your input signal, therefore your (integer) base frequency is k=2 (the value at fft_vals[k+1], the first peak in your plot). The second harmonic is at k=4, the third one at k=6, etc.
So you would set your bins edges at k = 1:2:end.
In general, this would be k = nCycles/2:nCycles:end.
You can compute your bar graph according to our computed bin edges as follows:
fft_vals = abs(fft(dimmed_wave));
nHarmonics = 9;
edges = nCycles/2 + (0:nHarmonics)*nCycles;
H = cumsum(fft_vals);
H = diff(H(edges));
bar(1:nHarmonics,H);

gradient descent seems to fail

I implemented a gradient descent algorithm to minimize a cost function in order to gain a hypothesis for determining whether an image has a good quality. I did that in Octave. The idea is somehow based on the algorithm from the machine learning class by Andrew Ng
Therefore I have 880 values "y" that contains values from 0.5 to ~12. And I have 880 values from 50 to 300 in "X" that should predict the image's quality.
Sadly the algorithm seems to fail, after some iterations the value for theta is so small, that theta0 and theta1 become "NaN". And my linear regression curve has strange values...
here is the code for the gradient descent algorithm:
(theta = zeros(2, 1);, alpha= 0.01, iterations=1500)
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
tmp_j1=0;
for i=1:m,
tmp_j1 = tmp_j1+ ((theta (1,1) + theta (2,1)*X(i,2)) - y(i));
end
tmp_j2=0;
for i=1:m,
tmp_j2 = tmp_j2+ (((theta (1,1) + theta (2,1)*X(i,2)) - y(i)) *X(i,2));
end
tmp1= theta(1,1) - (alpha * ((1/m) * tmp_j1))
tmp2= theta(2,1) - (alpha * ((1/m) * tmp_j2))
theta(1,1)=tmp1
theta(2,1)=tmp2
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
And here is the computation for the costfunction:
function J = computeCost(X, y, theta) %
m = length(y); % number of training examples
J = 0;
tmp=0;
for i=1:m,
tmp = tmp+ (theta (1,1) + theta (2,1)*X(i,2) - y(i))^2; %differenzberechnung
end
J= (1/(2*m)) * tmp
end
If you are wondering how the seemingly complex looking for loop can be vectorized and cramped into a single one line expression, then please read on. The vectorized form is:
theta = theta - (alpha/m) * (X' * (X * theta - y))
Given below is a detailed explanation for how we arrive at this vectorized expression using gradient descent algorithm:
This is the gradient descent algorithm to fine tune the value of θ:
Assume that the following values of X, y and θ are given:
m = number of training examples
n = number of features + 1
Here
m = 5 (training examples)
n = 4 (features+1)
X = m x n matrix
y = m x 1 vector matrix
θ = n x 1 vector matrix
xi is the ith training example
xj is the jth feature in a given training example
Further,
h(x) = ([X] * [θ]) (m x 1 matrix of predicted values for our training set)
h(x)-y = ([X] * [θ] - [y]) (m x 1 matrix of Errors in our predictions)
whole objective of machine learning is to minimize Errors in predictions. Based on the above corollary, our Errors matrix is m x 1 vector matrix as follows:
To calculate new value of θj, we have to get a summation of all errors (m rows) multiplied by jth feature value of the training set X. That is, take all the values in E, individually multiply them with jth feature of the corresponding training example, and add them all together. This will help us in getting the new (and hopefully better) value of θj. Repeat this process for all j or the number of features. In matrix form, this can be written as:
This can be simplified as:
[E]' x [X] will give us a row vector matrix, since E' is 1 x m matrix and X is m x n matrix. But we are interested in getting a column matrix, hence we transpose the resultant matrix.
More succinctly, it can be written as:
Since (A * B)' = (B' * A'), and A'' = A, we can also write the above as
This is the original expression we started out with:
theta = theta - (alpha/m) * (X' * (X * theta - y))
i vectorized the theta thing...
may could help somebody
theta = theta - (alpha/m * (X * theta-y)' * X)';
I think that your computeCost function is wrong.
I attended NG's class last year and I have the following implementation (vectorized):
m = length(y);
J = 0;
predictions = X * theta;
sqrErrors = (predictions-y).^2;
J = 1/(2*m) * sum(sqrErrors);
The rest of the implementation seems fine to me, although you could also vectorize them.
theta_1 = theta(1) - alpha * (1/m) * sum((X*theta-y).*X(:,1));
theta_2 = theta(2) - alpha * (1/m) * sum((X*theta-y).*X(:,2));
Afterwards you are setting the temporary thetas (here called theta_1 and theta_2) correctly back to the "real" theta.
Generally it is more useful to vectorize instead of loops, it is less annoying to read and to debug.
If you are OK with using a least-squares cost function, then you could try using the normal equation instead of gradient descent. It's much simpler -- only one line -- and computationally faster.
Here is the normal equation:
http://mathworld.wolfram.com/NormalEquation.html
And in octave form:
theta = (pinv(X' * X )) * X' * y
Here is a tutorial that explains how to use the normal equation: http://www.lauradhamilton.com/tutorial-linear-regression-with-octave
While not scalable like a vectorized version, a loop-based computation of a gradient descent should generate the same results. In the example above, the most probably case of the gradient descent failing to compute the correct theta is the value of alpha.
With a verified set of cost and gradient descent functions and a set of data similar with the one described in the question, theta ends up with NaN values just after a few iterations if alpha = 0.01. However, when set as alpha = 0.000001, the gradient descent works as expected, even after 100 iterations.
Using only vectors here is the compact implementation of LR with Gradient Descent in Mathematica:
Theta = {0, 0}
alpha = 0.0001;
iteration = 1500;
Jhist = Table[0, {i, iteration}];
Table[
Theta = Theta -
alpha * Dot[Transpose[X], (Dot[X, Theta] - Y)]/m;
Jhist[[k]] =
Total[ (Dot[X, Theta] - Y[[All]])^2]/(2*m); Theta, {k, iteration}]
Note: Of course one assumes that X is a n * 2 matrix, with X[[,1]] containing only 1s'
This should work:-
theta(1,1) = theta(1,1) - (alpha*(1/m))*((X*theta - y)'* X(:,1) );
theta(2,1) = theta(2,1) - (alpha*(1/m))*((X*theta - y)'* X(:,2) );
its cleaner this way, and vectorized also
predictions = X * theta;
errorsVector = predictions - y;
theta = theta - (alpha/m) * (X' * errorsVector);
If you remember the first Pdf file for Gradient Descent form machine Learning course, you would take care of learning rate. Here is the note from the mentioned pdf.
Implementation Note: If your learning rate is too large, J(theta) can di-
verge and blow up', resulting in values which are too large for computer
calculations. In these situations, Octave/MATLAB will tend to return
NaNs. NaN stands fornot a number' and is often caused by undened
operations that involve - infinity and +infinity.

Resources