Silhouette coefficient calculation for clustering - machine-learning

I am trying to cluster without using library function, I am trying to validate the clustering technique using Silhouette Coefficient.
valuek = list()
silhouettelist = list()
label = list()
for k in range(2,23,2):
c_list, c_info = bisectingKMeans(Xsvd,k,10)
for v in c_info[:, 0]+1:
label.append((int(v.A[0][0])))
valuek.append(k)
silhouettelist.append(metrics.silhouette_score(X_principal, label))
The input data matrix is (8580, 126356). Dimensionality reduction was done using SVD with 200 components. When I tried to use the above mentioned code for calculating silhouette coefficient
I get value error. What should be done to overcome this error?
Found input variables with inconsistent numbers of samples: [8580, 17160]

Related

setting state covariance matrix in statsmodel.tsa.UnobservedComponents

I am trying to impose smoothness on the state covariance matrix, while using frequency domain seasonal components. I initiate my model as follows with a local level component and a particular frequency and harmonics specified.
model = sm.tsa.UnobservedComponents(df, level='llevel',
freq_seasonal=[{'period':130.51, 'harmonics':2}],
stochastic_freq_seasonal=[True])
res = model.fit()
>>>
sigma2.irregular 0.730561
sigma2.level 0.187833
sigma2.freq_seasonal_130.51(2) 0.003718
This will generate some parameter values as noted above. Now Since I am using 2 harmonics there are in fact 4 error variance and I want to set them as follows
model.ssm.state_cov[1,1,0] = 17.65
model.ssm.state_cov[2,2,0] = 0.3102
model.ssm.state_cov[3,3,0] = 17.65
model.ssm.state_cov[4,4,0] = 0.3102
And then get a 'smooth' and 'filter' object and see how they do. I know i can set the parameters under res.params, but these 4 do no appear in the parameter list. Is there a way to do it in this library?
The implementation in Statsmodels assumes a single common error variance parameter across all of the seasonal harmonic error terms, as in Harvey (1989, "Forecasting, Structural Time Series, and the Kalman Filter") section 2.3.4.
As a result, it's not particularly easy to set those parameters as you have suggested and then estimate the remaining parameters.
However, it is possible. For this specific case, you can set the variance parameters to 1 and then put the square root of the variance terms you actually want into the diagonals of the selection matrix, as follows:
model = sm.tsa.UnobservedComponents(df, level='llevel',
freq_seasonal=[{'period':130.51, 'harmonics':2}],
stochastic_freq_seasonal=[True])
model['selection', 1:, 1:] = np.diag([17.65, 0.3102, 17.65, 0.3102])**0.5
with model.fix_params({'sigma2.freq_seasonal_130.51(2)': 1}):
res = model.fit()

MLJ: selecting rows and columns for training in evaluate

I want to implement a kernel ridge regression that also works within MLJ. Moreover, I want to have the option to use either feature vectors or a predefined kernel matrix as in Python sklearn.
When I run this code
const MMI = MLJModelInterface
MMI.#mlj_model mutable struct KRRModel <: MLJModelInterface.Deterministic
mu::Float64 = 1::(_ > 0)
kernel::String = "linear"
end
function MMI.fit(m::KRRModel,verbosity::Int,K,y)
K = MLJBase.matrix(K)
fitresult = inv(K+m.mu*I)*y
cache = nothing
report = nothing
return (fitresult,cache,report)
end
N = 10
K = randn(N,N)
K = K*K
a = randn(N)
y = K*a + 0.2*randn(N)
m = KRRModel()
kregressor = machine(m,K,y)
cv = CV(; nfolds=6, shuffle=nothing, rng=nothing)
evaluate!(kregressor, resampling=cv, measure=rms, verbosity=1)
the evaluate! function evaluates the machine on different subsets of rows of K. Due to the Representer Theorem, a kernel ridge regression has a number of nonzero coefficients equal to the number of samples. Hence, a reduced size matrix K[train_rows,train_rows] can be used instead of K[train_rows,:].
To denote I'm using a kernel matrix I'd set m.kernel = "" . How do I make evaluate! select the columns as well as the rows to form a smaller matrix when m.kernel = ""?
This is my first time using MLJ and I'd like to make as few modifications as possible.
Quoting the answer I got on the Julia Discourse from #ablaom
The intended use of evaluate! is to estimate the generalisation error
associated with some supervised learning model, by subsampling
observations, as in cross-validation, a common use-case. I’m afraid
there is no natural way for evaluate! do feature subsampling.
https://alan-turing-institute.github.io/MLJ.jl/dev/evaluating_model_performance/
FYI: There is a version of kernel regression implementing the MLJ
model interface, namely kernel partial least squares regression from
the package GitHub - lalvim/PartialLeastSquaresRegressor.jl:
Implementation of a Partial Least Squares Regressor 2 .

How to Decompose and Visualise Slope Component in Tensorflow Probability

I'm running tensorflow 2.1 and tensorflow_probability 0.9. I have fit a Structural Time Series Model with a seasonal component. I am using code from the Tensorflow Probability Structural Time Series Probability example:
Tensorflow Github.
In the example there is a great plot where the decomposition is visualised:
# Get the distributions over component outputs from the posterior marginals on
# training data, and from the forecast model.
component_dists = sts.decompose_by_component(
demand_model,
observed_time_series=demand_training_data,
parameter_samples=q_samples_demand_)
forecast_component_dists = sts.decompose_forecast_by_component(
demand_model,
forecast_dist=demand_forecast_dist,
parameter_samples=q_samples_demand_)
demand_component_means_, demand_component_stddevs_ = (
{k.name: c.mean() for k, c in component_dists.items()},
{k.name: c.stddev() for k, c in component_dists.items()})
(
demand_forecast_component_means_,
demand_forecast_component_stddevs_
) = (
{k.name: c.mean() for k, c in forecast_component_dists.items()},
{k.name: c.stddev() for k, c in forecast_component_dists.items()}
)
When using a trend component, is it possible to decompose and visualise both:
trend/_level_scale & trend/_slope_scale
I have tried many permutations to extract the nested element of the trend component with no luck.
Thanks for your time in advance.
We didn't write a separate STS interface for this, but you can access the posterior on latent states (in this case, both the level and slope) by directly querying the underlying state-space model for its marginal means and covariances:
ssm = model.make_state_space_model(
num_timesteps=num_timesteps,
param_vals=parameter_samples)
posterior_means, posterior_covs = (
ssm.posterior_marginals(observed_time_series))
You should also be able to draw samples from the joint posterior by running ssm.posterior_sample(observed_time_series, num_samples).
It looks like there's currently a glitch when drawing posterior samples from a model with no batch shape (Could not find valid device for node. Node:{{node Reshape}}): while we fix that, it should work to add an artificial batch dimension as a workaround:
ssm.posterior_sample(observed_time_series[tf.newaxis, ...], num_samples).

Math Behind Linear Regression

Am trying to understand math behind Linear Regression and i have verified in multiple sites that Linear Regression works under OLS method with y=mx+c to get best fit line
So in order to calculate intercept and slope we use below formula(if am not wrong)
m = sum of[ (x-mean(x)) (y-mean(y)) ] / sum of[(x-mean(x))]
c = mean(y) - b1(mean(x))
So with this we get x and c values to substitute in above equation to get y predicted values and can predict for newer x values.
But my doubt is when does "Gradient Descent" used. I understood it is also used for calculating co-efficients only in such a way it reduces the cost function by finding local minima value.
Please help me in this
Are this two having separate functions in python/R.
Or linear regression by default works on Gradient Descent(if so then when does above formula used for calculating m and c values)

Translating a TensorFlow LSTM into synapticjs

I'm working on implementing an interface between a TensorFlow basic LSTM that's already been trained and a javascript version that can be run in the browser. The problem is that in all of the literature that I've read LSTMs are modeled as mini-networks (using only connections, nodes and gates) and TensorFlow seems to have a lot more going on.
The two questions that I have are:
Can the TensorFlow model be easily translated into a more conventional neural network structure?
Is there a practical way to map the trainable variables that TensorFlow gives you to this structure?
I can get the 'trainable variables' out of TensorFlow, the issue is that they appear to only have one value for bias per LSTM node, where most of the models I've seen would include several biases for the memory cell, the inputs and the output.
Internally, the LSTMCell class stores the LSTM weights as a one big matrix instead of 8 smaller ones for efficiency purposes. It is quite easy to divide it horizontally and vertically to get to the more conventional representation. However, it might be easier and more efficient if your library does the similar optimization.
Here is the relevant piece of code of the BasicLSTMCell:
concat = linear([inputs, h], 4 * self._num_units, True)
# i = input_gate, j = new_input, f = forget_gate, o = output_gate
i, j, f, o = array_ops.split(1, 4, concat)
The linear function does the matrix multiplication to transform the concatenated input and the previous h state into 4 matrices of [batch_size, self._num_units] shape. The linear transformation uses a single matrix and bias variables that you're referring to in the question. The result is then split into different gates used by the LSTM transformation.
If you'd like to explicitly get the transformations for each gate, you can split that matrix and bias into 4 blocks. It is also quite easy to implement it from scratch using 4 or 8 linear transformations.

Resources