I am trying to run a SVD job in mahout. I have a matrix (say A) created (Document x term) of size 372053 x 21338 (21338 no of unique words say N, 372053 documents say M). So my matrix A is of size (M*N). I ran the svd using mahout and i got the cleaned eigen vectors (i gave the expected rank as 200 say R). Now i have a eigen vectors matrix created of size R*N.
Stating the SVD equation
A = U * S * V' (V' being transpose of V)
I need to convert the matrix A to the new space, to get the compressed vectors of the documents (I am trying to implement LSI)
What is the output i get from mahout SVD? (I would like to know in terms of the equation above) I read mailing list that we can get the eigen values from the NamedVectors in the generated eigen vectors matrix.
Please guide me on how to proceed from here to generate the document-term matrix A in the new space (of size M*R).
Any help is highly appreciated :)
A good starting point for LSI with Stochastic SVD on Mahout can be found here.
The good part is that the paper describes also the folding in process and is explicit on the output format in terms of the svd equation.
The work is integrated in the latest version 0.8 and can be used with SSVDCli job or through mahout CLI with mahout ssvd <options>
Related
I'm running tensorflow 2.1 and tensorflow_probability 0.9. I have fit a Structural Time Series Model with a seasonal component. I am using code from the Tensorflow Probability Structural Time Series Probability example:
Tensorflow Github.
In the example there is a great plot where the decomposition is visualised:
# Get the distributions over component outputs from the posterior marginals on
# training data, and from the forecast model.
component_dists = sts.decompose_by_component(
demand_model,
observed_time_series=demand_training_data,
parameter_samples=q_samples_demand_)
forecast_component_dists = sts.decompose_forecast_by_component(
demand_model,
forecast_dist=demand_forecast_dist,
parameter_samples=q_samples_demand_)
demand_component_means_, demand_component_stddevs_ = (
{k.name: c.mean() for k, c in component_dists.items()},
{k.name: c.stddev() for k, c in component_dists.items()})
(
demand_forecast_component_means_,
demand_forecast_component_stddevs_
) = (
{k.name: c.mean() for k, c in forecast_component_dists.items()},
{k.name: c.stddev() for k, c in forecast_component_dists.items()}
)
When using a trend component, is it possible to decompose and visualise both:
trend/_level_scale & trend/_slope_scale
I have tried many permutations to extract the nested element of the trend component with no luck.
Thanks for your time in advance.
We didn't write a separate STS interface for this, but you can access the posterior on latent states (in this case, both the level and slope) by directly querying the underlying state-space model for its marginal means and covariances:
ssm = model.make_state_space_model(
num_timesteps=num_timesteps,
param_vals=parameter_samples)
posterior_means, posterior_covs = (
ssm.posterior_marginals(observed_time_series))
You should also be able to draw samples from the joint posterior by running ssm.posterior_sample(observed_time_series, num_samples).
It looks like there's currently a glitch when drawing posterior samples from a model with no batch shape (Could not find valid device for node. Node:{{node Reshape}}): while we fix that, it should work to add an artificial batch dimension as a workaround:
ssm.posterior_sample(observed_time_series[tf.newaxis, ...], num_samples).
I'm working on implementing an interface between a TensorFlow basic LSTM that's already been trained and a javascript version that can be run in the browser. The problem is that in all of the literature that I've read LSTMs are modeled as mini-networks (using only connections, nodes and gates) and TensorFlow seems to have a lot more going on.
The two questions that I have are:
Can the TensorFlow model be easily translated into a more conventional neural network structure?
Is there a practical way to map the trainable variables that TensorFlow gives you to this structure?
I can get the 'trainable variables' out of TensorFlow, the issue is that they appear to only have one value for bias per LSTM node, where most of the models I've seen would include several biases for the memory cell, the inputs and the output.
Internally, the LSTMCell class stores the LSTM weights as a one big matrix instead of 8 smaller ones for efficiency purposes. It is quite easy to divide it horizontally and vertically to get to the more conventional representation. However, it might be easier and more efficient if your library does the similar optimization.
Here is the relevant piece of code of the BasicLSTMCell:
concat = linear([inputs, h], 4 * self._num_units, True)
# i = input_gate, j = new_input, f = forget_gate, o = output_gate
i, j, f, o = array_ops.split(1, 4, concat)
The linear function does the matrix multiplication to transform the concatenated input and the previous h state into 4 matrices of [batch_size, self._num_units] shape. The linear transformation uses a single matrix and bias variables that you're referring to in the question. The result is then split into different gates used by the LSTM transformation.
If you'd like to explicitly get the transformations for each gate, you can split that matrix and bias into 4 blocks. It is also quite easy to implement it from scratch using 4 or 8 linear transformations.
I'm trying to reproduce results of paper Using Very Deep Auto encoders for
Content-Based Image Retrieval
I have some working code thanks to Theano framework, but I don't really know what is meant by the first step in their algorithm:
For each data-vector, v, in a mini-batch, stochastically pick a binary state
vector, h for the hidden units:
where bj is the bias, wij, is a weight, and sigma(x) = (1 + exp(-x))^-1.
I understand all parts of the equation. The only problem is how do I stochastically pick a binary state vector, given I know the probability of each element?
My idea is that for each element I generate random number, and if the number if higher than the probability, I will choose 1, otherwise 0. Is that correct?
I just want to clarify something about PCA in OpenCV. Suppose, I have two rows of data (A, B).
A 3 8 7
B 2 4 5
If I wanted to create a PCA model in OpenCV, what must I do to the data? Do I have to subtract the means (e.g. subtract the mean of A from its data points) or does the PCA function do this?
Someone said that OpenCV PCA expects the data to be normalised (between 0 and 1). If so, how do I normalise?
Hope someone can clarify this for me as PCA in OpenCV is very badly documented on the Net.
Cheers...
The data for PCA in OpenCV needs not to be normalized. But if you already have the mean (from some previuos calculations), you can send it to the PCACompute() function to speed it up.
OpenCV refman:
PCACompute(data[, mean[, eigenvectors[, maxComponents ]]]) !mean, eigenvectors
Parameters
data – Input samples stored as the matrix rows or as the matrix columns.
mean – Optional mean value. If the matrix is empty ( noArray() ), the mean is computed
from the data.
There is a good article on data normalization on Wikipedia.
For complete documentation check out the opencv.pdf file that should be in the doc/ folder of your instalation. On some versions it is named opencv2refman.pdf
And also try to find the book "Learning OpenCV", by Gary Bradsky, it's more than well exlained.
I try to implement a people detecting system based on SVM and HOG using OpenCV2.3. But I got stucked.
I came this far:
I can compute HOG values from an image database and then I calculate with LIBSVM the SVM vectors, so I get e.g. 1419 SVM vectors with 3780 values each.
OpenCV just wants one feature vector in the method hog.setSVMDetector(). Therefore I have to calculate one feature vector from my 1419 SVM vectors, that LIBSVM has calculated.
I found one hint, how to calculate this single feature vector: link
“The detecting feature vector at component i (where i is in the range e.g. 0-3779) is built out of the sum of the support vectors at i * the alpha value of that support vector, e.g.
det[i] = sum_j (sv_j[i] * alpha[j]) , where j is the number of the support vector, i
is the number of the components of the support vector.”
According to this, my routine works this way:
I take the first element of my first SVM vector, multiply it with the alpha value and add it with the first element of the second SVM vector that has been multiplied with alpha value, …
But after summing up all 1419 elements I get quite high values:
16.0657, -0.351117, 2.73681, 17.5677, -8.10134,
11.0206, -13.4837, -2.84614, 16.796, 15.0564,
8.19778, -0.7101, 5.25691, -9.53694, 23.9357,
If you compare them, to the default vector in the OpenCV sample peopledetect.cpp (and hog.cpp in the OpenCV source)
0.05359386f, -0.14721455f, -0.05532170f, 0.05077307f,
0.11547081f, -0.04268804f, 0.04635834f, -0.05468199f, 0.08232084f,
0.10424068f, -0.02294518f, 0.01108519f, 0.01378693f, 0.11193510f,
0.01268418f, 0.08528346f, -0.06309239f, 0.13054633f, 0.08100729f,
-0.05209739f, -0.04315529f, 0.09341384f, 0.11035026f, -0.07596218f,
-0.05517511f, -0.04465296f, 0.02947334f, 0.04555536f,
you see, that the default vector values are in the boundaries between –1 and +1, but my values exceed them far.
I think, my single feature vector routine needs some adjustment, any ideas?
Regards,
Christoph
The aggregated vector's values do look high.
I used the loadSVMfromModelFile() located in http://lnx.mangaitalia.net/trainer/main.cpp
I had to remove svinstr.sync(); from the code since it caused losing parts of the lines and getting wrong results.
I don't know much about the rest of the file, I only used this function.