Sorry for asking as a newbie to torch, but I promise to have search a lot through the documents and Internet.
There are two main demands I need,
the first one is to get the weight delta after training for one or more batches,
the second one is to set the new weight to model.
That means I want to update the weights by my own methods (with external library),
will it be possible to achieve that in torch?
It seems that torch has an abstract module class [1] but its interface doesn't fit all my needs.
[1] https://github.com/torch/nn/blob/master/doc/module.md#nn.Module
Finally, I found the answer by referring to several of my colleagues.
Understanding the getParameters() [1] correctly is the key point to solve the problem. getParameters() will get the flatten parameters (weights) and gradParameters (weights delta) and what's more, it's a memory transition and should be only called once as documented.
This means the returned value of getParameters() is just what we want and the changes in the returned value will be reflected to the original model where updating the weights.
So we can not only get the flatten weights by the parameters returned by getParameters() but also set the weights simply by parameters:copy(). We can absolutely use other torch.Tensor() methods to modify the weights.
[1] https://github.com/torch/nn/blob/master/doc/module.md#flatparameters-flatgradparameters-getparameters
Related
How train_on_batch() is different from fit()? What are the cases when we should use train_on_batch()?
For this question, it's a simple answer from the primary author:
With fit_generator, you can use a generator for the validation data as
well. In general, I would recommend using fit_generator, but using
train_on_batch works fine too. These methods only exist for the sake of
convenience in different use cases, there is no "correct" method.
train_on_batch allows you to expressly update weights based on a collection of samples you provide, without regard to any fixed batch size. You would use this in cases when that is what you want: to train on an explicit collection of samples. You could use that approach to maintain your own iteration over multiple batches of a traditional training set but allowing fit or fit_generator to iterate batches for you is likely simpler.
One case when it might be nice to use train_on_batch is for updating a pre-trained model on a single new batch of samples. Suppose you've already trained and deployed a model, and sometime later you've received a new set of training samples previously never used. You could use train_on_batch to directly update the existing model only on those samples. Other methods can do this too, but it is rather explicit to use train_on_batch for this case.
Apart from special cases like this (either where you have some pedagogical reason to maintain your own cursor across different training batches, or else for some type of semi-online training update on a special batch), it is probably better to just always use fit (for data that fits in memory) or fit_generator (for streaming batches of data as a generator).
train_on_batch() gives you greater control of the state of the LSTM, for example, when using a stateful LSTM and controlling calls to model.reset_states() is needed. You may have multi-series data and need to reset the state after each series, which you can do with train_on_batch(), but if you used .fit() then the network would be trained on all the series of data without resetting the state. There's no right or wrong, it depends on what data you're using, and how you want the network to behave.
Train_on_batch will also see a performance increase over fit and fit generator if youre using large datasets and don't have easily serializable data (like high rank numpy arrays), to write to tfrecords.
In this case you can save the arrays as numpy files and load up smaller subsets of them (traina.npy, trainb.npy etc) in memory, when the whole set won't fit in memory. You can then use tf.data.Dataset.from_tensor_slices and then using train_on_batch with your subdataset, then loading up another dataset and calling train on batch again, etc, now you've trained on your entire set and can control exactly how much and what of your dataset trains your model. You can then define your own epochs, batch sizes, etc with simple loops and functions to grab from your dataset.
Indeed #nbro answer helps, just to add few more scenarios, lets say you are training some seq to seq model or a large network with one or more encoders. We can create custom training loops using train_on_batch and use a part of our data to validate on the encoder directly without using callbacks. Writing callbacks for a complex validation process could be difficult. There are several cases where we wish to train on batch.
Regards,
Karthick
From Keras - Model training APIs:
fit: Trains the model for a fixed number of epochs (iterations on a dataset).
train_on_batch: Runs a single gradient update on a single batch of data.
We can use it in GAN when we update the discriminator and generator using a batch of our training data set at a time. I saw Jason Brownlee used train_on_batch in on his tutorials (How to Develop a 1D Generative Adversarial Network From Scratch in Keras)
Tip for quick search: Type Control+F and type in the search box the term that you want to search (train_on_batch, for example).
I'm trying to perform a complicated function approximation in Tensorflow with several layers. The function is going to be trained using a lot of generated data, so I want to be able to generate the data at runtime simply due to the sheer quantity of generated data necessary. I decided to try using an Estimator with a ModelFnOps, but I'm at the point where I'm writing the training loop and I can't seem to find any documentation on using something like eval(feed_dict=my_feed_dict) that is shown here. The only thing I've found so far has been calling fit() on the Estimator, but that requires calling the entire data set (unless I've misunderstood the purpose of that function). Is there any way to feed in single examples or batches within a loop to train an Estimator?
You can feed in your data via an input function. This input function is a first-class function that gets passed to the estimator (or the eval/train/predict methods).
You can also make use of the dataset API to create data feeders and iterators and return the feeder operations in your input functions.
I'm using weka for some classification experiments. i was trying some of the features provided by weka that can be applied on extracted attributes, and I found that applying clustermembership on the attributes will provide relatively higher accuracy than other methods. I'm not quite sure what this feature does since it removes all the attributes and only keeps something like pCluster_0_0 , pCluster_1_0 , pCluster_2_0 and the class-attribute.So I'm not quite sure the results that I'm getting from this is valid and will it work for other new unseen instances. From Weka documentations
A filter that uses a density-based clusterer to generate cluster membership values; filtered instances are composed of these values plus the class attribute (if set in the input data). If a (nominal) class attribute is set, the clusterer is run separately for each class. The class attribute (if set) and any user-specified attributes are ignored during the clustering operation.
I do appreciate any help to understand this.
It basically does what the documentation you read describes! It uses a clustering algorithm to get the cluster membership of each input instance (i.e. the cluster the instance belongs in) and outputs them as new instances. A word of caution that the clustering algorithm used must be a density based clusterer, so DBSCAN or expectation maximisation for instance.
As for if your result is valid, you will need to run a test set against the clusterer or do percentage split evaluation. You could be overfitting your data!
I have created a model with neural network (backpropagation), then i want to classify an instance.
what i've did :
normalization with regular normalization for each features
the values for each features is start from 0 to 1
The problem is how to classify new instance that have a new value (or some new values) in a feature (or some feature) with existing model that i made before?
Any one have solution for this condition? or some references that i can use to resolve this issue?
thanks
actually i have a discussion with my stochastic lecturer in my campus and he has an idea to resolve this problem by distribute the error that i got from the process when build the model. Then, the new instance can be match or see the likelihood of the instance in the distribution (like gaussian, mixture gaussian, or empirical distribution). But the problem that come in this idea is, we still have to get the error for that instance so we can see the likelihood in the distribution (or it's mean we still have to classify the instance into the existing model/function that same as the function that used in error distribution).
and i have a discussion with my friend too, and he has an idea to use FFT to replace the real normlization function, so the result not in certain range. But the effect is the error maybe increase by the error that come from the result of FFT function.
As a short-term solution, perhaps what you could do is set the value of the attribute to 0 or 1 (within the range of the original dataset) depending on the value of the attribute.
A longer-term solution would be to include such cases in future training of the neural network. Such values may cause the values of other instances to be skewed to the left or right so some attention may be required for the preprocessing of the training data.
Hope this Helps!
I've been working weka for couple of months now.
Currently, I'm working on my machine learning course here in Ostfold University College.
I need a better way to construct a decision tree based on separated training and test sets.
Anybody come up with good idea can be of very great relief.
Thanx in advance.
-Neo
You might be asking for something more specific, but in general:
You build the decision tree with the training set, and you evaluate the performance of that tree using the test set. In other words, on the test data, you call a function usually named something like c*lassify*, passing in the newly-built tree and a data point (within your test set) you wish to classify.
This function returns the leaf (terminal) node from your tree to which that data point belongs--and assuming that the contents of that leaf is homogeneous (populated with data from a single class, not a mixture) then you have in essence assigned a class label to that data point. When you compare that class label assigned by the tree to the data point's actual class label, and repeat for all instances in your test set, you have a metric to evaluate the performance of your tree.
A rule of thumb: shuffle your data, then assign 90% to the training set and the other 10% to a test set.
actually i was looking for something like this - http://weka.wikispaces.com/Saving+and+loading+models
to save a model, load it and use it in the training set.
This is exactly what i was searching for. Hope it might be useful for anyone who had similar problem as mine.
cheers
-Neo182