I have implemented a Bidirectional LSTM which predicts a certain profile by using a windowing input in tensorflow. Conceptually it makes sense to me during the training phase.
Once the model is trained I use it to predict future and evaluate against a test set. However, I am unsure what happens to the backward LSTM cells that were originally trained with information passed from future to past. Does the network first do a forward pass in time and then a backward pass to give an output? Could someone please elaborate how does the network work in testing phase.
Network architecture
Thanks in advance.
I have tried to find some resources online but couldn't find a convincing answer
Related
I have a large set of Training data which consists of various texts. They should be the input for my neural network. I have no output, or I don't know what to put as output.
Anyway, after the learning phase I want the neural network to create new texts based on the training data.
I read about this like „I made a bot watch 1000 hours of xy and asked it to write a new xy“.
Now my question is, what kind of machine learning is this? I am not looking for instructions on how to write it, but just a hint on how to find some keywords or tutorials. My Google searches so far were useless.
Your problem can usually be solved by an Encoder-Decoder architecture. This architecture would learn a set of latent vectors from your input, then try to output in whatever form you want. This architecture can be built with RNN, LSTM or CNN. Nowadays, attention-based models like transformers are more common among the big names. If you want to do text generation, you can start by reading about Generative Adversarial Networks (GANs).
I am a beginner in the neuronal network field and I want to understand a certain statement. A friend said that a neuronal network gets slower after you fit a lot of data in.
Right now, I just did the coursera ML course from androw ng. There, I implemented backpropagation. I thought it just adaptes the model related to the expected output by using different types of calculations. Nevertheless, it was not like the history was used to adapt the model. Just the current state of the neurons were checked and their weight were adapted backwards in combination with regularisation.
Is my assumption correct or am I wrong? Are there some libraries that use history data that could result in a slowly adapting model after a certain amount of training?
I want to use a simple neuronal network for reinforcement learning and I want to get an idea if I need to reset my model if the target environment changes for some reason. Otherwise my model would be slower and slower in adaption after time.
Thanks for any links and explanations in advanced!
As you have said, neural networks adapt by modifying their weights during the backpropagation step. Modifying these weights will not be slower as the training goes on since the number of steps to modify these weights will always remain the same. The amount of steps needed to run an example through your model will also remain the same, therefore not slowing down your network according to the amount of examples you fed it during training.
However, you can decide to change your learning rate during your training (generally decreasing it as epochs go on). According to the way the learning rate of your model evolves, the weights will be modified in a different manner, generally resulting in a smaller difference each epoch.
So this question may seem a little stupid but I couldn't wrap my head around it.
What is the purpose of test data? Is it only to calculate accuracy of the classifier? I'm using Naive Bayes for sentiment analysis of tweets. Once I train my classifier using training data, I use test data just to calculate accuracy of the classifier. How can I use the test data to improve classifier's performance?
In doing general supervised machine learning, the test data set plays a critical role in determining how well your model is performing. You typically will build a model with say 90% of your input data, leaving 10% aside for testing. You then check the accuracy of that model by seeing how well it does against the 10% training set. The performance of the model against the test data is meaningful because the model has never "seen" this data. If the model be statistically valid, then it should perform well on both the training and test data sets. This general procedure is called cross validation and you can read more about it here.
You don't -- like you surmise, the test data is used for testing, and mustn't be used for anything else, lest you skew your accuracy measurements. This is an important cornerstone of any machine learning -- you only fool yourself if you use your test data for training.
If you are considering desperate measures like that, the proper way forward is usually to re-examine your problem space and the solution you have. Does it adequately model the problem you are trying to solve? If not, can you devise a better model which captures the essence of the problem?
Machine learning is not a silver bullet. It will not solve your problem for you. Too many failed experiments prove over and over again, "garbage in -- garbage out".
I am trying to solve some classification problem. It seems many classical approaches follow a similar paradigm. That is, train a model with some training set and than use it to predict the class labels for new instances.
I am wondering if it is possible to introduce some feedback mechanism into the paradigm. In control theory, introducing a feedback loop is an effective way to improve system performance.
Currently a straight forward approach on my mind is, first we start with a initial set of instances and train a model with them. Then each time the model makes a wrong prediction, we add the wrong instance into the training set. This is different from blindly enlarge the training set because it is more targeting. This can be seen as some kind of negative feedback in the language of control theory.
Is there any research going on with the feedback approach? Could anyone shed some light?
There are two areas of research that spring to mind.
The first is Reinforcement Learning. This is an online learning paradigm that allows you to get feedback and update your policy (in this instance, your classifier) as you observe the results.
The second is active learning, where the classifier gets to select examples from a pool of unclassified examples to get labelled. The key is to have the classifier choose the examples for labelling which best improve its accuracy by choosing difficult examples under the current classifier hypothesis.
I have used such feedback for every machine-learning project I worked on. It allows to train on less data (thus training is faster) than by selecting data randomly. The model accuracy is also improved faster than by using randomly selected training data. I'm working on image processing (computer vision) data so one other type of selection I'm doing is to add clustered false (wrong) data instead of adding every single false data. This is because I assume I will always have some fails, so my definition for positive data is when it is clustered in the same area of the image.
I saw this paper some time ago, which seems to be what you are looking for.
They are basically modeling classification problems as Markov decision processes and solving using the ACLA algorithm. The paper is much more detailed than what I could write here, but ultimately they are getting results that outperform the multilayer perceptron, so this looks like a pretty efficient method.
Hey I have a task to perform, which is basically to somehow retrieve powerpoint presentations or pdf documents pertaining to a certain field. Let's say I want to retrieve ppt and pdf lecture notes pertaining to bioinformatics field. I would like to know if this task can be achieved by adapting the approach of using neural bots trained by a neural network? Just wanted to confirm that this approach is not completely wrong before I proceeded further with my implementation.
And in case someone is wondering why a neural network or any learning algorithm at all is required in this case well here is my plan (which might be wrong or there might be an easier way to achieve this so please feel free to correct me):
I generate neural bots trained by a neural network (not sure how this training happens yet, I am assuming by supervised learning using a sample training set of certain ppt and pdf files) and then these bots retrieve pages that are similar to what they learnt through their training.
So is the above approach a correct way to go about completing this task?
Neural nets are complicated. It seems like you have a generic document classification problem. The simplest place to start is using some kind of naive bayes model with bag of word features. The next step I'd take from there is to use a linear SVM or logistic regression on the same feature set. If you still don't have the performance you want after you tried simpler things, maybe then go on to try using neural nets.
Just like you wouldn't say, I want to do write an email server, I'll start by writing an operating system, I'd tend to be wary of using neural nets before simpler things have failed.