I want to predict a time series. I want to use methods like Recurrent Neural Networks (RNN) but I want to also have some other input features. I mean as far as I know RNN predicts the future just based on the history but I want to have other input feature besides historical data. I want to do something like regressor chain using RNN. I very much appreciate if anyone can give me some hints or examples doing what I explained.
Related
I have a large set of Training data which consists of various texts. They should be the input for my neural network. I have no output, or I don't know what to put as output.
Anyway, after the learning phase I want the neural network to create new texts based on the training data.
I read about this like „I made a bot watch 1000 hours of xy and asked it to write a new xy“.
Now my question is, what kind of machine learning is this? I am not looking for instructions on how to write it, but just a hint on how to find some keywords or tutorials. My Google searches so far were useless.
Your problem can usually be solved by an Encoder-Decoder architecture. This architecture would learn a set of latent vectors from your input, then try to output in whatever form you want. This architecture can be built with RNN, LSTM or CNN. Nowadays, attention-based models like transformers are more common among the big names. If you want to do text generation, you can start by reading about Generative Adversarial Networks (GANs).
After searching questions on SO and reddit, I can't figure out how to train a multiple input, multiple output classifier on a ML Text Classifier. I can train a single input, single output text classifier but that doesnt fit my use case.
Any help would be appreciated. I understand that there's no code to post, and that this is sort of a "show me how" question, but this information seems not readily available via searching and elsewhere, and would be beneficial to the community.
The classifier objects provided by Core ML (and Create ML) are for very specific use cases. If you try to do anything more advanced than that, you'll have to create a custom model, such as your own neural network.
As far as I understand, neural networks aren't good at classifying 'unknowns', i.e. items that do not belong to a learned class. But how do face detection/recognition approaches usually determine that no face is detected/recognised in a region? Is the predicted probability somehow thresholded?
Summary
It is true that neural networks are inherently not good at classifying 'unknowns' because they tend to overfit to the data that they have been trained on, if the underlying structure of the neural network is complex enough. However, there are multiple ways to go about reducing the affects of overfitting. For example, one technique that is used for this is called dropout. Another example can be batch normalization. Despite these techniques, the best way to reduce the affects of overfitting is to use more data.
For the facial recognition example that you have given above, it is common that the models that have been trained have 'seen' a huge amount of data. This means that there are very few 'unknowns' and even if there are, the neural network has learned how to tell if there are facial features present or not. This is because certain structures of neural networks are really good at telling if there is a pattern of features present in the input data. This helps the neural networks to learn if the image that is being input has certain features/patterns in it or not. If the these features are found then the input data is classified as face otherwise it is not.
I have a dataset of the order of MxN. I want to perform a binary classifcation on this dataset using neural networks. I was looking into Recurrent Neural Networks. Although, LSTM's can be used for AutoEncoders, I am not sure if they can be used for classification (I am trying to do a binary classification). I am very new to neural networks and deep learning models and i am not really sure if there is a way of achieving binary classification with neural networks. I tried Bernouli RBM on my dataset. I am not sure how to use this model to perform classification. I also found out Pipeline(). Again, I am not sure how to achieve my goal.
Any help would be greatly appreciated.
Ok, something doesn't stack up. If you have unlabelled data and you want to classify it you must take a look at K-Means (http://scikit-learn.org/stable/modules/clustering.html#k-means).
Regarding LSTMs classification: You run your input through the RNN layers and take the last output and feed it into some Conv / Fully-connected layers to take care of classification as you know it.
I am a newbie in machine learning and also in neural networks. Currently I'm taking a course at coursera.org about neural networks, but I don't understand everything. I have a little problem with my thesis. I should use a neural network, but I don't know how to choose the right neural network architecture for my problem.
I have a lot of data from web portals (typically online editions of newspapers, magazines). There is information about articles for example, name, text of article and release of article. There are also large amounts of sequence data that capture behavior of users.
My goal is to predict the popularity of an article (number of readers or clicks on article by unique user). I want to make vectors from this data and feed my neural network with these vectors.
I have two questions:
1. How do I create the right vector?
2. Which neural network architecture is best suited for this problem?
Those are very broad questions. You'll need to identify smaller issues if you want more exact answers.
How to create a right vector?
For text data, you usually use the vector space model. Best results are often obtained using tf-idf weighting.
Which neural network architecture is suitable for this problem?
This is very hard to say. I would start with a network with k input neurons (where k is the size of your vectors after applying tf-idf: you might also want to do some sort of feature selection to reduce the number of features. A good feature selection method is by using the chi squared test.)
Then, a standard network layout is given by using a single hidden layer with number of neurons equal to the average between the number of input neurons and output neurons. Then it looks like you only need a single output neuron that will output how popular the article is going to be (this can be a linear neuron or a sigmoid neuron).
For the neurons in your hidden layer, you can also experiment with linear and sigmoid neurons.
There are many other things you can try as well: weight decay, the momentum technique, networks with multiple layers, recurrent networks and so on. It's impossible to say what would work best for your given problem without a lot of experimentation.