what is a "sequential recommender system"? - machine-learning

I have to make a presentation about the paper “Markov decision process in recommender system” from Shaini et al. But they forgot to make a simple definition of what is a sequential recommender system. Basically, the idea is you make recommendation based on the ordered sequence of user’s past behaviour. But how to make them sound academic and scientific?

Please describe what you mean by academic and scientific. Your own words describe sequential recommender clearly and being clear is in the benefit of science and academia.
I think you're looking for a formal definition of a sequential recommender system. I would suggest reading about Markov Chains at https://en.wikipedia.org/wiki/Markov_chain

Related

Handling new features in classification models

I’m taking my first steps in ML, specifically with classifiers for text sentiment analysis. My approach is to make the usual 80% train dataset and 20% test. Having a trained model what is the best way to proceed in a production environment when new features appear (new words in texts not present in the initial dataset)?
In classification task, all feature must be seen at train time and new features can not be add to prediction phase later. For your problem you can use, Stemming or Lemmatizing . Or Something like LDA or Word2Vec with large number of document they trained
this chapter could be useful: https://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html
The problem that you are describing is generally known as "out of vocabulary" (OOV) words that appear in the test set but not in the training set. A traditional approach is to represent each OOV word with a special token, such as "UNKNOWN", and actually have those in the training data. This approach is discussed more fully in Section 4.3 of "Speech and Language Processing" by Jurafsky and Martin.
A more modern approach is to use Word2Vec. This is a really advanced topic that's found in neural networks.

Can the TextRank Algorithm be categorized as unsupervised machine learning?

TextRank is an approach to Automatic Text Summarization. Many categorize it as an "unsupervised" approach. I wish to know if this translates into TextRank being categorized as an Unsupervised Machine Learning technique.
TextRank is not directly related to machine learning: Machine learning involves the creation of a data model to predict future observation based on previous observations. This involves tuning model parameters to fit observed data.
On the other hand, TextRank is a graph-based ranking algorithm: it finds the summary parts based on the structure of a single document and does not use observations to learn anything. Since it's not machine learning, it can't be unsupervised machine learning, either.
The original authors of TextRank, Mihalcea and Tarau, described their work as unsupervised in a sense:
In particular, we proposed and evaluated two innovative unsupervised approaches for keyword and sentence extraction.
However that differs from unsupervised learning, i.e. finding hidden structure within unlabeled data.
Also, TextRank is not a machine learning algorithm, in other words it does not generalize from data by "minimizing a loss function together with a regularization term or side constraints" (per Stephen Boyd, et al.). Linguists might not some similarities, though that's outside the scope of this question.
Even so, some confusion might come from the fact that TextRank and related approaches get used to develop feature vectors to present to machine learning algorithms.

Automating the rumour identification process

Currenlty what we do, check the user discussion based on some keywords on social media. As per the keywords detection we identify that this can be rumour.
Approach to automate the process:
Keyword based : verifying the conversation for 1-2 gram based keywords. If keyword present, marking it as suspected conversation
Classifier based approach : Training the classifier with some prelabeled suspected conversations. Which ever being classified with >50% probability, marked as suspected.
For 2nd approach I am thinking of naive bayes classifier, and identifying the result with precision, recall, F measure value using scikit learn.
Is there any better approach to this? Or some model which can be combination of both approach?
There's no reason that the two approaches would be mutually exclusive. If you are going to be identifying keywords anyway, then you could easily extract a feature for machine-learning. And if you are doing machine-learning, you might as well include features that capture what you know about the keywords you have identified.
Is there a reason that you have chosen a Naive Bayes model? You may want to try a number of models to compare their performance. Your statement about 'identifying the result with precision, recall, F-measure' makes it seem like you don't understand how you make predictions with a machine-learning model. Those three metrics are the result of comparing a model's predictions with 'gold-standard' labels on a number of texts. I would recommend reading through an introduction to machine-learning. If you have already decided that you want to use scikit-learn, then perhaps you could work through their tutorial here. Another python library worth looking into is nltk, which has a free companion book here.
If python is not your preferred language, then there are lots of other options, too. For example, weka is a well-known tool written in java. It has a very user-friendly graphical interface for the basic functions, but it is not difficult to use from the command line as well.
Good luck!

In Q-learning with function approximation, is it possible to avoid hand-crafting features?

I have little background knowledge of Machine Learning, so please forgive me if my question seems silly.
Based on what I've read, the best model-free reinforcement learning algorithm to this date is Q-Learning, where each state,action pair in the agent's world is given a q-value, and at each state the action with the highest q-value is chosen. The q-value is then updated as follows:
Q(s,a) = (1-α)Q(s,a) + α(R(s,a,s') + (max_a' * Q(s',a'))) where α is the learning rate.
Apparently, for problems with high dimensionality, the number of states become astronomically large making q-value table storage infeasible.
So the practical implementation of Q-Learning requires using Q-value approximation via generalization of states aka features. For example if the agent was Pacman then the features would be:
Distance to closest dot
Distance to closest ghost
Is Pacman in a tunnel?
And then instead of q-values for every single state you would only need to only have q-values for every single feature.
So my question is:
Is it possible for a reinforcement learning agent to create or generate additional features?
Some research I've done:
This post mentions A Geramifard's iFDD method
http://www.icml-2011.org/papers/473_icmlpaper.pdf
http://people.csail.mit.edu/agf/Files/13RLDM-GQ-iFDD+.pdf
which is a way of "discovering feature dependencies", but I'm not sure if that is feature generation, as the paper assumes that you start off with a set of binary features.
Another paper that I found was apropos is Playing Atari with Deep Reinforcement Learning, which "extracts high level features using a range of neural network architectures".
I've read over the paper but still need to flesh out/fully understand their algorithm. Is this what I'm looking for?
Thanks
It seems like you already answered your own question :)
Feature generation is not part of the Q-learning (and SARSA) algorithm. In a process which is called preprocessing you can however use a wide array of algorithms (of which you showed some) to generate/extract features from your data. Combining different machine learning algorithms results in hybrid architectures, which is a term you might look into when researching what works best for your problem.
Here is an example of using features with SARSA (which is very similar to Q-learning).
Whether the papers you cited are helpful for your scenario, you'll have to decide for yourself. As always with machine learning, your approach is highly problem-dependent. If you're in robotics and it's hard to define discrete states manually, a neural network might be helpful. If you can think of heuristics by yourself (like in the pacman example) then you probably won't need it.

Research papers classification on the basis of title of the research paper

Dear all I am working on a project in which I have to categories research papers into their appropriate fields using titles of papers. For example if a phrase "computer network" occurs somewhere in then title then this paper should be tagged as related to the concept "computer network". I have 3 million titles of research papers. So I want to know how I should start. I have tried to use tf-idf but could not get actual results. Does someone know about a library to do this task easily? Kindly suggest one. I shall be thankful.
If you don't know categories in advance, than it's not classification, but instead clustering. Basically, you need to do following:
Select algorithm.
Select and extract features.
Apply algorithm to features.
Quite simple. You only need to choose combination of algorithm and features that fits your case best.
When talking about clustering, there are several popular choices. K-means is considered one of the best and has enormous number of implementations, even in libraries not specialized in ML. Another popular choice is Expectation-Maximization (EM) algorithm. Both of them, however, require initial guess about number of classes. If you can't predict number of classes even approximately, other algorithms - such as hierarchical clustering or DBSCAN - may work for you better (see discussion here).
As for features, words themselves normally work fine for clustering by topic. Just tokenize your text, normalize and vectorize words (see this if you don't know what it all means).
Some useful links:
Clustering text documents using k-means
NLTK clustering package
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
Note: all links in this answer are about Python, since it has really powerful and convenient tools for this kind of tasks, but if you have another language of preference, you most probably will be able to find similar libraries for it too.
For Python, I would recommend NLTK (Natural Language Toolkit), as it has some great tools for converting your raw documents into features you can feed to a machine learning algorithm. For starting out, you can maybe try a simple word frequency model (bag of words) and later on move to more complex feature extraction methods (string kernels). You can start by using SVM's (Support Vector Machines) to classify the data using LibSVM (the best SVM package).
The fact, that you do not know the number of categories in advance, you could use a tool called OntoGen. The tool basically takes a set of texts, does some text mining, and tries to discover the clusters of documents. It is a semi-supervised tool, so you must guide the process a little, but it does wonders. The final product of the process is an ontology of topics.
I encourage you, to give it a try.

Resources