Specific Machine Learning Query about Estimating Training Values and Adjusting Weights - machine-learning

Hey I am really new to the field of machine learning and recently started reading the book Machine Learning by Tom Mitchell and am stuck on a particular section in the first chapter where he talks about Estimating Training values and also adjusting the weights. An explanation of the concepts of estimating training values would be great but I understand that it is not easy to explain all this so I would be really obliged if someone would be able to point me towards a resource (lecture video, or simple lecture slides, or some text snippet) that talks about the concept of estimating training data and the like.
Again I am sorry I cannot provide more information in terms of the question I am asking. The book sections are 1.2.4.1 and 1.2.4.2 in "Machine Learning by Tom Mitchell" if anyone has read this book and has had the same problem in understanding the concepts described in these sections.
Thanks in advance.

Ah. Classic textbook. My copy is a bit out of date but it looks like my section 1.2.4 deals with the same topics as yours.
First off, this is an introductory chapter that tries to be general and non-intimidating, but as a result it is also very abstract and a bit vague. At this point I wouldn't worry too much that you didn't understand the concepts, it is more likely that you're overthinking it. Later chapters will flesh out the things that seem unclear now.
Value in this context should be understood as a measure of the quality or performance of a certain state or instance, not as "values" as in numbers in general. Using his checkers example, a state with a high value is a board situation that is good/advantageous for the computer player.
The main idea here is that if you can provide every possible state that can be encountered with a value, and there is a set of rules that defines which states can be reached from the current state by doing which actions, then you can make an informed decision about which action to take.
But assigning values to states is only a trivial task for the end states of the game. The value attained at an end state is often called the reward. The goal is of course to maximize the reward. Estimating training values refers to the process of assigning guessed values to intermediate states based on the results you obtained later on in a game.
So, while playing many many training games you keep a trace of which states you encounter, and if you find that some state X leads to state Y, you can change your estimated value of X a bit, based on the current estimate for X and the current estimate of Y. This is what 'estimating the training weights' is all about. By repeated training, the model gets experienced and the estimates should converge to reliable values. It will start to avoid moves that lead to defeat, and favor moves that lead to victory. There are many different ways of doing such updates, and many different ways to represent the game state, but that is what the rest of the book is about .
I hope this helps!

Related

Dyna-Q with planning vs. n-step Q-learning

I'm reading Reinforcement Learning by Sutton and Barto, and for an example of Dyna-Q, they use a maze problem. The example shows that with n=50 steps of planning, the algorithm reaches the optimal path in only 3 episodes.
Is this an improvement over 50-step Q-learning? It seems like you are really just running a bunch of 50-step Q-learning algorithms in each episode, so saying it finds the optimal path in 3 episodes is misleading.
Also, I guess the big question is, I thought Dyna-Q was useful when you don't have a model of the environment, but in this example don't we have a model of the environment? Why use all of the memory to save all our previous moves if we already have a model? I'm having trouble understanding why this is a good example for Dyna-Q.
In theory, we don't have the model. We have it in practice just for simulation, but in real life we don't.
Dyna-Q basically approximate your model using sample. Instead of learning the transition and the reward functions, you "query" your data: what happened in the past when I did action a in state s? If everything is deterministic, this is equivalent of knowing the exact model.
Think it also like this. In classic Q-learning your know only your current s,a, so you update Q(s,a) only when you visit it. In Dyna-Q, you update all Q(s,a) every time you
query them from the memory. You don't have to revisit them. This speeds up things tremendously.
Also, the very common "replay memory" basically reinvented Dyna-Q, even though nobody acknowledges it.

Detecting broken sensors with machine learning

I'm new to machine learning.
I've got a huge database of sensor data from weather stations. Those sensors can be broken or have odd values. Broken sensors influences the calculations that are being done with that data.
The goal is to use machine-learning to detect if new sensor values are odd and mark them as broken if so. As said, I'm new to ML. Can somebody push me in the right direction or give feedback to my approach.
The data has a datetime and a value. The sensor values are being pushed every hour.
I appreciate any kind of help!
Since the question is pretty general in nature, I will provide some basic thoughts. Maybe you are already slightly familiar with them.
Set up a dataset that contains both broken sensors, as well as good sensors. That is the dependent variable. With that set you also have some variables that might predict the Y variable. Let's call them X.
You train a model to learn te relationship between X and Y.
You predict, based on X values where you do not know the outcome, what Y will be.
Some useful insight on the basics, is here:
https://www.youtube.com/watch?v=elojMnjn4kk&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A
Good Luck!
You could use Isolation Forest to detect abnormal readings.
Twitter has developed a algorithm called ESD (Extreme Studentized Deviate) also useful.
https://github.com/twitter/AnomalyDetection/
However a good EDA (Exploratory data analysis) is needed to define the types of abnormality found in the readings due to faulty sensors.
1) Step kind of trend, where suddenly the value increases and remains increased or decreased as well
2) Gradual increase in the value compared to other sensors and suddenly very high increase
3) Intermittent spike in the data

direct or indirect for the types of training exprience

I have a question,
in machine learning we define 2 types for the type of training experience:
Direct and indirect.
I searched a lot about the difference but I coud not find. Is anyone familiar with these?
Thank you in advance
In his book "Machine Learning" (1st ed.), Tom Mitchell explains this as follows (see section 1.2.1, page 5):
For example, in learning to play checkers, the system might learn from direct training examples consisting of individual checkers board states and the correct move for each. Alternatively, it might have available only indirect information consisting of the move sequences and final outcomes of various games played. In this later case, information about the correctness of specific moves early in the game must be inferred indirectly from the fact that the game was eventually won or lost.
He further states:
Here [using indirect feedback] the learner faces an additional problem of credit assignment, or determining the degree to which each move in the sequence deserves credit or blame for the final outcome. Credit assignment can be a particularly difficult problem because the game can be lost even when early moves are optimal, if these are followed later by poor moves. Hence, learning from direct training feedback is typically easier than learning from indirect feedback.
Considering the example Playing Chess
Direct experience : Learning the Rules of playing chess. Learning the different moves of different coins. Example : elephant moves only straight, soldier moves only one step at a time etc.
Indirect experience :
Learning from the previous experience, if particular movement of the coin in the game has led the to win, this movement will be assigned some reward or credit, if particular movement has led to loose game, penalty is assigned. This experience of the previous game is used to make coin movement against particular step, so that game can be won.

How to calculate distance when we have sparse dataset in K nearest neighbour

I am implementing K nearest neighbour algorithm for a very sparse data. I want to calculate the distance between a test instance and each sample in the training set, but I am confused.
Because most of the features in training samples don't exist in test instance or vice versa (missing features).
How can I compute the distance in this situation?
To make sure I'm understanding the problem correctly: each sample forms a very sparsely filled vector. The missing data is different between samples, so it's hard to use any Euclidean or other distance metric to gauge similarity of samples.
If that is the scenario, I have seen this problem show up before in machine learning - in the Netflix prize contest, but not specifically applied to KNN. The scenario there was quite similar: each user profile had ratings for some movies, but almost no user had seen all 17,000 movies. The average user profile was quite sparse.
Different folks had different ways of solving the problem, but the way I remember was that they plugged in dummy values for the missing values, usually the mean of the particular value across all samples with data. Then they used Euclidean distance, etc. as normal. You can probably still find discussions surrounding this missing value problem on that forums. This was a particularly common problem for those trying to implement singular value decomposition, which became quite popular and so was discussed quite a bit if I remember right.
You may wish to start here:
http://www.netflixprize.com//community/viewtopic.php?id=1283
You're going to have to dig for a bit. Simon Funk had a little different approach to this, but it was more specific to SVDs. You can find it here: http://www.netflixprize.com//community/viewtopic.php?id=1283
He calls them blank spaces if you want to skip to the relevant sections.
Good luck!
If you work in very high dimension space. It is better to do space reduction using SVD, LDA, pLSV or similar on all available data and then train algorithm on trained data transformed that way. Some of those algorithms are scalable therefor you can find implementation in Mahout project. Especially I prefer using more general features then such transformations, because it is easier debug and feature selection. For such purpose combine some features, use stemmers, think more general.

Using Artificial Intelligence (AI) to predict Stock Prices

Given a set of data very similar to the Motley Fool CAPS system, where individual users enter BUY and SELL recommendations on various equities. What I would like to do is show each recommendation and I guess some how rate (1-5) as to whether it was good predictor<5> (ie. correlation coefficient = 1) of the future stock price (or eps or whatever) or a horrible predictor (ie. correlation coefficient = -1) or somewhere in between.
Each recommendation is tagged to a particular user, so that can be tracked over time. I can also track market direction (bullish / bearish) based off of something like sp500 price. The components I think that would make sense in the model would be:
user
direction (long/short)
market direction
sector of stock
The thought is that some users are better in bull markets than bear (and vice versa), and some are better at shorts than longs- and then a combination the above. I can automatically tag the market direction and sector (based off the market at the time and the equity being recommended).
The thought is that I could present a series of screens and allow me to rank each individual recommendation by displaying available data absolute, market and sector out performance for a specific time period out. I would follow a detailed list for ranking the stocks so that the ranking is as objective as possible. My assumption is that a single user is right no more than 57% of the time - but who knows.
I could load the system and say "Lets rank the recommendation as a predictor of stock value 90 days forward"; and that would represent a very explicit set of rankings.
NOW here is the crux - I want to create some sort of machine learning algorithm that can identify patterns over a series of time so that as recommendations stream into the application we maintain a ranking of that stock (ie. similar to correlation coefficient) as to the likelihood of that recommendation (in addition to the past series of recommendations ) will affect the price.
Now here is the super crux. I have never taken an AI class / read an AI book / never mind specific to machine learning. So I cam looking for guidance - sample or description of a similar system I could adapt. Place to look for info or any general help. Or even push me in the right direction to get started...
My hope is to implement this with F# and be able to impress my friends with a new skill set in F# with an implementation of machine learning and potentially something (application / source) I can include in a tech portfolio or blog space;
Thank you for any advice in advance.
I have an MBA, and teach data mining at a top grad school.
The term project this year was to predict stock price movements automatically from news reports. One team had 70% accuracy, on a reasonably small sample, which ain't bad.
Regarding your question, a lot of companies have made a lot of money on pair trading (find a pair of assets that normally correlate, and buy/sell pair when they diverge). See the writings of Ed Thorpe, of Beat the Dealer. He's accessible and kinda funny, if not curmudgeonly. He ran a good hedge fund for a long time.
There is probably some room in using data mining to predict companies that will default (be unable to make debt payments) and shorting† them, and use the proceeds to buy shares in companies less likely to default. Look into survival analysis. Search Google Scholar for "predict distress" etc in finance journals.
Also, predicting companies that will lose value after an IPO (and shorting them. edit: Facebook!). There are known biases, in academic literature, that can be exploited.
Also, look into capital structure arbitrage. This is when the value of the stocks in a company suggest one valuation, but the value of the bonds or options suggest another value. Buy the cheap asset, short the expensive one.
Techniques include survival analysis, sequence analysis (Hidden Markov Models, Conditional Random Fields, Sequential Association Rules), and classification/regression.
And for the love of God, please read Fooled By Randomness by Taleb.
† shorting a stock usually involves calling your broker (that you have a good relationship with) and borrowing some shares of a company. Then you sell them to some poor bastard. Wait a while, hopefully the price has gone down, you buy some more of the shares and give them back to your broker.
My Advice to You:
There are several Machine Learning/Artificial Intelligence (ML/AI) branches out there:
http://www-formal.stanford.edu/jmc/whatisai/node2.html
I have only tried genetic programming, but in the "learning from experience" branch you will find neural nets. GP/GA and neural nets seem to be the most commonly explored methodologies for the purpose of stock market predictions, but if you do some data mining on Predict Wall Street, you might be able to utilize a Naive Bayes classifier to do what you're interested in doing.
Spend some time learning about the various ML/AI techniques, get a small data set and try to implement some of those algorithms. Each one will have its strengths and weaknesses, so I would recommend that you try to combine them using Naive Bays classifier (or something similar).
My Experience:
I'm working on the problem for my Masters Thesis so I'll pitch my results using Genetic Programming: www.twitter.com/darwins_finches
I started live trading with real money in 09/09/09.. yes, it was a magical day! I post the GP's predictions before the market opens (i.e. the timestamps on twitter) and I also place the orders before the market opens. The profit for this period has been around 25%, we've consistently beat the Buy & Hold strategy and we're also outperforming the S&P 500 with stocks that are under-performing it.
Some Resources:
Here are some resources that you might want to look into:
Max Dama's blog: http://www.maxdama.com/search/label/Artificial%20Intelligence
My blog: http://mlai-lirik.blogspot.com/
AI Stock Market Forum: http://www.ai-stockmarketforum.com/
Weka is a data mining tool with a collection of ML/AI algorithms: http://www.cs.waikato.ac.nz/ml/weka/
The Chatter:
The general consensus amongst "financial people" is that Artificial Intelligence is a voodoo science, you can't make a computer predict stock prices and you're sure to loose your money if you try doing it. None-the-less, the same people will tell you that just about the only way to make money on the stock market is to build and improve on your own trading strategy and follow it closely.
The idea of AI algorithms is not to build Chip and let him trade for you, but to automate the process of creating strategies.
Fun Facts:
RE: monkeys can pick better than most experts
Apparently rats are pretty good too!
I understand monkeys can pick better than most experts, so why not an AI? Just make it random and call it an "advanced simian Mersenne twister AI" or something.
Much more money is made by the sellers of "money-making" systems then by the users of those systems.
Instead of trying to predict the performance of companies over which you have no control, form a company yourself and fill some need by offering a product or service (yes, your product might be a stock-predicting program, but something a little less theoretical is probably a better idea). Work hard, and your company's own value will rise much quicker than any gambling you'd do on stocks. You'll also have plenty of opportunities to apply programming skills to the myriad of internal requirements your own company will have.
If you want to go down this long, dark, lonesome road of trying to pick stocks you may want to look into data mining techniques using advanced data mining software such as SPSS or SAS or one of the dozen others.
You'll probably want to use a combination or technical indicators and fundamental data. The data will more than likely be highly correlated so a feature reduction technique such as PCA will be needed to reduce the number of features.
Also keep in mind your data will constantly have to be updated, trimmed, shuffled around because market conditions will constantly be changing.
I've done research with this for a grad level class and basically I was somewhat successful at picking whether a stock would go up or down the next day but the number of stocks in my data set was fairly small (200) and it was over a very short time frame with consistent market conditions.
What I'm trying to say is what you want to code has been done in very advanced ways in software that already exists. You should be able to input your data into one of these programs and using either regression, or decision trees or clustering be able to do what you want to do.
I have been thinking of this for a few months.
I am thinking about Random Matrix Theory/Wigner's distribution.
I am also thinking of Kohonen self-learning maps.
These comments on speculation and past performance apply to you as well.
I recently completed my masters thesis on deep learning and stock price forecasting. Basically, the current approach seems to be LSTM and other deep learning models. There are also 10-12 technical indicators (TIs) based on moving average that have been shown to be highly predictive for stock prices, especially indexes such as SP500, NASDAQ, DJI, etc. In fact, there are libraries such as pandas_ta for computing various TIs.
I represent a group of academics that are trying to predict stocks in a general form that can also be applied to anything, even the rating of content.
Our algorithm, which we describe as truth seeking, works as follows.
Basically each participant has their own credence rating. This means that the higher your credence or credibility, then the more their vote counts. Credence is worked out by how close to the weighted credence each vote is. It's like you get a better credence value the closer you get to the average vote that has already been adjusted for credence.
For example, let's say that everyone is predicting that a stock's value will be at value X in 30 day's time (a future's option). People who predict on the average get a better credence. The key here is that the individual doesn't know what the average is, only the system. The system is tweaked further by weighting the guesses so that the target spot that generates the best credence is those votes that are already endowed with more credence. So the smartest people (historically accurate) project the sweet spot that will be used for further defining who gets more credence.
The system can be improved too to adjust over time. For example, when you find out the actual value, those people who guessed it can be rewarded with a higher credence. In cases where you can't know the future outcome, you can still account if the average weighted credence changes in the future. People can be rewarded even more if they spotted the trend early. The point is we don't need to even know the outcome in the future, just the fact that the weighted rating changed in the future is enough to reward people who betted early on the sweet spot.
Such a system can be used to rate anything from stock prices, currency exchange rates or even content itself.
One such implementation asks people to vote with two parameters. One is their actual vote and the other is an assurity percentage, which basically means how much a particular participant is assured or confident of their vote. In this way, a person with a high credence does not need to risk downgrading their credence when they are not sure of their bet, but at the same time, the bet can be incorporated, it just won't sway the sweet spot as much if a low assurity is used. In the same vein, if the guess is directly on the sweet spot, with a low assurity, they won't gain the benefits as they would have if they had used a high assurity.

Resources