I am building hybrid recommender system and then I evaluate the model
I want to make sure does this how we calculate the MAE for evaluation (see image )
where the prediction will result from the training dataset and the actual rating will be from the test set
MAE
thanks
Related
I collected ~1500 labelled data and trained with yolo v3, got a training loss of ~10, validation loss ~ 16. Obviously we can use real test data to evaluate the model performance, but I am wondering if there is a way to tell if this training loss = 10 is a "good" one? Or does it indicate I need to use more training data to see if I can push it down to 5 or even less?
Ultimately my question is, for a well-known model with a pre-defined loss function, is there a "good" standard value for the training loss?
thanks.
you need to train your weights until avg loss become 0.0XXXXX. It is minimal requirement to detect object with matching anchor IOU.
Update:28th Nov, 2018
while training object detection model, Loss might be vary sometimes with large data set. but all you need to calculate is Mean Average Precision(MAP) which exactly gave the accuracy criteria of trained model.
./darknet detector map .data .cfg .weights
If your MAP is near to 0.1 i.e. 100%, model performing well.
Follow link to know more about MAP:
https://medium.com/#jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173
Your validation loss is a good indicator of if the training loss can further alleviate, I mean i don't have any one-shot solutions ,you will have to tweak Hyper-parameters and check on the val test and iterate.You can also get a nice idea by looking at the loss curve, was it decreasing when you stopped training or was it flat, you can get a sense of how the training has progressed and make changes accordingly.GoodLuck
I am new to machine learning and I am currently working on classification problem. I am able to train the model and predict test data sets. I want to know whether is there some way by which I can get scores along with the prediction. By scores , I mean those are proximity scores along with prediction. For example, in standard age-salary-buy (based on age and salary whether the customer will buy the product or not) classification problem, I want to know what is a score out of 100 that he will buy that product in addition to the prediction of whether he will buy it or not.
Currently, I am using LibSVM Algo. Is there some algo which provides me above data ?
Thanks.
What you are looking for is a support of your decision. In other words, many classifiers base their decision of x class over labels Y on:
cl(x) = arg max_{y \in Y} p(y|x)
where p(y|x) is their internal estimation of "x having label y". And such classifiers include:
neural networks (with sigmoid output)
logistic regression
naive bayes
voting ensembles (such as RF)
...
These methods can be easily converted to your 0-100 scale, as probability is in 0-1 scale.
Some, on the other hand use measure proportional to probability (such as SVM), but unbounded, here you can get this value (often called decision function) but you cannot convert it to 0-100 score (as you do not have "maximum" value). This is a big drawback, so some modification were proposed. In particular for SVM you have Platt's scaling which actually fits a logistic regression on top of SVM so you get your probability estimate. In libSVM you can set -b to get probability estimates
from libsvm website
-b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
Do you have any readings recommendation on correcting forecast bias? For example, I use an ARIMA model to predict a time series. Is there a way based on the backtesting results to correct the bias of the forecast?
How to handle an all present Bias / Overfit struggle? Using a tactical methodology:
one principal approach to this is to systematically tune a Predictor ( be it ARIMA or some other ) via a two step approach.
You have to split available DataSET into two parts, so as to emulate a near "Future", and "hide" the -- say about 20-30% of the observations -- second part of the DataSET from a process of [1] Training and find it's use in a step [2] called CrossValidation of predictions.
This methodology allows one to search both the StateSPACE of a Predictor engine's configurations and data-related bias/overfit. Some use only the former part of the minimiser search ( lowest error / highest utility function ), some only the latter ( alike Leo Breiman's RandomForest modification of ensemble based method ) and some use both.
Train a pre-configured Predictor on aTrainingSubPartOfAvailableDataSET
Once such a configuration of a Predictor got trained, cross-validate this configuration's ability to predict against aCrossValidationSubPartOfAvailableDataSET not seen in the process of training (Step 1.) to observe the Bias / Overfit artefacts and proceed towards the lowest Cross-Validation error / best generalisation area of plausible configuration settings.
I programmed my own classifier in python, I used a text corpus to test it using F1 measurement, but now I want to test it in other Data Mining tasks, so I have my classifier output file to a given corpus and I want to measure the quality using Weka different measures, how I can past to Weka the output file and get the quality?
I think the correct procedure should be some sort of n-fold validation: Divide your data set into training and test sets. Develop the model on the training set; calculate its sum of squared errors SSE(train).
The take the model and run the test data through it and calculate the SSE(test) using the predicted and actual response values. That'll help you assess the accuracy and bias of your model.
Have a look at Elements of Statistical Learning Using R.
I built a classifier with 13 features ( no binary ones ) and normalized individually for each sample using scikit tool ( Normalizer().transform).
When I make predictions it predicts all training sets as positives and all test sets as negatives ( irrespective of fact whether it is positive or negative )
What anomalies I should focus on in my classifier, feature or data ???
Notes: 1) I normalize test and training sets (individually for each sample) separately.
2) I tried cross validation but the performance is same
3) I used both SVM linear and RBF Kernels
4) I tried without normalizing too. But same poor results
5) I have same number of positive and negative datasets ( 400 each) and 34 samples of positive and 1000+ samples of negative test sets.
If you're training on balanced data the fact that "it predicts all training sets as positive" is probably enough to conclude that something has gone wrong.
Try building something very simple (e.g. a linear SVM with one or two features) and look at the model as well as a visualization of your training data; follow the scikit-learn example: http://scikit-learn.org/stable/auto_examples/svm/plot_iris.html
There's also a possibility that your input data has many large outliers impacting the transform process...
Try doing feature selection on the training data (Seperately from your test/validation data).
Feature selection on your whole dataset can easily lead to overfitting.