Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I've been training a convolutional network with 7 learnable layers for about 20 hours. What are some general ways to tell if the network has converged or still needs training?
Here's a histogram of the parameters of the first convolutional layer:
Here's a graph of the loss and accuracy of the training and test set:
Obviously while you are getting score rising (train and test) it's meaning you on the right way and you are going to local/global minimum. When you'll see changing of direction score moving (traing still going down, test going up) or both scores go in stagnation, then it's time to stop.
BUT
While you are using accuracy as metric of evaluating you can just get anomal behavior of model. for ex: all result of network output would be number of most valuable classes. Explanation. This problem can be solved by using another metric of evaluating like f1, logloss and so on and you'll see any problems on learning period.
Also for imbalanced data you can use any strategies for avoiding negative effects of imbalance. Like weights in softmax_cross_entropy in tensorflow. Implementation you can find there.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 10 months ago.
Improve this question
I have to train a reinforcement learning agent (represented by a neural network) whose environment has a dataset where outliers are present.
How can I actually deal with the normalization data considering that I want to normalize them in a range [-1,1]?
I need to maintain outliers in the dataset because they're critical, and they can be actually significant in some circumstances despite being out of the normal range.
So the option to completely delete rows is excluded.
Currently, I'm trying to normalize the dataset by using the IQR method.
I fear that with outliers still present, the agent will take some actions only when intercepts them.
I already experimented that a trained agent always took the same actions, excluding others.
What does your experience suggest?
After some tests, I take this road:
Applied a Z-score normalization with the option "Robust"; in this way, I have mean=0 and sd=1.
I calculated the min_range(feature)+max_range(feature)/2
I divided all the feature data with the mean calculated in point 2.
The agent learned pretty well.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I am working on making predictions and decisions based on stocks and crypto data.
First I implemented a decision tree model and I had Model Accuracy: 0.5. After that I did some research and found out that decision tree is not enough and I tried to improve it with random forest and adaboosting.
After that I noticed that I have 3 above mentioned algorithms with the same training and test data, and I get three different results.
Now the question is if it is possible to make the three algorithms work together by combining them in some way and benefit from the previous result?
You can combine classifiers, yes. This is considered an ensemble. It's a bit weird to make an ensemble from a decision tree and a random forest, though. A random forest is an ensemble of decision trees. That's why it's called a forest.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I only know linear regression and nothing else. Any shorthand explanations or tricks would do.
seed is just a random number to add randomness to the algorithm. Just put your lucky number there.
n_estimators is a hyperparameter that determines how many trees/estimators are built within the ensemble model. The more you use, the more accurate it's because of the nature of the Gradient Boosting algorithm. The downside is that the larger n_estimators size it's the longer it takes to train and also can, potentially, overfit to your train data, but again, considering the nature of the algorithm it may not.
Another thing regarding n_estimators to consider is that you can achieve a good score with not much estimators (i.e. 300 or 500) and after that point, larger estimators (i.e. 2000) doesn't add nothing more than a potential overfit.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
is it possible to use logistic regression to identify prime numbers?
i´m trying to project a system with supervised logistic regression with a predefined database numbers and it´s classification (1 = Prime, 0 = Not Prime), using this data i want the computer to use this type of alghorythm to identify other numbers that aren´t classified on DB,
is it possible, or i´m trying to do something impossible?
Given the right network configuration and enough time, I don't know why it would be impossible.
It seems others have had success with different models and you might get a better idea from them:
Early success on prime number testing via artificial networks is presented in A Compositional Neural-network Solution to Prime-number Testing, László Egri, Thomas R. Shultz, 2006. The knowledge-based cascade-correlation (KBCC) network approach showed the most promise, although the practicality of this approach is eclipsed by other prime detection algorithms that usually begin by checking the least significant bit, immediately reducing the search by half, and then searching based other theorems and heuristics up to 𝑓𝑙𝑜𝑜𝑟(𝑥‾‾√). However the work was continued with Knowledge Based Learning with KBCC, Shultz et. al. 2006.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I am creating a classification and regression models using Random forest (DRF) and GBM in H2O.ai. I believe that I don't need to normalize (or scale) the data as it's un-neccessary rather more harmful as it might smooth out the nonlinear nature of the model. Could you please confirm if my understanding is correct.
You don't need to do anything to your data when using H2O - all algorithms handle numeric/categorical/string columns automatically. Some methods do internal standardization automatically, but the tree methods don't and don't need to (split at age > 5 and income < 100000 is fine). As to whether it's "harmful" depends on what you're doing, usually it's a good idea to let the algorithm do the standardization, unless you know exactly what you are doing. One example is clustering, where distances depend on the scaling (or lack thereof) of the data.