If you have a machine learning task, you are given a set of input parameters (features) and a output parameter (target). Based on a set of input+output pairs, you train a model and later use that model to predict the output (given the input).
My problem is somewhat different: I am given a set of input and output parameters (that part is identical), that have been recorded during a manifacturing process. (Acutually the input parameters are input values to a machine that produces some piece of equipment). I should suggest to the operators of the machine a set of the input parameters, that will most likely yield the best output parameters.
Q1: Is this type of problem also called machine learning?
Q2: If not, what are these types of problems called?
It can be classed as Machine Learning ... but it would be better classified as Neural Networks. However, these both come under the umbrella term of Artificial Intelligence.
Related
Given a trained system, a network can be run backward with output values and partial inputs to find the value of a missing input value. Is there a name for this operation?
In example with a trained XOR network with 2 input neurons (with values 1 and X) and an output layer neuron (with value 1). If someone wanted to find what the value of the second input neuron was, they could feed the information backwards can calculate that it would be close to 0. What exactly is this operation called?
I think your issue is related to Feature Extraction and Feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Also This article is related to your issue.
The Backwards Pass:
The goal with back propagation is to update each of the weights in the network so that they cause the actual output to be closer the target output, thereby minimising the error for each output neuron and the network as a whole. This is the step you wanted to know i guess.
I build a neural network with input as a mixture of integers and booleans. But it did not converge. I have seen many examples on internet and every one of them has input in boolean form. So is it possible to build a neural network with a mixture of inputs or integer inputs?
Indeed, it is. What you probability need to do is to normalize your inputs. This means that you could divide a feature's value by the maximum value you expect to see in that place, so that everything lies in the range (-1,1).
Some links to understand normalization of inputs:
Why do we have to normalize the input for an artificial neural network?
https://www.researchgate.net/post/How_can_I_normalize_input_and_output_data_in_training_neural_networks
Another recent way to ensure normalization is the concept of batch normalization
I was reading about neural networks and found this:
"Many-state nominal variables are more difficult to handle. ST Neural Networks has facilities to convert both two-state and many-state nominal variables for use in the neural network. Unfortunately, a nominal variable with a large number of states would require a prohibitive number of numeric variables for one-of-N encoding, driving up the network size and making training difficult. In such a case it is possible (although unsatisfactory) to model the nominal variable using a single numeric index; a better approach is to look for a different way to represent the information."
This is exactly what is happening when I am building my input layer. One-of-N encoding is making the model very complex to design. However, it is mentioned in the above that you can use a numeric index, which I am not sure what he/she means by it. What is a better approach to represent the information? Can neural networks solve a problem with many-state nominal variables?
References:
http://www.uta.edu/faculty/sawasthi/Statistics/stneunet.html#gathering
Solving this task is very often crucial for modeling. Depending on a complexity of distribution of this nominal variable it'seems very often truly important to find a proper embedding between its values and R^n for some n.
One of the most successful example of such embedding is word2vec where the function between words and vectors is obtained. In other cases - you should use either ready solution if it exists or prepare your own by representational learning (e.g. by autoencoders or RBMs).
I have this 5-5-2 backpropagation neural network I'm training, and after reading this awesome article by LeCun I started to put in practice some of the ideas he suggests.
Currently I'm evaluating it with a 10-fold cross-validation algorithm I made myself, which goes basically like this:
for each epoch
for each possible split (training, validation)
train and validate
end
compute mean MSE between all k splits
end
My inputs and outputs are standardized (0-mean, variance 1) and I'm using a tanh activation function. All network algorithms seem to work properly: I used the same implementation to approximate the sin function and it does it pretty good.
Now, the question is as the title implies: should I standardize each train/validation set separately or do I simply need to standardize the whole dataset once?
Note that if I do the latter, the network doesn't produce meaningful predictions, but I prefer having a more "theoretical" answer than just looking at the outputs.
By the way, I implemented it in C, but I'm also comfortable with C++.
You will most likely be better off standardizing each training set individually. The purpose of cross-validation is to get a sense for how well your algorithm generalizes. When you apply your network to new inputs, the inputs will not be ones that were used to compute your standardization parameters. If you standardize the entire data set at once, you are ignoring the possibility that a new input will fall outside the range of values over which you standardized.
So unless you plan to re-standardize every time you process a new input (which I'm guessing is unlikely), you should only compute the standardization parameters for the training set of the partition being evaluated. Furthermore, you should compute those parameters only on the training set of the partition, not the validation set (i.e., each of the 10-fold partitions will use 90% of the data to calculate standardization parameters).
So you assume the inputs are normally distribution and are subtracting the mean, dividing by standard deviation, to get N(0,1) distributed inputs?
Yes I agree with #bogatron that you standardize each training set separately, but I would more strongly say it's a "must" to not use the validation set data too. The problem is not values outside the range in the training set; this is fine, the transformation to a standard normal is still defined for any value. You can't compute mean / standard deviation overa ll the data because you can't in any way use the validation data in the training set, even if just via this statistic.
It should further be emphasized that you use the mean from the training set with the validation set, not the mean from the validation set. It has to be the same transformation of features that was used during training. It would not be valid to transform the validation set differently.
I have 20 numeric input parameters (or more) and single output parameter and I have thousands of these data. I need to find the relation between input parameters and output parameter. Some input parameters might not relate to output parameter or all input parameters might not relate to output parameter. I want some magic system that can statistically calculate output parameter when I provide all input parameters and it much be better if this system also provide confident rate with output result.
What’s technique (in machine learning) that I need to use to solve this problem? I think it should be Neural network, genetic algorithm or other related thing. But I don't sure. More than that, I need to know the limitation of this technique.
Thanks,
Your question seems to simply define the regression problem. Which can be solved by numerous algorithms and models, not just neural networks.
Support Vector Regression
Neural Networks
Linear regression (and many modifications and generalizations) using for example OLS method
Nearest Neighbours Regression
Decision Tree Regression
many, many more!
Simply look for "regression methods", "regression models" etc. in particular, sklearn library implements many of such methods.
I would recommend Genetic Programming (GP), which is genetic-based machine learning approach where the learnt model is a single mathematical expression/equation that best fits your data. Most GP packages out there come with a standard regression suite which you can run "as is" with your data, and with minimal setup costs.