I'm currently working on the Cifar-10 tutorial of tensorflow. I'd like to change the evaluation such that I can see for each image what the prediction of my model was, and whether it was true/false. I struggle with the first part: if I print the predictions (sess.run([top_k_op])) I get true/false values which I assume are whether the prediction was correct or not. However, if I try to print the actual prediction (I tried so far to print the logits, and print the top_k_op tensor), I get some numbers or values, but nothing that looks like the labels. What do I have to change about my code to actually see the labels that my model predicted?
You want to evaluate first the logits. This is a probability distribution over your classes out of your network. The index of the tensor with the higher value will give you the most likely class for your label.
you can use tf.argmax to get the index and then use the index in your labels to print it out
print labels[index]
You can figure out an answer by looking here
In svhn.py, at line 116 the predicted label is printed: print (step, int(test_labels[0]))
I did it in a clear way by using:
classification = sess.run(top_k_predict_op)
print (step, int(test_labels[0]))
print "network predicted:", classification[0], "for real label:", test_labels
Be sure that you are predicting on 24*24 images, in case you trained your model with the original version of the TensorFlow CIFAR-10 model.
Related
I'm designing a multivariate time series model. For that I'm inputing 5 features to lstm model and try to predict the output of 1 variable(i.e. whose value is dependent on itself and other 4 features).
For that I'm doing the feature scaling as follows:-
#Features Scaling
`from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range = (0,1))
training_set_scaled = sc.fit_transform(training_set)
print(training set scaled)`
Output:-
At the output of the model, I got the predicted value as:
However, when it tried to inverse transform it as:
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
I got the the following error:-
non-broadcastable output operand with shape (65,1) doesn't match the broadcast shape (65,5)
Please help. Thank you in advance :)
The problem is that you use sc to min-max-scale the five features. Therefore, sc can also only be used to inverse transform the scaled version of the features (shown by you as output), which would give you back the original feature values.
The label (model output) is independent from that. You can also, but do not necessarily have to scale your dependent variable, and certainly not with the same scaler object.
So I have an output vector of dim=7 and 4 possible classes for each position, so my question is, is it possible to feed the keras model a vector of one hot vectors, where each position of the vector is a one hot vector? something like this [[1000],[1000],[0100],[0010],[0001],[0001],[0010]].
If this is not possible are there any alternatives?
If you want your output of your model to be like that when your model = keras.models.Model(...), the answer is not possible because the output that you provide (which is like a step respond "[1000] => [0000]") will have a gradient of infinity at step and 0 at other point.
What people do is to create a model that give distribution over different action and select the highest probability as predicted value and using cross entropy loss to optimize the model. For example, from your output [1,0,0,0] you will have something like [0.9,0.01,0.01,0.08] instead. Then you can pick first instance as predicted value. This will make sure that your model have gradient at all point.
So if you really want your model to have dim = 7 and 4 different value, you can create output size of 28 = 7*4 with sigmoid activation, then pick first 4 as your dimension 1 (maybe using something like np.argmax(output[0:4])) and so on.
I have a neural network with an input layer having 10 nodes, some hidden layers and an output layer with only 1 node. Then I put a pattern in the input layer, and after some processing, it outputs the value in the output neuron which is a number from 1 to 10. After the training this model is able to get the output , provided the input pattern.
Now, my question is, if it is possible to calculate the inverse model: This means, that I provide a number from output side, (i.e. using output side as input) and then getting the random pattern from those 10 input neurons (i.e. using input as output side).
I want to do this because I will first train a network on basis of difficulty of pattern (input is the pattern and output is difficulty to understand the pattern). Then I want to feed the network with a number so it creates the random patterns on basis of difficulty.
I hope I understood your problem correctly, so I will summarize it in my own words: You have a given model, and want to determine the input which yields a given output.
Supposed, that this is correct, there is at least one way I know of, how you can do this approximately. This way is very easy to implement, but might take a while to calculate a value - probably there are better ways to do this, but I am not sure. (I needed this technique some weeks ago in the topic of reinforcement learning, and did not find anything better, compared to this): Lets assume that your Model maps an input to an output . We now have to create a new model, which we will call : This model will later on calculate the inverse of the model , so that it gives you the input which yields a specific output. To construct we will create a new model, which consists of one plain Dense layer which has the same dimension m as the input. This layer will be connected to the input of the model now. Next, you make all weights of non-trainable (this is very important!).
Now we are setup to find an inverse value already: Assuming you want to find the input corresponding (corresponding means here: it creates the output, but is not unique) to the output y. You have to create a new input vector v which is the unity of . Then you create a input-output data pair consisting of (v, y). Now you use any optimizer you wish to let the input-output-trainingdata propagate through your network, until the error converges to zero. Once this has happend, you can calculate the real input, which gives the output y by doing this: Supposed, that the weights if the new input layer are called w, and the bias is b, the desired input u is u = w*1 + b (whereby 1 )
You might be asking for the reason why this equation holds, so let me try to answer it: You model will try to learn the weights of your new input layer, so that the unity as an input will create the given output. As only the newly added input layer is trainable, only this weights will be changed. Therefore, each weight in this vector will represent the corresponding component of the desired input vector. By using an optimizer and minimizing the l^2 distance between the wanted output and the output of our inverse-model , we will finally determine a set of weights, which will give you a good approximation for the input vector.
I have trained my own dataset of images (traffic light images 11x27) with LeNet, using caffe and DIGITS interface. I get 99% accuracy and when I give new images via DIGITS, it predicts the good label, so the network seems to work very well.
However, I struggle to predict the labels through Python/Matlab API for caffe. The last layer output (ip2) is a vector with 2 elements (I have 2 classes), which looks like [4.8060, -5.2608] for example (the first component is always positive, the second always negative and the absolute values range from 4 to 20). I know it from many tests in Python, Matlab and DIGITS.
My problem is :
Argmax can't work directly on this layer (it always gives 0)
If I use a softmax function, it will always give me [1, 0] (and that's actually the value of net.blobs['prob'] or out['prob'] in the python interface, no matter the class of my image)
So, how can I get the good label predicted ?
Thanks!
I use function predict in opencv to classify my gestures.
svm.load("train.xml");
float ret = svm.predict(mat);//mat is my feature vector
I defined 5 labels (1.0,2.0,3.0,4.0,5.0), but in fact the value of ret are (0.521220207,-0.247173533,-0.127723947······)
So I am confused about it. As Opencv official document, the function returns a class label (classification) in my case.
update: I don't still know why to appear this result. But I choose new features to train models and the return value of predict function is what I defined during train phase (e.g. 1 or 2 or 3 or etc).
During the training of an SVM you assign a label to each class of training data.
When you classify a sample the returned result will match up with one of these labels telling you which class the sample is predicted to fall into.
There's some more documentation here which might help:
http://docs.opencv.org/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html
With Support Vector Machines (SVM) you have a training function and a prediction one. The training function is to train your data and save those informations on an xml file (it facilitates the prediction process in case you use a huge number of training data and you must do the prediction function in another project).
Example : 20 images per class in your case : 20*5=100 training images,each image is associated with a label of its appropriate class and all these informations are stocked in train.xml)
For the prediction function , it tells you what's label to assign to your test image according to your training DATA (the hole work you did in training process). Your prediction results might be good and might be bad , it's all about your training data I think.
If you want try to calculate the error rate for your classifier to see how much it can give good results or bad ones.