Why does tf.Session().run() not work here? - machine-learning

I am trying to figure out why does Session.run() gives a Session Graph is empty error. Why does Session.run() not work here? Somehow, I can just print(prediction) to retrieve the result.
Don't I need to use Session.run() to start constructing the model to get prediction?
def new_samples():
#return np.array([[5.9,3,4.2,1.5],[6.9,3.1,5.4,2.1]], dtype=np.float32)
return np.array(test_data_values, dtype=np.float32)
predictions = list(classifier.predict_classes(input_fn=new_samples))
default_session = tf.Session()
print(default_session.run(predictions))
Note: classifier = tf.contrib.learn.DNNClassifier

The DNNClassifier is an Estimator which is an abstraction which handles things like sessions for you.
The motivation for the higher level APIs (like Estimator or DNNClassifier) is precisely this, so you don't have to worry about it. Also it gets more tricky to manage sessions once you have more workers and that is all being handled for you :)

Related

How to find the optimal learning rate, number of epochs & decay strategy in Torch.optim.adam?

I am working on a model trained on the MNIST dataset. I am using the torch.optim.adam model and have been experimenting with tuning the hyper parameters. After running a lot of tests, I have come to find a combination of hyper parameters that give 90% accuracy. However, I feel like maybe since I am new to this, there might be a more efficient way to find the optimal values of the hyperparameters. The brute force approach seems to depend on trial and error & I was wondering if there is certain strategy to find these values.
Example of the code being used is:
if __name__ == '__main__':
end = time.time()
model_ft = Net().to(device)
print(model_ft.network)
criterion = nn.CrossEntropyLoss()
optimizer_ft = optim.Adam(model_ft.parameters(), lr=1e-3)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=9, gamma=0.5)
history, accuracy = train_test(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
num_epochs=15)
Here I would like to find the optimal values of:-
Learning Rate
Step Size
Gamma
Number of Epochs
Any help is much appreciated!
A similar question was already answered in-depth it seems.
However, in short, you can use something called Grid Search. With Grid Search, you set the values you want to try for each hyperparameter, and then Grid Search will try every combination. This link shows how to do it with PyTorch
The following Medium Post goes more in-depth about other methods and packages to try, but I think you should start with a simple grid search.

How the initial kmeans points works in to BigQuery ML?

I'm using BigQuery for machine learning, more specifically the k-means method for an unlabeled dataset where I'm trying to find clusters.
I'd like to know if someone has discovered how the BQ ML initiates the centroids.
I already tried looking at the documentation but either there is nothing or I couldn't find it.
CREATE MODEL `project.dataset.model_name`
OPTIONS(
model_type = "kmeans",
num_clusters = 3,
distance_type = "euclidean",
early_stop = TRUE,
max_iterations = 20,
standardize_features = TRUE)
AS
(SELECT * FROM `project.dataset.sample_date_to_train`
)
The results differ a little each time I run.
Has someone experience with that subject?
For someone who is still looking for an answer, recently there has been an update on BigQuery ML about this topic. Two new paramaters have been added to the CREATE MODEL statement, i.e.:
KMEANS_INIT_METHOD
KMEANS_INIT_COL
Basically you can set your custom K observations (belonging to the data table) that will serve as initial centroids for your K-means algorithm. You can find the relative documentation at this link. Maybe it's not the most exciting solution to your problem, but it's still something you can work with if you need reproducibility.
If I had to guess, it probably uses a similar logic to TensorFlow (BQML might be using TF under the hood as it is). Random partitioning seems to be the TensorFlow default, so that would be my guess.
The reason you are seeing different results each time you train the model, is due to the random nature of the initial values assigned to the centroids. The K-means algorithm begins by randomly selecting a value(position) for the k number of centroids chosen. If you review this documentation it explains the exact process when using the K-means algorithm1.

Find out the training error after fit()

I'm training a LinearSVC model and I want to get the training error of it. Is it possible to get it w/o evaluating it manually?
Thanks
sklearn is using liblinear for this task.
You can take a quick glance into the sources here:
self.coef_, self.intercept_, self.n_iter_ = _fit_liblinear(
X, y, self.C, self.fit_intercept, self.intercept_scaling,
self.class_weight, self.penalty, self.dual, self.verbose,
self.max_iter, self.tol, self.random_state, self.multi_class,
self.loss, sample_weight=sample_weight)
which shows that only coefficients, intercepts and number of iterations are processed by sklearn's python-API. Whatever else is available in liblinear's output is not grabbed. You can't directly read out the training-error without changing the internal code.
There might be a possible hack turning on verbose-mode, redirect the output and parse additional info available there. But this assumes the info you look for is available there and it's also hacky and i won't recommend it.
Just use the score-method. It won't be too costly compared to fitting.

How does batching interact with the loss function in TensorFlow?

I'm training a multi-objective neural net in TensorFlow with my own loss function and can't find documentation regarding how batching interacts with that functionality.
For example, I have snippet of my loss function below, which takes the tensor/list of predictions and makes sure that their absolute value sums to no more than one:
def fitness(predictions,actual):
absTensor = tf.abs(predictions)
sumTensor = tf.reduce_sum(absTensor)
oneTensor = tf.constant(1.0)
isGTOne = tf.greater(sumTensor,oneTensor)
def norm(): return predictions/sumTensor
def unchanged(): return predictions
predictions = tf.cond(isGTOne,norm,unchanged)
etc...
But when I'm passing in a batch of estimates I feel like this loss function is normalising the whole set of inputs to sum to 1 at this point, rather than each individual set summing to 1. I.e.
[[.8,.8],[.8,.8]] -> [[.25,.25],[.25,25]]
rather than the desired
[[.8,.8],[.8,.8]] -> [[.5,.5],[.5,.5]]
Can anybody clarify or put to rest my suspicions? If this is how my function is currently working, how do I change that?
You must specify a reduction axis for reduction ops, otherwise all axes will be reduces. Traditionally this is the first dimension of your tensor. So, line 2 should look like this:
sumTensor = tf.reduce_sum(absTensor, 0)
After you make that change you will run into another problem. sumTensor will no longer be a scalar and will thus no longer make sense as a condition for tf.cond (i.e. what does it mean to branch per entry of a batch?). What you really want is tf.select since you don't really want to branch logic per batch entry. Like this:
isGTOne = tf.greater(sumTensor,oneTensor)
norm = predictions/sumTensor
predictions = tf.select(isGTOne,norm,predictions)
But, looking at this now, I wouldn't even bother conditionally normalizing the entries. Since you are operating at the granularity of a batch now, I don't think you can gain performance from normalizing an entry of a batch one at a time. Especially, since dividing is not really an expensive side effect. Might as well just do:
def fitness(predictions,actual):
absTensor = tf.abs(predictions)
sumTensor = tf.reduce_sum(absTensor, 0)
predictions = predictions/sumTensor
etc...
Hope that helps!

How do you actually apply a trained model?

I've been slowly going through the tensorflow tutorials, and I assume I will have to again. I don't have a background in ML but am slowly pushing my way up.
Anyway, after reading through the RNN tutorial and running the training code, I am confused.
How does one actually apply the trained model so that it can be used to make language predictions?
I know this is a terrible noobish and simple question, but I believe it will be of use to others, as I have seen it asked and not answered in a satisfactory way.
In general, when you train a model, you first do a forward pass, and then a backward pass. The forward pass makes a prediction based on your input data, and the backward pass adjust your model based on how correct your prediction was.
So when you want to apply your model, you just do a forward pass with your new data as input.
In your particular example, using this code, you can see how it's done by looking at how they run the test set, starting line 286.
# They instantiate the model with is_training=False
mtest = PTBModel(is_training=False, config=eval_config)
# Then they can do a forward pass
test_perplexity = run_epoch(session, mtest, test_data, tf.no_op())
print("Test Perplexity: %.3f" % test_perplexity)
And if you want the actual prediction and not the perplexity, it is the state in the run_epoch function :
cost, state, _ = session.run([m.cost, m.final_state, eval_op],
{m.input_data: x,
m.targets: y,
m.initial_state: state})

Resources