How to make prediction with TFF? - tensorflow-federated

My question is : How can I predict a label of such image with Tensorflow Federated ?
After completing the evaluation of the model, I would like to predict the label of a given image. Like in Keras we do this :
# new instance where we do not know the answer
Xnew = array([[0.89337759, 0.65864154]])
# make a prediction
ynew = model.predict_classes(Xnew)
# show the inputs and predicted outputs
print("X=%s, Predicted=%s" % (Xnew[0], ynew[0]))
Output:
X=[0.89337759 0.65864154], Predicted=[0]
here is how state and model_fn was created:
def model_fn():
keras_model = create_compiled_keras_model()
return tff.learning.from_compiled_keras_model(keras_model, sample_batch)
iterative_process = tff.learning.build_federated_averaging_process(model_fn, server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0),client_weight_fn=None)
state = iterative_process.initialize()
I find this error :
list(self._name_to_index.keys())[:10]))
AttributeError: The tuple of length 2 does not have named field "assign_weights_to". Fields (up to first 10): ['trainable', 'non_trainable']
Thanks

(Requires TFF 0.16.0 or newer)
Since the code is building a tff.learning.Model from a tf.keras.Model you may be able to use the assign_weights_to method on the tff.learning.ModelWeights object (the type of state.model).
This method is used in the Federated Learning for Text Generation tutorial.
This might look something like (near the bottom, the early portions are an example FL training loop):
def create_keras_model() -> tf.keras.Model:
...
def model_fn():
...
return tff.learning.from_keras_model(create_keras_model())
training_process = tff.learning. build_federated_averaging_process(model_fn, ...)
state = training_process.initialize()
for _ in range(NUM_ROUNDS):
state, metrics = training_process.next(state, ...)
model_for_inference = create_keras_model()
state.model.assign_weights_to(model_for_inference)
Once the weights from state have been assigned back into the Keras model, the code can use the standard Keras APIs, such as tf.keras.Model.predict_on_batch
predictions = model_for_inference.predict_on_batch(batch)

Related

How to create a physics-informed neural network (PINN) using jax

I am trying to create a physics-informed neural network (PINN) in JAX. I want to differentiate the defined model (neural network) by the input (x). If I set model to jax.grad(params), I get an error.
If I set model to jax.grad(model), I don't get an error, but I don't know if I am able to differentiate the model of the neural network by x.
class MLP(fnn.Module):
#fnn.compact
def __call__(self, x):
x = fnn.Dense(128)(x)
x = fnn.relu(x)
x = fnn.Dense(256)(x)
x = fnn.relu(x)
x = fnn.Dense(10)(x)
return x
model = MLP()
params = model.init(jax.random.PRNGKey(0), jnp.ones([1]))['params']
tx = optax.adam(0.001)
state = TrainState.create(apply_fn=model.apply, params=params, tx=tx)
You can differentiate a model in JAX by (1) defining a function that you want to differentiate, (2) transforming it with jax.grad, jax.jacrev, jax.jacfwd, etc. as appropriate for your application, and (3) passing data to the transformed function.
It's not entirely clear from your question what operation you're hoping to differentiate, but here is an example of computing a forward-mode jacobian of the training state creation with respect to the params:
def f(params):
return TrainState.create(apply_fn=model.apply, params=params, tx=tx)
result = jax.jacfwd(f)(params)
If that doesn't help, I'd suggest editing your question to make clear what operation you're interested in differentiating.

Overcoming compatibility issues with using iml from h2o models

I am unable to reproduce the only example I can find of using h2o with iml (https://www.r-bloggers.com/2018/08/iml-and-h2o-machine-learning-model-interpretability-and-feature-explanation/) as detailed here (Error when extracting variable importance with FeatureImp$new and H2O). Can anyone point to a workaround or other examples of using iml with h2o?
Reproducible example:
library(rsample) # data splitting
library(ggplot2) # allows extension of visualizations
library(dplyr) # basic data transformation
library(h2o) # machine learning modeling
library(iml) # ML interprtation
library(modeldata) #attrition data
# initialize h2o session
h2o.no_progress()
h2o.init()
# classification data
data("attrition", package = "modeldata")
df <- rsample::attrition %>%
mutate_if(is.ordered, factor, ordered = FALSE) %>%
mutate(Attrition = recode(Attrition, "Yes" = "1", "No" = "0") %>% factor(levels = c("1", "0")))
# convert to h2o object
df.h2o <- as.h2o(df)
# create train, validation, and test splits
set.seed(123)
splits <- h2o.splitFrame(df.h2o, ratios = c(.7, .15), destination_frames =
c("train","valid","test"))
names(splits) <- c("train","valid","test")
# variable names for resonse & features
y <- "Attrition"
x <- setdiff(names(df), y)
# elastic net model
glm <- h2o.glm(
x = x,
y = y,
training_frame = splits$train,
validation_frame = splits$valid,
family = "binomial",
seed = 123
)
# 1. create a data frame with just the features
features <- as.data.frame(splits$valid) %>% select(-Attrition)
# 2. Create a vector with the actual responses
response <- as.numeric(as.vector(splits$valid$Attrition))
# 3. Create custom predict function that returns the predicted values as a
# vector (probability of purchasing in our example)
pred <- function(model, newdata) {
results <- as.data.frame(h2o.predict(model, as.h2o(newdata)))
return(results[[3L]])
}
# create predictor object to pass to explainer functions
predictor.glm <- Predictor$new(
model = glm,
data = features,
y = response,
predict.fun = pred,
class = "classification"
)
imp.glm <- FeatureImp$new(predictor.glm, loss = "mse")
Error obtained:
Error in `[.data.frame`(prediction, , self$class, drop = FALSE): undefined columns
selected
traceback()
1. FeatureImp$new(predictor.glm, loss = "mse")
2. .subset2(public_bind_env, "initialize")(...)
3. private$run.prediction(private$sampler$X)
4. self$predictor$predict(data.frame(dataDesign))
5. prediction[, self$class, drop = FALSE]
6. `[.data.frame`(prediction, , self$class, drop = FALSE)
7. stop("undefined columns selected")
In the iml package documentation, it says that the class argument is "The class column to be returned.". When you set class = "classification", it's looking for a column called "classification" which is not found. At least on GitHub, it looks like the iml package has gone through a fair amount of development since that blog post, so I imagine some functionality may not be backwards compatible anymore.
After reading through the package documentation, I think you might want to try something like:
predictor.glm <- Predictor$new(
model = glm,
data = features,
y = "Attrition",
predict.function = pred,
type = "prob"
)
# check ability to predict first
check <- predictor.glm$predict(features)
print(check)
Even better might be to leverage H2O's extensive functionality around machine learning interpretability.
h2o.varimp(glm) will give the user the variable importance for each feature
h2o.varimp_plot(glm, 10) will render a graphic showing the relative importance of each feature.
h2o.explain(glm, as.h2o(features)) is a wrapper for the explainability interface and will by default provide the confusion matrix (in this case) as well as variable importance, and partial dependency plots for each feature.
For certain algorithms (e.g., tree-based methods), h2o.shap_explain_row_plot() and h2o.shap_summary_plot() will provide the shap contributions.
The h2o-3 docs might be useful here to explore more

Why does the difference in network architecture make a huge difference in name classification

I tried to build a RNN by myself following this tutorial https://pytorch.org/tutorials/intermediate/char_rnn_classification_tutorial. I built my own version with this following network architecture, which is different from the tutorial.a stands for input layer, h hidden, o output. Here's my code:
class RNN(nn.Module):
def __init__(self,input_size,hidden_size,output_size,initial_hidden):
super(RNN, self).__init__()
self.linear1 = nn.Linear(input_size,hidden_size)
self.linear2 = nn.Linear(hidden_size,hidden_size,bias=False)
self.linear3 = nn.Linear(hidden_size,output_size)
self.prev_hidden = initial_hidden
def forward(self,X):
input = torch.add(self.linear1(X).view(1,-1),self.linear2(self.prev_hidden.to(device))
hidden = nn.ReLU()(input)
self.prev_hidden = hidden.detach()
output = self.linear3(hidden)
return output
This model stops at loss = 12000 over all samples and doesn't really drop anymore. However, after switching to the model described in the tutorial, which the hidden and input layers share the same weight, the loss drops to 4000 with the same hyper parameter. Here's the code:
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
self.i2o = nn.Linear(input_size + hidden_size, output_size)
self.softmax = nn.LogSoftmax(dim=1)
def forward(self, input, hidden):
combined = torch.cat((input, hidden), 1)
hidden = self.i2h(combined)
output = self.i2o(combined)
output = self.softmax(output)
return output, hidden
def initHidden(self):
return torch.zeros(1, self.hidden_size)
Why does the model architecture in the tutorial outperforms my version so much?
This line
self.prev_hidden = hidden.detach()
Makes you never backprop through time through your RNN, that looks like pretty non standard idea for training a neural network, and definitely limits its ability to learn.
Other obvious differences is that your implementation does not output probability (it lacks some projection onto a simplex, e.g. a softmax) which is hard to verify how much of an issue it is as the full training code is missing.

TFF : every client do a pretrain function instead of build_federated_averaging_process

I would like that every client train his model with a function pretrainthat I wrote below :
def pretrain(model):
resnet_output = model.output
layer1 = tf.keras.layers.GlobalAveragePooling2D()(resnet_output)
layer2 = tf.keras.layers.Dense(units=zdim*2, activation='relu')(layer1)
model_output = tf.keras.layers.Dense(units=zdim)(layer2)
model = tf.keras.Model(model.input, model_output)
iterations_per_epoch = determine_iterations_per_epoch()
total_iterations = iterations_per_epoch*num_epochs
optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=0.9)
checkpoint = tf.train.Checkpoint(step=tf.Variable(1), optimizer=optimizer, net=model)
manager = tf.train.CheckpointManager(checkpoint, pretrain_save_path, max_to_keep=10)
current_epoch = tf.cast(tf.floor(optimizer.iterations/iterations_per_epoch), tf.int64)
batch = client_data(0)
batch = client_data(0).batch(2)
epoch_loss = []
for (image1, image2) in batch:
loss, gradients = train_step(model, image1, image2)
epoch_loss.append(loss)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
# if tf.reduce_all(tf.equal(epoch, current_epoch+1)):
print("Loss after epoch {}: {}".format(current_epoch, sum(epoch_loss)/len(epoch_loss)))
#print("Learning rate: {}".format(learning_rate(optimizer.iterations)))
epoch_loss = []
current_epoch += 1
if current_epoch % 50 == 0:
save_path = manager.save()
print("Saved model for epoch {}: {}".format(current_epoch, save_path))
save_path = manager.save()
model.save("model.h5")
model.save_weights("saved_weights.h5")
But as we know that TFF has a predefined function :
iterative_process = tff.learning.build_federated_averaging_process(...)
So please, how can I proceed ? Thanks
There are a few ways that one could proceed along similar lines.
First it is important to note that TFF is functional--one can use things like writing to / reading from files to manage state (as TF allows this), but it is not in the interface TFF exposes to users--while something involving writing to / reading from a file (IE, manipulating state without passing it through function parameters and results), this should at best be considered an implementation detail. It's something that TFF does not encourage.
By slightly refactoring your code above, however, I think this kind of application can fit quite nicely in TFF's programming model. We will want to define something like:
#tff.tf_computation
#tf.function
def pretrain_client_model(model, client_dataset):
# perhaps do dataset processing you want...
for batch in client_dataset:
# do model training
return model.weights() # or some tensor structure representing the trained model weights
Once your implementation looks something like this, you will be able to wire it in to a custom iterative process. The canned function you mention (build_federated_averaging_process) really just constructs an instance of tff.templates.IterativeProcess; you are always, however, free to write your own instance of this class.
Several tutorials take us through this process, this probably being the simplest. For a finished code example of a standalone iterative process implementation, see simple_fedavg.py.

K cross validation with different results everytime

All my models are initialized with the below:
def intiailize_clf_models(self):
model = RandomForestClassifier(random_state=42)
self.clf_models.append((model))
model = ExtraTreesClassifier(random_state=42)
self.clf_models.append((model))
model = MLPClassifier(random_state=42)
self.clf_models.append((model))
model = LogisticRegression(random_state=42)
self.clf_models.append((model))
model = xgb.XGBClassifier(random_state=42)
self.clf_models.append((model))
model = lgb.LGBMClassifier(random_state=42)
self.clf_models.append((model))
Which loops through the models and performs k fold cross validation with :
def kfold_cross_validation(self):
clf_models = self.get_models()
models = []
self.results = {}
for model in clf_models:
self.current_model_name = model.__class__.__name__
cross_validate = cross_val_score(model, self.xtrain, self.ytrain, cv=4)
self.mean_cross_validation_score = cross_validate.mean()
print("Kfold cross validation for", self.current_model_name)
self.results[self.current_model_name] = self.mean_cross_validation_score
models.append(model)
Anytime i run this cross validation, i get a different result even after i have set a random state on the different models. I would like to know why i get different results in cross validation and what can be done about it
This is because you did not set the random_state for your k-fold generator. By default when you pass a int value for cv as
cross_validate = cross_val_score(model, self.xtrain, self.ytrain, cv=4)
cross_val_score will call (Stratified)KFold using a different random state with every call causing your model's parameters to differ leading to different results.
The relevant part from the source file.
cv: int, cross-validation generator or an iterable, default=None
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 5-fold cross validation,
- int, to specify the number of folds in a `(Stratified)KFold`,
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.
For int/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
other cases, :class:`KFold` is used.
To remedy this you can pass your own cross-validation generator with a controlled random state as stated in the documentation above. For example:
# (code untested)
from sklearn.model_selection import StratifiedKFold
skf = StratifiedKFold(n_splits=4, random_state=42)
cross_validate = cross_val_score(model, self.xtrain, self.ytrain, cv=skf)
I found the solution to my question.
Setting a random seed with the below solved the problem:
seed = np.random.seed(22)

Resources