xgboost learning rate scheduler callback - image-processing

i'm using xgboost for image classification and whenever i want to use a LearningRateScheduler or LearningRateDecay callbacks i got some errores. i use the same functions that i use for LearningRateScheduler in keras.
def read_lr_from_file(lr_file,epoch):
with open ('LR.txt' , mode='r') as lr_file:
for line in lr_file:
step,lr = line.split(':')
lr = lr.strip()
if int(step) <= epoch and float(lr) > 0:
learning_rate = float(lr)
return learning_rate
def get_scheduler(lr_file):
def scheduler(epoch):
lr = read_lr_from_file(lr_file, epoch)
return lr
return scheduler
learning_rate = xgboost.callback.LearningRateScheduler(get_scheduler('LR.txt'))
trained_model = xgboost.train(params= params_1, dtrain= train_dataset , evals=[(val_dataset, 'eval')],num_boost_round = 1000,early_stopping_rounds=50,callbacks=[learning_rate],verbose_eval= False)
and it goes printing the message below on and on:
"[20:33:17] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:576:
Parameters: { "min_chiled_weight", "n_estimators", "rate_drop" } might not be used.
This could be a false alarm, with some parameters getting used by language bindings but
then being mistakenly passed down to XGBoost core, or some parameter actually being used
but getting flagged wrongly here. Please open an issue if you find any such cases.
[20:33:20] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softmax' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
[20:33:20] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:576:
Parameters: { "min_chiled_weight", "n_estimators", "rate_drop" } might not be used.
This could be a false alarm, with some parameters getting used by language bindings but
then being mistakenly passed down to XGBoost core, or some parameter actually being used
but getting flagged wrongly here. Please open an issue if you find any such cases."
i would be grateful if you can help me to solve this.

It's not a reasonable callable object about callbacks, see the demo below:
def lr_decay(epoch):
lr = init_lr*0.999**epoch # *0.99 0.9 0.995 0.999
print(epoch,':',lr,)
return lr
callbacks = xgb.callback.LearningRateScheduler(reduce_lr)
bst = XGBClassifier()
eval_set = [(x_train, y_train), (x_test, y_test)]
bst.fit(x_train,y_train,eval_set=eval_set,callbacks=[callbacks])

Related

TFF : every client do a pretrain function instead of build_federated_averaging_process

I would like that every client train his model with a function pretrainthat I wrote below :
def pretrain(model):
resnet_output = model.output
layer1 = tf.keras.layers.GlobalAveragePooling2D()(resnet_output)
layer2 = tf.keras.layers.Dense(units=zdim*2, activation='relu')(layer1)
model_output = tf.keras.layers.Dense(units=zdim)(layer2)
model = tf.keras.Model(model.input, model_output)
iterations_per_epoch = determine_iterations_per_epoch()
total_iterations = iterations_per_epoch*num_epochs
optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=0.9)
checkpoint = tf.train.Checkpoint(step=tf.Variable(1), optimizer=optimizer, net=model)
manager = tf.train.CheckpointManager(checkpoint, pretrain_save_path, max_to_keep=10)
current_epoch = tf.cast(tf.floor(optimizer.iterations/iterations_per_epoch), tf.int64)
batch = client_data(0)
batch = client_data(0).batch(2)
epoch_loss = []
for (image1, image2) in batch:
loss, gradients = train_step(model, image1, image2)
epoch_loss.append(loss)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
# if tf.reduce_all(tf.equal(epoch, current_epoch+1)):
print("Loss after epoch {}: {}".format(current_epoch, sum(epoch_loss)/len(epoch_loss)))
#print("Learning rate: {}".format(learning_rate(optimizer.iterations)))
epoch_loss = []
current_epoch += 1
if current_epoch % 50 == 0:
save_path = manager.save()
print("Saved model for epoch {}: {}".format(current_epoch, save_path))
save_path = manager.save()
model.save("model.h5")
model.save_weights("saved_weights.h5")
But as we know that TFF has a predefined function :
iterative_process = tff.learning.build_federated_averaging_process(...)
So please, how can I proceed ? Thanks
There are a few ways that one could proceed along similar lines.
First it is important to note that TFF is functional--one can use things like writing to / reading from files to manage state (as TF allows this), but it is not in the interface TFF exposes to users--while something involving writing to / reading from a file (IE, manipulating state without passing it through function parameters and results), this should at best be considered an implementation detail. It's something that TFF does not encourage.
By slightly refactoring your code above, however, I think this kind of application can fit quite nicely in TFF's programming model. We will want to define something like:
#tff.tf_computation
#tf.function
def pretrain_client_model(model, client_dataset):
# perhaps do dataset processing you want...
for batch in client_dataset:
# do model training
return model.weights() # or some tensor structure representing the trained model weights
Once your implementation looks something like this, you will be able to wire it in to a custom iterative process. The canned function you mention (build_federated_averaging_process) really just constructs an instance of tff.templates.IterativeProcess; you are always, however, free to write your own instance of this class.
Several tutorials take us through this process, this probably being the simplest. For a finished code example of a standalone iterative process implementation, see simple_fedavg.py.

How to make prediction with TFF?

My question is : How can I predict a label of such image with Tensorflow Federated ?
After completing the evaluation of the model, I would like to predict the label of a given image. Like in Keras we do this :
# new instance where we do not know the answer
Xnew = array([[0.89337759, 0.65864154]])
# make a prediction
ynew = model.predict_classes(Xnew)
# show the inputs and predicted outputs
print("X=%s, Predicted=%s" % (Xnew[0], ynew[0]))
Output:
X=[0.89337759 0.65864154], Predicted=[0]
here is how state and model_fn was created:
def model_fn():
keras_model = create_compiled_keras_model()
return tff.learning.from_compiled_keras_model(keras_model, sample_batch)
iterative_process = tff.learning.build_federated_averaging_process(model_fn, server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0),client_weight_fn=None)
state = iterative_process.initialize()
I find this error :
list(self._name_to_index.keys())[:10]))
AttributeError: The tuple of length 2 does not have named field "assign_weights_to". Fields (up to first 10): ['trainable', 'non_trainable']
Thanks
(Requires TFF 0.16.0 or newer)
Since the code is building a tff.learning.Model from a tf.keras.Model you may be able to use the assign_weights_to method on the tff.learning.ModelWeights object (the type of state.model).
This method is used in the Federated Learning for Text Generation tutorial.
This might look something like (near the bottom, the early portions are an example FL training loop):
def create_keras_model() -> tf.keras.Model:
...
def model_fn():
...
return tff.learning.from_keras_model(create_keras_model())
training_process = tff.learning. build_federated_averaging_process(model_fn, ...)
state = training_process.initialize()
for _ in range(NUM_ROUNDS):
state, metrics = training_process.next(state, ...)
model_for_inference = create_keras_model()
state.model.assign_weights_to(model_for_inference)
Once the weights from state have been assigned back into the Keras model, the code can use the standard Keras APIs, such as tf.keras.Model.predict_on_batch
predictions = model_for_inference.predict_on_batch(batch)

FedProx with TensorFlow Federated

Would anyone know how to implement the FedProx optimisation algorithm with TensorFlow Federated? The only implementation that seems to be available online was developed directly with TensorFlow. A TFF implementation would enable an easier comparison with experiments that utilise FedAvg which the framework supports.
This is the link to the FedProx repo: https://github.com/litian96/FedProx
Link to the paper: https://arxiv.org/abs/1812.06127
At this moment, FedProx implementation is not available. I agree it would be a valuable algorithm to have.
If you are interested in contributing FedProx, the best place to start would be simple_fedavg which is a minimal implementation of FedAvg meant as a starting point for extensions -- see the readme there for more details.
I think the major change would need to happen to the client_update method, where you would add the proximal term depending on model_weights and initial_weights to the loss computed in forward pass.
I provide below my implementation of FedProx in TFF. I am not 100% sure that this is the right implementation; I post this answer also for discussing on actual code example.
I tried to follow the suggestions in the Jacub Konecny's answer and comment.
Starting from the simple_fedavg (referring to the TFF Github repo), I just modified the client_update method, and specifically changing the input argument for calculating the gradient with the GradientTape, i.e. instaead of just passing in input the outputs.loss, the tape calculates the gradient considering the outputs.loss + proximal_term previosuly (and iteratively) calculated.
#tf.function
def client_update(model, dataset, server_message, client_optimizer):
"""Performans client local training of "model" on "dataset".Args:
model: A "tff.learning.Model".
dataset: A "tf.data.Dataset".
server_message: A "BroadcastMessage" from server.
client_optimizer: A "tf.keras.optimizers.Optimizer".
Returns:
A "ClientOutput".
"""
def difference_model_norm_2_square(global_model, local_model):
"""Calculates the squared l2 norm of a model difference (i.e.
local_model - global_model)
Args:
global_model: the model broadcast by the server
local_model: the current, in-training model
Returns: the squared norm
"""
model_difference = tf.nest.map_structure(lambda a, b: a - b,
local_model,
global_model)
squared_norm = tf.square(tf.linalg.global_norm(model_difference))
return squared_norm
model_weights = model.weights
initial_weights = server_message.model_weights
tf.nest.map_structure(lambda v, t: v.assign(t), model_weights,
initial_weights)
num_examples = tf.constant(0, dtype=tf.int32)
loss_sum = tf.constant(0, dtype=tf.float32)
# Explicit use `iter` for dataset is a trick that makes TFF more robust in
# GPU simulation and slightly more performant in the unconventional usage
# of large number of small datasets.
for batch in iter(dataset):
with tf.GradientTape() as tape:
outputs = model.forward_pass(batch)
# ------ FedProx ------
mu = tf.constant(0.2, dtype=tf.float32)
prox_term =(mu/2)*difference_model_norm_2_square(model_weights.trainable, initial_weights.trainable)
fedprox_loss = outputs.loss + prox_term
# Letting GradientTape dealing with the FedProx's loss
grads = tape.gradient(fedprox_loss, model_weights.trainable)
client_optimizer.apply_gradients(zip(grads, model_weights.trainable))
batch_size = tf.shape(batch['x'])[0]
num_examples += batch_size
loss_sum += outputs.loss * tf.cast(batch_size, tf.float32)
weights_delta = tf.nest.map_structure(lambda a, b: a - b,
model_weights.trainable,
initial_weights.trainable)
client_weight = tf.cast(num_examples, tf.float32)
return ClientOutput(weights_delta, client_weight, loss_sum / client_weight)

Considering one column more important than others

In case of 3 columns data, (In my test case) I can see that all the columns are valued as equal.
random_forest.feature_importances_
array([0.3131602 , 0.31915436, 0.36768544])
Is there any way to add waitage to one of the columns?
Update:
I guess xgboost can be used in this case.
I tried, but getting this error:
import xgboost as xgb
param = {}
num_round = 2
dtrain = xgb.DMatrix(X, y)
dtest = xgb.DMatrix(x_test_split)
dtrain_split = xgb.DMatrix(X_train, label=y_train)
dtest_split = xgb.DMatrix(X_test)
gbdt = xgb.train(param, dtrain_split, num_round)
y_predicted = gbdt.predict(dtest_split)
rmse_pred_vs_actual = xgb.rmse(y_predicted, y_test)
AttributeError: module 'xgboost' has no attribute 'rmse'
Error is by assuming xgb has method "rmse":
rmse_pred_vs_actual = xgb.rmse(y_predicted, y_test)
It is literally written: AttributeError: module 'xgboost' has no attribute 'rmse'
Use sklearn.metrics.mean_squared_error
By:
from sklearn.metrics import mean_squared_error
# Your code
rmse_pred_vs_actual = mean_squared_error(y_test, y_predicted)
It'll fix your error but it still doesn't control a feature importance.
Now, if you really want to change the importance of a feature, you need to be creative about how to make a change like this. There is no text book solution that I know of and no method in xgboost that I know of. You can follow the link Stev posted in a comment to your question and maybe get some ideas (including changing your ML algorithm).

How to get loss function history using tf.contrib.opt.ScipyOptimizerInterface

I need to get the loss history over time to plot it in graph.
Here is my skeleton of code:
optimizer = tf.contrib.opt.ScipyOptimizerInterface(loss, method='L-BFGS-B',
options={'maxiter': args.max_iterations, 'disp': print_iterations})
optimizer.minimize(sess, loss_callback=append_loss_history)
With append_loss_history definition:
def append_loss_history(**kwargs):
global step
if step % 50 == 0:
loss_history.append(loss.eval())
step += 1
When I see the verbose output of ScipyOptimizerInterface, the loss is actually decrease over time.
But when I print loss_history, the losses are nearly the same over time.
Refer to the doc:
"Variables subject to optimization are updated in-place AT THE END OF OPTIMIZATION"
https://www.tensorflow.org/api_docs/python/tf/contrib/opt/ScipyOptimizerInterface. Is that the reason for the being unchanged of the loss?
I think you have the problem down; the variables themselves are not modified until the end of the optimization (instead being fed to session.run calls), and evaluating a "back channel" Tensor gets the un-modified variables. Instead, use the fetches argument to optimizer.minimize to piggyback on the session.run calls which have the feeds specified:
import tensorflow as tf
def print_loss(loss_evaled, vector_evaled):
print(loss_evaled, vector_evaled)
vector = tf.Variable([7., 7.], 'vector')
loss = tf.reduce_sum(tf.square(vector))
optimizer = tf.contrib.opt.ScipyOptimizerInterface(
loss, method='L-BFGS-B',
options={'maxiter': 100})
with tf.Session() as session:
tf.global_variables_initializer().run()
optimizer.minimize(session,
loss_callback=print_loss,
fetches=[loss, vector])
print(vector.eval())
(Modified from the example in the documentation). This prints Tensors with the updated values:
98.0 [ 7. 7.]
79.201 [ 6.29289341 6.29289341]
7.14396e-12 [ -1.88996808e-06 -1.88996808e-06]
[ -1.88996808e-06 -1.88996808e-06]

Resources