Get mapping from body state to plant state indices? - drake

Is there a way to know the mapping between the indices of a plant's state [q, v] and an individual object's [qi, vi]?
Example: if I have object-wise representations of position, velocities, or accelerations (e.g. q_obj1, q_obj2 etc.) and need to interact with a plant's state beyond what SetPositions/SetVelocities allows me to do---for instance computing M.dot(qdd), or feeding into MultibodyPositionToGeometryPose.
From the documentation it seemed like maybe the C++ method MakeStateSelectorMatrix or its inverse could be useful here, but how would one do this in python?
Edit: here's a more clear example
# object-wise positions, velocities, and accelerations
q1, qd1, qdd1, q2, qd2, qdd2 = ...
# set plant positions and velocities for each object
plant.SetPositionsAndVelocities(context, model1, np.concatenate([q1, qd1]))
plant.SetPositionsAndVelocities(context, model2, np.concatenate([q2, qd2]))
# get matrices for the plant manipulator equations
M = plant.CalcMassMatrixViaInverseDynamics(context)
Cv = plant.CalcBiasTerm(context)
# The following line is wrong, because M is in the plant state ordering
# but qdd is not!
print(M.dot(np.concatenate([qdd1, qdd2])) + Cv)
# here's what I would like, where the imaginary method ConvertToStateIndices
# maps from object indices to plant state indices,
# which SetPositions does under the hood, but I need exposed here
reordered_qdd = np.zeros(plant.num_positions, dtype=qdd.dtype)
reordered_qdd += plant.ConvertToStateIndices(model1, qdd1)
reordered_qdd += plant.ConvertToStateIndices(model2, qdd2)
print(M.dot(reordered_qdd) + Cv)
Edit 2: for future reference, here is a workaround---do the following before casting MultiBodyPlant to AutoDiffXd or Expression:
indices = np.arange(plant.num_positions())
indices_dict = {body: plant.GetPositionsFromArray(model, indices).astype(int)
for body, model in zip(bodies, models)}
then you can do:
reordered_qdd = np.zeros(plant.num_positions, dtype=qdd.dtype)
reordered_qdd[indices_dict[body1]] += qdd1
reordered_qdd[indices_dict[body2]] += qdd2
print(M.dot(reordered_qdd) + Cv)

Are MultibodyPlant::SetPositionsInArray or SetVelocitiesInArray up to the job you're looking for?
reordered_qdd = np.zeros(plant.num_positions, dtype=qdd.dtype)
plant.SetVelocitiesInArray(model1, qdd1, reordered_qdd)
plant.SetVelocitiesInArray(model2, qdd2, reordered_qdd)
print(M.dot(reordered_qdd) + Cv)
One could say a bit of an abuse of the API (since in this example the values are accelerations), but I think given that qd and qdd would have the same layout it should work fine.
That being said, I didn't actually test the code above, so apologies (and please let me know!) if that doesn't work out.

Related

How to Record Variables in Pytorch Without Breaking Gradient Computation?

I am trying to implement some policy gradient training, similar to this. However, I would like to manipulate the rewards(like discounted future sum and other differentiable operations) before doing the backward propagation.
Consider the manipulate function defined to calculate the reward to go:
def manipulate(reward_pool):
n = len(reward_pool)
R = np.zeros_like(reward_pool)
for i in reversed(range(n)):
R[i] = reward_pool[i] + (R[i+1] if i+1 < n else 0)
return T.as_tensor(R)
I tried to store the rewards in a list:
#pseudocode
reward_pool = [0 for i in range(batch_size)]
for k in batch_size:
act = net(state)
state, reward = env.step(act)
reward_pool[k] = reward
R = manipulate(reward_pool)
R.backward()
optimizer.step()
It seems like inplace operation breaks the gradient computation, the code gives me an error: one of the variables needed for gradient computation has been modified by an inplace operation.
I also tried to initialize an empty tensor first, and store it in the tensor, but inplace operation is still the issue - a view of a leaf Variable that requires grad is being used in an in-place operation.
I am kind of new to PyTorch. Does anyone know what the right way recording rewards is in this case?
The issue is due to assignment to existing object.
Simply initialize the empty pool(list) for each iteration and append to the pool when new reward is calculated, i.e.
reward_pool = []
for k in batch_size:
act = net(state)
state, reward = env.step(act)
reward_pool.append(reward)
R = manipulate(reward_pool)
R.backward()
optimizer.step()

how to apply custom encoders to multiple clients at once? how to use custom encoders in run_one_round?

So my goal is basically implementing global top-k subsampling. Gradient sparsification is quite simple and I have already done this building on stateful clients example, but now I would like to use encoders as you have recommended here at page 28. Additionally I would like to average only the non-zero gradients, so say we have 10 clients but only 4 have nonzero gradients at a given position for a communication round then I would like to divide the sum of these gradients to 4, not 10. I am hoping to achieve this by summing gradients at numerator and masks, 1s and 0s, at denominator. Also moving forward I will add randomness to gradient selection so it is imperative that I create those masks concurrently with gradient selection. The code I have right now is
import tensorflow as tf
from tensorflow_model_optimization.python.core.internal import tensor_encoding as te
#te.core.tf_style_adaptive_encoding_stage
class GrandienrSparsificationEncodingStage(te.core.AdaptiveEncodingStageInterface):
"""An example custom implementation of an `EncodingStageInterface`.
Note: This is likely not what one would want to use in practice. Rather, this
serves as an illustration of how a custom compression algorithm can be
provided to `tff`.
This encoding stage is expected to be run in an iterative manner, and
alternatively zeroes out values corresponding to odd and even indices. Given
the determinism of the non-zero indices selection, the encoded structure does
not need to be represented as a sparse vector, but only the non-zero values
are necessary. In the decode mehtod, the state (i.e., params derived from the
state) is used to reconstruct the corresponding indices.
Thus, this example encoding stage can realize representation saving of 2x.
"""
ENCODED_VALUES_KEY = 'stateful_topk_values'
INDICES_KEY = 'indices'
SHAPES_KEY = 'shapes'
ERROR_COMPENSATION_KEY = 'error_compensation'
def encode(self, x, encode_params):
shapes_list = [tf.shape(y) for y in x]
flattened = tf.nest.map_structure(lambda y: tf.reshape(y, [-1]), x)
gradients = tf.concat(flattened, axis=0)
error_compensation = encode_params[self.ERROR_COMPENSATION_KEY]
gradients_and_error_compensation = tf.math.add(gradients, error_compensation)
percentage = tf.constant(0.1, dtype=tf.float32)
k_float = tf.multiply(percentage, tf.cast(tf.size(gradients_and_error_compensation), tf.float32))
k_int = tf.cast(tf.math.round(k_float), dtype=tf.int32)
values, indices = tf.math.top_k(tf.math.abs(gradients_and_error_compensation), k = k_int, sorted = False)
indices = tf.expand_dims(indices, 1)
sparse_gradients_and_error_compensation = tf.scatter_nd(indices, values, tf.shape(gradients_and_error_compensation))
new_error_compensation = tf.math.subtract(gradients_and_error_compensation, sparse_gradients_and_error_compensation)
state_update_tensors = {self.ERROR_COMPENSATION_KEY: new_error_compensation}
encoded_x = {self.ENCODED_VALUES_KEY: values,
self.INDICES_KEY: indices,
self.SHAPES_KEY: shapes_list}
return encoded_x, state_update_tensors
def decode(self,
encoded_tensors,
decode_params,
num_summands=None,
shape=None):
del num_summands, decode_params, shape # Unused.
flat_shape = tf.math.reduce_sum([tf.math.reduce_prod(shape) for shape in encoded_tensors[self.SHAPES_KEY]])
sizes_list = [tf.math.reduce_prod(shape) for shape in encoded_tensors[self.SHAPES_KEY]]
scatter_tensor = tf.scatter_nd(
indices=encoded_tensors[self.INDICES_KEY],
updates=encoded_tensors[self.ENCODED_VALUES_KEY],
shape=[flat_shape])
nonzero_locations = tf.nest.map_structure(lambda x: tf.cast(tf.where(tf.math.greater(x, 0), 1, 0), tf.float32) , scatter_tensor)
reshaped_tensor = [tf.reshape(flat_tensor, shape=shape) for flat_tensor, shape in
zip(tf.split(scatter_tensor, sizes_list), encoded_tensors[self.SHAPES_KEY])]
reshaped_nonzero = [tf.reshape(flat_tensor, shape=shape) for flat_tensor, shape in
zip(tf.split(nonzero_locations, sizes_list), encoded_tensors[self.SHAPES_KEY])]
return reshaped_tensor, reshaped_nonzero
def initial_state(self):
return {self.ERROR_COMPENSATION_KEY: tf.constant(0, dtype=tf.float32)}
def update_state(self, state, state_update_tensors):
return {self.ERROR_COMPENSATION_KEY: state_update_tensors[self.ERROR_COMPENSATION_KEY]}
def get_params(self, state):
encode_params = {self.ERROR_COMPENSATION_KEY: state[self.ERROR_COMPENSATION_KEY]}
decode_params = {}
return encode_params, decode_params
#property
def name(self):
return 'gradient_sparsification_encoding_stage'
#property
def compressible_tensors_keys(self):
return False
#property
def commutes_with_sum(self):
return False
#property
def decode_needs_input_shape(self):
return False
#property
def state_update_aggregation_modes(self):
return {}
I have run some simple tests manually following the steps you outlined here at page 45. It works but I have some questions/problems.
When I use list of tensors of same shape (ex:2 2x25 tensors) as input,x, of encode it works without any issues but when I try to use list of tensors of different shapes (2x20 and 6x10) it gives and error saying
InvalidArgumentError: Shapes of all inputs must match: values[0].shape = [2,20] != values1.shape = [6,10] [Op:Pack] name: packed
How can I resolve this issue? As i said I want to use global top-k so it is essential I encode entire trainable model weights at once. Take the cnn model used here, all the tensors have different shapes.
How can I do the averaging I described at the beginning? For example here you have done
mean_factory = tff.aggregators.MeanFactory(
tff.aggregators.EncodedSumFactory(mean_encoder_fn), # numerator
tff.aggregators.EncodedSumFactory(mean_encoder_fn), # denominator )
Is there a way to repeat this with one output of decode going to numerator and other going to denominator? How can I handle dividing 0 by 0? tensorflow has divide_no_nan function, can I use it somehow or do I need to add eps to each?
How is partition handled when I use encoders? Does each client get a unique encoder holding a unique state for it? As you have discussed here at page 6 client states are used in cross-silo settings yet what happens if client ordering changes?
Here you have recommended using stateful clients example. Can you explain this a bit further? I mean in the run_one_round where exactly encoders go and how are they used/combined with client update and aggregation?
I have some additional information such as sparsity I want to pass to encode. What is the suggested method for doing that?
Here are some answers, hope it helps:
If you want to treat all of the aggregated structure just as a single tensor, use concat_factory as the outermost aggregator. That will concatenate entire structure to a rank-1 Tensor at clients, and then unpack back to the original structure at the end. Example use: tff.aggregators.concat_factory(tff.aggregators.MeanFactory(...))
Note the encoding stage objects are meant to work with a single tensor, so what you describe with identical tensors probably works only accidentally.
There are two options.
a. Modify the client training code such that the weights being passed to the weighted aggregator are already what you want it to be (zero/one
mask). In the stateful clients example you link, that would be here. You will then get what you need by default (by summing the numerator).
b. Modify UnweightedMeanFactory to do exactly the variant of averaging you describe and use that. Start would be modifying this
(and 4.) I think that is what you would need to implement. The same way existing client states are initialized in the example here, you would need extend it to contain the aggregator states, and make sure those are sampled together with the clients, as done here. Then, to integrate the aggregators in the example you would need to replace this hard-coded tff.federated_mean. An example of such integration is in the implementation of tff.learning.build_federated_averaging_process, primarily here
I am not sure what the question is. Perhaps get the previous working (seems like a prerequisite to me), and then clarify and ask in a new post?

Finding the Jacobian of a frame with respect to the joints of a given model in Pydrake

Is there any way to find the Jacobian of a frame with respect to the joints of a given model (as opposed to the whole plant), or alternatively to determine which columns of the full plant Jacobian correspond to a given model’s joints? I’ve found MultibodyPlant.CalcJacobian*, but I’m not sure if those are the right methods.
I also tried mapping the JointIndex of each joint in the model to a column of MultibodyPlant.CalcJacobian*, but the results didn't make sense -- the joint indices are sequential (all of one model followed by all of the other), but the Jacobian columns look interleaved (a column corresponding to one model followed by one corresponding to the other).
Assuming you are computing with respect to velocities, you'll want to use Joint.velocity_start() and Joint.num_velocities() to create a mask or set of indices. If you are in Python, then you can use NumPy's array slicing to select the desired columns of your Jacobian.
(If you compute w.r.t. position, then make sure you use Joint.position_start() and Joint.num_positions().)
Example notebook:
https://nbviewer.jupyter.org/github/EricCousineau-TRI/repro/blob/eb7f11d/drake_stuff/notebooks/multibody_plant_jacobian_subset.ipynb
(TODO: Point to a more official source.)
Main code to pay attention to:
def get_velocity_mask(plant, joints):
"""
Generates a mask according to supplied set of ``joints``.
The binary mask is unable to preserve ordering for joint indices, thus
`joints` required to be a ``set`` (for simplicity).
"""
assert isinstance(joints, set)
mask = np.zeros(plant.num_velocities(), dtype=np.bool)
for joint in joints:
start = joint.velocity_start()
end = start + joint.num_velocities()
mask[start:end] = True
return mask
def get_velocity_indices(plant, joints):
"""
Generates a list of indices according to supplies list of ``joints``.
The indices are generated according to the order of ``joints``, thus
``joints`` is required to be a list (for simplicity).
"""
indices = []
for joint in joints:
start = joint.velocity_start()
end = start + joint.num_velocities()
for i in range(start, end):
indices.append(i)
return indices
...
# print(Jv1_WG1) # Prints 7 dof from a 14 dof plant
[[0.000 -0.707 0.354 0.707 0.612 -0.750 0.256]
[0.000 0.707 0.354 -0.707 0.612 0.250 0.963]
[1.000 -0.000 0.866 -0.000 0.500 0.612 -0.079]
[-0.471 0.394 -0.211 -0.137 -0.043 -0.049 0.000]
[0.414 0.394 0.162 -0.137 0.014 0.008 0.000]
[0.000 -0.626 0.020 0.416 0.035 -0.064 0.000]]

Nonlinear (non-polynomial) cost function with DirectCollocation in Drake

I am trying to formulate a trajectory optimization problem for a glider, where I want to maximize the average horisontal velocity. I have formulated the system as a drakesystem, and the state vector consists of the position and velocity.
Currently, I have something like the following:
dircol = DirectCollocation(
plant,
context,
num_time_samples=N,
minimum_timestep=min_dt,
maximum_timestep=max_dt,
)
... # other constraints etc
horisontal_pos = dircol.state()[0:2] # Only (x,y)
time = dircol.time()
dircol.AddFinalCost(-w.T.dot(horisontal_pos) / time)
where AddFinalCost() should replace all instances of state() and time() with the final values, as far as I understand from the documentation. min_dt is non-zero and w is a vector of linear weights.
However, I am getting the following error message
Expression (...) is not a polynomial. ParseCost does not support non-polynomial expression.
which makes me think that there is no way of adding the type of cost function that I am looking for. Is there anything that I am missing?
Thank you in advance!
When calling AddFinalCost(e) with e being a symbolic expression, we can only handle it when e is a polynomial function of the state (more precisely, either a quadratic function or a linear function). Hence the error you see complaining that the cost is not polynomial.
You could add the cost like this
def average_speed(v):
x = v[0]
time_steps = v[1:]
return v[0] / np.sum(time_steps)
h_vars = [dircol.timestep[i] for i in range(N-1)]
dircol.AddCost(average_speed, vars=[dircol.state(N-1)[0]] + h_vars)
which uses a function average_speed to evaluate the average speed. You could find example of doing this in https://github.com/RobotLocomotion/drake/blob/e5f3c3e5f7927ef675066d97d3afac55d3481305/bindings/pydrake/solvers/test/mathematicalprogram_test.py#L590
First, the cost function should be a scalar, but you a vector-valued horisontal_pos / time, which has two entries containing both position_x / dt and position_y / dt, namely a vector as the cost. You should instead provide a scalar valued cost.
Second, it is unclear to me why you divide time in the final cost. As far as I understand it, you want the final position to be close to the origin, so something like position_x² + position_y². The code can look like
dircol.AddFinalCost(horisontal_pos[0]**2 + horisontal_pos[1]**2)

TensorBoard - Plot training and validation losses on the same graph?

Is there a way to plot both the training losses and validation losses on the same graph?
It's easy to have two separate scalar summaries for each of them individually, but this puts them on separate graphs. If both are displayed in the same graph it's much easier to see the gap between them and whether or not they have begin to diverge due to overfitting.
Is there a built in way to do this? If not, a work around way? Thank you much!
The work-around I have been doing is to use two SummaryWriter with different log dir for training set and cross-validation set respectively. And you will see something like this:
Rather than displaying the two lines separately, you can instead plot the difference between validation and training losses as its own scalar summary to track the divergence.
This doesn't give as much information on a single plot (compared with adding two summaries), but it helps with being able to compare multiple runs (and not adding multiple summaries per run).
Just for anyone coming accross this via a search: The current best practice to achieve this goal is to just use the SummaryWriter.add_scalars method from torch.utils.tensorboard. From the docs:
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()
r = 5
for i in range(100):
writer.add_scalars('run_14h', {'xsinx':i*np.sin(i/r),
'xcosx':i*np.cos(i/r),
'tanx': np.tan(i/r)}, i)
writer.close()
# This call adds three values to the same scalar plot with the tag
# 'run_14h' in TensorBoard's scalar section.
Expected result:
Many thanks to niko for the tip on Custom Scalars.
I was confused by the official custom_scalar_demo.py because there's so much going on, and I had to study it for quite a while before I figured out how it worked.
To show exactly what needs to be done to create a custom scalar graph for an existing model, I put together the following complete example:
# + <
# We need these to make a custom protocol buffer to display custom scalars.
# See https://developers.google.com/protocol-buffers/
from tensorboard.plugins.custom_scalar import layout_pb2
from tensorboard.summary.v1 import custom_scalar_pb
# >
import tensorflow as tf
from time import time
import re
# Initial values
(x0, y0) = (-1, 1)
# This is useful only when re-running code (e.g. Jupyter).
tf.reset_default_graph()
# Set up variables.
x = tf.Variable(x0, name="X", dtype=tf.float64)
y = tf.Variable(y0, name="Y", dtype=tf.float64)
# Define loss function and give it a name.
loss = tf.square(x - 3*y) + tf.square(x+y)
loss = tf.identity(loss, name='my_loss')
# Define the op for performing gradient descent.
minimize_step_op = tf.train.GradientDescentOptimizer(0.092).minimize(loss)
# List quantities to summarize in a dictionary
# with (key, value) = (name, Tensor).
to_summarize = dict(
X = x,
Y_plus_2 = y + 2,
)
# Build scalar summaries corresponding to to_summarize.
# This should be done in a separate name scope to avoid name collisions
# between summaries and their respective tensors. The name scope also
# gives a title to a group of scalars in TensorBoard.
with tf.name_scope('scalar_summaries'):
my_var_summary_op = tf.summary.merge(
[tf.summary.scalar(name, var)
for name, var in to_summarize.items()
]
)
# + <
# This constructs the layout for the custom scalar, and specifies
# which scalars to plot.
layout_summary = custom_scalar_pb(
layout_pb2.Layout(category=[
layout_pb2.Category(
title='Custom scalar summary group',
chart=[
layout_pb2.Chart(
title='Custom scalar summary chart',
multiline=layout_pb2.MultilineChartContent(
# regex to select only summaries which
# are in "scalar_summaries" name scope:
tag=[r'^scalar_summaries\/']
)
)
])
])
)
# >
# Create session.
with tf.Session() as sess:
# Initialize session.
sess.run(tf.global_variables_initializer())
# Create writer.
with tf.summary.FileWriter(f'./logs/session_{int(time())}') as writer:
# Write the session graph.
writer.add_graph(sess.graph) # (not necessary for scalars)
# + <
# Define the layout for creating custom scalars in terms
# of the scalars.
writer.add_summary(layout_summary)
# >
# Main iteration loop.
for i in range(50):
current_summary = sess.run(my_var_summary_op)
writer.add_summary(current_summary, global_step=i)
writer.flush()
sess.run(minimize_step_op)
The above consists of an "original model" augmented by three blocks of code indicated by
# + <
[code to add custom scalars goes here]
# >
My "original model" has these scalars:
and this graph:
My modified model has the same scalars and graph, together with the following custom scalar:
This custom scalar chart is simply a layout which combines the original two scalar charts.
Unfortunately the resulting graph is hard to read because both values have the same color. (They are distinguished only by marker.) This is however consistent with TensorBoard's convention of having one color per log.
Explanation
The idea is as follows. You have some group of variables which you want to plot inside a single chart. As a prerequisite, TensorBoard should be plotting each variable individually under the "SCALARS" heading. (This is accomplished by creating a scalar summary for each variable, and then writing those summaries to the log. Nothing new here.)
To plot multiple variables in the same chart, we tell TensorBoard which of these summaries to group together. The specified summaries are then combined into a single chart under the "CUSTOM SCALARS" heading. We accomplish this by writing a "Layout" once at the beginning of the log. Once TensorBoard receives the layout, it automatically produces a combined chart under "CUSTOM SCALARS" as the ordinary "SCALARS" are updated.
Assuming that your "original model" is already sending your variables (as scalar summaries) to TensorBoard, the only modification necessary is to inject the layout before your main iteration loop starts. Each custom scalar chart selects which summaries to plot by means of a regular expression. Thus for each group of variables to be plotted together, it can be useful to place the variables' respective summaries into a separate name scope. (That way your regex can simply select all summaries under that name scope.)
Important Note: The op which generates the summary of a variable is distinct from the variable itself. For example, if I have a variable ns1/my_var, I can create a summary ns2/summary_op_for_myvar. The custom scalars chart layout cares only about the summary op, not the name or scope of the original variable.
Here is an example, creating two tf.summary.FileWriters which share the same root directory. Creating a tf.summary.scalar shared by the two tf.summary.FileWriters. At every time step, get the summary and update each tf.summary.FileWriter.
import os
import tqdm
import tensorflow as tf
def tb_test():
sess = tf.Session()
x = tf.placeholder(dtype=tf.float32)
summary = tf.summary.scalar('Values', x)
merged = tf.summary.merge_all()
sess.run(tf.global_variables_initializer())
writer_1 = tf.summary.FileWriter(os.path.join('tb_summary', 'train'))
writer_2 = tf.summary.FileWriter(os.path.join('tb_summary', 'eval'))
for i in tqdm.tqdm(range(200)):
# train
summary_1 = sess.run(merged, feed_dict={x: i-10})
writer_1.add_summary(summary_1, i)
# eval
summary_2 = sess.run(merged, feed_dict={x: i+10})
writer_2.add_summary(summary_2, i)
writer_1.close()
writer_2.close()
if __name__ == '__main__':
tb_test()
Here is the result:
The orange line shows the result of the evaluation stage, and correspondingly, the blue line illustrates the data of the training stage.
Also, there is a very useful post by TF team to which you can refer.
For completeness, since tensorboard 1.5.0 this is now possible.
You can use the custom scalars plugin. For this, you need to first make tensorboard layout configuration and write it to the event file. From the tensorboard example:
import tensorflow as tf
from tensorboard import summary
from tensorboard.plugins.custom_scalar import layout_pb2
# The layout has to be specified and written only once, not at every step
layout_summary = summary.custom_scalar_pb(layout_pb2.Layout(
category=[
layout_pb2.Category(
title='losses',
chart=[
layout_pb2.Chart(
title='losses',
multiline=layout_pb2.MultilineChartContent(
tag=[r'loss.*'],
)),
layout_pb2.Chart(
title='baz',
margin=layout_pb2.MarginChartContent(
series=[
layout_pb2.MarginChartContent.Series(
value='loss/baz/scalar_summary',
lower='baz_lower/baz/scalar_summary',
upper='baz_upper/baz/scalar_summary'),
],
)),
]),
layout_pb2.Category(
title='trig functions',
chart=[
layout_pb2.Chart(
title='wave trig functions',
multiline=layout_pb2.MultilineChartContent(
tag=[r'trigFunctions/cosine', r'trigFunctions/sine'],
)),
# The range of tangent is different. Let's give it its own chart.
layout_pb2.Chart(
title='tan',
multiline=layout_pb2.MultilineChartContent(
tag=[r'trigFunctions/tangent'],
)),
],
# This category we care less about. Let's make it initially closed.
closed=True),
]))
writer = tf.summary.FileWriter(".")
writer.add_summary(layout_summary)
# ...
# Add any summary data you want to the file
# ...
writer.close()
A Category is group of Charts. Each Chart corresponds to a single plot which displays several scalars together. The Chart can plot simple scalars (MultilineChartContent) or filled areas (MarginChartContent, e.g. when you want to plot the deviation of some value). The tag member of MultilineChartContent must be a list of regex-es which match the tags of the scalars that you want to group in the Chart. For more details check the proto definitions of the objects in https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/custom_scalar/layout.proto. Note that if you have several FileWriters writing to the same directory, you need to write the layout in only one of the files. Writing it to a separate file also works.
To view the data in TensorBoard, you need to open the Custom Scalars tab. Here is an example image of what to expect https://user-images.githubusercontent.com/4221553/32865784-840edf52-ca19-11e7-88bc-1806b1243e0d.png
The solution in PyTorch 1.5 with the approach of two writers:
import os
from torch.utils.tensorboard import SummaryWriter
LOG_DIR = "experiment_dir"
train_writer = SummaryWriter(os.path.join(LOG_DIR, "train"))
val_writer = SummaryWriter(os.path.join(LOG_DIR, "val"))
# while in the training loop
for k, v in train_losses.items()
train_writer.add_scalar(k, v, global_step)
# in the validation loop
for k, v in val_losses.items()
val_writer.add_scalar(k, v, global_step)
# at the end
train_writer.close()
val_writer.close()
Keys in the train_losses dict have to match those in the val_losses to be grouped on the same graph.
Tensorboard is really nice tool but by its declarative nature can make it difficult to get it to do exactly what you want.
I recommend you checkout Losswise (https://losswise.com) for plotting and keeping track of loss functions as an alternative to Tensorboard. With Losswise you specify exactly what should be graphed together:
import losswise
losswise.set_api_key("project api key")
session = losswise.Session(tag='my_special_lstm', max_iter=10)
loss_graph = session.graph('loss', kind='min')
# train an iteration of your model...
loss_graph.append(x, {'train_loss': train_loss, 'validation_loss': validation_loss})
# keep training model...
session.done()
And then you get something that looks like:
Notice how the data is fed to a particular graph explicitly via the loss_graph.append call, the data for which then appears in your project's dashboard.
In addition, for the above example Losswise would automatically generate a table with columns for min(training_loss) and min(validation_loss) so you can easily compare summary statistics across your experiments. Very useful for comparing results across a large number of experiments.
Please let me contribute with some code sample in the answer given by #Lifu Huang. First download the loger.py from here and then:
from logger import Logger
def train_model(parameters...):
N_EPOCHS = 15
# Set the logger
train_logger = Logger('./summaries/train_logs')
test_logger = Logger('./summaries/test_logs')
for epoch in range(N_EPOCHS):
# Code to get train_loss and test_loss
# ============ TensorBoard logging ============#
# Log the scalar values
train_info = {
'loss': train_loss,
}
test_info = {
'loss': test_loss,
}
for tag, value in train_info.items():
train_logger.scalar_summary(tag, value, step=epoch)
for tag, value in test_info.items():
test_logger.scalar_summary(tag, value, step=epoch)
Finally you run tensorboard --logdir=summaries/ --port=6006and you get:

Resources