Open Cascade has glTF writer in their current development branch - RWGltf_CafWriter
I am trying to convert STP to glTF using it and got starting point from this question - Any Open source Libraries to Convert STEP files to glTF file format?
It looks doable, but I am new to Open Cascade technology and have few questions
While calculating triangulation for shapes using BRepMesh_IncrementalMesh, it needs line deflection and angle deflection, what are these and what should be its values?
RWGltf_CafWriter requires TDocStd_Document and TDF_LabelSequence, how do we get these from Shapes?
Thank You
While calculating triangulation for shapes using BRepMesh_IncrementalMesh,
it needs line deflection and angle deflection,
what are these and what should be its values?
Deflection parameters define the mesh quality. Within specific domain / algorithm, you should probably know in advance applicable deviation of your geometry (like no more than 1 mm). However, in context of visualization and arbitrary CAD model, linear deflection is usually defined relatively to the bounding box of the document.
RWGltf_CafWriter requires TDocStd_Document and TDF_LabelSequence, how do we get these from Shapes?
TDocStd_Document is an XDE document supported by various file format translators - including STEP and glTF. If at that point you have a single TopoDS_Shape from STEP file, then you probably used a simplified STEP translator STEPControl_Reader. To preserve the structure of original document, it is better using STEPCAFControl_Reader filling in an XDE document.
Within XDE document, shapes (and not only shapes) are stored as Labels, so that TDF_LabelSequence collection is used to pass through the information like a sequence of root shapes (model tree roots in the document), which are called Free Shapes:
// read / create / fill in the document
Handle(TDocStd_Document) theXdeDoc; // created in advance
STEPCAFControl_Reader aStepReader;
if (!aStepReader.ReadFile ("myStep.stp") != IFSelect_RetDone) { // parse error }
if (!aStepReader.Transfer (theXdeDoc)) { // translation error }
...
// collect document roots into temporary compound
Handle(XCAFDoc_ShapeTool) aShapeTool = XCAFDoc_DocumentTool::ShapeTool (myXdeDoc->Main());
TDF_LabelSequence aRootLabels;
aShapeTool->GetFreeShapes (aRootLabels);
TopoDS_Compound aCompound;
BRep_Builder aBuildTool;
aBuildTool.MakeCompound (aCompound);
for (TDF_LabelSequence::Iterator aRootIter (aRootLabels); aRootIter.More(); aRootIter.Next())
{
const TDF_Label& aRootLabel = aRootIter.Value();
TopoDS_Shape aRootShape;
if (XCAFDoc_ShapeTool::GetShape (aRootLabel, aRootShape))
{
aBuildTool.Add (aCompound, aRootShape);
}
}
// perform meshing
Handle(Prs3d_Drawer) aDrawer = new Prs3d_Drawer(); // holds visualization defaults
BRepMesh_IncrementalMesh anAlgo;
anAlgo.ChangeParameters().Deflection = Prs3d::GetDeflection (aCompound, aDrawer);
anAlgo.ChangeParameters().Angle = 20.0 * M_PI / 180.0; // 20 degrees
anAlgo.ChangeParameters().InParallel = true;
anAlgo.SetShape (aCompound);
anAlgo.Perform();
...
// write or export the document
TColStd_IndexedDataMapOfStringString aMetadata;
RWGltf_CafWriter aGltfWriter ("exported.glb", true);
// STEP reader translates into mm units by default
aGltfWriter.ChangeCoordinateSystemConverter().SetInputLengthUnit (0.001);
aGltfWriter.ChangeCoordinateSystemConverter().SetInputCoordinateSystem (RWMesh_CoordinateSystem_Zup);
if (!aGltfWriter.Perform (theXdeDoc, aMetadata, Handle(Message_ProgressIndicator)())) { // export error }
In Draw Harness the conversion may look like this (the source code of commands can be used as a helpful reference of working code using related OCCT algorithms):
pload XDE OCAF VISUALIZATION MODELING
# read STEP file into XDE document
ReadStep D myStep.stp
# display the document in 3D viewer (will also compute default triangulation)
vinit
XDisplay -dispMode 1 D
vfit
# export XDE document into glTF file
WriteGltf D myGltf.glb
Related
So my goal is basically implementing global top-k subsampling. Gradient sparsification is quite simple and I have already done this building on stateful clients example, but now I would like to use encoders as you have recommended here at page 28. Additionally I would like to average only the non-zero gradients, so say we have 10 clients but only 4 have nonzero gradients at a given position for a communication round then I would like to divide the sum of these gradients to 4, not 10. I am hoping to achieve this by summing gradients at numerator and masks, 1s and 0s, at denominator. Also moving forward I will add randomness to gradient selection so it is imperative that I create those masks concurrently with gradient selection. The code I have right now is
import tensorflow as tf
from tensorflow_model_optimization.python.core.internal import tensor_encoding as te
#te.core.tf_style_adaptive_encoding_stage
class GrandienrSparsificationEncodingStage(te.core.AdaptiveEncodingStageInterface):
"""An example custom implementation of an `EncodingStageInterface`.
Note: This is likely not what one would want to use in practice. Rather, this
serves as an illustration of how a custom compression algorithm can be
provided to `tff`.
This encoding stage is expected to be run in an iterative manner, and
alternatively zeroes out values corresponding to odd and even indices. Given
the determinism of the non-zero indices selection, the encoded structure does
not need to be represented as a sparse vector, but only the non-zero values
are necessary. In the decode mehtod, the state (i.e., params derived from the
state) is used to reconstruct the corresponding indices.
Thus, this example encoding stage can realize representation saving of 2x.
"""
ENCODED_VALUES_KEY = 'stateful_topk_values'
INDICES_KEY = 'indices'
SHAPES_KEY = 'shapes'
ERROR_COMPENSATION_KEY = 'error_compensation'
def encode(self, x, encode_params):
shapes_list = [tf.shape(y) for y in x]
flattened = tf.nest.map_structure(lambda y: tf.reshape(y, [-1]), x)
gradients = tf.concat(flattened, axis=0)
error_compensation = encode_params[self.ERROR_COMPENSATION_KEY]
gradients_and_error_compensation = tf.math.add(gradients, error_compensation)
percentage = tf.constant(0.1, dtype=tf.float32)
k_float = tf.multiply(percentage, tf.cast(tf.size(gradients_and_error_compensation), tf.float32))
k_int = tf.cast(tf.math.round(k_float), dtype=tf.int32)
values, indices = tf.math.top_k(tf.math.abs(gradients_and_error_compensation), k = k_int, sorted = False)
indices = tf.expand_dims(indices, 1)
sparse_gradients_and_error_compensation = tf.scatter_nd(indices, values, tf.shape(gradients_and_error_compensation))
new_error_compensation = tf.math.subtract(gradients_and_error_compensation, sparse_gradients_and_error_compensation)
state_update_tensors = {self.ERROR_COMPENSATION_KEY: new_error_compensation}
encoded_x = {self.ENCODED_VALUES_KEY: values,
self.INDICES_KEY: indices,
self.SHAPES_KEY: shapes_list}
return encoded_x, state_update_tensors
def decode(self,
encoded_tensors,
decode_params,
num_summands=None,
shape=None):
del num_summands, decode_params, shape # Unused.
flat_shape = tf.math.reduce_sum([tf.math.reduce_prod(shape) for shape in encoded_tensors[self.SHAPES_KEY]])
sizes_list = [tf.math.reduce_prod(shape) for shape in encoded_tensors[self.SHAPES_KEY]]
scatter_tensor = tf.scatter_nd(
indices=encoded_tensors[self.INDICES_KEY],
updates=encoded_tensors[self.ENCODED_VALUES_KEY],
shape=[flat_shape])
nonzero_locations = tf.nest.map_structure(lambda x: tf.cast(tf.where(tf.math.greater(x, 0), 1, 0), tf.float32) , scatter_tensor)
reshaped_tensor = [tf.reshape(flat_tensor, shape=shape) for flat_tensor, shape in
zip(tf.split(scatter_tensor, sizes_list), encoded_tensors[self.SHAPES_KEY])]
reshaped_nonzero = [tf.reshape(flat_tensor, shape=shape) for flat_tensor, shape in
zip(tf.split(nonzero_locations, sizes_list), encoded_tensors[self.SHAPES_KEY])]
return reshaped_tensor, reshaped_nonzero
def initial_state(self):
return {self.ERROR_COMPENSATION_KEY: tf.constant(0, dtype=tf.float32)}
def update_state(self, state, state_update_tensors):
return {self.ERROR_COMPENSATION_KEY: state_update_tensors[self.ERROR_COMPENSATION_KEY]}
def get_params(self, state):
encode_params = {self.ERROR_COMPENSATION_KEY: state[self.ERROR_COMPENSATION_KEY]}
decode_params = {}
return encode_params, decode_params
#property
def name(self):
return 'gradient_sparsification_encoding_stage'
#property
def compressible_tensors_keys(self):
return False
#property
def commutes_with_sum(self):
return False
#property
def decode_needs_input_shape(self):
return False
#property
def state_update_aggregation_modes(self):
return {}
I have run some simple tests manually following the steps you outlined here at page 45. It works but I have some questions/problems.
When I use list of tensors of same shape (ex:2 2x25 tensors) as input,x, of encode it works without any issues but when I try to use list of tensors of different shapes (2x20 and 6x10) it gives and error saying
InvalidArgumentError: Shapes of all inputs must match: values[0].shape = [2,20] != values1.shape = [6,10] [Op:Pack] name: packed
How can I resolve this issue? As i said I want to use global top-k so it is essential I encode entire trainable model weights at once. Take the cnn model used here, all the tensors have different shapes.
How can I do the averaging I described at the beginning? For example here you have done
mean_factory = tff.aggregators.MeanFactory(
tff.aggregators.EncodedSumFactory(mean_encoder_fn), # numerator
tff.aggregators.EncodedSumFactory(mean_encoder_fn), # denominator )
Is there a way to repeat this with one output of decode going to numerator and other going to denominator? How can I handle dividing 0 by 0? tensorflow has divide_no_nan function, can I use it somehow or do I need to add eps to each?
How is partition handled when I use encoders? Does each client get a unique encoder holding a unique state for it? As you have discussed here at page 6 client states are used in cross-silo settings yet what happens if client ordering changes?
Here you have recommended using stateful clients example. Can you explain this a bit further? I mean in the run_one_round where exactly encoders go and how are they used/combined with client update and aggregation?
I have some additional information such as sparsity I want to pass to encode. What is the suggested method for doing that?
Here are some answers, hope it helps:
If you want to treat all of the aggregated structure just as a single tensor, use concat_factory as the outermost aggregator. That will concatenate entire structure to a rank-1 Tensor at clients, and then unpack back to the original structure at the end. Example use: tff.aggregators.concat_factory(tff.aggregators.MeanFactory(...))
Note the encoding stage objects are meant to work with a single tensor, so what you describe with identical tensors probably works only accidentally.
There are two options.
a. Modify the client training code such that the weights being passed to the weighted aggregator are already what you want it to be (zero/one
mask). In the stateful clients example you link, that would be here. You will then get what you need by default (by summing the numerator).
b. Modify UnweightedMeanFactory to do exactly the variant of averaging you describe and use that. Start would be modifying this
(and 4.) I think that is what you would need to implement. The same way existing client states are initialized in the example here, you would need extend it to contain the aggregator states, and make sure those are sampled together with the clients, as done here. Then, to integrate the aggregators in the example you would need to replace this hard-coded tff.federated_mean. An example of such integration is in the implementation of tff.learning.build_federated_averaging_process, primarily here
I am not sure what the question is. Perhaps get the previous working (seems like a prerequisite to me), and then clarify and ask in a new post?
I used images and annotation data from the open images dataset v6.
I was able to retrieve the images, but not the annotation information.
Can you please tell me what to do?
Current status
I ran the code on GoogleColaboratory, referring to the demonstration of fiftyone.
I was able to run it up to the following point
dataset = foz.load_zoo_dataset(
"open-images-v6",
split="validation",
label_types=["detections", "classifications"],
classes=["Bottle"],
max_samples=250,
seed=51,
shuffle=True,
dataset_name="open-images-sample-mix-data",
)
person_subset = foz.load_zoo_dataset(
"open-images-v6",
split="validation",
label_types=["detections", "classifications"],
classes=["Person"],
max_samples=250,
seed=51,
shuffle=True,
dataset_name="Person-subset",
)
can_subset = foz.load_zoo_dataset(
"open-images-v6",
split="validation",
label_types=["detections", "classifications"],
classes={'Tin can'},
max_samples=250,
seed=51,
shuffle=True,
dataset_name="Tin_can-subset",
)
box_subset = foz.load_zoo_dataset(
"open-images-v6",
split="validation",
label_types=["detections", "classifications"],
classes=["Box"],
max_samples=250,
seed=51,
shuffle=True,
dataset_name="Box-subset",
)
_ = dataset.merge_samples(person_subset)
_ = dataset.merge_samples(box_subset)
_ = dataset.merge_samples(can_subset)
However, from here I want to open the file detections.csv and type the coordinate information into the generated text, but the file is too huge to open.
Could you please tell me how to get the coordinate information?
The base Open Images annotation csv files are quite large. The best way to access the bounding box coordinates would be to just iterate of the FiftyOne dataset directly and access the coordinates from the FiftyOne Detection label objects.
bboxes = []
for sample in dataset:
for detection in sample.detections.detections:
bbox = detection.bounding_box
bboxes.append(bbox)
In this loop, you can also access other information to store in your text file like the sample ids and classification annotations. While this loop is the most flexible way to get other information that you want from the dataset, if you just want the bounding box coordinates, the most efficient way to get that information is to use dataset.values()
bboxes = dataset.values("detections.detections.bounding_box")
Either way, you would then write these lists of box coordinates to a text file programmatically.
I would get some data stream about 3d position(in fixed world coordinate system) of a human's 20 skeletons.
I want to use the skeletons data to drive a human model with fixed bone like the demo video.
In Kinect SDK v1.8,i could get each skeleton's local rotation by NUI_SKELETON_BONE_ORIENTATION.hierarchicalRotation.
I want to implement some function like that.But the Kinect's SDK isn't open source.
I've found that the function xnGetSkeletonJointOrientation could get skeleton's rotation like that in OpenNI.But i haven't found the implement function about that.I don't know where am i wrong.
Any idea is appreciated.Thanks!
EDIT
I have found a similar question.
Here is the code he used finally.
Point3d Controller::calRelativeToParent(int parentID,Point3d point,int frameID){
if(parentID == 0){
QUATERNION temp = calChangeAxis(-1,parentID,frameID);
return getVect(multiplyTwoQuats(multiplyTwoQuats(temp,getQuat(point)),getConj(temp)));
}else{
Point3d ref = calRelativeToParent(originalRelativePointMap[parentID].parentID,point,frameID);
QUATERNION temp = calChangeAxis(originalRelativePointMap[parentID].parentID,parentID,frameID);
return getVect(multiplyTwoQuats(multiplyTwoQuats(temp,getQuat(ref)),getConj(temp)));
}}
QUATERNION Controller::calChangeAxis(int parentID,int qtcId,int frameID){ //currentid = id of the position of the orientation to be changed
if(parentID == -1){
QUATERNION out = multiplyTwoQuats(quatOrigin.toChange,originalRelativePointMap[qtcId].orientation);
return out;
}
else{
//QUATERNION temp = calChangeAxis(originalRelativePointMap[parentID].parentID,qtcId,frameID);
//return multiplyTwoQuats(finalQuatMap[frameID][parentID].toChange,temp);
return multiplyTwoQuats(finalQuatMap[frameID][parentID].toChange,originalRelativePointMap[qtcId].orientation);
}}
But i still have some question about that.
What does the variables quatOrigin.toChange and originalRelativePointMap stand for?
And in my opinion,the parameter Point3d point of the function Controller::calRelativeToParent should be a vector with euler angle.In this way,how to call the Controller::calRelativeToParent API in the main program.Because we know the root's rotation only.
The skeleton class has a "Joints" member that contains all the 3d position data for each tracked joint on the skeleton. I would look at the joint position data directly to drive your model rather than angles. Take one point to be your base (head or otherwise) then generate vectors in tree form between pairs of connected skeletal points. Scale those vectors and apply them to your model.
How can I get both the 4096 dim feature layer and the 1000 dim class layer in caffe after one forward pass using C++?
I tried to look it up in extract_features.cpp but it uses some weird datum object, so I cannot really understand how it works.
So far I was simply cropping my prototxt files up to the layer that I wanted to extract and used
[...]
net->ForwardPrefilled();
Blob<float> *output_layer = net->output_blobs()[0];
const float *begin = output_layer->cpu_data();
const float *end = begin + output_layer->channels();
return vector<float>(begin, end);
but that does not work if I want to extract two specific layers (eg "prob" and "fc7") simultaneously.
Update
The simple work flow of extract_feature.cpp(suppose you have a shared_ptr<Net<float> > net object in c++):
perform net forward to process input: net->Forward().
In this step, there is a Data layer in the net to read the input images. So if in your own app/code you want read an image to cv::Mat image and feed it into net, you can write a code like:
// for data preprocess
shared_ptr<caffe::DataTransformer<float> > data_transformer;
caffe::TransformationParameter trans_para;
// set mean
trans_para.set_mean_file("/path/to/image_mean.binaryproto");
// set crop size, e.g.here is cropping 227x227
trans_para.set_crop_size(227);
// instantiate a DataTransformer using trans_para for image preprocess
data_transformer.reset(new caffe::DataTransformer<float>(trans_para, caffe::TEST));
const std::vector<caffe::Blob<float> *> net_input = net->input_blobs();
// maybe you need to resize image before this step
data_transformer->Transform(image, *net_input[0]);
net->Forward();
And the net.prototxt should have a Input layer as the first layer, e.g. this deploy.prototxt.
get the feature blobs according to their names:const boost::shared_ptr<Blob<Dtype> > feature_blob = net->blob_by_name(blob_names[i])
extract the feature data from the blob you get into a structure you want, e.g. an arry, a simple sample code can be:
count = feature_blob->channels() * feature_blob->height() *
feature_blob->width();
float* feature_array = new float[count];
const float* feature_blob_data = feature_blob->cpu_data() +
feature_blob->offset(n); // feature data generated from
// the nth input image within a batch
memcpy(feature_array, feature_blob_data, count * sizeof(float));
...// other operations
delete [] feature_array;
Note that the data stored from feature_blob_data is in row-major order.
The extract_feature.cpp's usage should be like this for your task:
path/to/extract_features your_pretrained_model.caffemodel \
net.prototxt 4096_dim_feature_blob_name,1000_dim_class_feature_blob_name \
saved_4096_dim_feature_database,saved_1000_dim_class_feature_database \
num_mini_batches(times for forward pass) lmdb(or leveldb) GPU(or CPU)
The net.prototxt should contain a data layer that can read the input image data.
And when running, it will first the read image data from the data layer within net.prototxt and perform num_mini_batches times of forward pass and extract the 2 two feature blob 4096_dim_feature_blob_name, 1000_dim_class_feature_blob_name's data into a structure typed of Datum and then serialize them to save in the database saved_4096_dim_feature_database, saved_1000_dim_class_feature_database which are typed of lmdb or leveldb.
When finished, you can read the saved feature data from saved_4096_dim_feature_database, saved_1000_dim_class_feature_database using a data layer in net.prototxt respectively.
BTW, datum is a structure that can store at most 4D data as well as the data's shape and label information etc. It is defined in caffe.proto, generated using google protobuf and is convenient for data interchange between caffe and database like LMDB and LEVELDB.
I have the following problem. I have to compute dense SIFT interest points in a very high dimensional image (182MP). When I run the code in the full image Matlab always close suddently. So I decided to run the code in image patches.
the code
I tried to use blocproc in matlab to call the c++ function that performs the dense sift interest points detection this way:
fun = #(block_struct) denseSIFT(block_struct.data, options);
[dsift , infodsift] = blockproc(ndvi,[1000 1000],fun);
where dsift is the sift descriptors (vectors) and infodsift has the information of the interest points, such as the x and y coordinates.
the problem
The problem is the fact that blocproc just allow one output, but i want both outputs. The following error is given by matlab when i run the code.
Error using blockproc
Too many output arguments.
Is there a way for me doing this?
Would it be a problem for you to "hard code" a version of blockproc?
Assuming for a moment that you can divide your image into NxM smaller images, you could loop around as follows:
bigImage = someFunction();
sz = size(bigImage);
smallSize = sz ./ [N M];
dsift = cell(N,M);
infodsift = cell(N,M);
for ii = 1:N
for jj = 1:M
smallImage = bigImage((ii-1)*smallSize(1) + (1:smallSize(1)), (jj-1)*smallSize(2) + (1:smallSize(2));
[dsift{ii,jj} infodsift{ii,jj}] = denseSIFT(smallImage, options);
end
end
The results will then be in the two cell arrays. No real need to pre-allocate, but it's tidier if you do. If the individual matrices are the same size, you can convert into a single large matrix with
dsiftFull = cell2mat(dsift);
Almost magic. This won't work if your matrices are different sizes - but then, if they are, I'm not sure you would even want to put them all in a single one (unless you decide to horzcat them).
If you do decide you want a list of "all the colums as a giant matrix", then you can do
giantMatrix = [dsift{:}];
This will return a matrix with (in your example) 128 rows, and as many columns as there were "interest points" found. It's shorthand for
giantMatrix = [dsift{1,1} dsift{2,1} dsift{3,1} ... dsift{N,M}];