I created .arobject file from apple's object scanning sample code.
Now I am wondering is there any way to convert this .arobject file to .usdz file?
No, in ARKit 5.0 and earlier you can't convert .arobject file into .usdz file format (and vice versa). That's because .arobject file contains only the spatial feature information needed to recognize a scanned real-world object, it is not a displayable 3D reconstruction mesh of that object. In other words, .arobject contains a sparse point cloud, not a dense point cloud.
If you want to create a 3D model from a dense point cloud you need a special RealityKit's API for that. Look at this post and this post for further details.
Related
I am using Drake to implement a visual servoing scheme. For that I need to process color images and extract features.
I am using the ManipulationStation class, and I have already published the color images of the rgbd_sensor objects to LCM channels (in c++). Now I want to process the images during the simulation.
I know that it would be best to process images internally (without using the ImageWriter), and for that, I cannot use the image_array_t on LCM channels or ImageRgbaU8, I have to convert images to an Eigen or OpenCV type.
I will then use feature extraction functions in these libraries. These features will be used in a control law.
Do you have any examples on how to write a system or code in c++ that could convert my Drake images to OpenCV or Eigen types? What is the best way to do it?
Thanks in advance for the help!
Arnaud
There's no current example converting the drake::Image to Eigen or OpenCV but it should be pretty straightforward.
What you should do is create a system akin to ImageWriter. It takes an image as an input and does whatever processing you need on the image. You connect its input port to the RgbdSensor's output port (again, like ImageWriter.) The only difference, instead of taking the image and converting it to an lcm data type, you convert it to your OpenCV or Eigen type and apply your logic to that.
If you look, ImageWriter declares a periodic publish event. Change that to (probably) a discrete update event (still periodic, I'd imagine) and then change ImageWriter::WriteImage callback into your "Image-to-OpenCV" callback.
Finally, you could also declare an output port that would make your converted image to other systems.
I've retrained an ssd_mobilenet_v2 via tensorflow object detection API on my custom class. I've now got a frozen_inference_graph.pb file, which is ready to be embedded into my app.
The tutorials on tensorflow's github and website only show how to use it for the iOS built-in camera stream. Instead, I have an external camera for my iPhone, which streams to an UIView component. I want my network to detect objects in this, but my research doesn't point to any obvious implementations/tutorials.
My question: Does anyone know whether this is possible? If so, what's the best way to implement such a thing? tensorflow-lite? tensorflow mobile? Core ML? Metal?
Thanks!
In that TensorFlow source code, in the file CameraExampleViewController.mm is a method runCNNOnFrame that takes a CVPixelBuffer object as input (from the camera) and copies its contents into image_tensor_mapped.data(). Then it runs the TF graph on that image_tensor object.
To use a different image source, such as the contents of a UIView, you need to first read the contents of that view into some kind of memory buffer (typically a CGImage) and then copy that memory buffer into image_tensor_mapped.data().
It might be easier to convert the TF model to Core ML (if possible), then use the Vision framework to run the model as that can directly use a CGImage as input. This saves you from having to convert that image into a tensor first.
I want to make a project in which we can make 3D model of a object using sequences of images. So I want to know:
How can I make 3D model using sequences of 2D images?
Is there any tutorial for it either on any website or in PDF format?
I searched on opencv's website but I couldn't find topic related to 3D model.
Here openCV SfM module documentation: link
So I am working on a project for school and what we are trying to do is to teach a neural network to recognize buildings from non-buildings. The problem I am having right now is representing the data in a form, that would be "readable" by the classifier function.
The training data is a bunch of pictures + .wkt file with coordinates of buildings on a picture. So far we have been able to rescale the polygons, but kinda got stuck there.
Can you give any hints or ideas of how to bring this all to an appropriate form?
Edit: I do not need the code written for me, a link to an article on a similar subject or a book is more of stuff I am looking for.
You did not mention what framework you are using, but I will give an answer for caffe.
Your problem is very close to detecting objects within an image. You have full images with object (building in your case) bounding boxes.
The easiest way of doing this is through a python data layer which reads an image and a file with stored coordinates for that image and feeds that into your network. A tutorial on how to use it can be found here: https://github.com/NVIDIA/DIGITS/tree/master/examples/python-layer
To accelerate the process you may want to store image, coordinate pairs in your custom lmdb database.
Finally a good working example with complete caffe implementation can be found within Faster-RCNN library here: https://github.com/rbgirshick/caffe-fast-rcnn/
You should check roi_pooling_layer.cpp in their custom caffe branch and roi_data_layer on how the data is fed into the network.
the 3D point is generated by my laser-scaner, I want to save it as the format ADF,so Google Tango could use it
The short answer is... you probably can't.
There is no information on the ADF format but in any case it uses more than the 3D points from the depth camera. If you watch the Google IO videos it shows how it uses the angular camera to obtain some image features and recognize the environment. I guess using only 3D data would be too expensive and could not use information from distant points.