i am using pioneer 3dx with rosaria library where sonar publishes obstacles position from where i can extract ranges from obstacles. i mounted intel realsense D435i on pioneer and installed all the libraries required for camera. now the question is what data should i extract from camera for fusing both data so that they complement each other.
i used yolov3 algorithm on realsense camera and try to detect objects ,bounding box and x min,y min,x max, y max and depth info from objects detected.
so which info should i use from vast topics that camera is giving to fuse it with sonar range data so to improve obstacle detection fo avoidance
i am attaching sonar range data and camera publishes topics below
sonar ranges from 8 sonar rings:
pointcloud of sonar:
camera published topics:
i am doing final year project as lane tracking using a camera. the most challenging task now is how i can measure distance between the camera (the car that carries it actually) and the lane.
While the lane is easily recognized (Hough line transform) but i found no way to measure distance to it.
given the fact that there is a way to measure distance to object in front of camera based on Pixel width of the object, but it does not work here be because the nearest point of the line, is blind in the camera.
What you want is to directly infer the depth map with a monocular camera.
You can refer my answer here
https://stackoverflow.com/a/64687551/11530294
Usually, we need a photometric measurement from a different position in the world to form a geometric understanding of the world(a.k.a depth map). For a single image, it is not possible to measure the geometric, but it is possible to infer depth from prior understanding.
One way for a single image to work is to use a deep learning-based method to direct infer depth. Usually, the deep learning-based approaches are all based on python, so if you only familiar with python, then this is the approach that you should go for. If the image is small enough, i think it is possible for realtime performance. There are many of this kind of work using CAFFE, TF, TORCH etc. you can search on git hub for more option. The one I posted here is what i used recently
reference:
Godard, Clément, et al. "Digging into self-supervised monocular depth estimation." Proceedings of the IEEE international conference on computer vision. 2019.
Source code: https://github.com/nianticlabs/monodepth2
The other way is to use a large FOV video for a single camera-based SLAM. This one has various constraints such as need good features, large FOV, slow motion, etc. You can find many of this work such as DTAM, LSDSLAM, DSO, etc. There are a couple of other packages from HKUST or ETH that does the mapping given the position(e.g if you have GPS/compass), some of the famous names are REMODE+SVO open_quadtree_mapping etc.
One typical example for a single camera-based SLAM would be LSDSLAM. It is a realtime SLAM.
This one is implemented based on ROS-C++, I remember they do publish the depth image. And you can write a python node to subscribe to the depth directly or the global optimized point cloud and project it into a depth map of any view angle.
reference: Engel, Jakob, Thomas Schöps, and Daniel Cremers. "LSD-SLAM: Large-scale direct monocular SLAM." European conference on computer vision. Springer, Cham, 2014.
source code: https://github.com/tum-vision/lsd_slam
I am going to set up my RGBD camera with Panasonic LUMIX GH5 and azure Kinect like a depthkit cinema.
The depthkit does not provide raw depth data but the obj sequence files. I require the depth buffer which aligns with an RGB image.
So I started to write the software for it. (I have fixed my Panasonic LUMIX GH5 camera and azure Kinect with SmallRig fixture.)
After getting the extrinsic param of the GH5 and Azure Kinect RGB sensor with OpenCV's solvePnP function, how can I use them to aligning the GH5 colour image with the Azure Kinect Depth image?
Or should I do another approach to accomplish this?
I can't find any idea or resources for this issue.
In the Azure Kinect documentation, I found "k4a_transformation_depth_image_to_color_camera_custom" function in Azure Kinect SDK.
Is this method useful for my case?
And if true, how can I get the k4a_transformation_t value for its parameter?
I am currently doing a project with a FLIR Blackfly camera (BFS-U3-120S4C to be exact), and have not been getting a low enough reprojection error in my calibration experiments.
I know that manufacturers often have the intrinsic parameters and distortion coefficients available, and FLIR has a Spinnaker Python API to operate their cameras and they advertise them as Machine Vision cameras, so I strongly suspected that this would be the case for them.
However, I have not been successful in finding this in my Google searches. Does anyone know of FLIR have given out the values for these parameters somewhere?
Thanks for the help!
APIs shipped with MS Windows Kinect SDK is all about program around Voice, Movement and Gesture Recognition related to humans.
Is there any open source or commercial APIs for tracking & recognizing dynamically moving objects like vehicles for its classification.
Is it feasible and good approach of employee Kinect for Automated vehicle classification than traditional image processing approaches
Even image processing technologies have made remarkable innovations, why fully automated vehicle classification is not used at Most of the toll collection.
why existing technologies (except RFID approach) failing to classify the vehicle (i.e, they are not yet 100% accurate in classifying) or is there any other reasons apart from image processing.
You will need to use a regular image processing suite to track objects that are not supported by the Kinect API. A few being:
OpenCV
Emgu CV (OpenCV in .NET)
ImageMagick
There is no library that directly supports the depth capabilities of the Kinect, to my knowledge. As a result, using the Kinect over a regular camera would be of no benefit.