Since for a while, I have been really interested in Image processing especially in VFX. After I watched several movie such as Rise of the Planet of the Ape, I have been trying to replace the face of actor by 3d model without the special feature points have to marked on the face.
As you can see in the second picture, I can get the point on my face.
I wish to combine this points position with a 3d model so that I could control the model with my face expression...
What can I do? 3dMax or Maya do not have the plug in( or I am too stupid that I could not find that), unity is also a good solutions. I once tried to use openGL, and I could control the model's position, however, controlling the face expression of the model is more difficult...
Could anyone help me, give me some suggestions or some paper to read ~
Thanks a lot
Related
I'm using SceneKit to create a 3D Room for a Swift iOS app.
I'm using multiple boxes and placing it together to create different walls of the room. I want to also add doors and windows to the room for which I need to cut holes into the walls. This looks like a very common scenario but yet I couldn't find any relevant answers out there.
I know there are multiple ways of doing it -
Simplest being, don't cut the box. Place another box with door or wall texture.
But I do want to keep a light source outside of the room and want it to flow into the room through these doors and windows
Create multiple boxes for single wall and put them together to make a geometry
My last resort maybe.
Create custom geometry.
Feels too complicated since it requires me to draw each triangle myself. Not sure?
But what I was actually expecting -
Subtract geometries from geometries?
Library that's already handling these complexities?
Any pointers would be very helpful.
Thanks.
Scene kit offers some awesome potential but it's not a substitute for a 3D modeling program. If you want something much beyond assembling with primitives and extrusion in a plane you should think about constructing your model in a dedicated 3-D package and exporting the model into SceneKit as a .dae file. You might take a look at Blender. It's free and readily available on the net. I suspect it can easily do what you want and the learning curve will be compensated by the higher level functions of a graphics program versus coding.
I think #bpedit described the best approach.
A weak second choice would be to use SCNShape to build your geometry. That still leaves you the problem of constructing a Bezier path that matches your wall layout/topology. That might be a helpful hack in the short term, to save you from an immediate learning curve in modeling software. But I predict you'll still eventually move to a tool like Blender, SketchUp, Cheetah 3D, or Maya.
I want to do something like this but in reverse-- so that the cameras are outside and pointing inward. Let's start with the abstract and get specific:
1) Are there any TOOLS that will do this for me? How close can I get using existing software?
2) Say the nearest tool is a graphics library like OpenCV. I've taken linear algebra and have an undergraduate degree in CS but without any special training in graphics. Where should I go from there?
3) If I really am undergoing a decade-long spiritual quest of a self-teaching+programming exercise to make this happen, are there any papers or other resources that you aware of that might aid me?
I think the demo you linked uses a 360° camera (see the black circle on the bottom) and does not involve stitching in any way.
About your question, are you aware of this work? They don't do stitching either, just blending between different views.
If you use inward views, then the objects you will observe will probably be quite close to the cameras, while standard stitching assumes that objects are far away. Close 3D objects mean high distortion when you change the viewpoint (i.e. parallax & occlusions), which makes it difficult to interpolate between two views. Hence, if you want stitching, then your main problem is to correctly handle parallax effects & occlusions between the views.
In my opinion, the most promising approach would be to do live stereo matching (i.e. dense 3D reconstruction) between the two camera images closest to your current viewpoint, and then interpolate the estimated disparities to generate an expected image. However, it's not likely to run in real-time, as demonstrated in the demo you linked, and the result could be quite ugly...
EDIT
You can also have a look at this paper, which uses a different but interesting approach, however maybe not directly useful in your case since it requires the new viewpoint to be visible in the available images.
First off, I'd like to state that I'm very new to this field and apologize if the question is a little too repetitive. I've looked around but in vain. I'm working on reading Hartley and Zisserman's book but it's taking me a while.
My problem is That I've got 3 Video Sources of an area and I need to find the camera position at each frame of the video. I do not have any information about the cameras that took the videos (i.e no Intrinsics).
Looking for a solution I came across SfM and tried existing software that exists namely Bundler & Vsfm and they both seem to have worked quite well. However I've got a couple of questions about it.
1) Is SfM really required in my case? Since SfM does a sparse reconstruction and the common points between images are also an output, is it fully necessary? or are there more suitable methods that can do it without since positions are all I really need? Or are there less complex methods I may use instead?
2) From what I've read, I need to calibrate the camera and find it's Intrinsics and Extrinsics. How can I do this without knowing either? I've looked at the 5-pt problem and others but most of them require you to know the intrinsic properties of the camera which I don't have and I cannot use a pattern such as a chessboard to calibrate them since they come from a source outside my control.
Thanks for your time!
Based on my experience, the short answer is:
1) You cannot reliably estimate the 3D pose of the cameras independently from the 3D of the scene. Moreover, since your cameras are moving independently, I think SfM is the right way to approach your problem.
2) You need to estimate the cameras' intrinsics in order to estimate useful (i.e. Euclidian) poses and scene reconstruction. If you cannot use the standard calibration procedure, with chessboard and co, you can have a look at the autocalibration techniques (see also chapter 19 in Hartley's & Zisserman's book). This calibration procedure is done independently for each camera and only require several image samples at different positions, which seems appropriate in your case.
You can actually accomplish your task in a massive bundle adjacent procedure up to a scaling parameter. But is is a very complicated thing even if you aren't novice. You dont need 3d reconstruction, just an essential matrix that can be obtained from 2d projections and decomposed i to rotation and translation but this does require Iintrinsic Paramus. To get them you have to have at least three frames.
Finally, Drop Zimmerman book it will drive you crazy. Read Simon Princes "Computer Vision"instead.
I am looking for camera calibration techniques with OpenCV and saw the chessboard and circles methods, but I wanted to calibrate the camera with something that is in the real world and you don't have to print (printers are also not very accurate in what they print).
Is it possible to do calibration with complex shapes like the Coca Cola logo on the cans? Is it a problem that the surface is curved?
Thanks
Depending on what you want to achieve this is not at all necessarily a bad idea, and you are not the first one who had it. There was a technology that uses a CD, which is a strongly standardised object which at least used to exist on most households, for a simple camera calibration task. (There is little technical to be found online about this, as the technology was proprietary. This is business document, where the use of the CD is mentioned. Algorithmically, however, it is not difficult if you know camera calibration.)
The question is whether the precision you get is sufficient for your application. Don't expect any miracles here. Generally you can use almost any object you like to learn something about a camera, as long as you can detect it reliably and you know its geometry. Almost certainly you will have to take several pictures of the object. Curved surfaces are no problem per see. I regularly used a cylinder (larger than a beverage can, though, with a simple to detect pattern) to calibrate a complete camera rig of 12 SLRs.
Don't expect to find out of the box solutions and don't expect implementation to be trivial. You will have to work your way through the math. I recommend the book by Hartley and Zisserman, Multiple View Geometry for Computer vision. This paper describes an analysis-by-synthesis approach to calibration, which is the way to go for here (it does not describe exactly what you want, but the approach should generalise to arbitrary objects as long as you can detect them).
i can understand your wish, but it's a bad idea.
the calibration algorithm works by comparing real world points from the cam with a synthetical model ( yes, you have to supply that , too! ). so, while it's easy to calculate a 2d chessboard grid on the fly and use that, it will be very hard to do for your tin can, or any arbitrary household item you grab.
just give in, and print a rectangular chessbord grid to a piece of paper
(opencv comes with a pdf for that already).
don't use a real-life chessboard, a quadratic one is ambiguous to 90° rotation.
interesting idea.
What about displaying a checkerboard pattern (or sth else) on an lcd screen display and use that display as calibration pattern?? You would have to know the displaying size of the pattern though.
Googling I found this paper:
CAMERA CALIBRATION BASED ON LIQUID CRYSTAL DISPLAY (LCD)
ZHAN Zongqian
http://www.isprs.org/proceedings/XXXVII/congress/3b_pdf/04.pdf
comment: this doesn't answer the question about the coca-cola can but gives and idea for a solution to the grounding problem: camera calibration with a common object.
I have a very specific application in which I would like to try structure from motion to get a 3D representation. For now, all the software/code samples I have found for structure from motion are like this: "A fixed object that is photographed from all angle to create the 3D". This is not my case.
In my case, the camera is moving in the middle of a corridor and looking forward. Sometimes, the camera can look on other direction (Left, right, top, down). The camera will never go back or look back, it always move forward. Since the corridor is small, almost everything is visible (no hidden spot). The corridor can be very long sometimes.
I have tried this software and it doesn't work in my particular case (but it's fantastic with normal use). Does anybody can suggest me a library/software/tools/paper that could target my specific needs? Or did you ever needed to implement something like that? Any help is welcome!
Thanks!
What kind of corridors are you talking about and what kind of precision are you aiming for?
A priori, I don't see why your corridor would not be a fixed object photographed from different angles. The quality of your reconstruction might suffer if you only look forward and you can't get many different views of the scene, but standard methods should still work. Are you sure that the programs you used aren't failing because of your picture quality, arrangement or other reasons?
If you have to do the reconstruction yourself, I would start by
1) Calibrating your camera
2) Undistorting your images
3) Matching feature points in subsequent image pairs
4) Extracting a 3D point cloud for each image pair
You can then orient the point clouds with respect to one another, for example via ICP between two subsequent clouds. More sophisticated methods might not yield much difference if you don't have any closed loops in your dataset (as your camera is only moving forward).
OpenCV and the Point Cloud Library should be everything you need for these steps. Visualization might be more of a hassle, but the pretty pictures are what you pay for in commercial software after all.
Edit (2017/8): I haven't worked on this in the meantime, but I feel like this answer is missing some pieces. If I had to answer it today, I would definitely suggest looking into the keyword monocular SLAM, which has recently seen a lot of activity, not least because of drones with cameras. Notably, LSD-SLAM is open source and may not be as vulnerable to feature-deprived views, as it operates directly on the intensity. There even seem to be approaches combining inertial/odometry sensors with the image matching algorithms.
Good luck!
FvD is right in the sense that your corridor is a static object. Your scenario is the same and moving around and object and taking images from multiple views. Your views are just not arranged to provide a 360 degree view of the object.
I see you mentioned in your previous comment that the data is coming from a video? In that case, the problem could very well be the camera calibration. A camera calibration tells the SfM algorithm about the internal parameters of the camera (focal length, principal point, lens distortion etc.) In the absence of knowledge about these, the bundler in VSfM uses information from the EXIF data of the image. However, I don't think video stores any EXIF information (not a 100% sure). As a result, I think the entire algorithm is running with bad focal length information and cannot solve for the orientation.
Can you extract a few frames from the video and see if there is any EXIF information?