How do I standardize the orientation of several PLY models, that are different from each other, in MeshLab? - orientation

This is the only question I found, that is somewhat similar to mine: Stackoverflow Question.
The biggest difference is that I want to find a way to orient mutiple .ply point clouds, that aren't oriented the same way after loading.
For example:
The first .ply object from one room looks, without applying any filter, like a top-down view.
The second .ply object from another room looks, again without applying any filter, like a view from underground with a 180° degree spin added.
How can I unify the orientation of these two .ply objects?
The reason behind my question is that I need to apply the same filter on both (or even more) .ply objects and at least get useable results out of it. But the "Compute normals for point clouds" Filter, which is needed for applying the following "Surface Reconstruction: Screened Poisson"-Filter, wants a View Direction or View Position, which is difficult to find, if every .ply object is oriented differently.
Hopefully you understand what I'm asking and that you can help me. I'm, of course, open if you need more questions answered.

Related

How to restrict findTransformEcc to a partial affine transform with scale but without shear?

I built a stereoscopic camera mobile app which performs automatic alignment using findTransformEcc and the app is working pretty well with it. I know I should probably be using rectifyStereoUncalibrated preceded by keypoints and descriptors etc. etc. but I get bad results from that despite many different approaches attempted and I'm super frustrated. So instead, I'm sticking with findTransformEcc (at least for now). At the moment I'm using MotionType.Euclidean (restricted to translations and rotations) but I would like to change that.
So far, the app has worked by having the user take one picture and move to the side to capture the next (chacha method). But now I'm adding the ability to have two phones capture simultaneously. The problem is that the focal length and sensor size (angular field of view) may be different between the two cameras, so in order to align the two pictures I need to allow scaling/zooming. However, if I want to do that with findTransformEcc I can only step up from Euclidean to Affine, it seems like I can't go between. That is, it seems I cannot allow scaling without also allowing shearing, and I don't want shearing.
As another way to explain this, I'd like to get the type of transform that you can get from estimateRigidTranform(array,array,FALSE) (partial affine) but rather than using keypoints as that function does, I want to use findTransformEcc because from my experimentation it just seems to be more reliable.
(https://github.com/KRA2008/crosscam/blob/develop/AutoAlignment/OpenCV.cs is the auto-alignment code if that helps at all)
Take a look at Fourier-Mellin transform based approach: https://github.com/Smorodov/LogPolarFFTTemplateMatcher
It will give you offset, scale and rotation parameters, nothing more.

Placing Virtual Object Behind the Real World Object

In ARKit for iOS. If you display a virtual item then it always comes before any real item. This means if I stand in front of the virtual item then I would still see the virtual item. How can I fix this scenario?
The bottle should be visible but it is cutting off.
You cannot achieve this with ARkit only. It offers no off the shelve solution for solving occlusion, which is a hard problem.
Ideally you'd know the depth of each pixel projected on the camera, and would use that to determine those that are in front and those that are behind. I would not try something with the feature points ARKit is exposing since 1) their position is innacurate 2) there's no way to know between two frames which feature point of frame A is which feature point in frame B. It's way to noisy data to do anything good.
You might be able to achieve something with third party options that'd process the captured image and understand depth or different depth levels in the scene, but I don't know any good solution. There's some SLAM technique that yields dense depth map like DTAM (https://www.kudan.eu/kudan-news/different-types-visual-slam-systems/) but that'd be redoing most of what arkit is doing. There might be other approaches that I'm not aware off. Apps like snap do this in their own way so it is possible!
So basically your question is to mapping the coordinate of the virtual item on real world coordinate system, in short, you want to see the virtual item blocked by the real item, and you can only see the virtual item once you pass the real item.
If so, you need to know the physical relations of each object in this environment, and then you need to know exactly where you are to decide if the virtual item is blocked.
It's not an intuitive way to fix this, however, it's the only way I can think of.
Cheers.
What you are trying to achieve is not easy.
You need to detect the parts of the real world that "should be visible" using some kind of image processing. Or maybe the ARKit feature points that have the depth information, then based on this you have to add "an invisible virtual object" that cuts the drawing of things behind it. This will represent your "real object" inside the "virtual world" so that the background (camera feed) remains visible in places where this invisible virtual object is present.

how to generate Tetris piece from a given grid

At first I think my question should have been asked before, but I didn't find what I want.
One element of this iOS app I'm developing is break a 8x8 grid into Tetris pieces (every piece is made of 4 blocks). Two particular question I have are:
what is the best way to represent a Tetris piece in objective-C?
what algorithm to present the grid into random Tetris pieces (and later on how to check if two pieces fits together).
Edition on 01/28
#livingtech, I think I implemented pretty much what you say, except the point of "having a hole". My code works with no hole at simple stage when Tetris block is two blocks only (yes, two squares, connected either horizontally or vertically), but at 3-square Tetris block, I would get holes. I just tested and out of 1000 running, I would get one without a hole. So definitely I need some mechanism to check if next square will be a singleton.
I been trying to do the same thing for my game. Though I am a total beginner, and I'm using XNA and C#.
But the way I'm trying to go about it is: 4x6 grid array
--y123456
X1-000000
X2-000000
X3-000000
X4-000000
Here,
0 signifies no block
1 defines a block
Algorithm
Start by taking the very first 0 in the array ( top left corner )
and randomly pick a 0 or a 1.
Randomly choose the coordinates based on x1/x2-y1/y2, decide 1 or 0.
If it is 1, then decide coordinated based on where that 1 was put.
If it was 1 on x2 y1, then decide if a 1 should go on next touching
coordinate.
If you just have to code in what coordinates touch and which don't,
and the logic will roll through.
I have mine set up bit different. But this is the basic foundation of my random Tetris engine.
I also found that making it like that really helps to have a whiteboard and make a drawing of the grid and label with your coordinates.
since ur board is 8*8, i think u can use a int64 to represent the board. each bit of the int64 represents whether the specific grid is filled or not.
Implementing Tetris is a hobby of mine. First implemented it in Windows/C. Then in Perl/Tk! Last implementation I did in Obj-C/Cocoa (Mac). In all cases, the game logic is the same. Only the UI stuff changes. I treat every little box separately and have a two-dimensional array which contains the presence (and color) of every "set" box on the board. Standard board size I use is 10 boxes wide by 20 boxes high.
Separately I keep track of the "dropping" piece: it's location and what kind of piece it is. Based on a timer, try to make the piece drop. If any of the boxes where the "dropping" piece would drop is already set, then stop dropping the piece and add the piece boxes to the "set" part of the board. Create a new piece, and start over.
It may not be the best way to implement it, but it makes sense in my head. From a pure OO perspective, each shape of a dropping piece could be a subclass of a generic shape class. Override functions that check whether the shape can drop, the offsets of the individual boxes in the shape, etc.
I don't think anybody has taken a stab at your question #2 yet here, so I'm going to outline what I would do.
Setup:
You'll need to represent your grid as an array of some kind. At the very least, you'll want some kind of boolean values, to denote whether each coordinate in the grid is "occupied".
You'll need to keep track of the pieces on your grid. This could be another array, this time holding references to the four coordinates for each piece.
You'll need a variable or variables to keep track of a coordinate in your grid where you'll start filling in pieces, (I would probably populate these with a corner to start).
Set up a "pool" of all possible Tetris pieces and rotations. (You'll want to keep track of which ones you've already checked on every iteration outlined below.)
Iterate:
Get a random piece from your pool that will fit into your starting coordinate. (If you want to get fancy, you could be smart about which ones you choose, or you could just go totally random. As pieces don't fit, mark them checked, so you don't keep checking randomly forever. If you get to a point where you've checked all the pieces, you have a solution that doesn't work, either back up an iteration, or start over.)
Make sure the Tetris piece you selected didn't leave a "hole", or empty space with less than 4 squares. (I don't know your requirements for solving this problem, so I can't say whether you should focus on speed or ease of coding, but you may be able to skip this step if you want, and "brute force" the solution.)
"Place" the piece, by writing it to your piece array and marking the coordinates filled.
Check for "finished" condition, in which all your spaces are filled.
Pick a new coordinate in your grid and repeat #1. (I would pick an empty one next to the previous coordinate.)
If this actual yet, I wrote test tetris app on Objective-C few months ago https://github.com/SonnyBlack/Test-Demo-Tetris . I think my algorithm not very well, but it working. =)

OpenCV tracking people from overhead view

I have a broad but interesting OpenCV question and I'm wondering where to start.
I am looking for any strategies or white papers that might help.
I need to get the position of people sitting at a conference table from a fixed overhead view. Ideally, I will assign a persistent ID to each person, and maintain a list of people with ID and coordinates. This problem could be easy in a specific case - for example, if designed for a single conference room table - but it gets harder in the general case, especially with people entering and leaving the scene.
My first question: is it a detection or a motion tracking problem? Or some combination of the two?
Well it seems like both to me. I would think you would need to take a long average of the visible area which becomes the background. Then based on your background information you can track movement of other objects.
Assigning an ID may become difficult if objects merge together (at least as far as the camera is concerned) and then separate again, say someone removing a hat placing it down and placing it back on.
But all that in mind it is possible even if it presents a challenge. I once saw a similar project tracking people in a train station using a similar approach (it was in a lecture so I can't provide a link sorry)

simple shape recognition

I wanna achieve something that looks like the wizard's ability in the game Trine.
I want to create a game where the player uses the mouse to create certain objects, so i will need to compare the shape the player drew to a predefined shape of my own and check if its close.
I have no idea how to achieve this and where to look for, I assume it has something to do with shape recognition like in image processing and computer vision but it should be much simpler and work in real time.
does anyone have a clue how this can be done or where can i look for something like that?
Is this what you're going for? http://www.youtube.com/watch?v=7Zh79q_xvZw
I would start by researching gesture recognition. I think that's the phrase you need to get good info. http://en.wikipedia.org/wiki/Gesture_recognition
Also, sketch recognition: http://en.wikipedia.org/wiki/Sketch_recognition
Have a look at this question. What you are looking for in particular is on-line handwriting recognition, meaning that you follow every move of the user from beginning to end.
Now, you might want to simplify it a whole lot, so one way is defining 9 areas, like a 3x3 grid. Then convert the user's movement into a list of how the user moved through these grids (use thresholds to make sure it was in that area for a while). Now you will have an array like this: 1-1, 1-2, 2-2, 2-3 (meaning the user went from upper-left corner, the upper-middle, etc.)
This information is now fairly easy to match to a set of gestures. If it performs poorly, you can either make it more difficult and introduce a Hidden Markov Model, that will allow some mistakes in the gesture (but still matching the most likely one you have in your gesture set), or you could simply display the grid to the user, so that the user will learn the gestures like number codes.

Resources