3D Character/Model Creator - xna

I'm in a project to create a 3d game using XNA/C#, and the game will use a lot of 3d characters.
Looking at the current 3d games, in some they create near to hundreds of characters, what lead me to think that there are some good 3d character/model creator.
To narrow the sample, the game will have characters like the game "Grand Chase". There are some good (and easy) character model creator for to use in XNA development? Free is better, of course, but I will get payed versions too.
EDIT: Another question is about the movements of the characters. The movements like walk, jump, sit, etc are "created" by the "character creator tool" or by the game?

Another question is about the
movements of the characters. The
movements like walk, jump, sit, etc
are "created" by the "character
creator tool" or by the game?
Animation in various forms, key frame, skeletal and so forth are created in the 3D modelling software.
The game then plays these animations are certain points. For example, pressing jump will play the jump animation. Games often use a form of linear interpolation to blend different animations together to smooth them.
Consider a football game, you can animate the footballer running in eight different directions, but what if the player suddenly changes direction midflow? The modeller could not account for this, therefore the engine will "blur" the difference between the animations together to provide a smooth transition via linear interpolation or some other blending factor.
Software
As for software, free editors such as Blender will do. However I prefer Maya/Max. Often you can gain student editions of these, check their official websites. I got a free six month version via my university. While you legally cannot use the models in commerical games, for learning purposes it is fine. I believe they used to offer a Personal Learning Edition but this no longer exists as far as my searching has found.

Most 3D game objects are created in 3D software, such as Maya and Blender. But there are indeed applications that speed up the character modeling, such as Poser. If you quickly need a low poly mesh without big bucks and a lot of exporters, try MilkShape 3D. Its cheap and it's easy to work with. You can easily build meshes with joint animations, which you can edit later to fine tune your characters.
EDIT: Another question is about the
movements of the characters. The
movements like walk, jump, sit, etc
are "created" by the "character
creator tool" or by the game?
Poser 3D. It's not free, but it comes with a good library for starters. Also you might going to like DAZ 3D, also a commercial product. Personally I am not excited about most 3D modeling software that comes for free, exceptions are Blender and Anim8tor. If you are not that well tuned into modeling professionally, I would still recommend you to go for MilkShape 3D. It has an really easy learning curve and you can pop in and work out quickly just to test and work out your game (there is more inside a game than models). Eventually, you could fine tune all models in software you prefer later.

The Xsi Mod Tool will allow you to do character modelling and animation and is a (slightly) cut down version of the full Xsi tool.
It's free for non-commercial use and has close integration with XNA plus it has plugins that support the Unreal Engine and CryEngine etc
Available here

If you want, you could try using XBL Avatars; the bonus is that the players will actually get to use their avatar ingame, and AFAIK, you can procedurally generate characters and stuff through a code API.

I strongly recomment Blender. It's free, it has tons of robust features, and it's widely used by the XNA community, myself included.
It can be a bit time-consuming to learn how to use it, but once you master the basics, Blender feels like a pencil on paper. (Or, for those of us who suck at drawing, a really good artist that can read your mind :P )
There's also a script called MakeHuman that allows you to parametrically create human models, and I think it works pretty well, myself.

Related

Learning WebGL and three.js [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm new and starting to learn about 3D computer graphics in web browsers. I'm interested in making 3D games in a browser. For anyone who has learned both WebGL and three.js...
Is knowledge of WebGL required to use three.js?
What are the advantages of using three.js vs. WebGL?
Since you have big ambitions, you have to invest the time to learn the fundamentals. It is not a matter of what you learn first -- you can learn them simultaneously if you want to. (That's what I did.)
This means that you need to understand:
WebGL concepts
Three.js
The underlying mathematical concepts
Three.js. Three.js does an excellent job of abstracting away many of the details of WebGL, so personally, I'd suggest using Three.js for your project. But remember, Three.js is in alpha, and it is changing frequently, so you have to be prepared for that. Most people learn Three.js by studying the examples. Avoid outdated books and tutorials, and avoid examples from the net that link to old versions of the library.
WebGL. If you use Three.js, you don't need to know how to program in WebGL, you just need to understand the WebGL concepts. That means, that you just need to be able to read someone else's WebGL code and understand what you read. That is a lot easier than being expected to write a WebGL program yourself from scratch. You can learn the WebGL concepts sufficiently well using any of the tutorials on the net, such as the beginner's tutorial at WebGLFundamentals.org and Learning WebGL.
Math. Again, you at least need to understand the concepts. Three good books are:
3D Math Primer for Graphics and Game Development by Fletcher Dunn and Ian Parberry
Essential Mathematics for Games and Interactive Applications: A Programmer’s Guide by James M. Van Verth and Lars M. Bishop
Mathematics for 3D Game Programming and Computer Graphics by Eric Lengyel
There is a very good online course - Interactive 3D Graphics at https://www.udacity.com/course/cs291 on THREE.js. This course includes assignments also to get hands-on experience.
It covers all the basic concepts of Three.js and Computer Graphics
My personal thoughts are the following:
If you have plenty time, you could learn both, but note that WebGL is much lower level than Three.js.
For a first 3d project, experts suggest using a library like Three.js in order to get used to the terms and the general 3d model.
Whichever direction you choose to go, I suggest you learn/polish up on your linear algebra skills. Then go ahead and learn or polish up your understanding about MVP dimensions (Model View Projection). Three.JS can abstract much of that away, but I think it's key that one understands those concepts well before getting serious about any 3D development.
I wrote an introductory article about MVP when I was first learning 3D programming with OpenGL. I realized that until I was able to explain what those transformation matrices are, and how they relate to the various dimensions/spaces, I really didn't know any 3D programming at all, though I could render objects to the screen.
Since your goal is to create games, I think you'll benefit much from learning some raw WebGL first, even if you end up using a framework like Three.js to help you write your code later.
"WebGL is a 2D API and not a 3D API"
http://webglfundamentals.org/webgl/lessons/webgl-2d-vs-3d-library.html
This article describes the fundamental differences between WebGL & 3d libraries like three.js.
Which made my choice between WebGL or Three.js a no-brainer.
I came from a Unity3D background as well as Papervision3D back in the day, so I had a good understanding of how to deal with 3D space. Three.js is the way to go for your initial jump into learning how to deal with WebGL projects. The api is very good, it's very powerful and if you're coming from another 3D technology, you'll be up and running with very little time.
I spent a lot of time with Threejs.org's examples - there's a ton of them and they're very good at getting you off and running in the right direction. The docs are decent enough, especially if you're comparing them to other webGL 3D api's out there.
You might also consider getting the free version of Unity3D and the free collada (was free when I got it) exporter from their app store (Window>App store). I found it easy enough to setup my scene in Unity and export it to Collada for use with Three.js.
Also, I posted this class that I use with Three.js called neo ( http://rockonflash.com/webGL/three/neo.js ). Just add it to your project, then call Neo.JackIntoThree() and it will add the methods/properties to Object3D for use in your project. Things like DrawAllAxis() are invaluable when debugging your scene etc.
Hands down though, Three.js is a great way to go - it's flexible enough to let you write your own shaders/objects etc, and powerful enough right out of the box to help you accomplish your goals.
I picked up three.js, but also jumped into GLSL and experimented a lot with three.js shaderMaterial. One way of going about it - three.js still abstracts much of the stuff for you, but it also gives you a very clean, low level access to all the rendering (projection, animation) capabilities.
This way, you can follow even something like this awesome open-gl tutorial. You don't have to set up the matrices, typed arrays, because three already sets it up for you, updating them when needed. The shader though, you can write from scratch - a simple color rendering would be two lines of GLSL. There is also a post processing plug-in for three.js that sets up all the buffers, full screen quads and stuff you need to do the effects, but the shader can be very simple to begin with.
Since programmable shaders are the essence of modern 3d graphics, i hope my answer is not missing the point :) Sooner or later, anyone who does this needs to at least understand what goes on under the hood, it's the nature of the beast. Also, understanding the 4th dimension in homogeneous space is probably important as well.
This book is good for WebGL.
I just learnt a little of both and I feel that understand the basics of webgl, I think an introduction on webgl is sufficient and then jump into three js. It will be pretty easy once you understand the underlying concepts of WebGL.
Useful links:
Best Intro I have read:
http://dev.opera.com/articles/view/an-introduction-to-webgl/
Comprehensive tutorials:
http://www.johannes-raida.de/tutorials.htm

Learn DirectX or Game engine?

I started learning DirectX 11 with C++. It's hard, but I think I'm doing good.
I discovered UDK (Unreal game engine), and read that many good games like Mass Effect 1-3 was made in that engine. I consider why I should learn DirectX, when many games are already made in game engines, and it's a lot easier.
What are the pros and cons by learning DirectX?
DirectX gives the capability to push pixels on screen. Anything beyond that - physics, map models, dev tools, file formats, music, AI, networking code - is still your problem. On the other hand, a game engine provides a comprehensive solution for most of the things you will need, but at a cost (technical constraints, learning curve, and often non-trivial amounts of $$).
It really depends on your goals and needs.
If you goal is to learn Graphics programming then you should choose DirectX 11 because it gives you access to low level graphics programming.
On the other hand if you want to jump right to gameplay programming UDK will allow you to step over low level graphics programming and get your hands on gameplay.
If you want to learn physics, audio, networking programming, you should take a look at Ogre3D (it's a graphics engine that as the name suggests handles the graphics, all you have to do is programming physics, gameplay).

How to train an artificial neural network to play Diablo 2 using visual input?

I'm currently trying to get an ANN to play a video game and and I was hoping to get some help from the wonderful community here.
I've settled on Diablo 2. Game play is thus in real-time and from an isometric viewpoint, with the player controlling a single avatar whom the camera is centered on.
To make things concrete, the task is to get your character x experience points without having its health drop to 0, where experience point are gained through killing monsters. Here is an example of the gameplay:
Now, since I want the net to operate based solely on the information it gets from the pixels on the screen, it must learn a very rich representation in order to play efficiently, since this would presumably require it to know (implicitly at least) how divide the game world up into objects and how to interact with them.
And all of this information must be taught to the net somehow. I can't for the life of me think of how to train this thing. My only idea is have a separate program visually extract something innately good/bad in the game (e.g. health, gold, experience) from the screen, and then use that stat in a reinforcement learning procedure. I think that will be part of the answer, but I don't think it'll be enough; there are just too many levels of abstraction from raw visual input to goal-oriented behavior for such limited feedback to train a net within my lifetime.
So, my question: what other ways can you think of to train a net to do at least some part of this task? preferably without making thousands of labeled examples.
Just for a little more direction: I'm looking for some other sources of reinforcement learning and/or any unsupervised methods for extracting useful information in this setting. Or a supervised algorithm if you can think of a way of getting labeled data out of a game world without having to manually label it.
UPDATE(04/27/12):
Strangely, I'm still working on this and seem to be making progress. The biggest secret to getting a ANN controller to work is to use the most advanced ANN architectures appropriate to the task. Hence I've been using a deep belief net composed of factored conditional restricted Boltzmann machines that I've trained in an unsupervised manner (on video of me playing the game) before fine tuning with temporal difference back-propagation (i.e. reinforcement learning with standard feed-forward ANNs).
Still looking for more valuable input though, especially on the problem of action selection in real-time and how to encode color images for ANN processing :-)
UPDATE(10/21/15):
Just remembered I asked this question back-in-the-day, and thought I should mention that this is no longer a crazy idea. Since my last update, DeepMind published their nature paper on getting neural networks to play Atari games from visual inputs. Indeed, the only thing preventing me from using their architecture to play, a limited subset, of Diablo 2 is the lack of access to the underlying game engine. Rendering to the screen and then redirecting it to the network is just far too slow to train in a reasonable amount of time. Thus we probably won't see this sort of bot playing Diablo 2 anytime soon, but only because it'll be playing something either open-source or with API access to the rendering target. (Quake perhaps?)
I can see that you are worried about how to train the ANN, but this project hides a complexity that you might not be aware of. Object/character recognition on computer games through image processing it's a highly challenging task (not say crazy for FPS and RPG games). I don't doubt of your skills and I'm also not saying it can't be done, but you can easily spend 10x more time working on recognizing stuff than implementing the ANN itself (assuming you already have experience with digital image processing techniques).
I think your idea is very interesting and also very ambitious. At this point you might want to reconsider it. I sense that this project is something you are planning for the university, so if the focus of the work is really ANN you should probably pick another game, something more simple.
I remember that someone else came looking for tips on a different but somehow similar project not too long ago. It's worth checking it out.
On the other hand, there might be better/easier approaches for identifying objects in-game if you're accepting suggestions. But first, let's call this project for what you want it to be: a smart-bot.
One method for implementing bots accesses the memory of the game client to find relevant information, such as the location of the character on the screen and it's health. Reading computer memory is trivial, but figuring out exactly where in memory to look for is not. Memory scanners like Cheat Engine can be very helpful for this.
Another method, which works under the game, involves manipulating rendering information. All objects of the game must be rendered to the screen. This means that the locations of all 3D objects will eventually be sent to the video card for processing. Be ready for some serious debugging.
In this answer I briefly described 2 methods to accomplish what you want through image processing. If you are interested in them you can find more about them on Exploiting Online Games (chapter 6), an excellent book on the subject.
UPDATE 2018-07-26: That's it! We are now approaching the point where this kind of game will be solvable! Using OpenAI and based on the game DotA 2, a team could make an AI that can beat semi-professional gamers in a 5v5 game. If you know DotA 2, you know this game is quite similar to Diablo-like games in terms of mechanics, but one could argue that it is even more complicated because of the team play.
As expected, this was achieved thanks to the latest advances in reinforcement learning with deep learning, and using open game frameworks like OpenAI which eases the development of an AI since you get a neat API and also because you can accelerate the game (the AI played the equivalent of 180 years of gameplay against itself everyday!).
On the 5th of August 2018 (in 10 days!), it is planned to pit this AI against top DotA 2 gamers. If this works out, expect a big revolution, maybe not as mediatized as the solving of the Go game, but it will nonetheless be a huge milestone for games AI!
UPDATE 2017-01: The field is moving very fast since AlphaGo's success, and there are new frameworks to facilitate the development of machine learning algorithms on games almost every months. Here is a list of the latest ones I've found:
OpenAI's Universe: a platform to play virtually any game using machine learning. The API is in Python, and it runs the games behind a VNC remote desktop environment, so it can capture the images of any game! You can probably use Universe to play Diablo II through a machine learning algorithm!
OpenAI's Gym: Similar to Universe but targeting reinforcement learning algorithms specifically (so it's kind of a generalization of the framework used by AlphaGo but to a lot more games). There is a course on Udemy covering the application of machine learning to games like breakout or Doom using OpenAI Gym.
TorchCraft: a bridge between Torch (machine learning framework) and StarCraft: Brood War.
pyGTA5: a project to build self-driving cars in GTA5 using only screen captures (with lots of videos online).
Very exciting times!
IMPORTANT UPDATE (2016-06): As noted by OP, this problem of training artificial networks to play games using only visual inputs is now being tackled by several serious institutions, with quite promising results, such as DeepMind Deep-Qlearning-Network (DQN).
And now, if you want to get to take on the next level challenge, you can use one of the various AI vision game development platforms such as ViZDoom, a highly optimized platform (7000 fps) to train networks to play Doom using only visual inputs:
ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular.
ViZDoom is based on ZDoom to provide the game mechanics.
And the results are quite amazing, see the videos on their webpage and the nice tutorial (in Python) here!
There is also a similar project for Quake 3 Arena, called Quagents, which also provides easy API access to underlying game data, but you can scrap it and just use screenshots and the API only to control your agent.
Why is such a platform useful if we only use screenshots? Even if you don't access underlying game data, such a platform provide:
high performance implementation of games (you can generate more data/plays/learning generations with less time so that your learning algorithms can converge faster!).
a simple and responsive API to control your agents (ie, if you try to use human inputs to control a game, some of your commands may be lost, so you'd also deal with unreliability of your outputs...).
easy setup of custom scenarios.
customizable rendering (can be useful to "simplify" the images you get to ease processing)
synchronized ("turn-by-turn") play (so you don't need your algorithm to work in realtime at first, that's a huge complexity reduction).
additional convenience features such as crossplatform compatibility, retrocompatibility (you don't risk your bot not working with the game anymore when there is a new game update), etc.
To summarize, the great thing about these platforms is that they alleviate much of the previous technical issues you had to deal with (how to manipulate game inputs, how to setup scenarios, etc.) so that you just have to deal with the learning algorithm itself.
So now, get to work and make us the best AI visual bot ever ;)
Old post describing the technical issues of developping an AI relying only on visual inputs:
Contrary to some of my colleagues above, I do not think this problem is intractable. But it surely is a hella hard one!
The first problem as pointed out above is that of the representation of the state of the game: you can't represent the full state with just a single image, you need to maintain some kind of memorization (health but also objects equipped and items available to use, quests and goals, etc.). To fetch such informations you have two ways: either by directly accessing the game data, which is the most reliable and easy; or either you can create an abstract representation of these informations by implementing some simple procedures (open inventory, take a screenshot, extract the data). Of course, extracting data from a screenshot will either have you to put in some supervised procedure (that you define completely) or unsupervised (via a machine learning algorithm, but then it'll scale up a lot the complexity...). For unsupervised machine learning, you will need to use a quite recent kind of algorithms called structural learning algorithms (which learn the structure of data rather than how to classify them or predict a value). One such algorithm is the Recursive Neural Network (not to confuse with Recurrent Neural Network) by Richard Socher: http://techtalks.tv/talks/54422/
Then, another problem is that even when you have fetched all the data you need, the game is only partially observable. Thus you need to inject an abstract model of the world and feed it with processed information from the game, for example the location of your avatar, but also the location of quest items, goals and enemies outside the screen. You may maybe look into Mixture Particle Filters by Vermaak 2003 for this.
Also, you need to have an autonomous agent, with goals dynamically generated. A well-known architecture you can try is BDI agent, but you will probably have to tweak it for this architecture to work in your practical case. As an alternative, there is also the Recursive Petri Net, which you can probably combine with all kinds of variations of the petri nets to achieve what you want since it is a very well studied and flexible framework, with great formalization and proofs procedures.
And at last, even if you do all the above, you will need to find a way to emulate the game in accelerated speed (using a video may be nice, but the problem is that your algorithm will only spectate without control, and being able to try for itself is very important for learning). Indeed, it is well-known that current state-of-the-art algorithm takes a lot more time to learn the same thing a human can learn (even more so with reinforcement learning), thus if can't speed up the process (ie, if you can't speed up the game time), your algorithm won't even converge in a single lifetime...
To conclude, what you want to achieve here is at the limit (and maybe a bit beyond) of current state-of-the-art algorithms. I think it may be possible, but even if it is, you are going to spend a hella lot of time, because this is not a theoretical problem but a practical problem you are approaching here, and thus you need to implement and combine a lot of different AI approaches in order to solve it.
Several decades of research with a whole team working on it would may not suffice, so if you are alone and working on it in part-time (as you probably have a job for a living) you may spend a whole lifetime without reaching anywhere near a working solution.
So my most important advice here would be that you lower down your expectations, and try to reduce the complexity of your problem by using all the information you can, and avoid as much as possible relying on screenshots (ie, try to hook directly into the game, look for DLL injection), and simplify some problems by implementing supervised procedures, do not let your algorithm learn everything (ie, drop image processing for now as much as possible and rely on internal game informations, later on if your algorithm works well, you can replace some parts of your AI program with image processing, thus gruadually attaining your full goal, for example if you can get something to work quite well, you can try to complexify your problem and replace supervised procedures and memory game data by unsupervised machine learning algorithms on screenshots).
Good luck, and if it works, make sure to publish an article, you can surely get renowned for solving such a hard practical problem!
The problem you are pursuing is intractable in the way you have defined it. It is usually a mistake to think that a neural network would "magically" learn a rich reprsentation of a problem. A good fact to keep in mind when deciding whether ANN is the right tool for a task is that it is an interpolation method. Think, whether you can frame your problem as finding an approximation of a function, where you have many points from this function and lots of time for designing the network and training it.
The problem you propose does not pass this test. Game control is not a function of the image on the screen. There is a lot of information the player has to keep in memory. For a simple example, it is often true that every time you enter a shop in a game, the screen looks the same. However, what you buy depends on the circumstances. No matter how complicated the network, if the screen pixels are its input, it would always perform the same action upon entering the store.
Besides, there is the problem of scale. The task you propose is simply too complicated to learn in any reasonable amount of time. You should see aigamedev.com for how game AI works. Artitificial Neural Networks have been used successfully in some games, but in very limited manner. Game AI is difficult and often expensive to develop. If there was a general approach of constructing functional neural networks, the industry would have most likely seized on it. I recommend that you begin with much, much simpler examples, like tic-tac-toe.
Seems like the heart of this project is exploring what is possible with an ANN, so I would suggest picking a game where you don't have to deal with image processing (which from other's answers on here, seems like a really difficult task in a real-time game). You could use the Starcraft API to build your bot, they give you access to all relevant game state.
http://code.google.com/p/bwapi/
As a first step you might look at the difference of consecutive frames. You have to distinguish between background and actual monster sprites. I guess the world may also contain animations. In order to find those I would have the character move around and collect everything that moves with the world into a big background image/animation.
You could detect and and identify enemies with correlation (using FFT). However if the animations repeat pixel-exact it will be faster to just look at a few pixel values. Your main task will be to write a robust system that will identify when a new object appears on the screen and will gradually all the frames of the sprite frame to a database. Probably you have to build models for weapon effects as well. Those can should be subtracted so that they don't clutter your opponent database.
Well assuming at any time you could generate a set of 'outcomes' (might involve probabilities) from a set of all possible 'moves', and that there is some notion of consistency in the game (eg you can play level X over and over again), you could start with N neural networks with random weights, and have each of them play the game in the following way:
1) For every possible 'move', generate a list of possible 'outcomes' (with associated probabilities)
2) For each outcome, use your neural network to determine an associated 'worth' (score) of the 'outcome' (eg a number between -1 and 1, 1 being the best possible outcome, -1 being the worst)
3) Choose the 'move' leading to the highest prob * score
4) If the move led to a 'win' or 'lose', stop, otherwise go back to step 1.
After a certain amount of time (or a 'win'/'lose'), evaluate how close the neural network was to the 'goal' (this will probably involve some domain knowledge). Then throw out the 50% (or some other percentage) of NNs that were farthest away from the goal, do crossover/mutation of the top 50%, and run the new set of NNs again. Continue running until a satisfactory NN comes out.
I think your best bet would be a complex architecture involving a few/may networks: i.e. one recognizing and responding to items, one for the shop, one for combat (maybe here you would need one for enemy recognition, one for attacks), etc.
Then try to think of the simplest possible Diablo II gameplay, probably a Barbarian. Then keep it simple at first, like Act I, first area only.
Then I guess valuable 'goals' would be disappearance of enemy objects, and diminution of health bar (scored inversely).
Once you have these separate, 'simpler' tasks taken care of, you can use a 'master' ANN to decide which sub-ANN to activate.
As for training, I see only three options: you could use the evolutionary method described above, but then you need to manually select the 'winners', unless you code a whole separate program for that. You could have the networks 'watch' someone play. Here they will learn to emulate a player or group of player's style. The network tries to predict the player's next action, gets reinforced for a correct guess, etc. If you actually get the ANN you want this could be done with video gameplay, no need for actual live gameplay. Finally you could let the network play the game, having enemy deaths, level ups, regained health, etc. as positive reinforcement and player deaths, lost health, etc. as negative reinforcement. But seeing how even a simple network requires thousands of concrete training steps to learn even simple tasks, you would need a lot of patience for this one.
All in all your project is very ambitious. But I for one think it could 'in theory be done', given enough time.
Hope it helps and good luck!

GUI version of OpenCV for feature-detection (SIFT etc.) prototyping before actual project development?

I had an idea for which I need to be able to recognize certain objects or models from a rendered three dimensional digital movie.
After limited research, I know now that what I need is called feature detection in the field of Computer Vision.
So, what I want to do is:
create a few screenshots of a certain character in the movie (eg. front/back/leftSide/rightSide)
play the movie
while playing the movie, continuously create new screenshots of the movie
for each screenshot, perform feature detection (SIFT?, with openCV?) to see if any of our character appearances are there (they must still be recognized if the character is further away and thus appears smaller, or if the character is eg. lying down).
give a notice whenever the character is found
This would be possible with OpenCV, right?
The "issue" is that I would have to learn c++ or python to develop this application. This is not a problem if my movie and screenshots are applicable for what I want to do.
So, I would like to first test my screenshots of the movie. Is there a GUI version of OpenCV that I can input my test data and then execute it's feature detection algorithms manually as a means of prototyping?
Any feedback is appreciated. Thanks.
There is no GUI of OpenCV able to do what you want. You will be able to use OpenCV for some aspects of your problem, but there is no ready-made solution waiting there for you.
While it's definitely possible to solve your problem, the learning curve for this problem is quite long. If you're a professional, then an alternative to learning about it yourself would be to hire an expert to do it for you. It would cost money, but save you time.
EDIT
As far as template matching goes, you wouldn't normally use it to solve such a problem because the thing you're looking for is changing appearance and shape. There aren't really any "dynamic parameters to set". The closest thing you could try is have a massive template collection that would try to cover the expected forms that your target may take. But it would hardly be an elegant solution. Plus it wouldn't scale.
Next, to your point about face recognition. This is kind of related, but most facial recognition applications deal with a controlled environment: lighting, distance, pose, angle, etc. Outside of that controlled environment face detection effectiveness drops significantly. If you're detecting objects in a movie, then your environment isn't really controlled.
You may want to first try a simpler problem of accurately detecting where the characters are, without determining who they are (video surveillance, essentially). While it may sound simple, you'll find that it's actually non-trivial for arbitrary scenes. The result of solving that problem may be useful in identifying the characters.
There is Find-Object by Mathieu Labbé. It was very helpful for me to start getting an understanding of the descriptors since you can change them while your video is running to see what happens.
This is probably too late, but might help someone else looking for a solution.
Well, using OpenCV you would of taking a frame of a video file and do any computations on it.
You can do several different methods of detecting a character on that image, but it's not so easy to have it as flexible so you can even get that person if it's lying on the floor for example, if you only entered reference images of that character standing.
Basically you could try extracting all important features from your set of reference pictures and have a (in your case supervised) learning algorithm that gets a good feature-vector of that character for classification.
You then need to write your code that plays the video and which takes a video frame let's say each 500ms (or other as you desire), gets a segmentation of the object you thing would be that character and compare it with the reference values you get from your learning algorithm. If there's a match, your code can yell "Yehaaawww!" or do other things...
But all this depends on how flexible you want this to be. You could also try a template match or cross-correlation which basically shifts the reference image(s) over the frame and checks how equal both parts are. But this unfortunately is very sensitive for rotation, deformations or other noise... so you wouldn't get that person if its i.e. laying down. And I doubt you can get all those calculations done in realtime...
Basically: Yes OpenCV is good to use for your image processing/computer vision tasks. But it offers a lot of methods and ways and you'd need to find a way that works for your images... it's not a trivial task though...
Hope that helps...
Have you tried looking at some of the work of the Oxford visual geometry group?
Their Video Google system describes to a large extent what you want, instance detection.
Their work into Naming People in TV shows is also pretty relevant. A face detection and facial feature pipeline is included that can be run from Matlab. Are you familiar with Matlab?
Have you tried computer vision frameworks like Cassandra? There you can exactly do that just by some mouse clicks.

Microsoft Robotics Studio, simple simulation

I am soon to start with Microsoft Robotics Studio.
My question is to all the gurus of MSRS, Can simple simulation (as obstacle avoidance and wall following) be done without any hardware ?
Does MSRS have 3-dimensional as well as 2-dimensional rendering? As of now I do not have any hardware and I am only interested in simulation, when I have the robot hardware I may try to interface it!
Sorry for a silly question, I am a MSRS noob, but have previous robotics h/w and s/w experience.
Other than MSRS and Player Project (Player/Stage/Gazebo) is there any other Software to simulate robots, effectively ?
MSRS tackles several key areas. One of them is simulation. The 3D engine is based on the AGeia Physics engine and can simulate not only your robot and its sensors, but a somewhat complex environment.
The demo I saw had a Pioneer with a SICK lidar running around a cluttered appartment living room, with tables, chairs and etc.
The idea is that your code doesn't even need to know if it's running on the simulator or the real robot.
Edit:
A few links as requested:
Start here: http://msdn.microsoft.com/en-us/library/dd939184.aspx
alt text http://i.msdn.microsoft.com/Dd939184.image001(en-us,MSDN.10).jpg
Then go here: http://msdn.microsoft.com/en-us/library/dd939190.aspx
alt text http://i.msdn.microsoft.com/Dd939190.image008(en-us,MSDN.10).jpg
Then take a look at some more samples: http://msdn.microsoft.com/en-us/library/cc998497.aspx
alt text http://i.msdn.microsoft.com/Cc998496.Sumo1(en-us,MSDN.10).jpg
simple answer is yes, MRDS simulator and player/stage have very similar capabilities. MRDS uses a video game quality physics engine under the hood, so you can do collisions, and some basic physics on your robots, but its not going to be the level of accuracy of a matlab simulation (on the flip side its realtime and easier to develop with though). You can do a lot in MRDS without any hardware.
MRDS uses some pretty advanced programming abstractions, so can be a bit intimidating at first, but do the tutorials, and the course that has been posted to codeplex "software engineering for robotics" and you will be fine. http://swrobotics.codeplex.com/

Resources