Game rendering in monochrome for self driving AI test? - image-processing

As you might already know, processing a monochrome image(B&W) is much easier than an RGB image for image processing.
I'm working on a self-driving AI right now. It is still in the early development stage.
The game I choose is The Crew (2014) because of realistic driving, realistic graphics, open-world environments and online mod. Also relatively old game so must be easier to run in higher resolution (if needed).
I'm planning to log in with a different account so I can bully the AI as a different car on the road. Therefore I can test accident prevention capabilities.
I was planning to render the game in B&W for AI but actually I don't really know how to do that.
I think I can encode the output as B&W after rendering in full colours but,
Can I render it in B&W natively?
If I can, will rendering in monochrome improve FPS over RGB rendering?
I don't really know much about rendering (especially in games) but I think it should improve performance.
Because the game will be rendering on my old GPU (GTX 670) while AI uses GTX 1080 TI, I don't want to get another card just for game running.

Related

What is NeRF(Neural Radiance Fields) used for?

Recently I am studying the research NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis(https://www.matthewtancik.com/nerf), and I am wondering: What is it used for? Will there be any application of NeRF?
The result of this technique is very impressive, but what is it used for? I keep thinking of this question over and over again. It is very realistic, the quality is perfect, but we don't want to see the camera swinging around all the time, right?
Personally, this technique has some limitations:
Cannot generate views that never seen in input images. This technique interpolates between two views.
Long training and rendering time: According to the authors, it takes 12 hours to train a scene, and 30s to render one frame.
The view is static and not interactable.
I don't know if it is appropriate to compare NeRF with Panorama and 360° image/video, essentially they are different, only NeRF uses deep learning to generate new views, the others basically are just using smart phone/camera to capture scenes plus some computer vision techniques. Still, the long training time makes NeRF less competitive in this application area. Am I correct?
Another utility I can think of is product rendering, however, NeRF doesn't show advantages compare to using 3D software to render. Like commercial advertisement, usually it requires animation and special effects, then definitely 3D software can do better.
The potential use of NeRF might be 3D reconstruction, but that would be out of the scope, although it is able to do that. Why do we need to use NeRF for 3D reconstruction? Why not use other reconstruction techniques? The unique feature of NeRF is the ability of creating photo-realistic views, if we use NeRF for 3D reconstruction, then this feature becomes pointless.
Does anyone have new ideas? I would like to know.
Why do we need to use NeRF for 3D reconstruction?
The alternative would be multi-view stereo, which produces point clouds of finite resolution and is susceptible to illumination changes. If you then render such point cloud without non-trivial post-processing, it will not look photorealistic.
I don't know if it is appropriate to compare NeRF with Panorama and 360° image/video,
Well, if you deal with exactly flat scene with simple lighting (i.e. ambient light and Lambertian objects), then you can use panorama techniques for new view synthesis. In general though, it won’t produce the result you expect. You have to know the depth to interpolate correctly.
When it comes to practical limitations (slow; does not model deformations), NeRF should be considered a milestone that provided a proof of concept that representing surface as a level set of MLP-modelled function can result in sharp rendering. There is already good progress in addressing those limitations, and multiple works apply this idea for practical tasks.

What is a synthetic image?

I don't understand what is a synthetic image in computer vision.
And what are the differences between optical image and synthetic image?
Here's an example of the question. It's a screen shot of a research paper:
A real image is obtained by an imaging device such as a camera, which converts the light from a scene to pixel values. Due to the image formation process that obeys the laws of physics, real images are rich, complex and often noisy and textured. The real world contains a lot of information.
A synthetic image is obtained "out of the blue" by pure computation, i.e. by modelling the real world and simulating the laws of optics.
Two decades ago, you could spot a synthetic image at a glance, because it was lacking realism and was obtained through too simple models (in part due to heavy computation costs). This is no more true nowadays, they tend to be indistinguishable.
Note that in scientific contexts, they can be using very simple images (say a chessboard) for experimental purposes, for instance testing an image filter.
For instance, the scene below has been synthetized by armies of researchers, with the goal of finding the most realistic lighting simulation. This room never existed.

Are there any fast-rendering game libraries that use "GPU mode" as a rendering method in Air for iOS?

I've been toying around with starling 2D and found out my game runs faster without it. The reason is simply because the game uses lots of vector shapes and "direct" mode is just too slow in rendering shapes. Using GPU mode, FPS went up from 20 to around 55.
There was a small trade off though. Rendering static images such as BitmapData(Textures) was faster with starling 2d. Also I didn't have to worry about whether if my graphic assets are being hardware acceralated or not all the time.
So I'm looking for a game(graphic) library for Air for iOS which works in GPU rendering mode and makes it easier to manage BitmapData Caching.
Does anyone know any?
There are 4 render modes that can be selected when you create an AIR application using <renderMode>mode</renderMode> in the application.xml file.
<renderMode>cpu</renderMode> or <renderMode>auto</renderMode> (Using Auto defaults to CPU)
Everything is rendered in software
Easiest to build
Performance improvements can be made with bitmap caching
<renderMode>gpu</renderMode>
Vectors and cached surfaces are rendered through hardware
Text and vectors can use cacheAsBitmapMatrix
There are many limitations (read more here http://help.adobe.com/en_US/as3/mobile/WS901d38e593cd1bac-3d719af412b2b394529-8000.html)
<renderMode>direct</renderMode> (aka Stage3D)
GPU accelerated rendering engine
Faster than any other rendering technique
Many useful frameworks such as Starling and Away3D
Create applications close to native level performance
Handle most recurring scripts in Event.ENTER_FRAME
It's upto you which you would like to choose. Try to stay away from heavy features such as:
Bitmap effects (shadow, gloss, emboss…)
Masks (inherently vector-based)
Alpha channels
Blend modes (add, multiply…)
Embedded fonts (Especially Unicode)
Complex vector artwork
There are no 2D vector libraries to my knowledge. I recommend exporting your vectors to sprite sheets. I know it doesn't scale as easily but it's the supported way of developing games in AIR.
Dustin

Can an image segmentation algorithm like GrabCut run on the iPhone GPU?

I've been toying around with the GrabCut algorithm (as implemented in OpenCV) on the iPhone. The performance is horrid. It takes about 10-15 seconds to run even on the simulator for an image that's about 800x800. On my phone it runs for several minutes, eventually runs out of memory, and crashes (iPhone 4). I'm sure there's probably some optimization I can do if I write my own version of the algorithm in C, but I get the feeling that no amount of optimization is going to get it anywhere near usable. I've dug up some performance measurements in some academic papers and even they were seeing 30 second runtimes on multicore 1.8 ghz CPU's.
So my only hope is the GPU, which I know literally nothing about. I've done some basic research on OpenGL ES so far, but it is a pretty in-depth topic and I don't want to waste hours or days learning the basic concepts just so I can find out whether or not I'm on the right path.
So my question is twofold:
1) Can something like GrabCut be run on the GPU? If so, I'd love to have a starting point other than "learn OpenGL ES". Ideally I would like to know what concepts I need to pay particular attention to. Keep in mind that I have no experience with OpenGL and very little experience with image processing.
2) Even if this type of algorithm can be run on the GPU, what kind of performance improvement should I expect? Considering that the current runtime is about 30 seconds AT BEST on the CPU, it seems unlikely that the GPU will put a big enough dent in the runtime to make the algorithm useful.
EDIT: For the algorithm to be "useful", I think it would have to run in 10 seconds or less.
Thanks in advance.
It seems that grabcut doesn't benefit from the image resolution. It means that the quality of the result doesn't depend directly from the quality of the input image. On the other hand the performance benefits from the size, meaning that the smaller is the image the faster is the algorithm in performing the cutout. So, try to scale the image down to 300x300, apply grabcut, take the mask out, scale the mask up to the original size , and apply the mask to the original image to obtain the result. Let me know if it works.
Luca

When to use power of 2 textures in XNA?

I hear a lot that power of 2 textures are better for performance reasons, but I couldn't find enough solid information about if it's a problem when using XNA. Most of my textures have random dimensions and I don't see much of a problem, but maybe VS profiler doesn't show that.
In general, pow 2 textures are better. But most graphics cards should allow non pow 2 textures with a minimal loss of performance. However, if you use XNA reach profile, only pow 2 textures are allowed. And some small graphics cards only support the reach profile.
XNA is really a layer built on top of DirectX. So any performance guidelines that goes for that will also apply for anything using XNA.
The VS profiler also won't really apply to the graphics specific things you are doing. That will need to be profiled separately by some tool that can check how the graphic card itself is doing. If the graphics card is struggling it won't show up as a high resource usage on your CPU, but rather as a slow rendering speed.

Resources