Direct2D versus Direct3D for digital video rendering - directx

I need to render video from multiple IP cameras into several controls within the client application.
On top of the video, I should be able to add some OSD such as timestamp and camera name.
What I'm trying to do has nothing to do with 3D since we're talking about digital video with some text on it.
Which API is more suitable for this purpose? Direct3D or Direct2D?
Performance should also be a consideration here.

It used to be that Direct2D was a poor choice for Windows Phone (if you care about that system) because it wasn't supported, but Win Phone 8.1 has it now, so less of an issue.
My experience with D2D was that it offered fast, high quality 2D rendering, and I would say it is a good choice.
You might want to take a look at this article on Code Project. That looks appropriate for your purposes.
If you are certain you only need MS system support, then you're all set.
Another way to go would be a cross platform system like nanovg, which offers nice 2D rendering and would work on a Mac. Of course, you'd need to figure out how to do the video part on non windows systems.
Regarding D3D, you could certainly do it that way, but my guess would be it would make some things trickier to do. Don't forget you can combine the two as well...

Related

Getting FPS and frame-time info from a GPU

I am a mathematician and not a programmer, I have a notion on the basics of programming and am a quite advanced power-user both in linux and windows.
I know some C and some python but nothing much.
I would like to make an overlay so that when I start a game it can get info about amd and nvidia GPUs like frame time and FPS because I am quite certain the current system benchmarks use to compare two GPUs is flawed because small instances and scenes that bump up the FPS momentarily (but are totally irrelevant in terms of user experience) result in a higher average FPS number and mislead the market either unintentionally or intentionally (for example, I cant remember the name of the game probably COD there was a highly tessellated entity on the map that wasnt even visible to the player which lead AMD GPUs to seemingly under perform when roaming though that area leading to lower average FPS count)
I have an idea on how to calculate GPU performance in theory but I dont know how to harvest the data from the GPU, Could you refer me to api manuals or references to help me making such an overlay possible?
I would like to study as little as possible (by that I mean I would like to learn what I absolutely have to learn in order to get the job done I dont intent to become a coder).
I thank you in advance.
It is generally what the Vulkan Layer system is for, which allows to intercept API commands and inject your own. But it is nontrivial to code it yourself. Here are some pre-existing open-source options for you:
To get to timing info and draw your custom overlay you can use (and modify) a tool like OCAT. It supports Direct3D 11, Direct3D 12, and Vulkan apps.
To just get the timing (and other interesting info) as CSV you can use a command-line tool like PresentMon. Should work in D3D, and I have been using it with Vulkan apps too and it seems to accept them.

how to read video file and split it into frames for android

My goal is as follows: I have to read in a video that is stored on the sd card, process it frame for frame and then store it in a new file on the SD card again,In each image to do image processing.
At first I wanted to use opencv for android but I did not seem to be able to read the video
here.
I am guessing you already know that doing this on a mobile device or any compute limited devices is not ideal, simply because video manipulation is very computer intensive which translates to slow execution and heavy battery usage on many devices. If you do have the option to do the processing on the server side it is definitely worth considering.
Assuming that for your use case you need to do it on the mobile device, then OpenCV on Android will now allow you to read in a video and access each frame - #StephenG mentions this in his answer to the question you refer to above.
In the past, functionality like this did not get ported to the Android OpenCv as the guidance was to use ffmpeg for frame grabbing on Android devices.
According to more recent documentation, however, this should be available for Android now using the VideoCapture class (note I have not used this myself...):
http://docs.opencv.org/java/2.4.11/org/opencv/highgui/VideoCapture.html
It is worth noting that OpenCV Android examples are all currently based around Eclipse and if you want to use Studio, getting things up an running initially can be quite tricky. The following worked for me recently, but as both studio and OpenCV can change over time you may find you have to do some forum hunting if it does not work for you:
https://stackoverflow.com/a/35135495/334402
Taking a different approach, you can use ffmpeg itself, in a wrapper in Android, for tasks like this.
The advantage of the wrapper approach is that you can use all the usual command line syntax and there is a lot of info on the web to help you get the right parameters.
The disadvantage is that ffmpeg was not really designed to be wrapped in this way so you do sometimes see issues. Having said that it is a common approach now and so long as you choose a well used wrapper library you should at least have a good community to discuss any issues you come across with. I have used this approach in a hand crafted way in the past but if I was doing it again I would use one of the popular examples such as:
https://github.com/WritingMinds/ffmpeg-android-java

iOS CGImageRef Pixel Shader

I am working on an image processing app for the iOS, and one of the various stages of my application is a vector based image posterization/color detection.
Now, I've written the code that can, per-pixel, determine the posterized color, but going through each and every pixel in an image, I imagine, would be quite difficult for the processor if the iOS. As such, I was wondering if it is possible to use the graphics processor instead;
I'd like to create a sort of "pixel shader" which uses OpenGL-ES, or some other rendering technology to process and posterize the image quickly. I have no idea where to start (I've written simple shaders for Unity3D, but never done the underlying programming for them).
Can anyone point me in the correct direction?
I'm going to come at this sideways and suggest you try out Brad Larson's GPUImage framework, which describes itself as "a BSD-licensed iOS library that lets you apply GPU-accelerated filters and other effects to images, live camera video, and movies". I haven't used it and assume you'll need to do some GL reading to add your own filtering but it'll handle so much of the boilerplate stuff and provides so many prepackaged filters that it's definitely worth looking into. It doesn't sound like you're otherwise particularly interested in OpenGL so there's no real reason to look into it.
I will add the sole consideration that under iOS 4 I found it often faster to do work on the CPU (using GCD to distribute it amongst cores) than on the GPU where I needed to be able to read the results back at the end for any sort of serial access. That's because OpenGL is generally designed so that you upload an image and then it converts it into whatever format it wants and if you want to read it back then it converts it back to the one format you expect to receive it in and copies it to where you want it. So what you save on the GPU you pay for because the GL driver has to shunt and rearrange memory. As of iOS 5 Apple have introduced a special mechanism that effectively gives you direct CPU access to OpenGL's texture store so that's probably not a concern any more.

3D library recommendations for interactive spatial data visualisation?

Our software produces a lot of data that is georeferenced and recorded over time. We are considering ways to improve the visualisation, and showing the (processed) data in a 3D view, given it's georeferenced, seems a good idea.
I am looking for SO's recommendations for what 3D libraries are best to use as a base when building these kind of visualisations in a Delphi- / C++Builder-based Windows application. I'll add a bounty when I can.
The data
Is recorded over time (hours to days) and is GPS-tagged. So, we have a lot of data following a path over time.
Is spatial: it represents real 3D elements of the earth, such as the land, or 3D elements of objects around the earth.
Is high volume: we could have a point cloud, say, of hundreds of thousands to millions of points. Processed data may display as surfaces created from these point clouds.
From that, you can see that an interactive, spatially-based 3D visualisation seems a good approach. I'm envisaging something where you can easily and quickly navigate around in space, and data will load or be generated on the fly depending on what you're looking at. I would prefer we don't try to write our own 3D library from scratch - for something like this, there have to be good existing libraries we can work from.
So, I'm hoping for a library which supports:
good navigation (is the library based on Euler rotations only, for example? Can you 'pick' objects to rotate around or move with easily?);
modern GPUs (shader-only rendering is ok; being able to hook into the pipeline to write shaders that map values to colours and change dynamically would be great - think data values given a colour through a colour lookup table);
dynamic data / objects (data can be added as it's recorded; and if the data volume is too high, we should be able to page things in and out or recreate them, and only show a sensible subset so that whatever the user's viewport is looking at is there onscreen, but other data can be loaded/regenerated, preferably asynchronously, or at least quickly as the user navigates. Obviously data creation is dependent on us, but a library that has hooks for this kind of thing would be great.)
and technologically, works with Delphi / C++Builder and the VCL.
Libraries
There are two main libraries I've considered so far - I'm looking for knowledgeable opinions about these, or for other libraries I haven't considered.
1. FireMonkey
This is Embarcadero's new UI library, which is only available in XE2 and above. Our app is based on the VCL and we'd want to host this in a VCL window; that seems to be officially unsupported but unofficially works fine, or is available through third-parties.
The mix of UI framework and 3D framework with shaders etc sounds great. But I don't know how complex the library is, what support it has for data that's not a simple object like a cube or sphere, and how well-designed it is. That last link has major criticisms of the 3D side of the library - severe enough I am not sure it's worthwhile in its current state at the time of writing for a non-trivial 3D app.
Is it worth trying to write a new visualisation window in our VCL app using FireMonkey?
2. GLScene
GLScene is a well-known 3D OpenGL framework for Delphi. I have never used it myself so have no experience about how it works or is designed. However, I believe it integrates well into VCL windows and supports shaders and modern GPUs. I do not know how its scene graph or navigation work or how well dynamic data can be implemented.
Its feature list specifically mentions some things I'm interested in, such as easy rotation/movement, procedural objects (implying dynamic data is easy to implement), and helper functions for picking. It seems shaders are Cg only (not GLSL or another non-vendor-specific language.) It also supports "polymorphic image support for texturing (allows many formats as well as procedural textures), easily extendable" - that may just mean many image formats, or it may indicate something where the texture can be dynamically changed, such as for dynamic colour mapping.
Where to from here?
These are the only two major 3D libraries I know of for Delphi or C++Builder. Have I missed any? Are there pros and cons I'm not aware of? Do you have any experience using either of these for this kind of purpose, and what pitfalls should we be wary of or features should we know about and use?
We currently use Embarcadero RAD Studio 2010 and most of our software is written in C++. We have small amounts of Delphi and may consider upgrading IDEs, but we are most likely to wait until the 64-bit C++ compiler is released. For that reason, a library that works in RS2010 might be best.
Thanks for your input :) I'm after high-quality answers, so I'll add a bounty when I can!
I have used GLScene in my 3D geomapping software and although it's not used to an extent you're looking for I can vouch that it seems the most appropriate for what you're trying to do.
GLScene supports terrain rendering and adding customizable objects to the scene. Objects can be interacted with and you can create complex 3D models of objects using the various building blocks of GLScene.
Unfortunately I cannot state how it will work with millions of points, but I do know that it is quite optimized and performs great on minimal hardware - that being said - the target PC I found required a dedicated graphics card capable of using OpenGL 2.1 extensions or higher (I found small issues with integrated graphics cards).
The other library I looked at was DXscene - which appears quite similar to GLScene albeit using DirectX instead of OpenGL. From memory this was a commercial product where GLScene was licensed under GPL. (EDIT - the page seems to be down at the moment : http://www.ksdev.com/index.html)
GLScene is still in active development and provides a fairly comprehensive library of functions, base objects and texturing etc. Things like rotation, translation, pitch, roll, turn, ray casting - to name a few - are all provided for you. Visibility culling is provided for each base object as well as viewing cameras, lighting and meshes. Base objects include cubes, spheres, pipes, tetrahedrons, cones, terrain, grids, 3d text, arrows to name a few.
Objects can be picked with the mouse and moved along 1,2 or 3 axes. Helper functions are included to automatically calculate the top-most object the mouse is under. Complex 3D shapes can be built by attaching base objects to other base objects in a hierarchical manner. So, for example, a car could be built using a rectangle as the base object and attaching four cylinders to it for the wheels - then you can manipulate the 'car' as a whole - since the four cylinders are attached to the base rectangle.
The only downside I could bring to your attention is the sometimes limited help/support available to you. Yes, there is a reference manual and a number of demo applications to show you how to do things such as select objects and move them around, however the reference manual is not complete and there is potential to get 'stuck' on how to accomplish a certain task. Forum support is somewhat limited/sparse. If you have a sound knowledge of 3D basics and concepts I'm sure you could nut it out.
As for Firemonkey - I have had no experience with this so I can't comment. I believe this is more targeted at mobile applications with lower hardware requirements so you may have issues with larger data sets.
Here are some other links that you may consider - I have no experience with them:
http://www.truevision3d.com/
http://www.3impact.com/
Game Development in Delphi
The last one is targeted at game development - but may provide useful information.
Have you tried glData? http://gldata.sourceforge.net/
It is old (~2004, Delphi 7), and I have not personally used the library, but some of the output is amazing.
you can use the GLScene or OpenGL they are good 3D rendering and its very easy to use.
Since you are already using georeferenced data, maybe you should consider embedding GoogleEarth in your Delphi application like this? Then you can add data to it as points, paths, or objects.

Automated Webcam Application / Hardware Problems

I am starting to develop an automated webcam application. The goal is to automatically take pictures, do some image processing and then upload the results to a FTP site. All of these tasks seem simple.
However, I am having a hard time to find a decent camera. I don't want to use a simple webcam or hd-webcam because the image quality of still frames isn't very good.
I'm also having a hard time finding an affordable digital camera supporting USB snapshot or control.
My second concern is the development itself. I'm not quite sure which programming language to use. I have experience with AS3, Processing, Java and some simple C++ and Open CV.
Do you have a clue?
Regarding the camera, There are pretty good webcams that you can find, some with HD quality. look at the cameras on Logitech (I tested their API and it is quite good), A HD camera has a retail of $99 which is very cheap. If you are looking for something better I would go with Nikon as they also have a pretty good API for C#/C++. You can get a basic SLR with simple 28mm lens for $500. Don't use a PowerShot as Nikon stopped supporting their API. Whatever camera you decide to buy make sure a proper API is available, is being maintained and free.
Regarding development, I would go with C#/Java as they are easier than C++. There are quite allot of libraries for image processing for C#/Java, just make sure that the Camera comes with an API the fits your chosen language.
Good luck.
Generally (from experience) most USB cameras that show up as an imaging device through Windows can be used with JAI [Java Advanced Imaging]. Additionally [on the .net/c++ side], the same cameras can be used through DirectShow as a capture device. Java/C# will make development easier but expect to loose some performance [even with the best of optimizations]. Additionally you can only perform upto the speed of the camera and the data line running from the camera to the computer [USB1.0 will seriously limit a decent framerate]
first get the image in RAM:
If you are using CHDK, I suggest you get the image copied from camera memory to RAM by using supported scripting languages by CHDK - you can take help from the CHDK forum http://chdk.setepontos.com/index.php for this.
or if thats difficult you can continuously copy the image to hard disk and load in RAM from there. (you need to take care (delete) of massive images accumulated on hard disk in a short period of time !)
This sounds like a 'brute force' approach, but will get your work going while you are researching correct approach.
perform image processing:
once the image is in RAM, you can apply your image processing algorithms as usual e.g. using opencv library.
hope this helps you

Resources