Capture images from a webcam in delphi - delphi

I am looking for a way to capture images from my webcam using directshow, preferably I want to use HD resolutions if possible, and avoid CPU spikes at 60-100%.
can someone shoot me in the right direction on how to do this?
I tried using DSPack, but this component makes my CPU spike 90-100%
if however someone here know how to use DSPack with less CPU consumption I would also be happy about that :)

I've used dspack for a long time on cheap machines that are built into cars. They have slow ~700mhz VIA processors (single core), and 256MB of RAM.
The application captures 12 images per second from a camera on the roof of the car, and every time a new GPS position (once per second) comes in, it adds coordinates to the image, and stores it as a .jpg on a harddisk.
When the application captures images, and shows the images on a form without creating .jpg images, the application takes about 5% processor time (!).
If you get 90% CPU time with DsPack, it's probably because of extra processing that's being done with the images.
I've tried all sorts of libraries in my research to create this program, and dspack was a clear winner on many fronts. I wouldn't give up on it too soon.

I have a real-time video application that uses Mitov's VideoLibrary. It's a collection of objects that are well-designed, threaded, and takes advantage of all the CPU cores available.
When I go to his library with some new need, I'm usually pleasantly surprised to see he anticipated it. Support has been very good also.
It's not cheap: $450, but for my needs, has been worth every penny. It's free for non-commercial use: http://www.mitov.com/html/videolab.html.
His CaptureBitMap demo captures successive frames to a bitmap. You simply drop several components on a form, and write six lines of code! The library has lots of hooks to go further than this simple example. (In Win 7, Delphi 2010, the demos are installed here: C:\Program Files (x86)\Embarcadero\RAD Studio\7.0\LabPacks\Demos\Delphi2010\VideoLab\CaptureBitmap. But, I know he supports as far back as Delphi 7.)
One thing that differentiates his library is that it makes use of the Intel IPP libraries: http://software.intel.com/en-us/articles/intel-ipp. When running on Intel chipsets, if you choose to ship the Intel DLLs, you get the best performance that Intel's engineers could squeeze out of their chips. If Mitov's library with IPP can't process your video fast enough, I'd be surprised if any video library can.
Mitov has some standing in the Delphi community: he was a speaker on multi-threading at CodeRage: http://www.embarcadero.com/coderage5/sessions (Thursday session.)
The above may sound like I'm a shill for his company. I don't have any relationship other than as a very pleased licensee. I'm just very happy (and relieved) that I found his tools and decided to use them.

Related

What takes up the majority of the storage space in video games?

I've been curious recently about why text/code seems to take up so little storage but videogame applications are enormous in size. For example, a game like Warzone is over 100 Gb.
Link to see how enormous the maps are: https://www.gamesatlas.com/cod-modern-warfare/guides/call-of-duty-warzone-map-all-cod-battle-royale-locations
I've done some research and think that it has something to do with the complex landscapes that are created in the videogames. Those don't seem to be lines of code that a developer has written but rather creating some sort of 3D environment for your game to run in.
What about something like Windows or other operating systems? Is there entire storage "weight" of what is downloaded code or data that is being downloaded as well to make the applications done?
If the majority of it is code, how do those enormous organizations write so many lines of code to take up so much storage?
It just depends on the game.
For triple A games, I woukd say most of it is binary data like texture, models, media (like video, cinematic, audio).
Then you have the way your game is packed and lot of dependencies like C Redistribuable, game engines, physics engine, libraries, etc ... While many of those are not used they may still be packed in the game.
For some "indie" games like Minecraft, I wouldn't be surprise code is what take most of the space (or Audio I guess ?). Note that the map can be larger than the game too ...
What you can do is use a tool like Windirstats to check what is happening, but It will not find dependencies that are out of the folder.
For the codebase, I guess its mainly automated through games engines.
Here is an example for Conan Exile :
So it's mainly texture data (GraniteSDK), the game engine files is 115MB and executable are 100MB (note that it has Battleye anti-cheat packed, + the server version of the game). Video is 500MO ...
Another example for Minecraft :
Which is (contrary to what I expected) mainly texture/sound data.
What about, let's say, Chrome ?
Interpretation : I have no clue :D.
Last one:
Python itself is not quite big. But all the dependencies, their dependencies (the dlls, etc) are quite big at the end.

Getting FPS and frame-time info from a GPU

I am a mathematician and not a programmer, I have a notion on the basics of programming and am a quite advanced power-user both in linux and windows.
I know some C and some python but nothing much.
I would like to make an overlay so that when I start a game it can get info about amd and nvidia GPUs like frame time and FPS because I am quite certain the current system benchmarks use to compare two GPUs is flawed because small instances and scenes that bump up the FPS momentarily (but are totally irrelevant in terms of user experience) result in a higher average FPS number and mislead the market either unintentionally or intentionally (for example, I cant remember the name of the game probably COD there was a highly tessellated entity on the map that wasnt even visible to the player which lead AMD GPUs to seemingly under perform when roaming though that area leading to lower average FPS count)
I have an idea on how to calculate GPU performance in theory but I dont know how to harvest the data from the GPU, Could you refer me to api manuals or references to help me making such an overlay possible?
I would like to study as little as possible (by that I mean I would like to learn what I absolutely have to learn in order to get the job done I dont intent to become a coder).
I thank you in advance.
It is generally what the Vulkan Layer system is for, which allows to intercept API commands and inject your own. But it is nontrivial to code it yourself. Here are some pre-existing open-source options for you:
To get to timing info and draw your custom overlay you can use (and modify) a tool like OCAT. It supports Direct3D 11, Direct3D 12, and Vulkan apps.
To just get the timing (and other interesting info) as CSV you can use a command-line tool like PresentMon. Should work in D3D, and I have been using it with Vulkan apps too and it seems to accept them.

Direct2D versus Direct3D for digital video rendering

I need to render video from multiple IP cameras into several controls within the client application.
On top of the video, I should be able to add some OSD such as timestamp and camera name.
What I'm trying to do has nothing to do with 3D since we're talking about digital video with some text on it.
Which API is more suitable for this purpose? Direct3D or Direct2D?
Performance should also be a consideration here.
It used to be that Direct2D was a poor choice for Windows Phone (if you care about that system) because it wasn't supported, but Win Phone 8.1 has it now, so less of an issue.
My experience with D2D was that it offered fast, high quality 2D rendering, and I would say it is a good choice.
You might want to take a look at this article on Code Project. That looks appropriate for your purposes.
If you are certain you only need MS system support, then you're all set.
Another way to go would be a cross platform system like nanovg, which offers nice 2D rendering and would work on a Mac. Of course, you'd need to figure out how to do the video part on non windows systems.
Regarding D3D, you could certainly do it that way, but my guess would be it would make some things trickier to do. Don't forget you can combine the two as well...

3D library recommendations for interactive spatial data visualisation?

Our software produces a lot of data that is georeferenced and recorded over time. We are considering ways to improve the visualisation, and showing the (processed) data in a 3D view, given it's georeferenced, seems a good idea.
I am looking for SO's recommendations for what 3D libraries are best to use as a base when building these kind of visualisations in a Delphi- / C++Builder-based Windows application. I'll add a bounty when I can.
The data
Is recorded over time (hours to days) and is GPS-tagged. So, we have a lot of data following a path over time.
Is spatial: it represents real 3D elements of the earth, such as the land, or 3D elements of objects around the earth.
Is high volume: we could have a point cloud, say, of hundreds of thousands to millions of points. Processed data may display as surfaces created from these point clouds.
From that, you can see that an interactive, spatially-based 3D visualisation seems a good approach. I'm envisaging something where you can easily and quickly navigate around in space, and data will load or be generated on the fly depending on what you're looking at. I would prefer we don't try to write our own 3D library from scratch - for something like this, there have to be good existing libraries we can work from.
So, I'm hoping for a library which supports:
good navigation (is the library based on Euler rotations only, for example? Can you 'pick' objects to rotate around or move with easily?);
modern GPUs (shader-only rendering is ok; being able to hook into the pipeline to write shaders that map values to colours and change dynamically would be great - think data values given a colour through a colour lookup table);
dynamic data / objects (data can be added as it's recorded; and if the data volume is too high, we should be able to page things in and out or recreate them, and only show a sensible subset so that whatever the user's viewport is looking at is there onscreen, but other data can be loaded/regenerated, preferably asynchronously, or at least quickly as the user navigates. Obviously data creation is dependent on us, but a library that has hooks for this kind of thing would be great.)
and technologically, works with Delphi / C++Builder and the VCL.
Libraries
There are two main libraries I've considered so far - I'm looking for knowledgeable opinions about these, or for other libraries I haven't considered.
1. FireMonkey
This is Embarcadero's new UI library, which is only available in XE2 and above. Our app is based on the VCL and we'd want to host this in a VCL window; that seems to be officially unsupported but unofficially works fine, or is available through third-parties.
The mix of UI framework and 3D framework with shaders etc sounds great. But I don't know how complex the library is, what support it has for data that's not a simple object like a cube or sphere, and how well-designed it is. That last link has major criticisms of the 3D side of the library - severe enough I am not sure it's worthwhile in its current state at the time of writing for a non-trivial 3D app.
Is it worth trying to write a new visualisation window in our VCL app using FireMonkey?
2. GLScene
GLScene is a well-known 3D OpenGL framework for Delphi. I have never used it myself so have no experience about how it works or is designed. However, I believe it integrates well into VCL windows and supports shaders and modern GPUs. I do not know how its scene graph or navigation work or how well dynamic data can be implemented.
Its feature list specifically mentions some things I'm interested in, such as easy rotation/movement, procedural objects (implying dynamic data is easy to implement), and helper functions for picking. It seems shaders are Cg only (not GLSL or another non-vendor-specific language.) It also supports "polymorphic image support for texturing (allows many formats as well as procedural textures), easily extendable" - that may just mean many image formats, or it may indicate something where the texture can be dynamically changed, such as for dynamic colour mapping.
Where to from here?
These are the only two major 3D libraries I know of for Delphi or C++Builder. Have I missed any? Are there pros and cons I'm not aware of? Do you have any experience using either of these for this kind of purpose, and what pitfalls should we be wary of or features should we know about and use?
We currently use Embarcadero RAD Studio 2010 and most of our software is written in C++. We have small amounts of Delphi and may consider upgrading IDEs, but we are most likely to wait until the 64-bit C++ compiler is released. For that reason, a library that works in RS2010 might be best.
Thanks for your input :) I'm after high-quality answers, so I'll add a bounty when I can!
I have used GLScene in my 3D geomapping software and although it's not used to an extent you're looking for I can vouch that it seems the most appropriate for what you're trying to do.
GLScene supports terrain rendering and adding customizable objects to the scene. Objects can be interacted with and you can create complex 3D models of objects using the various building blocks of GLScene.
Unfortunately I cannot state how it will work with millions of points, but I do know that it is quite optimized and performs great on minimal hardware - that being said - the target PC I found required a dedicated graphics card capable of using OpenGL 2.1 extensions or higher (I found small issues with integrated graphics cards).
The other library I looked at was DXscene - which appears quite similar to GLScene albeit using DirectX instead of OpenGL. From memory this was a commercial product where GLScene was licensed under GPL. (EDIT - the page seems to be down at the moment : http://www.ksdev.com/index.html)
GLScene is still in active development and provides a fairly comprehensive library of functions, base objects and texturing etc. Things like rotation, translation, pitch, roll, turn, ray casting - to name a few - are all provided for you. Visibility culling is provided for each base object as well as viewing cameras, lighting and meshes. Base objects include cubes, spheres, pipes, tetrahedrons, cones, terrain, grids, 3d text, arrows to name a few.
Objects can be picked with the mouse and moved along 1,2 or 3 axes. Helper functions are included to automatically calculate the top-most object the mouse is under. Complex 3D shapes can be built by attaching base objects to other base objects in a hierarchical manner. So, for example, a car could be built using a rectangle as the base object and attaching four cylinders to it for the wheels - then you can manipulate the 'car' as a whole - since the four cylinders are attached to the base rectangle.
The only downside I could bring to your attention is the sometimes limited help/support available to you. Yes, there is a reference manual and a number of demo applications to show you how to do things such as select objects and move them around, however the reference manual is not complete and there is potential to get 'stuck' on how to accomplish a certain task. Forum support is somewhat limited/sparse. If you have a sound knowledge of 3D basics and concepts I'm sure you could nut it out.
As for Firemonkey - I have had no experience with this so I can't comment. I believe this is more targeted at mobile applications with lower hardware requirements so you may have issues with larger data sets.
Here are some other links that you may consider - I have no experience with them:
http://www.truevision3d.com/
http://www.3impact.com/
Game Development in Delphi
The last one is targeted at game development - but may provide useful information.
Have you tried glData? http://gldata.sourceforge.net/
It is old (~2004, Delphi 7), and I have not personally used the library, but some of the output is amazing.
you can use the GLScene or OpenGL they are good 3D rendering and its very easy to use.
Since you are already using georeferenced data, maybe you should consider embedding GoogleEarth in your Delphi application like this? Then you can add data to it as points, paths, or objects.

Automated Webcam Application / Hardware Problems

I am starting to develop an automated webcam application. The goal is to automatically take pictures, do some image processing and then upload the results to a FTP site. All of these tasks seem simple.
However, I am having a hard time to find a decent camera. I don't want to use a simple webcam or hd-webcam because the image quality of still frames isn't very good.
I'm also having a hard time finding an affordable digital camera supporting USB snapshot or control.
My second concern is the development itself. I'm not quite sure which programming language to use. I have experience with AS3, Processing, Java and some simple C++ and Open CV.
Do you have a clue?
Regarding the camera, There are pretty good webcams that you can find, some with HD quality. look at the cameras on Logitech (I tested their API and it is quite good), A HD camera has a retail of $99 which is very cheap. If you are looking for something better I would go with Nikon as they also have a pretty good API for C#/C++. You can get a basic SLR with simple 28mm lens for $500. Don't use a PowerShot as Nikon stopped supporting their API. Whatever camera you decide to buy make sure a proper API is available, is being maintained and free.
Regarding development, I would go with C#/Java as they are easier than C++. There are quite allot of libraries for image processing for C#/Java, just make sure that the Camera comes with an API the fits your chosen language.
Good luck.
Generally (from experience) most USB cameras that show up as an imaging device through Windows can be used with JAI [Java Advanced Imaging]. Additionally [on the .net/c++ side], the same cameras can be used through DirectShow as a capture device. Java/C# will make development easier but expect to loose some performance [even with the best of optimizations]. Additionally you can only perform upto the speed of the camera and the data line running from the camera to the computer [USB1.0 will seriously limit a decent framerate]
first get the image in RAM:
If you are using CHDK, I suggest you get the image copied from camera memory to RAM by using supported scripting languages by CHDK - you can take help from the CHDK forum http://chdk.setepontos.com/index.php for this.
or if thats difficult you can continuously copy the image to hard disk and load in RAM from there. (you need to take care (delete) of massive images accumulated on hard disk in a short period of time !)
This sounds like a 'brute force' approach, but will get your work going while you are researching correct approach.
perform image processing:
once the image is in RAM, you can apply your image processing algorithms as usual e.g. using opencv library.
hope this helps you

Resources