What is the fastest way of loading and re-sizing an image? - delphi

I need to display thumbnails of images in a given directory. I use TFileStream to read the image file before loading the image into an image component. The bitmap is then resized to the thumbnail size, and assigned to a TImage component on a TScrollBox.
It seems to work ok, but slows down quite a lot with larger images.
Is there a faster way of loading (image) files from disk and resizing them?
Thanks, Pieter

Not really. What you can do is resize them in a background thread, and use a "place holder" image until the resizing is done. I would then save these resized images to some sort of cache file for later processing (windows does this, and calls the cache thumbs.db in the current directory).
You have several options on the thread architecture itself. A single thread that does all images, or a thread pool where a thread only knows how to process a single image. The AsyncCalls library is even another way and can keep things fairly simple.

I'll complement the answer by skamradt with an attempt to design this for being as fast as possible. For this you should
optimize I/O
use multiple threads to make use of multiple CPU cores, and to keep even a single CPU core working while you read (or write) files
The use of multiple threads implies that using VCL classes for the resizing isn't going to work, as the VCL isn't thread-safe, and all hacks around that don't scale well. efg's Computer Lab has links for image processing code.
It's important to not cause several concurrent I/O operations when using multiple threads. If you choose to write the thumbnail images back to files, then once you have started reading a file you should read it completely, and once you have started writing a file you should also write it completely. Interleaving both operations will kill your I/O, because you potentially cause a lot of seeking operations of the hard disc head.
For best results the reading (and writing) of files should also not happen in the main (GUI) thread of your application. That would suggest the following design:
Have one thread read files into TGraphic objects, and put these into a thread-safe list.
Have a thread pool wait on the list of files in original size, and have one thread process one TGraphic object, resize it into another TGraphic object, and add this to another thread-safe list.
Notify the GUI thread for each thumbnail image added to the list, so it can be displayed.
If thumbnails are to be written to file, do this in the reading thread as well (see above for an explanation).
Edit:
On re-reading your question I notice that you maybe only need to resize one image, in which case a single background thread is of course enough. I'll leave my answer in place anyway, maybe it will be of use to someone else some time. It's what I learned from one of my latest projects, where the final program could have needed a little more speed but was only using about 75% of the quad core machine at peak times. Decoupling I/O from processing would have made the difference.

I often use TJPEGImage with Scale:=jsEighth (in Delphi 7). This is really fast because the JPEG de-compression can skip a lot of the data to fill a bitmap of only an eighth of width and height.
Another option is to use the shell's method to extract a thumbnail, which is pretty fast as well

I'm in the vision business, and I simply upload the images to the GPU using OpenGL. (typically 20x 2048x2000x8bpp per second), a bmp per texture, and let the videocard scale (win32, Mike Lischke's opengl headers)
Upload of such an image costs 5-10ms depending on exact videocard (if not integrated and nvidia 7300 series or newer. Very recent integrated GPUs might be doable also). Scaling and displaying costs 300us. Which means customers can pan and zoom like crazy without touching the app. I draw an overlay (which used to be a tmetafile but is now an own format) on top of it.
My biggest picture is 4096x7000x8bpp which shows and scales in under 30ms. (GF 8600)
A limitation of this technology is max texture size. It can be resolved by fragmenting the picture into multiple textures, but I haven't bothered yet because I deliver the systems with the software.
(some typical sizes:
nv6x00 series: 2k*2k but uploading is just about break even compared to GDI
nv7x00 series: 4k*4k For me the baseline cards. GF7300's are like $20-40
nv8x00 series: 8k*8k
)
Note that this might not be for everybody. But if you are in the lucky situation to specify hardware limits, it might work. The main problem are laptops like Thinkpads, the GPUs of which are older than the avg laptop, which are in turn often a generation behind Desktops.
I chose OpenGL over DirectX because it is more static in time, and easier to find non-game related examples.

Try to look at the Graphics32 library : it's very good at drawing things and works great with Bitmaps. They are Thread - Safe with good example, and it's totally free.

Exploit windows capacity to create thumbnails. Remember that hidden Thumbs.db files in folders that contain images?
I have implemented something like this feature but in VB. My software is able to build thumbnails of 100 files (mixed size) in around 10 seconds.
I am not able to convert it to Delphi though.

Related

Why is saving a ID3D11Texture2D to file so painfully slow and what to do about it

I'm having some code (compiled into a standalone executable) to take a screenshot. It runs pretty fast (around 100ms), however, as soon as I run something graphic intensive (like a game), it takes up to 2 minutes, even though CPU and GPU are not fully exhausted. Specifically, I use IDXGIOutputDuplication to grab the screen, and then some code I have "borrowed" from here https://github.com/microsoft/DirectXTex/blob/main/ScreenGrab/ScreenGrab11.cpp to save the screen as a picture to file. Specifically the "encoding" part, i.e., use of the encoder seems to be problem. Any ideas about (1) where this massive performance impact is coming from and (2) how to improve it?
Full code here: https://github.com/plengauer/DXGIOutputDuplication

What does libvips VIPS_DISC_THRESHOLD default=100 mean?

Does it mean that it will take 100MB (Open via Disk)?
Or it mean that it will take 100MB (Open via Memory)?
That's the threshold at which libvips will flip from open-via-memory to open-via-disc.
For small images (100mb when decompressed in this case), libvips will decompress to memory then process from there. This is obviously not a good idea for large images, so for these libvips will decompress to a temporary disc file, then map that area of disc into virtual memory and use that as the pixel source.
tldr: set VIPS_DISC_THRESHOLD to a small number to prefer the use of disc, set it to a large number to prefer RAM.
There's a chapter in the libvips docs which goes into a lot more detail:
https://www.libvips.org/API/current/How-it-opens-files.md.html
To very quickly summarize:
libvips has at least four ways of opening images and tries hard to pick the best one for you automatically.
Sometimes it'll need a bit of help to hit the best path for your use case and you have three main ways of influencing this.
You can hint the access pattern you expect for this image with the access= parameter, you can set the threshold at which it'll flip between preferring memory and preferring disc, and you can say where you'd like disc temporaries to be held.

Dask looping overhead from libraries

When calling another libary to dask such as scikit image contrast stretch, I realise that dask is creating a result for each block, storing in either memory or spilling to disk seperately. Then it attempts to merge all the results. Thats fine if your on a cluster or on a single computer and the dataset for the array is small, everything is fairly controlled. The problems start to happen when you work with data sets that are much larger than your RAM or disk. Is there a way to mitigate this or use the zarr file format to save to updating values as you go along? May be thats too fanciful. Any other ideas bar buy more ram would be helpful.
edit
I was looking at the documentation on dask and the suggestions on chunk sizes for dask, is something like about 100MB. I ended up reducing significantly from this amount to 30-70MB depending on file size. I then ran a contrast stretch (not from a library but with numpy unfunc and I didnt have any issue! In fact i played with the way the compuation is done. Since I start with a uint8 3dim array, when multiplying by the ratio for contrast stretch I am inevitably increasing the array chunk to a float64 array. Which takes up significant memory and computation. So what I have been do is treating the da.array as np.asarray(float64) but only prior to the multiplication by a float number. Then returning to a uint8 to finish the computation. The stretch time has reduced to just under 5 mins for a 20GB file. So I think thats a positive step. Just means image processing without libraries, I will, have a look at rechunker though.
The image processing pipeline i am building is to inevitable be used for a merged dataset of about 250-300GB (definitely outside the limits of my laptop). I also dotn have time to get to grips with cloud or parralell processing in the cloud. Thats for a few months down the line. Right now its trying to get through this analysis.
Yes, you can do the kind of thing you are talking about. I encourage you to check out the rechunker project, which is specialied around changing the layout of the data in zarr storage, but shows the idea of how to save temporary intermediated for the purpose of mitigating memory and communication issues.

Should I programly put computation-heavy tasks on a separate thread on IOS to utilize multi-core?

I am making a real-time image processing app on IOS with my team. I am handling the custom computation kernel (mostly on CPU rather than GPU) and my teammates deals with the GUI. When I tested my kernel on a toy app, the core (ignoring any IO overhead ) runs steadily at 100ms per image. However, when put into the full-functioning one, it is slowed down to 500ms per image.
I have checked that the data is pretty much the same and I am only measuring time consumed within the kernel, on the same iphone6. There are hardly any other computation in the full-functioning app so I am not sure what is pulling behind. Though GPU-processing is definitely an alternative and I am working on it, I would like to know if there is any tricks to use for now.
Currently, there is no explicit multi-threading in the computation part, so my simple guess is: should I programly put the computation part on a separate thread so the second core can be utilized?
[Update]
It turns out that I made some mistakes in packing my code as library, as the copying over the source code works out nicely. I have not figured out my problem yet and am going to post it on a separate question.
GPU Acceleration
This massively depends on the tasks you're performing, the GPU is good a specific subset of tasks and simply utilising it can sometimes even slow things down. Check this out
A lot of image based tasks that are part of the Quartz framework e.t.c are GPU accelerated (like blurring). Also if you use a library like OpenCV you get GPU acceleration on certain tasks out the box.
Unless you're a real pro I would avoid using the GPU specifically and let the frameworks and libraries you use do that for you.
Concurrency
It will certainly help to put intensive tasks on a background thread. Just be aware of what it entails (i.e. you can't make any UIKit calls from a background thread.
The answer heavily depends on how you do the processing. Some methods in the SDK perform their job in a background thread, while others require the caller to create and use one.
In general, in the case of drawing, most methods require you to create one explicitly. This is important especially for the ones that perform their work on the CPU (e.g. using CoreGraphics to draw within a drawRect method). If you're using methods that use GPU for the processing, then creating threads won't be much of use since CPU won't be the cause of the bottleneck.
If you want to determine why your app slows down, use Instruments. (Time Profiler for CPU and Core Animation for drawing)

Hunting down EOutOfResources

Question:
Is there an easy way to get a list of types of resources that leak in a running application? IOW by connecting to an application ?
I know memproof can do it, but it slows down so much that the application won't even last a minute. Most taskmanager likes can show the number, but not the type.
It is not a problem that the check itself is catastrophic (halts the app process), since I can check with a taskmgr if I'm getting close (or at least I hope)
Any other insights on resource leak hunting (so not memory) is also welcomed.
Background:
I've an Delphi 7/2006/2009 app (compiles with all three) and after about a few week it starts acting funny. However only on one of the places it runs, on several other systems it runs till the power goes out.
I've tried to put in some debug code to narrow the problem down. and found out that the exception is EOutofResources on a save of a file. (the file save can happen thousands of times a day).
I have tried to reason out memory leaks (with fastmm), but since the dataflow is quite high (60MByte/s from gigabit industrial camera's), I can only rule out "creeping" memory leaks with fastmm, not quick flashes of memoryleaks that exhaust memory around the time it happens. If something goes wrong, the app fills memory in under half a minute.,
Main suspects are filehandles that are somehow left on some error and TMetafiles (which are streamed to these files). Minor suspects are VST, popupmenu and tframes
Updates:
Another possible tip: It ran fine for two years with D7, and now the problems are with Turbo Explorer (which I use for stable projects not converted to D2009 ).
Paul-Jan: Since it only happens once a week (and that can happen at night), information acquisition is slow. Which is why I ask this question, need to combine stuff for when I'm there thursday. In short: no I don't know 100% sure. I intend to bring the entire Systemtools collection to see if I can find something (because then it will be running for days). There is also a chance that I see open files. (maybe should try to find some mingw lsof and schedule it)
But the app sees very little GUI action (it is an machine vision inspection app), except screen refresh +/- 15/s which is tbitmap stretchdraw + tmetafile, but I get this error when saving to disk (TFileStream) handles are probably really exhausted. However in the same stream, TMetafile is also savetostreamed, something which later apps don't have anymore, and they can run from months.
------------------- UPDATE
I've searched and searched and searched, and managed to reproduce the problems in-vitro two or three times. The problems happened when memusage was +/- 256MB (the systems have 2GB), user objects 200, gdi objects 500, not one file more open than expected ).
This is not really exceptional. I do notice that I leak small amounts of handles, probably due to reparenting frames (something in the VCL seems to leak HPalette's), but I suspect the core cause is a different problem. I reuse TMetafile, and .clear it inbetween. I think clearing the metafile doesn't really (always?) resize the resource, eventually each metafile in the entire pool of tmetafile at maximum size, and with 20-40+ tmetafiles (which can be several 100ks each) this will hit the desktop heap limit.
That's theory, but I'll try to verify this by setting the desktop limit to 10MB at the customers, but it will be several weeks before I have confirmation if this changes anything. This theory also confirms why this machine is special (it's possible that this machine naturally has slightly larger metafiles on average). Occasionally freeing and recreating a tmetafile in the pool might also help.
Luckily all these problems (both tmetafile and reparenting) have already been designed out in newer generations of the apps.
Due to the special circumstances (and the fact that I have very limited test windows), this is going to be a while, but I decided to accept the desktop heap as an example for now (though the GDILeaks stuff was also somewhat useful).
Another thing that the audit revealed GDI-types usage in a thread (though only saving tmetafiles (that weren't used or connected otherwise) to streams.
------------- Update 2.
Increasing the desktop limit only seemed to minorly increase the time till the problem occurred.
Unfortunately, I won't be able to follow up on this further, since the machines were updated to a newer version of the framework that doesn't have the problem.
In summary I can only state what the three core modifications were going from the old to the new framework:
I no longer change screens by reparenting frames. I now work with forms that I hide and show. I changed this since I also had very rare crashes or exceptions (that could be clicked away) due to this. The crashes were all while operating the GUI though, not spontaneously like the main problem
The routine where the crash happened dealt with TMetafile. TMetafile has been designed out, and replace by a simpler own made format. (basically arrays with Opengl vertices)
Drawing no longer happened with tbitmap with a tmetafile overlay strechdrawn over it, but using OpenGL.
Of course it could be something else too, that got changed in the rewrite of the above parts, fixing some very nasty detail bug. It would have to be an extremely bad one, since I analysed the above system as much as I could.
Updated nov 2012 after some private mail discussion: In retrospect, the next step would have been adding a counter to the metafiles objects, and simply reinstantiate them every x * 1000 uses or so, and see if that changes anything. If you have similar problems, try to see if you can somewhat regularly destroy and reinitialize long living resources that are dynamically allocated.
There is a slim chance that the error is misleading. The VCL naively reports EOutOfResources if it is unable to obtain a DC for a window (see TWinControl.GetDeviceContext in Controls.pas).
I say "naively" because there are other reasons why GetDC() might return a NULL handle and the VCL should report the OS error, not assume an out of resources condition (there is a Windows version check required for this to be reliably possible, but the VCL could and should take of that too).
I had a situation where I was getting the EOutOfResources error as the result of a window handle becoming invalid. Once I'd discovered the true problem, finding the cause and fixing it was simple, but I wasted many, many hours trying to find a non-existent resource leak.
If possible I would examine the stack trace leading to this exception - if it is coming from TWinControl.GetDeviceContext then the problem may not be what you think (it's impossible to say what it might be of course, but eliminating the impossible is always the first step toward discovering the solution, no matter how improbable).
If they are GDI handle leaks you can have a look at MSDN Magazine January 2003 which uses the tool GDILeaks. Other tools are GDIObj or GDIView. Also see here.
Another source of EOutOfResources could be that the Desktop Heap is full. I've had that issue on busy terminal servers with large screens.
If there are lots of file handles you are leaking you could check out Process Explorer and have a look at the open file handles of your process and see any out of the ordinary. Or use WinDbg with the !htrace command.
I've run into this problem before. From what I've been able to tell, Delphi may throw an EOutOfResources any time the Windows API returns ERROR_NOT_ENOUGH_MEMORY, and (as the other answers here discuss) Windows may return ERROR_NOT_ENOUGH_MEMORY for a variety of conditions.
In my case, EOutOfResources was being caused by a TBitmap - in particular, TBitmap's call to CreateCompatibleBitmap, which it uses with its default PixelFormat of pfDevice. Apparently Windows may enforce fairly strict systemwide limits on the memory available for device-dependent bitmaps (see, e.g, this discussion), even if your system otherwise has plenty of memory and plenty of GDI resources. (These systemwide limits are apparently because Windows may allocate device-dependent bitmaps in the video card's memory.)
The solution is simply to use device-independent bitmaps (DIBs) instead (although these may not offer quite as good of a performance). To do this in Delphi, set TBitmap.PixelFormat to anything other than pfDevice. This KB article describes how to pick the optimal DIB format for a device, although I generally just use pf32Bit instead of trying to determine the optimal format for each of the monitors the application is displayed on.
Most of the times I saw EOutOfResources, it was some sort of handle leak.
Did you try something like MadExcept?
--jeroen
"I've tried to put in some debug code to narrow the problem down. and found out that the exception is EOutofResources on a save of a file. (the file save can happen thousands of times a day)."
I'm shooting in the dark here, but could it be that you're using the Windows API to (GetTempFileName) create a temp file and you're blowing out some file system indexes or forgetting to close a file handle?
Either way, I do agree that with your supposition about it being a file handle problem. That seems to be the most likely thing given your symptoms and diagnosis.
Also try to check handle count for the application with Process Explorer from SysInternals. Handle leaks can be very dangerous and they build slowly through time.
I am currently having this problem, in software that is clearly not leaking any handles in my own code, so if there are leaks they could be happening in a component's source code or the VCL sourcecode itself.
The handle count and GDI and user object counts are not increasing, nor is anything being created. Deltic's answer shows corner cases where the message is kind of a red-herring, and Allen suggests that even a file write can cause this error.
So far, The best strategy I have found for hunting them down is to use either JCL JCLDEBUG stack tracebacks, or the exception report save features in MadExcept to generate the context information to find out what is actually failing.
Secondly, AQTime contains many tools to help you, including a resource profiler that can keep the links between where the code that created the resources is, and how it was called, along with counts of the total numbers of handles. It can grab results MID RUN and so it is not limited to detecting unfreed resources after you exit. So, run AQTime, do a results capture in mid run, wait several hours, and capture again, and you should have two points in time to compare handle counts. Just in case it is the obvious thing. But as Deltics wisely points out, this exception class is raised in cases where it probably shouldn't have been.
I spent all of today chasing this issue down. I found plenty of helpful resources pointing me in the direction of GDI, with the fact that I'm using GDI+ to produce high-speed animations directly onto the main form via timer/invalidate/onpaint (animation performed in separate thread). I also have a panel in this form with some dynamically created controls for the user to make changes to the animation.
It was extremely random and spontaneous. It wouldn't break anywhere in my code, and when the error dialog appeared, the animation on the main form would continue to work. At one point, two of these errors popped up at the same time (as opposed to sequential).
I carefully observed my code and made sure I wasn't leaking any handles related to GDI. In fact, my entire application tends to keep less than 300 handles, according to Task Manager. Regardless, this error would randomly pop up. And it would always correspond with the simplest UI related action, such as just moving the mouse over a standard VCL control.
Solution
I believe I have solved it by changing the logic to performing the drawing within a custom control, rather than directly to the main form as I had been doing before. I think the fact that I was rapidly drawing on the same form canvas which shared other controls, somehow they interfered. Now that it has its own dedicated canvas to draw on, it seems to be perfectly fixed.
That is with about 1 hour of vigorous testing at least.
[Fingers crossed]

Resources