Processing large video files in Azure Function

Processing large video files in Azure Function - stream

I am trying to put a watermark in the video files (any size) through an Azure fucntion(C#). For the same, I am downloading the video into a Stream from an external source and then splitting it into frames and applying a watermark in each frame. And then merging it back. For the same, I am using FFMeg & OpenCV.
While splitting the file, I have to get the whole file and then process it frame by frame. For smaller files, it has no issues, for large files (like 2+ GB) it's going to impact the memory.
Any suggestion on the same, to do it in a better way?
With all other steps, I am using the durable function to process the same with 5-6 small Functions.

Related

Dask looping overhead from libraries

When calling another libary to dask such as scikit image contrast stretch, I realise that dask is creating a result for each block, storing in either memory or spilling to disk seperately. Then it attempts to merge all the results. Thats fine if your on a cluster or on a single computer and the dataset for the array is small, everything is fairly controlled. The problems start to happen when you work with data sets that are much larger than your RAM or disk. Is there a way to mitigate this or use the zarr file format to save to updating values as you go along? May be thats too fanciful. Any other ideas bar buy more ram would be helpful.
edit
I was looking at the documentation on dask and the suggestions on chunk sizes for dask, is something like about 100MB. I ended up reducing significantly from this amount to 30-70MB depending on file size. I then ran a contrast stretch (not from a library but with numpy unfunc and I didnt have any issue! In fact i played with the way the compuation is done. Since I start with a uint8 3dim array, when multiplying by the ratio for contrast stretch I am inevitably increasing the array chunk to a float64 array. Which takes up significant memory and computation. So what I have been do is treating the da.array as np.asarray(float64) but only prior to the multiplication by a float number. Then returning to a uint8 to finish the computation. The stretch time has reduced to just under 5 mins for a 20GB file. So I think thats a positive step. Just means image processing without libraries, I will, have a look at rechunker though.
The image processing pipeline i am building is to inevitable be used for a merged dataset of about 250-300GB (definitely outside the limits of my laptop). I also dotn have time to get to grips with cloud or parralell processing in the cloud. Thats for a few months down the line. Right now its trying to get through this analysis.

Yes, you can do the kind of thing you are talking about. I encourage you to check out the rechunker project, which is specialied around changing the layout of the data in zarr storage, but shows the idea of how to save temporary intermediated for the purpose of mitigating memory and communication issues.

What are the scenario that makes us compress data before we transfer it?

I am wondering the reason why we need to apply file compression before we upload files to server under some scenarios. For my understanding, as soon as the server received the compressed files, the compressed file need to be extracted to allow the server read the file content. It certainly consumes the computation power of the server if multiple Http POSTs are sent from many client side platforms.
Therefore, as far as I can think of the scenario of sending the compressed file is uploading the backup files, setting files, files that only servers as back up for the client side platforms. Please give me more scenarios for uploading compressed data.

I think the following article gives an perfect explanation to the question:http://www.dataexpedition.com/support/notes/tn0014.html
Here's the content:
Compression Pros & Cons
Simply put, compression is a process which trades CPU cycles for bytes. But the trade isn't always a good one. Sometimes you can spend a lot of valuable CPU cycles for little or no gain.
In the context of network data transport, "Should I compress?" is a common question. But the answer can get complicated, depending on several factors. The most important thing to remember is that compression can actually make your data move much slower, so it should not be used without some consideration.
When Compression Is Good
Compression algorithms try to identify large repeating patterns in a data set and replace them with smaller patterns. Ideally, this shrinks the size of the data set. For the purposes of network transport, having less data to move means it should take less time to move it.
Documents and files which consist mostly of plain text or machine executable code tend to compress well. Examples include word processing documents, HTML files, some .exe files, and some database files.
Combining many small files into a single archive prior to network transfer can often result in faster speeds than transferring each file individually. This may be true even if the individual files themselves are not compressible. Many archiving utilities have options to pack files into an archive without compression, such as the "-0" option for "zip". ExpeDat will combine the contents of a folder into a single data stream when you enable Streaming Folders.
When Compression Is Bad
Many data types are not compressible, because the repeating patterns have already been removed. This includes most images, videos, songs, any data that is already compressed, or any data that has been encrypted.
Trying to compress data that is not compressible wastes CPU time. When you are trying to move data at high speeds, that CPU time may be critical to feeding the network. So by taking away processing time with worthless compression, you can actually end up moving your data much more slowly than if you had compression turned off.
If you are using a compression utility only for the purposes of combining many small files, check for options that disable compression. For example, the "zip" command has a "-0" option which packages files into an archive without spending time trying to compress them.
Inline versus Offline
Many transport mechanisms allow you to apply compression algorithms to data as its being transferred. This is convenient because the compression and decompression occur seamlessly without the user having to perform extra steps. But it is also risky because any CPU time spent on compression is time NOT being spent on feeding data through the network. If the network is very fast, the CPU is very slow, or the compression algorithm is unable to scale, having inline compression turned on may cause your data to move more slowly than if you turn compression off. Inline compression can be slower than no compression even when the data is compressible!
If you are going to be transferring the same data set multiple times, it pays to compress it first using Zip or Tar-Gzip. Then you can transfer the compressed archive without taking CPU cycles away from the network processing. If you are planning to encrypt your data, make sure you compress it first, then encrypt second.
Hidden Compression
Devices in your network may be applying compression without you realizing it. This becomes evident if the "speed" of the network seems to change for different data types. If the network seems slow when you are transferring data that is already compressed, but fast when you are transferring uncompressed text files, then you can be pretty sure that something out there is making compression decisions for you.
Network compression devices can be helpful in that they take the compression burden away from the end-point CPUs. But they can also create very inconsistent results since they will not work for all destinations and data types. Network level compression can also run into the same CPU trade-offs discussed above, resulting in some files moving more slowly than they would if there was no compression.
If you are testing the speed of your network, try using data that is already compressed or encrypted to ensure consistent results.
Should I Turn On Inline Compression?
For compressed data, images, audio, video, or encrypted files: No.
For other types of data, test it both ways to see which is faster.
If the network is very fast (hundreds of megabits per second or faster), consider turning off inline compression and instead compress the data before you move it.

Is it possible to resume 7zip compression?

My application regularly upload large files. Regardless of their size, all files are compressed before uploaded to server.
Part of this project requirements is to resume nicely after crash/power failure, so right now compression is done this way:
large-file.bin sliced in N slices
Compress each slice & upload it
In case of crash, I pickup from the last slice.
To optimize upload speed, I'm currently looking into sending the whole file (uploads are resumed if failed) instead of sending slices one by one, so I'm looking into compressing the whole file instead of compressing each slice.
I'm currently using 7z.dll. I wonder if it's possible, in case of power failure, to tell 7z to resume compression.
I know I could always implement my own compression routine and implement such feature, but before going that road I wonder if it's possible to do that in 7z (which already have an excellent compression ratio)

As far as I know, no compression algorithm supports that. You will likely have to recompress the source file from the beginning every time, discarding any output bytes until you reach the desired resume position, and then you can send the remaining output bytes from that point on.

What is the fastest way for reading huge files in Delphi?

My program needs to read chunks from a huge binary file with random access. I have got a list of offsets and lengths which may have several thousand entries. The user selects an entry and the program seeks to the offset and reads length bytes.
The program internally uses a TMemoryStream to store and process the chunks read from the file. Reading the data is done via a TFileStream like this:
FileStream.Position := Offset;
MemoryStream.CopyFrom(FileStream, Size);
This works fine but unfortunately it becomes increasingly slower as the files get larger. The file size starts at a few megabytes but frequently reaches several tens of gigabytes. The chunks read are around 100 kbytes in size.
The file's content is only read by my program. It is the only program accessing the file at the time. Also the files are stored locally so this is not a network issue.
I am using Delphi 2007 on a Windows XP box.
What can I do to speed up this file access?
edit:
The file access is slow for large files, regardless of which part of the file is being read.
The program usually does not read the file sequentially. The order of the chunks is user driven and cannot be predicted.
It is always slower to read a chunk from a large file than to read an equally large chunk from a small file.
I am talking about the performance for reading a chunk from the file, not about the overall time it takes to process a whole file. The latter would obviously take longer for larger files, but that's not the issue here.
I need to apologize to everybody: After I implemented file access using a memory mapped file as suggested it turned out that it did not make much of a difference. But it also turned out after I added some more timing code that it is not the file access that slows down the program. The file access takes actually nearly constant time regardless of the file size. Some part of the user interface (which I have yet to identify) seems to have a performance problem with large amounts of data and somehow I failed to see the difference when I first timed the processes.
I am sorry for being sloppy in identifying the bottleneck.

If you open help topic for CreateFile() WinAPI function, you will find interesting flags there such as FILE_FLAG_NO_BUFFERING and FILE_FLAG_RANDOM_ACCESS . You can play with them to gain some performance.
Next, copying the file data, even 100Kb in size, is an extra step which slows down operations. It is a good idea to use CreateFileMapping and MapViewOfFile functions to get the ready for use pointer to the data. This way you avoid copying and also possibly get certain performance benefits (but you need to measure speed carefully).

Maybe you can take this approach:
Sort the entries on max fileposition and then to the following:
Take the entries that only need the first X MB of the file (till a certain fileposition)
Read X MB from the file into a buffer (TMemorystream
Now read the entries from the buffer (maybe multithreaded)
Repeat this for all the entries.
In short: cache a part of the file and read all entries that fit into it (multhithreaded), then cache the next part etc.
Maybe you can gain speed if you just take your original approach, but sort the entries on position.

The stock TMemoryStream in Delphi is slow due to the way it allocates memory. The NexusDB company has TnxMemoryStream which is much more efficient. There might be some free ones out there that work better.
The stock Delphi TFileStream is also not the most efficient component. Wayback in history Julian Bucknall published a component named BufferedFileStream in a magazine or somewhere that worked with file streams very efficiently.
Good luck.

What is the fastest way of loading and re-sizing an image?

I need to display thumbnails of images in a given directory. I use TFileStream to read the image file before loading the image into an image component. The bitmap is then resized to the thumbnail size, and assigned to a TImage component on a TScrollBox.
It seems to work ok, but slows down quite a lot with larger images.
Is there a faster way of loading (image) files from disk and resizing them?
Thanks, Pieter

Not really. What you can do is resize them in a background thread, and use a "place holder" image until the resizing is done. I would then save these resized images to some sort of cache file for later processing (windows does this, and calls the cache thumbs.db in the current directory).
You have several options on the thread architecture itself. A single thread that does all images, or a thread pool where a thread only knows how to process a single image. The AsyncCalls library is even another way and can keep things fairly simple.

I'll complement the answer by skamradt with an attempt to design this for being as fast as possible. For this you should
optimize I/O
use multiple threads to make use of multiple CPU cores, and to keep even a single CPU core working while you read (or write) files
The use of multiple threads implies that using VCL classes for the resizing isn't going to work, as the VCL isn't thread-safe, and all hacks around that don't scale well. efg's Computer Lab has links for image processing code.
It's important to not cause several concurrent I/O operations when using multiple threads. If you choose to write the thumbnail images back to files, then once you have started reading a file you should read it completely, and once you have started writing a file you should also write it completely. Interleaving both operations will kill your I/O, because you potentially cause a lot of seeking operations of the hard disc head.
For best results the reading (and writing) of files should also not happen in the main (GUI) thread of your application. That would suggest the following design:
Have one thread read files into TGraphic objects, and put these into a thread-safe list.
Have a thread pool wait on the list of files in original size, and have one thread process one TGraphic object, resize it into another TGraphic object, and add this to another thread-safe list.
Notify the GUI thread for each thumbnail image added to the list, so it can be displayed.
If thumbnails are to be written to file, do this in the reading thread as well (see above for an explanation).
Edit:
On re-reading your question I notice that you maybe only need to resize one image, in which case a single background thread is of course enough. I'll leave my answer in place anyway, maybe it will be of use to someone else some time. It's what I learned from one of my latest projects, where the final program could have needed a little more speed but was only using about 75% of the quad core machine at peak times. Decoupling I/O from processing would have made the difference.

I often use TJPEGImage with Scale:=jsEighth (in Delphi 7). This is really fast because the JPEG de-compression can skip a lot of the data to fill a bitmap of only an eighth of width and height.
Another option is to use the shell's method to extract a thumbnail, which is pretty fast as well

I'm in the vision business, and I simply upload the images to the GPU using OpenGL. (typically 20x 2048x2000x8bpp per second), a bmp per texture, and let the videocard scale (win32, Mike Lischke's opengl headers)
Upload of such an image costs 5-10ms depending on exact videocard (if not integrated and nvidia 7300 series or newer. Very recent integrated GPUs might be doable also). Scaling and displaying costs 300us. Which means customers can pan and zoom like crazy without touching the app. I draw an overlay (which used to be a tmetafile but is now an own format) on top of it.
My biggest picture is 4096x7000x8bpp which shows and scales in under 30ms. (GF 8600)
A limitation of this technology is max texture size. It can be resolved by fragmenting the picture into multiple textures, but I haven't bothered yet because I deliver the systems with the software.
(some typical sizes:
nv6x00 series: 2k*2k but uploading is just about break even compared to GDI
nv7x00 series: 4k*4k For me the baseline cards. GF7300's are like $20-40
nv8x00 series: 8k*8k
)
Note that this might not be for everybody. But if you are in the lucky situation to specify hardware limits, it might work. The main problem are laptops like Thinkpads, the GPUs of which are older than the avg laptop, which are in turn often a generation behind Desktops.
I chose OpenGL over DirectX because it is more static in time, and easier to find non-game related examples.

Try to look at the Graphics32 library : it's very good at drawing things and works great with Bitmaps. They are Thread - Safe with good example, and it's totally free.

Exploit windows capacity to create thumbnails. Remember that hidden Thumbs.db files in folders that contain images?
I have implemented something like this feature but in VB. My software is able to build thumbnails of 100 files (mixed size) in around 10 seconds.
I am not able to convert it to Delphi though.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart