Why am I sometimes getting files filled with zeros at their end after being downloaded? - delphi

I'm developing a download manager using Indy and Delphi XE (The application uses Multithreading to attempt several connections to the server). Everything works fine but sometimes the final downloaded file is broken and when I check downloaded temp files I see that 2 or 3 of them is filled with zero at their end. (Each temp file is download result of each connection).
The larger the file is, the more broken temp files I get as the result.
For example in one of the temp files which was 65,536,000 bytes, only the range of 0-34,359,426 was valid and from 34,359,427 to 64,535,999 it was full of zeros. If I delete those zeros, application will automatically download the missing segments and what I get as the result, well if the problem wouldn't happen again, is the healthy downloaded file.
I want to get rid of those zeros at the end of the temp files without having a lost in download speed.
P.S. I'm using TFileStream and I'm sending it directly to TIdHTTP and downloading the files using GET method.
Additional Info: I handle OnWork event which assigns AWorkCount to a public int64 variable. Each time the file is downloaded, the downloaded file size (That Int64 variable) is logged to a text file and from what the log says is that the file has been downloaded completely (even those zero bytes).

Make sure the server actually supports downloading byte ranges before you request a range to download. If the server does not support ranges, a requested range will be ignored by the server and the entire file will be sent instead. If you are not already doing so, you should be using TIdHTTP.Head() to text for range support before then calling TIdHTTP.Get(). You also need to do this anyway to detect if the remote file has been altered since the last time you downloaded it. Any decent download manager needs to be able to handle things like that.
Also keep in mind that if TIdHTTP knows up front how many bytes are being transferred, it will pre-allocate the size of the destination TStream before then downloading data into it. This is to speed up the transfer and optimize disc I/O when using a TFileStream. So you should NOT use TFileStream to access the same file as the destination for multiple simultaneous downloads, even if they are writing to different areas of the file. Pre-allocating multiple TFileStream objects will likely trample over each other trying to set the file size to different positions. If you need to download a file in multiple pieces simultaneously then either:
1) download each piece to a separate file and copy them into the final file as needed once you have all of the pieces that you need.
2) use a custom TStream class, or Indy's TIdEventStream class, to manage the file I/O yourself so you can ignore TIdHTTP's pre-allocation attempts and ensure that multiple file I/O operatons do not overlap each other incorrectly.

Related

Calculate the checksum for a file before it has been downloaded?

Is it possible to calculate a file's checksum without possessing the file?
Background
I'm interested in creating some software that would be used to download external files. I must be careful, because the files can be altered by the file owners.
I would like to keep a list of checksum values inside the software, to allow the software to validate that the external file is what it claims to be.
I believe this is easily possible once the external file is stored locally (i.e. after it has been downloaded), but I would ideally like to calculate the checksum for the file before downloading. Is this possible?
Essentially I'm tying to get a file's checksum without actually possessing the file. I think that sounds impossible, but I'm new to checksums and may be missing obvious techniques.
I'm not an expert, but here's an idea.
Make your file owners' clients (or the server that receives the files) upload the checksums (and any other metadata you need) of the files separately (as a different file, or a database entry). Then your software can download the checksum and verify it before downloading the bigger file.

File operations conflicts

I’m writing a program which is continously looking for new files in a directory. After it extracts data from each file and makes some treatments with it, the files are moved to another directory containing all scanned files.
Imagine I’m copying a new file in the scanned directory while my program is running. Can a file which has not finished copying be treated (and then produce unforeseen results), or is it locked by the System ?
Now, imagine two instances of the program are running on two different computers, continously scanning the same folder. What can happen if both instances are trying to move the same file ?
Thank you for your help.
I have a project that does much the same thing. Another application is receiving data from a feed and writing files to a folder. My application is processing those files by opening them, acting on them in some way, writing them to another folder, then deleting them.
The strategy I used in the application that does the processing and deleting is to simply open them like this:
TFileStream.Create(AFileName, fmOpenRead OR fmShareDenyWrite);
If the file that is being opened is still being written by another process, the above will fail, and can likely be opened successfully on a subsequent iteration.

"Fastest way to unzip many files on iOS" or "How else can I download many files quickly into my iOS app"

In my app i want the user to be able to download offline map content.
So I (compressed) moved all my tiles into a zip file. (I used 0 compression)
The structure is like that: {z/x/y.jpg}
+0
+-0
+--0.jpg
+1
+-1
+--0.jpg
+2
+-2
+--1.jpg
So basically there are going to be many many files for zoom level 0-15. (about 120.000 tiles for my test-region).
I am using https://github.com/mattconnolly/ZipArchive now but also tried out https://github.com/soffes/ssziparchive before and both are pretty slow. It takes about 5!! minutes on my iPhone 5S for the files to unzip.
Is there any way I can speed things up? What other possibilities rather than downloading the tiles in one big zip file would there be?
Edit:
How can i download the content of the whole folder quickly to my iPhone without the need of unzipping something?
Any help is appreciated!
JPGs rarely compress at all with zip - they are by definition already compressed. What you should do is create your own binary file format, and put whatever metadata you need into it along with the images (which you should encode with a really low quality number, to get their size down).
When you download those files, you can open then, quickly read them into memory, and extract out data or images as needed.
This will be really fast and have virtually no overhead if your extra data is binary (not text).
PS: I just tripped on a PHP Plist class
If anyone is wondering how I was ending up:
For my use-case (MapTiles) I am using MBTiles now instead of zipped images. It's one big database file and super easy to read if using FMDB. No unpacking whatsoever needed...
Even if I was placing the Images all in one binary file without any compression, the "extracting" still took forever!

Download Directory and Contents

Is it possible to persuade the stream result to download an entire directory and it's contents? And if so, how? I've no problem getting it to download individual files, but I have a need to download a series of files that must be in a specific directory structure.
I don't think so.
Stream result allow you to download ONE content, with its MIME type, its name, etc.
This makes it impossible to work with a lot of files, with different names and content type.
What you can do is:
Render in a JSP the list of files (in anchor tags for example), everyone targeting the Action that will download that single file;
Call multiple Actions via scripting opening multiple pages (target="_blank") for every file you have (dangerous, annoying, almost useless...);
Create a zip with Java in server side, containing all your files and directories, then output the zip with Stream result.
I think you may consider the third option.

Does Windows.CopyFile create a temporary local file while source and destination are network shares?

I have a D2007 application that uses Windows.CopyFile to copy MS Word and PowerPoint files from one network folder to another network folder. Our organization is migrating to Windows 7 from Vista. One of my migrated users got an error message that displayed a partial local folder (C:\Users\(username)\...\A100203.doc) during the copy. Does the CopyFile function cache a local copy of the document when it is copying from one network folder to another network folder or is it a direct write? I have never seen this error before and the application has been running for years on Win95, Win 98, Win2000, WinXP and Vista.
Windows.CopyFile does NOT cache the file on your hard drive... instead, it instructs Windows to handle the copying of the file itself (rather than you managing the streams in your own program). The output file buffer (destination) is opened, and the input buffer simply read and written. Essentially this means that the source file is spooled into system memory, then offloaded onto the destination... at no point is an additional cache file created (this would slow file copying down).
You need to provide more specific information about your error... such as either the text or an actual screenshot of the offending error message. This will allow people to provide more useful answers.
The user that launches the copy will require read access to the original and write access to the target, regardless of caching (if the user has read access to the file, then the file can be written to a local cache, so caching/no-caching is irrelevant).
It's basic security to disallow someone to be able to copy files/directories among machines just because the security attributes between the machines are compatible.
There's little else to say without the complete text of the error message.

Resources