Delphi How to search in binary file faster? - delphi

I have a binary file (2.5 MB) and I want to find position of this sequence of bytes: CD 09 D9 F5. Then I want to write some data after this position and also overwrite old data (4 KB) with zeros.
Here is how I do it now but it is a bit slow.
ProcessFile(dataToWrite: string);
var
fileContent: string;
f: file of char;
c: char;
n, i, startIndex, endIndex: integer;
begin
AssignFile(f, 'file.bin');
reset(f);
n := FileSize(f);
while n > 0 do
begin
Read(f, c);
fileContent := fileContent + c;
dec(n);
end;
CloseFile(f);
startindex := Pos(Char($CD)+Char($09)+Char($D9)+Char($F5), fileContent) + 4;
endIndex := startIndex + 4088;
Seek(f, startIndex);
for i := 1 to length(dataToWrite) do
Write(f, dataToWrite[i]);
c := #0;
while (i < endIndex) do
begin
Write(f, c); inc(i);
end;
CloseFile(f);
end;

See this answer: Fast read/write from file in delphi
Some options are:
memory mapped files
TFileStream
blockread
To search the file buffer, see Best way to find position in the Stream where given byte sequence starts - one answer mentions the Boyer-Moore algorithm for fast detection of a byte sequence.

Your code to read the entire file into a string is very wasteful. Pascal I/O uses buffering so I don't think it's the byte by byte aspect particularly. Although one big read would be better. The main problem will be the string concatenation and the extreme heap allocation demand required to concatenate the string, one character at a time.
I'd do it like this:
function LoadFileIntoString(const FileName: string): string;
var
Stream: TFileStream;
begin
Stream := TFileStream.Create(FileName, fmOpenRead);
try
SetLength(Result, Stream.Size);//one single heap allocation
Stream.ReadBuffer(Pointer(Result)^, Length(Result));
finally
Stream.Free;
end;
end;
That alone should make a big difference. When it comes to writing the file, a similar use of strings will be much faster. I've not attempted to decipher the writing part of your code. Writing the new data, and the block of zeros again should be batched up to as few separate writes as possible.
If ever you find that you need to read or write very small blocks to a file, then I offer you my buffered file streams: Buffered files (for faster disk access).
The code could be optimised further to read only a portion of the file, and search until you find the target. You may be able to avoid reading the entire file that way. However, I suspect that these changes will make enough of a difference.

Related

TFileStream.Read not reading last part of file

I'm using TFileStream.Read in a loop to read a text file, but I find that the last part is not being read into the buffer - although the total number of bytes being read is equal to the filesize.
This is my code:
procedure TForm1.DoImport;
var
f: String;
fs: TFileStream;
r, c: Integer;
buf: TBytes;
const
bufsiz = 16384;
begin
SetLength(buf, bufsiz);
f := 'C:\Report\Claims\Claims.csv';
fs := TFileStream.Create(f, fmOpenRead);
try
c := 0;
repeat
r := fs.Read(buf, bufsiz);
Inc(c, r);
until (r <> bufsiz);
showmessage('Done. ' + IntToStr(c)); // <-- c equals to filesize !!
Memo1.Text := StringOf(buf); // <-- but the memo does not show the last chunk of the file
finally
fs.Free;
end;
end;
At the end, the TMemo does not contain the last chunk of the file, but the 2nd to last chunk. Is there something wrong with my code?
Thanks in advance!
The beginning of that buffer contains the last chunk of your file. But after that comes the content of the previous chunk, because you never cleared the buffer. So you think that your memo contains the previous chunk, but it is a mix of both.
You could use the copy function in order to just add a part of the buffer.
Memo1.Text := StringOf(Copy(buf, 0, r)); // r is the number of bytes to copy
A better way for reading a text file is using TStringList or TStringReader. These will take care of the file encoding (Ansi, UTF8, ...) I usually prefer the TStringList because I had too much trouble with some of the bugs in TStringReader.

Strings in Delphi: Pre-allocate memory to increase performance in simple cases?

I'm one of those so-called developers who got their way with Delphi without really understanding or even thinking about basics. In this case, I'm talking about strings.
While I do understand how pre-allocating memory can result in a significant speed gain. I don't understand how to use it in simple, real-world, cases (this is even more true with the TStringBuilder).
For example, let's say I have this code that recursively search a folder & add results to a hash list:
var
FilesList : TDictionary<String, Byte>; // Byte = (file = 0, folder = 1)
// ------------------------------------------------------------------------------ //
procedure AddFolder(const AFolderName : String);
var
FileName : String;
AHandle : THandle;
FindData : TWin32FindData;
begin
AHandle := FindFirstFile(PChar(AFolderName + '*'), FindData);
if (AHandle = INVALID_HANDLE_VALUE) then
Exit;
repeat
if (FindData.dwFileAttributes And FILE_ATTRIBUTE_DIRECTORY = 0) then
begin
{ Add a file. }
FileName := FindData.cFileName;
FilesList.Add(AFolderName + FileName, 0);
end
else if ((FindData.cFileName[0] <> '.') OR Not ((FindData.cFileName[1] = #0) OR (FindData.cFileName[1] = '.') And (FindData.cFileName[2] = #0))) then
begin
FileName := AFolderName + FindData.cFileName + '\';
FilesList.Add(FileName, 1);
AddFolder(FileName);
end;
until Not FindNextFile(AHandle, FindData);
Windows.FindClose(AHandle);
end;
I'm not sure if it's a good example, but in this case, it's not clear to me how pre-allocating memory to the variable FileName would help increase the execution speed, especially that I know nothing about its length. Assuming this is possible, how?
Or is the pre-allocation technique only useful when concatenating / building strings?
Notes about my question:
The question is primarily for XE2, but feel free to reference other delphi versions as I'm sure other developers will benefit from sharing the wisdom (that is, assuming mods won't delete it as chatty or subjective)
I'm more interested in simple everyday cases where one needs to make micro-optimization in very large loops / with huge amount of data by optimizing string memory pre-allocation.
Straight up string concatenation (for example) can be slow because the memory for the string is reallocated for each piece that is appended. Sometimes the new size can actually be accommodated in place but sometimes the data must be copied to a new location, the old buffer freed, and so on. This takes time.
In general, though, this should be of no concern to you unless you have verified with a performance profiler or explicit timing statements that you do in fact have a performance issue.
Of course you got away without understanding how Strings really work: Delphi is grate in that respect, it's string manipulation is highly effective and it's memory-manager is also highly-effective for small memory blocks. You can do ALOT with Delphi and not have a problem with String manipulation.
There are some classes of problems where you should take care, especially if the routines you're looking at are to be reused (library code).
For example, this should always raise a flag:
Result := '';
for i:=1 to N do
Result := Result + Something; // <- Recursively builds the string, one-char-at-a-time
Even THAT might fly with Delphi, if it's not often use or is used where time is not critical. None the less that kind of code should be optimized so the entire (likely) length of the string is pre-allocated, then trimmed in then end:
SetLength(Result, Whatever);
for i:=1 to N do
Result[i] := SomeChar;
SetLength(Result, INowKnowTheLength);
Now for an example where the TStringBuilder shines. If you have something like this:
var Msg: string;
begin
Msg := 'Mr ' + MrName + #13#10;
if SomeCondition then
Msg := Msg + 'We were unable to reach you'
else
Msg := Msg + 'We need to let you know';
Msg := Msg + #13#10
end;
i.e: code that builds one complex (and possibly large) bit of message, then you can easily optimize it using TStringBuilder:
var Msg: TStringBuilder;
begin
Msg := TStringBuilder.Create;
try
Msg.Append('Mr ');
Msg.Append(MrName);
Msg.Append(#13#10);
if SomeCondition then
Msg.Append('We were unable to reach you')
else
Msg.Append('We need to let you know');
Msg.Append(#13#10);
ShowMessage(Msg.ToString); // <- Gets the whole string
finally Msg.Free;
end;
end;
Any way, always balance the ease of writing, ease of maintenance with the true benefits of the performance. Don't exceed natural limits for the code your write: optimizing a string-generating routine to be faster then the HDD can write is wasted effort. Optimizing some GUI code to generate a message in 1ms (instead of 20ms) is also wasted effort - the user would never know your code was 20 times faster, it'd be just as instantaneous.
Basically it is these concatenations you are talking about:
AFolderName + '*'
AFolderName + FindData.cFileName
AFolderName + FindData.cFileName + '\'
The first one is done once, the loop executes either the second and third.
These methods in System.pas are used internally for the 3 lines:
procedure _UStrCat3(var Dest: UnicodeString; const Source1, Source2: UnicodeString);
procedure _UStrCat3(var Dest: UnicodeString; const Source1, Source2: UnicodeString);
procedure _UStrCatN(var Dest: UnicodeString; ArgCnt: Integer; const Strs: UnicodeString); varargs;
Since the 3 values are different, you cannot optimize it using only one expression.
All functions precalculate the final length and do the proper allocation work if needed.
Inside the loop, you might try to do the preallocation of the AFolderName + FindData.cFileName + '\' yourself, and snap out the AFolderName + FindData.cFileName portion, but then you require 2 allocations for the then case.
So I think your code cannot get optimized much further (i.e. you cannot get it to perform an order of magnitude better).

How can I use a large file in Delphi?

When I use a large file in memorystream or filestream I see an error which is "out of memory"
How can I solve this problem?
Example:
procedure button1.clıck(click);
var
mem:TMemoryStream;
str:string;
begin
mem:=Tmemorystream.create;
mem.loadfromfile('test.txt');----------> there test.txt size 1 gb..
compressstream(mem);
end;
Your implementation is very messy. I don't know exactly what CompressStream does, but if you want to deal with a large file as a stream, you can save memory by simply using a TFileStream instead of trying to read the whole thing into a TMemoryStream all at once.
Also, you're never freeing the TMemoryStream when you're done with it, which means that you're going to leak a whole lot of memory. (Unless CompressStream takes care of that, but that's not clear from the code and it's really not a good idea to write it that way.)
You can't fit the entire file into a single contiguous block of 32 bit address space. Hence the out of memory error.
Read the file in smaller pieces and process it piece by piece.
Answering the question in the title, you need to process the file piece by piece, byte by byte if that's needed: you definitively do not load the file all at once into memory! How you do that obviously depends on what you need to do with the file; But since we know you're trying to implement an Huffman encoder, I'll give you some specific tips.
An Huffman encoder is a stream encoder: Bytes go in and bits go out. Each unit of incoming data is replaced with it's corresponding bit pattern. The encoder doesn't need to see the whole file at once, because it is in fact only working on one byte each time.
Here's how you'd huffman-compress a file without loading it all into memory; Of course, the actual Huffman encoder is not shown, because the question is about working with big files, not about building the actual encoder. This piece of code includes buffered input and output and shows how you'd link an actual encoder procedure to it.
(beware, code written in browser; if it doesn't compile you're expected to fix it!)
type THuffmanBuffer = array[0..1023] of Byte; // Because I need to pass the array as parameter
procedure DoActualHuffmanEncoding(const EncodeByte:Byte; var BitBuffer: THuffmanBuffer; var AtBit: Integer);
begin
// This is where the actual Huffman encoding would happen. This procedure will
// copy the correct encoding for EncodeByte in BitBuffer starting at AtBit bit index
// The procedure is expected to advance the AtBit counter with the number of bits
// that were actually written (that's why AtBit is a var parameter).
end;
procedure HuffmanEncoder(const FileNameIn, FileNameOut: string);
var InFile, OutFile: TFileStream;
InBuffer, OutBuffer: THuffmanBuffer;
InBytesCount: Integer;
OutBitPos: Integer;
i: Integer;
begin
// First open the InFile
InFile := TFileStream.Create(FileNameIn, fmOpenRead or fmShareDenyWrite);
try
// Now prepare the OutFile
OutFile := TFileStream.Create(FileNameOut, fmCreate);
try
// Start the out bit counter
OutBitPos := 0;
// Read from the input file, one buffer at a time (for efficiency)
InBytesCount := InFile.Read(InBuffer, SizeOf(InBuffer));
while InBytesCount <> 0 do
begin
// Process the input buffer byte-by-byte
for i:=0 to InBytesCount-1 do
begin
DoActualHuffmanEncoding(InBuffer[i], OutBuffer, OutBitPos);
// The function writes bits to the outer buffer, not full bytes, and the
// encoding for a rare byte might be significantly longer then 1 byte.
// Whenever the output buffer approaches it's capacity we'll flush it
// out to the OutFile
if (OutBitPos > ((SizeOf(OutBuffer)-10)*8) then
begin
// Ok, we've got less then 10 bytes available in the OutBuffer, time to
// flush!
OutFile.Write(OutBuffer, OutBitPos div 8);
// We're now possibly left with one incomplete byte in the buffer.
// We'll copy that byte to the start of the buffer and continue.
OutBuffer[0] := OutBuffer[OutBitPos div 8];
OutBitPos := OutBitPos mod 8;
end;
end;
// Read next chunk
InBytesCount := InFile.Read(InBuffer, SizeOf(InBuffer));
end;
// Flush the remaining of the output buffer. This time we want to flush
// the final (potentially incomplete) byte as well, because we've got no
// more input, there'll be no more output.
OutFile.Write(OutBuffer, (OutBitPos + 7) div 8);
finally OutFile.Free;
end;
finally InFile.Free;
end;
end;
The Huffman encoder is not a difficult encoder to implement, but doing it both correctly and fast might be a challenge. I suggest you start with a correct encoder, once you've got both encoding and decoding working figure out how to do a fast encoder.
try something like http://www.explainth.at/en/delphi/mapstream.shtml

How can I quickly zero out the contents of a file?

I want to terminate file that user selected from my program. I wrote this sample code:
var
aFile: TFileStream;
Const
FileAddr: String = 'H:\Akon.mp3';
Buf: Byte = 0;
begin
if FileExists(FileAddr) then
begin
// Open given file in file stream & rewrite it
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
aFile.Seek(0, soFromBeginning);
while aFile.Position <> aFile.Size do
aFile.Write(Buf, 1);
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
As you can see, I overwrite given file with 0 (null) value. This code works correctly, but the speed is very low in large files. I would like do this process in multithreaded code, but I tried some test some code and can't do it. For example, I create 4 threads that do this work to speed up this process.
Is there any way to speed up this process?
I don't know if it could help you, but I think you could do better (than multithreading) writing to file a larger buffer.
For example you could initialize a buffer 16k wide and write directly to FileStream; you have only to check the last part of file, for which you write only a part of the full buffer.
Believe me, it will be really faster...
OK, I'll bite:
const
FileAddr: String = 'H:\Akon.mp3';
var
aFile: TFileStream;
Buf: array[0..1023] of Byte;
Remaining, NumBytes: Integer;
begin
if FileExists(FileAddr) then
begin
// Open given file in file stream & rewrite it
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
FillChar(Buf, SizeOf(Buf), 0);
Remaining := aFile.Size;
while Remaining > 0 do begin
NumBytes := SizeOf(Buf);
if NumBytes < Remaining then
NumBytes := Remaining;
aFile.WriteBuffer(Buf, NumBytes);
Dec(Remaining, NumBytes);
end;
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
Multiple threads won't help you here. Your constraint is disk access, primarily because you're writing only 1 byte at a time.
Declare Buf as an array of bytes, and initialise it with FillChar or ZeroMemory. Then change your while loop as follows:
while ((aFile.Position + SizeOf(Buf)) < aFile.Size) do
begin
aFile.Write(Buf, SizeOf(Buf));
end;
if (aFile.Position < aFile.Size) then
begin
aFile.Write(Buf, aFile.Size - aFile.Position);
end;
You should learn from the slowness of the above code that:
Writing one byte at a time is the slowest and worst way to do it, you incur a huge overhead, and reduce your overall performance.
Even writing 1k bytes (1024 bytes) at a time, would be a vast improvement, but writing a larger amount will of course be even faster, until you reach a point of diminishing returns, which I would guess is going to be somewhere between 200k and 500k write buffer size. The only way to find out when it stops mattering for your application is to test, test, and test.
Checking position against size so often is completely superfluous. If you read the size once, and write the correct number of bytes, using a local variable you will save yourself more overhead, and improve performance. ie, Inc(LPosition,BufSize) to increment LPosition:Integer logical variable, by the buffer size amount BufSize.
Not sure if this meets your requirments but it works and it's fast.
var
aFile: TFileStream;
const
FileAddr: String = 'H:\Akon.mp3';
begin
if FileExists(FileAddr) then
begin
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
afile.Size := 0;
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
So will something along these lines (declaring b outside the function will improve your performance in the loop, especially when dealing with a large file ). I assume that in the app filename would be a var:
const
b: byte=0;
procedure MyProc;
var
aFile: TFileStream;
Buf: array of byte;
len: integer;
FileAddr: String;
begin
FileAddr := 'C:\testURL.txt';
if FileExists(FileAddr) then
begin
aFile := TFileStream.Create(FileAddr, fmcreate);
try
len := afile.Size;
setlength(buf, len);
fillchar(buf[0], len, b);
aFile.Position := 0;
aFile.Write(buf, len);
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;

Delphi Pascal - Using SetFilePointerEx and GetFileSizeEx, Getting Physical Media exact size when reading as a file

I do not know how to use any API that is not in the RTL. I have been using SetFilePointer and GetFileSize to read a Physical Disk into a buffer and dump it to a file, something like this in a loop does the job for flash memory cards under 2GB:
SetFilePointer(PD,0,nil,FILE_BEGIN);
SetLength(Buffer,512);
ReadFile(PD,Buffer[0],512,BytesReturned,nil);
However GetFileSize has a limit at 2GB and so does SetFilePointer. I have absolutley no idea how to delcare an external API, I have looked at the RTL and googled for many examples and have found no correct answer.
I tried this
function GetFileSizeEx(hFile: THandle; lpFileSizeHigh: Pointer): DWORD;
external 'kernel32';
and as suggested this
function GetFileSizeEx(hFile: THandle; var FileSize: Int64): DWORD;
stdcall; external 'kernel32';
But the function returns a 0 even though I am using a valid disk handle which I have confirmed and dumped data from using the older API's.
I am using SetFilePointer to jump every 512 bytes and ReadFile to write into a buffer, in reverse I can use it to set when I am using WriteFile to write Initial Program Loader Code or something else to the disk. I need to be able to set the file pointer beyond 2gb well beyond.
Can someone help me make the external declarations and a call to both GetFileSizeEx and SetFilePointerEx that work so I can modify my older code to work with say 4 to 32gb flash cards.
I suggest that you take a look at this Primoz Gabrijelcic blog article and his GpHugeFile unit which should give you enough pointers to get the file size.
Edit 1 This looks rather a daft answer now in light of the edit to the question.
Edit 2 Now that this answer has been accepted, following a long threads of comments to jachguate's answer, I feel it incumbent to summarise what has been learnt.
GetFileSize and
SetFilePointer have no 2GB
limitation, they can be used on files
of essentially arbitrary size.
GetFileSizeEx and
SetFilePointerEx are much
easier to use because they work
directly with 64 bit quantities and
have far simpler error condition
signals.
The OP did not in fact need to
calculate the size of his disk. Since
the OP was reading the entire
contents of the disk the size was not
needed. All that was required was to
read the contents sequentially until
there was nothing left.
In fact
GetFileSize/GetFileSizeEx
do not support handles to devices
(e.g. a physical disk or volume) as
was requested by the OP. What's more,
SetFilePointer/SetFilePointerEx
cannot seek to the end of such device
handles.
In order to obtain the size of a
disk, volume or partition, one should
pass the the
IOCTL_DISK_GET_LENGTH_INFO
control code to
DeviceIoControl.
Finally, should you need to use GetFileSizeEx and SetFilePointerEx then they can be declared as follows:
function GetFileSizeEx(hFile: THandle; var lpFileSize: Int64): BOOL;
stdcall; external 'kernel32.dll';
function SetFilePointerEx(hFile: THandle; liDistanceToMove: Int64;
lpNewFilePointer: PInt64; dwMoveMethod: DWORD): BOOL;
stdcall; external 'kernel32.dll';
One easy way to obtain these API imports them is through the excellent JEDI API Library.
The GetFileSizeEx routine expects a pointer to a LARGE_INTEGER data type, and documentation says:
If your compiler has built-in support for 64-bit integers, use the QuadPart member to store the 64-bit integer
Lucky you, Delphi has built-in support for 64 bit integers, so use it:
var
DriveSize: LongWord;
begin
GetFilePointerSizeEx(PD, #DriveSize);
end;
SetFilePointerEx, on the other hand, expects parameters for liDistanceToMove, lpNewFilePointer, both 64 bit integers. My understanding is it wants signed integers, but you have the UInt64 data type for Unsingned 64 bit integers if I'm missunderstanding the documentation.
Alternative coding
Suicide, first of all your approach is wrong, and because of your wrong approach you ran into some hairy problems with the way Windows handles Disk drives opened as files. In pseudo code your approach seems to be:
Size = GetFileSize;
for i=0 to (Size / 512) do
begin
Seek(i * 512);
ReadBlock;
WriteBlockToFile;
end;
That's functionally correct, but there's a simpler way to do the same without actually getting the SizeOfDisk and without seeking. When reading something from a file (or a stream), the "pointer" is automatically moved with the ammount of data you just read, so you can skip the "seek". All the functions used to read data from a file return the amount of data that was actually read: you can use that to know when you reached the end of the file without knowing the size of the file to start with!
Here's an idea of how you can read an physical disk to a file, without knowing much about the disk device, using Delphi's TFileStream:
var DiskStream, DestinationStream:TFileStream;
Buff:array[0..512-1] of Byte;
BuffRead:Integer;
begin
// Open the disk for reading
DiskStream := TFileStream.Create('\\.\PhysicalDrive0', fmOpenRead);
try
// Create the file
DestinationStream := TFileStream.Create('D:\Something.IMG', fmCreate);
try
// Read & write in a loop; This is where all the work's done:
BuffRead := DiskStream.Read(Buff, SizeOf(Buff));
while BuffRead > 0 do
begin
DestinationStream.Write(Buff, BuffRead);
BuffRead := DiskStream.Read(Buff, SizeOf(Buff));
end;
finally DestinationStream.Free;
end;
finally DiskStream.Free;
end;
end;
You can obviously do something similar the other way around, reading from a file and writing to disk. Before writing that code I actually attempted doing it your way (getting the file size, etc), and immediately ran into problems! Apparently Windows doesn't know the exact size of the "file", not unless you read from it.
Problems with disks opened as files
For all my testing I used this simple code as the base:
var F: TFileStream;
begin
F := TFileStream.Create('\\.\PhysicalDrive0', fmOpenRead);
try
// Test code goes here...
finally F.Free;
end;
end;
The first (obvious) thing to try was:
ShowMessage(IntToStr(DiskStream.Size));
That fails. In the TFileStream implementation that depends on calling FileSeek, and FileSeek can't handle files larger then 2Gb. So I gave GetFileSize a try, using this code:
var RetSize, UpperWord:DWORD;
RetSize := GetFileSize(F.Handle, #UpperWord);
ShowMessage(IntToStr(UpperWord) + ' / ' + IntToStr(RetSize));
That also fails, even those it should be perfectly capable of returning file size as an 64 bit number! Next I tried using the SetFilePointer API, because that's also supposed to handle 64bit numbers. I thought I'd simply seek to the end of the file and look at the result, using this code:
var RetPos, UpperWord:DWORD;
UpperWord := 0;
RetPos := SetFilePos(F.Handle, 0, #UpperWord, FILE_END);
ShowMessage(IntToStr(UpperWord) + ' / ' + IntToStr(RetPos));
This code also fails! And now I'm thinking, why did the first code work? Apparently reading block-by-block works just fine and Windows knows exactly when to stop reading!! So I thought maybe there's a problem with the implementation of the 64 bit file handling routines, let's try seeking to end of the file in small increments; When we get an error seeking we know we reached the end we'll stop:
var PrevUpWord, PrevPos: DWORD;
UpWord, Pos: DWORD;
UpWord := 0;
Pos := SetFilePointer(F.Handle, 1024, #UpWord, FILE_CURRENT); // Advance the pointer 512 bytes from it's current position
while (UpWord <> PrevUpWord) or (Pos <> PrevPos) do
begin
PrevUpWord := UpWord;
PrevPos := Pos;
UpWord := 0;
Pos := SetFilePointer(F.Handle, 1024, #UpWord, FILE_CURRENT);
end;
When trying this code I had a surprise: It doesn't stop at the of the file, it just goes on and on, for ever. It never fails. To be perfectly honest I'm not sure it's supposed to ever fail... It's probably not supposed to fail. Anyway, doing a READ in that loop fails when we're past the end of file so we can use a VERY hacky mixed approach to handle this situation.
Ready-made routines that work around the problem
Here's the ready-made routine that gets the size of the physical disk opened as a file, even when GetFileSize fails, and SetFilePointer with FILE_END fails. Pass it an opened TFileStream and it will return the size as an Int64:
function Hacky_GetStreamSize(F: TFileStream): Int64;
var Step:DWORD;
StartPos: Int64;
StartPos_DWORD: packed array [0..1] of DWORD absolute StartPos;
KnownGoodPosition: Int64;
KGP_DWORD: packed array [0..1] of DWORD absolute KnownGoodPosition;
Dummy:DWORD;
Block:array[0..512-1] of Byte;
begin
// Get starting pointer position
StartPos := 0;
StartPos_DWORD[0] := SetFilePointer(F.Handle, 0, #StartPos_DWORD[1], FILE_CURRENT);
try
// Move file pointer to the first byte
SetFilePointer(F.Handle, 0, nil, FILE_BEGIN);
// Init
KnownGoodPosition := 0;
Step := 1024 * 1024 * 1024; // Initial step will be 1Gb
while Step > 512 do
begin
// Try to move
Dummy := 0;
SetFilePointer(F.Handle, Step, #Dummy, FILE_CURRENT);
// Test: Try to read!
if F.Read(Block, 512) = 512 then
begin
// Ok! Save the last known good position
KGP_DWORD[1] := 0;
KGP_DWORD[0] := SetFilePointer(F.Handle, 0, #KGP_DWORD[1], FILE_CURRENT);
end
else
begin
// Read failed! Move back to the last known good position and make Step smaller
SetFilePointer(F.Handle, KGP_DWORD[0], #KGP_DWORD[1], FILE_BEGIN);
Step := Step div 4; // it's optimal to devide by 4
end;
end;
// From here on we'll use 512 byte steps until we can't read any more
SetFilePointer(F.Handle, KGP_DWORD[0], #KGP_DWORD[1], FILE_BEGIN);
while F.Read(Block, 512) = 512 do
KnownGoodPosition := KnownGoodPosition + 512;
// Done!
Result := KnownGoodPosition;
finally
// Move file pointer back to starting position
SetFilePointer(F.Handle, StartPos_DWORD[0], #StartPos_DWORD[1], FILE_BEGIN);
end;
end;
To be complete, here are two routines that may be used to set and get the file pointer using Int64 for positioning:
function Hacky_SetStreamPos(F: TFileStream; Pos: Int64):Int64;
var aPos:Int64;
DWA:packed array[0..1] of DWORD absolute aPos;
const INVALID_SET_FILE_POINTER = $FFFFFFFF;
begin
aPos := Pos;
DWA[0] := SetFilePointer(F.Handle, DWA[0], #DWA[1], FILE_BEGIN);
if (DWA[0] = INVALID_SET_FILE_POINTER) and (GetLastError <> NO_ERROR) then
RaiseLastOSError;
Result := aPos;
end;
function Hacky_GetStreamPos(F: TFileStream): Int64;
var Pos:Int64;
DWA:packed array[0..1] of DWORD absolute Pos;
begin
Pos := 0;
DWA[0] := SetFilePointer(F.Handle, 0, #DWA[1], FILE_CURRENT);
Result := Pos;
end;
Last notes
The 3 routines I'm providing take as a parameter an TFileStream, because that's what I use for file reading and writing. They obviously only use TFileStream.Handle, so the parameter can simply be replaced with an file handle: the functionality would stay the same.
I know this thread is old, but...
One small suggestion - if you use the Windows DeviceIoControl(...) function you can get Drive Geometry and/or Partition Information, and use them to get the total size/length of the opened drive or partition. No more messing around with incrementally seeking to the end of the device.
Those IOCTLs can also be used to give you the correct volume sector size, and you could use that instead of defaulting to 512 everywhere.
Very very useful. But I got a problem for disks greater then 4 GB.
I solved replacing:
// Ok! Save the last known good position
KGP_DWORD[1] := 0;
KGP_DWORD[0] := SetFilePointer(F.Handle, 0, #KGP_DWORD[1], FILE_CURRENT);
with the following:
// Ok! Save the last known good position
KnownGoodPosition := KnownGoodPosition + Step;
Many thanks again...
And many thanks also to James R. Twine. I followed the advice of using IOCTL_DISK_GET_DRIVE_GEOMETRY_EX and got disk dimension with no problem and no strange workaround.
Here is the code:
TDISK_GEOMETRY = record
Cylinders : Int64; //LargeInteger
MediaType : DWORD; //MEDIA_TYPE
TracksPerCylinder: DWORD ;
SectorsPerTrack: DWORD ;
BytesPerSector : DWORD ;
end;
TDISK_GEOMETRY_EX = record
Geometry: TDISK_GEOMETRY ;
DiskSize: Int64; //LARGE_INTEGER ;
Data : array[1..1000] of byte; // unknown length
end;
function get_disk_size(handle: thandle): int64;
var
BytesReturned: DWORD;
DISK_GEOMETRY_EX : TDISK_GEOMETRY_EX;
begin
result := 0;
if DeviceIOControl(handle,IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
nil,0,#DISK_GEOMETRY_EX, sizeof(TDISK_GEOMETRY_EX),BytesReturned,nil)
then result := DISK_GEOMETRY_EX.DiskSize;
end;

Resources