Delphi: How to calculate the SHA hash of a large file - delphi

Hi I need to generate a SHA over a 5 Gig file
Do you know of a non string based Delphi library that can do this ?

You should use DCPcrypt v2 and read your file buffered and feed the SHA hasher with the buffer until you've read the complete 5GB file.
If you want to know how to read a large file buffered, see my answer about a file copy using custom buffering.
so in concept (no real delphi code!):
function GetShaHash(const AFilename: String)
begin
sha := TSHAHasher.Create;
SetLength(Result, sha.Size);
file := OpenFile(AFilename, GENERIC_READ);
while not eof file do
begin
BytesRead := ReadFile(file, buffer[0], 0, 1024 * 1024);
sha.Update(buffer[0], BytesRead);
end;
sha.Final(Result[0]);
CloseFile(file);
end;

I would recommend Wolfgang Ehrhardt's CRC/Hash.
http://home.netsurf.de/wolfgang.ehrhardt/
It's fast and "can be compiled with most current Pascal (TP 5/5.5/6, BP 7, VP 2.1, FPC 1.0/2.0/2.2) and Delphi versions (tested with V1 up to V7/9/10)".
I've used it with D11/D12 too.

If I remember correctly, Indy comes with several a stream based hash methods.

There is a Delphi interface for OpenSSL, isn't there?
That should provide you with better performances.

#Davy Landman,
thank you, your answer really helped me out. This is the code I ended up using:
function HashFileSHA256(const fileName: String): String;
var
sha256: TDCP_sha256;
buffer: array[0..1024*1024] of byte;
i, bytesRead: Integer;
streamIn: TFileStream;
hashBuf: array[0..31] of byte;
begin
// Initialization
Result := '';
streamIn := TFileStream.Create(fileName, fmOpenRead);
sha256 := TDCP_sha256.Create(nil);
for i:=0 to Sizeof(buffer) do
buffer[i] := 0;
for i:=0 to Sizeof(hashBuf) do
hashBuf[i] := 0;
bytesRead := -1;
// Compute
try
sha256.Init;
while bytesRead <> 0 do
begin
bytesRead := streamIn.Read(buffer[0], Sizeof(buffer));
sha256.Update(buffer[0], bytesRead);
end;
sha256.Final(hashBuf);
for I := 0 to 31 do
Result := Result + IntToHex(hashBuf[i], 2);
finally
streamIn.Free;
sha256.Free;
end;
Result := LowerCase(Result);
end;
P.S.: I am a total beginner with Pascal, so this code most likely sucks. But I tested it on the MSYS2 installer and was able to verify the hash, so that's nice.

Related

Infinite uncompressing PDF stream decoded with FlateDecode using ZLib

I've searched many forums and blogs before posting the question.
I've found samples in python and VB which are using ZLib.
but I can't get it to work in Delphi.
I have the stream from a pdf that is encoded with FlateDecode.
Here is the stream saved as a simple file named "compressed_stream.pdf" (in fact it's not pdf - it's only the stream, but I just left the .pdf file extension)
https://files.fm/u/epka2hxz
Here is my code:
Execution goes to System.Zlib.ZDecompressStream(streamIn, streamOut); and just sleeps... no errors, no crashes, nothing - just sleeps until I break the execution.
Any idea?
var
fs: TFileStream;
streamIn, streamOut: TMemoryStream;
begin
fs := TFileStream.Create(sDocumentFolder + 'compressed_stream.pdf', fmOpenRead);
streamIn := TMemoryStream.Create();
streamOut := TMemoryStream.Create();
streamIn.CopyFrom(fs, 0);
streamIn.Position := 0;
System.Zlib.ZDecompressStream(streamIn, streamOut);
end;
Thanks to Dima I quickly found a sample for TZDecomoressionStream:
https://forum.lazarus.freepascal.org/index.php?topic=33009.0
function ZDecompressString(aText: string): string;
var
strInput,
strOutput: TStringStream;
Unzipper: TZDecompressionStream;
begin
Result:= '';
strInput:= TStringStream.Create(aText);
strOutput:= TStringStream.Create;
try
Unzipper:= TZDecompressionStream.Create(strInput);
try
strOutput.CopyFrom(Unzipper, Unzipper.Size);
finally
Unzipper.Free;
end;
Result:= strOutput.DataString;
finally
strInput.Free;
strOutput.Free;
end;
end;

How to read first and last 64kb of a video file in Delphi?

I want to use a subtitle API. It requires a md5 hash of first and last 64kb of the video file. I know how to do the md5 part just want to know how will I achieve to get the 128kb of data.
Here is the solution to the problem in Java which I am unable to implement in Delphi. How to read first and last 64kb of a video file in Java?
My Delphi code so far:
function TSubdbApi.GetHashFromFile(const AFilename: string): string;
var
Md5: TIdHashMessageDigest5;
Filestream: TFileStream;
Buffer: TByteArray;
begin
Md5 := TIdHashMessageDigest5.Create;
Filestream := TFileStream.Create(AFilename, fmOpenRead, fmShareDenyWrite);
try
if Filestream.Size > 0 then begin
Filestream.Read(Buffer, 1024 * 64);
Filestream.Seek(64, soFromEnd);
Filestream.Read(Buffer, 1024 * 64);
Result := Md5.HashStreamAsHex(Filestream);
end;
finally
Md5.Free;
Filestream.Free;
end;
end;
I am not getting the accurate md5 hash as stated by the official API.API url here. I am using Delphi XE8.
The hash function used by that API is described as:
Our hash is composed by taking the first and the last 64kb of the
video file, putting all together and generating a md5 of the resulting
data (128kb).
I can see a few problems in your code. You are hashing the file stream, not your Buffer array. Except that you were overwriting that array by subsequent reading from the file stream. And you were trying to seek only 64 bytes, and beyond the end of the stream (you need to use a negative value to seek from the end of the stream). Try something like this instead:
type
ESubDBException = class(Exception);
function TSubdbApi.GetHashFromFile(const AFileName: string): string;
const
KiloByte = 1024;
DataSize = 64 * KiloByte;
var
Digest: TIdHashMessageDigest5;
FileStream: TFileStream;
HashStream: TMemoryStream;
begin
FileStream := TFileStream.Create(AFileName, fmOpenRead, fmShareDenyWrite);
try
if FileStream.Size < DataSize then
raise ESubDBException.Create('File is smaller than the minimum required for ' +
'calculating API hash.');
HashStream := TMemoryStream.Create;
try
HashStream.CopyFrom(FileStream, DataSize);
FileStream.Seek(-DataSize, soEnd);
HashStream.CopyFrom(FileStream, DataSize);
Digest := TIdHashMessageDigest5.Create;
try
HashStream.Position := 0;
Result := Digest.HashStreamAsHex(HashStream);
finally
Digest.Free;
end;
finally
HashStream.Free;
end;
finally
FileStream.Free;
end;
end;

Copy a file to clipboard in Delphi

I am trying to copy a file to the clipboard. All examples in Internet are the same. I am using one from, http://embarcadero.newsgroups.archived.at/public.delphi.nativeapi/200909/0909212186.html but it does not work.
I use Rad Studio XE and I pass the complete path. In mode debug, I get some warnings like:
Debug Output:
Invalid address specified to RtlSizeHeap( 006E0000, 007196D8 )
Invalid address specified to RtlSizeHeap( 006E0000, 007196D8 )
I am not sure is my environment is related: Windows 8.1 64 bits, Rad Studio XE.
When I try to paste the clipboard, nothing happens. Also, seeing the clipboard with a monitor tool, this tool shows me error.
The code is:
procedure TfrmDoc2.CopyFilesToClipboard(FileList: string);
var
DropFiles: PDropFiles;
hGlobal: THandle;
iLen: Integer;
begin
iLen := Length(FileList) + 2;
FileList := FileList + #0#0;
hGlobal := GlobalAlloc(GMEM_SHARE or GMEM_MOVEABLE or GMEM_ZEROINIT,
SizeOf(TDropFiles) + iLen);
if (hGlobal = 0) then raise Exception.Create('Could not allocate memory.');
begin
DropFiles := GlobalLock(hGlobal);
DropFiles^.pFiles := SizeOf(TDropFiles);
Move(FileList[1], (PChar(DropFiles) + SizeOf(TDropFiles))^, iLen);
GlobalUnlock(hGlobal);
Clipboard.SetAsHandle(CF_HDROP, hGlobal);
end;
end;
UPDATE:
I am sorry, I feel stupid. I used the code that did not work, the original question that somebody asked, in my project, while I used the Remy's code, the correct solution, here in Stackoverflow. I thought that I used the Remy's code in my project. So, now, using the Remy's code, everything works great. Sorry for the mistake.
The forum post you link to contains the code in your question and asks why it doesn't work. Not surprisingly the code doesn't work for you any more than it did for the asker.
The answer that Remy gives is that there is a mismatch between ANSI and Unicode. The code is for ANSI but the compiler is Unicode.
So click on Remy's reply and do what it says: http://embarcadero.newsgroups.archived.at/public.delphi.nativeapi/200909/0909212187.html
Essentially you need to adapt the code to account for characters being 2 bytes wide in Unicode Delphi, but I see no real purpose repeating Remy's code here.
However, I'd say that you can do better than this code. The problem with this code is that it mixes every aspect all into one big function that does it all. What's more, the function is a method of a form in your GUI which is really the wrong place for it. There are aspects of the code that you might be able to re-use, but not factored like that.
I'd start with a function that puts an known block of memory into the clipboard.
procedure ClipboardError;
begin
raise Exception.Create('Could not complete clipboard operation.');
// substitute something more specific that Exception in your code
end;
procedure CheckClipboardHandle(Handle: HGLOBAL);
begin
if Handle=0 then begin
ClipboardError;
end;
end;
procedure CheckClipboardPtr(Ptr: Pointer);
begin
if not Assigned(Ptr) then begin
ClipboardError;
end;
end;
procedure PutInClipboard(ClipboardFormat: UINT; Buffer: Pointer; Count: Integer);
var
Handle: HGLOBAL;
Ptr: Pointer;
begin
Clipboard.Open;
Try
Handle := GlobalAlloc(GMEM_MOVEABLE, Count);
Try
CheckClipboardHandle(Handle);
Ptr := GlobalLock(Handle);
CheckClipboardPtr(Ptr);
Move(Buffer^, Ptr^, Count);
GlobalUnlock(Handle);
Clipboard.SetAsHandle(ClipboardFormat, Handle);
Except
GlobalFree(Handle);
raise;
End;
Finally
Clipboard.Close;
End;
end;
We're also going to need to be able to make double-null terminated lists of strings. Like this:
function DoubleNullTerminatedString(const Values: array of string): string;
var
Value: string;
begin
Result := '';
for Value in Values do
Result := Result + Value + #0;
Result := Result + #0;
end;
Perhaps you might add an overload that accepted a TStrings instance.
Now that we have all this we can concentrate on making the structure needed for the CF_HDROP format.
procedure CopyFileNamesToClipboard(const FileNames: array of string);
var
Size: Integer;
FileList: string;
DropFiles: PDropFiles;
begin
FileList := DoubleNullTerminatedString(FileNames);
Size := SizeOf(TDropFiles) + ByteLength(FileList);
DropFiles := AllocMem(Size);
try
DropFiles.pFiles := SizeOf(TDropFiles);
DropFiles.fWide := True;
Move(Pointer(FileList)^, (PByte(DropFiles) + SizeOf(TDropFiles))^,
ByteLength(FileList));
PutInClipboard(CF_HDROP, DropFiles, Size);
finally
FreeMem(DropFiles);
end;
end;
Since you use Delphi XE, strings are Unicode, but you are not taking the size of character into count when you allocate and move memory.
Change the line allocating memory to
hGlobal := GlobalAlloc(GMEM_SHARE or GMEM_MOVEABLE or GMEM_ZEROINIT,
SizeOf(TDropFiles) + iLen * SizeOf(Char));
and the line copying memory, to
Move(FileList[1], (PByte(DropFiles) + SizeOf(TDropFiles))^, iLen * SizeOf(Char));
Note the inclusion of *SizeOf(Char) in both lines and change of PChar to PByte on second line.
Then, also set the fWide member of DropFiles to True
DropFiles^.fWide := True;
All of these changes are already in the code from Remy, referred to by David.

copy part of a file into stream

the global target is
using a part of file to get checksum to find duplicated movie and mp3 files,
for this goal i have to get a part of file and generate the md5 because whole file size is up to 25 gigs in some cases,if i found duplicates then i will do a complete md5 for avoid any mistake of wrong file deletion
i dont have any problem i generating md5 from stream , it will be done with indy components
so
for first part
i have to copy first 1mb of a file
so i maked this function
but the memory stream is empty for all checkes!
function splitFile(FileName: string): TMemoryStream;
var
fs: TFileStream;
ms : TMemoryStream;
begin
fs := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite) ;
ms := TMemoryStream.Create;
fs.Position :=0;
ms.CopyFrom(fs, 1048576);
result := ms;
end;
how can i fix this? or where is my problem?
update1 - (dirty test) :
this code return error stream read error also memo2 show some string but memo3 is empty!!
function splitFile(FileName: string): TMemoryStream;
var
fs: TFileStream;
ms : TMemoryStream;
begin
fs := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite) ;
ms := TMemoryStream.Create;
fs.Position :=0;
form1.Memo2.Lines.LoadFromStream(fs);
ms.CopyFrom(fs,1048576);
ms.Position := 0;
form1.Memo3.Lines.LoadFromStream(ms);
result := ms;
end;
the complete code
function splitFile(FileName: string): TMemoryStream;
var
fs: TFileStream;
ms : TMemoryStream;
i,BytesToRead : integer;
begin
fs := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
ms := TMemoryStream.Create;
fs.Position :=0;
BytesToRead := Min(fs.Size-fs.Position, 1024*1024);
ms.CopyFrom(fs, BytesToRead);
result := ms;
// fs.Free;
// ms.Free;
end;
function streamFile(FileName: string): TFileStream;
var
fs: TFileStream;
ms : TMemoryStream;
begin
fs := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite) ;
result := fs;
end;
function GetFileMD5(const Stream: TStream): String; overload;
var MD5: TIdHashMessageDigest5;
begin
MD5 := TIdHashMessageDigest5.Create;
try
Result := MD5.HashStreamAsHex(Stream);
finally
MD5.Free;
end;
end;
function getMd5HashString(value: string): string;
var
hashMessageDigest5 : TIdHashMessageDigest5;
begin
hashMessageDigest5 := nil;
try
hashMessageDigest5 := TIdHashMessageDigest5.Create;
Result := IdGlobal.IndyLowerCase ( hashMessageDigest5.HashStringAsHex ( value ) );
finally
hashMessageDigest5.Free;
end;
end;
procedure TForm1.Button1Click(Sender: TObject);
var
Path,hash : String;
SR : TSearchRec;
begin
if od1.Execute then
begin
Path:=ExtractFileDir(od1.FileName); //Get the path of the selected file
DirList:=TStringList.Create;
try
if FindFirst(Path+'\*.*', faArchive , SR) = 0 then
begin
repeat
if (SR.Size>10240) then
begin
hash := GetFileMD5(splitFile(Path+'\'+SR.Name));
end
else
begin
hash := GetFileMD5(streamFile(Path+'\'+SR.Name));
end;
memo1.Lines.Add(hash+' | '+SR.Name +' | '+inttostr(SR.Size));
application.ProcessMessages;
until FindNext(SR) <> 0;
FindClose(SR);
end;
finally
DirList.Free;
end;
end;
end;
output:
D41D8CD98F00B204E9800998ECF8427E | eslahat.docx | 13338
D41D8CD98F00B204E9800998ECF8427E | EXT-3000-Data-Sheet.pdf | 682242
D41D8CD98F00B204E9800998ECF8427E | faktor khate ekhtesasi firoozpoor.pdf | 50091
D41D8CD98F00B204E9800998ECF8427E | FileZilla_3.9.0.5_win32-setup.exe | 6057862
D41D8CD98F00B204E9800998ECF8427E | FileZilla_3.9.0.6_win32-setup.exe | 6126536
11210486C9E54E12DA9DF687792257EA | get_stats_of_all_members_of_mu(1).php | 6227
11210486C9E54E12DA9DF687792257EA | get_stats_of_all_members_of_mu.php | 6227
D41D8CD98F00B204E9800998ECF8427E | GOMAUDIOGLOBALSETUP.EXE | 6855616
D41D8CD98F00B204E9800998ECF8427E | harvester-master(1).zip | 54255
D41D8CD98F00B204E9800998ECF8427E | harvester-master.zip | 54180
Here is a procedure that I quickly wrote for you which would alow you to read part of file (chunk) into a memory stream.
The reason why I made this into a procedure and not function is so that it is posible to reuse same memory stream for diferent chunks. This way you avoid all those memory alocations/dealocations and also reduce the chance of introducing the memory leak.
In order to be able to do so you need to pas the memory stream handle to the procedure as variable parameter.
I also adad two more parameters. One for specifying the chunk size (amount of data that you want to read from file) and chunk number.
I also made some rudimentatry safeguards to tell you when you want to read a chunk that is beyond file scope. And also the ability to automatically reduce the size of the last chunk since not all file sizes are multiples of oyur chunk size (in your case not all files are exactly X megabytes in size where X is any valid integer).
procedure readFileChunk(FileName: string; var MS: TMemoryStream; ChunkNo: Integer; ChunkSize: Int64);
var fs: TFileStream;
begin
fs := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
if ChunkSize * (ChunkNo-1) <= fs.Size then
begin
fs.Position := ChunkSize * (ChunkNo-1);
if fs.Position + ChunkSize <= fs.Size then
ms.CopyFrom(fs, ChunkSize)
else
ms.CopyFrom(fs, fs.Size - fs.Position);
end
else
MessageBox(Form2.WindowHandle, 'File does not have so many chunks', 'WARNING!', MB_OK);
fs.Free;
end;
You use this procedure by calling:
readFileChunk(FileName,MemoryStream,ChunkNumber,ChunkSize);
Make sure you have already created the memory stream before calling this procedure.
Also if you want to reuse the same memory stream multiple times don't forget to set its postion to 0 before calling this procedure othevise new data will be added to the end of the stream and in turn keep increasing the memory stream size.
UPDATE:
After doing some trials I found out that the problem resides in your GetFileMD5 method.
I can't explain exactly why this is happening but if you pass a TMemoryStream to TStream parameter, the TStream parameters simply doesent accept it so the MD5 hashing algorithm the treats it as empty handle.
When I went and changed the parameter type to TMemoryStream instead the code worked but you no longer could pass a TFileStream to GetFileMD5 method anymore so it broke hash generation from entire files that worked before.
SOLUTION:
So after doing some more digging I have a GREAT news for you.
You don't even need to use TMemoryStreams. The "HashStreamAsHex" function can accept two optional parameters which alows you to define the starting point of your data and the size of data block from which you wanna generate the MD5 hash string. And this also works with TFileStream.
So in order to generate MD5 hash string from just small part of your file call this:
MD5.HashStreamAsHex(Stream,StartPosition,DataSize);
StartPositon specifies the inital offset into the stream for the hashing operation. When StartPosition contains a positive non-zero value, the stream position is moved to the specified offset prior to calculating the hash value. When StartPosition contains the value -1, the current position of the stream is used as the initial offset into the specified stream.
DataSize indicates the number of bytes from the stream to include in the hashing operation. When DataSize contains a negative value (<0), the bytes remaining from the current stream position are used for the hashing operation. Otherwise, the number of bytes in DataSize is used. If DataSize is larger than the size of the stream, the smaller of the two values is used for the operation.
In your case for getting the MD5 hash from the first MegaByte you would call:
MD5.HashStreamAsHex(Stream,0,1024*1024);
Now I belive you can modify the rest of your code to get this working as you want to. If not do tell where it stopped and I will help you.
I'm assuming that your code does not raise an exception. If it did you surely would have mentioned that. I also assume that the file is large enough for your attempted read.
Your code does copy. If the call to CopyFrom does not raise an exception then the memory stream contains the first 1024000 bytes of the file.
However, after the call to CopyFrom, the memory stream's pointer is at the end of the stream so if you read from it you will not be able to read anything. Perhaps you need to move the stream pointer to the beginning:
ms.Position := 0;
And then read from the memory stream.
1MB = 1024*1024, FWIW.
Update
Probably my assumptions above were incorrect. It seems likely that your code raises an exception because you attempt to read beyond the end of the file.
What you really seem to be wanting to do is to read as much of the first part of the file as possible. That's a two-liner.
BytesToRead := Min(Source.Size-Source.Position, 1024*1024);
Dest.CopyFrom(Source, BytesToRead);

Delphi: Alternative to using Reset/ReadLn for text file reading

i want to process a text file line by line. In the olden days i loaded the file into a StringList:
slFile := TStringList.Create();
slFile.LoadFromFile(filename);
for i := 0 to slFile.Count-1 do
begin
oneLine := slFile.Strings[i];
//process the line
end;
Problem with that is once the file gets to be a few hundred megabytes, i have to allocate a huge chunk of memory; when really i only need enough memory to hold one line at a time. (Plus, you can't really indicate progress when you the system is locked up loading the file in step 1).
The i tried using the native, and recommended, file I/O routines provided by Delphi:
var
f: TextFile;
begin
Reset(f, filename);
while ReadLn(f, oneLine) do
begin
//process the line
end;
Problem withAssign is that there is no option to read the file without locking (i.e. fmShareDenyNone). The former stringlist example doesn't support no-lock either, unless you change it to LoadFromStream:
slFile := TStringList.Create;
stream := TFileStream.Create(filename, fmOpenRead or fmShareDenyNone);
slFile.LoadFromStream(stream);
stream.Free;
for i := 0 to slFile.Count-1 do
begin
oneLine := slFile.Strings[i];
//process the line
end;
So now even though i've gained no locks being held, i'm back to loading the entire file into memory.
Is there some alternative to Assign/ReadLn, where i can read a file line-by-line, without taking a sharing lock?
i'd rather not get directly into Win32 CreateFile/ReadFile, and having to deal with allocating buffers and detecting CR, LF, CRLF's.
i thought about memory mapped files, but there's the difficulty if the entire file doesn't fit (map) into virtual memory, and having to maps views (pieces) of the file at a time. Starts to get ugly.
i just want Reset with fmShareDenyNone!
With recent Delphi versions, you can use TStreamReader. Construct it with your file stream, and then call its ReadLine method (inherited from TTextReader).
An option for all Delphi versions is to use Peter Below's StreamIO unit, which gives you AssignStream. It works just like AssignFile, but for streams instead of file names. Once you've used that function to associate a stream with a TextFile variable, you can call ReadLn and the other I/O functions on it just like any other file.
You can use this sample code:
TTextStream = class(TObject)
private
FHost: TStream;
FOffset,FSize: Integer;
FBuffer: array[0..1023] of Char;
FEOF: Boolean;
function FillBuffer: Boolean;
protected
property Host: TStream read FHost;
public
constructor Create(AHost: TStream);
destructor Destroy; override;
function ReadLn: string; overload;
function ReadLn(out Data: string): Boolean; overload;
property EOF: Boolean read FEOF;
property HostStream: TStream read FHost;
property Offset: Integer read FOffset write FOffset;
end;
{ TTextStream }
constructor TTextStream.Create(AHost: TStream);
begin
FHost := AHost;
FillBuffer;
end;
destructor TTextStream.Destroy;
begin
FHost.Free;
inherited Destroy;
end;
function TTextStream.FillBuffer: Boolean;
begin
FOffset := 0;
FSize := FHost.Read(FBuffer,SizeOf(FBuffer));
Result := FSize > 0;
FEOF := Result;
end;
function TTextStream.ReadLn(out Data: string): Boolean;
var
Len, Start: Integer;
EOLChar: Char;
begin
Data:='';
Result:=False;
repeat
if FOffset>=FSize then
if not FillBuffer then
Exit; // no more data to read from stream -> exit
Result:=True;
Start:=FOffset;
while (FOffset<FSize) and (not (FBuffer[FOffset] in [#13,#10])) do
Inc(FOffset);
Len:=FOffset-Start;
if Len>0 then begin
SetLength(Data,Length(Data)+Len);
Move(FBuffer[Start],Data[Succ(Length(Data)-Len)],Len);
end else
Data:='';
until FOffset<>FSize; // EOL char found
EOLChar:=FBuffer[FOffset];
Inc(FOffset);
if (FOffset=FSize) then
if not FillBuffer then
Exit;
if FBuffer[FOffset] in ([#13,#10]-[EOLChar]) then begin
Inc(FOffset);
if (FOffset=FSize) then
FillBuffer;
end;
end;
function TTextStream.ReadLn: string;
begin
ReadLn(Result);
end;
Usage:
procedure ReadFileByLine(Filename: string);
var
sLine: string;
tsFile: TTextStream;
begin
tsFile := TTextStream.Create(TFileStream.Create(Filename, fmOpenRead or fmShareDenyWrite));
try
while tsFile.ReadLn(sLine) do
begin
//sLine is your line
end;
finally
tsFile.Free;
end;
end;
If you need support for ansi and Unicode in older Delphis, you can use my GpTextFile or GpTextStream.
As it seems the FileMode variable is not valid for Textfiles, but my tests showed that multiple reading from the file is no problem. You didn't mention it in your question, but if you are not going to write to the textfile while it is read you should be good.
What I do is use a TFileStream but I buffer the input into fairly large blocks (e.g. a few megabytes each) and read and process one block at a time. That way I don't have to load the whole file at once.
It works quite quickly that way, even for large files.
I do have a progress indicator. As I load each block, I increment it by the fraction of the file that has additionally been loaded.
Reading one line at a time, without something to do your buffering, is simply too slow for large files.
I had same problem a few years ago especially the problem of locking the file. What I did was use the low level readfile from the shellapi. I know the question is old since my answer (2 years) but perhaps my contribution could help someone in the future.
const
BUFF_SIZE = $8000;
var
dwread:LongWord;
hFile: THandle;
datafile : array [0..BUFF_SIZE-1] of char;
hFile := createfile(PChar(filename)), GENERIC_READ, FILE_SHARE_READ or FILE_SHARE_WRITE, nil, OPEN_EXISTING, FILE_ATTRIBUTE_READONLY, 0);
SetFilePointer(hFile, 0, nil, FILE_BEGIN);
myEOF := false;
try
Readfile(hFile, datafile, BUFF_SIZE, dwread, nil);
while (dwread > 0) and (not myEOF) do
begin
if dwread = BUFF_SIZE then
begin
apos := LastDelimiter(#10#13, datafile);
if apos = BUFF_SIZE then inc(apos);
SetFilePointer(hFile, aPos-BUFF_SIZE, nil, FILE_CURRENT);
end
else myEOF := true;
Readfile(hFile, datafile, BUFF_SIZE, dwread, nil);
end;
finally
closehandle(hFile);
end;
For me the speed improvement appeared to be significant.
Why not simply read the lines of the file directly from the TFileStream itself one at a time ?
i.e. (in pseudocode):
readline:
while NOT EOF and (readchar <> EOL) do
appendchar to result
while NOT EOF do
begin
s := readline
process s
end;
One problem you may find with this is that iirc TFileStream is not buffered so performance over a large file is going to be sub-optimal. However, there are a number of solutions to the problem of non-buffered streams, including this one, that you may wish to investigate if this approach solves your initial problem.

Resources