Fast read/write from file in delphi - delphi

I am loading a file into a array in binary form this seems to take a while
is there a better faster more efficent way to do this.
i am using a similar method for writing back to the file.
procedure openfile(fname:string);
var
myfile: file;
filesizevalue,i:integer;
begin
assignfile(myfile,fname);
filesizevalue:=GetFileSize(fname); //my method
SetLength(dataarray, filesizevalue);
i:=0;
Reset(myFile, 1);
while not Eof(myFile) do
begin
BlockRead(myfile,dataarray[i], 1);
i:=i+1;
end;
CloseFile(myfile);
end;

If your really want to read a binary file fast, let windows worry about buffering ;-) by using Memory Mapped Files. Using this you can simple map a file to a memory location an read like it's an array.
Your function would become:
procedure openfile(fname:string);
var
InputFile: TMappedFile;
begin
InputFile := TMappedFile.Create;
try
InputFile.MapFile(fname);
SetLength(dataarray, InputFile.Size);
Move(PByteArray(InputFile.Content)[0], Result[0], InputFile.Size);
finally
InputFile.Free;
end;
end;
But I would suggest not using the global variable dataarray, but either pass it as a var in the parameter, or use a function which returns the resulting array.
procedure ReadBytesFromFile(const AFileName : String; var ADestination : TByteArray);
var
InputFile : TMappedFile;
begin
InputFile := TMappedFile.Create;
try
InputFile.MapFile(AFileName);
SetLength(ADestination, InputFile.Size);
Move(PByteArray(InputFile.Content)[0], ADestination[0], InputFile.Size);
finally
InputFile.Free;
end;
end;
The TMappedFile is from my article Fast reading of files using Memory Mapping, this article also contains an example of how to use it for more "advanced" binary files.

You generally shouldn't read files byte for byte. Use BlockRead with a larger value (512 or 1024 often are best) and use its return value to find out how many bytes were read.
If the size isn't too large (and your use of SetLength seems to support this), you can also use one BlockRead call reading the complete file at once. So, modifying your approach, this would be:
AssignFile(myfile,fname);
filesizevalue := GetFileSize(fname);
Reset(myFile, 1);
SetLength(dataarray, filesizevalue);
BlockRead(myFile, dataarray[0], filesizevalue);
CloseFile(myfile);
Perhaps you could also change the procedure to a boolean function named OpenAndReadFile and return false if the file couldn't be opened or read.

It depends on the file format. If it consists of several identical records, you can decide to create a file of that record type.
For example:
type
TMyRecord = record
fieldA: integer;
..
end;
TMyFile = file of TMyRecord;
const
cBufLen = 100 * sizeof(TMyRecord);
var
file: TMyFile;
i : Integer;
begin
AssignFile(file, filename);
Reset(file);
i := 0;
try
while not Eof(file) do begin
BlockRead(file, dataarray[i], cBufLen);
Inc(i, cBufLen);
end;
finally
CloseFile(file);
end;
end;

If it's a long enough file that reading it this way takes a noticeable amount of time, I'd use a stream instead. The block read will be a lot faster, and there's no loops to worry about. Something like this:
procedure openfile(fname:string);
var
myfile: TFileStream;
filesizevalue:integer;
begin
filesizevalue:=GetFileSize(fname); //my method
SetLength(dataarray, filesizevalue);
myFile := TFileStream.Create(fname);
try
myFile.seek(0, soFromBeginning);
myFile.ReadBuffer(dataarray[0], filesizevalue);
finally
myFile.free;
end;
end;
It appears from your code that your record size is 1 byte long. If not, then change the read line to:
myFile.ReadBuffer(dataarray[0], filesizevalue * SIZE);
or something similar.

Look for a buffered TStream descendant. It will make your code a lot faster as the disk read is done fast, but you can loop through the buffer easily. There are various about, or you can write your own.

If you're feeling very bitheaded, you can bypass Win32 altogether and call the NT Native API function ZwOpenFile() which in my informal testing does shave a tiny bit off. Otherwise, I'd use Davy's Memory Mapped File solution above.

Related

Copying File Fails in when open in fmOpenReadWriteMode

i am working on a little byte patching program but i encountered an error.
copying the file before modifying fails with no error, (no copied output is seen) but the file patches successfully.
Here is the Patch Code
procedure DoMyPatch();
var
i: integer;
FileName: string;
input: TFileStream;
FileByteArray, ExtractedByteArray: array of Byte;
begin
FileName := 'Cute1.res';
try
input := TFileStream.Create(FileName, fmOpenReadWrite);
except
begin
ShowMessage('Error Opening file');
Exit;
end
end;
input.Position := 0;
SetLength(FileByteArray, input.size);
input.Read(FileByteArray[0], Length(FileByteArray));
for i := 0 to Length(FileByteArray) do
begin
SetLength(ExtractedByteArray, Length(OriginalByte));
ExtractedByteArray := Copy(FileByteArray, i, Length(OriginalByte));
// function that compares my array of bytes
if CompareByteArrays(ExtractedByteArray, OriginalByte) = True then
begin
// Begin Patching
CopyFile(PChar(FileName), PChar(ChangeFileExt(FileName, '.BAK')),
true); =======>>> fails at this point, no copied output is seen.
input.Seek(i, SoFromBeginning);
input.Write(BytetoWrite[0], Length(BytetoWrite)); =====>>> patches successfully
input.Free;
ShowMessage('Patch Success');
Exit;
end;
end;
if Assigned(input) then
begin
input.Free;
end;
ShowMessage('Patch Failed');
end;
sidenote : it copies fine if i close the filestream before attempting copy.
by the way, i have tested it on Delphi 7 and XE7.
Thanks
You cannot copy the file because you locked it exclusively when you opened it for the file stream, which is why CopyFile fails.
You should close the file before attempting to call CopyFile. Which would require you to reopen the file to patch it. Or perhaps open the file with a different sharing mode.
Some other comments:
The exception handling is badly implemented. Don't catch exceptions here. Let them float up to the high level.
Lifetime management is fluffed. You can easily leak as it stands. You need to learn about try/finally.
You overrun buffers. Valid indices for a dynamic array are 0 to Length(arr)-1 inclusive. Or use low() and high().
You don't check the value returned by CopyFile. Wrap it with a call to Win32Check.
The Copy function returns a new array. So you make a spurious call to SetLength. To copy the entire array use the one parameter overload of Copy.
Showing messages in this function is probably a mistake. Better to let the caller provide user feedback.
There are loads of other oddities in the code and I've run out of energy to point them all out. I think I got the main ones.

TFileStream read huge files piece by piece

Earlier today I opened a question here asking if my method to scan files in computer was correct. As solution, I received a few tips, and the one of the solutions I thought: "this need to be solved urgent!", was saying about memory overflow, once I was reading the files entirely in memory. So I started trying to find a way to read the files piece by piece, and I got something (wrong/bogus), that I need some help to figure out how to do this correctly.
The method is simple like this for now:
procedure ScanFile(FileName: string);
const
MAX_SIZE = 100*1024*1024;
var
i, aux, ReadLimit: integer;
MyFile: TFileStream;
Target: AnsiString;
PlainText: String;
Buff: array of byte;
TotalSize: Int64;
begin
if (POS('.exe', FileName) = 0) and (POS('.dll', FileName) = 0) and
(POS('.sys', FileName) = 0) then //yeah I know it's not the best way...
begin
try
MyFile:= TFileStream.Create(FileName, fmOpenRead);
except on E: EFOpenError do
MyFile:= NIL;
end;
if MyFile <> NIL then
try
TotalSize:= MyFile.Size;
while TotalSize > 0 do begin
ReadLimit:= Min(TotalSize, MAX_SIZE);
SetLength(Buff, ReadLimit);
MyFile.ReadBuffer(Buff[0], ReadLimit);
PlainText:= RemoveNulls(Buff); //this is to transform the array of bytes in string, I posted the code below too...
for i:= 1 to Length(PlainText) do
begin //Begin the search..
end;
dec(TotalSize, ReadLimit);
end;
finally
MyFile.Free;
end;
end;
Code for RemoveNulls is:
function RemoveNulls(const Buff: array of byte): String;
var
i: integer;
begin
for i:= 0 to Length(Buff) do
begin
if Buff[i] <> 0 then
Result:= Result + Chr(Ord(Buff[i]));
end;
end;
Ok, the problems I got with this code so far was:
1- each time the while is repeated, I get more memory consumed, when I was expecting to get only MAX 100MB as described in the MAX_SIZE variable, right?
2- I created a file with 2 occurrences of what should be filtered, and for some unknown reason I got about 10 repeated occurrences, looks like I'm scanning the file repeatedly.
I appreciate your help guys, and if someone have this kind of code already done, post here please, I don't pretend to re-create the wheel...
I'd say that RemoveNulls is your problem. Suppose that you just read 100MB into a string that you passed to RemoveNulls. You would then allocate a string of length 1. The reallocate to length 2. Then to length 3. Then to length 4. And so on, all the way to length 100*1024*1024.
That process will fragment your memory, as well as being appallingly slow. Heap allocation is to be avoided when performance matters. You've no need for it at all. Read a chunk of the file, and search directly in the buffer that you read.
There are various problems with your code that I can see:
Your file extension check is broken, as I described in your previous question.
You are not handling exceptions correctly, as I described in your previous question.
Your for loop in RemoveNulls has buffer overrun. Loop from low() to high().
It's not possible to comment on the search code since that's not present in the question.

Create and/or Write to a file

I feel like this should be easy, but google is totally failing me at the moment. I want to open a file, or create it if it doesn't exist, and write to it.
The following
AssignFile(logFile, 'Test.txt');
Append(logFile);
throws an error on the second line when the file doesn't exist yet, which I assume is expected. But I'm really failing at finding out how to a) test if the file exists and b) create it when needed.
FYI, working in Delphi XE.
You can use the FileExists function and then use Append if exist or Rewrite if not.
AssignFile(logFile, 'Test.txt');
if FileExists('test.txt') then
Append(logFile)
else
Rewrite(logFile);
//do your stuff
CloseFile(logFile);
Any solution that uses FileExists to choose how to open the file has a race condition. If the file's existence changes between the time you test it and the time you attempt to open the file, your program will fail. Delphi doesn't provide any way to solve that problem with its native file I/O routines.
If your Delphi version is new enough to offer it, you can use the TFile.Open with the fmOpenOrCreate open mode, which does exactly what you want; it returns a TFileStream.
Otherwise, you can use the Windows API function CreateFile to open your file instead. Set the dwCreationDisposition parameter to OPEN_ALWAYS, which tells it to create the file if it doesn't already exist.
You should be using TFileStream instead. Here's a sample that will create a file if it doesn't exist, or write to it if it does:
var
FS: TFileStream;
sOut: string;
i: Integer;
Flags: Word;
begin
Flags := fmOpenReadWrite;
if not FileExists('D:\Temp\Junkfile.txt') then
Flags := Flags or fmCreate;
FS := TFileStream.Create('D:\Temp\Junkfile.txt', Flags);
try
FS.Position := FS.Size; // Will be 0 if file created, end of text if not
sOut := 'This is test line %d'#13#10;
for i := 1 to 10 do
begin
sOut := Format(sOut, [i]);
FS.Write(sOut[1], Length(sOut) * SizeOf(Char));
end;
finally
FS.Free;
end;
end;
If you are just doing something simple, the IOUtils Unit is a lot easier. It has a lot of utilities for writing to files.
e.g.
procedure WriteAllText(const Path: string; const Contents: string);
overload; static;
Creates a new file, writes the specified string to the file, and then
closes the file. If the target file already exists, it is overwritten.
You can also use the load/save feature in a TStringList to solve your problem.
This might be a bad solution, because the whole file will be loaded into memory, modified in memory and then saved to back to disk. (As opposed to your solution where you just write directly to the file). It's obviously a bad solution for multiuser situations.
But this approach is OK for smaller files, and it is easy to work with and easy understand.
const
FileName = 'test.txt';
var
strList: TStringList;
begin
strList := TStringList.Create;
try
if FileExists(FileName) then
strList.LoadFromFile(FileName);
strList.Add('My new line');
strList.SaveToFile(FileName);
finally
strList.Free;
end;
end;

Is Valid IMAGE_DOS_SIGNATURE

I want to check a file has a valid IMAGE_DOS_SIGNATURE (MZ)
function isMZ(FileName : String) : boolean;
var
Signature: Word;
fexe: TFileStream;
begin
result:=false;
try
fexe := TFileStream.Create(FileName, fmOpenRead or fmShareDenyNone);
fexe.ReadBuffer(Signature, SizeOf(Signature));
if Signature = $5A4D { 'MZ' } then
result:=true;
finally
fexe.free;
end;
end;
I know I can use some code in Windows unit to check the IMAGE_DOS_SIGNATURE. The problem is I want the fastest way to check IMAGE_DOS_SIGNATURE (for a big file). I need your some suggestion about my code or maybe a new code?
Thanks
The size of the file doesn't matter because your code only reads the first two bytes.
Any overhead from allocating and using a TFileStream, which goes through SysUtils.FileRead before reaching Win32 ReadFile, ought to be all but invisible noise compared to the cost of seeking in the only situation where it should matter, where you're scanning through hundreds of executables.
There might possibly be some benefit in tweaking Windows' caching by using the raw WinAPI, but I would expect it to be very marginal.

How can I get this File Writing code to work with Unicode (Delphi)

I had some code before I moved to Unicode and Delphi 2009 that appended some text to a log file a line at a time:
procedure AppendToLogFile(S: string);
// this function adds our log line to our shared log file
// Doing it this way allows Wordpad to open it at the same time.
var F, C1 : dword;
begin
if LogFileName <> '' then begin
F := CreateFileA(Pchar(LogFileName), GENERIC_READ or GENERIC_WRITE, 0, nil, OPEN_ALWAYS, 0, 0);
if F <> 0 then begin
SetFilePointer(F, 0, nil, FILE_END);
S := S + #13#10;
WriteFile(F, Pchar(S)^, Length(S), C1, nil);
CloseHandle(F);
end;
end;
end;
But CreateFileA and WriteFile are binary file handlers and are not appropriate for Unicode.
I need to get something to do the equivalent under Delphi 2009 and be able to handle Unicode.
The reason why I'm opening and writing and then closing the file for each line is simply so that other programs (such as WordPad) can open the file and read it while the log is being written.
I have been experimenting with TFileStream and TextWriter but there is very little documentation on them and few examples.
Specifically, I'm not sure if they're appropriate for this constant opening and closing of the file. Also I'm not sure if they can make the file available for reading while they have it opened for writing.
Does anyone know of a how I can do this in Delphi 2009 or later?
Conclusion:
Ryan's answer was the simplest and the one that led me to my solution. With his solution, you also have to write the BOM and convert the string to UTF8 (as in my comment to his answer) and then that worked just fine.
But then I went one step further and investigated TStreamWriter. That is the equivalent of the .NET function of the same name. It understands Unicode and provides very clean code.
My final code is:
procedure AppendToLogFile(S: string);
// this function adds our log line to our shared log file
// Doing it this way allows Wordpad to open it at the same time.
var F: TStreamWriter;
begin
if LogFileName <> '' then begin
F := TStreamWriter.Create(LogFileName, true, TEncoding.UTF8);
try
F.WriteLine(S);
finally
F.Free;
end;
end;
Finally, the other aspect I discovered is if you are appending a lot of lines (e.g. 1000 or more), then the appending to the file takes longer and longer and it becomes quite inefficient.
So I ended up not recreating and freeing the LogFile each time. Instead I keep it open and then it is very fast. The only thing I can't seem to do is allow viewing of the file with notepad while it is being created.
For logging purposes why use Streams at all?
Why not use TextFiles? Here is a very simple example of one of my logging routines.
procedure LogToFile(Data:string);
var
wLogFile: TextFile;
begin
AssignFile(wLogFile, 'C:\MyTextFile.Log');
{$I-}
if FileExists('C:\MyTextFile.Log') then
Append(wLogFile)
else
ReWrite(wLogFile);
WriteLn(wLogfile, S);
CloseFile(wLogFile);
{$I+}
IOResult; //Used to clear any possible remaining I/O errors
end;
I actually have a fairly extensive logging unit that uses critical sections for thread safety, can optionally be used for internal logging via the OutputDebugString command as well as logging specified sections of code through the use of sectional identifiers.
If anyone is interested I'll gladly share the code unit here.
Char and string are Wide since D2009. Thus you should use CreateFile instead of CreateFileA!
If you werite the string you shoudl use Length( s ) * sizeof( Char ) as the byte length and not only Length( s ). because of the widechar issue. If you want to write ansi chars, you should define s as AnsiString or UTF8String and use sizeof( AnsiChar ) as a multiplier.
Why are you using the Windows API function instead of TFileStream defined in classes.pas?
Try this little function I whipped up just for you.
procedure AppendToLog(filename,line:String);
var
fs:TFileStream;
ansiline:AnsiString;
amode:Integer;
begin
if not FileExists(filename) then
amode := fmCreate
else
amode := fmOpenReadWrite;
fs := TFileStream.Create(filename,{mode}amode);
try
if (amode<>fmCreate) then
fs.Seek(fs.Size,0); {go to the end, append}
ansiline := AnsiString(line)+AnsiChar(#13)+AnsiChar(#10);
fs.WriteBuffer(PAnsiChar(ansiline)^,Length(ansiline));
finally
fs.Free;
end;
Also, try this UTF8 version:
procedure AppendToLogUTF8(filename, line: UnicodeString);
var
fs: TFileStream;
preamble:TBytes;
outpututf8: RawByteString;
amode: Integer;
begin
if not FileExists(filename) then
amode := fmCreate
else
amode := fmOpenReadWrite;
fs := TFileStream.Create(filename, { mode } amode, fmShareDenyWrite);
{ sharing mode allows read during our writes }
try
{internal Char (UTF16) codepoint, to UTF8 encoding conversion:}
outpututf8 := Utf8Encode(line); // this converts UnicodeString to WideString, sadly.
if (amode = fmCreate) then
begin
preamble := TEncoding.UTF8.GetPreamble;
fs.WriteBuffer( PAnsiChar(preamble)^, Length(preamble));
end
else
begin
fs.Seek(fs.Size, 0); { go to the end, append }
end;
outpututf8 := outpututf8 + AnsiChar(#13) + AnsiChar(#10);
fs.WriteBuffer(PAnsiChar(outpututf8)^, Length(outpututf8));
finally
fs.Free;
end;
end;
If you try to use text file or Object Pascal typed/untyped files in a multithreaded application you gonna have a bad time.
No kidding - the (Object) Pascal standard file I/O uses global variables to set file mode and sharing. If your application runs in more than one thread (or fiber if anyone still use them) using standard file operations could result in access violations and unpredictable behavior.
Since one of the main purposes of logging is debugging a multithreaded application, consider using other means of file I/O: Streams and Windows API.
(And yes, I know it is not really an answer to the original question, but I do not wish to log in - therefor I do not have the reputation score to comment on Ryan J. Mills's practically wrong answer.)

Resources