Is Valid IMAGE_DOS_SIGNATURE - delphi

I want to check a file has a valid IMAGE_DOS_SIGNATURE (MZ)
function isMZ(FileName : String) : boolean;
var
Signature: Word;
fexe: TFileStream;
begin
result:=false;
try
fexe := TFileStream.Create(FileName, fmOpenRead or fmShareDenyNone);
fexe.ReadBuffer(Signature, SizeOf(Signature));
if Signature = $5A4D { 'MZ' } then
result:=true;
finally
fexe.free;
end;
end;
I know I can use some code in Windows unit to check the IMAGE_DOS_SIGNATURE. The problem is I want the fastest way to check IMAGE_DOS_SIGNATURE (for a big file). I need your some suggestion about my code or maybe a new code?
Thanks

The size of the file doesn't matter because your code only reads the first two bytes.
Any overhead from allocating and using a TFileStream, which goes through SysUtils.FileRead before reaching Win32 ReadFile, ought to be all but invisible noise compared to the cost of seeking in the only situation where it should matter, where you're scanning through hundreds of executables.
There might possibly be some benefit in tweaking Windows' caching by using the raw WinAPI, but I would expect it to be very marginal.

Related

TFileStream read huge files piece by piece

Earlier today I opened a question here asking if my method to scan files in computer was correct. As solution, I received a few tips, and the one of the solutions I thought: "this need to be solved urgent!", was saying about memory overflow, once I was reading the files entirely in memory. So I started trying to find a way to read the files piece by piece, and I got something (wrong/bogus), that I need some help to figure out how to do this correctly.
The method is simple like this for now:
procedure ScanFile(FileName: string);
const
MAX_SIZE = 100*1024*1024;
var
i, aux, ReadLimit: integer;
MyFile: TFileStream;
Target: AnsiString;
PlainText: String;
Buff: array of byte;
TotalSize: Int64;
begin
if (POS('.exe', FileName) = 0) and (POS('.dll', FileName) = 0) and
(POS('.sys', FileName) = 0) then //yeah I know it's not the best way...
begin
try
MyFile:= TFileStream.Create(FileName, fmOpenRead);
except on E: EFOpenError do
MyFile:= NIL;
end;
if MyFile <> NIL then
try
TotalSize:= MyFile.Size;
while TotalSize > 0 do begin
ReadLimit:= Min(TotalSize, MAX_SIZE);
SetLength(Buff, ReadLimit);
MyFile.ReadBuffer(Buff[0], ReadLimit);
PlainText:= RemoveNulls(Buff); //this is to transform the array of bytes in string, I posted the code below too...
for i:= 1 to Length(PlainText) do
begin //Begin the search..
end;
dec(TotalSize, ReadLimit);
end;
finally
MyFile.Free;
end;
end;
Code for RemoveNulls is:
function RemoveNulls(const Buff: array of byte): String;
var
i: integer;
begin
for i:= 0 to Length(Buff) do
begin
if Buff[i] <> 0 then
Result:= Result + Chr(Ord(Buff[i]));
end;
end;
Ok, the problems I got with this code so far was:
1- each time the while is repeated, I get more memory consumed, when I was expecting to get only MAX 100MB as described in the MAX_SIZE variable, right?
2- I created a file with 2 occurrences of what should be filtered, and for some unknown reason I got about 10 repeated occurrences, looks like I'm scanning the file repeatedly.
I appreciate your help guys, and if someone have this kind of code already done, post here please, I don't pretend to re-create the wheel...
I'd say that RemoveNulls is your problem. Suppose that you just read 100MB into a string that you passed to RemoveNulls. You would then allocate a string of length 1. The reallocate to length 2. Then to length 3. Then to length 4. And so on, all the way to length 100*1024*1024.
That process will fragment your memory, as well as being appallingly slow. Heap allocation is to be avoided when performance matters. You've no need for it at all. Read a chunk of the file, and search directly in the buffer that you read.
There are various problems with your code that I can see:
Your file extension check is broken, as I described in your previous question.
You are not handling exceptions correctly, as I described in your previous question.
Your for loop in RemoveNulls has buffer overrun. Loop from low() to high().
It's not possible to comment on the search code since that's not present in the question.

Create and/or Write to a file

I feel like this should be easy, but google is totally failing me at the moment. I want to open a file, or create it if it doesn't exist, and write to it.
The following
AssignFile(logFile, 'Test.txt');
Append(logFile);
throws an error on the second line when the file doesn't exist yet, which I assume is expected. But I'm really failing at finding out how to a) test if the file exists and b) create it when needed.
FYI, working in Delphi XE.
You can use the FileExists function and then use Append if exist or Rewrite if not.
AssignFile(logFile, 'Test.txt');
if FileExists('test.txt') then
Append(logFile)
else
Rewrite(logFile);
//do your stuff
CloseFile(logFile);
Any solution that uses FileExists to choose how to open the file has a race condition. If the file's existence changes between the time you test it and the time you attempt to open the file, your program will fail. Delphi doesn't provide any way to solve that problem with its native file I/O routines.
If your Delphi version is new enough to offer it, you can use the TFile.Open with the fmOpenOrCreate open mode, which does exactly what you want; it returns a TFileStream.
Otherwise, you can use the Windows API function CreateFile to open your file instead. Set the dwCreationDisposition parameter to OPEN_ALWAYS, which tells it to create the file if it doesn't already exist.
You should be using TFileStream instead. Here's a sample that will create a file if it doesn't exist, or write to it if it does:
var
FS: TFileStream;
sOut: string;
i: Integer;
Flags: Word;
begin
Flags := fmOpenReadWrite;
if not FileExists('D:\Temp\Junkfile.txt') then
Flags := Flags or fmCreate;
FS := TFileStream.Create('D:\Temp\Junkfile.txt', Flags);
try
FS.Position := FS.Size; // Will be 0 if file created, end of text if not
sOut := 'This is test line %d'#13#10;
for i := 1 to 10 do
begin
sOut := Format(sOut, [i]);
FS.Write(sOut[1], Length(sOut) * SizeOf(Char));
end;
finally
FS.Free;
end;
end;
If you are just doing something simple, the IOUtils Unit is a lot easier. It has a lot of utilities for writing to files.
e.g.
procedure WriteAllText(const Path: string; const Contents: string);
overload; static;
Creates a new file, writes the specified string to the file, and then
closes the file. If the target file already exists, it is overwritten.
You can also use the load/save feature in a TStringList to solve your problem.
This might be a bad solution, because the whole file will be loaded into memory, modified in memory and then saved to back to disk. (As opposed to your solution where you just write directly to the file). It's obviously a bad solution for multiuser situations.
But this approach is OK for smaller files, and it is easy to work with and easy understand.
const
FileName = 'test.txt';
var
strList: TStringList;
begin
strList := TStringList.Create;
try
if FileExists(FileName) then
strList.LoadFromFile(FileName);
strList.Add('My new line');
strList.SaveToFile(FileName);
finally
strList.Free;
end;
end;

How can I get this File Writing code to work with Unicode (Delphi)

I had some code before I moved to Unicode and Delphi 2009 that appended some text to a log file a line at a time:
procedure AppendToLogFile(S: string);
// this function adds our log line to our shared log file
// Doing it this way allows Wordpad to open it at the same time.
var F, C1 : dword;
begin
if LogFileName <> '' then begin
F := CreateFileA(Pchar(LogFileName), GENERIC_READ or GENERIC_WRITE, 0, nil, OPEN_ALWAYS, 0, 0);
if F <> 0 then begin
SetFilePointer(F, 0, nil, FILE_END);
S := S + #13#10;
WriteFile(F, Pchar(S)^, Length(S), C1, nil);
CloseHandle(F);
end;
end;
end;
But CreateFileA and WriteFile are binary file handlers and are not appropriate for Unicode.
I need to get something to do the equivalent under Delphi 2009 and be able to handle Unicode.
The reason why I'm opening and writing and then closing the file for each line is simply so that other programs (such as WordPad) can open the file and read it while the log is being written.
I have been experimenting with TFileStream and TextWriter but there is very little documentation on them and few examples.
Specifically, I'm not sure if they're appropriate for this constant opening and closing of the file. Also I'm not sure if they can make the file available for reading while they have it opened for writing.
Does anyone know of a how I can do this in Delphi 2009 or later?
Conclusion:
Ryan's answer was the simplest and the one that led me to my solution. With his solution, you also have to write the BOM and convert the string to UTF8 (as in my comment to his answer) and then that worked just fine.
But then I went one step further and investigated TStreamWriter. That is the equivalent of the .NET function of the same name. It understands Unicode and provides very clean code.
My final code is:
procedure AppendToLogFile(S: string);
// this function adds our log line to our shared log file
// Doing it this way allows Wordpad to open it at the same time.
var F: TStreamWriter;
begin
if LogFileName <> '' then begin
F := TStreamWriter.Create(LogFileName, true, TEncoding.UTF8);
try
F.WriteLine(S);
finally
F.Free;
end;
end;
Finally, the other aspect I discovered is if you are appending a lot of lines (e.g. 1000 or more), then the appending to the file takes longer and longer and it becomes quite inefficient.
So I ended up not recreating and freeing the LogFile each time. Instead I keep it open and then it is very fast. The only thing I can't seem to do is allow viewing of the file with notepad while it is being created.
For logging purposes why use Streams at all?
Why not use TextFiles? Here is a very simple example of one of my logging routines.
procedure LogToFile(Data:string);
var
wLogFile: TextFile;
begin
AssignFile(wLogFile, 'C:\MyTextFile.Log');
{$I-}
if FileExists('C:\MyTextFile.Log') then
Append(wLogFile)
else
ReWrite(wLogFile);
WriteLn(wLogfile, S);
CloseFile(wLogFile);
{$I+}
IOResult; //Used to clear any possible remaining I/O errors
end;
I actually have a fairly extensive logging unit that uses critical sections for thread safety, can optionally be used for internal logging via the OutputDebugString command as well as logging specified sections of code through the use of sectional identifiers.
If anyone is interested I'll gladly share the code unit here.
Char and string are Wide since D2009. Thus you should use CreateFile instead of CreateFileA!
If you werite the string you shoudl use Length( s ) * sizeof( Char ) as the byte length and not only Length( s ). because of the widechar issue. If you want to write ansi chars, you should define s as AnsiString or UTF8String and use sizeof( AnsiChar ) as a multiplier.
Why are you using the Windows API function instead of TFileStream defined in classes.pas?
Try this little function I whipped up just for you.
procedure AppendToLog(filename,line:String);
var
fs:TFileStream;
ansiline:AnsiString;
amode:Integer;
begin
if not FileExists(filename) then
amode := fmCreate
else
amode := fmOpenReadWrite;
fs := TFileStream.Create(filename,{mode}amode);
try
if (amode<>fmCreate) then
fs.Seek(fs.Size,0); {go to the end, append}
ansiline := AnsiString(line)+AnsiChar(#13)+AnsiChar(#10);
fs.WriteBuffer(PAnsiChar(ansiline)^,Length(ansiline));
finally
fs.Free;
end;
Also, try this UTF8 version:
procedure AppendToLogUTF8(filename, line: UnicodeString);
var
fs: TFileStream;
preamble:TBytes;
outpututf8: RawByteString;
amode: Integer;
begin
if not FileExists(filename) then
amode := fmCreate
else
amode := fmOpenReadWrite;
fs := TFileStream.Create(filename, { mode } amode, fmShareDenyWrite);
{ sharing mode allows read during our writes }
try
{internal Char (UTF16) codepoint, to UTF8 encoding conversion:}
outpututf8 := Utf8Encode(line); // this converts UnicodeString to WideString, sadly.
if (amode = fmCreate) then
begin
preamble := TEncoding.UTF8.GetPreamble;
fs.WriteBuffer( PAnsiChar(preamble)^, Length(preamble));
end
else
begin
fs.Seek(fs.Size, 0); { go to the end, append }
end;
outpututf8 := outpututf8 + AnsiChar(#13) + AnsiChar(#10);
fs.WriteBuffer(PAnsiChar(outpututf8)^, Length(outpututf8));
finally
fs.Free;
end;
end;
If you try to use text file or Object Pascal typed/untyped files in a multithreaded application you gonna have a bad time.
No kidding - the (Object) Pascal standard file I/O uses global variables to set file mode and sharing. If your application runs in more than one thread (or fiber if anyone still use them) using standard file operations could result in access violations and unpredictable behavior.
Since one of the main purposes of logging is debugging a multithreaded application, consider using other means of file I/O: Streams and Windows API.
(And yes, I know it is not really an answer to the original question, but I do not wish to log in - therefor I do not have the reputation score to comment on Ryan J. Mills's practically wrong answer.)

Getting size of a file in Delphi 2010 or later?

Delphi 2010 has a nice set of new file access functions in IOUtils.pas (I especially like the UTC versions of the date-related functions). What I miss so far is something like
TFile.GetSize (const Path : String)
What is the Delphi 2010-way to get the size of a file? Do I have to go back and use FindFirst to access TSearchRec.FindData?
Thanks.
I'm not sure if there's a "Delphi 2010" way, but there is a Windows way that doesn't involve FindFirst and all that jazz.
I threw together this Delphi conversion of that routine (and in the process modified it to handle > 4GB size files, should you need that).
uses
WinApi.Windows;
function FileSize(const aFilename: String): Int64;
var
info: TWin32FileAttributeData;
begin
result := -1;
if NOT GetFileAttributesEx(PChar(aFileName), GetFileExInfoStandard, #info) then
EXIT;
result := Int64(info.nFileSizeLow) or Int64(info.nFileSizeHigh shl 32);
end;
You could actually just use GetFileSize() but this requires a file HANDLE, not just a file name, and similar to the GetCompressedFileSize() suggestion, this requires two variables to call. Both GetFileSize() and GetCompressedFileSize() overload their return value, so testing for success and ensuring a valid result is just that little bit more awkward.
GetFileSizeEx() avoids the nitty gritty of handling > 4GB file sizes and detecting valid results, but also requires a file HANDLE, rather than a name, and (as of Delphi 2009 at least, I haven't checked 2010) isn't declared for you in the VCL anywhere, you would have to provide your own import declaration.
Using an Indy unit:
uses IdGlobalProtocols;
function FileSizeByName(const AFilename: TIdFileName): Int64;
You can also use DSiFileSize from DSiWin32. Works in "all" Delphis. Internally it calls CreateFile and GetFileSize.
function DSiFileSize(const fileName: string): int64;
var
fHandle: DWORD;
begin
fHandle := CreateFile(PChar(fileName), 0, 0, nil, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
if fHandle = INVALID_HANDLE_VALUE then
Result := -1
else try
Int64Rec(Result).Lo := GetFileSize(fHandle, #Int64Rec(Result).Hi);
finally CloseHandle(fHandle); end;
end; { DSiFileSize }
I'd like to mention few Pure Delphi ways. Though i think Deltics made a most speed-effective answer for Windows platform, yet sometimes you want just rely on RTL and also make portable code that would work in Delphi for MacOS or in FreePascal/Virtual Pascal/whatever.
There is FileSize function left from Turbo Pascal days.
http://turbopascal.org/system-functions-filepos-and-filesize
http://docwiki.embarcadero.com/CodeExamples/XE2/en/SystemFileSize_(Delphi)
http://docwiki.embarcadero.com/Libraries/XE2/en/System.FileSize
The sample above lacks "read-only" mode setting. You would require that to open r/o file such as one on CD-ROM media or in folder with ACLs set to r/o. Before calling ReSet there should be zero assigned to FileMode global var.
http://docwiki.embarcadero.com/Libraries/XE2/en/System.FileMode
It would not work on files above 2GB size (maybe with negative to cardinal cast - up to 4GB) but is "out of the box" one.
There is one more approach, that you may be familiar if you ever did ASM programming for MS-DOS. You Seek file pointer to 1st byte, then to last byte, and check the difference.
I can't say exactly which Delphi version introduced those, but i think it was already in some ancient version like D5 or D7, though that is just common sense and i cannot check it.
That would take you an extra THandle variable and try-finally block to always close the handle after size was obtained.
Sample of getting length and such
http://docwiki.embarcadero.com/Libraries/XE2/en/System.SysUtils.FileOpen
http://docwiki.embarcadero.com/Libraries/XE2/en/System.SysUtils.FileSeek
Aside from 1st approach this is int64-capable.
It is also compatible with FreePascal, though with some limitations
http://www.freepascal.org/docs-html/rtl/sysutils/fileopen.html
You can also create and use TFileStream-typed object - which was the primary, officially blessed avenue for file operations since Delphi 1.0
http://www.freepascal.org/docs-html/rtl/classes/tfilestream.create.html
http://www.freepascal.org/docs-html/rtl/classes/tstream.size.html
http://docwiki.embarcadero.com/Libraries/XE2/en/System.Classes.TFileStream.Create
http://docwiki.embarcadero.com/Libraries/XE2/en/System.Classes.TStream.Size
As a side note, this avenue is of course integrated with aforementioned IOUtils unit.
http://docwiki.embarcadero.com/Libraries/XE3/en/System.IOUtils.TFile.OpenRead
This is a short solution using FileSize that does the job:
function GetFileSize(p_sFilePath : string) : Int64;
var
oFile : file of Byte;
begin
Result := -1;
AssignFile(oFile, p_sFilePath);
try
Reset(oFile);
Result := FileSize(oFile);
finally
CloseFile(oFile);
end;
end;
From what I know, FileSize is available only from XE2.
uses
System.Classes, System.IOUtils;
function GetFileSize(const FileName : string) : Int64;
var
Reader: TFileStream;
begin
Reader := TFile.OpenRead(FileName);
try
result := Reader.Size;
finally
Reader.Free;
end;
end;

Fast read/write from file in delphi

I am loading a file into a array in binary form this seems to take a while
is there a better faster more efficent way to do this.
i am using a similar method for writing back to the file.
procedure openfile(fname:string);
var
myfile: file;
filesizevalue,i:integer;
begin
assignfile(myfile,fname);
filesizevalue:=GetFileSize(fname); //my method
SetLength(dataarray, filesizevalue);
i:=0;
Reset(myFile, 1);
while not Eof(myFile) do
begin
BlockRead(myfile,dataarray[i], 1);
i:=i+1;
end;
CloseFile(myfile);
end;
If your really want to read a binary file fast, let windows worry about buffering ;-) by using Memory Mapped Files. Using this you can simple map a file to a memory location an read like it's an array.
Your function would become:
procedure openfile(fname:string);
var
InputFile: TMappedFile;
begin
InputFile := TMappedFile.Create;
try
InputFile.MapFile(fname);
SetLength(dataarray, InputFile.Size);
Move(PByteArray(InputFile.Content)[0], Result[0], InputFile.Size);
finally
InputFile.Free;
end;
end;
But I would suggest not using the global variable dataarray, but either pass it as a var in the parameter, or use a function which returns the resulting array.
procedure ReadBytesFromFile(const AFileName : String; var ADestination : TByteArray);
var
InputFile : TMappedFile;
begin
InputFile := TMappedFile.Create;
try
InputFile.MapFile(AFileName);
SetLength(ADestination, InputFile.Size);
Move(PByteArray(InputFile.Content)[0], ADestination[0], InputFile.Size);
finally
InputFile.Free;
end;
end;
The TMappedFile is from my article Fast reading of files using Memory Mapping, this article also contains an example of how to use it for more "advanced" binary files.
You generally shouldn't read files byte for byte. Use BlockRead with a larger value (512 or 1024 often are best) and use its return value to find out how many bytes were read.
If the size isn't too large (and your use of SetLength seems to support this), you can also use one BlockRead call reading the complete file at once. So, modifying your approach, this would be:
AssignFile(myfile,fname);
filesizevalue := GetFileSize(fname);
Reset(myFile, 1);
SetLength(dataarray, filesizevalue);
BlockRead(myFile, dataarray[0], filesizevalue);
CloseFile(myfile);
Perhaps you could also change the procedure to a boolean function named OpenAndReadFile and return false if the file couldn't be opened or read.
It depends on the file format. If it consists of several identical records, you can decide to create a file of that record type.
For example:
type
TMyRecord = record
fieldA: integer;
..
end;
TMyFile = file of TMyRecord;
const
cBufLen = 100 * sizeof(TMyRecord);
var
file: TMyFile;
i : Integer;
begin
AssignFile(file, filename);
Reset(file);
i := 0;
try
while not Eof(file) do begin
BlockRead(file, dataarray[i], cBufLen);
Inc(i, cBufLen);
end;
finally
CloseFile(file);
end;
end;
If it's a long enough file that reading it this way takes a noticeable amount of time, I'd use a stream instead. The block read will be a lot faster, and there's no loops to worry about. Something like this:
procedure openfile(fname:string);
var
myfile: TFileStream;
filesizevalue:integer;
begin
filesizevalue:=GetFileSize(fname); //my method
SetLength(dataarray, filesizevalue);
myFile := TFileStream.Create(fname);
try
myFile.seek(0, soFromBeginning);
myFile.ReadBuffer(dataarray[0], filesizevalue);
finally
myFile.free;
end;
end;
It appears from your code that your record size is 1 byte long. If not, then change the read line to:
myFile.ReadBuffer(dataarray[0], filesizevalue * SIZE);
or something similar.
Look for a buffered TStream descendant. It will make your code a lot faster as the disk read is done fast, but you can loop through the buffer easily. There are various about, or you can write your own.
If you're feeling very bitheaded, you can bypass Win32 altogether and call the NT Native API function ZwOpenFile() which in my informal testing does shave a tiny bit off. Otherwise, I'd use Davy's Memory Mapped File solution above.

Resources