Comparing Bmp, JPEG, PNG, TIF files - delphi

I need a e method which compares content of two files together, files can be BMP, JPEG, PNG, TIF
I tried this
procedure TForm1.Button1Click(Sender: TObject);
var
f1, f2 : TFileStream;
Bytes1: TBytes;
Bytes2: TBytes;
i: integer;
s: booleAN;
begin
f1 := TFileStream.Create('C:\Output\Layout 1.JPG' , fmOpenRead);
f2 := TFileStream.Create('C:\Data\Layout 1.JPG' , fmOpenRead );
if f1.Size <> f2.Size then
begin
ShowMessage('size');
exit;
end;
SetLength(Bytes1, f1.Size);
f1.Read(Bytes1[0], f1.Size);
SetLength(Bytes2, f2.Size);
f2.Read(Bytes2[0], f2.Size);
s:= true;
for I := 1 to length(Bytes1) do
begin
if Bytes1[i] <> Bytes2[i] then
begin
s := false;
Exit;
end;
end;
if s then
ShowMessage('same');
end;
but this is not working fine for me my files are both the same in content but their size are different in 2 byte.
one of the files is the on that I have to give to user the other one is the files that user is opening the same file and make a copy of it, so why they are 2 byte different i have no idea but they should be away to compare content of these files

The code has one error. Dynamic arrays are zero based so the loop should be:
for I := 0 to high(Bytes1) do
The code is very inefficient. It should not read all the content at once. And you should use CompareMem to compare blocks of memory.
You say that the files have different size, but you expect them to compare equal. Well, that makes no sense. Your code explicitly checks that the sizes match, as it should.
Opening and reading a JPEG file will modify the content because JPEG is a lossy compression algorithm.
Your subject suggests that you wish to compare PowerPoint files but the files are in fact JPEG images.

If you are going to compare JPEGs you probably need to include a range, something like
Const
DELTA = 2 ;
if (Bytes1[i] - Bytes2[i] > DELTA) OR (Bytes1[i] - Bytes2[i] < -DELTA) then

Related

Delphi - Check each line of a file againt another file

I have to check each line of a file against another file.
If one line from the first file exists in the second file I have to delete it.
Right now i'm using 2 listboxes and the "for listbox1.items.count-1 downto do begin..."
My program works but I have to check this for huge files with over 1 milion lines.
Is there a faster approach to this method?
I want to load the files inside memory in order to be extremely fast!
Thanks
You can use TStringList for this. List for second file should be sorted for faster search. Try this:
var
l1, l2: TStringList;
i: integer;
begin
l1 := nil;
l2 := nil;
try
l1 := TStringList.Create;
l1.loadfromFile('file1');
l2 := TStringList.Create;
l2.LoadFromFile('file2');
l2.Sorted := True;
for i := l1.Count -1 downto 0 do
begin
if l2.IndexOf(l1[i]) <> -1 then
l1.Delete(i);
end;
l1.SaveToFile('file1');
finally
FreeEndNil(l1);
FreeEndNil(l2);
end
end
A quick solution (but not the fastest one) is to use two TStringList lists instead of list boxes.
var
a, b: TStringList;
i: Integer;
begin
a := TStringList.Create;
b := TStringList.Create;
try
a.LoadFromFile('C:\1.txt');
b.LoadFromFile('C:\2.txt');
b.Sorted := True;
for i := a.Count - 1 downto 0 do
begin
// Check if line of file 'a' are present in file 'b'
// and delete line if true
if b.IndexOf(a[i]) > -1 then
a.Delete(i);
end;
a.SaveToFile('C:\1.txt');
finally
b.Free;
a.Free;
end;
end;
Again, this is a slow and simple solution that loads whole files in RAM. It still will be much faster than using a ListBox. Sometimes simple is just enough for solving a one-time problem.
A faster method would be to create an index (eg. binary tree) of both files on hard disk and use this index to compare. That way you will not need to store the whole files on disk.

TFileStream.Read not reading last part of file

I'm using TFileStream.Read in a loop to read a text file, but I find that the last part is not being read into the buffer - although the total number of bytes being read is equal to the filesize.
This is my code:
procedure TForm1.DoImport;
var
f: String;
fs: TFileStream;
r, c: Integer;
buf: TBytes;
const
bufsiz = 16384;
begin
SetLength(buf, bufsiz);
f := 'C:\Report\Claims\Claims.csv';
fs := TFileStream.Create(f, fmOpenRead);
try
c := 0;
repeat
r := fs.Read(buf, bufsiz);
Inc(c, r);
until (r <> bufsiz);
showmessage('Done. ' + IntToStr(c)); // <-- c equals to filesize !!
Memo1.Text := StringOf(buf); // <-- but the memo does not show the last chunk of the file
finally
fs.Free;
end;
end;
At the end, the TMemo does not contain the last chunk of the file, but the 2nd to last chunk. Is there something wrong with my code?
Thanks in advance!
The beginning of that buffer contains the last chunk of your file. But after that comes the content of the previous chunk, because you never cleared the buffer. So you think that your memo contains the previous chunk, but it is a mix of both.
You could use the copy function in order to just add a part of the buffer.
Memo1.Text := StringOf(Copy(buf, 0, r)); // r is the number of bytes to copy
A better way for reading a text file is using TStringList or TStringReader. These will take care of the file encoding (Ansi, UTF8, ...) I usually prefer the TStringList because I had too much trouble with some of the bugs in TStringReader.

Delphi How to search in binary file faster?

I have a binary file (2.5 MB) and I want to find position of this sequence of bytes: CD 09 D9 F5. Then I want to write some data after this position and also overwrite old data (4 KB) with zeros.
Here is how I do it now but it is a bit slow.
ProcessFile(dataToWrite: string);
var
fileContent: string;
f: file of char;
c: char;
n, i, startIndex, endIndex: integer;
begin
AssignFile(f, 'file.bin');
reset(f);
n := FileSize(f);
while n > 0 do
begin
Read(f, c);
fileContent := fileContent + c;
dec(n);
end;
CloseFile(f);
startindex := Pos(Char($CD)+Char($09)+Char($D9)+Char($F5), fileContent) + 4;
endIndex := startIndex + 4088;
Seek(f, startIndex);
for i := 1 to length(dataToWrite) do
Write(f, dataToWrite[i]);
c := #0;
while (i < endIndex) do
begin
Write(f, c); inc(i);
end;
CloseFile(f);
end;
See this answer: Fast read/write from file in delphi
Some options are:
memory mapped files
TFileStream
blockread
To search the file buffer, see Best way to find position in the Stream where given byte sequence starts - one answer mentions the Boyer-Moore algorithm for fast detection of a byte sequence.
Your code to read the entire file into a string is very wasteful. Pascal I/O uses buffering so I don't think it's the byte by byte aspect particularly. Although one big read would be better. The main problem will be the string concatenation and the extreme heap allocation demand required to concatenate the string, one character at a time.
I'd do it like this:
function LoadFileIntoString(const FileName: string): string;
var
Stream: TFileStream;
begin
Stream := TFileStream.Create(FileName, fmOpenRead);
try
SetLength(Result, Stream.Size);//one single heap allocation
Stream.ReadBuffer(Pointer(Result)^, Length(Result));
finally
Stream.Free;
end;
end;
That alone should make a big difference. When it comes to writing the file, a similar use of strings will be much faster. I've not attempted to decipher the writing part of your code. Writing the new data, and the block of zeros again should be batched up to as few separate writes as possible.
If ever you find that you need to read or write very small blocks to a file, then I offer you my buffered file streams: Buffered files (for faster disk access).
The code could be optimised further to read only a portion of the file, and search until you find the target. You may be able to avoid reading the entire file that way. However, I suspect that these changes will make enough of a difference.

Listing Filenames from a Directory with sorting? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How to get the sort order in Delphi as in Windows Explorer?
I am trying to scan a directory, but I can't get it to Sort by File Name.
Example, say if I have these Filenames in a folder:
File1
File2
File3
File4
File5
File6
File7
File8
File9
File10
File11
File12
File13
File14
File15
File16
File17
File18
File19
File20
File21
File22
If I use something like this:
var
SL: TStringList;
SR: TSearchRec;
begin
SL := TStringList.Create;
try
if FindFirst(Path + '*.*', faAnyFile and not faDirectory and not faHidden, SR) = 0 then
repeat
SL.Add(Path + SR.Name)
until FindNext(SR) <> 0;
FindClose(SR);
// handle the filenames..
finally
SL.Free;
end;
end;
The result will be:
File10
File11
File12
File13
File14
File15
File16
File17
File18
File19
File2
File20
File21
File22
File3
File4
File5
File6
File7
File8
File9
It should be sorted by Filename (as I wrote in the first Filename list example).
I bet this is really simple but I cannot see it, what do I need to do to sort this?
Thanks.
You're starting with the assumption that there's some sort of inherent "order" for file names. There isn't. You appear to want the file names to be sorted alphabetically, with numerical portions of names sorted numerically. I'm not sure what you want to happen with punctuation and other characters.
The file-enumeration functions don't define any order that names will be returned in. They're returned in whatever order the underlying file system decides to provide them. There are two steps to getting a sorted list of file names. You're already doing the first one:
Collect the file names in a list for post-processing.
Arrange the names in the order you want. If plain "asciibetical" isn't what you want, then you can write a custom sorting function and pass it to TStringList.CustomSort.
For example, if you want them to be in the same order you see file names in Windows Explorer as of Windows XP, you can use the StrCmpLogicalW API function. Call that from your comparison function, like this:
function LogicalCompare(List: TStringList; Index1, Index2: Integer): Integer;
begin
Result := StrCmpLogicalW(PWideChar(List[Index1]), PWideChar(List[Index2]));
end;
SL.CustomSort(LogicalCompare);
If you have something earlier than Delphi 2007, you'll need to do something about converting your strings to wide characters, at least for the duration of the sorting phase.
FindFirst() and FindNext() merely enumerate the files on the file system as-is. The files can be returned in any order. You have to sort the TStringList yourself afterwards, eg:
function SortFilesByName(List: TStringList; Index1, Index2: Integer): Integer;
var
FileName1, FileName2: String;
FileNumber1, FileNumber2: Integer;
begin
// assuming the files are all named "Path\File###.xxx",
// where "###" is the number to sort on...
FileName1 := ChangeFileExt(ExtractFileName(List[Index1]), '');
FileName2 := ChangeFileExt(ExtractFileName(List[Index1]), '');
FileNumber1 := StrToInt(Copy(FileName1, 5, MaxInt));
FileNumber2 := StrToInt(Copy(FileName2, 5, MaxInt));
Result := (FileNumber2 - FileNumber1);
end;
var
SL: TStringList;
SR: TSearchRec;
begin
SL := TStringList.Create;
try
if FindFirst(Path + '*.*', faAnyFile and (not faDirectory) and (not faHidden), SR) = 0 then
try
repeat
SL.Add(Path + SR.Name)
until FindNext(SR) <> 0;
finally
FindClose(SR);
end;
SL.CustomSort(SortFilesByName);
// handle the filenames..
finally
SL.Free;
end;
end;
Any simple sort system (such as the one windows uses to return you files and delphi uses for sorting) will sort alphabetically and then by number, but unless you pad your numbers with zeros
1 comes before 2 so
11 comes before 2 (in the same way that aa comes before b)
you either need to pad you numbers with zeros such as
filename001
filename010
filename020
for use the answer provided by Remy Lebeau - TeamB above which will pull out the number at the end of you filename and sort by that.

How can I quickly zero out the contents of a file?

I want to terminate file that user selected from my program. I wrote this sample code:
var
aFile: TFileStream;
Const
FileAddr: String = 'H:\Akon.mp3';
Buf: Byte = 0;
begin
if FileExists(FileAddr) then
begin
// Open given file in file stream & rewrite it
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
aFile.Seek(0, soFromBeginning);
while aFile.Position <> aFile.Size do
aFile.Write(Buf, 1);
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
As you can see, I overwrite given file with 0 (null) value. This code works correctly, but the speed is very low in large files. I would like do this process in multithreaded code, but I tried some test some code and can't do it. For example, I create 4 threads that do this work to speed up this process.
Is there any way to speed up this process?
I don't know if it could help you, but I think you could do better (than multithreading) writing to file a larger buffer.
For example you could initialize a buffer 16k wide and write directly to FileStream; you have only to check the last part of file, for which you write only a part of the full buffer.
Believe me, it will be really faster...
OK, I'll bite:
const
FileAddr: String = 'H:\Akon.mp3';
var
aFile: TFileStream;
Buf: array[0..1023] of Byte;
Remaining, NumBytes: Integer;
begin
if FileExists(FileAddr) then
begin
// Open given file in file stream & rewrite it
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
FillChar(Buf, SizeOf(Buf), 0);
Remaining := aFile.Size;
while Remaining > 0 do begin
NumBytes := SizeOf(Buf);
if NumBytes < Remaining then
NumBytes := Remaining;
aFile.WriteBuffer(Buf, NumBytes);
Dec(Remaining, NumBytes);
end;
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
Multiple threads won't help you here. Your constraint is disk access, primarily because you're writing only 1 byte at a time.
Declare Buf as an array of bytes, and initialise it with FillChar or ZeroMemory. Then change your while loop as follows:
while ((aFile.Position + SizeOf(Buf)) < aFile.Size) do
begin
aFile.Write(Buf, SizeOf(Buf));
end;
if (aFile.Position < aFile.Size) then
begin
aFile.Write(Buf, aFile.Size - aFile.Position);
end;
You should learn from the slowness of the above code that:
Writing one byte at a time is the slowest and worst way to do it, you incur a huge overhead, and reduce your overall performance.
Even writing 1k bytes (1024 bytes) at a time, would be a vast improvement, but writing a larger amount will of course be even faster, until you reach a point of diminishing returns, which I would guess is going to be somewhere between 200k and 500k write buffer size. The only way to find out when it stops mattering for your application is to test, test, and test.
Checking position against size so often is completely superfluous. If you read the size once, and write the correct number of bytes, using a local variable you will save yourself more overhead, and improve performance. ie, Inc(LPosition,BufSize) to increment LPosition:Integer logical variable, by the buffer size amount BufSize.
Not sure if this meets your requirments but it works and it's fast.
var
aFile: TFileStream;
const
FileAddr: String = 'H:\Akon.mp3';
begin
if FileExists(FileAddr) then
begin
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
afile.Size := 0;
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
So will something along these lines (declaring b outside the function will improve your performance in the loop, especially when dealing with a large file ). I assume that in the app filename would be a var:
const
b: byte=0;
procedure MyProc;
var
aFile: TFileStream;
Buf: array of byte;
len: integer;
FileAddr: String;
begin
FileAddr := 'C:\testURL.txt';
if FileExists(FileAddr) then
begin
aFile := TFileStream.Create(FileAddr, fmcreate);
try
len := afile.Size;
setlength(buf, len);
fillchar(buf[0], len, b);
aFile.Position := 0;
aFile.Write(buf, len);
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;

Resources