I've been using ZLib functions to compress/uncompress streams in memory. In case when I try to uncompress invalid stream, it leaks memory. The following code would leak memory:
uses
Winapi.Windows, System.Classes, System.ZLib;
function DecompressStream(const AStream: TMemoryStream): Boolean;
var
ostream: TMemoryStream;
begin
ostream := TMemoryStream.Create;
try
AStream.Position := 0;
// ISSUE: Memory leak happening here
try
ZDecompressStream(AStream, ostream);
except
Exit(FALSE);
end;
AStream.Clear;
ostream.Position := 0;
AStream.CopyFrom(ostream, ostream.Size);
result := TRUE;
finally
ostream.Free;
end;
end;
var
s: TMemoryStream;
begin
ReportMemoryLeaksOnShutdown := TRUE;
s := TMemoryStream.Create;
try
DecompressStream(s);
finally
s.Free;
end;
end.
I try to decompress empty TMemoryStream here and at the end of execution it shows that memory leak happened. Testing on Delphi XE2.
Any ideas how to prevent this leak to happen, because in real world there would be a chance for my application to try to decompress invalid stream and leak the memory there.
QC: http://qc.embarcadero.com/wc/qcmain.aspx?d=120329 - claimed fixed starting with XE6
It's a bug in the Delphi RTL code. The implementation of ZDecompressStream raises exceptions and then fails to perform tidy up. Let's look at the code:
procedure ZDecompressStream(inStream, outStream: TStream);
const
bufferSize = 32768;
var
zstream: TZStreamRec;
zresult: Integer;
inBuffer: TBytes;
outBuffer: TBytes;
inSize: Integer;
outSize: Integer;
begin
SetLength(inBuffer, BufferSize);
SetLength(outBuffer, BufferSize);
FillChar(zstream, SizeOf(TZStreamRec), 0);
ZCompressCheck(InflateInit(zstream)); <--- performs heap allocation
inSize := inStream.Read(inBuffer, bufferSize);
while inSize > 0 do
begin
zstream.next_in := #inBuffer[0];
zstream.avail_in := inSize;
repeat
zstream.next_out := #outBuffer[0];
zstream.avail_out := bufferSize;
ZCompressCheck(inflate(zstream, Z_NO_FLUSH));
// outSize := zstream.next_out - outBuffer;
outSize := bufferSize - zstream.avail_out;
outStream.Write(outBuffer, outSize);
until (zstream.avail_in = 0) and (zstream.avail_out > 0);
inSize := inStream.Read(inBuffer, bufferSize);
end;
repeat
zstream.next_out := #outBuffer[0];
zstream.avail_out := bufferSize;
zresult := ZCompressCheck(inflate(zstream, Z_FINISH));
// outSize := zstream.next_out - outBuffer;
outSize := bufferSize - zstream.avail_out;
outStream.Write(outBuffer, outSize);
until (zresult = Z_STREAM_END) and (zstream.avail_out > 0);
ZCompressCheck(inflateEnd(zstream)); <--- tidy up, frees heap allocation
end;
I've taken this from my XE3, but I believe that it is essentially the same in all versions. I've highlighted the problem. The call to inflateInit allocates memory off the heap. It needs to be paired with a call to inflateEnd. Because ZCompressCheck raises exceptions in the face of errors, the call to inflateEnd never happens. And hence the code leaks.
The other calls to inflateInit and inflateEnd in that unit are correctly protected with try/finally. It just appears to be the use in this function that is erroneous.
My recommendation is that you replace the Zlib unit with a version that is implemented correctly.
Related
I'm creating a package in which a custom image list reads and writes its content inside a DFM file.
The code I wrote works globally well in all compilers between XE7 and 10.3 Rio. However I have a strange issue in XE2. With this particular compiler, I sometimes receives an empty stream content while the DFM is read, and sometimes a corrupted content.
My custom image list is built above a standard TImageList. I register my read callback this way:
procedure TMyImageList.DefineProperties(pFiler: TFiler);
function DoWritePictures: Boolean;
begin
if (Assigned(pFiler.Ancestor)) then
Result := not (pFiler.Ancestor is TMyImageList)
else
Result := Count > 0;
end;
begin
inherited DefineProperties(pFiler);
// register the properties that will load and save the pictures binary data in DFM files
pFiler.DefineBinaryProperty('Pictures', ReadPictures, WritePictures, DoWritePictures);
end;
Here is the ReadPictures function:
procedure TMyImageList.ReadPictures(pStream: TStream);
begin
LoadPictureListFromStream(m_pPictures, pStream);
end;
Here is the LoadPictureListFromStream function:
procedure TMyImageList.LoadPictureListFromStream(pList: IWPictureList; pStream: TStream);
var
{$if CompilerVersion <= 23}
pImgNameBytes: Pointer;
pData: Pointer;
{$else}
imgNameBytes: TBytes;
{$ifend}
count, i: Integer;
color: Cardinal;
imgClassName: string;
pMemStr: TMemoryStream;
size: Int64;
pItem: IWPictureItem;
pGraphicClass: TGraphicClass;
pGraphic: TGraphic;
begin
// read the list count
pStream.ReadBuffer(count, SizeOf(count));
// is list empty?
if (count <= 0) then
Exit;
pMemStr := TMemoryStream.Create;
// enable the code below to log the received stream content
{$ifdef _DEBUG}
size := pStream.Position;
pStream.Position := 0;
pMemStr.CopyFrom(pStream, pStream.Size);
pMemStr.Position := 0;
pMemStr.SaveToFile('__DfmStreamContent.bin');
pMemStr.Clear;
pStream.Position := size;
{$endif}
try
for i := 0 to count - 1 do
begin
pItem := IWPictureItem.Create;
try
// read the next size
pStream.ReadBuffer(size, SizeOf(size));
// read the image type from stream
if (size > 0) then
begin
{$if CompilerVersion <= 23}
pImgNameBytes := nil;
try
GetMem(pImgNameBytes, size + 1);
pStream.ReadBuffer(pImgNameBytes^, size);
pData := Pointer(NativeUInt(pImgNameBytes) + NativeUInt(size));
(PByte(pData))^ := 0;
imgClassName := UTF8ToString(pImgNameBytes);
finally
if (Assigned(pImgNameBytes)) then
FreeMem(pImgNameBytes);
end;
{$else}
SetLength(imgNameBytes, size);
pStream.Read(imgNameBytes, size);
imgClassName := TEncoding.UTF8.GetString(imgNameBytes);
{$ifend}
end;
// read the next size
pStream.ReadBuffer(size, SizeOf(size));
// read the image from stream
if (size > 0) then
begin
// read the image in a temporary memory stream
pMemStr.CopyFrom(pStream, size);
pMemStr.Position := 0;
// get the graphic class to create
if (imgClassName = 'TWSVGGraphic') then
pGraphicClass := TWSVGGraphic
else
begin
TWLogHelper.LogToCompiler('Internal error - unknown graphic class - '
+ imgClassName + ' - name - ' + Name);
pGraphicClass := nil;
end;
// found it?
if (Assigned(pGraphicClass)) then
begin
pGraphic := nil;
try
// create a matching graphic to receive the image data
pGraphic := pGraphicClass.Create;
pGraphic.LoadFromStream(pMemStr);
pItem.m_pPicture.Assign(pGraphic);
finally
pGraphic.Free;
end;
end;
pMemStr.Clear;
end;
// read the next size
pStream.ReadBuffer(size, SizeOf(size));
// read the color key from stream
if (size > 0) then
begin
Assert(size = SizeOf(color));
pStream.ReadBuffer(color, size);
// get the color key
pItem.m_ColorKey := TWColor.Create((color shr 16) and $FF,
(color shr 8) and $FF,
color and $FF,
(color shr 24) and $FF);
end;
// add item to list
pList.Add(pItem);
except
pItem.Free;
raise;
end;
end;
finally
pMemStr.Free;
end;
end;
Here is the WritePictures function:
procedure TMyImageList.WritePictures(pStream: TStream);
begin
SavePictureListToStream(m_pPictures, pStream);
end;
And finally, here is the SavePictureListToStream function:
procedure TMyImageList.SavePictureListToStream(pList: IWPictureList; pStream: TStream);
var
count, i: Integer;
color: Cardinal;
imgClassName: string;
imgNameBytes: TBytes;
pMemStr: TMemoryStream;
size: Int64;
begin
// write the list count
count := pList.Count;
pStream.WriteBuffer(count, SizeOf(count));
if (count = 0) then
Exit;
pMemStr := TMemoryStream.Create;
try
for i := 0 to count - 1 do
begin
// a picture should always be assigned in the list so this should never happen
if (not Assigned(pList[i].m_pPicture.Graphic)) then
begin
TWLogHelper.LogToCompiler('Internal error - picture list is corrupted - ' + Name);
// write empty size to prevent to corrupt the stream
size := 0;
pStream.WriteBuffer(size, SizeOf(size));
pStream.WriteBuffer(size, SizeOf(size));
end
else
begin
// save the image type in the stream
imgClassName := pList[i].m_pPicture.Graphic.ClassName;
imgNameBytes := TEncoding.UTF8.GetBytes(imgClassName);
size := Length(imgNameBytes);
pStream.WriteBuffer(size, SizeOf(size));
pStream.Write(imgNameBytes, size);
// save the image in the stream
pList[i].m_pPicture.Graphic.SaveToStream(pMemStr);
size := pMemStr.Size;
pStream.WriteBuffer(size, SizeOf(size));
pStream.CopyFrom(pMemStr, 0);
pMemStr.Clear;
end;
// build the key color to save
color := (pList[i].m_ColorKey.GetBlue +
(pList[i].m_ColorKey.GetGreen shl 8) +
(pList[i].m_ColorKey.GetRed shl 16) +
(pList[i].m_ColorKey.GetAlpha shl 24));
// save the key color in the stream
size := SizeOf(color);
pStream.WriteBuffer(size, SizeOf(size));
pStream.WriteBuffer(color, size);
end;
finally
pMemStr.Free;
end;
end;
When the issue occurs, the content get in imgClassName become incoherent, or sometimes the image count read on the LoadPictureListFromStream() function first line is equals to 0.
Writing the DFM stream content in a file, I also noticed that only the class name value is corrupted, other data seems OK.
This issue happens randomly, sometimes all works fine, especially if I start the app in runtime time without previously opening the form in design time, but it may also happen whereas I just open the form in design time, without changing nor saving nothing. On the other hand, this issue happen only with XE2. I never noticed it on any other compiler.
As I'm a c++ developer writing a Delphi code, and as I needed to adapt a part of the code to be able to compile it under XE2 (see the {$if CompilerVersion <= 23} statements), I probably doing something very bad with the memory, but I cannot figure out what exactly.
Can someone analyse this code and point me what is(are) my mistake(s)?
In your SavePictureListToStream() method, the statement
pStream.Write(imgNameBytes, size);
does not work the way you expect in XE2 and earlier. TStream did not gain support for reading/writing TBytes arrays until XE3. As such, the above statement ends up writing to the memory address where the imgNameBytes variable itself is declared on the stack, not to the address where the variable is pointing to, where the array is allocated on the heap.
For XE2 and earlier, you need to use this statement instead:
pStream.WriteBuffer(PByte(imgNameBytes)^, size);
What you have in your LoadPictureListFromStream() method is technically OK, but your UTF-8 handling is more complicated then it needs to be. TEncoding exists in XE2, as it was first introduced in D2009. But even in older versions, you can and should use a dynamic array instead of GetMem() to simplify your memory management and keep it consistent across multiple versions, eg:
{$if CompilerVersion < 18.5}
type
TBytes = array of byte;
{$ifend}
var
imgNameBytes: TBytes;
...
begin
...
// read the next size
pStream.ReadBuffer(size, SizeOf(size));
// read the image type from stream
if (size > 0) then
begin
SetLength(imgNameBytes, size{$if CompilerVersion < 20}+1{$ifend});
pStream.ReadBuffer(PByte(imgNameBytes)^, size);
{$if CompilerVersion < 20}
imgNameBytes[High(imgNameBytes)] := $0;
imgClassName := UTF8ToString(PAnsiChar(pImgNameBytes));
{$else}
imgClassName := TEncoding.UTF8.GetString(imgNameBytes);
{$ifend}
end;
...
end;
In Delphi 5, with FastMM active, the call to FreeMem in the following minimum-reproducible code triggers an Invalid Pointer Exception:
program Project1;
{$APPTYPE CONSOLE}
uses
FastMM4,
SysUtils,
Windows;
procedure Main;
var
token: THandle;
returnLength: Cardinal;
p: Pointer;
begin
OpenProcessToken(GetCurrentProcess, TOKEN_QUERY, {out}token);
//Get the size of the buffer required.
//It's normally going to be 38 bytes. We'll use 16KB to eliminate the possibility of buffer overrun
// Windows.GetTokenInformation(token, TokenUser, nil, 0, {var}returnLength);
p := GetMemory(16384); //GetMemory(returnLength);
Windows.GetTokenInformation(token, TokenUser, p, 1024, {var}returnLength);
FreeMem({var}p); //FreeMem is the documented way to free memory allocated with GetMemory.
// FreeMemory(p); //FreeMemory is the C++ compatible version of FreeMem.
end;
begin
Main;
end.
The call to FreeMme fails with an EInvalidPointerException:
FreeMem({var}p); //error
The error will stop happening if:
i stop using FastMM4
i stop calling GetTokenInformation
i call FreeMemory (rather than FreeMem)
I've reproduced the error on a fresh install of Delphi 5 on a freshly installed Windows 7 machine. FastMM4 v4.992.
The error does not happen in Delphi 7
The error does not happen in Delphi XE6
It's only:
Delphi 5
when using FastMM4
Workaround
If it is a bug in FastMM4, i can workaround it. Rather than calling:
GetMemory
FreeMem
I can manually allocate the buffer another way:
SetLength(buffer, cb)
SetLength(buffer, 0)
If it's not a bug in FastMM4, i'd like to fix the above code.
Using FreeMemory, rather than FreeMem, doesn't trigger the error
I was under the impression that FastMM takes over memory management, which is why i was surprised to discover:
FreeMem({var}p); failed
FreeMemory(p); works
Internally, FreeMem is implemented as a call to the memory manager. In this case the memory manager (FastMM) returns non-zero, causing the call to reInvalidPtr:
System.pas
procedure _FreeMem;
asm
TEST EAX,EAX
JE ##1
CALL MemoryManager.FreeMem
OR EAX,EAX
JNE ##2
##1: RET
##2: MOV AL,reInvalidPtr
JMP Error
end;
and the implementation of MemoryManager.FreeMem ends up being:
FastMM4.pas
function FastFreeMem(APointer: Pointer);
FreeMem takes a var pointer, FreeMemory takes a pointer
The implementation of FreeMemory is:
System.pas:
function FreeMemory(P: Pointer): Integer; cdecl;
begin
if P = nil then
Result := 0
else
Result := SysFreeMem(P);
end;
And SysFreeMem is implemented in:
GetMem.inc:
function SysFreeMem(p: Pointer): Integer;
// Deallocate memory block.
label
abort;
var
u, n : PUsed;
f : PFree;
prevSize, nextSize, size : Integer;
begin
heapErrorCode := cHeapOk;
if not initialized and not InitAllocator then begin
heapErrorCode := cCantInit;
result := cCantInit;
exit;
end;
try
if IsMultiThread then EnterCriticalSection(heapLock);
u := p;
u := PUsed(PChar(u) - sizeof(TUsed)); { inv: u = address of allocated block being freed }
size := u.sizeFlags;
{ inv: size = SET(block size) + [block flags] }
{ validate that the interpretation of this block as a used block is correct }
if (size and cThisUsedFlag) = 0 then begin
heapErrorCode := cBadUsedBlock;
goto abort;
end;
{ inv: the memory block addressed by 'u' and 'p' is an allocated block }
Dec(AllocMemCount);
Dec(AllocMemSize,size and not cFlags - sizeof(TUsed));
if (size and cPrevFreeFlag) <> 0 then begin
{ previous block is free, coalesce }
prevSize := PFree(PChar(u)-sizeof(TFree)).size;
if (prevSize < sizeof(TFree)) or ((prevSize and cFlags) <> 0) then begin
heapErrorCode := cBadPrevBlock;
goto abort;
end;
f := PFree(PChar(u) - prevSize);
if f^.size <> prevSize then begin
heapErrorCode := cBadPrevBlock;
goto abort;
end;
inc(size, prevSize);
u := PUsed(f);
DeleteFree(f);
end;
size := size and not cFlags;
{ inv: size = block size }
n := PUsed(PChar(u) + size);
{ inv: n = block following the block to free }
if PChar(n) = curAlloc then begin
{ inv: u = last block allocated }
dec(curAlloc, size);
inc(remBytes, size);
if remBytes > cDecommitMin then
FreeCurAlloc;
result := cHeapOk;
exit;
end;
if (n.sizeFlags and cThisUsedFlag) <> 0 then begin
{ inv: n is a used block }
if (n.sizeFlags and not cFlags) < sizeof(TUsed) then begin
heapErrorCode := cBadNextBlock;
goto abort;
end;
n.sizeFlags := n.sizeFlags or cPrevFreeFlag
end else begin
{ inv: block u & n are both free; coalesce }
f := PFree(n);
if (f.next = nil) or (f.prev = nil) or (f.size < sizeof(TFree)) then begin
heapErrorCode := cBadNextBlock;
goto abort;
end;
nextSize := f.size;
inc(size, nextSize);
DeleteFree(f);
{ inv: last block (which was free) is not on free list }
end;
InsertFree(u, size);
abort:
result := heapErrorCode;
finally
if IsMultiThread then LeaveCriticalSection(heapLock);
end;
end;
So it makes that sense that FreeMemory doesn't trigger the error; FreeMemory is not taken over by the memory manager.
I guess that is why FreeMemory is not the documented counterpart to GetMemory: ๐
FreeMem is not the documented way to free memory allocated with GetMemory - that's apparently an error in the old documentation that has since been corrected. From the documentation for System.GetMemory (emphasis added):
GetMemory allocates a memory block.
GetMemory allocates a block of the given Size on the heap, and returns the address of this memory. The bytes of the allocated buffer are not set to zero. To dispose of the buffer, use FreeMemory. If there is not enough memory available to allocate the block, an EOutOfMemory exception is raised.
If you allocate the memory with GetMem, use FreeMem. If the allocation is done with GetMemory, use FreeMemory.
I'm trying to download a file of 500 mb more when it's in this size gives out of memory error. I tried switching to 64 bit application and it worked. But I would need it to work in 32 bit application to download file.
var
Stream: TStream;
fileStream: TFileStream;
Buffer: PByte;
BytesRead, BufSize: Integer;
Size: int64;
begin
BufSize := 1024;
fileStream:= TFileStream.Create(GetCurrentDir()+'\DownloadFile.zip',
fmCreate);
GetMem(Buffer, BufSize);
Stream := getDownload(size);
if (Size <> 0) then
begin
repeat
BytesRead := Stream.Read(Pointer(Buffer)^, BufSize);
if (BytesRead > 0) then
begin
fileStream.WriteBuffer(Pointer(Buffer)^, BytesRead);
end;
Application.ProcessMessages
until (BytesRead < BufSize);
if (Size <> fileStream.Size) then
begin
exit;
end;
end;
finally
FreeMem(Buffer, BufSize);
fileStream.Destroy;
end;
end;
function TServiceMethods.getDownload(out Size: Int64): TStream;
begin
Result := TFileStream.Create(GetCurrentDir+'\DownloadFile.zip', fmOpenRead
or fmShareDenyNone);
Size := Result.Size;
Result.Position := 0;
end;
Don't use a memory stream here. That forces the entire file into a contiguous block of memory, which as you discovered exhausts memory in a 32 bit process.
Instead, write the downloaded data directly to file. You can remove the intermediate memory stream and write directly to a file stream.
Of course, all of this assumes that getDownload returns a stream that performs the download as you read it. If getDownload reads the entire file into a memory stream then it suffers from the exact same problem as the code in this question.
I am trying to find and replace text in a text file. I have been able to do this in the past with methods like:
procedure SmallFileFindAndReplace(FileName, Find, ReplaceWith: string);
begin
with TStringList.Create do
begin
LoadFromFile(FileName);
Text := StringReplace(Text, Find, ReplaceWith, [rfReplaceAll, rfIgnoreCase]);
SaveToFile(FileName);
Free;
end;
end;
The above works fine when a file is relatively small, however; when the the file size is something like 170 Mb the above code will cause the following error:
EOutOfMemory with message 'Out of memory'
I have tried the following with success, however it takes a long time to run:
procedure Tfrm_Main.button_MakeReplacementClick(Sender: TObject);
var
fs : TFileStream;
s : AnsiString;
//s : string;
begin
fs := TFileStream.Create(edit_SQLFile.Text, fmOpenread or fmShareDenyNone);
try
SetLength(S, fs.Size);
fs.ReadBuffer(S[1], fs.Size);
finally
fs.Free;
end;
s := StringReplace(s, edit_Find.Text, edit_Replace.Text, [rfReplaceAll, rfIgnoreCase]);
fs := TFileStream.Create(edit_SQLFile.Text, fmCreate);
try
fs.WriteBuffer(S[1], Length(S));
finally
fs.Free;
end;
end;
I am new to "Streams" and working with buffers.
Is there a better way to do this?
Thank You.
You have two mistakes in first code example and three - in second example:
Do not load whole large file in memory, especially in 32bit application. If file size more than ~1 Gb, you always get "Out of memory"
StringReplace slows with large strings, because of repeated memory reallocation
In second code you don`t use text encoding in file, so (for Windows) your code "think" that file has UCS2 encoding (two bytes per character). But what you get, if file encoding is Ansi (one byte per character) or UTF8 (variable size of char)?
Thus, for correct find&replace you must use file encoding and read/write parts of file, as LU RD said:
interface
uses
System.Classes,
System.SysUtils;
type
TFileSearchReplace = class(TObject)
private
FSourceFile: TFileStream;
FtmpFile: TFileStream;
FEncoding: TEncoding;
public
constructor Create(const AFileName: string);
destructor Destroy; override;
procedure Replace(const AFrom, ATo: string; ReplaceFlags: TReplaceFlags);
end;
implementation
uses
System.IOUtils,
System.StrUtils;
function Max(const A, B: Integer): Integer;
begin
if A > B then
Result := A
else
Result := B;
end;
{ TFileSearchReplace }
constructor TFileSearchReplace.Create(const AFileName: string);
begin
inherited Create;
FSourceFile := TFileStream.Create(AFileName, fmOpenReadWrite);
FtmpFile := TFileStream.Create(ChangeFileExt(AFileName, '.tmp'), fmCreate);
end;
destructor TFileSearchReplace.Destroy;
var
tmpFileName: string;
begin
if Assigned(FtmpFile) then
tmpFileName := FtmpFile.FileName;
FreeAndNil(FtmpFile);
FreeAndNil(FSourceFile);
TFile.Delete(tmpFileName);
inherited;
end;
procedure TFileSearchReplace.Replace(const AFrom, ATo: string;
ReplaceFlags: TReplaceFlags);
procedure CopyPreamble;
var
PreambleSize: Integer;
PreambleBuf: TBytes;
begin
// Copy Encoding preamble
SetLength(PreambleBuf, 100);
FSourceFile.Read(PreambleBuf, Length(PreambleBuf));
FSourceFile.Seek(0, soBeginning);
PreambleSize := TEncoding.GetBufferEncoding(PreambleBuf, FEncoding);
if PreambleSize <> 0 then
FtmpFile.CopyFrom(FSourceFile, PreambleSize);
end;
function GetLastIndex(const Str, SubStr: string): Integer;
var
i: Integer;
tmpSubStr, tmpStr: string;
begin
if not(rfIgnoreCase in ReplaceFlags) then
begin
i := Pos(SubStr, Str);
Result := i;
while i > 0 do
begin
i := PosEx(SubStr, Str, i + 1);
if i > 0 then
Result := i;
end;
if Result > 0 then
Inc(Result, Length(SubStr) - 1);
end
else
begin
tmpStr := UpperCase(Str);
tmpSubStr := UpperCase(SubStr);
i := Pos(tmpSubStr, tmpStr);
Result := i;
while i > 0 do
begin
i := PosEx(tmpSubStr, tmpStr, i + 1);
if i > 0 then
Result := i;
end;
if Result > 0 then
Inc(Result, Length(tmpSubStr) - 1);
end;
end;
var
SourceSize: int64;
procedure ParseBuffer(Buf: TBytes; var IsReplaced: Boolean);
var
i: Integer;
ReadedBufLen: Integer;
BufStr: string;
DestBytes: TBytes;
LastIndex: Integer;
begin
if IsReplaced and (not(rfReplaceAll in ReplaceFlags)) then
begin
FtmpFile.Write(Buf, Length(Buf));
Exit;
end;
// 1. Get chars from buffer
ReadedBufLen := 0;
for i := Length(Buf) downto 0 do
if FEncoding.GetCharCount(Buf, 0, i) <> 0 then
begin
ReadedBufLen := i;
Break;
end;
if ReadedBufLen = 0 then
raise EEncodingError.Create('Cant convert bytes to str');
FSourceFile.Seek(ReadedBufLen - Length(Buf), soCurrent);
BufStr := FEncoding.GetString(Buf, 0, ReadedBufLen);
if rfIgnoreCase in ReplaceFlags then
IsReplaced := ContainsText(BufStr, AFrom)
else
IsReplaced := ContainsStr(BufStr, AFrom);
if IsReplaced then
begin
LastIndex := GetLastIndex(BufStr, AFrom);
LastIndex := Max(LastIndex, Length(BufStr) - Length(AFrom) + 1);
end
else
LastIndex := Length(BufStr);
SetLength(BufStr, LastIndex);
FSourceFile.Seek(FEncoding.GetByteCount(BufStr) - ReadedBufLen, soCurrent);
BufStr := StringReplace(BufStr, AFrom, ATo, ReplaceFlags);
DestBytes := FEncoding.GetBytes(BufStr);
FtmpFile.Write(DestBytes, Length(DestBytes));
end;
var
Buf: TBytes;
BufLen: Integer;
bReplaced: Boolean;
begin
FSourceFile.Seek(0, soBeginning);
FtmpFile.Size := 0;
CopyPreamble;
SourceSize := FSourceFile.Size;
BufLen := Max(FEncoding.GetByteCount(AFrom) * 5, 2048);
BufLen := Max(FEncoding.GetByteCount(ATo) * 5, BufLen);
SetLength(Buf, BufLen);
bReplaced := False;
while FSourceFile.Position < SourceSize do
begin
BufLen := FSourceFile.Read(Buf, Length(Buf));
SetLength(Buf, BufLen);
ParseBuffer(Buf, bReplaced);
end;
FSourceFile.Size := 0;
FSourceFile.CopyFrom(FtmpFile, 0);
end;
how to use:
procedure TForm2.btn1Click(Sender: TObject);
var
Replacer: TFileSearchReplace;
StartTime: TDateTime;
begin
StartTime:=Now;
Replacer:=TFileSearchReplace.Create('c:\Temp\123.txt');
try
Replacer.Replace('some ัะตะบัั', 'some', [rfReplaceAll, rfIgnoreCase]);
finally
Replacer.Free;
end;
Caption:=FormatDateTime('nn:ss.zzz', Now - StartTime);
end;
Your first try creates several copies of the file in memory:
it loads the whole file into memory (TStringList)
it creates a copy of this memory when accessing the .Text property
it creates yet another copy of this memory when passing that string to StringReplace (The copy is the result which is built in StringReplace.)
You could try to solve the out of memory problem by getting rid of one or more of these copies:
e.g. read the file into a simple string variable rather than a TStringList
or keep the string list but run the StringReplace on each line separately and write the result to the file line by line.
That would increase the maximum file size your code can handle, but you will still run out of memory for huge files. If you want to handle files of any size, your second approach is the way to go.
No - I don't think there's a faster way that the 2nd option (if you want a completely generic search'n'replace function for any file of any size). It may be possible to make a faster version if you code it specifically according to your requirements, but as a general-purpose search'n'replace function, I don't believe you can go faster...
For instance, are you sure you need case-insensitive replacement? I would expect that this would be a large part of the time spent in the replace function. Try (just for kicks) to remove that requirement and see if it doesn't speed up the execution quite a bit on large files (this depends on how the internal coding of the StringReplace function is made - if it has a specific optimization for case-sensitive searches)
I believe refinement of Kami's code is needed to account for the string not being found, but the start of a new instance of the string might occur at the end of the buffer. The else clause is different:
if IsReplaced then begin
LastIndex := GetLastIndex(BufStr, AFrom);
LastIndex := Max(LastIndex, Length(BufStr) - Length(AFrom) + 1);
end else
LastIndex :=Length(BufStr) - Length(AFrom) + 1;
Correct fix is this one:
if IsReplaced then
begin
LastIndex := GetLastIndex(BufStr, AFrom);
LastIndex := Max(LastIndex, Length(BufStr) - Length(AFrom) + 1);
end
else
if FSourceFile.Position < SourceSize then
LastIndex := Length(BufStr) - Length(AFrom) + 1
else
LastIndex := Length(BufStr);
I am using a TStringBuilder class ported from .Net to Delphi 7.
And here is my code snippet:
procedure TForm1.btn1Click(Sender: TObject);
const
FILE_NAME = 'PATH TO A TEXT FILE';
var
sBuilder: TStringBuilder;
I: Integer;
fil: TStringList;
sResult: string;
randInt: Integer;
begin
randomize;
sResult := '';
for I := 1 to 100 do
begin
fil := TStringList.Create;
try
fil.LoadFromFile(FILE_NAME);
randInt := Random(1024);
sBuilder := TStringBuilder.Create(randInt);
try
sBuilder.Append(fil.Text);
sResult := sBuilder.AsString;
finally
sBuilder.free;
end;
mmo1.Text := sResult;
finally
FreeAndNil(fil);
end;
end;
showmessage ('DOne');
end;
I am experiencing AV errors. I can alleviate the problem when I create memory with the size multiple by 1024, however sometimes it still occurs.
Am I doing something wrong?
Your code is fine. The TStringBuilder code you're using is faulty. Consider this method:
procedure TStringBuilder.Append(const AString : string);
var iLen : integer;
begin
iLen := length(AString);
if iLen + FIndex > FBuffMax then _ExpandBuffer;
move(AString[1],FBuffer[FIndex],iLen);
inc(FIndex,iLen);
end;
If the future length is too long for the current buffer size, the buffer is expanded. _ExpandBuffer doubles the size of the buffer, but once that's done, it never checks whether the new buffer size is sufficient. If the original buffer size is 1024, and the file you're loading is 3 KB, then doubling the buffer size to 2048 will still leave the buffer too small in Append, and you'll end up overwriting 1024 bytes beyond the end of the buffer.
Change the if to a while in Append.