Delphi- How to remove NON ANSI (NOT PRINTABLE) characters before saving? - delphi

Can somebody guide me to extend this procedure in a way so it removes all Non Printable characters or replaces with SPACE before it saves the stream to file ? String is read from Binary and could be maximum of 1 MB size.
My Procedure :
var
i : Word;
FileName : TFileName;
SizeofFiles,posi : Integer;
fs, sStream: TFileStream;
SplitFileName: String;
begin
ProgressBar1.Position := 0;
FileName:= lblFilePath.Caption;
SizeofFiles := StrToInt(edt2.Text) ;
posi := StrToInt(edt1.text) ;
fs := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
try
fs.Position := Posi ;
begin
SplitFileName := ChangeFileExt(FileName, '.'+ FormatFloat('000', i));
sStream := TFileStream.Create(SplitFileName, fmCreate or fmShareExclusive);
try
if fs.Size - fs.Position < SizeofFiles then
SizeofFiles := fs.Size - fs.Position;
sStream.CopyFrom(fs, SizeofFiles);
ProgressBar1.Position := Round((fs.Position / fs.Size) * 100);
finally
sStream.Free;
end;
end;
finally
fs.Free;
end;
end;

You won't be able to use TStream.CopyFrom() anymore. You would have to Read(Buffer)() from the source TStream into a local byte array, strip off whatever you don't want from that array, and then Write(Buffer)() the remaining bytes to the destination TStream.

Here is a simple demo that should do what you want:
const
SrcFileName : String = 'Test.txt';
DstFileName : String = 'TestResult.txt';
StartPosition : Int64 = 50;
procedure TForm1.Button1Click(Sender: TObject);
var
FS : TFileStream;
Buf : TBytes;
I : Integer;
begin
// Read the source file from starting position
FS := TFileStream.Create(SrcFileName, fmOpenRead or fmShareDenyWrite);
try
FS.Position := StartPosition;
SetLength(Buf, FS.Size - FS.Position);
FS.Read(Buf[0], Length(Buf));
finally
FreeAndNil(FS);
end;
// Replace all non printable character by a space
// Assume file content is ASCII characters
for I := 0 to Length(Buf) - 1 do begin
// You may want to make a more complex test for printable of not
if (Ord(Buf[I]) < Ord(' ')) or (Ord(Buf[I]) > 126) then
Buf[I] := Ord(' ');
end;
// Write destination file
FS := TFileStream.Create(DstFileName, fmCreate);
try
FS.Write(Buf[0], Length(Buf));
finally
FreeAndNil(FS);
end;
end;
This code assume the file is pure ASCII text and that every character whose ASCII code is below 32 (space) or above 126 is not printable. This may not be the case for European languages. You'll easily adapt the test to fit your needs.
The source file could also be Unicode (16 bits characters). You should use a buffer made of Unicode characters or 16 bit integers (Word). And adapt the test for printable.
Could also be UTF8...

Related

How to read last line in a text file using Delphi

I need to read the last line in some very large textfiles (to get the timestamp from the data). TStringlist would be a simple approach but it returns an out of memory error. I'm trying to use seek and blockread, but the characters in the buffer are all nonsense. Is this something to do with unicode?
Function TForm1.ReadLastLine2(FileName: String): String;
var
FileHandle: File;
s,line: string;
ok: 0..1;
Buf: array[1..8] of Char;
k: longword;
i,ReadCount: integer;
begin
AssignFile (FileHandle,FileName);
Reset (FileHandle); // or for binary files: Reset (FileHandle,1);
ok := 0;
k := FileSize (FileHandle);
Seek (FileHandle, k-1);
s := '';
while ok<>1 do begin
BlockRead (FileHandle, buf, SizeOf(Buf)-1, ReadCount); //BlockRead ( var FileHandle : File; var Buffer; RecordCount : Integer {; var RecordsRead : Integer} ) ;
if ord (buf[1]) <>13 then //Arg to integer
s := s + buf[1]
else
ok := ok + 1;
k := k-1;
seek (FileHandle,k);
end;
CloseFile (FileHandle);
// Reverse the order in the line read
setlength (line,length(s));
for i:=1 to length(s) do
line[length(s) - i+1 ] := s[i];
Result := Line;
end;
Based on www.delphipages.com/forum/showthread.php?t=102965
The testfile is a simple CSV I created in excel ( this is not the 100MB I ultimately need to read).
a,b,c,d,e,f,g,h,i,j,blank
A,B,C,D,E,F,G,H,I,J,blank
1,2,3,4,5,6,7,8,9,0,blank
Mary,had,a,little,lamb,His,fleece,was,white,as,snow
And,everywhere,that,Mary,went,The,lamb,was,sure,to,go
You really have to read the file in LARGE chunks from the tail to the head.
Since it is so large it does not fit the memory - then reading it line by line from start to end would be very slow. With ReadLn - twice slow.
You also has to be ready that the last line might end with EOL or may not.
Personally I would also account for three possible EOL sequences:
CR/LF aka #13#10=^M^J - DOS/Windows style
CR without LF - just #13=^M - Classic MacOS file
LF without CR - just #10=^J - UNIX style, including MacOS version 10
If you are sure your CSV files would only ever be generated by native Windows programs it would be safe to assume full CR/LF be used. But if there can be other Java programs, non-Windows platforms, mobile programs - I would be less sure. Of course pure CR without LF would be the least probable case of them all.
uses System.IOUtils, System.Math, System.Classes;
type FileChar = AnsiChar; FileString = AnsiString; // for non-Unicode files
// type FileChar = WideChar; FileString = UnicodeString;// for UTF16 and UCS-2 files
const FileCharSize = SizeOf(FileChar);
// somewhere later in the code add: Assert(FileCharSize = SizeOf(FileString[1]);
function ReadLastLine(const FileName: String): FileString; overload; forward;
const PageSize = 4*1024;
// the minimal read atom of most modern HDD and the memory allocation atom of Win32
// since the chances your file would have lines longer than 4Kb are very small - I would not increase it to several atoms.
function ReadLastLine(const Lines: TStringDynArray): FileString; overload;
var i: integer;
begin
Result := '';
i := High(Lines);
if i < Low(Lines) then exit; // empty array - empty file
Result := Lines[i];
if Result > '' then exit; // we got the line
Dec(i); // skip the empty ghost line, in case last line was CRLF-terminated
if i < Low(Lines) then exit; // that ghost was the only line in the empty file
Result := Lines[i];
end;
// scan for EOLs in not-yet-scanned part
function FindLastLine(buffer: TArray<FileChar>; const OldRead : Integer;
const LastChunk: Boolean; out Line: FileString): boolean;
var i, tailCRLF: integer; c: FileChar;
begin
Result := False;
if Length(Buffer) = 0 then exit;
i := High(Buffer);
tailCRLF := 0; // test for trailing CR/LF
if Buffer[i] = ^J then begin // LF - single, or after CR
Dec(i);
Inc(tailCRLF);
end;
if (i >= Low(Buffer)) and (Buffer[i] = ^M) then begin // CR, alone or before LF
Inc(tailCRLF);
end;
i := High(Buffer) - Max(OldRead, tailCRLF);
if i - Low(Buffer) < 0 then exit; // no new data to read - results would be like before
if OldRead > 0 then Inc(i); // the CR/LF pair could be sliced between new and previous buffer - so need to start a bit earlier
for i := i downto Low(Buffer) do begin
c := Buffer[i];
if (c=^J) or (c=^M) then begin // found EOL
SetString( Line, #Buffer[i+1], High(Buffer) - tailCRLF - i);
exit(True);
end;
end;
// we did not find non-terminating EOL in the buffer (except maybe trailing),
// now we should ask for more file content, if there is still left any
// or take the entire file (without trailing EOL if any)
if LastChunk then begin
SetString( Line, #Buffer[ Low(Buffer) ], Length(Buffer) - tailCRLF);
Result := true;
end;
end;
function ReadLastLine(const FileName: String): FileString; overload;
var Buffer, tmp: TArray<FileChar>;
// dynamic arrays - eases memory management and protect from stack corruption
FS: TFileStream; FSize, NewPos: Int64;
OldRead, NewLen : Integer; EndOfFile: boolean;
begin
Result := '';
FS := TFile.OpenRead(FileName);
try
FSize := FS.Size;
if FSize <= PageSize then begin // small file, we can be lazy!
FreeAndNil(FS); // free the handle and avoid double-free in finally
Result := ReadLastLine( TFile.ReadAllLines( FileName, TEncoding.ANSI ));
// or TEncoding.UTF16
// warning - TFIle is not share-aware, if the file is being written to by another app
exit;
end;
SetLength( Buffer, PageSize div FileCharSize);
OldRead := 0;
repeat
NewPos := FSize - Length(Buffer)*FileCharSize;
EndOfFile := NewPos <= 0;
if NewPos < 0 then NewPos := 0;
FS.Position := NewPos;
FS.ReadBuffer( Buffer[Low(Buffer)], (Length(Buffer) - OldRead)*FileCharSize);
if FindLastLine(Buffer, OldRead, EndOfFile, Result) then
exit; // done !
tmp := Buffer; Buffer := nil; // flip-flop: preparing to broaden our mouth
OldRead := Length(tmp); // need not to re-scan the tail again and again when expanding our scanning range
NewLen := Min( 2*Length(tmp), FSize div FileCharSize );
SetLength(Buffer, NewLen); // this may trigger EOutOfMemory...
Move( tmp[Low(tmp)], Buffer[High(Buffer)-OldRead+1], OldRead*FileCharSize);
tmp := nil; // free old buffer
until EndOfFile;
finally
FS.Free;
end;
end;
PS. Note one extra special case - if you would use Unicode chars (two-bytes ones) and would give odd-length file (3 bytes, 5 bytes, etc) - you would never be ble to scan the starting single byte (half-widechar). Maybe you should add the extra guard there, like Assert( 0 = FS.Size mod FileCharSize)
PPS. As a rule of thumb you better keep those functions out of the form class, - because WHY mixing them? In general you should separate concerns into small blocks. Reading file has nothing with user interaction - so should better be offloaded to an extra UNIT. Then you would be able to use functions from that unit in one form or 10 forms, in main thread or in multi-threaded application. Like LEGO parts - they give you flexibility by being small and separate.
PPPS. Another approach here would be using memory-mapped files. Google for MMF implementations for Delphi and articles about benefits and problems with MMF approach. Personally I think rewriting the code above to use MMF would greatly simplify it, removing several "special cases" and the troublesome and memory copying flip-flop. OTOH it would demand you to be very strict with pointers arithmetic.
https://en.wikipedia.org/wiki/Memory-mapped_file
https://msdn.microsoft.com/en-us/library/ms810613.aspx
http://torry.net/quicksearchd.php?String=memory+map&Title=No
Your char type is two byte, so that buffer is 16 byte. Then with blockread you read sizeof(buffer)-1 byte into it, and check the first 2 byte char if it is equal to #13.
The sizeof(buffer)-1 is dodgy (where does that -1 come from?), and the rest is valid, but only if your input file is utf16.
Also your read 8 (or 16) characters each time, but compare only one and then do a seek again. That is not very logical either.
If your encoding is not utf16, I suggest you change the type of a buffer element to ansichar and remove the -1
In response to kopiks suggestion, I figured out how to do it with TFilestream, it works ok with the simple test file, though there may be some further tweeks when I use it on a variety of csv files. Also, I don't make any claims that this is the most efficient method.
procedure TForm1.Button6Click(Sender: TObject);
Var
StreamSize, ApproxNumRows : Integer;
TempStr : String;
begin
if OpenDialog1.Execute then begin
TempStr := ReadLastLineOfTextFile(OpenDialog1.FileName,StreamSize, ApproxNumRows);
// TempStr := ReadFileStream('c:\temp\CSVTestFile.csv');
ShowMessage ('approximately '+ IntToStr(ApproxNumRows)+' Rows');
ListBox1.Items.Add(TempStr);
end;
end;
Function TForm1.ReadLastLineOfTextFile(const FileName: String; var StreamSize, ApproxNumRows : Integer): String;
const
MAXLINELENGTH = 256;
var
Stream: TFileStream;
BlockSize,CharCount : integer;
Hash13Found : Boolean;
Buffer : array [0..MAXLINELENGTH] of AnsiChar;
begin
Hash13Found := False;
Result :='';
Stream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
StreamSize := Stream.size;
if StreamSize < MAXLINELENGTH then
BlockSize := StreamSize
Else
BlockSize := MAXLINELENGTH;
// for CharCount := 0 to Length(Buffer)-1 do begin
// Buffer[CharCount] := #0; // zeroing the buffer can aid diagnostics
// end;
CharCount := 0;
Repeat
Stream.Seek(-(CharCount+3), 2); //+3 misses out the #0,#10,#13 at the end of the file
Stream.Read( Buffer[CharCount], 1);
Result := String(Buffer[CharCount]) + result;
if Buffer[CharCount] =#13 then
Hash13Found := True;
Inc(CharCount);
Until Hash13Found OR (CharCount = BlockSize);
ShowMessage(Result);
ApproxNumRows := Round(StreamSize / CharCount);
end;
Just thought of a new solution.
Again, there could be better ones, but this one is the best i thought of.
function GetLastLine(textFilePath: string): string;
var
list: tstringlist;
begin
list := tstringlist.Create;
try
list.LoadFromFile(textFilePath);
result := list[list.Count-1];
finally
list.free;
end;
end;

Delphi Change Hex Address on file

I wanna over delphi change hex adress 15 character,
I follow like this a way but I didnt get success,
BlockRead(F,arrChar,1); //read all to the buf
CloseFile(F); //close file
IMEI:=Form1.Edit1.Text; //get the number
Form1.Memo1.Lines.Add('new IMEI is'+IMEI); //output
for i:=524288 to 524288+15 do /
arrChar[i]:=IMEI[i-524287];
Do this with a file stream.
var
Stream: TFileStream;
....
Stream := TFileStream.Create(FileName, fmOpenWrite);
try
Stream.Position := $080000;
Stream.WriteBuffer(IMEI, SizeOf(IMEI));
finally
Stream.Free;
end;
I'm assuming that IMEI is an fixed length array of bytes of length 15 but your code attempts to write 16 bytes so it would appear that you are suffering from a degree of confusion.
In your code, your variable IMEI is a string. Which is not an array of bytes. Please don't make that classic mistake of regarding a string as an array of bytes.
You might declare an IMEI type like this:
type
TIMEI = array [0..14] of Byte;
Then you might write a function to populate such a variable from text:
function TextToIMEI(const Text: string): TIMEI;
var
ResultIndex, TextIndex: Integer;
C: Char;
begin
if Length(Text) <> Length(Result) then
raise SomeExceptionClass.Create(...);
TextIndex := low(Text);
for ResultIndex := low(Result) to high(Result) do
begin
C := Result[TextIndex];
if (C < '0') or (C > '9') then
raise SomeExceptionClass.Create(...);
Result[ResultIndex] := ord(C);
inc(TextIndex);
end;
end;
You might then combine this code with that above:
procedure WriteIMEItoFile(const FileName: string; FileOffset: Int64; const IMEI: TIMEI);
var
Stream: TFileStream;
begin
Stream := TFileStream.Create(FileName, fmOpenWrite);
try
Stream.Position := FileOffset;
Stream.WriteBuffer(IMEI, SizeOf(IMEI));
finally
Stream.Free;
end;
end;
Call it like this:
WriteIMEItoFile(FileName, $080000, TextToIMEI(Form1.Edit1.Text));
Although it looks a bit odd that you are explicitly using the Form1 global variable. If that code executes in a method of TForm1 then you should use the implicit Self variable.

Find and Replace Text in a Large TextFile (Delphi XE5)

I am trying to find and replace text in a text file. I have been able to do this in the past with methods like:
procedure SmallFileFindAndReplace(FileName, Find, ReplaceWith: string);
begin
with TStringList.Create do
begin
LoadFromFile(FileName);
Text := StringReplace(Text, Find, ReplaceWith, [rfReplaceAll, rfIgnoreCase]);
SaveToFile(FileName);
Free;
end;
end;
The above works fine when a file is relatively small, however; when the the file size is something like 170 Mb the above code will cause the following error:
EOutOfMemory with message 'Out of memory'
I have tried the following with success, however it takes a long time to run:
procedure Tfrm_Main.button_MakeReplacementClick(Sender: TObject);
var
fs : TFileStream;
s : AnsiString;
//s : string;
begin
fs := TFileStream.Create(edit_SQLFile.Text, fmOpenread or fmShareDenyNone);
try
SetLength(S, fs.Size);
fs.ReadBuffer(S[1], fs.Size);
finally
fs.Free;
end;
s := StringReplace(s, edit_Find.Text, edit_Replace.Text, [rfReplaceAll, rfIgnoreCase]);
fs := TFileStream.Create(edit_SQLFile.Text, fmCreate);
try
fs.WriteBuffer(S[1], Length(S));
finally
fs.Free;
end;
end;
I am new to "Streams" and working with buffers.
Is there a better way to do this?
Thank You.
You have two mistakes in first code example and three - in second example:
Do not load whole large file in memory, especially in 32bit application. If file size more than ~1 Gb, you always get "Out of memory"
StringReplace slows with large strings, because of repeated memory reallocation
In second code you don`t use text encoding in file, so (for Windows) your code "think" that file has UCS2 encoding (two bytes per character). But what you get, if file encoding is Ansi (one byte per character) or UTF8 (variable size of char)?
Thus, for correct find&replace you must use file encoding and read/write parts of file, as LU RD said:
interface
uses
System.Classes,
System.SysUtils;
type
TFileSearchReplace = class(TObject)
private
FSourceFile: TFileStream;
FtmpFile: TFileStream;
FEncoding: TEncoding;
public
constructor Create(const AFileName: string);
destructor Destroy; override;
procedure Replace(const AFrom, ATo: string; ReplaceFlags: TReplaceFlags);
end;
implementation
uses
System.IOUtils,
System.StrUtils;
function Max(const A, B: Integer): Integer;
begin
if A > B then
Result := A
else
Result := B;
end;
{ TFileSearchReplace }
constructor TFileSearchReplace.Create(const AFileName: string);
begin
inherited Create;
FSourceFile := TFileStream.Create(AFileName, fmOpenReadWrite);
FtmpFile := TFileStream.Create(ChangeFileExt(AFileName, '.tmp'), fmCreate);
end;
destructor TFileSearchReplace.Destroy;
var
tmpFileName: string;
begin
if Assigned(FtmpFile) then
tmpFileName := FtmpFile.FileName;
FreeAndNil(FtmpFile);
FreeAndNil(FSourceFile);
TFile.Delete(tmpFileName);
inherited;
end;
procedure TFileSearchReplace.Replace(const AFrom, ATo: string;
ReplaceFlags: TReplaceFlags);
procedure CopyPreamble;
var
PreambleSize: Integer;
PreambleBuf: TBytes;
begin
// Copy Encoding preamble
SetLength(PreambleBuf, 100);
FSourceFile.Read(PreambleBuf, Length(PreambleBuf));
FSourceFile.Seek(0, soBeginning);
PreambleSize := TEncoding.GetBufferEncoding(PreambleBuf, FEncoding);
if PreambleSize <> 0 then
FtmpFile.CopyFrom(FSourceFile, PreambleSize);
end;
function GetLastIndex(const Str, SubStr: string): Integer;
var
i: Integer;
tmpSubStr, tmpStr: string;
begin
if not(rfIgnoreCase in ReplaceFlags) then
begin
i := Pos(SubStr, Str);
Result := i;
while i > 0 do
begin
i := PosEx(SubStr, Str, i + 1);
if i > 0 then
Result := i;
end;
if Result > 0 then
Inc(Result, Length(SubStr) - 1);
end
else
begin
tmpStr := UpperCase(Str);
tmpSubStr := UpperCase(SubStr);
i := Pos(tmpSubStr, tmpStr);
Result := i;
while i > 0 do
begin
i := PosEx(tmpSubStr, tmpStr, i + 1);
if i > 0 then
Result := i;
end;
if Result > 0 then
Inc(Result, Length(tmpSubStr) - 1);
end;
end;
var
SourceSize: int64;
procedure ParseBuffer(Buf: TBytes; var IsReplaced: Boolean);
var
i: Integer;
ReadedBufLen: Integer;
BufStr: string;
DestBytes: TBytes;
LastIndex: Integer;
begin
if IsReplaced and (not(rfReplaceAll in ReplaceFlags)) then
begin
FtmpFile.Write(Buf, Length(Buf));
Exit;
end;
// 1. Get chars from buffer
ReadedBufLen := 0;
for i := Length(Buf) downto 0 do
if FEncoding.GetCharCount(Buf, 0, i) <> 0 then
begin
ReadedBufLen := i;
Break;
end;
if ReadedBufLen = 0 then
raise EEncodingError.Create('Cant convert bytes to str');
FSourceFile.Seek(ReadedBufLen - Length(Buf), soCurrent);
BufStr := FEncoding.GetString(Buf, 0, ReadedBufLen);
if rfIgnoreCase in ReplaceFlags then
IsReplaced := ContainsText(BufStr, AFrom)
else
IsReplaced := ContainsStr(BufStr, AFrom);
if IsReplaced then
begin
LastIndex := GetLastIndex(BufStr, AFrom);
LastIndex := Max(LastIndex, Length(BufStr) - Length(AFrom) + 1);
end
else
LastIndex := Length(BufStr);
SetLength(BufStr, LastIndex);
FSourceFile.Seek(FEncoding.GetByteCount(BufStr) - ReadedBufLen, soCurrent);
BufStr := StringReplace(BufStr, AFrom, ATo, ReplaceFlags);
DestBytes := FEncoding.GetBytes(BufStr);
FtmpFile.Write(DestBytes, Length(DestBytes));
end;
var
Buf: TBytes;
BufLen: Integer;
bReplaced: Boolean;
begin
FSourceFile.Seek(0, soBeginning);
FtmpFile.Size := 0;
CopyPreamble;
SourceSize := FSourceFile.Size;
BufLen := Max(FEncoding.GetByteCount(AFrom) * 5, 2048);
BufLen := Max(FEncoding.GetByteCount(ATo) * 5, BufLen);
SetLength(Buf, BufLen);
bReplaced := False;
while FSourceFile.Position < SourceSize do
begin
BufLen := FSourceFile.Read(Buf, Length(Buf));
SetLength(Buf, BufLen);
ParseBuffer(Buf, bReplaced);
end;
FSourceFile.Size := 0;
FSourceFile.CopyFrom(FtmpFile, 0);
end;
how to use:
procedure TForm2.btn1Click(Sender: TObject);
var
Replacer: TFileSearchReplace;
StartTime: TDateTime;
begin
StartTime:=Now;
Replacer:=TFileSearchReplace.Create('c:\Temp\123.txt');
try
Replacer.Replace('some текст', 'some', [rfReplaceAll, rfIgnoreCase]);
finally
Replacer.Free;
end;
Caption:=FormatDateTime('nn:ss.zzz', Now - StartTime);
end;
Your first try creates several copies of the file in memory:
it loads the whole file into memory (TStringList)
it creates a copy of this memory when accessing the .Text property
it creates yet another copy of this memory when passing that string to StringReplace (The copy is the result which is built in StringReplace.)
You could try to solve the out of memory problem by getting rid of one or more of these copies:
e.g. read the file into a simple string variable rather than a TStringList
or keep the string list but run the StringReplace on each line separately and write the result to the file line by line.
That would increase the maximum file size your code can handle, but you will still run out of memory for huge files. If you want to handle files of any size, your second approach is the way to go.
No - I don't think there's a faster way that the 2nd option (if you want a completely generic search'n'replace function for any file of any size). It may be possible to make a faster version if you code it specifically according to your requirements, but as a general-purpose search'n'replace function, I don't believe you can go faster...
For instance, are you sure you need case-insensitive replacement? I would expect that this would be a large part of the time spent in the replace function. Try (just for kicks) to remove that requirement and see if it doesn't speed up the execution quite a bit on large files (this depends on how the internal coding of the StringReplace function is made - if it has a specific optimization for case-sensitive searches)
I believe refinement of Kami's code is needed to account for the string not being found, but the start of a new instance of the string might occur at the end of the buffer. The else clause is different:
if IsReplaced then begin
LastIndex := GetLastIndex(BufStr, AFrom);
LastIndex := Max(LastIndex, Length(BufStr) - Length(AFrom) + 1);
end else
LastIndex :=Length(BufStr) - Length(AFrom) + 1;
Correct fix is this one:
if IsReplaced then
begin
LastIndex := GetLastIndex(BufStr, AFrom);
LastIndex := Max(LastIndex, Length(BufStr) - Length(AFrom) + 1);
end
else
if FSourceFile.Position < SourceSize then
LastIndex := Length(BufStr) - Length(AFrom) + 1
else
LastIndex := Length(BufStr);

Issues with AES Encryption using SynCrypto

Am trying to encrypt a file using SynCrypto.pas with AES 256, but it fails if I try to encrypt a file whose size is not a multiple of 16 bytes. The decrypted data contains junk.
Example:
Original string in txt file
we are testing the file
Encrypted String
[ù[„|wáî}f *!4ìÙw¬•ü¨s
Decrypted String
we are testing tÝp?J
Here is my Encryption Code
procedure TForm1.Button1Click(Sender: TObject);
var
A: TAES;
Key: TSHA256Digest;
s, B: TAESBlock;
ks: integer;
st: RawByteString;
InStream, OutStream: TFileStream;
SuperNo, TheSize, StreamSize: Int64;
begin
InStream := TFileStream.Create('test.txt', fmOpenRead);
OutStream := TFileStream.Create('out.txt', fmCreate);
InStream.Position := 0;
OutStream.Position := 0;
st := '1234essai';
ks := 256;
SHA256Weak(st, Key);
A.EncryptInit(Key, ks);
StreamSize := InStream.Size;
while InStream.Position < StreamSize do
begin
TheSize := StreamSize - InStream.Position;
if TheSize < 16 then
begin
SuperNo := StreamSize - InStream.Position;
end
else
begin
SuperNo := 16;
end;
InStream.ReadBuffer(s, SuperNo);
A.Encrypt(s, B);
OutStream.WriteBuffer(B, SuperNo);
end;
A.Done;
ShowMessage('Finish');
InStream.Free;
OutStream.Free;
end;
Here is my Decryption Code
procedure TForm1.Button2Click(Sender: TObject);
var
A: TAES;
Key: TSHA256Digest;
s, B: TAESBlock;
ks: integer;
st: RawByteString;
InStream, OutStream: TFileStream;
SuperNo, TheSize, StreamSize: Int64;
begin
InStream := TFileStream.Create('out.txt', fmOpenRead);
OutStream := TFileStream.Create('in.txt', fmCreate);
InStream.Position := 0;
OutStream.Position := 0;
st := '1234essai';
ks := 256;
SHA256Weak(st, Key);
A.DecryptInit(Key, ks);
StreamSize := InStream.Size;
while InStream.Position < StreamSize do
begin
TheSize := StreamSize - InStream.Position;
if TheSize < 16 then
begin
SuperNo := StreamSize - InStream.Position;
end
else
begin
SuperNo := 16;
end;
InStream.ReadBuffer(s, SuperNo);
A.Decrypt(s, B);
OutStream.WriteBuffer(B, SuperNo);
end;
A.Done;
ShowMessage('Finish');
InStream.Free;
OutStream.Free;
end;
Using Delphi XE7.
AES is a block cipher algorithm. It means that it works by blocks that are 16 bytes in size for AES. So you need to use padding if your data does not fit in 16 bytes blocks (which is the case for your text).
Instead of trying to re-invent the wheel, and introduce mistakes, you would rather rely on existing padding algorithm. And you should add a block chaining mode. Currently the default algorithm you are using is ECB, which is known to be weak, since it does not use a chaining mode nor an Initialization Vector (IV).
The SynCrypto unit contains a safe block chaining mode like CFB and PKCS7 padding, so you could write:
var inp,out: RawByteString;
begin
// encryption:
inp := StringFromFile('in.txt');
out := TAESCFB.SimpleEncrypt(inp,'privatekey',true,true);
FileFromString(out,'out.txt');
// or in a single line:
FileFromString(TAESCFB.SimpleEncrypt(StringFromFile('in.txt'),'privatekey',true,true),'out.txt');
// decryption
inp := StringFromFile('out.txt');
out := TAESCFB.SimpleEncrypt(inp,'privatekey',false,true);
FileFromString(out,'in.txt');
end;
The above code is safe and fast, will generate the binary key using SHA-256 over the supplied 'privatekey', and will use AES-NI hardware instructions if you CPU supports it. You can use another chaining mode, just by changing the TAESCFB class name to another type available.

TStringList load memorystream? <Delphi>

edited :
My file has several lines. I encrypt the file onto a new file. I want to store each line of decrypted file (=a stream) into StringList.
First, I have a file contain :
aa
bb
cc
I encrypt the file with this function :
procedure EnDecryptFile(pathin, pathout: string; Chave: Word) ;
var
InMS, OutMS: TMemoryStream;
cnt: Integer;
C: byte;
begin
InMS := TMemoryStream.Create;
OutMS := TMemoryStream.Create;
try
InMS.LoadFromFile(pathin) ;
InMS.Position := 0;
for cnt := 0 to InMS.Size - 1 do
begin
InMS.Read(C, 1) ;
C := (C xor not (ord(chave shr cnt))) ;
OutMS.Write(C, 1) ;
end;
OutMS.SaveToFile(pathout) ;
finally
InMS.Free;
OutMS.Free;
end;
end;
My purpose now is to store original value of each line into StringList. I don't want to store decrypted file into harddisk, so I use stream.
This is the function to decrypt the file into stream :
procedure DecryptFile(pathin: string; buff: TMemoryStream; Chave: Word);
var
InMS: TMemoryStream;
cnt: Integer;
C: byte;
begin
InMS := TMemoryStream.Create;
try
InMS.LoadFromFile(pathin);
InMS.Position := 0;
for cnt := 0 to InMS.Size - 1 do
begin
InMS.Read(C, 1);
C := (C xor not(ord(Chave shr cnt)));
buff.Write(C, 1);
end;
// buff.SaveToFile('c:\temp\dump.txt') ;
finally
InMS.free;
end;
end;
--
bbuffer := TMemoryStream.Create;
try
DecryptFile(path, bbuffer, 10); //
//ShowMessage(IntToStr(bbuffer.size)); // output : 1000
bbuffer.Position := 0;
SL := TStringList.Create;
try
SL.LoadFromStream(bbuffer);
for I := 0 to SL.Count - 1 do // SL.Count = 1
begin;
//add each line of orginal file into SL??
end;
finally
SL.free;
end;
finally
bbuffer.free;
end;
Load from stream takes a TStream so you can give it a TFileStream as well as an TMemoryStream. The code you posted should work without any problems. What exactly does not work?
You might have to use
bbuffer.Position := 0;
to reset the position to the start of the stream before loading it into the string list.
EDIT: You write single bytes to a stream and then try to load a string list from it. That won't work. The stream is just a collection of bytes. How should the string list know where one string ends and the next one starts? TStringList.SaveToStream writes separator bytes to the stream so that it can parse the string list back. So, you could do your encryption on the string list and then write the whole string list to the stream, then read the stringlist and do the decryption on the string list.

Resources