Can somebody guide me to extend this procedure in a way so it removes all Non Printable characters or replaces with SPACE before it saves the stream to file ? String is read from Binary and could be maximum of 1 MB size.
My Procedure :
var
i : Word;
FileName : TFileName;
SizeofFiles,posi : Integer;
fs, sStream: TFileStream;
SplitFileName: String;
begin
ProgressBar1.Position := 0;
FileName:= lblFilePath.Caption;
SizeofFiles := StrToInt(edt2.Text) ;
posi := StrToInt(edt1.text) ;
fs := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
try
fs.Position := Posi ;
begin
SplitFileName := ChangeFileExt(FileName, '.'+ FormatFloat('000', i));
sStream := TFileStream.Create(SplitFileName, fmCreate or fmShareExclusive);
try
if fs.Size - fs.Position < SizeofFiles then
SizeofFiles := fs.Size - fs.Position;
sStream.CopyFrom(fs, SizeofFiles);
ProgressBar1.Position := Round((fs.Position / fs.Size) * 100);
finally
sStream.Free;
end;
end;
finally
fs.Free;
end;
end;
You won't be able to use TStream.CopyFrom() anymore. You would have to Read(Buffer)() from the source TStream into a local byte array, strip off whatever you don't want from that array, and then Write(Buffer)() the remaining bytes to the destination TStream.
Here is a simple demo that should do what you want:
const
SrcFileName : String = 'Test.txt';
DstFileName : String = 'TestResult.txt';
StartPosition : Int64 = 50;
procedure TForm1.Button1Click(Sender: TObject);
var
FS : TFileStream;
Buf : TBytes;
I : Integer;
begin
// Read the source file from starting position
FS := TFileStream.Create(SrcFileName, fmOpenRead or fmShareDenyWrite);
try
FS.Position := StartPosition;
SetLength(Buf, FS.Size - FS.Position);
FS.Read(Buf[0], Length(Buf));
finally
FreeAndNil(FS);
end;
// Replace all non printable character by a space
// Assume file content is ASCII characters
for I := 0 to Length(Buf) - 1 do begin
// You may want to make a more complex test for printable of not
if (Ord(Buf[I]) < Ord(' ')) or (Ord(Buf[I]) > 126) then
Buf[I] := Ord(' ');
end;
// Write destination file
FS := TFileStream.Create(DstFileName, fmCreate);
try
FS.Write(Buf[0], Length(Buf));
finally
FreeAndNil(FS);
end;
end;
This code assume the file is pure ASCII text and that every character whose ASCII code is below 32 (space) or above 126 is not printable. This may not be the case for European languages. You'll easily adapt the test to fit your needs.
The source file could also be Unicode (16 bits characters). You should use a buffer made of Unicode characters or 16 bit integers (Word). And adapt the test for printable.
Could also be UTF8...
I need to read the last line in some very large textfiles (to get the timestamp from the data). TStringlist would be a simple approach but it returns an out of memory error. I'm trying to use seek and blockread, but the characters in the buffer are all nonsense. Is this something to do with unicode?
Function TForm1.ReadLastLine2(FileName: String): String;
var
FileHandle: File;
s,line: string;
ok: 0..1;
Buf: array[1..8] of Char;
k: longword;
i,ReadCount: integer;
begin
AssignFile (FileHandle,FileName);
Reset (FileHandle); // or for binary files: Reset (FileHandle,1);
ok := 0;
k := FileSize (FileHandle);
Seek (FileHandle, k-1);
s := '';
while ok<>1 do begin
BlockRead (FileHandle, buf, SizeOf(Buf)-1, ReadCount); //BlockRead ( var FileHandle : File; var Buffer; RecordCount : Integer {; var RecordsRead : Integer} ) ;
if ord (buf[1]) <>13 then //Arg to integer
s := s + buf[1]
else
ok := ok + 1;
k := k-1;
seek (FileHandle,k);
end;
CloseFile (FileHandle);
// Reverse the order in the line read
setlength (line,length(s));
for i:=1 to length(s) do
line[length(s) - i+1 ] := s[i];
Result := Line;
end;
Based on www.delphipages.com/forum/showthread.php?t=102965
The testfile is a simple CSV I created in excel ( this is not the 100MB I ultimately need to read).
a,b,c,d,e,f,g,h,i,j,blank
A,B,C,D,E,F,G,H,I,J,blank
1,2,3,4,5,6,7,8,9,0,blank
Mary,had,a,little,lamb,His,fleece,was,white,as,snow
And,everywhere,that,Mary,went,The,lamb,was,sure,to,go
You really have to read the file in LARGE chunks from the tail to the head.
Since it is so large it does not fit the memory - then reading it line by line from start to end would be very slow. With ReadLn - twice slow.
You also has to be ready that the last line might end with EOL or may not.
Personally I would also account for three possible EOL sequences:
CR/LF aka #13#10=^M^J - DOS/Windows style
CR without LF - just #13=^M - Classic MacOS file
LF without CR - just #10=^J - UNIX style, including MacOS version 10
If you are sure your CSV files would only ever be generated by native Windows programs it would be safe to assume full CR/LF be used. But if there can be other Java programs, non-Windows platforms, mobile programs - I would be less sure. Of course pure CR without LF would be the least probable case of them all.
uses System.IOUtils, System.Math, System.Classes;
type FileChar = AnsiChar; FileString = AnsiString; // for non-Unicode files
// type FileChar = WideChar; FileString = UnicodeString;// for UTF16 and UCS-2 files
const FileCharSize = SizeOf(FileChar);
// somewhere later in the code add: Assert(FileCharSize = SizeOf(FileString[1]);
function ReadLastLine(const FileName: String): FileString; overload; forward;
const PageSize = 4*1024;
// the minimal read atom of most modern HDD and the memory allocation atom of Win32
// since the chances your file would have lines longer than 4Kb are very small - I would not increase it to several atoms.
function ReadLastLine(const Lines: TStringDynArray): FileString; overload;
var i: integer;
begin
Result := '';
i := High(Lines);
if i < Low(Lines) then exit; // empty array - empty file
Result := Lines[i];
if Result > '' then exit; // we got the line
Dec(i); // skip the empty ghost line, in case last line was CRLF-terminated
if i < Low(Lines) then exit; // that ghost was the only line in the empty file
Result := Lines[i];
end;
// scan for EOLs in not-yet-scanned part
function FindLastLine(buffer: TArray<FileChar>; const OldRead : Integer;
const LastChunk: Boolean; out Line: FileString): boolean;
var i, tailCRLF: integer; c: FileChar;
begin
Result := False;
if Length(Buffer) = 0 then exit;
i := High(Buffer);
tailCRLF := 0; // test for trailing CR/LF
if Buffer[i] = ^J then begin // LF - single, or after CR
Dec(i);
Inc(tailCRLF);
end;
if (i >= Low(Buffer)) and (Buffer[i] = ^M) then begin // CR, alone or before LF
Inc(tailCRLF);
end;
i := High(Buffer) - Max(OldRead, tailCRLF);
if i - Low(Buffer) < 0 then exit; // no new data to read - results would be like before
if OldRead > 0 then Inc(i); // the CR/LF pair could be sliced between new and previous buffer - so need to start a bit earlier
for i := i downto Low(Buffer) do begin
c := Buffer[i];
if (c=^J) or (c=^M) then begin // found EOL
SetString( Line, #Buffer[i+1], High(Buffer) - tailCRLF - i);
exit(True);
end;
end;
// we did not find non-terminating EOL in the buffer (except maybe trailing),
// now we should ask for more file content, if there is still left any
// or take the entire file (without trailing EOL if any)
if LastChunk then begin
SetString( Line, #Buffer[ Low(Buffer) ], Length(Buffer) - tailCRLF);
Result := true;
end;
end;
function ReadLastLine(const FileName: String): FileString; overload;
var Buffer, tmp: TArray<FileChar>;
// dynamic arrays - eases memory management and protect from stack corruption
FS: TFileStream; FSize, NewPos: Int64;
OldRead, NewLen : Integer; EndOfFile: boolean;
begin
Result := '';
FS := TFile.OpenRead(FileName);
try
FSize := FS.Size;
if FSize <= PageSize then begin // small file, we can be lazy!
FreeAndNil(FS); // free the handle and avoid double-free in finally
Result := ReadLastLine( TFile.ReadAllLines( FileName, TEncoding.ANSI ));
// or TEncoding.UTF16
// warning - TFIle is not share-aware, if the file is being written to by another app
exit;
end;
SetLength( Buffer, PageSize div FileCharSize);
OldRead := 0;
repeat
NewPos := FSize - Length(Buffer)*FileCharSize;
EndOfFile := NewPos <= 0;
if NewPos < 0 then NewPos := 0;
FS.Position := NewPos;
FS.ReadBuffer( Buffer[Low(Buffer)], (Length(Buffer) - OldRead)*FileCharSize);
if FindLastLine(Buffer, OldRead, EndOfFile, Result) then
exit; // done !
tmp := Buffer; Buffer := nil; // flip-flop: preparing to broaden our mouth
OldRead := Length(tmp); // need not to re-scan the tail again and again when expanding our scanning range
NewLen := Min( 2*Length(tmp), FSize div FileCharSize );
SetLength(Buffer, NewLen); // this may trigger EOutOfMemory...
Move( tmp[Low(tmp)], Buffer[High(Buffer)-OldRead+1], OldRead*FileCharSize);
tmp := nil; // free old buffer
until EndOfFile;
finally
FS.Free;
end;
end;
PS. Note one extra special case - if you would use Unicode chars (two-bytes ones) and would give odd-length file (3 bytes, 5 bytes, etc) - you would never be ble to scan the starting single byte (half-widechar). Maybe you should add the extra guard there, like Assert( 0 = FS.Size mod FileCharSize)
PPS. As a rule of thumb you better keep those functions out of the form class, - because WHY mixing them? In general you should separate concerns into small blocks. Reading file has nothing with user interaction - so should better be offloaded to an extra UNIT. Then you would be able to use functions from that unit in one form or 10 forms, in main thread or in multi-threaded application. Like LEGO parts - they give you flexibility by being small and separate.
PPPS. Another approach here would be using memory-mapped files. Google for MMF implementations for Delphi and articles about benefits and problems with MMF approach. Personally I think rewriting the code above to use MMF would greatly simplify it, removing several "special cases" and the troublesome and memory copying flip-flop. OTOH it would demand you to be very strict with pointers arithmetic.
https://en.wikipedia.org/wiki/Memory-mapped_file
https://msdn.microsoft.com/en-us/library/ms810613.aspx
http://torry.net/quicksearchd.php?String=memory+map&Title=No
Your char type is two byte, so that buffer is 16 byte. Then with blockread you read sizeof(buffer)-1 byte into it, and check the first 2 byte char if it is equal to #13.
The sizeof(buffer)-1 is dodgy (where does that -1 come from?), and the rest is valid, but only if your input file is utf16.
Also your read 8 (or 16) characters each time, but compare only one and then do a seek again. That is not very logical either.
If your encoding is not utf16, I suggest you change the type of a buffer element to ansichar and remove the -1
In response to kopiks suggestion, I figured out how to do it with TFilestream, it works ok with the simple test file, though there may be some further tweeks when I use it on a variety of csv files. Also, I don't make any claims that this is the most efficient method.
procedure TForm1.Button6Click(Sender: TObject);
Var
StreamSize, ApproxNumRows : Integer;
TempStr : String;
begin
if OpenDialog1.Execute then begin
TempStr := ReadLastLineOfTextFile(OpenDialog1.FileName,StreamSize, ApproxNumRows);
// TempStr := ReadFileStream('c:\temp\CSVTestFile.csv');
ShowMessage ('approximately '+ IntToStr(ApproxNumRows)+' Rows');
ListBox1.Items.Add(TempStr);
end;
end;
Function TForm1.ReadLastLineOfTextFile(const FileName: String; var StreamSize, ApproxNumRows : Integer): String;
const
MAXLINELENGTH = 256;
var
Stream: TFileStream;
BlockSize,CharCount : integer;
Hash13Found : Boolean;
Buffer : array [0..MAXLINELENGTH] of AnsiChar;
begin
Hash13Found := False;
Result :='';
Stream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
StreamSize := Stream.size;
if StreamSize < MAXLINELENGTH then
BlockSize := StreamSize
Else
BlockSize := MAXLINELENGTH;
// for CharCount := 0 to Length(Buffer)-1 do begin
// Buffer[CharCount] := #0; // zeroing the buffer can aid diagnostics
// end;
CharCount := 0;
Repeat
Stream.Seek(-(CharCount+3), 2); //+3 misses out the #0,#10,#13 at the end of the file
Stream.Read( Buffer[CharCount], 1);
Result := String(Buffer[CharCount]) + result;
if Buffer[CharCount] =#13 then
Hash13Found := True;
Inc(CharCount);
Until Hash13Found OR (CharCount = BlockSize);
ShowMessage(Result);
ApproxNumRows := Round(StreamSize / CharCount);
end;
Just thought of a new solution.
Again, there could be better ones, but this one is the best i thought of.
function GetLastLine(textFilePath: string): string;
var
list: tstringlist;
begin
list := tstringlist.Create;
try
list.LoadFromFile(textFilePath);
result := list[list.Count-1];
finally
list.free;
end;
end;
I wanna over delphi change hex adress 15 character,
I follow like this a way but I didnt get success,
BlockRead(F,arrChar,1); //read all to the buf
CloseFile(F); //close file
IMEI:=Form1.Edit1.Text; //get the number
Form1.Memo1.Lines.Add('new IMEI is'+IMEI); //output
for i:=524288 to 524288+15 do /
arrChar[i]:=IMEI[i-524287];
Do this with a file stream.
var
Stream: TFileStream;
....
Stream := TFileStream.Create(FileName, fmOpenWrite);
try
Stream.Position := $080000;
Stream.WriteBuffer(IMEI, SizeOf(IMEI));
finally
Stream.Free;
end;
I'm assuming that IMEI is an fixed length array of bytes of length 15 but your code attempts to write 16 bytes so it would appear that you are suffering from a degree of confusion.
In your code, your variable IMEI is a string. Which is not an array of bytes. Please don't make that classic mistake of regarding a string as an array of bytes.
You might declare an IMEI type like this:
type
TIMEI = array [0..14] of Byte;
Then you might write a function to populate such a variable from text:
function TextToIMEI(const Text: string): TIMEI;
var
ResultIndex, TextIndex: Integer;
C: Char;
begin
if Length(Text) <> Length(Result) then
raise SomeExceptionClass.Create(...);
TextIndex := low(Text);
for ResultIndex := low(Result) to high(Result) do
begin
C := Result[TextIndex];
if (C < '0') or (C > '9') then
raise SomeExceptionClass.Create(...);
Result[ResultIndex] := ord(C);
inc(TextIndex);
end;
end;
You might then combine this code with that above:
procedure WriteIMEItoFile(const FileName: string; FileOffset: Int64; const IMEI: TIMEI);
var
Stream: TFileStream;
begin
Stream := TFileStream.Create(FileName, fmOpenWrite);
try
Stream.Position := FileOffset;
Stream.WriteBuffer(IMEI, SizeOf(IMEI));
finally
Stream.Free;
end;
end;
Call it like this:
WriteIMEItoFile(FileName, $080000, TextToIMEI(Form1.Edit1.Text));
Although it looks a bit odd that you are explicitly using the Form1 global variable. If that code executes in a method of TForm1 then you should use the implicit Self variable.
This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Convert Array of ShortInt to String, Delphi
I would like to turn a file into a string, this string should contain the numbers corresponding to the file in bytes.
I did in a way but it was very slow, using an array of FOR and ShortInt ...
Example:
have any file on my computer, my goal would be to transform it into Bytes, Bytes that these should be between -127 .. 128, it would have with something like this:
A[0] = 120
A[1] = -35
A[2] = 40
Ate here OK, but I need it in a concatenated string and a ',' between them, thus:
'120,-35,40'
I did it with a 'FOR', but it was very slow, if you have another alternative.
This question is rather similar to your previous question, but with the added complexity of reading the array from a file.
I'd probably write it like this:
function ConvertFileToCommaDelimitedArray(const FileName: string): string;
var
i, BytesLeft, BytesRead: Integer;
Buffer: array [0..4096-1] of Shortint;
Stream: TFileStream;
sb: TStringBuilder;
begin
sb := TStringBuilder.Create;
try
Stream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
try
BytesLeft := Stream.Size;
while BytesLeft>0 do
begin
BytesRead := Min(SizeOf(Buffer), BytesLeft)
Stream.ReadBuffer(Buffer, BytesRead);
dec(BytesLeft, BytesRead);
for i := 0 to BytesRead-1 do
begin
sb.Append(IntToStr(Buffer[i]));
sb.Append(',');
end;
end;
finally
Stream.Free;
end;
if sb.Length>0 then
sb.Length := sb.Length-1;//remove trailing comma
Result := sb.ToString;
finally
sb.Free;
end;
end;
You can load a file into an array of one-byte signed integral types (also known as ShortInt) like this:
type
TShortIntArray = array of TShortInt;
function LoadFileAsShortInt(const name: TFileName): TShortIntArray;
var
f: TFileStream;
begin
f := TFileStream.Create(name, fmOpenRead or fmShareDenyWrite);
try
SetLength(Result, f.Size);
f.ReadBuffer(Result[0], f.Size);
finally
f.Free;
end;
end;
If you want the file's contents as a string, then you should skip the array and load the file directly into a string:
function FileAsString(const name: TFileName): AnsiString;
var
s: TStringStream;
begin
s := TStringStream.Create;
try
s.LoadFromFile(name);
Result := s.DataString;
finally
s.Free;
end;
end;
I am trying to remotely read a binary (REG_BINARY) registry value, but I get nothing but junk back. Any ideas what is wrong with this code? I'm using Delphi 2010:
function GetBinaryRegistryData(ARootKey: HKEY; AKey, AValue, sMachine: string; var sResult: string): boolean;
var
MyReg: TRegistry;
RegDataType: TRegDataType;
DataSize, Len: integer;
sBinData: string;
bResult: Boolean;
begin
bResult := False;
MyReg := TRegistry.Create(KEY_QUERY_VALUE);
try
MyReg.RootKey := ARootKey;
if MyReg.RegistryConnect('\\' + sMachine) then
begin
if MyReg.KeyExists(AKey) then
begin
if MyReg.OpenKeyReadOnly(AKey) then
begin
try
RegDataType := MyReg.GetDataType(AValue);
if RegDataType = rdBinary then
begin
DataSize := MyReg.GetDataSize(AValue);
if DataSize > 0 then
begin
SetLength(sBinData, DataSize);
Len := MyReg.ReadBinaryData(AValue, PChar(sBinData)^, DataSize);
if Len <> DataSize then
raise Exception.Create(SysErrorMessage(ERROR_CANTREAD))
else
begin
sResult := sBinData;
bResult := True;
end;
end;
end;
except
MyReg.CloseKey;
end;
MyReg.CloseKey;
end;
end;
end;
finally
MyReg.Free;
end;
Result := bResult;
end;
And I call it like this:
GetBinaryRegistryData(
HKEY_LOCAL_MACHINE,
'\SOFTWARE\Microsoft\Windows NT\CurrentVersion',
'DigitalProductId', '192.168.100.105',
sProductId
);
WriteLn(sProductId);
The result I receive from the WriteLn on the console is:
ñ ♥ ???????????6Z ????1 ???????☺ ???♦ ??3 ? ??? ?
??
Assuming that you are already connected remotely, try using the GetDataAsString function
to read binary data from the registry.
sResult := MyReg.GetDataAsString(AValue);
You're using Delphi 2010, so all your characters are two bytes wide. When you set the length of your result string, you're allocating twice the amount of space you need. Then you call ReadBinaryData, and it fills half your buffer. There are two bytes of data in each character. Look at each byte separately, and you'll probably find that your data looks less garbage-like.
Don't use strings for storing arbitrary data. Use strings for storing text. To store arbitrary blobs of data, use TBytes, which is an array of bytes.