I'm trying to read a PGM binary file (http://netpbm.sourceforge.net/doc/pgm.html) to fill a 0-based 2D matrix of integers (16-bit grayscale values).
The file may be 50 megs, so I'm trying to fill a buffer in one call.
I've never done anything with Streams before, but the Google results on Delphi streams going back 20 years and are a cluttered mess in which I couldn't find my way.
I've managed to lock up Delphi (first time in 15 years!) while running some code that uses pointers and buffers (and probably is based on my misunderstanding of an antiquated approach.)
Here's my pseudo code, doing it integer by integer. Is there a way to do the read and fill of the matrix with a single Stream call? (Assuming the file was created on the same machine, so byte-sex is the same.)
type
TMatrix: Array of Array of Integer;
procedure ReadMatrix( const AFileName: String;
const AStartingByte: Integer;
const AMaxRow: Integer;
const AMaxCol: Integer;
const AMatrix: TMatrix)
begin
SetLength(AMatrix, aMaxRow, aMaxCol);
Open(AFileName);
Seek(AStartingByte);
for Row := 0 to aMaxCol do
for Col := 0 to aMaxCol do
AMatrix[Row, Col] := ReadWord
end;
And, no, this isn't a homework assignment! :-)
As already stated, you cannot read 2D dynamic array in a single operation, because its memory is non continuous. But every 1D subarray can be filled.
I also changed array element type to 16-bit. If you really need matrix of Integer (it is 32 bit), then you have to read 16-bit data and assign elements to Integers one-by-one
type
TMatrix = Array of Array of Word;
procedure ReadMatrix( const AFileName: String;
const AStartingByte: Integer;
const AMaxRow: Integer;
const AMaxCol: Integer;
const AMatrix: TMatrix)
var
FS: TFileStream;
Row: Integer;
begin
SetLength(AMatrix, aMaxRow, aMaxCol);
FS := TFileStream.Create(AFileName, fmOpenRead);
try
FS.Position := AStartingByte;
for Row := 0 to aMaxRow - 1 do
FS.Read(AMatrix[Row, 0], SizeOf(Word) * aMaxCol);
finally
FS.Free;
end;
end;
Related
Given a buffer and its size in bytes, is there a way to convert this to TBytes without copying it?
Example:
procedure HandleBuffer(_Buffer: PByte; _BufSize: integer);
var
Arr: TBytes;
i: Integer;
begin
// some clever code here to get contents of the buffer into the Array
for i := 0 to Length(Arr)-1 do begin
HandleByte(Arr[i]);
end;
end;
I could of course copy the data:
procedure HandleBuffer(_Buffer: PByte; _BufSize: integer);
var
Arr: TBytes;
i: Integer;
begin
// this works but is very inefficient
SetLength(Arr, _BufSize);
Move(PByte(_Buffer)^, Arr[0], _BufSize);
//
for i := 0 to Length(Arr)-1 do begin
HandleByte(Arr[i]);
end;
end;
But for a large buffer (about a hundred megabytes) this would mean I have double the memory requirement and also spend a lot of time unnecessarily copying data.
I am aware that I could simply use a PByte to process each byte in the buffer, I'm only interested in a solution to use a TBytes instead.
I think it's not possible, but I have been wrong before.
No, this is not possible (without unreasonable hacks).
The problem is that TBytes = TArray<Byte> = array of Byte is a dynamic array and the heap object for a non-empty dynamic array has a header containing the array's reference count and length.
A function that accepts a TBytes parameter, when given a plain pointer to an array of bytes, might (rightfully) attempt to read the (non-existing) header, and then you are in serious trouble.
Also, dynamic arrays are managed types (as indicated by the reference count I mentioned), so you might have problems with that as well.
However, in your particular example code, you don't actually use the dynamic array nature of the data at all, so you can work directly with the buffer:
procedure HandleBuffer(_Buffer: PByte; _BufSize: integer);
var
i: Integer;
begin
for i := 0 to _BufSize - 1 do
HandleByte(_Buffer[i]);
end;
Here's the deal, I'm developing a security system and I'm doing some bit scrambling using bitwise operations. Using 4 bits just to illustrate, supose I have 1001 and I wish to shift left. This would leave me with 0010 since the right-most bit would be lost. What I wanted to do was to shift left and right without losing any bits.
You might choose to use rotate rather than shift. That preserves all the bits. If you wish to use an intermediate value that is the result of a shift, perform both a rotate and a shift. Keep track of the value returned by the rotate, but use the value returned by the shift. This question provides various implementations of rotate operations: RolDWord Implementation in Delphi (both 32 and 64 bit)?
Another option is to never modify the original value. Instead just keep track of the cumulative shift, and when a value is required, return it.
type
TLosslessShifter = record
private
FData: Cardinal;
FShift: Integer;
function GetValue: Cardinal;
public
class function New(Data: Cardinal): TLosslessShifter; static;
procedure Shift(ShiftIncrement: Integer);
property Value: Cardinal read GetValue;
end;
class function TLosslessShifter.New(Data: Cardinal): TLosslessShifter;
begin
Result.FData := Data;
Result.FShift := 0;
end;
procedure TLosslessShifter.Shift(ShiftIncrement: Integer);
begin
inc(FShift, ShiftIncrement);
end;
function TLosslessShifter.GetValue: Cardinal;
begin
if FShift > 0 then
Result := FData shr FShift
else
Result := FData shl -FShift;
end;
Some example usage and output:
var
Shifter: TLosslessShifter;
....
Shifter := TLosslessShifter.New(8);
Shifter.Shift(-1);
Writeln(Shifter.Value);
Shifter.Shift(5);
Writeln(Shifter.Value);
Shifter.Shift(-4);
Writeln(Shifter.Value);
Output:
16
0
8
I'm using Delphi XE and the Matlab 2012B compiler on Windows 7.
I'm trying to write several wrapper functions so DLL files created with the Matlab 2012b Compiler can be more easily called from Delphi XE. I found that I should use the _proxy functions when using the MCR, which indeed allowed me to call several functions successfully. I can also pass strings to Matlab without problems by passing them as PAnsiChar.
I'm currently trying to create a StructArray with some field names.
As I've already successfully created numeric arrays and matrices, I'm pretty sure the first 2 parameters are OK. I expect the last one is causing the error, but I don't know how to solve this (yet). Looking at the Matlab help and example files I'm doing what should be done. Obviously I'm wrong...
I know that with Matlab r13 we had to pass the fieldnames as an array[0..n] of pAnsiChar instead of an array of pAnsiChar. I tried this here as well to no avail.
Can someone tell me if I have indeed made the correct function mapping to mxCreateStructArray(_730_proxy) and if I'm passing the parameters as expected?
type
mxArray = pointer;
// mxArray *mxCreateStructArray(mwSize ndim, const mwSize *dims, int nfields, const char **fieldnames);
function MCRdll_CreateStructArray(aDimCount: integer; aDims: pointer; aFieldCount: integer; aFields: PPAnsiChar): mxArray; cdecl; external 'mclmcrrt8_0.dll' name 'mxCreateStructArray_730_proxy';
function MCR_CreateStructArray(aFieldNames: TArray<string>): mxArray;
var
i: integer;
lstDims: array of integer;
lstNames: array of pAnsiChar;
begin
SetLength(lstNames, Length(aFieldNames));
for i := 0 to Length(aFieldNames) - 1 do
lstNames[i] := ToPAnsiChar(aFieldNames[i]); //Creates a new PAnsiChar with the content of aFieldNames[i]
SetLength(lstDims, 2);
lstDims[0] := 1;
lstDims[1] := Length(aFieldNames);
//This call raises an "External Exception" from Matlab.
Result := MCRdll_CreateStructArray(Length(lstDims), #lstDims, Length(lstNames), #lstNames);
end;
The MATLAB C API function is:
mxArray *mxCreateStructArray(mwSize ndim, const mwSize *dims,
int nfields, const char **fieldnames);
As I understand it, mwSize is by default the same as int. That translates to Integer in Delphi. The const char** parameter is the address of an array of const C strings. Translate that to Delphi and you have:
function MCRdll_CreateStructArray(ndim: Integer; dims: PInteger;
nFields: Integer; fieldnames: PPAnsiChar): mxArray; cdecl;
external 'mclmcrrt8_0.dll' name 'mxCreateStructArray_730_proxy';
Now, how to get the parameters. Well, assuming you want a vector, dims is an array of length 2, and ndim is that length. I'd declare that as a static array:
var
dims: array [0..1] of Integer;
As for the field names, those are variable length. So you need a dynamic array of PAnsiChar. That is:
var
fieldnames: array of PAnsiChar;
You also need to pass the vector length for your struct array to your function. That makes your function be something like this:
function MCR_CreateStructArray(len: Integer;
const aFieldNames: array of AnsiString): mxArray;
var
i: integer;
dims: array [0..1] of Integer;
fieldnames: array of PAnsiChar;
begin
if Length(aFieldNames)=0 then
begin
Result := nil;
exit;
end;
dims[0] := 1;
dims[1] := len;
SetLength(fieldnames, Length(aFieldNames));
for i := 0 to high(fieldnames) do
fieldnames[i] := PAnsiChar(aFieldNames[i]);
Result := MCRdll_CreateStructArray(Length(dims), #lstDims[0],
Length(fieldnames), #fieldnames[0]);
end;
An alternative to the final parameter is to pass PPAnsiChar(fieldnames). That works because a dynamic array variable is the address of the first element.
So, what was wrong with your version? The biggest mistake you made was to use untyped pointers for the two arrays that you pass to MCRdll_CreateStructArray. This means that the compiler cannot check that you got the indirection correct. And you did not.
First of all in your code you pass #lstDims to the second parameter. Now lstDims is a dynamic array in your code. The implementation of that has lstDims being a pointer to the first element. So, informally, lstDims has type ^Integer. And therefore #lstDims has type ^^Integer. That's one level of indirection too far. And you made the exact same mistake in the final parameter.
One final point. I've change the signature of the function to receive an array of AnsiString. That's the easy way for me to write the code because I don't need to worry about the UTF-16 to ANSI conversion, and can use a simple PAnsiChar cast. You'd probably benefit from this helper:
function ToAnsiStringArray(const arr: array of string): TArray<AnsiString>;
var
i: Integer;
begin
SetLength(Result, Length(arr));
for i := 0 to high(Result) do
Result[i] := AnsiString(arr[i]);
end;
I've not compiled any of this so there may be some imprecision. I trust you'll not be put off by that.
As the topic indicates above, I'm wondering if there's a good example of a clean and efficient way to handle pointers as passed in function parms when processing the data sequentially. What I have is something like:
function myfunc(inptr: pointer; inptrsize: longint): boolean;
var
inproc: pointer;
i: integer;
begin
inproc := inptr;
for i := 1 to inptrsize do
begin
// do stuff against byte data here.
inc(longint(inproc), 1);
end;
end;
The idea is that instead of finite pieces of data, I want it to be able to process whatever is pushed its way, no matter the size.
Now when it comes to processing the data, I've figured out a couple of ways to do it successfully.
Assign the parm pointers to identical temporary pointers, then use those to access each piece of data, incrementing them to move on. This method is quickest, but not very clean looking with all the pointer increments spread all over the code. (this is what I'm talking about above)
Assign the parm pointers to a pointer representing a big array value and then incremently process that using standard table logic. Much cleaner, but about 500 ms slower than #1.
Is there another way to efficiently handle processing pointers in this way, or is there some method I'm missing that will both be clean and not time inefficient?
Your code here is basically fine. I would always choose to increment a pointer than cast to a fake array.
But you should not cast to an integer. That is semantically wrong and you'll pay the penalty anytime you compile on a platform that has pointer size different from your integer size. Always use a pointer to an element of the right size. In this case a pointer to byte.
function MyFunc(Data: PByte; Length: Integer): Boolean;
var
i: Integer;
begin
for i := 1 to Length do
begin
// do stuff against byte data here.
inc(Data);
end;
end;
Unless the compiler is having a really bad day, you won't find it easy to get better performing code than this. What's more, I think this style is actually rather clear and easy to understand. Most of the clarity gain comes in avoiding the need to cast. Always strive to remove casts from your code.
If you want to allow any pointer type to be passed then you can write it like this:
function MyFunc(P: Pointer; Length: Integer): Boolean;
var
i: Integer;
Data: PByte;
begin
Data := P;
for i := 1 to Length do
begin
// do stuff against byte data here.
inc(Data);
end;
end;
Or if you want to avoid pointers in the interface, then use an untyped const parameter.
function MyFunc(const Buffer; Length: Integer): Boolean;
var
i: Integer;
Data: PByte;
begin
Data := PByte(#Buffer);
for i := 1 to Length do
begin
// do stuff against byte data here.
inc(Data);
end;
end;
Use a var parameter if you need to modify the buffer.
I have a different opinion: For sake of readability I would use an array. Pascal was not designed to be able to access memory directly. Original pascal did not even have pointer arithmetic.
This is how I would use an array:
function MyFunc(P: Pointer; Length: Integer): Boolean;
var
ArrayPtr : PByteArray Absolute P;
I : Integer;
begin
For I := 0 to Length-1 do
// do stuff against ArrayPtr^[I]
end;
But if performance matters, I would write it like this
function MyFunc(P: Pointer; Length: Integer): Boolean;
var
EndOfMemoryBlock: PByte;
begin
EndOfMemoryBlock := PByte(Int_Ptr(Data)+Length);
While P<EndOfMemoryBlock Do begin
// do stuff against byte data here.
inc(P);
end;
end;
I'm going maintain and port to Delphi XE2 a bunch of very old Delphi code that is full of VarArrayCreate constructs to fake dynamic arrays having a lower bound that is not zero.
Drawbacks of using Variant types are:
quite a bit slower than native arrays (the code does a lot of complex financial calculations, so speed is important)
not type safe (especially when by accident a wrong var... constant is used, and the Variant system starts to do unwanted conversions or rounding)
Both could become moot if I could use dynamic arrays.
Good thing about variant arrays is that they can have non-zero lower bounds.
What I recollect is that dynamic arrays used to always start at a lower bound of zero.
Is this still true? In other words: Is it possible to have dynamic arrays start at a different bound than zero?
As an illustration a before/after example for a specific case (single dimensional, but the code is full of multi-dimensional arrays, and besides varDouble, the code also uses various other varXXX data types that TVarData allows to use):
function CalculateVector(aSV: TStrings): Variant;
var
I: Integer;
begin
Result := VarArrayCreate([1,aSV.Count-1],varDouble);
for I := 1 to aSV.Count-1 do
Result[I] := CalculateItem(aSV, I);
end;
The CalculateItem function returns Double. Bounds are from 1 to aSV.Count-1.
Current replacement is like this, trading the space zeroth element of Result for improved compile time checking:
type
TVector = array of Double;
function CalculateVector(aSV: TStrings): TVector;
var
I: Integer;
begin
SetLength(Result, aSV.Count); // lower bound is zero, we start at 1 so we ignore the zeroth element
for I := 1 to aSV.Count-1 do
Result[I] := CalculateItem(aSV, I);
end;
Dynamic arrays always have a lower bound of 0. So, low(A) equals 0 for all dynamic arrays. This is even true for empty dynamic arrays, i.e. nil.
From the documentation:
Dynamic arrays are always integer-indexed, always starting from 0.
Having answered your direct question already, I also offer you the beginnings of a generic class that you can use in your porting.
type
TSpecifiedBoundsArray<T> = class
private
FValues: TArray<T>;
FLow: Integer;
function GetHigh: Integer;
procedure SetHigh(Value: Integer);
function GetLength: Integer;
procedure SetLength(Value: Integer);
function GetItem(Index: Integer): T;
procedure SetItem(Index: Integer; const Value: T);
public
property Low: Integer read FLow write FLow;
property High: Integer read GetHigh write SetHigh;
property Length: Integer read GetLength write SetLength;
property Items[Index: Integer]: T read GetItem write SetItem; default;
end;
{ TSpecifiedBoundsArray<T> }
function TSpecifiedBoundsArray<T>.GetHigh: Integer;
begin
Result := FLow+System.High(FValues);
end;
procedure TSpecifiedBoundsArray<T>.SetHigh(Value: Integer);
begin
SetLength(FValues, 1+Value-FLow);
end;
function TSpecifiedBoundsArray<T>.GetLength: Integer;
begin
Result := System.Length(FValues);
end;
procedure TSpecifiedBoundsArray<T>.SetLength(Value: Integer);
begin
System.SetLength(FValues, Value);
end;
function TSpecifiedBoundsArray<T>.GetItem(Index: Integer): T;
begin
Result := FValues[Index-FLow];
end;
function TSpecifiedBoundsArray<T>.SetItem(Index: Integer; const Value: T);
begin
FValues[Index-FLow] := Value;
end;
I think it's pretty obvious how this works. I contemplated using a record but I consider that to be unworkable. That's down to the mix between value type semantics for FLow and reference type semantics for FValues. So, I think a class is best here.
It also behaves rather weirdly when you modify Low.
No doubt you'd want to extend this. You'd add a SetBounds, a copy to, a copy from and so on. But I think you may find it useful. It certainly shows how you can make an object that looks very much like an array with non-zero lower bound.