Why is Uint8List compatible with list<int> in dart? - dart

I am a dart newbie.
Something strange I noticed while learning dart is that Uint8List seems to be compatible with List<int>.
For example, the IOSink.read() method accepts data of type List<int> as an argument. But it also seems to accept data of type Uint8List as argument directly.
What kind of mechanism is this? It doesn't really convert every byte in the Uint8List to int, does it? That would be very wasteful in terms of efficiency and memory usage.

The Uint8List interface implements List<int>.
That means that it has an implementation of every member of List<int> with a signature that is compatible with List<int>.
It also means that Uint8List is a subtype of List<int> and a Uint8List instance can be used anywhere a List<int> instance is allowed or required.
Making Uint8List implement List<int> was easy, since a Uint8List is a list of (limited) integers, and because Dart only has one integer type, int, there is no problem distinguishing between a "byte" and an integer.
Any integer you read out of a Uint8List will be in the range 0..255.
Any integer you write into a Uint8List will be truncated to its first 8 bits before being stored. Storing the integer 257 into a Uint8List means actually storing the byte with value 1.
The read method will likely just use plain List methods for storing integers into the buffer. If that buffer happens to be a Uint8List, those integers are truncated and take up only a single byte. If not, it just stores integers (which happen to be in the range 0..255) into a List<int> as normal.

Related

DirectX compute shader (HLSL): how to access individual characters in a string?

In a DirectCompute shader, having a function taking an string type argument, how to access individual characters ?
Example:
uint TestFunc(string S, uint I)
{
return uint(S[I]);
}
The compiler complain about S[I]: "error X3121: array, matrix, vector, or indexable object type expected in index expression".
Any idea?
From MS docs:
HLSL also supports a string type, which is an ASCII string. There are no operations or states that accept strings, but effects can query string parameters and annotations.
Strings exist in HLSL, but there’s very little you can do with them. Depending on your needs, you might want to pass the string to the shader as an array of instead of a string, or as a RWStructuredBuffer of bytes, then perform the conversion to/from ASCII.

BytesOf() and wchar_t arrays

I found that there is an api called System.UnicodeString.BytesOf to get byte arrays of the UnicodeString.
However, I do not know the benefit of using the function.
Instead, we can use wchar_t arrays like:
wchar_t szBuf[100];
wcscpy(szBuf, str.c_str());
What is the usefulness of the BytesOf function comparing to those using wchar_t array?
BytesOf() converts a string to a byte array. In the case of the overloaded version that takes a UnicodeString as input, it converts the UnicodeString data to the OS's default Ansi charset before then copying the resulting data to the array (IOW, BytesOf(UnicodeString) is just a wrapper for TEncoding::Default->GetBytes(UnicodeString)).

Can BitConverter be used to reliably extract multi-byte values from an IL byte stream (as returned by MethodBody.GetILAsByteArray)?

I am working on some code that parses IL byte arrays as returned by MethodBody.GetILAsByteArray.
Lets say I want to read a metadata token or a 32-bit integer constant from such an IL byte stream. At first I thought using BitConverter.ToInt32(byteArray, offset) would make this easy. However I'm now worried that this won't work on big-endian machines.
As far as I know, IL always uses little-endian encoding for multi-byte values:
"All argument numbers are encoded least-significant-byte-at-smallest-address (a pattern commonly termed 'little-endian')." — The Common Language Infrastructure Annotated Standard, Partition III, ch. 1.2 (p. 482).
Since BitConverter's conversion methods honour the computer architecture's endianness (which can be discovered through BitConverter.IsLittleEndian), I conclude that BitConverter should not be used to extract multi-byte values from an IL byte stream, because this would give wrong results on big-endian machines.
Is this conclusion correct?
If yes: Is there any way to tell BitConverter which endianness to use for conversions, or is there any other class in the BCL that offers this functionality, or do I have to write my own conversion code?
If no: Where am I wrong? What is the proper way of extracting e.g. a Int32 operand value from an IL byte array?
You should always do this on a little endian array before passing it:
// Array is little. Are we on big?
if (!BitConverter.IsLittleEndian)
{
// Then flip it
Array.Reverse(array);
}
int val = BitConverter.ToInt32(...);
However as you mention an IL stream. The bytecode is this (AFAIK):
(OPCODE:(1|2):little) (VARIABLES:x:little)
So I would read a byte, check its opcode, then read the appropriate bytes and flip the array if necessary using the above code. Can I ask what you are doing?

Why use string[1] rather than string while using readbuffer

I am having a record like this
TEmf_SrectchDIBits = packed record
rEMF_STRETCHDI_BITS: TEMRStretchDIBits;
rBitmapInfo: TBitmapInfo;
ImageSource: string;
end;
---
---
RecordData: TEmf_SrectchDIBits;
If i am reading data into it by using TStream like this an exception is occuring
SetLength(RecordData.ImageSource, pRecordSize);
EMFStream.ReadBuffer(RecordData.ImageSource,pRecordSize)
But if i use below code, it was working normally
SetLength(RecordData.ImageSource, pRecordSize);
EMFStream.ReadBuffer(RecordData.ImageSource[1], pRecordSize);
So what is the difference between using String and String[1]
The difference is a detail related to the signature of the .ReadBuffer method.
The signature is:
procedure ReadBuffer(var Buffer; Count: Longint);
As you can see, the Buffer parameter does not have a type. In this case, you're saying that you want access to the underlying variable.
However, a string is two parts, a pointer (the content of the variable) and the string (the variable points to this).
So, if ReadBuffer were given just the string variable, it would have 4 bytes to store data into, the string variable, and that would not work out too well since the string variable is supposed to hold a pointer, not just any random binary data. If ReadBuffer wrote more than 4 bytes, it would overwrite something else in memory with new data, a potentially disastrous action to do.
By passing the [1] character to a var parameter, you're giving ReadBuffer access to the data that the string variable points to, which is what you want. You want to change the string content after all.
Also, make sure you've set up the length of the string variable to be big enough to hold whatever you're reading into it.
Also, final note, one that I cannot verify. In older Delphi versions, a string variable contained 1-byte characters. In newer, I think they're two, due to unicode, so that code might not work as expected in newer versions of Delphi. You probably would like to use a byte array or heap memory instead.
String types are implemented actually as pointers to something we could call a "string descriptor block". Basically, you have a level of indirection.
That block contains some string control data (reference count, length, and in later versions character set info as well) at negative offsets, and the string characters at positive ones. A string variable is a pointer to the decription block (and if you print SizeOf(stringvar) you get 4), when you work on strings the compiler knows where to find the string data and handle them. But when using an untyped parameter (var Buffer;), the compiler does not know that, it will simply access the memory at "Buffer", but with a string variable that's the pointer to the string block, not the actual string characters. Using string[1] you pass the location of the first character data.

TArray<Byte> VS TBytes VS PByteArray

Those 3 types are very similar...
TArray is the generic version of TBytes.
Both can be casted to PByteArray and used as buffer for calls to Windows API. (with the same restrictions as string to Pchar).
What I would like to know: Is this behavior "by design" or "By Implementation". Or more specifically, could it break in future release?
//Edit
As stated lower...
What I really want to know is: Is this as safe to typecast TBytes(or TArray) to PByteArray as it is to typecast String to PChar as far as forward compatibility is concerned. (Or maybe AnsiString to PAnsiChar is a better exemple ^_^)
Simply put, an array of bytes is an array of bytes, and as long as the definitions of a byte and an array don't change, this won't change either. You're safe to use it that way, as long as you make sure to respect the array bounds, since casting it out of Delphi's array types nullifies your bounds checking.
EDIT: I think I see what you're asking a bit better now.
No, you shouldn't cast a dynamic array reference to a C-style array pointer. You can get away with it with strings because the compiler helps you out a little.
What you can do, though, is cast a pointer to element 0 of the dynamic array to a C-style array pointer. That will work, and won't change.
Two of those types are similar (identical in fact). The third is not.
TArray is declared as "Array of Byte", as is TBytes. You missed a further very relevant type however, TByteArray (the type referenced by PByteArray).
Being a pointer to TByteArray, PByteArray is strictly speaking a pointer to a static byte array, not a dynamic array (which the other byte array types all are). It is typed in this way in order to allow reference to offsets from that base pointer using an integer index. And note that this indexing is limited to 2^15 elements (0..32767). For arbitrary byte offsets (> 32767) from some base pointer, a PByteArray is no good:
var
b: Byte;
ab: TArray<Byte>;
pba: PByteArray;
begin
SetLength(ab, 100000);
pba := #ab; // << No cast necessary - the compiler knows (magic!)
b := pba[62767]; // << COMPILE ERROR!
end;
i.e. casting an Array of Byte or a TArray to a PByteArray is potentially going to lead to problems where the array has > 32K elements (and the pointer is passed to some code which attempts to access all elements). Casting to an untyped pointer avoids this of course (as long as the "recipient" of the pointer then handles access to the memory reference by the pointer appropriately).
BUT, none of this is likely to change in the future, it is merely a consequence of the implementation details that have long since applied in this area. The introduction of a syntactically sugared generic type declaration is a kipper rouge.

Resources