As per the documentation(https://www.lua.org/manual/5.3/manual.html#lua_pushliteral),
which says that:
This macro is equivalent to lua_pushstring, but should be used only when s is a literal string.
But I can't understand the explanation aforementioned at all.
As far as I can see, there is no difference from the the macro definition for lua_pushliteral:
#define lua_pushliteral(L, s) lua_pushstring(L, "" s)
The documentation for lua_pushliteral in Lua 5.4 is the same as 5.3, except it adds "(Lua may optimize this case.)". So while it is currently the same as calling lua_pushstring, the Lua devs are giving themselves the option to optimize it in the future.
EDIT: As an example, the doc for lua_pushstring says:
Lua will make or reuse an internal copy of the given string, so the memory at s can be freed or reused immediately after the function returns.
But a C string literal is read-only, so it's impossible for the C code to free or reuse the memory. Also, Lua strings are immutable. It's basically useless to copy one immutable object to another immutable object when you could just refer to the same memory from both places. That means one possible way to optimize lua_pushliteral would be to just not make the copy that lua_pushstring does.
Related
A u32 takes 4 bytes of memory, a String takes 3 pointer-sized integers (for location, size, and reserved space) on the stack, plus some amount on the heap.
This to me implies that Rust doesn't know, when the code is executed, what type is stored at a particular location, because that knowledge would require more memory.
But at the same time, does it not need to know what type is stored at 0xfa3d2f10, in order to be able to interpret the bytes at that location? For example, to know that the next bytes form the spec of a String on the heap?
How does Rust store types at runtime?
It doesn't, generally.
Rust doesn't know, when the code is executed, what type is stored at a particular location
Correct.
does it not need to know what type is stored
No, the bytes in memory should be correct, and the rest of the code assumes as much. The offsets of fields in a struct are baked-in to the generated machine code.
When does Rust store something like type information?
When performing dynamic dispatch, a fat pointer is used. This is composed of a pointer to the data and a pointer to a vtable, a collection of functions that make up the interface in question. The vtable could be considered a representation of the type, but it doesn't have a lot of the information that you might think goes into "a type" (unless the trait requires it). Dynamic dispatch isn't super common in Rust as most people prefer static dispatch when it's possible, but both techniques have their benefits.
There's also concepts like TypeId, which can represent one specific type, but only of a subset of types. It also doesn't provide much capability besides "are these the same type or not".
Isn't this all terribly brittle?
Yes, it can be, which is one of the things that makes Rust so interesting.
In a language like C or C++, there's not much that safeguards the programmer from making dumb mistakes that go out and mess up those bytes floating around in memory. Making those mistakes is what leads to bugs due to memory safety. Instead of interpreting your password as a password, it's interpreted as your username and printed out to an attacker (oops!)
Rust provides safeguards against that in the form of a strong type system and tools like the borrow checker, but still all done at compile time. Unsafe Rust enables these dangerous tools with the tradeoff that the programmer is now expected to uphold all the guarantees themselves, much like if they were writing C or C++ again.
See also:
When does type binding happen in Rust?
How does Rust implement reflection?
How do I print the type of a variable in Rust?
How to introspect all available methods and members of a Rust type?
I'm trying to achieve a static cast like coercion that doesn't result in copying of any data.
A naive static cast does not work
let pkt = byte_buffer :> PktHeader
FS0193: Type constraint mismatch. The type byte[] is not compatible with type PktHeader The type 'byte[]' is not compatible with the type 'PktHeader' (FS0193) (program)
where the packet is initially held in a byte array because of the way System.Net.Sockets.Socket.Receive() is defined.
The low level packet struct is defined something like this
[<Struct; StructLayout(LayoutKind.Explicit)>]
type PktHeader =
[<FieldOffset(0)>] val mutable field1: uint16
[<FieldOffset(2)>] val mutable field2: uint16
[<FieldOffset(4)>] val mutable field3: uint32
.... many more fields follow ....
Efficiency is important in this real world scenario because wasteful copying of data could rule out F# as an implementation language.
How do you achieve zero copy efficiencies in this scenario?
EDIT on Nov 29
my question was predicated on the implicit belief that a C/C++/C# style unsafe static cast is a useful construct, as if this is self evident. However, on 2nd thought this kind of cast is not idiomatic in F# since it is inherently an imperative language technique fraught with peril. For this reason I've accepted the answer by V.B. where SBE/FlatBuffers data access is promulgated as best practice.
A pure F# approach for conversion
let convertByteArrayToStruct<'a when 'a : struct> (byteArr : byte[]) =
let handle = GCHandle.Alloc(byteArr, GCHandleType.Pinned)
let structure = Marshal.PtrToStructure (handle.AddrOfPinnedObject(), typeof<'a>)
handle.Free()
structure :?> 'a
This is a minimum example but I'd recommend introducing some checks on the length of the byte array because, as it's written there, it will produce undefined results if you give it a byte array which is too short. You could check against Marshall.SizeOf(typeof<'a>).
There is no pure F# solution to do a less safe conversion than this (and this is already an approach prone to runtime failure). Alternative options could include interop with C# to use unsafe and fixed to do the conversion.
Ultimately though, you are asking for a way to subvert the F# type system which is not really what the language is designed for. One of the principle advantages of F# is the power of the type system and it's ability to help you produce statically verifiable code.
F# and very low-level performance optimizations are not best friends, but then... some smart people do magic even with Java, which doesn't have value types and real generic collections for them.
1) I am a big fan of a flyweight pattern lately. If you architecture allows for it, you could wrap a byte array and access struct members via offsets. A C# example here. SBE/FlatBuffers even have tools to generate a wrapper automatically from a definition.
2) If you could stay within unsafe context in C# to do the work, pointer casting is very easy and efficient. However, that requires pinning the byte array and keeping its handle for later release, or staying within fixed keyword. If you have many small ones without a pool, you could have problems with GC.
3) The third option is abusing .NET type system and cast a byte array with IL like this (this could be coded in F#, if you insist :) ):
static T UnsafeCast(object value) {
ldarg.1 //load type object
ret //return type T
}
I tried this option and even have a snippet somewhere if you need, but this approach makes me uncomfortable because I do not understand its consequences to GC. We have two objects backed by the same memory, what would happen when one of them is GCed? I was going to ask a new question on SO about this detail, will post it soon.
The last approach could be good for arrays of structs, but for a single struct it will box it or copy it anyway. Since structs are on the stack and passed by value, you will probably get better results just by casting a pointer to byte[] in unsafe C# or using Marshal.PtrToStructure as in another answer here, and then copy by value. Copying is not the worst thing, especially on the stack, but allocation of new objects and GC is the enemy, so you need byte arrays pooled and this will add much more to the overall performance than you struct casting issue.
But if your struct is very big, option 1 could still be better.
A very basic question .. but really very important to understand the concepts..
in c++ or c languages, we usually don't use pointer variables to store values.. i.e. values are stored simply as is in:
int a=10;
but here in ios sdk, in objective c, most of the objects which we use are initialized by denoting a pointer with them as in:
NSArray *myArray=[NSArray array];
So,the question arises in my mind ,that, what are the benefit and need of using pointer-objects (thats what we call them here, if it is not correct, please, do tell)..
Also I just get confused sometimes with memory allocation fundamentals when using a pointer objects for allocation. Can I look for good explanations anywhere?
in c++ or c languages, we usually don't use pointer variables to store values
I would take that "or C" part out. C++ programmers do frown upon the use of raw pointers, but C programmers don't. C programmers love pointers and regard them as an inevitable silver bullet solution to all problems. (No, not really, but pointers are still very frequently used in C.)
but here in ios sdk, in objective c, most of the objects which we use are initialized by denoting a pointer with them
Oh, look closer:
most of the objects
Even closer:
objects
So you are talking about Objective-C objects, amirite? (Disregard the subtlety that the C standard essentially describes all values and variables as an "object".)
It's really just Objective-C objects that are always pointers in Objective-C. Since Objective-C is a strict superset of C, all of the C idioms and programming techniques still apply when writing iOS apps (or OS X apps, or any other Objective-C based program for that matter). It's pointless, superfluous, wasteful, and as such, it is even considered an error to write something like
int *i = malloc(sizeof(int));
for (*i = 0; *i < 10; ++*i)
just because we are in Objective-C land. Primitives (or more correctly "plain old datatypes" with C++ terminology) still follow the "don't use a pointer if not needed" rule.
what are the benefit and need of using pointer-objects
So, why they are necessary:
Objective-C is an object-oriented and dynamic language. These two, strongly related properties of the language make it possible for programmers to take advantage of technologies such as polymorphism, duck-typing and dynamic binding (yes, these are hyperlinks, click them).
The way these features are implemented make it necessary that all objects be represented by a pointer to them. Let's see an example.
A common task when writing a mobile application is retrieving some data from a server. Modern web-based APIs use the JSON data exchange format for serializing data. This is a simple textual format which can be parsed (for example, using the NSJSONSerialization class) into various types of data structures and their corresponding collection classes, such as an NSArray or an NSDictionary. This means that the JSON parser class/method/function has to return something generic, something that can represent both an array and a dictionary.
So now what? We can't return a non-pointer NSArray or NSDictionary struct (Objective-C objects are really just plain old C structs under the hoods on all platforms I know Objective-C works on), because they are of different size, they have different memory layouts, etc. The compiler couldn't make sense of the code. That's why we return a pointer to a generic Objective-C object, of type id.
The C standard mandates that pointers to structs (and as such, to objects) have the same representation and alignment requirements (C99 6.2.5.27), i. e. that a pointer to any struct can be cast to a pointer to any other struct safely. Thus, this approach is correct, and we can now return any object. Using runtime introspection, it is also possible to determine the exact type (class) of the object dynamically and then use it accordingly.
And why they are convenient or better (in some aspects) than non-pointers:
Using pointers, there is no need to pass around multiple copies of the same object. Creating a lot of copies (for example, each time an object is assigned to or passed to a function) can be slow and lead to performance problems - a moderately complex object, for example, a view or a view controller, can have dozens of instance variables, thus a single instance may measure literally hundreds of bytes. If a function call that takes an object type is called thousands or millions of times in a tight loop, then re-assigning and copying it is quite painful (for the CPU anyway), and it's much easier and more straightforward to just pass in a pointer to the object (which is smaller in size and hence faster to copy over). Furthermore, Objective-C, being a reference counted language, even kind of "discourages" excessive copying anyway. Retaining and releasing is preferred over explicit copying and deallocation.
Also I just get confused sometimes with memory allocation fundamentals when using a pointer objects for allocation
Then you are most probably confused enough even without pointers. Don't blame it on the pointers, it's rather a programmer error ;-)
So here's...
...the official documentation and memory management guide by Apple;
...the earliest related Stack Overflow question I could find;
...something you should read before trying to continue Objective-C programming #1; (i. e. learn C first)
...something you should read before trying to continue Objective-C programming #2;
...something you should read before trying to continue Objective-C programming #3;
...and an old Stack Overflow question regarding C memory management rules, techniques and idioms;
Have fun! :-)
Anything more complex than an int or a char or similar is usually passed as
pointers even in C. In C you could of course pass around a struct of data
from function to function but this is rarely seen.
Consider the following code:
struct some_struct {
int an_int;
char a_char[1234];
};
void function1(void)
{
struct some_struct s;
function2(s);
}
void function2(struct some_struct s)
{
//do something with some_struct s
}
The some_struct data s will be put on the stack for function1. When function2
is called the data will be copied and put on the stack for use in function2.
It requires the data to be on the stack twice as well as the data to be
copied. This is not very efficient. Also, note that changing the values
of the struct in function2 will not affect the struct in function1, they
are different data in memory.
Instead consider the following code:
struct some_struct {
int an_int;
char a_char[1234];
};
void function1(void)
{
struct some_struct *s = malloc(sizeof(struct some_struct));
function2(s);
free(s);
}
void function2(struct some_struct *s)
{
//do something with some_struct s
}
The some_struct data will be put on the heap instead of the stack. Only
a pointer to this data will be put on the stack for function1, copied in the
call to function2 another pointer put on the stack for function2. This is a
lot more efficient than the previous example. Also, note that any changes of
the data in the struct made by function2 will now affect the struct in
function1, they are the same data in memory.
This is basically the fundamentals on which higher level programming languages
such as Objective-C is built and the benefits from building these languages
like this.
The benefit and need of pointer is that it behaves like a mirror. It reflects what it points to. One main place where points could be very useful is to share data between functions or methods. The local variables are not guaranteed to keep their value each time a function returns, and that they’re visible only inside their own function. But you still may want to share data between functions or methods. You can use return, but that works only for a single value. You can also use global variables, but not to store your complete data, you soon have a mess. So we need some variable that can share data between functions or methods. There comes pointers to our remedy. You can just create the data and just pass around the memory address (the unique ID) pointing to that data. Using this pointer the data could be accessed and altered in any function or method. In terms of writing modular code, that’s the most important purpose of a pointer— to share data in many different places in a program.
The main difference between C and Objective-C in this regard is that arrays in Objective-C are commonly implemented as objects. (Arrays are always implemented as objects in Java, BTW, and C++ has several common classes resembling NSArray.)
Anyone who has considered the issue carefully understands that "bare" C-like arrays are problematic -- awkward to deal with and very frequently the source of errors and confusion. ("An array 'decays' to a pointer" -- what is that supposed to mean, anyway, other than to admit in a backhanded way "Yes, it's confusing"??)
Allocation in Objective-C is a bit confusing in large part because it's in transition. The old manual reference count scheme could be easily understood (if not so easily dealt with in implementations), but ARC, while simpler to deal with, is far harder to truly understand, and understanding both simultaneously is even harder. But both are easier to deal with than C, where "zombie pointers" are almost a given, due to the lack of reference counting. (C may seem simpler, but only because you don't do things as complex as those you'd do with Objective-C, due to the difficulty controlling it all.)
You use a pointer always when referring to something on the heap and sometimes, but usually not when referring to something on the stack.
Since Objective-C objects are always allocated on the heap (with the exception of Blocks, but that is orthogonal to this discussion), you always use pointers to Objective-C objects. Both the id and Class types are really pointers.
Where you don't use pointers are for certain primitive types and simple structures. NSPoint, NSRange, int, NSUInteger, etc... are all typically accessed via the stack and typically you do not use pointers.
In fortran you can declare an array with any suitable (integral) range, for example:
real* 8 array(-10:10)
I believe that fortran, when passing by reference, will always pass around array(1) as the reference, but I'm not sure.
I'm using fortran pointers, and I believe that fortran is pointing the "1st" element address, i.e. array(1), not array(-10). However I'm not sure.
How does Fortran deal with negative array indexing in memory? And is it implimentation defined?
Edit: To add a little more detail, I'm passing a malloc'd block from C to fortran by means of using a fortran pointer to point at the the address, which is done by calling a fortran routine from within C. I.e. C goes:
void * pointer = malloc(blockSize*sizeof(double));
fortranpoint_(pointer);
And the fortran point routine looks like:
real*8 :: target block(5, -6:6, 0:0)
real*8 :: pointer array(:,:,:)
entry fortranPoint(block)
array => block
return
The problem is that sometimes when it later tries to access say:
array(1, -6, 0)
I am not sure if this is accessing the address at the beginning of the block or somewhere before it. I now think this is implementation defined, but would like to know the details of each implementation.
Fortran array argument ABI depends on the compiler, and perhaps more crucially, on whether the called procedure has an explicit or implicit interface.
For an implicit interface, typically the address of the first element is passed [1]. In the callee, the procedure then adds an offset depending on how the array dummy argument is declared. E.g. if the array dummy argument is declared somearray(-10:10), then a reference to somearray(x) is calculated as
address_of_first_element_passed_in_to_the_procedure + x + 10
If the procedure has an explicit interface, typically an array descriptor structure is passed rather than the address of the first element. In this structure, the callee can find information on the bounds of each dimension and, of course, a pointer to the actual data, allowing it to calculate the correct offset, similarly to the case of an implicit interface.
[1] Note that this is the first element in memory, that is, the lowest index for each dimension. Not somearray(1) regardless of how the array was declared.
To answer your updated question, for C/Fortran interoperability, use the ISO_C_BINDING feature which is nowadays widely available. This provides a standardized way to pass information between C and Fortran.
If the dummy argument for a regular array in Fortran is declared A(:) (or with more dimensions), the SHAPE is passed, not the specific index range. So the procedure will default to one-indexing. You can override this with a declaration in the procedure of A(-10:), or A(StartIndex:), where StartIndex is another argument.
Fortran pointers do include the index range, but the passing mechanism will be compiler dependent. Code interfacing this to C is likely to be OS & compiler dependent. As already suggested, I'd use a regular array and the ISO C Binding. It is MUCH easier than the old ways of figuring out the compiler passing mechanisms and standard and portable. If you have a large existing Fortran code, you could write a "glue" Fortran procedure that maps between the regular Fortran variable declarations and the ISO C Binding names. While they types will have formally different names, in practice they will be the same if you select the correct ISO C types. The ISO C Binding has been available for many years now -- can you upgrade the compiler on the problem target platform? If not, I'd use a regular Fortran array and either use zero-indexing on the C-side, or explicitly pass as arguments the desired indices.
There are examples of ISO C Binding usage on other Stack Overflow questions.
The interface to a procedure is explicit if it is declared so that it is known to the compiler in the caller. The simplest way it to place the procedures in a module and "use" the module in the caller. Having explicit interfaces helps avoid bugs since the compiler can check consistency between arguments of the caller and callee. It is a little bit like C header files, only easier.
I am able to understand immutability with python (surprisingly simple too). Let's say I assign a number to
x = 42
print(id(x))
print(id(42))
On both counts, the value I get is
505494448
My question is, does python interpreter allot ids to all the numbers, alphabets, True/False in the memory before the environment loads? If it doesn't, how are the ids kept track of? Or am I looking at this in the wrong way? Can someone explain it please?
What you're seeing is an implementation detail (an internal optimization) calling interning. This is a technique (used by implementations of a number of languages including Java and Lua) which aliases names or variables to be references to single object instances where that's possible or feasible.
You should not depend on this behavior. It's not part of the language's formal specification and there are no guarantees that separate literal references to a string or integer will be interned nor that a given set of operations (string or numeric) yielding a given object will be interned against otherwise identical objects.
I've heard that the C Python implementation does include a set of the first hundred or so integers as statically instantiated immutable objects. I suspect that other very high level language run-time libraries are likely to include similar optimizations: the first hundred integers are used very frequently by most non-trivial fragments of code.
In terms of how such things are implemented ... for strings and larger integers it would make sense for Python to maintain these as dictionaries. Thus any expression yielding an integer (and perhaps even floats) and strings (at least sufficiently short strings) would be hashed, looked up in the appropriate (internal) object dictionary, added if necessary and then returned as references to the resulting object.
You can do your own similar interning of any sorts of custom object you like by wrapping the instantiation in your own calls to your own class static dictionary.