I would like to get the lua state from inside lua so I can pass it too an external program that cannot be hooked up using ffi or dll. I just need a pointer to it and the ability to share it(shared memory across program boundaries.
That or I can create a lua state in my program and then pass that so I would simply need to set the lua state to it inside lua(and it would have to work with shared memory).
I've thought about sharing data using json but ideally I would like to directly access objects.
Lua is pretty good about avoiding heap allocation and global pointers to allocated memory. lua_newstate takes an allocator function as a parameter. The provided function will be used to allocate/deallocate all memory associated with the lua_State object. Including the pointer returned by lua_newstate.
So hypothetically, you could provide an allocator function that allocates/deallocates interprocess shared memory. And then, you can just pass the lua_State to some other process and access it.
First, you clearly cannot do this "from inside lua"; that kind of low-level thing just ain't happening. You cannot access the lua_State object from within Lua. You must be in control of the lua_State creation process for that to be a possibility. So we're talking about C (equivalent) code here, not in-Lua code.
Now, you can expose a C function to Lua which returns a light userdata that just so happens to be the exact lua_State* in question. But Lua can't really do much with light userdata other than pass it to other C function APIs.
Second, while the Lua system provides a guarantee that it will only allocate memory through the allocator, the system does not provide a guarantee that what you're trying to do will work. It is entirely possible that the Lua implementation does use process global memory, so long as it does it in such a way that different threads can access that global memory without breaking threading guarantees.
Obviously, you can inspect the Lua implementation to see if it does anything of the kind. But my point is that the guarantees are that each independent lua_State will be thread-isolated from each other and that each lua_State will only allocate memory through the given allocator. There is no guarantee that Lua's implementation doesn't have some global storage that it uses for some purpose.
So simply sharing the memory allocated by the Lua state may not be enough.
Also, even if this works, the two processes cannot access the same lua_State object at the same time, just like two threads in the same process cannot access the lua_State at the same time.
The lua state is not designed to leave the program / thread it is executing in.
Doing a query on a running lua_state could result in a crash, because it is only notionally consistent when a lua call returns, or a C api function is called. During execution, some un-locked modifications could cause uninitialized memory access, or ininite loops due to lists being inconsistent.
Related
In another question, I found out that the Assigned() function is identical to Pointer <> nil. It has always been my understanding that Assigned() was detecting these dangling pointers, but now I've learned it does not. Dangling Pointers are those which may have been created at one point, but have since been free'd and haven't been assigned to nil yet.
If Assigned() can't detect dangling pointers, then what can? I'd like to check my object to make sure it's really a valid created object before I try to work with it. I don't use FreeAndNil as many recommend, because I like to be direct. I just use SomeObject.Free.
Access Violations are my worst enemy - I do all I can to prevent their appearance.
If you have an object variable in scope and it may or may not be a valid reference, FreeAndNil is what you should be using. That or fixing your code so that your object references are more tightly managed so it's never a question.
Access Violations shouldn't be thought of as an enemy. They're bugs: they mean you made a mistake that needs fixed. (Or that there's a bug in some code you're relying on, but I find most often that I'm the one who screwed up, especially when dealing with the RTL, VCL, or Win32 API.)
It is sometimes possible to detect when the address a pointer points to resides in a memory block that is on the heap's list of freed memory blocks. However, this requires comparing the pointer to potentially every block in the heap's free list which could contain thousands of blocks. So, this is potentially a computationally intensive operation and something you would not want to do frequently except perhaps in a severe diagnostic mode.
This technique only works while the memory block that the pointer used to point to continues to sit in the heap free list. As new objects are allocated from the heap, it is likely that the freed memory block will be removed from the heap free list and put back into active play as the home of a new, different object. The original dangling pointer still points to the same address, but the object living at that address has changed. If the newly allocated object is of the same (or compatible) type as the original object now freed, there is practically no way to know that the pointer originated as a reference to the previous object. In fact, in this very special and rare situation, the dangling pointer will actually work perfectly well. The only observable problem might be if someone notices that the data has changed out from under the pointer unexpectedly.
Unless you are allocating and freeing the same object types over and over again in rapid succession, chances are slim that the new object allocated from that freed memory block will be the same type as the original. When the types of the original and the new object are different, you have a chance of figuring out that the content has changed out from under the pointer. However, to do that you need a way to know the type of the original object that the pointer referred to. In many situations in native compiled applications, the type of the pointer variable itself is not retained at runtime. A pointer is a pointer as far as the CPU is concerned - the hardware knows very little of data types. In a severe diagnostic mode it's conceivable that you could build a lookup table to associate every pointer variable with the type allocated and assigned to it, but this is an enormous task.
That's why Assigned() is not an assertion that the pointer is valid. It just tests that the pointer is not nil.
Why did Borland create the Assigned() function to begin with? To further hide pointerisms from novice and occasional programmers. Function calls are easier to read and understand than pointer operations.
The bottom line is that you should not be attempting to detect dangling pointers in code. If you are going to refer to pointers after they have been freed, set the pointer to nil when you free it. But the best approach is not to refer to pointers after they have been freed.
So, how do you avoid referring to pointers after they have been freed? There are a couple of common idioms that get you a long way.
Create objects in a constructor and destroy them in the destructor. Then you simply cannot refer to the pointer before creation or after destruction.
Use a local variable pointer that is created at the beginning of the function and destroyed as the last act of the function.
One thing I would strongly recommend is to avoid writing if Assigned() tests into your code unless it is expected behaviour that the pointer may not be created. Your code will become hard to read and you will also lose track of whether the pointer being nil is to be expected or is a bug.
Of course we all do make mistakes and leave dangling pointers. Using FreeAndNil is one cheap way to ensure that dangling pointer access is detected. A more effective method is to use FastMM in full debug mode. I cannot recommend this highly enough. If you are not using this wonderful tool, you should start doing so ASAP.
If you find yourself struggling with dangling pointers and you find it hard to work out why then you probably need to refactor the code to fit into one of the two idioms above.
You can draw a parallel with array indexing errors. My advice is not to check in code for validity of index. Instead use range checking and let the tools do the work and keep the code clean. The exception to this is where the input comes from outside your program, e.g. user input.
My parting shot: only ever write if Assigned if it is normal behaviour for the pointer to be nil.
Use a memory manager, such as FastMM, that provides debugging support, in particular to fill a block of freed memory with a given byte pattern. You can then dereference the pointer to see if it points at a memory block that starts with the byte pattern, or you can let the code run normallly ad raise an AV if it tries to access a freed memory block through a dangling pointer. The AV's reported memory address will usually be either exactly as, or close to, the byte pattern.
Nothing can find a dangling (once valid but then not) pointer. It's your responsibility to either make sure it's set to nil when you free it's content, or to limit the scope of the pointer variable to only be available within the scope it's valid. (The second is the better solution whenever possible.)
The core point is that the way how objects are implemented in Delphi has some built-in design drawbacks:
there is no distinction between an object and a reference to an object. For "normal" variables, say a scalar (like int) or a record, these two use cases can be well told apart - there's either a type Integer or TSomeRec, or a type like PInteger = ^Integer or PSomeRec = ^TSomeRec, which are different types. This may sound like a neglectable technicality, but it isn't: a SomeRec: TSomeRec denotes "this scope is the original owner of that record and controls its lifecycle", while SomeRec: PSomeRec tells "this scope uses a transient reference to some data, but has no control over the record's lifecycle. So, as dumb it may sound, for objects there's virtually no one who has denotedly control over other objects' lifecycles. The result is - surprise - that the lifecycle state of objects may in certain situations be unclear.
an object reference is just a simple pointer. Basically, that's ok, but the problem is that there's sure a lot of code out there which treats object references as if they were a 32bit or 64bit integer number. So if e.g. Embarcadero wanted to change the implementation of an object reference (and make it not a simple pointer any more), they would break a lot of code.
But if Embarcadero wanted to eliminate dangling object pointers, they would have to redesign Delphi object references:
when an object is freed, all references to it must be freed, too. This is only possible by double-linking both, i.e. the object instance must carry a list with all of the references to it, that is, all memory addresses where such pointers are (on the lowest level). Upon destruction, that list is traversed, and all those pointers are set to nil
a little more comfortable solution were that the "one" holding such a reference can register a callback to get informed when a referenced object is destroyed. In code: when I have a reference FSomeObject: TSomeObject I would want to be able to write in e.g. SetSomeObject: FSomeObject.OnDestruction := Self.HandleDestructionOfSomeObject. But then FSomeObject can't be a pointer; instead, it would have to be at least an (advanced) record type
Of course I can implement all that by myself, but that is tedious, and isn't it something that should be addressed by the language itself? They also managed to implement for x in ...
I want to read the values stored in the Link Register or Frame Pointer from a linux kernel module and I am not sure the syntax to use. For context, I've compiled Android goldfish 3.4 kernel and am using insmod to load my module into the kernel.
My knowledge of this area is entirely hobbyist in nature, someone else might know something really stylish that obviates this dangerous and hackish method.
As a philosophical issue, the kernel doesn't tamper with user-mode operation as part of it's normal duties. This means you are going to have to tamper with the direct operation of the kernel and potentially cause crashes, corruption and other problematic c-words.
There are two ways to go about doing this. You can go through the syscall entry/exit mechanism: switching a single running thread from running usermode code to running kernel code in the context of that thread while slyly replacing it's stored registers before it goes back again. The second is the context switch mechanism itself, which switches in kernel mode from running in the context of one thread to another, again replacing relevant stored register material.
The operating theory behind all of this is that each user thread has both a user-mode stack and a kernel-mode stack. When a thread enters the kernel, the current value of the user-mode stack and instruction pointer are saved to the thread's kernel-mode stack, and the CPU switches to the kernel-mode stack. The remaining register values and flags are then also saved to the kernel stack.
At this stage, you can directly read and modify those values prior to the process being returned off the run queue. After this, when your thread returns from the kernel to user-mode, the register values and flags are popped from the kernel-mode stack, then the user-mode stack and instruction pointer values are restored from the modified values on the kernel-mode stack.
The schedule has an internal mechanism that selects the process to run next, calling switch_to(). As the name implies, this function essentially just switches the kernel stacks - it saves the current value of the stack pointer into the TCB for the current thread (called struct task_struct in Linux), and loads a previously-saved stack pointer from the TCB for the next thread. You can use this to calculate the user-mode process in question (possibly requiring a cross-reference of existing kernel-mode process structures)
The way to look at the state of the current userspace process from kernel-side is current_pt_regs() (cf. task_pt_regs() for a specific task). This gets you a pointer to a struct pt_regs, which is the same thing you'd find in the mcontext_t in a signal handler (on ARM at least). The kernel even provides nice accessor macros to make the whole caboodle rather civilised - reading through existing uses in the source should give a good feel of how to do it, but for the sake of completeness here's a trivial example*:
#include <asm/ptrace.h>
void func()
{
struct pt_regs *regs = current_pt_regs();
pr_info("User LR was %p\n", (void *)regs->ARM_lr);
}
You'd have to know the ABI details of the userspace binary to know which, if any, register is being used as a frame pointer, but if there is one it's typically in r11 or r7.
*Code typed directly into browser late at night, usual disclaimers apply, etc.
I'm somewhat new to Delphi, and this question is just me being curious. (I also just tried using it by accident only to discover I'm not supposed to.)
If you look at the documentation for TObject.InitInstance it tells you not to use it unless you're overriding NewInstance. The method is also public. Why not make it protected if the user is never supposed to call it?
Since I was around when this whole Delphi thing got started back around mid-1992, there are likely several answers to this question. If you look at the original declaration for TObject in Delphi 1, there weren't any protected/private members on TObject. That was because very early on in the development of Delphi and in concert with the introduction of exceptions to the language, exceptions were allocated from a different heap than other objects. This was the genesis of the NewInstance/InitInstance/CleanupInstance/FreeInstance functions. Overriding these functions on your class types you can literally control where an object is allocated.
In recent years I've used this functionality to create a cache of object instances that are literally "recycled". By intercepting NewInstance and FreeInstance, I created a system where instances are not returned to the heap upon de-allocation, rather they are placed on a lock-free/low-lock linked list. This makes allocating/freeing instances of a particular type much faster and eliminates a lot of excursions into the memory manager.
By having InitInstance public (the opposite of which is CleanupInstance), this would allow those methods to be called from other utility functions. In the above case I mentioned, InitInstance could be called on an existing block of memory without having to be called only from NewInstance. Suppose NewInstance calls a general purpose function that manages the aforementioned cache. The "scope" of the class instance is lost so the only way to call InitInstance is of it were public.
One of these days, we'll likely ship the code that does what I described above... for now it's part of an internal "research" project.
Oh, as an aside and also a bit of a history lesson... Prior to the Delphi 1 release, the design of how Exception instances were allocated/freed was returned to using the same heap as all the other objects. Because of an overall collective misstep it was assumed that we needed to allocate all Exception object instances to "protect" the Out of memory case. We reasoned that if we try and raise an exception because the memory manager was "out of memory", how in the blazes would we allocate the exception instance!? We already know there is no memory at that point! So we decided that a separate heap was necessary for all exceptions... until either Chuck Jazdzewski or Anders Heijlsberg (I forget exactly which one), figured out a simple, rather clever solution... Just pre-allocate the out of memory exception on startup! We still needed to control whether or not the exception should ever actually be freed (Exception instances are automatically freed once handled), so the whole NewInstance/FreeInstance mechanism remained.
Well never say never. In the VCL too much stuff is private and not virtual as it is, so I kinda like the fact that this stuff is public.
It isn't really necessary for normal use, but in specific cases, you might use it to allocate objects in bulk. NewInstance reserves a bit of memory for the object and then calls InitInstance to initialize it. You could write a piece of code that allocates memory for a great number of objects in one go, and then calls InitInstance for different parts of that large block to initialize different blocks in it. Such an implementation could be the base for a flyweight pattern implementation.
Normally you wouln't need such a thing at all, but it's nice that you can if you really want/need to.
How it works?
The fun thing is: a constructor in Delphi is just some method. The Create method itself doesn't do anything special. If you look at it, it is just a method as any other. It's even empty in TObject!
You can even call it on an instance (call MyObject.Create instead of TMyObject.Create), and it won't return a new object at all. The key is in the constructor keyword. That tells the compiler, that before executing the TAnyClass.Create method, it should also construct an actual object instance.
That construction means basically calling NewInstance. NewInstance allocates a piece of memory for the data of the object. After that, it calls InitInstance to do some special initialization of that memory, starting with clearing it (filling with zeroes).
Allocating memory is a relatively expensive task. A memory manager (compiled into your application) needs to find a free piece of memory and assign it to your object. If it doesn't have enough memory available, it needs to make a request to Windows to give it some more. If you have thousands or even millions of objects to create, then this can be inefficient.
In those rare cases, you could decide to allocate the memory for all those objects in one go. In that case you won't call the constructor at all, because you don't want to call NewInstance (because it would allocate extra memory). Instead, you can call InitInstance yourself to initialize pieces of your big chunk of memory.
Anyway, this is just a hypotheses of the reason. Maybe there isn't a reason at all. I've seen so many irrationally applied visibility levels in the VCL. Maybe they just didn't think about it at all. ;)
It gives developers a way to create object not using NewInstance (memory from stack/memory pool)
Is there a way to set an entry point in DWScript?
For example, if I start a script execution, I'd like it to execute a procedure Main, rather than the code in the regular entry point (begin ... end.).
I know it's possible to execute functions from Delphi, but I'm not sure this would be quite the same.
Aside from writing your procedure Main(); and then having your regular script entry point consist of nothing but calling Main, which is probably not what you're thinking of, no, there's no way to do that in DWS.
For all its innovations in syntax, DWS is still Pascal, and it still works the way Pascal works. Asking for some sort of named Main routine would be a radical departure from Pascal style.
EDIT: To answer the clarification posted in comments:
If you want your script to spawn a new script thread, you'll have to handle it in external Delphi code. As of this writing, the DWS system doesn't have any concept of multithreading built in. If you wanted to do it, you'd do something like this:
Create an external routine called something like SpawnThread(EntryPoint: string). Its eval method (out in Native-Delphi-land) would spawn a new thread that loads the current script, then finds the routine with the specified name and executes it.
That's about the only way you could get it to work without language-level support. If you'd like a way to spawn threads from within DWS, try adding it as a feature request to the issue tracker.
Calling functions directly is explicited in
https://code.google.com/p/dwscript/wiki/FirstSteps#Functions
If you want to execute a function in a different thread, you'll need some Delphi-side code to create a new thread, a new execution, and then call your functions. The main and threaded-execution will then be sandboxed from each other (so can't share share global vars etc.).
If you need to share data between the threads, you could do that by exposing functions or external variables, which would call into Delphi code with the proper synchronizations and locks in place (what is "proper" will depend on what your code wants to do, like always in multithreading...).
Note that it is possible to pass objects, interfaces and dynamic arrays between script executions (provided they're executions of the same program), but just like with regular code, you'll have to use locks, critical sections or mutexes explicitly.
I'm allocating my pthread thread-specific data from a fixed-size global pool that's controlled by a mutex. (The code in question is not permitted to allocate memory dynamically; all the memory it's allowed to use is provided by the caller as a single buffer. pthreads might allocate memory, I couldn't say, but this doesn't mean that my code is allowed to.)
This is easy to handle when creating the data, because the function can check the result of pthread_getspecific: if it returns NULL, the global pool's mutex can be taken there and then, the pool entry acquired, and the value set using pthread_setspecific.
When the thread is destroyed, the destructor function (as per pthread_key_create) is called, but the pthreads manual is a bit vague about any restrictions that might be in place.
(I can't impose any requirements on the thread code, such as needing it to call a destructor manually before it exits. So, I could leave the data allocated, and maybe treat the pool as some kind of cache, reusing entries on an LRU basis once it becomes full -- and this is probably the approach I'd take on Windows when using the native API -- but it would be neatest to have the per-thread data correctly freed when each thread is destroyed.)
Can I just take the mutex in the destructor? There's no problem with thread destruction being delayed a bit, should some other thread have the mutex taken at that point. But is this guaranteed to work? My worry is that the thread may "no longer exist" at that point. I use quotes, because of course it certainly exists if it's still running code! -- but will it exist enough to permit a mutex to be acquired? Is this documented anywhere?
The pthread_key_create() rationale seems to justify doing whatever you want from a destructor, provided you keep signal handlers from calling pthread_exit():
There is no notion of a destructor-safe function. If an application does not call pthread_exit() from a signal handler, or if it blocks any signal whose handler may call pthread_exit() while calling async-unsafe functions, all functions may be safely called from destructors.
Do note, however, that this section is informative, not normative.
The thread's existence or non-existence will most likely not affect the mutex in the least, unless the mutex is error-checking. Even then, the kernel is still scheduling whatever thread your destructor is being run on, so there should definitely be enough thread to go around.