According to the lua 5.1 manual, lua_xmove moves values between stacks of different threads belonging to the same Lua state. But, I accidentally happened to use it to move values across different Lua states and it seemed to work fine! Is there any other API to move values from one Lua state to another (in 5.1), or can lua_xmove be used?
Lua stores garbage collection data in the global state. So, if you move GC or string objects across states, you can potentially confuse the garbage collector and create dangling references.
So, while it might look like it works, it could just as easily cause problems later on.
For reference, see this mailing list thread where developers discuss this exact issue.
Note that lua_xmove does check that the global states are the same:
api_check(from, G(from) == G(to));
Related
I have shared object sw_core.so. I need to have multiple instances (separate memory alloc) of this ".so" in main program. From the main program, I will be invoking display_context() function defined in sw_core.so. All display_context() need to run in parallel. sw_core.so is thread safe (no memory dependency to my knowledge).
To solve the above problem,
dlopen is used to invoke sw_core.so with RTLD_LAZY to have multiple instances of ".so".
pthread is used to invoke display_context() by getting symbol from dlsym()
Number of threads tried is 2
Anything above 2 is resulting segfault.
When I invoke 2 threads, segfault is coming when the 2nd thread write pthread_join().
Tried valgrind tool to check memory leakage, but is not showing any serious leakage.
It is not clear what you're trying to do by loading same library several times. Data segment of a shared library created in one copy once per process and initialized by whatever initial values specified in the library.
If calls use some data or state stored in library, at best case you would overwrite that.
Note: I work at Synopsys. If you do as well Priyan you may want to contact me internally.
Next, the Valgrind perspective.
Memory leaks are not likely to be the issue. I would recommend that you ensure the you have no memcheck issues first. memcheck is single threaded so problems should not be related to threads. After that you can use DRD or Helgrind to detect threading issues.
Lastly, I don't think you can open multiple different instances of a shared library. The man page (here) says
If the same shared object is loaded again with dlopen(), the same
object handle is returned. The dynamic linker maintains reference
counts for object handles, so a dynamically loaded shared object is
not deallocated until dlclose() has been called on it as many times
as dlopen() has succeeded on it. Any initialization returns (see
below) are called just once. However, a subsequent dlopen() call
that loads the same shared object with RTLD_NOW may force symbol
resolution for a shared object earlier loaded with RTLD_LAZY.
Dlopen probly does not work because it causes false sharing of public symbols between different versions of sw_core.so. To achieve proper isolation use dlmopen:
void *h = dlmopen (LM_ID_NEWLM, "path/to/sw_core.so", RTLD_LAZY | RTLD_LOCAL);
Thanks a lot for all your help ! I could successfully load single library multiple times,each one of them in separate namespace with dlmopen(). Only issue which I have faced is that gcc was not supporting dlmopen(). I checked till 4.9 version, but no success. With g++, no issues was seen in loading.
All static,global, static-global variables are having its own memory for each instances of the library loaded.
Note :- I am using a my own lib (argument to dlmopen) for this experiment not any standard libraries.
Priyan
I'm working on a Python project, where I'm currently trying to speed things up in some horrible ways: I set up my Z3 solvers, then I fork the process, and have Z3 perform the solve in the child process and pass a pickle-able representation of the model back to the parent.
This works great, and represents the first stage of what I'm trying to do: the parent process is now no longer CPU-bound. The next step is to multi-thread the parent, so that we can solve multiple Z3 solvers in parallel.
I'm pretty sure I've mutexed away any concurrent accesses of Z3 in the setup phase, and only one thread should be touching Z3 at any one time. However, despite this, I'm getting random segfaults in libz3.so. It's important to note, at this point, that it's not always the same thread that touches Z3 -- the same object (not the solvers themselves, but the expressions) might be handled by different threads at different times.
My question is, is it possible to multi-thread Z3? There is a brief note here (http://research.microsoft.com/en-us/um/redmond/projects/z3/z3.html) saying "It is not safe to access Z3 objects from multiple threads.", which I guess would answer my question, but I'm holding out hope that it means to say that one shouldn't access Z3 from multiple threads simultaneously. Another resource (Again: Installing Z3 + Python on Windows) states, from Leonardo himself, that "Z3 uses thread local storage", which, I guess, would sink this whole undertaking, but a) that answer is from 2012, so maybe things have changed, and b) maybe it uses thread-local storage for some unrelated stuff?
Anyways, is multi-threading Z3 possible (from Python)? I'd hate to have to push the setup phase into the child processes...
Z3 does indeed use thread local storage, but as far as I can see, there is only one point left in the code where it does so (to track how much memory each thread is using; in memory_manager.cpp), but that should not be responsible for the symptoms you see.
Z3 should behave nicely in a multi-threaded setting, if every thread strictly uses only it's own context object (Z3_context, or in Python class Context). This means that any object created through one of the Context's can not in any way interact with any of the other Context's; if that is required, all objects have to be translated from one Context to another first, e.g. in Python via functions like translate(...) in class ASTRef.
That said, there surely are some bugs left to fix. My first target when seeing random segfaults would be the garbage collector, because it might not interact nicely with Z3's reference counting (which is the case in other APIs). There is also a known bug that's triggered when many Context objects are created at the same time (on my todo list though...)
I just played around a bit with Lua and tried the Koneki eclipse plugin, which is quite nice. Problem is that when I make changes in a function I'm debugging at the moment the changes do not become effective when saving the changes. So I'm forced to restart the application. Would be so nice if I could make changes in the debugger and they would become effective on the fly as for example with Smalltalk or to some extend as in hot code replacement in Java. Anybody has a clue whether this is possible?
It is possible to some degree with some limitations. I've been developing an IDE/debugger that provides this functionality. It gives you access to a remote console to execute commands in the context/environment of your running application. The IDE also supports live coding, which reloads modified code as you make changes to it; see demos here.
The main limitation is that you can't modify a currently running function (at least without changes to Lua VM). This means that the effect of your changes to the currently running function will only be seen after you exit and re-enter that function. It works well for environments that call the same function repeatedly (for example a game engine calling draw), but may not work in your case.
Another challenge is dealing with upvalues (values that are created outside of your function and are referenced inside it). There are methods to "read" current upvalues and re-create them when the (new) function is created, but it requires some code analysis to find what functions will be recreated to query them for upvalues, to get the current values, and then to create a new environment with those upvalue and assign proper values to them. My current implementation doesn't do this, which means you need to use global variables as a workaround.
There was also relevant discussion just the other day on the Lua mailing list.
Is it possible to store all changes of a set by using some means of logical paths - of the changes as they occur - such that one may revert the changes by essentially "stepping back"? I assume that something would need to map the changes as they occur, and the process of reverting them would thus ultimately be linear.
Apologies for any incoherence and this isn't applicable to any particular language. Rather, it's a problem of memory – i.e. can a set * (e.g. which may be some store of user input)* of a finite size that's changed continuously * (e.g. at any given time for any amount of time - there's no limit with regards to how much it can be changed)* be mapped procedurally such that new - future - changes are assumed to be the consequence of prior change * (in a second, mirror store that can be used to revert the state of the set all the way to its initial state)*.
You might want to look at some functional data structures. Functional languages, like Erlang, make it easy to roll back to the earlier state, since changes are always made on new data structures instead of mutating existing ones. While this feature can be used at repeatedly internally, Erlang programming typically uses this abundantly at the top level of a "process" so that on any kind of failure, it aborts both processing as well as all the changes in their entirety simply by throwing an exception (in a non-functional language, using mutable data structures, you'd be able to throw an exception to abort, but restoring originals would be your program's job not the runtime's job). This is one reason that Erlang has a solid reputation.
Some of this functional style of programming is usefully applied to non-functional languages, in particular, use of immutable data structures, such as immutable sets, lists, or trees.
Regarding immutable sets, for example, one might design a functionally-oriented data structure where modifications always generate a new set given some changes and an existing set (a change set consisting of additions and removals). You'd leave the old set hanging around for reference (by whomever); languages with automatic garbage collection reclaim the old ones when they're no longer being used (referenced).
You can put a id or tag into your set data structure, this way you can do some introspection to see what data structure id someone has a hold of. You also can capture the id of the base off of which each new version was generated; this gives you some history or lineage.
If desired, you can also capture a reference to the entire old data structure in the new one, or, one can maintain a global list of all of the sets as they are being generated. If you do, however, you'll have to take over more responsibility for storage management, as an automatic collector will probably not find any unused (unreferenced) garbage to collect without additional some help.
Database designs do some of this in their transaction controllers. For the purposes of your question, you can think of a database as a glorified set. You might look into MVCC (Multi-version Concurrency Control) as one example that is reasonably well written up in literature. This technique keeps old snapshot versions of data structures around (temporarily), meaning that mutations always appear to be in new versions of the data. An old snapshot is maintained until no active transaction references it; then is discarded. When two concurrently running transactions both modify the database, they each get a new version based off the same current and latest data set. (The transaction controller knows exactly which version each transaction is based off of, though the transaction's client doesn't see the version information.) Assuming both concurrent transactions choose to commit their changes, the versioning control in the transaction controller recognizes that the second committer is trying to commit a change set that is not a logical successor to the first (since both changes sets as we postulated above were based on the same earlier version). If possible, the transaction controller will merge the changes as if the 2nd committer was really working off the other, newer version committed by the first committer. (There are varying definitions of when this is possible, MVCC says it is when there are no write conflicts, which is a less-than-perfect answer but fast and scalable.) But if not possible, it will abort the 2nd committers transaction and inform the 2nd committer thereof (they then have the opportunity, should they like, to retry their transaction starting from the newer base). Under the covers, various snapshot versions in flight by concurrent transactions will probably share the bulk of the data (with some transaction-specific change sets that are consulted first) in order to make the snapshots cheap. There is usually no API provided to access older versions, so in this domain, the transaction controller knows that as transactions retire, the original snapshot versions they were using can also be (reference counted and) retired.
Another area this is done is using Append-Only-Files. Logging is a way of recording changes; some databases are based 100% on log-oriented designs.
BerkeleyDB has a nice log structure. Though used mostly for recovery, it does contain all the history so you can recreate the database from the log (up to the point you purge the log in which case you should also archive the database). Again someone has to decide when they can start a new log file, and when they can purge old log files, which you'd do to conserve space.
These database techniques can be applied in memory as well. (Nothing is free, though, of course ;)
Anyway, yes, there are fields where this is done.
Immutable data structures help preserve history, by simply keeping old copies; changes always go to new copies. (And efficiency techniques can make this not as bad as it sounds.)
Id's can help understand lineage without necessarily holding onto all the old copies.
If you do want to hold onto all old the copies, you have to look at your domain design to understand when/how/if old data structures possibly can get accessed with an eye toward how to eventually reclaim them. You'll mostly likely have to help get involved in defining how they get released, if ever. Or how they get archived for posterity though at the cost of slower access later.
Recently, I have encountered many difficulties when I was developing using C++ and Lua. My situation is: for some reason, there can be thousands of Lua-states in my C++ program. But these states should be same just after initialization. Of course, I can do luaL_loadlibs() and lua_loadfile() for each state, but that is pretty heavy(in fact, it takes a rather long time for me even just initial one state). So, I am wondering the following schema: What about keeping a separate Lua-state(the only state that has to be initialized) which is then cloned for other Lua-states, is that possible?
When I started with Lua, like you I once wrote a program with thousands of states, had the same problem and thoughts, until I realized I was doing it totally wrong :)
Lua has coroutines and threads, you need to use these features to do what you need. They can be a bit tricky at first but you should be able to understand them in a few days, it'll be well worth your time.
take a look to the following lua API call I think it is what you exactly need.
lua_State *lua_newthread (lua_State *L);
This creates a new thread, pushes it on the stack, and returns a pointer to a lua_State that represents this new thread. The new thread returned by this function shares with the original thread its global environment, but has an independent execution stack.
There is no explicit function to close or to destroy a thread. Threads are subject to garbage collection, like any Lua object.
Unfortunately, no.
You could try Pluto to serialize the whole state. It does work pretty well, but in most cases it costs roughly the same time as normal initialization.
I think it will be hard to do exactly what you're requesting here given that just copying the state would have internal references as well as potentially pointers to external data. One would need to reconstruct those internal references in order to not just have multiple states pointing to the clone source.
You could serialize out the state after one starts up and then load that into subsequent states. If initialization is really expensive, this might be worth it.
I think the closest thing to doing what you want that would be relatively easy would be to put the states in different processes by initializing one state and then forking, however your operating system supports it:
http://en.wikipedia.org/wiki/Fork_(operating_system)
If you want something available from within Lua, you could try something like this:
How do you construct a read-write pipe with lua?