Create and Run Multiple Solver Instances in Parallel - drake

I'd like to run multiple solvers in multiple threads and eventually processes. I'm currently running a for-loop and creating threads like the following:
for (...) {
pthread_t pid;
Args args;
args.solver = solver???
pthread_create(&pid, NULL, &func, (void*)&args);
}
When defining solver, I've tried several options, though none have worked.
First, I tried calling const auto solver = drake::solvers::MakeSolver(solver_id);, then passing solver.get() into each thread's args. This successfully compiles and runs, but I get some obscure failure terminate called recursively in drake::solvers::SnoptSolver::DoSolve. I saw that MakeSolver seems to return a unique ptr around a single solver instance defined in kKnownSolvers, so possibly the threads are calling DoSolve on the same solver instance, causing this issue.
I then tried creating multiple instances of the solver. Calling StaticSolverInterface::Make<SnoptSolver>() didn't work since that is defined in an unnamed namespace and thus is only accessible in that file. Calling const auto solver = drake::solvers::MakeSolver(solver_id); and copying the SolverInterface pointed to by solver isn't possible because SolverInterface is not movable or copyable.
Is what I'm doing possible? If so, how can I achieve this?

Calling drake::solvers::MakeSolver(solver_id) multiple times and giving each one to a different thread should work fine. It returns distinct objects each time, nothing is shared.
Similarly, repeated calls to make_unique<SnoptSolver> or the like should also work, if you can hard-code which solver you'd like instead of going by the id.
The terminate error message is probably an unhandled exception. Generally when you make a thread you'll want to put a try { } catch () {} within the immediate entry point; you don't want exceptions to leave the thread.
Also I strongly suggest std::thread if you're in C++; the pthread API is old and stinky.

Related

Does the thread ever change once inside a reactor execution?

It's been hammered into my head that I shouldn't use ThreadLocal with Reactor. But I want to know if I can use ThreadLocal within a single execution of a reactor function.
Specifically, when inside a Spring Webflux Controller method, can the thread ever change if I don't invoke a reactor function?
Please let me know if this is correct
#GetMapping
public Mono<String> someControllerMethod() {
// Thread 1 executing
ThreadLocal<String> USER_ID = new ThreadLocal<>();
USER_ID.set("1");
Thread.sleep(...);
someMethod();
// Thread 1 executing
assertEquals(USER_ID.get(), "1"); // this will ALWAYS be true
return Mono.just("hello ")
// this is the only time a new thread executes and USER_ID is not set
.flatMap(s -> s + USER_ID.get());
}
void someMethod() {
// Thread 1 executing
assertEquals(USER_ID.get(), "1"); // this will ALWAYS be true
}
Is my understanding above correct?
Revised this section for clarity
In a reactor chain of many operators, each operator (e.g. map) could be run under different threads, and even different "instances?" (e.g. map of url N) of the same operator could be on different threads. But once we're in an instance of a operator, will it always be the same thread (ie is it safe to declare ThreadLocal in an instance of an reactor operator)?
// main thread
Flux.fromIterable(urls)
.map(url -> {
// each of these instances runs on a different thread
// but is declaring ThreadLocal here safe to do?
ThreadLocal<String> URL = new ThreadLocal<>();
URL.set(url);
// Will URL always be set deep in the call stack?
someOtherMethod();
// Will URL always be set at the end?
URL.get();
});
.subscribeOn(Schedules.boundedElastic())
.subscribe();
void someOtherMethod() {
URL.get(); // will this will ALWAYS be set?
}
Basically, I'd like to know whether it's safe to use ThreadLocal objects like io.grpc.Context within a single instance of a Reactor operator execution.
It's been hammered into my head that I shouldn't use ThreadLocal with Reactor.
You mustn't use ThreadLocal in a reactive chain with reactor (which is the only sensible way to use that library.) In a reactive chain, the thread might change whenever you invoke an asynchronous operator - so a single reactive chain could have operations executing on many different threads throughout. In this case your ThreadLocal might work sometimes, but it's unreliable - introduce an async operator that switches the thread (say a web request that's executed on the netty worker pool), and you've then introduced a subtle and weird bug that's hard to track down (you're arbitrarily leaking information from one reactive chain to another unintentionally.) In short, it's incredibly bad practice to tie your reactive chains to a single thread - while it might seem to work initially, you're going to eventually run into a lot of problems if you do.
That being said, you don't really have a reactive chain in the above method - it's incredibly weird. If you're returning a Mono<String> to try to make the method reactive, then you need to be executing everything as part of a reactive chain. What you're actually doing is:
Using synchronous & blocking logic, a complete no-no as it ties up an event loop thread which isn't allowed;
Calling another method that's not part of a reactive chain;
Using a JUnit test method in a controller class;
Wrapping up a value to return in Mono.just();
Making one flatMap call at the end (which won't work as it's not even mapping to a publisher to flatten, you'd have to use map instead.)
...so while using your ThreadLocal is technically "safe" in this context, from a wider perspective the implementation makes no sense at all. You realistically have two options - either make the entire method non-blocking and reactive properly, not just wrapping blocking logic in a reactive publisher, or make the whole controller just return a standard object and forget the reactive element entirely.
Follow-up:
once we're in an instance of a operator, will it always be the same thread (ie is it safe to declare ThreadLocal in an instance of an reactor operator)?
No, there's at least two cases I can think of where that wouldn't be safe:
Operators can be nested. Once you're "inside" a certain operator, there's no reason why other operators can't be used that would also switch thread.
Code in other threads can be explicitly started even if there's no operator.
I don't think you can wind up in cases where the thread changes under you other than those two, but I could well be missing something, and it's still a rather delicate scenario (someone could break it quite easily.) If you must use a Threadlocal for some reason then I'd still be seriously considering whether you should be using reactor in this context.

How to call Sinks.Many<T>.tryEmitNext from multiple threads?

I am wrapping my head around Flux Sinks and cannot understand the higher-level picture. When using Sinks.Many<T> tryEmitNext, the function tells me if there was contention and what should I do in case of failure, (FailFast/Handler).
But is there a simple construct which allows me to safely emit elements from multiple threads. For example, instead of letting the user know that there was contention and I should try again, maybe add elements to a queue(mpmc, mpsc etc), and only notify when the queue is full.
Now I can add a queue myself to alleviate the problem, but it seems a common use case. I guess I am missing a point here.
I hit the same issue, migrating from Processors which support safe emission from multiple threads. I use this custom EmitFailureHandler to do a busy loop as suggested by the EmitFailureHandler docs.
public static EmitFailureHandler etryOnNonSerializedElse(EmitFailureHandler fallback){
return (signalType, emitResult) -> {
if (emitResult == EmitResult.FAIL_NON_SERIALIZED) {
LockSupport.parkNanos(10);
return true;
} else
return fallback.onEmitFailure(signalType, emitResult);
};
}
There are various confusing aspects about the 3.4.0 implementation
There is an implication that unless the Unsafe variant is used, the sink supports serialized emission but actually all the serialized version does is to fail fast in case of concurrent emission.
The Sink provided by Flux.Create does support threadsafe emission.
I hope there will be a solidly engineered alternative to this offered by the library at some point.

D/Dlang: Lua interface, any way to force users to have no access to intermediate objects?

Status: Sort of solved. Switching Lua.Ref (close equivalent to LuaD LuaObject) to struct as suggested in answer has solved most issues related to freeing references, and I changed back to similar mechanism LuaD uses. More about this in the end.
In one of my project, I am working with Lua interface. I have mainly borrowed the ideas from LuaD. The mechanism in LuaD uses lua_ref & lua_unref to be able to move lua table/function references in D space, but this causes heavy problems because the calls to destructors and their order is not guaranteed. LuaD usually segfaults at least at the program exit.
Because it seems that LuaD is not maintained anymore, I decided to write my own interface for my purposes. My Lua interface class is here: https://github.com/mkoskim/games/blob/master/engine/util/lua.d
Usage examples can be found here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/luademo.d
And in case you need, the Lua script used by the example is here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/data/test.lua
The interface works like this:
Lua.opIndex pushes global table and index key to stack, and return Top object. For example, lua["math"] pushes _G and "math" to stack.
Further accesses go through Top object. Top.opIndex goes deeper in the table hierarchy. Other methods (call, get, set) are "final" methods, which perform an operation with the table and key at the top of the stack, and clean the stack afterwards.
Close everything works fine, except this mechanism has nasty quirk/bug that I have no idea how to solve it. If you don't call any of those "final" methods, Top will leave table and key to the stack:
lua["math"]["abs"].call(-1); // Works. Final method (call) called.
lua["math"]["abs"]; // table ref & key left to stack :(
What I know for sure, is that playing with Top() destructor does not work, as it is not called immediately when object is not referenced anymore.
NOTE: If there is some sort of operator to be called when object is accessed as rvalue, I could replace call(), set() and get() methods with operator overloads.
Questions:
Is there any way to prevent users to write such expressions (getting Top object without calling any of "final" methods)? I really don't want users to write e.g. luafunc = lua["math"]["abs"] and then later try to call it, because it won't work at all. Not without starting to play with lua_ref & lua_unref and start fighting with same issues that LuaD has.
Is there any kind of opAccess operator overloading, that is, overloading what happens when object is used as rvalue? That is, expression "a = b" -> "a.opAssign(b.opAccess)"? opCast does not work, it is called only with explicit casts.
Any other suggestions? I internally feel that I am looking solution from wrong direction. I feel that the problem reside in the realm of metaprogramming: I am trying to "scope" things at expression level, which I feel is not that suitable for classes and objects.
So far, I have tried to preserve the LuaD look'n'feel at interface user's side, but I think that if I could change the interface to something like following, I could get it working:
lua.call(["math", "abs"], 1); // call lua.math.abs(2)
lua.get(["table", "x", "y", "z"], 2); // lua table.x.y.z = 2
...
Syntactically that would ensure that reference to lua object fetched by indexing is finally used for something in the expression, and the stack would be cleaned.
UPDATE: Like said, changing Lua.Ref to struct solved problems related to dereferencing, and I am again using reference mechanism similar to LuaD. I personally feel that this mechanism suits the LuaD-style syntax I am using, too, and it can be quite a challenge to make the syntax working correctly with other mechanisms. I am still open to hear if someone has ideas to make it work.
The system I sketched to replace references (to tackle the problem with objects holding references living longer than lua sandbox) would probably need different kind of interface, something similar I sketched above.
You also have an issue when people do
auto math_abs = lua["math"]["abs"];
math_abs.call(1);
math_abs.call(3);
This will double pop.
Make Top a struct that holds the stack index of what they are referencing. That way you can use its known scoping and destruction behavior to your advantage. Make sure you handle this(this) correctly as well.
Only pop in the destructor when the value is the actual top value. You can use a bitset in LuaInterface to track which stack positions are in use and put the values in it using lua_replace if you are worried about excessive stack use.

cannot traverse the nodes of an AST, while assigning each node an ID

This is more a simple personal attempt to understand what goes on inside Rascal. There must be better (if not already supported) solution.
Here's the code:
fileLoad = |home:///PHPAnalysis/systems/ApilTestScripts/simple1.php|;
fileAST=loadPHPFile(fileLoad,true,false);
//assign a simple id to each node
public map[value,int] assignID12(node N)
{
myID=();
visit(N)
{
case node M:
{
name=getName(M);
myID[name] =999;
}
}
return myID;
}
ids=assignID12(fileAST);
gives me
|stdin:///|(92,4,<1,92>,<1,96>): Expected str, but got value
loadPHPFile returns a node of type: list[Stmt], where each Stmt is one of the many types of statements that could occur in a program (PHP, in my case). Without going into why I'd do this, why doesn't the above code work? Especially frustrating because a very simple example is worked out in the online documentation. See: http://tutor.rascal-mpl.org/Recipes/Basic/Basic.html#/Recipes/Common/CountConstructors/CountConstructors.html
I started a new console, and it seems to work. Of course, I changed the return type from map[value,int] to map[str,int] as it was originally in the example.
The problem I was having was that I may have erroneously defined the function previously. While I quickly fixed an apparent problem, it kept giving me errors. I realized that in Rascal, when you've started a console and imported certain definitions, it (seems)is impossible to overwrite those definitions. The interpreter keeps making reference to the very first definition that you provided. This could just be the interpreter performing a type-check, and preventing unintentional and/or incompatible assignments further down the road. That makes sense for variables (in the typical program sense), but it doesn't seem like the best idea to enforce that on functions (or methods). I feel it becomes cumbersome, because a user typically has to undergo some iterations before he/she is satisfied with a function definition. Just my opinion though...
Most likely you already had the name ids in scope as having type map[str,int], which would be the direct source of the error. You can look in script https://github.com/cwi-swat/php-analysis/blob/master/src/lang/php/analysis/cfg/LabelState.rsc at the function labelScript to see how this is done in PHP AiR (so you don't need to write this code yourself). What this will give you is a script where all the expressions and statements have an assigned ID, as well as the label state, which just keeps track of some info used in this labeling operation (mainly the counter to generate a unique ID).
As for the earlier response, the best thing to do is to give your definitions in modules which you can import. If you do that, any changes to types, etc will be picked up (automatically if the module is already imported, since Rascal will reimport the module for you if it has changed, or when you next import the module). However, if you define something directly in the console, this won't happen. Think of the console as one large module that you keep adding to. Since we can have overloads of functions, if you define the function again you are really defining a new alternative to the function, but this may not work like you expect.

Retrieve object executed via ExecutorService.submit

I have an ExecutorService that runs several solvers in parallel. Each solver modifies several internal variables which value must be returned.
It is not possible to encapsulate all the variables in a class to be returned via a callable object for compatibility issues. Therefore, make the solvers either callable or runnable does not make any difference in my case, as I cannot retrieve all the variables I need.
I considered following two options:
Each solver access a synchronized class and writes its values there.
Access the objects (solvers) that have been submitted by the executor in order to get their variables via get methods.
I prefer the second option, but I don't find the way to gain access to the objects submitted.
Any suggestion (for any of the options)?
You didn't elaborate on the "compatibility issues", so I can only suggest a general solution for what you described.
Since you use ExecutorService, I believe that you use ThreadPoolExecutor (or its subclass) as implementation of that interface. If that's the case, I suggest overriding ThreadPoolExecutor.afterExecute(Runnable r, Throwable t) method. It's called after any submitted Runnable has completed it's execution. Its default implementation is empty.
Your implementation should follow these steps:
Check if t != null. If so, process Throwable t which caused a solver to abort.
Check the type of r and if you recognize it, retrieve its results. Of course, it will be simpler if all your solvers have a common API.
Store results somewhere.
But look out - ThreadPoolExecutor.afterExecute() is called from the thread that ran the Runnable r, so the 3rd step will most likely need to be synchronized.
Putting it all together, your code can look like this:
if (t != null) {
// handle t
} else {
Solver solver = (Solver)r;
Results results = solver.getResults();
synchronized (allSolutions) {
allSolutions.addResults(results);
}
}

Resources