Are Reactor operators like flatMap() that runs on same main thread any more efficient than regular blocking code? - project-reactor

I understand that a lot of reactor operators like flatMap() run on the same thread on which the onNext method was invoked. What I am trying to understand is if such a method is any more efficient/performant non-blocking than a regular blocking call in say a for loop. Sorry if it's a noob question but I can't seem to grasp it. Is the power of reactive realized only when we use Schedulers that jump threads (e.g. Schedulers.parallel()) and so on?
e.g. if I had a function like the following
List<Integer> integers = Arrays.asList(5,6,7,8,9,10);
Flux.fromIterable(integers)
.flatMap(this::makeAReactiveMethodCall)
.subscribe(r -> log.info(Thread.currentThread().getName()));
Logs look something like this - notice all the threads are the same "main" one.
01:25:40.641 INFO ReactiveTest - main
01:25:40.642 INFO ReactiveTest - main
01:25:40.642 INFO ReactiveTest - main
01:25:40.642 INFO ReactiveTest - main
01:25:40.642 INFO ReactiveTest - main
01:25:40.642 INFO ReactiveTest - main
All invocations happen on the same main thread. How is this code more efficient than making all the call in a for loop with blocking semantics? Say the following code?
integers.forEach(i -> this.makeAReactiveMethodCall(i).block());

Assuming each makeAReactiveMethodCall does some I/O and takes 1 second to complete. Using the flatMap operator your calls will be made asynchronously. This means that the main thread will make all 6 calls without waiting for the I/O operation to complete(non-blocking), instead, it will process some other work and will be notified when a call is completed. In the case of WebClient and Project Reactor, this is achieved by using the Netty event loop to queue/dispatch/process events.
In the traditional, blocking way(eg. RestTemplate), it would take 6 seconds to make all 6 calls synchronously. Of course, you could use ExecutorService API to make it asynchronously, but in that case, you would need 6 threads because calls would be blocking. One of the advantages of the reactive model is that number of threads is limited, thus, resources are not wasted in multiple thread creation.

it's not if makeReactiveMethodCall() is doing CPU-bound work, or if it is not really reactive at all but a blocking call in disguise.
it's more efficient the moment makeReactiveMethodCall introduces some latency, eg. by performing I/O in a reactive manner.
there is also a tradeoff in composition and using a unified abstraction for your various processing steps that you might want to consider.
but if you're after pure throughput of CPU-bound code, then by all means use a good old loop.

Related

Does the thread ever change once inside a reactor execution?

It's been hammered into my head that I shouldn't use ThreadLocal with Reactor. But I want to know if I can use ThreadLocal within a single execution of a reactor function.
Specifically, when inside a Spring Webflux Controller method, can the thread ever change if I don't invoke a reactor function?
Please let me know if this is correct
#GetMapping
public Mono<String> someControllerMethod() {
// Thread 1 executing
ThreadLocal<String> USER_ID = new ThreadLocal<>();
USER_ID.set("1");
Thread.sleep(...);
someMethod();
// Thread 1 executing
assertEquals(USER_ID.get(), "1"); // this will ALWAYS be true
return Mono.just("hello ")
// this is the only time a new thread executes and USER_ID is not set
.flatMap(s -> s + USER_ID.get());
}
void someMethod() {
// Thread 1 executing
assertEquals(USER_ID.get(), "1"); // this will ALWAYS be true
}
Is my understanding above correct?
Revised this section for clarity
In a reactor chain of many operators, each operator (e.g. map) could be run under different threads, and even different "instances?" (e.g. map of url N) of the same operator could be on different threads. But once we're in an instance of a operator, will it always be the same thread (ie is it safe to declare ThreadLocal in an instance of an reactor operator)?
// main thread
Flux.fromIterable(urls)
.map(url -> {
// each of these instances runs on a different thread
// but is declaring ThreadLocal here safe to do?
ThreadLocal<String> URL = new ThreadLocal<>();
URL.set(url);
// Will URL always be set deep in the call stack?
someOtherMethod();
// Will URL always be set at the end?
URL.get();
});
.subscribeOn(Schedules.boundedElastic())
.subscribe();
void someOtherMethod() {
URL.get(); // will this will ALWAYS be set?
}
Basically, I'd like to know whether it's safe to use ThreadLocal objects like io.grpc.Context within a single instance of a Reactor operator execution.
It's been hammered into my head that I shouldn't use ThreadLocal with Reactor.
You mustn't use ThreadLocal in a reactive chain with reactor (which is the only sensible way to use that library.) In a reactive chain, the thread might change whenever you invoke an asynchronous operator - so a single reactive chain could have operations executing on many different threads throughout. In this case your ThreadLocal might work sometimes, but it's unreliable - introduce an async operator that switches the thread (say a web request that's executed on the netty worker pool), and you've then introduced a subtle and weird bug that's hard to track down (you're arbitrarily leaking information from one reactive chain to another unintentionally.) In short, it's incredibly bad practice to tie your reactive chains to a single thread - while it might seem to work initially, you're going to eventually run into a lot of problems if you do.
That being said, you don't really have a reactive chain in the above method - it's incredibly weird. If you're returning a Mono<String> to try to make the method reactive, then you need to be executing everything as part of a reactive chain. What you're actually doing is:
Using synchronous & blocking logic, a complete no-no as it ties up an event loop thread which isn't allowed;
Calling another method that's not part of a reactive chain;
Using a JUnit test method in a controller class;
Wrapping up a value to return in Mono.just();
Making one flatMap call at the end (which won't work as it's not even mapping to a publisher to flatten, you'd have to use map instead.)
...so while using your ThreadLocal is technically "safe" in this context, from a wider perspective the implementation makes no sense at all. You realistically have two options - either make the entire method non-blocking and reactive properly, not just wrapping blocking logic in a reactive publisher, or make the whole controller just return a standard object and forget the reactive element entirely.
Follow-up:
once we're in an instance of a operator, will it always be the same thread (ie is it safe to declare ThreadLocal in an instance of an reactor operator)?
No, there's at least two cases I can think of where that wouldn't be safe:
Operators can be nested. Once you're "inside" a certain operator, there's no reason why other operators can't be used that would also switch thread.
Code in other threads can be explicitly started even if there's no operator.
I don't think you can wind up in cases where the thread changes under you other than those two, but I could well be missing something, and it's still a rather delicate scenario (someone could break it quite easily.) If you must use a Threadlocal for some reason then I'd still be seriously considering whether you should be using reactor in this context.

How to handle blocking calls when using reactor in a JAX-RS-powered server?

To process HTTP requests, we have to make blocking calls (e.g. JDBC calls) as part of a Mono/Flux-based process. Our current plan looks something like this:
// I renamed getSomething to processJaxrsHttpRequest
CompletionStage<String> processJaxrsHttpRequest(String input) {
return Mono.just(input)
.map(in -> process(in))
.flatMap(str -> Mono.fromCallable(() -> jdbcCall(str)).subscribeOn(fixedScheduler))
.flatMap(str -> asyncHttpCall(str))
.flatMap(str -> Mono.fromCallable(() -> jdbcCall(str)).subscribeOn(fixedScheduler))
.toFuture();
}
where fixedScheduler is used concurrently across HTTP requests.
We were hoping to get some feedback on this strategy for handling block calls within a decent number of fluxes. Of course, we understand that if all our requests were flowing through these blocking calls then we might as well not use reactor (outside of the admittedly nice processing API).
Update: Thanks bsideup for this answer. However, I should have been a little more specific with my questions.
My overall question is how to effectively have a blocking call used across multiple fluxes were these fluxes can be created/subscribed to in large numbers. We tried the suggested approach, but it results in an explosion of threads and quickly OOMs. So we are thinking to use a shared scheduler. So.. here are my questions.
Is using a shared scheduler (fixedScheduler) what you would suggest in the situation I describe? If not, will you point me in any directions?
If using a shared scheduler is good, would this be a good implementation of it: Schedulers.newParallel("blocking-scheduler", maxNumThreads)?
Update 2: Just dug a big on Schedulers#newParallel and realize that won't work since it 'rejects' blocking calls.
Really appreciate any tips!
While subscribeOn is indeed one way of handling blocking calls and your usage is okay, you can as well use publishOn.
It moves processing to the provided Scheduler, unless other publishOn is specified:
CompletionStage<String> getSomething(String input) {
return Mono.just(input)
.map(in -> process(in)) // process must be non-blocking, or go after publishOn
.publishOn(Schedulers.boundedElastic())
.map(::jdbcCall)
.flatMap(str -> asyncHttpCall(str))
.publishOn(Schedulers.boundedElastic())
.map(::jdbcCall)
.toFuture();
}
As you can see, you can continue using async calls too. Just make sure you're not blocking non-blocking schedulers (in that example, I use publishOn again after flatMap because asyncHttpCall may complete from non-blocking scheduler)

Does let!/do! always run the async object in a new thread?

From the wikibook on F# there is a small section where it says:
What does let! do?#
let! runs an async<'a> object on its own thread, then it immediately
releases the current thread back to the threadpool. When let! returns,
execution of the workflow will continue on the new thread, which may
or may not be the same thread that the workflow started out on.
I have not found anywhere else in books or on the web where this fact (highlighted in bold) is stated.
Is this true for all let!/do! regardless of what the async object contains (e.g. Thread.Sleep()) and how it is started (e.g. Async.Start)?
Looking in the F# source code on github, I wasn't able to find the place where a call to bind executes on a new (TP) thread. Where in the code is the magic happening?
Which part of that statement do you find surprising? That parts of a single async can execute on different threadpool threads, or that a threadpool thread is necessarily being released and obtained on each bind?
If it's the latter, then I agree - it sounds wrong. Looking at the code, there are only a few places where a new work item is being queued on the threadpool (namely, the few Async module functions that use queueAsync internally), and Async.SwitchToNewThread spawns a non-threadpool thread and runs the continuation there. A bind alone doesn't seem to be enough to switch threads.
The spirit of the statement however seems to be about the former - no guarantees are made that parts of an async block will run on the same thread. The exact thread that you run on should be treated as an implementation detail, and when you yield control and await some result, you can be pretty sure that you'll land on a different thread at least some of the time.
No. An async operations might execute synchronously on the current thread, or it might wind up completing on a different thread. It depends entirely on how the async API in question is implemented.
See Do the new C# 5.0 'async' and 'await' keywords use multiple cores? for a decent explanation. The implementation details of F# and C# async are different, but the overall principles are the same.
The builder that implements the F# async computation expression is here.

Is it better for an API to dispatch itself to a queue and invoke a callback, or for the API caller to do the dispatching?

Examples:
Asynchronous method with its own dispatching:
// Library
func asyncAPI(callback: Result -> Void) {
dispatch_async(self.queue) {
...
callback(result)
}
}
// Caller
asyncAPI() { result in
...
}
Synchronous method with exposed dispatch queue:
// Library
func syncAPI() -> Result {
assert(isRunningOnCorrectQueue())
...
return result
}
// Caller
dispatch_async(api.queue) {
let result = api.syncAPI()
...
}
These two examples behave the same but I am looking to learn whether one of these ends up complicating a larget codebase more than the other, especially when there is a lot of asynchrony.
I would argue against both of the patterns you propose.
For the first pattern (where the API manages it's own backgrounding) I see little or no benefit to doing it this way, as opposed to leaving it to the caller. If you want to use a private, serial queue to protect data (or any other sort of critical section) internal to your API, that's fine, but that queue should be private, and it should specifically not target any public, non-global-concurrent queue (Note: it should especially not target the main queue). Ideally, the primary implementation of your API would also take a second parameter, so callers can specify on which queue to invoke the callback. (People can work around the lack of such a parameter by passing a callback block that re-dispatches to their desired queue, but I think that's clunkier than having an extra, optional parameter.) This puts the API consumer in complete control of the concurrency, while preserving your freedom to use queues internally to protect state.
As to the second approach, it's my opinion that we all should avoid creating new synchronous, blocking API. When you provide a synchronous, blocking API and don't provide a callback-based version, that means that you have denied consumers of your API any opportunity to avoid blocking. When you only provide synchronous, blocking API, then if someone wants to call your API in the background, at least one thread (in addition to any additional threads that your API consumes behind the scenes) will be consumed from the finite number of threads available to each process. (In the worst case this can lead to starvation conditions that are effectively deadlocks.)
Another red flag with this second example is that it vends a queue; Any time an API vends a queue, something is amiss. As mentioned, if you want to use a private serial queue to protect state or other critical sections internal to your API, go for it, but don't expose that queue to the outside world. If nothing else, it unnecessarily exposes details of your implementation. In looking at the system framework headers, I couldn't find a single case where a dispatch_queue_t was vended where it wasn't immediately obvious that the intent was for the API consumer to push in the queue, and not read it out.
It's also worth mentioning that these patterns are problematic regardless of whether your workload is CPU-bound or IO-bound. If it's CPU-bound, then not managing your own dispatch gives consumers of the API explicit control over how this CPU work is executed. If your workload is IO-bound, then you should use the OS- and libdispatch-provided asynchronous IO mechanisms (dispatch_io, dispatch_sources, kevent, etc) to avoid consuming a thread (or more than one) for the duration of your work.
Another answer here implied that forcing consumers to manage their own concurrency leads to "boilerplate" code. If you feel that the burden of API consumers potentially having to wrap calls to your API with dispatch_async is too great, then feel free to provide a convenience overload that dispatches to the default global concurrent queue, but please always leave the version that allows API consumers the ability to explicitly manage their own concurrency.
If, on the other hand, all this is internal to the implementation, and not part of the public API, then do whatever is most expedient, knowing that you can refactor the implementation behind the public API any time in the future.
As you said, the 2 generally accomplish the same thing but the first is more preferable in most scenarios. There are several benefits to using the first method.
The API is simpler. You simply call the method and provide code for the callback block.
Less boilerplate code, No typing dispatch_async every time you want to call it as it is just included in the method itself.
Less room for bugs/errors. By wrapping the asynchronous logic inside the method itself, you ensure that it is called on the right queue internally without the caller having to worry about any of that.
Touching on the last point, you also have finer control over the queue itself. Let's say you are trying to perform certain tasks on a particular queue. It is way simpler to simply wrap the code in a GCD call on that queue a single time rather than having to remember to reuse that same queue every time you want to call the method.

Difference between the WaitFor function for TMutex delphi and the equivalent in win32 API

The documentation of delphi says that the WaitFor function for TMutex and others sychronization objects wait until a handle of object is signaled.But this function also guarantee the ownership of the object for the caller?
Yes, the calling thread of a TMutex owns the mutex; the class is just a wrapper for the OS mutex object. See for yourself by inspecting SyncObjs.pas.
The same is not true for other synchronization objects, such as TCriticalSection. Any thread my call the Release method on such an object, not just the thread that called Acquire.
TMutex.Acquire is a wrapper around THandleObjects.WaitFor, which will call WaitForSingleObject OR CoWaitForMultipleHandles depending on the UseCOMWait contructor argument.
This may be very important, if you use STA COM objects in your application (you may do so without knowing, dbGO/ADO is COM, for instance) and you don't want to deadlock.
It's still a dangerous idea to enter a long/infinite wait in the main thread, 'cause the only method which correctly handles calls made via TThread.Synchronize is TThread.WaitFor and you may stall (or deadlock) your worker threads if you use the SyncObjs objects or WinAPI wait functions.
In commercial projects, I use a custom wait method, built upon the ideas from both THandleObjects.WaitFor AND TThread.WaitFor with optional alertable waiting (good for asynchronous IO but irreplaceable for the possibility to abort long waits).
Edit: further clarification regarding COM/OLE:
COM/OLE model (e.g. ADO) can use different threading models: STA (single-threaded) and MTA (multi or free-threaded).
By definition, the main GUI thread is initialized as STA, which means, the COM objects can use window messages for their asynchronous messaging (particulary when invoked from other threads, to safely synchronize). AFAIK, they may also use APC procedures.
There is a good reason for the CoWaitForMultipleHandles function to exist - see its' use in SyncObjs.pas THandleObject.WaitFor - depending on the threading model, it can process internal COM messages, while blocking on the wait handle.

Resources