How to handle blocking calls when using reactor in a JAX-RS-powered server? - project-reactor

To process HTTP requests, we have to make blocking calls (e.g. JDBC calls) as part of a Mono/Flux-based process. Our current plan looks something like this:
// I renamed getSomething to processJaxrsHttpRequest
CompletionStage<String> processJaxrsHttpRequest(String input) {
return Mono.just(input)
.map(in -> process(in))
.flatMap(str -> Mono.fromCallable(() -> jdbcCall(str)).subscribeOn(fixedScheduler))
.flatMap(str -> asyncHttpCall(str))
.flatMap(str -> Mono.fromCallable(() -> jdbcCall(str)).subscribeOn(fixedScheduler))
.toFuture();
}
where fixedScheduler is used concurrently across HTTP requests.
We were hoping to get some feedback on this strategy for handling block calls within a decent number of fluxes. Of course, we understand that if all our requests were flowing through these blocking calls then we might as well not use reactor (outside of the admittedly nice processing API).
Update: Thanks bsideup for this answer. However, I should have been a little more specific with my questions.
My overall question is how to effectively have a blocking call used across multiple fluxes were these fluxes can be created/subscribed to in large numbers. We tried the suggested approach, but it results in an explosion of threads and quickly OOMs. So we are thinking to use a shared scheduler. So.. here are my questions.
Is using a shared scheduler (fixedScheduler) what you would suggest in the situation I describe? If not, will you point me in any directions?
If using a shared scheduler is good, would this be a good implementation of it: Schedulers.newParallel("blocking-scheduler", maxNumThreads)?
Update 2: Just dug a big on Schedulers#newParallel and realize that won't work since it 'rejects' blocking calls.
Really appreciate any tips!

While subscribeOn is indeed one way of handling blocking calls and your usage is okay, you can as well use publishOn.
It moves processing to the provided Scheduler, unless other publishOn is specified:
CompletionStage<String> getSomething(String input) {
return Mono.just(input)
.map(in -> process(in)) // process must be non-blocking, or go after publishOn
.publishOn(Schedulers.boundedElastic())
.map(::jdbcCall)
.flatMap(str -> asyncHttpCall(str))
.publishOn(Schedulers.boundedElastic())
.map(::jdbcCall)
.toFuture();
}
As you can see, you can continue using async calls too. Just make sure you're not blocking non-blocking schedulers (in that example, I use publishOn again after flatMap because asyncHttpCall may complete from non-blocking scheduler)

Related

Is it safe to just subscribe to Publishers in the assembly phase and leave it at that?

I've asked in Gitter already but looks like it's not too active these days..
I was curious, is it safe to use such constructs in production code:
private Mono<Void> someHandler() {
someService.registerPlayer(internalPlayer)
.subscribe();
return Mono.empty();
}
and then just use this Publisher from WebFlux controller method, for example?
do we have anything in the official documentation about it? All the examples I could find in the Reference, seem to be examples of calling reactive code from blocking code and observing behavior.
I guess this Disposable returned here will be GC-ed quite soon, right?
and what would be the right way to achieve similar result (emitting on the outer Publisher without having to wait till the inner Publisher completes; but making sure it's completed nevertheless)
If I understand you correctly, then it is basically a fire-and-forget scenario, but nothing prevents you from using additional operators on that chain to process the results.
If by "to beconfident that the subscription will complete" you mean "do something if it is completed successfully" or "do something if it is completed with error" then:
If you want to start some processing in a separate thread and return immediately, and also be confident in completion of that sepapate processing, then you can use both subscribeOn() and put some operators to that chain to handle the results of success or errors of that processing, just as you do when you build any reactive chain.
To handle the results you can use, for example, .map(), doOnNext(), flatMap(), switchIfEmpty(), etc., whatever fits your requirements,
To handle errors, you can use, for example, onErrorResume(), onErrorContinue(), onErrorMap(), onErrorReturn(), doOnError(), etc., whatever fits your requirements.
Here is an example how it would look like:
private Mono<Void> someHandler() {
someService.registerPlayer(internalPlayer)
// handle results
.doOnNext(...)
// handle errors
.orErrorResume(...)
// the whole subscription process will happen on the thread provided by the Scheduler you specified
.subscribeOn(Schedulers.boundedElastic())
.subscribe();
// return immediately
return Mono.empty();
}

How to call Sinks.Many<T>.tryEmitNext from multiple threads?

I am wrapping my head around Flux Sinks and cannot understand the higher-level picture. When using Sinks.Many<T> tryEmitNext, the function tells me if there was contention and what should I do in case of failure, (FailFast/Handler).
But is there a simple construct which allows me to safely emit elements from multiple threads. For example, instead of letting the user know that there was contention and I should try again, maybe add elements to a queue(mpmc, mpsc etc), and only notify when the queue is full.
Now I can add a queue myself to alleviate the problem, but it seems a common use case. I guess I am missing a point here.
I hit the same issue, migrating from Processors which support safe emission from multiple threads. I use this custom EmitFailureHandler to do a busy loop as suggested by the EmitFailureHandler docs.
public static EmitFailureHandler etryOnNonSerializedElse(EmitFailureHandler fallback){
return (signalType, emitResult) -> {
if (emitResult == EmitResult.FAIL_NON_SERIALIZED) {
LockSupport.parkNanos(10);
return true;
} else
return fallback.onEmitFailure(signalType, emitResult);
};
}
There are various confusing aspects about the 3.4.0 implementation
There is an implication that unless the Unsafe variant is used, the sink supports serialized emission but actually all the serialized version does is to fail fast in case of concurrent emission.
The Sink provided by Flux.Create does support threadsafe emission.
I hope there will be a solidly engineered alternative to this offered by the library at some point.

Can many concurrent calls to the same overused function cause blocking in Lua?

Lets say you have a complex Lua application, and there is some base function that different parts of your code call repeatedly. It's a stateless function with little to no side effects, and fairly simple.
How does the virtual machine handle this? Does it queue up all the calls, and let them run one by one, waiting for the function to to return before calling it again? Or does it do some trickery to avoid this sort of situation? What if the function had some big side effects, like print()?
Lua is single threaded so every function call must return before the next one is called. If a function is blocked then so is the VM. The only way around that is coroutines or Lua lanes or C threads.

Is it better for an API to dispatch itself to a queue and invoke a callback, or for the API caller to do the dispatching?

Examples:
Asynchronous method with its own dispatching:
// Library
func asyncAPI(callback: Result -> Void) {
dispatch_async(self.queue) {
...
callback(result)
}
}
// Caller
asyncAPI() { result in
...
}
Synchronous method with exposed dispatch queue:
// Library
func syncAPI() -> Result {
assert(isRunningOnCorrectQueue())
...
return result
}
// Caller
dispatch_async(api.queue) {
let result = api.syncAPI()
...
}
These two examples behave the same but I am looking to learn whether one of these ends up complicating a larget codebase more than the other, especially when there is a lot of asynchrony.
I would argue against both of the patterns you propose.
For the first pattern (where the API manages it's own backgrounding) I see little or no benefit to doing it this way, as opposed to leaving it to the caller. If you want to use a private, serial queue to protect data (or any other sort of critical section) internal to your API, that's fine, but that queue should be private, and it should specifically not target any public, non-global-concurrent queue (Note: it should especially not target the main queue). Ideally, the primary implementation of your API would also take a second parameter, so callers can specify on which queue to invoke the callback. (People can work around the lack of such a parameter by passing a callback block that re-dispatches to their desired queue, but I think that's clunkier than having an extra, optional parameter.) This puts the API consumer in complete control of the concurrency, while preserving your freedom to use queues internally to protect state.
As to the second approach, it's my opinion that we all should avoid creating new synchronous, blocking API. When you provide a synchronous, blocking API and don't provide a callback-based version, that means that you have denied consumers of your API any opportunity to avoid blocking. When you only provide synchronous, blocking API, then if someone wants to call your API in the background, at least one thread (in addition to any additional threads that your API consumes behind the scenes) will be consumed from the finite number of threads available to each process. (In the worst case this can lead to starvation conditions that are effectively deadlocks.)
Another red flag with this second example is that it vends a queue; Any time an API vends a queue, something is amiss. As mentioned, if you want to use a private serial queue to protect state or other critical sections internal to your API, go for it, but don't expose that queue to the outside world. If nothing else, it unnecessarily exposes details of your implementation. In looking at the system framework headers, I couldn't find a single case where a dispatch_queue_t was vended where it wasn't immediately obvious that the intent was for the API consumer to push in the queue, and not read it out.
It's also worth mentioning that these patterns are problematic regardless of whether your workload is CPU-bound or IO-bound. If it's CPU-bound, then not managing your own dispatch gives consumers of the API explicit control over how this CPU work is executed. If your workload is IO-bound, then you should use the OS- and libdispatch-provided asynchronous IO mechanisms (dispatch_io, dispatch_sources, kevent, etc) to avoid consuming a thread (or more than one) for the duration of your work.
Another answer here implied that forcing consumers to manage their own concurrency leads to "boilerplate" code. If you feel that the burden of API consumers potentially having to wrap calls to your API with dispatch_async is too great, then feel free to provide a convenience overload that dispatches to the default global concurrent queue, but please always leave the version that allows API consumers the ability to explicitly manage their own concurrency.
If, on the other hand, all this is internal to the implementation, and not part of the public API, then do whatever is most expedient, knowing that you can refactor the implementation behind the public API any time in the future.
As you said, the 2 generally accomplish the same thing but the first is more preferable in most scenarios. There are several benefits to using the first method.
The API is simpler. You simply call the method and provide code for the callback block.
Less boilerplate code, No typing dispatch_async every time you want to call it as it is just included in the method itself.
Less room for bugs/errors. By wrapping the asynchronous logic inside the method itself, you ensure that it is called on the right queue internally without the caller having to worry about any of that.
Touching on the last point, you also have finer control over the queue itself. Let's say you are trying to perform certain tasks on a particular queue. It is way simpler to simply wrap the code in a GCD call on that queue a single time rather than having to remember to reuse that same queue every time you want to call the method.

Erlang/OTP: Synchronous vs. Asynchronous messaging

One of the things that attracted me to Erlang in the first place is the Actor model; the idea that different processes run concurrently and interact via asynchronous messaging.
I'm just starting to get my teeth into OTP and in particular looking at gen_server. All the examples I've seen - and granted they are tutorial type examples - use handle_call() rather than handle_cast() to implement module behaviour.
I find that a little confusing. As far as I can tell, handle_call is a synchronous operation: the caller is blocked until the callee completes and returns. Which seems to run counter to the async message passing philosophy.
I'm about to start a new OTP application. This seems like a fundamental architectural decision so I want to be sure I understand before embarking.
My questions are:
In real practice do people tend to use handle_call rather than handle_cast?
If so, what's the scalability impact when multiple clients can call the same process/module?
Depends on your situation.
If you want to get a result, handle_call is really common. If you're not interested in the result of the call, use handle_cast. When handle_call is used, the caller will block, yes. This is most of time okay. Let's take a look at an example.
If you have a web server, that returns contents of files to clients, you'll be able to handle multiple clients. Each client have to wait for the contents of files to be read, so using handle_call in such a scenario would be perfectly fine (stupid example aside).
When you really need the behavior of sending a request, doing some other processing and then getting the reply later, typically two calls are used (for example, one cast and the one call to get the result) or normal message passing. But this is a fairly rare case.
Using handle_call will block the process for the duration of the call. This will lead to clients queuing up to get their replies and thus the whole thing will run in sequence.
If you want parallel code, you have to write parallel code. The only way to do that is to run multiple processes.
So, to summarize:
Using handle_call will block the caller and occupy the process called for the duration of the call.
If you want parallel activities to go on, you have to parallelize. The only way to do that is by starting more processes, and suddenly call vs cast is not such a big issue any more (in fact, it's more comfortable with call).
Adam's answer is great, but I have one point to add
Using handle_call will block the process for the duration of the call.
This is always true for the client who made the handle_call call. This took me a while to wrap my head around but this doesn't necessarily mean the gen_server also has to block when answering the handle_call.
In my case, I encountered this when I created a database handling gen_server and deliberately wrote a query that executed SELECT pg_sleep(10), which is PostgreSQL-speak for "sleep for 10 seconds", and was my way of testing for very expensive queries. My challenge: I don't want the database gen_server to sit there waiting for the database to finish!
My solution was to use gen_server:reply/2:
This function can be used by a gen_server to explicitly send a reply to a client that called call/2,3 or multi_call/2,3,4, when the reply cannot be defined in the return value of Module:handle_call/3.
In code:
-module(database_server).
-behaviour(gen_server).
-define(DB_TIMEOUT, 30000).
<snip>
get_very_expensive_document(DocumentId) ->
gen_server:call(?MODULE, {get_very_expensive_document, DocumentId}, ?DB_TIMEOUT).
<snip>
handle_call({get_very_expensive_document, DocumentId}, From, State) ->
%% Spawn a new process to perform the query. Give it From,
%% which is the PID of the caller.
proc_lib:spawn_link(?MODULE, query_get_very_expensive_document, [From, DocumentId]),
%% This gen_server process couldn't care less about the query
%% any more! It's up to the spawned process now.
{noreply, State};
<snip>
query_get_very_expensive_document(From, DocumentId) ->
%% Reference: http://www.erlang.org/doc/man/proc_lib.html#init_ack-1
proc_lib:init_ack(ok),
Result = query(pgsql_pool, "SELECT pg_sleep(10);", []),
gen_server:reply(From, {return_query, ok, Result}).
IMO, in concurrent world handle_call is generally a bad idea. Say we have process A (gen_server) receiving some event (user pressed a button), and then casting message to process B (gen_server) requesting heavy processing of this pressed button. Process B can spawn sub-process C, which in turn cast message back to A when ready (of to B which cast message to A then). During processing time both A and B are ready to accept new requests. When A receives cast message from C (or B) it e.g. displays result to the user. Of course, it is possible that second button will be processed before first, so A should probably accumulate results in proper order. Blocking A and B through handle_call will make this system single-threaded (though will solve ordering problem)
In fact, spawning C is similar to handle_call, the difference is that C is highly specialized, process just "one message" and exits after that. B is supposed to have other functionality (e.g. limit number of workers, control timeouts), otherwise C could be spawned from A.
Edit: C is asynchronous also, so spawning C it is not similar to handle_call (B is not blocked).
There are two ways to go with this. One is to change to using an event management approach. The one I am using is to use cast as shown...
submit(ResourceId,Query) ->
%%
%% non blocking query submission
%%
Ref = make_ref(),
From = {self(),Ref},
gen_server:cast(ResourceId,{submit,From,Query}),
{ok,Ref}.
And the cast/submit code is...
handle_cast({submit,{Pid,Ref},Query},State) ->
Result = process_query(Query,State),
gen_server:cast(Pid,{query_result,Ref,Result});
The reference is used to track the query asynchronously.

Resources