Do we allow to see the progress during a function call? - upload

I am using Thrift to make call to the server. I am trying to send an image. Is there a function allow me to see the progress of the uploading on Thrift.

Short answer.
No.
Long answer
To send some large amounts of data, such as an image, you probably already defined a service function like this (or arbitrarily more complex):
service FooBar {
bool SendImageData( 1: binary data)
}
You now transfer the image through that function call. The call does not return until it either succeeds or fails. The Thrift infrastructure does not provide you with a way to get any progress data out of the box. But we do have some options:
We could write a special Thrift transport that wraps the original one and provides us with some counters.
On the API level, we could change the call to send (especially large) files in chunks, instead of one large block. This way, we have multiple calls and know about the progress we are making.

Related

How to handle blocking calls when using reactor in a JAX-RS-powered server?

To process HTTP requests, we have to make blocking calls (e.g. JDBC calls) as part of a Mono/Flux-based process. Our current plan looks something like this:
// I renamed getSomething to processJaxrsHttpRequest
CompletionStage<String> processJaxrsHttpRequest(String input) {
return Mono.just(input)
.map(in -> process(in))
.flatMap(str -> Mono.fromCallable(() -> jdbcCall(str)).subscribeOn(fixedScheduler))
.flatMap(str -> asyncHttpCall(str))
.flatMap(str -> Mono.fromCallable(() -> jdbcCall(str)).subscribeOn(fixedScheduler))
.toFuture();
}
where fixedScheduler is used concurrently across HTTP requests.
We were hoping to get some feedback on this strategy for handling block calls within a decent number of fluxes. Of course, we understand that if all our requests were flowing through these blocking calls then we might as well not use reactor (outside of the admittedly nice processing API).
Update: Thanks bsideup for this answer. However, I should have been a little more specific with my questions.
My overall question is how to effectively have a blocking call used across multiple fluxes were these fluxes can be created/subscribed to in large numbers. We tried the suggested approach, but it results in an explosion of threads and quickly OOMs. So we are thinking to use a shared scheduler. So.. here are my questions.
Is using a shared scheduler (fixedScheduler) what you would suggest in the situation I describe? If not, will you point me in any directions?
If using a shared scheduler is good, would this be a good implementation of it: Schedulers.newParallel("blocking-scheduler", maxNumThreads)?
Update 2: Just dug a big on Schedulers#newParallel and realize that won't work since it 'rejects' blocking calls.
Really appreciate any tips!
While subscribeOn is indeed one way of handling blocking calls and your usage is okay, you can as well use publishOn.
It moves processing to the provided Scheduler, unless other publishOn is specified:
CompletionStage<String> getSomething(String input) {
return Mono.just(input)
.map(in -> process(in)) // process must be non-blocking, or go after publishOn
.publishOn(Schedulers.boundedElastic())
.map(::jdbcCall)
.flatMap(str -> asyncHttpCall(str))
.publishOn(Schedulers.boundedElastic())
.map(::jdbcCall)
.toFuture();
}
As you can see, you can continue using async calls too. Just make sure you're not blocking non-blocking schedulers (in that example, I use publishOn again after flatMap because asyncHttpCall may complete from non-blocking scheduler)

Questions about passing strings and other data from UI to LV2 plugin

I need to pass a string from the UI to the plugin. From the eg-sample, it appears that an LV2 atom should be written to a atom port.
If I understand it correctly
First allocate a LV2_Atom_Forge. May that object be on the stack or does it have to survive after the UI event callback has returned?
Call lv2_atom_forge_set_buffer. How do I know the required size of the buffer? The example sets it to 1024 bytes for no reason. May the buffer be allocated on the stack or does it have to survive the UI after the UI event callback has returned?
The forge is just a utility for writing atoms. The buffer it writes to is provided by the code that uses it, so the lifetime of the forge itself is irrelevant. Allocating it on the stack is fine, though it may be more convenient to keep one around in your UI struct for use in various places.
You can estimate the space required by knowing the format of atoms as described in the documentation, or simply implementing everything with a massive buffer at first and checking the size field of the top-level atom in your output. Keep in mind that this will change if you have variable-sized elements like strings in there. The data passed to the UI callback(s) is const and only valid during the call, it must be copied by the receiver if it needs to be available later.

Is it better for an API to dispatch itself to a queue and invoke a callback, or for the API caller to do the dispatching?

Examples:
Asynchronous method with its own dispatching:
// Library
func asyncAPI(callback: Result -> Void) {
dispatch_async(self.queue) {
...
callback(result)
}
}
// Caller
asyncAPI() { result in
...
}
Synchronous method with exposed dispatch queue:
// Library
func syncAPI() -> Result {
assert(isRunningOnCorrectQueue())
...
return result
}
// Caller
dispatch_async(api.queue) {
let result = api.syncAPI()
...
}
These two examples behave the same but I am looking to learn whether one of these ends up complicating a larget codebase more than the other, especially when there is a lot of asynchrony.
I would argue against both of the patterns you propose.
For the first pattern (where the API manages it's own backgrounding) I see little or no benefit to doing it this way, as opposed to leaving it to the caller. If you want to use a private, serial queue to protect data (or any other sort of critical section) internal to your API, that's fine, but that queue should be private, and it should specifically not target any public, non-global-concurrent queue (Note: it should especially not target the main queue). Ideally, the primary implementation of your API would also take a second parameter, so callers can specify on which queue to invoke the callback. (People can work around the lack of such a parameter by passing a callback block that re-dispatches to their desired queue, but I think that's clunkier than having an extra, optional parameter.) This puts the API consumer in complete control of the concurrency, while preserving your freedom to use queues internally to protect state.
As to the second approach, it's my opinion that we all should avoid creating new synchronous, blocking API. When you provide a synchronous, blocking API and don't provide a callback-based version, that means that you have denied consumers of your API any opportunity to avoid blocking. When you only provide synchronous, blocking API, then if someone wants to call your API in the background, at least one thread (in addition to any additional threads that your API consumes behind the scenes) will be consumed from the finite number of threads available to each process. (In the worst case this can lead to starvation conditions that are effectively deadlocks.)
Another red flag with this second example is that it vends a queue; Any time an API vends a queue, something is amiss. As mentioned, if you want to use a private serial queue to protect state or other critical sections internal to your API, go for it, but don't expose that queue to the outside world. If nothing else, it unnecessarily exposes details of your implementation. In looking at the system framework headers, I couldn't find a single case where a dispatch_queue_t was vended where it wasn't immediately obvious that the intent was for the API consumer to push in the queue, and not read it out.
It's also worth mentioning that these patterns are problematic regardless of whether your workload is CPU-bound or IO-bound. If it's CPU-bound, then not managing your own dispatch gives consumers of the API explicit control over how this CPU work is executed. If your workload is IO-bound, then you should use the OS- and libdispatch-provided asynchronous IO mechanisms (dispatch_io, dispatch_sources, kevent, etc) to avoid consuming a thread (or more than one) for the duration of your work.
Another answer here implied that forcing consumers to manage their own concurrency leads to "boilerplate" code. If you feel that the burden of API consumers potentially having to wrap calls to your API with dispatch_async is too great, then feel free to provide a convenience overload that dispatches to the default global concurrent queue, but please always leave the version that allows API consumers the ability to explicitly manage their own concurrency.
If, on the other hand, all this is internal to the implementation, and not part of the public API, then do whatever is most expedient, knowing that you can refactor the implementation behind the public API any time in the future.
As you said, the 2 generally accomplish the same thing but the first is more preferable in most scenarios. There are several benefits to using the first method.
The API is simpler. You simply call the method and provide code for the callback block.
Less boilerplate code, No typing dispatch_async every time you want to call it as it is just included in the method itself.
Less room for bugs/errors. By wrapping the asynchronous logic inside the method itself, you ensure that it is called on the right queue internally without the caller having to worry about any of that.
Touching on the last point, you also have finer control over the queue itself. Let's say you are trying to perform certain tasks on a particular queue. It is way simpler to simply wrap the code in a GCD call on that queue a single time rather than having to remember to reuse that same queue every time you want to call the method.

Synchronization in ActionScript

Despite my best efforts, I am unable to produce the kind of synchronization effects I would like to in ActionScript. The issue is that I need to make several external calls to get various pieces of outside information in response to a user request, and the way items will be laid out on the page is dependent on what each of these external calls returns. So, I don't care that all of these calls return asynchronously. However, is there any way to force some amount of synchronization on ActionScript, so that at least calling the method for doing the final layout and placement of items on the page is dependent on all of my calls finishing?
If I understand the question right, event listeners are probably your best bet. Most loader classes throw an Event.COMPLETE message when they finish doing everything behind the scenes. If you wrote those external calls, it would be easy to dispatch a complete event at the end.
So when you make all these external calls, have a function that listens to when those calls complete. This function would keep track of how many calls have been made, and when there's none left to run, continue building your layout.
Rough Sketch to explain below:
var numProcesses:int = 0;
slowthing.addEventListener(Event.COMPLETE,waitForSlowest);
numProcesses++;
slowthing.load();
quickThing.addEventListener(Event.COMPLETE,waitForSlowest);
numProcesses++;
quickthing.load();
function waitForSlowest(e:Event)
{
numProcesses--;
if(numProcesses == 0)
finalizeLayout();
}

Parsing variable length descriptors from a byte stream and acting on their type

I'm reading from a byte stream that contains a series of variable length descriptors which I'm representing as various structs/classes in my code. Each descriptor has a fixed length header in common with all the other descriptors, which are used to identify its type.
Is there an appropriate model or pattern I can use to best parse and represent each descriptor, and then perform an appropriate action depending on it's type?
I've written lots of these types of parser.
I recommend that you read the fixed length header, and then dispatch to the correct constructor to your structures using a simple switch-case, passing the fixed header and stream to that constructor so that it can consume the variable part of the stream.
This is a common problem in file parsing. Commonly, you read the known part of the descriptor (which luckily is fixed-length in this case, but isn't always), and branch it there. Generally I use a strategy pattern here, since I generally expect the system to be broadly flexible - but a straight switch or factory may work as well.
The other question is: do you control and trust the downstream code? Meaning: the factory / strategy implementation? If you do, then you can just give them the stream and the number of bytes you expect them to consume (perhaps putting some debug assertions in place, to verify that they do read exactly the right amount).
If you can't trust the factory/strategy implementation (perhaps you allow the user-code to use custom deserializers), then I would construct a wrapper on top of the stream (example: SubStream from protobuf-net), that only allows the expected number of bytes to be consumed (reporting EOF afterwards), and doesn't allow seek/etc operations outside of this block. I would also have runtime checks (even in release builds) that enough data has been consumed - but in this case I would probably just read past any unread data - i.e. if we expected the downstream code to consume 20 bytes, but it only read 12, then skip the next 8 and read our next descriptor.
To expand on that; one strategy design here might have something like:
interface ISerializer {
object Deserialize(Stream source, int bytes);
void Serialize(Stream destination, object value);
}
You might build a dictionary (or just a list if the number is small) of such serializers per expected markers, and resolve your serializer, then invoke the Deserialize method. If you don't recognise the marker, then (one of):
skip the given number of bytes
throw an error
store the extra bytes in a buffer somewhere (allowing for round-trip of unexpected data)
As a side-note to the above - this approach (strategy) is useful if the system is determined at runtime, either via reflection or via a runtime DSL (etc). If the system is entirely predictable at compile-time (because it doesn't change, or because you are using code-generation), then a straight switch approach may be more appropriate - and you probably don't need any extra interfaces, since you can inject the appropriate code directly.
One key thing to remember, if you're reading from the stream and do not detect a valid header/message, throw away only the first byte before trying again. Many times I've seen a whole packet or message get thrown away instead, which can result in valid data being lost.
This sounds like it might be a job for the Factory Method or perhaps Abstract Factory. Based on the header you choose which factory method to call, and that returns an object of the relevant type.
Whether this is better than simply adding constructors to a switch statement depends on the complexity and the uniformity of the objects you're creating.
I would suggest:
fifo = Fifo.new
while(fd is readable) {
read everything off the fd and stick it into fifo
if (the front of the fifo is has a valid header and
the fifo is big enough for payload) {
dispatch constructor, remove bytes from fifo
}
}
With this method:
you can do some error checking for bad payloads, and potentially throw bad data away
data is not waiting on the fd's read buffer (can be an issue for large payloads)
If you'd like it to be nice OO, you can use the visitor pattern in an object hierarchy. How I've done it was like this (for identifying packets captured off the network, pretty much the same thing you might need):
huge object hierarchy, with one parent class
each class has a static contructor that registers with its parent, so the parent knows about its direct children (this was c++, I think this step is not needed in languages with good reflection support)
each class had a static constructor method that got the remaining part of the bytestream and based on that, it decided if it is his responsibility to handle that data or not
When a packet came in, I've simply passed it to static constructor method of the main parent class (called Packet), which in turn checked all of its children if it's their responsibility to handle that packet, and this went recursively, until one class at the bottom of the hierarchy returned the instantiated class back.
Each of the static "constructor" methods cut its own header from the bytestream and passed down only the payload to its children.
The upside of this approach is that you can add new types anywhere in the object hierarchy WITHOUT needing to see/change ANY other class. It worked remarkably nice and well for packets; it went like this:
Packet
EthernetPacket
IPPacket
UDPPacket, TCPPacket, ICMPPacket
...
I hope you can see the idea.

Resources