A "Temporal Pass" Module - erlang

I am trying to create a general module that collects data at an irregular interval. Data arrives from the left end as soon as new data has arrived. This may be something like 100 times a second.
On the right end I want to be able to "plug in" n listeners, each with its own regular interval. For the purpose of simplification, let's say all with an interval of once per second.
Every listener registers a callback function that may or may not be asynchronous.
My problem is that if the callback function is synchronous, my "temporal pass" may hang. What is the best way to approach this? Should I spawn a process whose pure purpose is to pass along the data and pay the price if the callback hangs?
+-------------+ Data Out 1
=======> |Temporal Pass| ==========>
Data In +-------------+ \\ Data Out 2
\\ Data Out n

Spawn a new process for the message, otherwise the process will wait until synchronous calls are done. This is exactly the sort of problem the process model is meant to solve and I do not see any other way to do it.
Spawning processes are not expensive, but not entirely free either. You may get a small performance boost by only spawning new processes for the synchronous calls. That will require some way of flagging each callback as either synchronous or asynchronous.


Asynchronous Xarray writing to Zarr

all. I'm using a Dask Distributed cluster to write Zarr+Dask-backed Xarray Datasets inside of a loop, and the dataset.to_zarr is blocking. This can really slow things down when there are straggler chunks that block the continuation of the loop. Is there a way to do the .to_zarr asynchronously, so that the loop can continue with the next dataset write without being held up by a few straggler chunks?
With the distributed scheduler, you get async behaviour without any special effort. For example, if you are doing arr.to_zarr, then indeed you are going to wait for completion. However, you could do the following:
client = Client(...)
out = arr.to_zarr(..., compute=False)
fut = client.compute(out)
This will return a future, fut, whose status reflects the current state of the whole computation, and you can choose whether to wait on it or to continue submitting new work. You could also display it to a progress bar (in the notebook) which will update asynchronously whenever the kernel is not busy.

Erlang/OTP How to notify parent process that child processes are idle and no messages in their mailbox

I would like to design a process hierarchy where there is a a parent process P which acts like a gatekeeper and delegates the work(messages/events from its client processes) to it's children processes C1,C2..Cn which collaborate with each other and may send the result back to P. The children processes cannot talk to any process outside, only P.
The challenge is that though P may have multiple messages from its clients, it should accept only one message, delegate the work to C1..Cn and ONLY accept the next message from its clients
when all its children processes are done(or idle) and there are no more messages circulating between C1 to Cn.
P finishes accepting messages from C1..Cn so that it can return the result to its clients
Idle for me is when they are waiting with a receive (blocking) or simply exited.
C1 to Cn are finite state machines. Some or all of them may send messages back to C. Or there may be no messages to be sent back to C. Even if no messages are sent back to C, C has to figure out that all of them are done with no messages between them.
If any of C1 to Cn have been pre-empted, then it is considered busy(this may be obvious but I thought I'll put it here for completion) and C will not receive the next message
Is there an OTP pattern or library which will do this for me (before I hack something?). I know that process_info can let me know if the mailbox of a process are empty and I could keep on checking the children's mailboxes from P but it would be unnecessary polling from P.
EDIT GENERAL: I am trying to implement a reactive variant of Flow Based Programming on the Erlang platform. This has the notion of 'hierarchical processes' or composites which themselves may contain composite processes until we reach some boxes of actual code...I am going to research(looking at monitor,process_info,process_flag) but I wanted to respond to your excellent answers
EDIT RECURSIVE PARENTS: Each of C1 and Cn can themselves be parent/composite processes. If I just spawn processes and let them exit immediately, I'll have to create the chain of Composites everytime as C1..Cn may themselves be composites (which spawn composites..and so on). Finally, when we reach a leaf box(which is not a composite node), they are supposed to be finite state machines.. so I'm not sure of spawning and making them exit quickly if the are FSMs.
EDIT TKOWAL: Since I am trying to create a generic parent/composite process, it does not know 'when' the task ends. All it does is relay the messages it receives from its children to it's siblings with the 'constraint' that it will not accept the next message from its client/siblings until its children are 'done'. The children C1..Cn may send not just one but many messages. I understand from your proposal, that wait_for_task_finish will stop blocking the moment it gets the first message. But more messages may be emitted too by P's children. P should wait for all messages. Also, having a task_end symbol will not work for the same reason(i.e. multiple messages possible from the children)
Given how inexpensive it is to start up Erlang processes, your gatekeeper could start new children for each incoming task, and then wait for them all to exit normally once they complete their work.
But in general, it sounds like you're looking for a process pool. There are a few of these already available, such as poolboy and sidejob. Pools can be harder to get right than you think, so I advise using an existing proven pool implementation before attempting to write your own.
After edits, this became entirely different question, so I am posting new answer.
If you are trying to write Flow Based Programming, then you are probably solving wrong problem. FBP is effective, because almost everything is asynchronous and you start processing next request immediately after you finished with previous one.
So, the answer is - don't wait for children to finish:
In FBP, there is no time dependencies between the components. So if I
have a chunk of data, it should be able to flow from one end of the
diagram to the other regardless of how any other pieces of data are
being handled. In order to program an FBP system, you have to minimize
your dependencies.
When creating parent and children, you know all the connections between blocks, so just configure children to send processed data directly to next block. For example: P1 has children C1 and C2. You send message to P1, it delegates it to C1, packet flows couple of times between C1 and C2 and after that, C1 or C2 sends it directly to P2.
Blocks should be stateless. They output should not depend on previous requests, so even if C1 and C2 are processing data from two different requests to P1 - it is OK. There could be situations, where P1 gets data packet D1 and then D2, but will output answers in different order R2 and then R1. It is also OK. You can use Erlang reference to tag messages and then check, which response is from which request.
I don't think, there is ready library for that, but it is really easy to hack, unless I missed something. Your P process should look like this:
ready_for_next_task() ->
{task, Task, CallerPid} ->
wait_for_task_finish(CallerPid) ->
{task_end, Response} ->
CallerPid ! Response
In wait_for_task_finish/1 you have only one clause for receive, so it will not accept next task, until current one is finished. If you are waiting for multiple responses from workers, you can simply add second clause to receive with some partial response and call wait_for_task_finish/1 recursively.
It is always better to have some indicator, that the processing ended, because you don't have guarantees on message delivery time. This means, that you could check, that all processes currently are waiting for message and think, that they ended processing, but actually, they did not started yet or one of them send message to other and you caught them before the second one had it in message box.
If the processes C1..Cn have only parts of actual work and don't know about the progress, than the gatekeeper P should know how many parts there were, receive all of them one by one and then call ready_for_next_task/1.

Questions about the nextTuple method in the Spout of Storm stream processing

I am developing some data analysis algorithms on top of Storm and have some questions about the internal design of Storm. I want to simulate a sensor data yielding and processing in Storm, and therefore I use Spout to push sensor data into the succeeding bolts at a constant time interval via setting a sleep method in nextTuple method of Spout. But from the experiment results, it appeared that spout didn't push data at the specified rate. In the experiment, there was no bottleneck bolt in the system.
Then I checked some material about the ack and nextTuple methods of Storm. Now my doubt is if the nextTuple method is called only when the previous tuples are fully processed and acked in the ack method?
If this is true, does it means that I cannot set a fixed time interval to emit data?
Thx a lot!
My experience has been that you should not expect Storm to make any real-time guarantees, including in your case the rate of tuple processing. You can certainly write a spout that only emits tuples on some time schedule, but Storm can't really guarantee that it will always call on the spout as often as you would like.
Note that nextTuple should be called whenever there is room available for more pending tuples in the topology. If the topology has free capacity, I would expect Storm to try to fill it up if it can with whatever it can get.
I had a similar use-case, and the way I accomplished it is by using TICK_TUPLE
Config tickConfig = new Config();
tickConfig.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 15);
builder.setBolt("storage_bolt", new S3Bolt(), 4).fieldsGrouping("shuffle_bolt", new Fields("hash")).addConfigurations(tickConfig);
Then in my storage_bolt (note it's written in python, but you will get an idea) i check if message is tick_tuple if it is then execute my code:
def process(self, tup):
if tup.stream == '__tick':
# Your logic that need to be executed every 15 seconds,
# or what ever you specified in tickConfig.
# NOTE: the maximum time is 600 s.

Is it possible implement Pregel in Erlang without supersteps?

Let's say we implement Pregel with Erlang. Why do we actually need supersteps? Isn't it better to just send messages from one supervisor to processes that represent nodes? They could just apply the calculation function to themselves, send messages to each other and then send a 'done' message to the supervisor.
What is the whole purpose of supersteps in concurrent Erlang implementation of Pregel?
The SuperStep concept as espoused by the Pregel model could be viewed as sort of a Barrier for parallel-y executing entities. At the end of each superstep, each worker, flushes it state to the persistent store.
The algorithm is check-pointed at the end of each SuperStep so that in case of failure, when a new node has to take over the function of a failed peer, it has a point to start from. Pregel guarantees that since the data of the node has been flushed to disk before the SuperStep started, it can reliably start from exactly that point.
It also in a way signifies "progress" of the algorithm. A pregel algorithm/job can be provided with a "max number of supersteps" after which the algorithm should terminate.
What you specified in your question (about superisors sending worker a calculation function and waiting for a "done") can definitely be implemented (although I dont think the current supervisor packaged with OTP can do stuff like that out of the box) but I guess the concept of a SuperStep is just a requirement of a Pregel model. If on the other hand, you were implementing something like a parallel mapper (like what Joe implements in his book) you wont need supersteps/

MailboxProcessor usage guidelines?

I've just discovered the MailboxProcessor in F# and it's usage as a "state machine" ... but I can't find much on the recommended usage of them.
For example... say I'm making a simple game with 100 on-screen enemies should I use a MailboxProcessor to change enemy position and health; giving me 200 active MailboxProcessor?
Is there any clever thread management going on under the hood? should I try and limit the amount of active MailboxProcessor I have or can I keep banging them out willy-nilly?
Thanks in advance,
A MailboxProcessor for enemy simulation might look like this:
MailboxProcessor.Start(fun inbox ->
async {
while true do
let! message = inbox.Receive()
It does not consume a thread while it waits for a message to arrive (let! message = line). However, once message arrives it will consume a thread (on a thread pool). If you have a 100 mailbox processors that all receive a message simultaneously, they will all attempt to wake up and consume a thread. Since here message processing is CPU bound, 100s of mailbox processors will all wake up and start spawning (thread pool) threads. This is not a great performance.
One situation mailbox processors excel in is the situation where there is a lot of concurrent clients all sending messages to one processor (imagine several parallel web crawlers all downloading pages and sinking results to a queue). On-screen enemies case appears different - it is many entities responding to a single source of messages (player movement/time ticks).
Another example where thousands of MailboxProcessors is a great solution is I/O bound MailboxProcessor:
MailboxProcessor.Start(fun inbox ->
async {
while true do
let! message = inbox.Receive()
match message with
| ->
do! AsyncWrite("something")
let! response = AsyncResponse()
Here after receiving a message the agent very quickly yields a thread but still needs to maintain state across asynchronous operations. This will scale very very well in practice - you can run thousands and thousands of such agents: this is a great way to write a web server.
As per
you can bang them out willy-nilly. Try it! They use the ThreadPool. I have not tried this for a real-time GUI game app, but I would not be surprised if this is 'good enough'.
say I'm making a simple game with 100 on-screen enemies should I use a MailboxProcessor to change enemy position and health; giving me 200 active MailboxProcessor?
I don't see any reason to try to use MailboxProcessor for that. A serial loop is likely to be much simpler and faster.
Is there any clever thread management going on under the hood?
Yes, lots. But is it designed for asynchronous concurrent programming (particularly scalable IO) and your program isn't really doing that.
should I try and limit the amount of active MailboxProcessor I have or can I keep banging them out willy-nilly?
You can bang them out willy-nilly but they are far from optimized and performance is much worse than serial code.
Maybe this or this can help?
