How to make sure fluentd BufferedOutput write(chunk) process events exactly once - fluentd

In a BufferedOutput plugin,
if write(chunk) throws exception or the fluentd process dies when it is processing the chunk, according to the docs it says the chunk will still stay in the queue but does that mean the events/records processed before the crash will be processed again after fluentd restarts?
If that is the case, write(chunk) has to be atomic for "exactly once processing". Then, is the method written here in the filterstream-method section good for the purpose? i.e. Are the events in the MultiEventStream being processed atomically?

write(chunk) may be retried sometimes if any errors occurs in that method. So that method should be written as idempotent.
I cannot understand what you're doing. Each methods are designed for these purposes:
filter_stream in Filter: select/reject events, or enrich/shave fields of records (once per each event, not retried)
format in Output: format events to string/binary, then it will be written into chunks (once per each event, not retried)
write in Output: read data from chunk and write/send it to destination (at least once per chunks, retried for errors)

Related

Get all Durable Function instances over a time period

I have been trying to use the Durable Functions HTTP API Get Instances call to get a list of Completed/Failed/Terminated instances to delete over a given time period, batched in groups of 50: /runtime/webhooks/durabletask/instances?code=xxx&createdTimeFrom=2021-11-06T00:00:00.0Z&createdTimeTo=2021-11-07T00:00:00.0Z&top=50
As per the documentation, if the response contains the x-ms-continuation-token header then there are more results and I should make another call adding the x-ms-continuation-token to the request headers... even if I get no results in the body (the first few calls always seem to return no results but then I start getting results after that for a while before dropping back to no results). My issue is that this never seems to end because there is always a continuation token even after running for 20+ minutes and hundreds of calls for the same date range. This doesn't happen for the Durable Function Monitor extension for VS Code.
What am I missing from the documentation that will tell me when to stop looking for more records if the x-ms-continuation-token header is always present?

select from system$stream_has_data returns error - parameter must be a valid stream name... hmm?

I'm trying to see if there is data in a stream and I provided the exact stream name as follows :
Select SYSTEM$STREAM_HAS_DATA('STRM_EXACT_STREAM_NAME_GIVEN');
But, I get an error :
SQL compilation error: Invalid value ['STRM_EXACT_STREAM_NAME_GIVEN'] for function 'SYSTEM$STREAM_HAS_DATA', parameter 1: must be a valid stream name
1) Any idea why ? How can this error be resolved ?
2) Would it hurt to resume a set of tasks (alter task resume;) without knowing if the corresponding stream has data in it or not? I blv if there is (delta) data in the stream, the task will load it, if not, the task won't do anything.
3) Any idea how to modify / update a stream that shows up as 'STALE' ? - or should just loading fresh data into the table associated with the stream should set the stream as 'NOT STALE' i.e. stale = false ? what if loading the associated table does not update the state of the task? (and that is what is happening currently in my case, as things appear.
1) It doesn't look like you have a stream by that name. Try running SHOW STREAMS; to see what streams you have active in the database/schema that you are currently using.
2) If your task has a WHEN clause that validates against the SYSTEM$STREAM_HAS_DATA result, then resuming a task and letting it run on schedule only hits against your global services layer (no warehouse credits), so there is no harm there.
3) STALE means that the stream data wasn't used by a DML statement in a long time (I think its 14 days by default or if data retention is longer than 14 days, then it's the longer of those). Loading more data into the stream table doesn't help that. Running a DML statement will, but since the stream is stale, doing so may have bad consequences. Streams are meant to be used for frequent DML, so not running DML against a stream for longer than 14 days is very uncommon.

XGrabPointer poll till next event or pipe

I was trying to write a mouse event listener. This was my approach, can you please tell me if this will work before I start writing it. I'm writing it in ctypes, so if I ctype it all (couple days) then find out it doesnt work its a loss of time.
My goal is, that I should be able to cancel the poll via a pipe. This was my approach:
In another thread call XThreadsInit
Open XDisplay display
XGrabPointer to display
get file descriptor ConnectionNumber(display)
connect to pipe that was made on main thread
Do a pselect with no timeout timeout is set to null on pipe and fd from 4
Is this right approach?
Thanks
If you are using threads you are sharing variables between threads. It would be much simpler to use a global variable that is set when the poll must be aborted, then in your watch thread create a tight loop that checks for that variable and use a short timeout in pselect(). This may introduce a short delay but if you keep the timeout short (say, 100 ms) it would be hardly noticable and still efficient.

Best way to handle backed out message in WMB

I have a backout queue for my queue manager.
I want to build a message flow which will read this queue and if any message comes to the queue it should take the message and wrap it in a specially formatted XML message and put it in the normal exception queue which gets the handled exceptions.
But, the message coming to the backout queue can be in any format and I have to make an xml where that message is going be a field.
So, what could be the best settings for my flow(Regarding MQMD properties like CCSID, format etc) and which parser should I use (DFDL or BLOB or MRM)?
Kindly advice.
Since you don't know what kind of message arrived to backout queue, you should not parse it with specific parsers (like XMLNSC etc). Probably the more generic params you will set on MQInput, the better you will do further down the flow to determine what's inside the message.
So, I would start with default Message domain (BLOB) and leave other params untouched as well. Connect some logging node (e.g. Trace node) to Catch and Failure terminals. Connect Out terminal to a Compute node which includes ESQL to determine error type and decide on further actions (e.g. route to label). Then in each label decide what part of the message should be mapped to final exception message and to the mapping.
If you need those MQMD properties of the message currently in backout queue in your resulting message, just extract the values and put/concatenate/whatever to resulting message XML part. I don't think you should copy MQMD (and other) headers to result message as is, because these might be the reason why original message got into backout queue and your resulting message will get there again. Construct resulting message headers from scratch.
If something bad happens while doing these transformations, you will see the problem in Trace. Then modify error handling logic appropriately to avoid mishandling in the future.

Erlang message loops

How does message loops in erlang work, are they sync when it comes to processing messages?
As far as I understand, the loop will start by "receive"ing a message and then perform something and hit another iteration of the loop.
So that has to be sync? right?
If multiple clients send messages to the same message loop, then all those messages are queued and performed one after another, or?
To process multiple messages in parallell, you would have to spawn multiple message loops in different processes, right?
Or did I misunderstand all of it?
Sending a message is asynchronous. Processing a message is synchronous - one message is receive'd at a time - because each process has its own (and only one) mailbox.
From the manual (Erlang concurrency
Each process has its own input queue for messages it receives. New messages received are put at the end of the queue. When a process executes a receive, the first message in the queue is matched against the first pattern in the receive, if this matches, the message is removed from the queue and the actions corresponding to the the pattern are executed.
However, if the first pattern does not match, the second pattern is tested, if this matches the message is removed from the queue and the actions corresponding to the second pattern are executed. If the second pattern does not match the third is tried and so on until there are no more pattern to test. If there are no more patterns to test, the first message is kept in the queue and we try the second message instead. If this matches any pattern, the appropriate actions are executed and the second message is removed from the queue (keeping the first message and any other messages in the queue). If the second message does not match we try the third message and so on until we reach the end of the queue. If we reach the end of the queue, the process blocks (stops execution) and waits until a new message is received and this procedure is repeated.Of course the Erlang implementation is "clever" and minimizes the number of times each message is tested against the patterns in each receive.
So you could create prios with the regex, but the concurrency is done via multiple processes.

Resources