Best practice to detect broken log sinks - serilog

I am trying to replace the in-house-written logger solution of one of my customers. Pretty much everything is straight forward, but i need to implement one sink that sends the logs to a custom log window that i cannot change (for now). It communicates using named pipes. this pipe may be broken or busy, so the current solution actually blocks on every log call - which I want to improve.
The question is what the best practice is when using serilog: whats the best way to tell serilog the sink is currently broken so it is not slowing down the system. Is throwing an exception enough?

Serilog itself doesn't know (or care) when a sink is broken or not, so I'm not sure I understand your goal.
Writing to a Serilog logger is supposed to be a safe operation, by design, thus any exceptions that happen in your sink will automatically be caught by Serilog to make sure the app doesn't crash. Serilog will make sure these exceptions are written to the SelfLog which developers can use to troubleshoot sink issues. See an example here.
Therefore, if your goal is to have a way that a developer can see when the sink experienced problems, the recommendation is to write error messages to the SelfLog and throw your own exceptions from within your sink.
If you can detect from within your sink that the named pipe is not available without blocking, then just write to SelfLog and return/short-circuit without trying to write to it. It's really up to you to implement any kind of resilience policy from within your sink.
If your goal is to improve the blocking calls, you might want to consider making your sink asynchronous, with the messages sent on a separate thread, without blocking the main thread of the app.
Given you're implementing your own custom sink, an easy way to do that is to turn your sink into a Periodic Batching sink and leverage the infrastructure it provides. Alternatively, you can use Serilog.Sinks.Async wrapper sink.

Related

error handling in data pipeline using project reactor

I'm writing a data pipeline using Reactor and Reactor Kafka and use spring's Message<> to save
the ReceiverOffset of ReceiverRecord in the headers, to be able to use ReciverOffset.acknowledge() when finish processing. I'm also using the out-of-order commit feature enabled.
When an event process fails I want to be able to log the error, write to another topic that represents all the failure events, and commit to the source topic. I'm currently solving that by returning Either<Message<Error>,Message<myPojo>> from each processing stage, that way the stream will not be stopped by exceptions and I'm able to save the original event headers and eventually commit the failed messages at the button of the pipeline.
The problem is that each step of the pipline gets Either<> as input and needs to filter the previous errors, apply the logic only on the Either.right and that could be cumbersome, especially when working with buffers and the operator get 'List<Either<>>' as input. So I would want to keep my business pipeline clean and get only Message<MyPojo> as input but also not missing errors that need to be handled.
I read that sending those message erros to other channel or stream is a soulution for that.
Spring Integration uses that pattern for error handling and I also read an article (link to article) that solves this problem in Akka Streams using 'divertTo()':
I couldn't find documentation or code examples of how to implement that in Reactor,
is there any way to use Spring Integration error channel with Reactor? or any other ideas to implement that?
Not familiar with reactor per se, but you can keep the stream linear. The trick, since Vavr's Either is right-biased is to use flatMap, which would take a function from Message<MyPojo> to Either<Message<Error>, Message<MyPojo>>. If the Either coming in is a right (i.e. a Message<MyPojo>, the function gets invoked and otherwise it just gets passed through.
// Apologies if the Java is atrocious... haven't written Java since pre-Java 8
incomingEither.flatMap(
myPojoMessage -> ... // compute a new Either
)
Presumably at some point you want to do something (publish to a dead-letter topic, tickle metrics, whatever) with the Message<Error> case, so for that, orElseRun will come in handy.

How to cleanup streams in a dartvm application?

Please note that I am asking about a strictly dart only application this does not concern flutter in any means, dartvm refers to the dart virtual machine.
As far as I understand Dart's idea of reactive state is implemented through streams, the responsibility of handling the lifetime of a stream object is given to the programmer, at runtime one could manipulate the stream as they see fit according to what works for their design by adding to the stream; listening to it or disposing it.
My question is this, Is it necessary that I need to call the dispose() method of a stream before my application quits? If I do, how do I go about accomplishing that? Hooking into the VM state isn't well documented and using ProcessSignal listeners is not portable, If I don't, does the GC handle this case? What's the best practice in this case?
Dart streams do not have a dispose method. Therefore you don't need to call it.
But just to give a little more detail ...
Dart streams are many things. Or rather, streams are pretty simple, they're just a way to provide a connection between code which provides events and code which consumes events. After calling listen, the stream object is no longer part of the communication, events and pushback goes directly between the event source (possibly a StreamController) and the consumer (a StreamSubscription).
Event providers are many things.
Some events are triggered just by code doing things. There is no need to clean up after those, it's just Dart objects like everything else, and they will die with the program, and can be garbage collected earlier if no live code refers to them.
Some events are triggered by I/O operations on the underlying operating system. Those will usually be cleaned up when the program ends, because they are allocated through the Dart runtime system, and it knows how to stop them again.
It's still a good idea to cancel the subscription as soon as you don't need any more events. That way, you won't keep a file open too long and prevent another part of the program from overwriting it.
Some code might allocate other resources, not managed by the runtime, and you should take extra care to say when that resource is no longer needed.
You'll have to figure that out on a case-by-case basis, by reading the documentation of the stream.
For resources allocated through dart:ffi, you can also use NativeFinalizer to register a dispose function for the resource.
Generally, you should always cancel the subscription if you don't need any more events from a stream. That's the one thing you can do. If nothing else, it allows garbage collection to collect things a little earlier.

grpc iOS stream, send only when GRXWriter.state is started?

I'm using grpc in iOS with bidirectional streams.
For the stream that I write to, I subclassed GRXWriter and I'm writing to it from a background thread.
I want to be as quick as possible. However, I see that GRXWriter's status switches between started and paused, and I sometimes get an exception when I write to it during the paused state. I found that before writing, I have to wait for GRXWriter.state to become started. Is this really a requirement? Is GRXWriter only allowed to write when its state is started? It switches very often between started and paused, and this feels like it may be slowing me down.
Another issue with this state check is that my code looks ugly. Is there any other way that I can use bidirectional streams in a nicer way? In C# grpc, I just get a stream that I write freely to.
Edit: I guess the reason I'm asking is this: in my thread that writes to GRXWriter, I have a while loop that keeps checking whether state is started and does nothing if it is not. Is there a better way to do this rather than polling the state?
The GRXWriter pauses because the gRPC Core only accepts one write operation pending at a time. The next one has to wait until the first one completes. So the GRPCCall instance will block the writer until the previous write is completed, by modifying its state!
In terms of the exception, I am not sure why you are getting the problem. GRXWriter is more like an abstract class and it seems you did your own implementation by inheriting from it. If you really want to do so, it might be helpful to refer to GRXBufferedPipe, which is an internal implementation. In particular, if you want to avoid waiting in a loop for writing, writing again in the setter of GRXWriter's state should be a good option.

Serilog File/RollingFile sink and buffering

It looks like Serilog File/RollingFile sink flushes stream after each logger call.
Isn't this fundamental performance hit? For example Nlog has some kind of AsyncWrapper for queuing log events and writing them in batch'es using background thread.
What are the solutions if I want to minimize latency when using file sink?
Rebuilding the code yourself is the only option for this currently.
I've added https://github.com/serilog/serilog/issues/650 for hopeful inclusion in the upcoming Serilog v2.

Erlang/OTP framework's error_logger hangs under fairly high load

My application is basically a content based router which will route MMS events.
The logger I am using is the one that comes with the OTP framework in SASL mode "error_logger"
The issue is ::
I am using a client to generate MMS events with default values. This client (in Java) has the ability to send high load of events in multiple THREADS
I am sending 100 events in 10 threads (each thread sending 10 MMS events) to the my router written in Erlang/OTP.
The problem is, when such high load is received by my router , my Logger hangs i.e it stops updating my Log file. But the router is still able to route the events.
The conclusions that I have come up with is ::
Scheduling problem in Erlang when such high load of events is received (a separate process for each event).
A very unlikely dead-loack state.
Might be due to sending events in multiple threads rather than sending them sequentially. But I guess a router will be connected to multiple service provider boxes, so I thought of sending events in threads.
Can anybody help mw in demystifying the problem?
You already have a good answer, but I'll add to the discussion.
The error_logger is by default using cached write operations to disk. So one possibility is that you don't really notice this while under low load, but under high load your writes get stuck in the cache for a while.
On a side note: there should be no problem having multiple threads doing calls to Erlang.
Another way of testing this is to add your own logger to error_logger, and see what happens. Possibly printing to the shell or something else that is "fast".
Which version of Erlang are you using? Prior to R14A (R13B4 maybe?), there was a performance penalty when you invoked a selective receive when the message queue contained a lot of messages. This behaviour meant that in a process that receives lots of messages (error_logger being the canonical example), if it was barely keeping up with the load then a small spike in load could cause the cost of processing to spike up and stay there as the new processing cost was higher than the process could bear. This problem has been solved in R14A.
Secondly - why are you sending a high volume of events/calls/logs to a text logger? Formatting strings for output to a human readable log file is a lot more expensive than using a binary disk_log for instance. Reducing the cost of logging will help, but reducing the volume of logs will help even more. Maybe investigate exactly why you need to log these things and see if you can't record them another (less expensive) way.
Problems with error_logger are often symptoms of some other overload problem. Try looking at the message queue sizes for all your processes when this problem occurs and see if something else is backed up too. The following erlang shellcode might help:
[ { P, element(2, process_info(P, message_queue_len)) }
|| P <- erlang:processes(), is_process_alive(P) ]

Resources