Restricting the number of the same log via fluentD - fluentd

Use case: Set the maximum number of messages (within a timeframe) to be sent to a target service.
Example.
We collect logs from service X which has these kind of logs:
{"#timestamp":"2020-10-30T13:00:00.310Z","level":"INFO","message":"This is some event"}
{"#timestamp":"2020-10-30T13:00:00.315Z","level":"WARN","message":"This is warn abc123"}
{"#timestamp":"2020-10-30T13:00:00.325Z","level":"WARN","message":"This is warn abc123"}
{"#timestamp":"2020-10-30T13:00:00.327Z","level":"WARN","message":"This is warn abc123"}
{"#timestamp":"2020-10-30T13:00:00.335Z","level":"WARN","message":"This is warn xyz123"}
As you can see the same warning (abc123) was logged multiple time by the service within 12ms.
What I want is to send only one from them.
So fluentD should forward these to the target service:
{"#timestamp":"2020-10-30T13:00:00.310Z","level":"INFO","message":"This is some event"}
{"#timestamp":"2020-10-30T13:00:00.315Z","level":"WARN","message":"This is warn abc123"}
{"#timestamp":"2020-10-30T13:00:00.335Z","level":"WARN","message":"This is warn xyz123"}
Which timestamp to use or to have a counter doesn't matter for me.
Is there a filter,plugin for this use case? Something like where I can set a regex rule for the messages(for deciding whether more messages should be considered as equal) and a timeframe?

In fluentd one may try the throttle plugin https://github.com/rubrikinc/fluent-plugin-throttle with a message key as a group_key (not sure about performance in this case tho).
In FluentBit you can use a built-in SQL stream processor and write a SELECT with WINDOW and GROUP BY statements: https://docs.fluentbit.io/stream-processing/getting_started/fluent_bit_sql#select-statement.

Related

How to log to a file from the Erlang shell?

Keep forgetting that Logging chapter of the the Kernel User's Guide already has an answer.
Paraphrasing the 2.9 Example: Add a handler to log info events to file section in the Logging chapter of the Kernel User's Guide:
1. Set log level (default: notice)
Globally: logger:set_primary_config/2
For certain modules only: logger:set_module_level/2
Accepted log levels (from least severe to most):
debug, info, notice, warning, error, critical, alert, emergency
Note:
The default log level in the Erlang shell is notice, so if you leave it as is, but set a lower level (such as debug or info) when adding a log handler in the next step, those level of logs will never get through.
Example:
logger:set_primary_config(level, debug).
2. Configure and add log handler
Specify the handler configuration map, for example:
Config = #{config => #{file => "./sample.log"}, level => debug}.
And add the handler:
logger:add_handler(to_file_handler, logger_std_h, Config).
logger_std_h is the standard handler for Logger.
3. Filter out logs below a certain level on Erlang shell
Following the examples above, all levels of logs will be printed. To restore the notice default, but still save every level of logs in the file), use logger:set_handler_config/3.
logger:set_handler_config(default, level, notice).
Work in progress: Log each process' events into their own logfile
This module documents my (partially successful) attempts; will revisit and expand on this section when time permits. My use case was that the FreeSWITCH phone server would spawn an Erlang process to handle the call, and so logging each of them into their own files made sense at the time.

Redis - monitoring maximum memory before inserts fail?

While this Q/A does not address the actual issue of: How to detect with client (eg redis-py) that redis is running out of memory constraint not by machine but by the maxmem configuration? Before inserts fail which command to use in the programm to detect about to be full?
My first guess is: info and check if used_memory_peak < maxmem setting. Is this correct?
(Besides, for out of machine memory, since defrag, use which setting, none of the returned INFO fields help here)
Well should i just try an insert and see if fail (but that would be after the fact then.)
Trail and error, good enough tested by running
while true; do redis-cli lpush mm longstringhere; done; results on maxmem - used_memory < 0.1MB with insert failures:
(error) OOM command not allowed when used memory > 'maxmemory'.
So i have set i poll it via redis-py client and once the diff goes <1mb threshold throw up, sry raise Error of course. Make sure the user_memory memory addon of your longest command is < threshold too of course otherwise you run into it on insert.
I try to figure how to calc the ~percentage of used mem so i get notification way earlier eg 90% of maxmem, therefore this solution is fine.
Info dump:
# Memory
used_memory:3126272
used_memory_human:2.98M
used_memory_rss:5292032
used_memory_rss_human:5.05M
used_memory_peak:4914296
used_memory_peak_human:4.69M
used_memory_peak_perc:63.62%
used_memory_overhead:696654...
Furthermore maxmem is not a hardcap, when running it further by eg adding members to existing set.
used_memory:3162584
used_memory_human:3.02M
code to get percent 0-100
rmem_info = pipe.info(section='memory')
{'redis_mem_percent': math.ceil(rmem_info['used_memory'] / rmem_info['maxmemory'] *100)}

Dataflow pipeline is dropping events during processing when using outputWithTimestamp

I have a Cloud Dataflow pipeline in which I alter the original timestamp for the event in order to simulate real world scenarios of events arriving late. However, it appears I'm dropping some percentage of my events on each run of the pipeline. Inside my DoFn I use the following code to change the timestamp:
Instant newTimestamp = originalTimestamp.minus(Duration.standardMinutes(RANDOM.nextInt(15)));
c.outputWithTimestamp(KV.of(Integer.toString(RANDOM.nextInt(100)), element), newTimestamp);
The problem is most likely caused by your DoFn step outputting a timestamp that is earlier than the timestamp that was received by the processing step minus the allowed timestamp skew. The exception that would be thrown can be found here in the code:
https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunnerBase.java#L493
This behavior is documented with regard to using outputWithTimestamp here:
https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/transforms/DoFn.Context#outputWithTimestamp-OutputT-org.joda.time.Instant-
While you could override the getAllowedTimestampSkew function, is is also documented that this might cause unpredictable issues with the watermark calculations so it should only be used without windowing/grouping.
https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/transforms/DoFn#getAllowedTimestampSkew--

Write to the system's standard error in Progress

I am writing a small program in Progress that needs to write an error message to the system's standard error. What ways, simple if at all possible, can I use to print to standard error?
I am using OpenEdge 11.3.
When on Windows (10.2B+) you can use .NET:
System.Console:Error:WriteLine ("This is an error message") .
together with
prowin32 2> stderr.out
Progress doesn't provide a way to write to stderr - the easiest way I can think of is to output-through an external program that takes stdin and echoes it to stderr.
You could look into LOG-MANAGER:WRITE-MESSAGE. It won't log to standard output or standard error, but to a client-specific log. This log should be monitored in any case (specifically if the client is an application server).
From the documentation:
For an interactive or batch client, the WRITE-MESSAGE( ) method writes the log entries to the log file specified by the LOGFILE-NAME attribute or the Client Logging (-clientlog) startup parameter. For WebSpeed agents and AppServer servers, the WRITE-MESSAGE() method writes the log entries to the server log file. For DataServers, the WRITE-MESSAGE() method writes the log entries to the log file specified by the DataServer Logging (-dslog) startup parameter.
LOG-MANAGER:WRITE-MESSAGE("Got here, x=" + STRING(x), "DEBUG1").
Will write this in the log:
[04/12/05#13:19:19.742-0500] P-003616 T-001984 1 4GL DEBUG1 Got here, x=5
There are quite a lot of options regarding the LOG-MANAGER system, what messages to display, where the file is placed, etc.
There is no easy way, but in Unixen you can always do something like this using OUTPUT THROUGH (untested):
output through "cat >&2" no-echo unbuffered.
Alternatively -- and this is tested -- if you just want error messages from a batch-mode program to go to standard out then
output through "tee" ...
...definitely works.

collecting logg4j log using apache flume

guys
I met a problem.I use logg4j and apache-flume to collect logs.the architecture is use logg4j remote print,the config like this:
log4j.appender.flume=org.apache.flume.clients.log4jappender.Log4jAppender
log4j.appender.flume.Hostname=192.168.152.49
log4j.appender.flume.Port=44446
log4j.appender.flume.layout=org.apache.log4j.PatternLayout
while the configure of flume like this:
a1.sources.r1.type=avro
a1.sources.r1.bind=192.168.152.49
a1.sources.r1.port=44446
it works!but the question is when the flume closed.the application which use logg4j can't print log!so is anybody can tell me.
how to fix this problem
It depends on how you want to handle Flume being down. With the regular Log4jAppender, you can enable unsafe mode which will log the error in the log4j LogLog, but otherwise fail silently. To do that you can set log4j.appender.flume.UnsafeMode = true. You can see an example here:
https://github.com/kite-sdk/kite-examples/blob/master/logging/src/main/resources/log4j.properties#L20
With unsafe enabled, any events you log while Flume is down will be lost.
If you want to be able to point to multiple Flume agents and have it balance the load between them as well as fail over if one of them goes down, you can use the LoadBalancingLog4jAppender instead. The docs here should help:
http://flume.apache.org/FlumeUserGuide.html#load-balancing-log4j-appender

Resources