How to disable crash dump for the gen_server - erlang

There is a gen_server that keeps some sensible information in it's state (password and so on)
Lagger is enabled,
So in case of crash gen_server's state is dump to the crash log like:
yyyy-mm-dd hh:mm:ss =ERROR REPORT====
** Generic server XXX terminating
** Last message in was ...
** When Server state == {state, ...}
** Reason for termination ==
As a result sensible information is written to the log file.
Are there any way to prevent state of the gen_server to be written into the log files/crash dumps?

You could implement the optional format_status callback function. That means that whenever the gen_server crashes, you get the chance to format the state data to your liking before it gets logged, for example by removing sensitive information.

You can add this into your app.config:
{lager, [{error_logger_redirect, false}]}
to prevent lager from redirecting error logs. You should also try to catch error (that causes gen_server to crash) and handle it in some graceful way. Our you can keep passwords salted and just let it crash.

Related

What causes dask job failure with CancelledError exception

I have been seeing below error message for quite some time now but could not figure out what leads to the failure.
Error:
concurrent.futures._base.CancelledError: ('sort_index-f23b0553686b95f2d91d4a3fda85f229', 7)
On restart of dask cluster it runs successfully.
If running a dask-cloudprovider ECSCluster or FargateCluster the concurrent.futures._base.CancelledError can result from a long-running step in computation where there is no output (logging or otherwise) to the Client. In these cases, due to the lack of interaction with the client, the scheduler regards itself as "idle" and times out after the configured cloudprovider.ecs.scheduler_timeout period, which defaults to 5 minutes. The CancelledError error message is misleading, but if you look in the logs for the scheduler task itself it will record the idle timeout.
The solution is to set scheduler_timeout to a higher value, either via config or by passing directly to the ECSCluster/FargateCluster constructor.

Dataflow concurrency error with ValueState

The Beam 2.1 pipeline uses ValueState in a stateful DoFn. It runs fine with a single worker but when scaling is enabled will fail with "Unable to read value from state" and the root exception below. Any ideas what could cause this?
Caused by: java.util.concurrent.ExecutionException: com.google.cloud.dataflow.worker.KeyTokenInvalidException: Unable to fetch data due to token mismatch for key ��
at com.google.cloud.dataflow.worker.repackaged.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:500)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:459)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:76)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:62)
at com.google.cloud.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:309)
at com.google.cloud.dataflow.worker.WindmillStateInternals$WindmillValue.read(WindmillStateInternals.java:384)
... 16 more
Caused by: com.google.cloud.dataflow.worker.KeyTokenInvalidException: Unable to fetch data due to token mismatch for key ��
at com.google.cloud.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:469)
at com.google.cloud.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:411)
at com.google.cloud.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:306)
... 17 more
I believe that exception should just be rethrown. It is thrown by the state mechanism to indicate that additional work on that key should not be performed, and will be automatically retried by the Dataflow runner.
These typically indicate that either that particular work should be performed on a different worker (thus proceeding wouldn't be helpful).
It may be possible that misusing state -- storing the state object from one key and attempting to use it on a different key -- could also lead to these errors. If that is the case, you may be able to see more diagnostic messages in either the worker or shuffler logs in Stackdriver logging.
If neither retrying nor looking at logging and how you use the state objects help, please provide a job ID demonstrating the problem.

The Server State which error_logger print, is the State when handling "Last message in was"?

forgive my poor English,
I mean when a gen_server crash, the error_logger print "Last message in was" and "When Server State", The Server State value is the state before handle the Last message?
In a gen_server, the state is stored by the generic part of the code, which is supposed to be robust. It is changed by callback functions which have to return the new state value as their result.
When the system reports an error, the state reported is the one passed as parameter to the callback who is responsible for the crash.

why Erlang's Reason of termination is normal?

all
the part of log :
** Reason for termination == **
{normal,
{gen_server,call,
[<0.9723.458>,
{create_jtxn_mon,
{player,34125,0,"gulexi",
why does it report error log when the reason is normal?
thanks for your help~~~
It seems like you made a call to a gen_server that exited with reason normal before it sent a response to the caller.
In general, if a gen_server exits with reason ServerExitReason during a call, gen_server:call will exit with the exit reason {ServerExitReason, {gen_server, call, [...]}}, even if ServerExitReason is normal. (See the source)
That is, the exit reason is not normal but {normal, ...}, and that's why you get a log message.

Is it possible to get the stacktrace of an error in the error_logger handler?

im currently writing an error_logger handler and would like to get the stacktrace where the error happend (more precisely: at the place, where error_logger:error* was called). But I cannot use the erlang:get_stacktrace() method, since i'm in a different process.
Does anyone know a way to get a stacktrace here?
Thanks
get_stacktrace() returns "stack back-trace of the last exception". Throw and catch an exception inside error_logger:error() and then you can get the stacktrace.
error() ->
try throw(a) of
_ -> a
catch
_:_ -> io:format("track is ~p~n", erlang:get_stacktrace())
end.
I have not fully debugged it, but I suppose that the error functions simply send a message (fire and forget) to the error logger process, so at the time your handler is called after the message has been received, the sender might be doing something completely different. The message sent might contain the backtrace, but I highly doubt it.

Resources