In my application I have about 100 continuous Esper filter queries with events being sent in. At some point for an unknown reason some of statemens stop matching events and never match any further event without ever throwing an exception (nothing logged in log4j default logging setup). This is not reproducible in a small example, and I realize that it's difficult to pinpoint a problem like this, but I'm writing this in the hopes of this being a known and/or fixed issue.
I would suggest to review your application code to make sure it is still sending events and the listener/subscriber code to make sure it is still processing output events, i.e. exception handling and logging. Or perhaps an OOM occurs and logging doesn't happen and thus you may want to check heap memory use. Also look at console out see if the JVM has encountered an issue.
Related
I am working on a Delphi application using the TService functionalities.
It is a large project that was started by someone else.
The application uses several separate threads for processing, communicating with clients, database access, etc. The application’s main job is to poll regularly (every 2-300ms) certain devices and, a couple times a day, execute specific actions.
Now somewhere there is an unhandled exception for which I cannot seem to find the cause:
According to the debugger, the faulting method is System.Classes.StdWndProc.
This also seems to be confirmed by analyzing the crash dump file with WinDbg.
After numerous tests and debugging, I noticed that the crash happened on my dev computer every day at almost the same time.
I looked at the windows log and found this event:
This, coupled with the fact that the stack trace from Delphi indicates that a message with ID=26 (WM_WININICHANGE) was processed, made me believe that there might be something wrong with my usage of FormatDateTime() or DateTimeToStr() when regional settings are reloaded.
I checked every call and made sure to be using the thread-safe overload with a local instance of TFormatSettings.
However today the service crashed again.
A few points that I think are worth mentioning:
The application is also installed on a Windows 2008 server and has
been running OK for over a month.
On 2012r2 I tried forcing DEP off, but it didn’t change anything.
The service’s OnExecute() method is not implemented. I create a base thread in TService. ServiceStart() which then in turn creates the main data module and all the other threads.
The service is not marked as interactive and is executed with the Local System account.
All data modules are created with AOwner=nil.
With a special parameter, the application can be started a normal windowed application with a main form (which is created only in this case). The exception does not seem to happen when running in GUI mode.
Almost all threads have a message pump and use PostThreadMessage() to exchange information. There are no window handle allocations anywhere.
I have checked the whole project and there are no timers or message dialogs or other graphical components anywhere.
I have activated range as well as overflow checking and found no issues.
I am a loss for what to do here. I have checked and re-checked the code several times without finding anything that could explain the error.
Looking online I found several reports that seem to be pertinent to my situation, but none that actually explains what is going on:
https://answers.microsoft.com/en-us/windows/forum/windows_10-other_settings/windows-10-group-policy-application-hang/72016ea4-ba89-4770-b1de-6ddf14b0a51f
https://www.experts-exchange.com/questions/20720591/Prevent-regional-settings-from-changing-inside-a-TService-class.html
https://forums.embarcadero.com/thread.jspa?messageID=832265
Before taking everything apart I would like to know if anyone experienced anything similar or has any tips.
Thanks
Edit: looking at the call stack and the first method executed I am thinking of the TApplication instance that is used in TService.
LPARAM is always 648680.
LPARAM is a pointer to a string (you can cast it to PChar).
You could try to catch WM_SETTINGS_CHANGE and log anytime it's processing.
I think you could use TApplicationEvents component. The component has OnSettingsChange event. The event provides a setting area name (section name) and a flag that points to changed parameter.
Have a look at doc-wiki.
There is a full list of possible parameters.
I did not test it, but maybe you can use Abort procedure in OnSettingsChange event handler to stop the message distribution. Of course it's not a solution, but it may work.
I am running into out of memory issues after a certain amount of time when I run my quickfixj app. After a little investigation I found out that this was being caused by messages that quickfixj caches for re sending when a resend request is received.
So for testing I set this flag to N on a particular session. After that my memory problems completely disappeared. But I do not understand why quickfixj is keeping these message in memory when I have properly set this property : FileStorePath. These messages should be stored into a file but they are not. I do see some files present in the directory I set in FileStorePath but none of them seems to be storing messages, I can only see sequence number in them. Do I need to set other flags besides this in order to make this work?
I do not plan to use PersisMessages flag outside testing. I would prefer FileStoreMaxCachedMsgs flag with a reasonable figure. I also need to know what will happen if my app receives a resend request when I have set PersisMessages to N? Will quickfixj send gapfills instead or will it crash with some exception message?
Thanks
i ve found that quickfixj sends gap fills when it could not find messages. also the config flag FileStoreMaxCachedMsgs is used to tell quickfixj about how much messages it should keep in cache before pushing them down to files. so this flag, in my opinion, should be altered in order to get your app to work without running out of memory due to message caching.
hope it ll be helpful for somebody. Thanks
I've started using SmartInspect in my Delphi applications because my users were running into bugs/problems I couldn't reproduce on my machine. When I have a general idea of the problem I'll monitor the application in a few specific places to confirm what is or is not working.
When the bug doesn't have an obvious cause, I feel lost. I don't know where to start logging in order to narrow down the problem. Are there common techniques or best practices for using a logger?
SmartInspect seems to be quite powerful, but I don't know quite what to log or how to organise my logs so the data is meaningful and useful for catching bugs.
NOTE: I'm using SmartInspect but I assume the answers should be suitable for any logging package.
Here are some guidelines I tried to implement in my own OpenSource logging unit, but it's fairly generic, and as you state, it should be suitable for any logging package:
Make several levels (we use sets) of logging, to tune the logging information needed;
Log all exceptions, even the handled one with a try...except block - and add an exception classes list not worth logging (e.g. EConvertError) - e.g. our unit is able to log all exceptions via a global exception "hook" (no try..except to add in your code), and handle a list of exception classes to be ignored;
Log all "fatal" errors, like database connection issues, or wrong SQL syntax - should be done though "log all exceptions" previous item;
For such exceptions, log the stack trace to know about the calling context;
Be able to log all SQL statements, or database access;
Add a generic User Interface logging, to know which main functions of the software the User did trigger (e.g. for every toolbar button or menu items): it's very common that the user said 'I have this on my screen/report, but I didn't do anything'... and when you see the log, you will discover that the "anything" was done. ;)
Monitor the main methods of your application, and associated parameters;
Logging is an evolving feature: use those general rules above, then tune your logging from experiment, according to your debugging needs.
For UI-driven applications here are the main things I instrument first:
ActionManager or ActionList's events when an action executes (gives me a user clicked here then here then here list).
Unhandled Exceptions with tracebacks using JCL debug go right in my main log, whereas if I was using MadExcept or EurekaLog, exceptions have their own log.
Background thread starts, stops and significant history events
Warnings, errors, API function failures, file access failures, handled (caught) exceptions.
Current memory usage can be useful for long running processes to see if there are memory leaks (which could lead to an out of memory error some day).
We have an exception that pops up on our website that is getting reported to BugzScout many times a day. The functionality that produces the exception still does what it's intended to do, so we just want to stop FogBugz from piling up all of these occurrences until we have a chance to dig into the issue and prevent the exception.
That said, is there a way to set up a filter on the FogBugz side of things to ignore a list of exceptions that get reported? I know I can set up some logic in our app's BugzScout class so it stops sending those messages, but it would be nice to know if FogBugz does this already before I put the time into building that filter locally. We are using the hosted FogBugz On Demand version of the product if that makes a difference.
In the BugzScout case itself, you should be able to set "Stop Reporting" for the Scout Will setting. This way, only the occurrences will increment when the exception is reported. The case will not reopen or notify anyone.
It sounds a bit from your description that there are many different exceptions reporting to the same ScoutDescription. As much as possible, you should use version numbers and exception line numbers to make sure that exceptions are reported separately. I can elaborate on this if you want.
I have just started using an exception logger (EurekaLog) in my (Delphi) application. Now my application sends me lots of error messages via e-mail every day. Here's what I found out so far
lots of duplicate errors
multiple mails from the same PC
While this is highly valuable input to improve my application, I am slightly overwhelmed by the sheer amount of information I'm getting.
What are your best practices for handling mails from your application?
If you get to much information as it is currently the case you are not getting any information at all.
So I would say categorize your errors into groups, like WARNINGS, FATAL ERRORS, etc.
Then limit your emails to the most important messages (FATAL).
Apart from that review your logs on a regular basis (day, week ...).
What I've done with my exception logging, which uses madExcept as the core, but my own transport mechanism, is have them all go into a database. The core information is all extracted from each report and put into fields, and the whole report is stored as well. The stack trace is automatically analysed to remove the un-interesting functions, leaving a list of only my functions that have failed.
With this happening automatically, I can now "ignore" each individual message coming in, but see the bigger picture in a grid that shows me simply which functions are having the most problems. I can then focus on them, look for the causes, and fix them.
My display app is also able to filter out reports in builds before a certain number if I choose, so that I can tell it not to include "MyWidget.BadProc" before build 75 once I've fixed it.
This has helped me improve my app, and hit the problems that people found most problematic, without having to guess.
It would very much depend on what the errors are that are being sent back. The obvious one being if there are errors in your application, they need fixing and patches/updates sent to your clients.
If they are exceptions that you know can happen and do not required you to be notified you can add "Exception Filters" in the Eureaka Log options to specify how they should be handled (or ignored!).
Another option is to use EurekaLog Variables (where you can add the exception description etc) in the mail Subject line and then use your email client to filter based on this.
I did this using madExcept. It's really useful for tracking down problems we couldn't reproduce ourselves.
Which makes me ask why you are getting so many? Untrapped exceptions should be few and far between. Especially if the user sees an error dialog. I was responsible for several applications, each with hundreds of installs and I would rarely get e-mail notices.
If they are mostly from a very small number of PCs, I'd work with some of those users to find out what they're doing differently, or how their setup might be generating exceptions.
If they are from all over the place, it's probably a bug that got through your testing.
Either way, use the details to fix your code or, at the very least, anticipate known exceptions and trap them properly (no empty try..except).
Fixing the hot spot problems will cut way down on the number of e-mails you get, making the occasional notice much more manageable.
I think that you should throw out all duplicates. Leave only count of the reports. I.e. if you get, say 100 reports, but there are only 4 unique problems - leave only 4 reports, throw out other 96 reports, but use their count to sort reports by severity. For example, 6 reports for fourth problem, 10 reports for third, 20 for second and 60 for first. So, you should fix first problem with 60 reports and only then switch to second.
I believe that EurekaLog has BugID in its reports. Same problem has the same BugID. This will allow you to sort reports with duplicates. EurekaLog Viewer also can sort out duplicates.