What do you log in your desktop applications to improve stability? - delphi

I've started using SmartInspect in my Delphi applications because my users were running into bugs/problems I couldn't reproduce on my machine. When I have a general idea of the problem I'll monitor the application in a few specific places to confirm what is or is not working.
When the bug doesn't have an obvious cause, I feel lost. I don't know where to start logging in order to narrow down the problem. Are there common techniques or best practices for using a logger?
SmartInspect seems to be quite powerful, but I don't know quite what to log or how to organise my logs so the data is meaningful and useful for catching bugs.
NOTE: I'm using SmartInspect but I assume the answers should be suitable for any logging package.

Here are some guidelines I tried to implement in my own OpenSource logging unit, but it's fairly generic, and as you state, it should be suitable for any logging package:
Make several levels (we use sets) of logging, to tune the logging information needed;
Log all exceptions, even the handled one with a try...except block - and add an exception classes list not worth logging (e.g. EConvertError) - e.g. our unit is able to log all exceptions via a global exception "hook" (no try..except to add in your code), and handle a list of exception classes to be ignored;
Log all "fatal" errors, like database connection issues, or wrong SQL syntax - should be done though "log all exceptions" previous item;
For such exceptions, log the stack trace to know about the calling context;
Be able to log all SQL statements, or database access;
Add a generic User Interface logging, to know which main functions of the software the User did trigger (e.g. for every toolbar button or menu items): it's very common that the user said 'I have this on my screen/report, but I didn't do anything'... and when you see the log, you will discover that the "anything" was done. ;)
Monitor the main methods of your application, and associated parameters;
Logging is an evolving feature: use those general rules above, then tune your logging from experiment, according to your debugging needs.

For UI-driven applications here are the main things I instrument first:
ActionManager or ActionList's events when an action executes (gives me a user clicked here then here then here list).
Unhandled Exceptions with tracebacks using JCL debug go right in my main log, whereas if I was using MadExcept or EurekaLog, exceptions have their own log.
Background thread starts, stops and significant history events
Warnings, errors, API function failures, file access failures, handled (caught) exceptions.

Current memory usage can be useful for long running processes to see if there are memory leaks (which could lead to an out of memory error some day).

Related

Access Violation when pressing a button that uses a separate form for validation [duplicate]

What tips can you share to help locate and fix access violations when writing applications in Delphi?
I believe access violations are usually caused by trying to access something in memory that has not yet been created such as an Object etc?
I find it hard to identify what triggers the access violations and then where to make the required changes to try and stop/fix them.
A example is a personal project I am working on now. I am storing in TTreeView Node.Data property some data for each node. Nodes can be multiple selected and exported (the export iterates through each selected node and saves specific data to a text file - the information saved to the text file is what is stored in the nodes.data). Files can also be imported into the Treeview (saving the contents of the text files into the node.data).
The issue in that example is if I import files into the Treeview and then export them, it works perfect. However if I add a node at runtime and export them I get:
"Access Violation at address 00405772 in module 'Project1.exe'. Read of address 00000388."
My thoughts on that must be the way I am assigning the data to created nodes, maybe differently to the way I assign it when they are imported, but it all looks ok to me. The access violation only shows up when exporting, and this never happens with imported files.
I am NOT looking for a fix to the above example, but mainly advice/tips how to find and fix such type of errors. I don't often get access violations, but when I do they are really hard to track down and fix.
So advice and tips would be very useful.
It means your code is accessing some part of the memory it isn't allowed to. That usually means you have a pointer or object reference pointing to the wrong memory. Maybe because it is not initialized or is already released.
Use a debugger, like Delphi. It will tell you on what line of code the AV occurred. From there figure out your problem by looking at the callstack and local variables etc. Sometimes it also helps if you compile with Debug DCUs.
If you don't have a debugger because it only happens on a client side, you might want to use MadExcept or JclDebug to log the exception with callstack and have it send to you. It gives you less details but might point you in the right direction.
There are some tools that might be able to find these kind of problems earlier by checking more aggressively. The FastMM memory manager has such options.
EDIT
"Access Violation at address 00405772
in module 'Project1.exe'. Read of
address 00000388."
So your problem results in a AV at addresss 00405772 in module 'Project1.exe'. The Delphi debugger will bring you to the right line of code (or use Find Error).
It is trying to read memory at address 00000388. That is pretty close to 00000000 (nil), so that would probably mean accessing some pointer/reference to an array or dynamic array that is nil. If it was an array of bytes, it would be item 388. Or it could be a field of a rather large object or record with lots of fields. The object or record pointer/reference would be nil.
I find that the really hard-to-find access violations don't always occur while I'm running in a debugger. Worse yet, they happen to customers and not to me. The accepted answer mentions this, but I really think it should be given more detail: MadExcept provides a stack traceback which gives me valuable context information and helps me see where the code fails, or has unhandled exceptions (it's not just for access violations). It even provides a way for customers to email you the bug reports right from inside your program. That leads to more access violations found and fixed, reported by your beta testers, or your users.
Secondly, I have noticed that compiler hints and warnings are in fact detecting for you, some of the common problems. Clean up hints and warnings and you might find many access violations and other subtle problems. Forgetting to declare your destructors properly, for example, can lead to a compiler warning, but to serious problems at runtime.
Thirdly, I've found tools like Pascal Analyzer from Peganza, and the audits-and-metrics feature in some editions of Delphi, can help you find areas of your code that have problems. As a single concrete example, Pascal Analyzer has found places where I forgot to do something important, that lead to a crash or access violation.
Fourth, you can hardly beat the technique of having another developer critique your code. You might feel a bit sheepish afterwards, but you're going to learn something, hopefully, and get better at doing what you're doing. Chances are, there is more way than one to use a tree view, and more way than one to do the work you're doing, and a better architecture, and a clean way of doing things is going to result in more reliable code that doesn't break each time you touch it. THere is not a finite list of rules to produce clean code, it is rather, a lifetime effort, and a matter of degrees. You'd be surprised how innocent looking code can be a hotbed of potential crashes, access violations, race conditions, freezes and deadlocks.
I would like to mention one more tool, that I use when other tools fail to detect AV. It's SafeMM (newer version). Once it pointed me to the small 5 line procedure. And I had to look more than 10 minutes at it, in order to see the AV that happened there. Probably that day my programming skills wasn't at their maximum, but you know, bad thing tend to happen exactly at such days.
Just want to mention other debugging or "code guard" techniques that were not mentioned in previous answers:
"Local" tools:
* Use FastMM in DebugMode - have it write zeros each time it dealocates memory. This will make your program PAINFULLY slow but you have a HUGE chance to find errors like trying to access a freed object.
* Use FreeAndNil(Obj) instead of Obj.Free. Some, people were complaining about it as creating problems but without actually providing a clear example where this might happen. Additionally, Emarcadero recently added the recommendation to use FreeAndNil in their manual (finally!).
* ALWAYS compile the application in Release and Debug mode. Make sure the Project Options are correctly set for debug mode. The DEFAULT settings for Debug mode are NOT correct/complete - at last not in Delphi XE7 and Tokyo. Maybe one day they will set the correct options for Debug mode. So, enable things like:
"Stack frames"
"Map file generation (detailed)"
"Range checking",
"Symbol reference info"
"Debug information"
"Overflow checking"
"Assertions"
"Debug DCUs"
Deactivate the "Compiler optimizations"!
3rd Party Tools:
Use MadShi or EurekaLog (I would recommend MadShi over EurekaLog)
Use Microsoft's ApplicationVerfier

Recurrent exception in Delphi TService on W10 and Server 2012r2

I am working on a Delphi application using the TService functionalities.
It is a large project that was started by someone else.
The application uses several separate threads for processing, communicating with clients, database access, etc. The application’s main job is to poll regularly (every 2-300ms) certain devices and, a couple times a day, execute specific actions.
Now somewhere there is an unhandled exception for which I cannot seem to find the cause:
According to the debugger, the faulting method is System.Classes.StdWndProc.
This also seems to be confirmed by analyzing the crash dump file with WinDbg.
After numerous tests and debugging, I noticed that the crash happened on my dev computer every day at almost the same time.
I looked at the windows log and found this event:
This, coupled with the fact that the stack trace from Delphi indicates that a message with ID=26 (WM_WININICHANGE) was processed, made me believe that there might be something wrong with my usage of FormatDateTime() or DateTimeToStr() when regional settings are reloaded.
I checked every call and made sure to be using the thread-safe overload with a local instance of TFormatSettings.
However today the service crashed again.
A few points that I think are worth mentioning:
The application is also installed on a Windows 2008 server and has
been running OK for over a month.
On 2012r2 I tried forcing DEP off, but it didn’t change anything.
The service’s OnExecute() method is not implemented. I create a base thread in TService. ServiceStart() which then in turn creates the main data module and all the other threads.
The service is not marked as interactive and is executed with the Local System account.
All data modules are created with AOwner=nil.
With a special parameter, the application can be started a normal windowed application with a main form (which is created only in this case). The exception does not seem to happen when running in GUI mode.
Almost all threads have a message pump and use PostThreadMessage() to exchange information. There are no window handle allocations anywhere.
I have checked the whole project and there are no timers or message dialogs or other graphical components anywhere.
I have activated range as well as overflow checking and found no issues.
I am a loss for what to do here. I have checked and re-checked the code several times without finding anything that could explain the error.
Looking online I found several reports that seem to be pertinent to my situation, but none that actually explains what is going on:
https://answers.microsoft.com/en-us/windows/forum/windows_10-other_settings/windows-10-group-policy-application-hang/72016ea4-ba89-4770-b1de-6ddf14b0a51f
https://www.experts-exchange.com/questions/20720591/Prevent-regional-settings-from-changing-inside-a-TService-class.html
https://forums.embarcadero.com/thread.jspa?messageID=832265
Before taking everything apart I would like to know if anyone experienced anything similar or has any tips.
Thanks
Edit: looking at the call stack and the first method executed I am thinking of the TApplication instance that is used in TService.
LPARAM is always 648680.
LPARAM is a pointer to a string (you can cast it to PChar).
You could try to catch WM_SETTINGS_CHANGE and log anytime it's processing.
I think you could use TApplicationEvents component. The component has OnSettingsChange event. The event provides a setting area name (section name) and a flag that points to changed parameter.
Have a look at doc-wiki.
There is a full list of possible parameters.
I did not test it, but maybe you can use Abort procedure in OnSettingsChange event handler to stop the message distribution. Of course it's not a solution, but it may work.

Access violation, Delphi 2005 TADOQuery [duplicate]

What tips can you share to help locate and fix access violations when writing applications in Delphi?
I believe access violations are usually caused by trying to access something in memory that has not yet been created such as an Object etc?
I find it hard to identify what triggers the access violations and then where to make the required changes to try and stop/fix them.
A example is a personal project I am working on now. I am storing in TTreeView Node.Data property some data for each node. Nodes can be multiple selected and exported (the export iterates through each selected node and saves specific data to a text file - the information saved to the text file is what is stored in the nodes.data). Files can also be imported into the Treeview (saving the contents of the text files into the node.data).
The issue in that example is if I import files into the Treeview and then export them, it works perfect. However if I add a node at runtime and export them I get:
"Access Violation at address 00405772 in module 'Project1.exe'. Read of address 00000388."
My thoughts on that must be the way I am assigning the data to created nodes, maybe differently to the way I assign it when they are imported, but it all looks ok to me. The access violation only shows up when exporting, and this never happens with imported files.
I am NOT looking for a fix to the above example, but mainly advice/tips how to find and fix such type of errors. I don't often get access violations, but when I do they are really hard to track down and fix.
So advice and tips would be very useful.
It means your code is accessing some part of the memory it isn't allowed to. That usually means you have a pointer or object reference pointing to the wrong memory. Maybe because it is not initialized or is already released.
Use a debugger, like Delphi. It will tell you on what line of code the AV occurred. From there figure out your problem by looking at the callstack and local variables etc. Sometimes it also helps if you compile with Debug DCUs.
If you don't have a debugger because it only happens on a client side, you might want to use MadExcept or JclDebug to log the exception with callstack and have it send to you. It gives you less details but might point you in the right direction.
There are some tools that might be able to find these kind of problems earlier by checking more aggressively. The FastMM memory manager has such options.
EDIT
"Access Violation at address 00405772
in module 'Project1.exe'. Read of
address 00000388."
So your problem results in a AV at addresss 00405772 in module 'Project1.exe'. The Delphi debugger will bring you to the right line of code (or use Find Error).
It is trying to read memory at address 00000388. That is pretty close to 00000000 (nil), so that would probably mean accessing some pointer/reference to an array or dynamic array that is nil. If it was an array of bytes, it would be item 388. Or it could be a field of a rather large object or record with lots of fields. The object or record pointer/reference would be nil.
I find that the really hard-to-find access violations don't always occur while I'm running in a debugger. Worse yet, they happen to customers and not to me. The accepted answer mentions this, but I really think it should be given more detail: MadExcept provides a stack traceback which gives me valuable context information and helps me see where the code fails, or has unhandled exceptions (it's not just for access violations). It even provides a way for customers to email you the bug reports right from inside your program. That leads to more access violations found and fixed, reported by your beta testers, or your users.
Secondly, I have noticed that compiler hints and warnings are in fact detecting for you, some of the common problems. Clean up hints and warnings and you might find many access violations and other subtle problems. Forgetting to declare your destructors properly, for example, can lead to a compiler warning, but to serious problems at runtime.
Thirdly, I've found tools like Pascal Analyzer from Peganza, and the audits-and-metrics feature in some editions of Delphi, can help you find areas of your code that have problems. As a single concrete example, Pascal Analyzer has found places where I forgot to do something important, that lead to a crash or access violation.
Fourth, you can hardly beat the technique of having another developer critique your code. You might feel a bit sheepish afterwards, but you're going to learn something, hopefully, and get better at doing what you're doing. Chances are, there is more way than one to use a tree view, and more way than one to do the work you're doing, and a better architecture, and a clean way of doing things is going to result in more reliable code that doesn't break each time you touch it. THere is not a finite list of rules to produce clean code, it is rather, a lifetime effort, and a matter of degrees. You'd be surprised how innocent looking code can be a hotbed of potential crashes, access violations, race conditions, freezes and deadlocks.
I would like to mention one more tool, that I use when other tools fail to detect AV. It's SafeMM (newer version). Once it pointed me to the small 5 line procedure. And I had to look more than 10 minutes at it, in order to see the AV that happened there. Probably that day my programming skills wasn't at their maximum, but you know, bad thing tend to happen exactly at such days.
Just want to mention other debugging or "code guard" techniques that were not mentioned in previous answers:
"Local" tools:
* Use FastMM in DebugMode - have it write zeros each time it dealocates memory. This will make your program PAINFULLY slow but you have a HUGE chance to find errors like trying to access a freed object.
* Use FreeAndNil(Obj) instead of Obj.Free. Some, people were complaining about it as creating problems but without actually providing a clear example where this might happen. Additionally, Emarcadero recently added the recommendation to use FreeAndNil in their manual (finally!).
* ALWAYS compile the application in Release and Debug mode. Make sure the Project Options are correctly set for debug mode. The DEFAULT settings for Debug mode are NOT correct/complete - at last not in Delphi XE7 and Tokyo. Maybe one day they will set the correct options for Debug mode. So, enable things like:
"Stack frames"
"Map file generation (detailed)"
"Range checking",
"Symbol reference info"
"Debug information"
"Overflow checking"
"Assertions"
"Debug DCUs"
Deactivate the "Compiler optimizations"!
3rd Party Tools:
Use MadShi or EurekaLog (I would recommend MadShi over EurekaLog)
Use Microsoft's ApplicationVerfier

What information should I be logging in my web app?

I finishing up a web application and I'm trying to implement some logging. I've never seen any good examples of what to log. Is it just exceptions? Are there other things I should be logging? What type of information do you find useful for finding and fixing bugs.
Looking for some specific guidance and best practices.
Thanks
Follow up
If I'm logging exceptions what information specifically should I be logging? Should I be doing something more than _log.Error(ex.Message, ex); ?
Here is my logical breakdown of what can be logged within and application, why you might want to and how you might go about doing it. No matter what I would recommend using a logging framework such as log4net when implementing.
Exception Logging
When everything else has failed, this should not. It is a good idea to have a central means of capturing all unhanded exceptions. This shouldn't
be much harder then wrapping your entire application in a giant try/catch unless you are using more than on thread. The work doesn't end here
though because if you wait until the exception reaches you a lot of useful information would have gone out of scope. At the very least you should
try to collect specific pieces of the application state that are likely to help with debugging as the stack unwinds. Your application should always be prepared to produce this type of log output, especially in production. Make sure to take a look at ELMAH if you haven't already. I haven't tried it but I have heard great things
Application Logging
What I call application logs includes any log that captures information about what your application is doing on a conceptual level such as "Deleted Order" or "A User Signed On". This kind of information can be useful in analyzing trends, auditing the system, locking it down, testing, security and detecting bugs of coarse. It is probably a good idea to plan on leaving these logs on in production as well, perhaps at variable levels of granularity.
Trace Logging
Trace logging, to me, represents the most granular form of logging. At this level you focus less on what the application is doing and more on how it is doing it. This is one step above actually walking through the code line by line. It is probably most helpful in dealing with concurrency issues or anything for that matter which is hard to reproduce. You wouldn't want to always have this running, probably only turning it on when needed.
Lastly, as with so many other things that usually only get addressed at the very end, the best time to think about logging is at the beginning of a project so that the application can be designed with it in mind. Great question though!
Some things to log:
business actions, such as adding/deleting items. Talk to your app's business owner to come up with a list of things that are useful. These should make sense to the business, not to you (for example: when user submitted report, when user creates a new process, etc)
exceptions
exceptions
exceptions
Some things to NOT to log:
do not log information simply for tracking user usage. Use an analytics tool for that (which tracks the client in javascirpt, not in the client)
do not track passwords or hashes of passwords (huge security issue)
Maybe you should log page/resource accesses which are not yet defined in your application, but are requested by clients. That way, you may be able to find vulnerabilities.
It depends on the application and its audience. If you are managing sales or trading stocks, you probably should log more info than say a personal blog. When you need the log most is when an error is happening in your production environment, but can't reproduce it locally. Having log level and log hierarchy would help in such situations, because you can dynamically increase the log level. See log4j's documentation and log4net.
My few cents.
Besides using log severity and exceptions properly, consider structuring your log statements so that you could easily look though the log data in the future. For example - extracting meaningful info quickly, doing queries etc. There is no problem to generate an ocean of log data, the problem is to convert this data into information. So, structuring and defining it beforehand helps in later usage. If you use log4j, I would also suggest using mapped diagnostic context (MDC) - this helps a lot for tracking session contexts. Aside from trace and info, I would also use debug level where I usually keep temp. items. Those could be filtered out or disabled when not needed.
You probably shouldn't be thinking of this at this stage, rather, logging is helpful to consider at every stage of development to help diffuse potential bugs before they arise. Depending on your program, I would try to capture as much information as possible. Log everything. You can always stop logging certain components or processes if you don't reference that data enough. There is no such thing as too much information.
From my (limited) experience, if you don't want to make a specific error table for each possible error type, construct a generic database table that accepts general information as well as a string that you can populate with exception data, confirmation messages during successful yet important processes, etc. I've used a generic function with parameters for this.
You should also consider the ability to turn logging off if necessary.
Hope this helps.
I beleive when you log an exception you should also save current date and time, requested url, url refferer and user IP address.

Exception Logger: Best Practices

I have just started using an exception logger (EurekaLog) in my (Delphi) application. Now my application sends me lots of error messages via e-mail every day. Here's what I found out so far
lots of duplicate errors
multiple mails from the same PC
While this is highly valuable input to improve my application, I am slightly overwhelmed by the sheer amount of information I'm getting.
What are your best practices for handling mails from your application?
If you get to much information as it is currently the case you are not getting any information at all.
So I would say categorize your errors into groups, like WARNINGS, FATAL ERRORS, etc.
Then limit your emails to the most important messages (FATAL).
Apart from that review your logs on a regular basis (day, week ...).
What I've done with my exception logging, which uses madExcept as the core, but my own transport mechanism, is have them all go into a database. The core information is all extracted from each report and put into fields, and the whole report is stored as well. The stack trace is automatically analysed to remove the un-interesting functions, leaving a list of only my functions that have failed.
With this happening automatically, I can now "ignore" each individual message coming in, but see the bigger picture in a grid that shows me simply which functions are having the most problems. I can then focus on them, look for the causes, and fix them.
My display app is also able to filter out reports in builds before a certain number if I choose, so that I can tell it not to include "MyWidget.BadProc" before build 75 once I've fixed it.
This has helped me improve my app, and hit the problems that people found most problematic, without having to guess.
It would very much depend on what the errors are that are being sent back. The obvious one being if there are errors in your application, they need fixing and patches/updates sent to your clients.
If they are exceptions that you know can happen and do not required you to be notified you can add "Exception Filters" in the Eureaka Log options to specify how they should be handled (or ignored!).
Another option is to use EurekaLog Variables (where you can add the exception description etc) in the mail Subject line and then use your email client to filter based on this.
I did this using madExcept. It's really useful for tracking down problems we couldn't reproduce ourselves.
Which makes me ask why you are getting so many? Untrapped exceptions should be few and far between. Especially if the user sees an error dialog. I was responsible for several applications, each with hundreds of installs and I would rarely get e-mail notices.
If they are mostly from a very small number of PCs, I'd work with some of those users to find out what they're doing differently, or how their setup might be generating exceptions.
If they are from all over the place, it's probably a bug that got through your testing.
Either way, use the details to fix your code or, at the very least, anticipate known exceptions and trap them properly (no empty try..except).
Fixing the hot spot problems will cut way down on the number of e-mails you get, making the occasional notice much more manageable.
I think that you should throw out all duplicates. Leave only count of the reports. I.e. if you get, say 100 reports, but there are only 4 unique problems - leave only 4 reports, throw out other 96 reports, but use their count to sort reports by severity. For example, 6 reports for fourth problem, 10 reports for third, 20 for second and 60 for first. So, you should fix first problem with 60 reports and only then switch to second.
I believe that EurekaLog has BugID in its reports. Same problem has the same BugID. This will allow you to sort reports with duplicates. EurekaLog Viewer also can sort out duplicates.

Resources