Overlapped serial port and Blue Screen of Death - delphi

I created a class that handles serial port asynchronously. I use it to communicate with a modem. I have no idea why, but sometimes, when I close my application, I get the Blue Screen and my computer restarts. I logged my code step by step, but when the BSOD appeared, and my computer restarted, the file into which I was logging data contained only white spaces. Therefore I have no idea, what the reason of the BSOD could be.
I looked through my code carefully and I found several possible reasons of the problem (I was looking for all that could lead to accessing unallocated memory and causing AV exceptions).
When I rethought the idea of asynchronous operations, a few things came to my mind. Please verify whether these are right:
1) WaitCommEvent() takes a pointer to the overlapped structure. Therefore, if I call WaitCommEvent() inside a function and then leave the function, the overlapped structure cannot be a local variable, right? The event mask variable and event handle too, right?
2) ReadFile() and WriteFile() also take references or pointers to variables. Therefore all these variables have to be accessible until the overlapped read or write operations finish, right?
3) I call WaitCommEvent() only once and check for its result in a loop, in the mean time doing other things. Because I have no idea how to terminate asynchronous operations (is it possible?), when I destroy my class that keeps a handle to a serial port, I first close the handle, and then wait for the event in the overlapped structure that was used when calling the WaitCommEvent() function. I do this to be sure that the thread that waits asynchronously for a comm event does not access any fields of my class which is destroyed. Is it a good idea or is it stupid?
try
CloseHandle(FSerialPortHandle);
if Assigned(FWaitCommEvent) then
FWaitCommEvent.WaitFor(INFINITE);
finally
FSerialPortHandle := INVALID_HANDLE_VALUE;
FreeAndNil(FWaitCommEvent);
end;
Before I noticed all these, most of the variables mentioned in point one and two were local variables of the functions that called the three methods above. Could it be the reason of the BSOD or should I look for some other mistakes in my code?
When I corrected the code, the BSOD stopped occuring, but It might be a coincidence. How do you think?
Any ideas will be appreciated. Thanks in advance.
I read the CancelIo() function documentation and it states that this method cancells all I/O operations issued by the calling thread. Is it OK to wait for the FWaitCommEvent after calling CancelIo() if I know that WaitCommEvent() was issued by a different thread than the one that calls CancelIo()?
if Assigned(FWaitCommEvent) and CancelIo(FSerialPortHandle) then
begin
FWaitCommEvent.WaitFor(INFINITE);
FreeAndNil(FWaitCommEvent);
end;
I checked what happens in such case and the thread calling this piece of code didn't get deadlocked even though it did not issue WaitCommEvent(). I tested in on Windows 7 (if it matters). May I leave the code as is or is it dangerous? Maybe I misunderstood the documentation and this is the reason of my question. I apologize for asking so many questions, but I really need to be sure about that.
Thanks.

An application running as a standard user should never be able to cause a bug check (a.k.a. BSOD). (And an application running as an Administrator should have to go well out of its way to do so.) Either you ran into a driver bug or you have bad hardware.
By default, Windows is configured to save a minidump in %SystemRoot%\minidump whenever a bug check occurs. You may be able to determine more information about the crash by loading the minidump file in WinDbg, configuring WinDbg to use the Microsoft public symbol store, and running the !analyze -v command in WinDbg. At the very least, this should identify what driver is probably at fault (though I would guess it's your modem driver).

Yes, you do need to keep the TOverlapped structure available for the duration of the overlapped operation. You're going to call GetOverlappedResult at some point, and GetOverlappedResult says it should receive a pointer to a structure that was used when starting the overlapped operation. The event mask and handle can be stored in local variables if you want; you're going to have a copy of them in the TOverlapped structure anyway.
Yes, the buffers that ReadFile and WriteFile use must remain valid. They do not make their own local copies to use internally. The documentation for ReadFile even says so:
This buffer must remain valid for the duration of the read operation. The caller must not use this buffer until the read operation is completed.
If you weren't obeying that rule, then you were likely reading into unreserved stack space, which could easily cause all sorts of unexpected behavior.
To cancel an overlapped I/O operation, use CancelIo. It's essential that you not free the memory of your TOverlapped record until you're sure the associated operation has terminated. Likewise for the buffer you're reading or writing. CancelIo does not cancel the operation immediately, so your buffers might still be in use even after you call it.

Related

TIdHTTPServer and 100% CPU usage

I use TIdHTTPServer in Delphi 11 to run a simple web server on a VPS.
It works great, except from time to time my app will start to use 100% of the CPU and keep this way forever, and I can't identify what is causing this.
When this happens, the server is still active and replying to requests, but very slowly. The only way to fix this is to force close the application and open it again.
I don't have any code to show, as my code is just generic responses using the OnCommandGet event of the TIdHTTPServer. This event will handle GET parameters on the URL and return something in the AResponseInfo.ContentText.
I know this is difficult, but any ideas about what I should hunt for to fix this?
We use TIdHttpServer quite a lot and have no problems with it. We use it in Delphi 10.3-10.4.2, but it’s not the reason for the problem. Programs work a few months without restarting.
From our experience we can say that problem of such unexpected behavior can be (in order of probability):
Code is not threadsafe. Event OnCommandGet run not in a main thread, so all access to global object/resources/etc must be done thru some kind of synchronization mechanism (locks, TEvent, synchronize, mutex, semaphore or other). If code does not use synchronization – it can broke logic, throw exceptions or do some other unexpected actions (like high CPU usage).
Connections count go over the limit. TIdHttpServer has properties like ListenQueue and MaxConnections. Can be that you make more requests that the server can handle. In this case your new requests wait until they can be handled by your code and it can make some additional CPU usage. To diagnose this – try to increment some internal variable at the start of your event and decrement it at the end. Make some service request to return this variable and you will know if all work correctly. Other similar situation – connection does not close after using the inside event and stay in memory, then you can go over limits too. Try to workaround properties TIdHttpServer.KeepAlive := false and TIdHttpServer.ReuseSocket := rsFalse.
Memory leaks. Try to set variable ReportMemoryLeaksOnShutdown := true and start the application, make a few requests and close it. If you’ll see a message with leaks – then you do something wrong, try to handle these objects in the right way. In production these small leaks can take a lot of RAM and Windows will dump part of memory into a swap-file, so your new requests will take more time to be processed.
Without an example, we can't say more.

Is there a way to take action, thus execute code, when a iOS application crashes ? Is this possible?

Is there a way to take action, thus execute code, when an iOS application crashes? Specifically, I would like to save the core data storage. Is this possible? I would say that this is possible since, for example, Firebase has to send information online for making crashlytics work. How can this be achieved? Thanks
Yes, but it is very difficult, and "save core data storage" would be far too much (and very dangerous, to boot).
Most crashes result from a signal (often SIGSEGV, but also SIGABRT, SIGILL or others), and you can install a signal handler to run code in that case. However, that code must be very, very carefully written because you will be in a special execution state. There are a small number of C functions you are permitted to use (see the man page for sigaction for the list). Most notably, you can't allocate memory. Allocating memory in a signal catching function can deadlock the program in a tight spinlock (done that myself when I tried to write my own crash handler in my more naive days; it's really bad).
The way that crash handlers like Crashlytics do it is that they do as little as possible during the signal handler, mostly just writing the stack trace to storage (using pre-allocated buffers). When you restart, they see that there's an unhandled stack trace from a previous run, and then they do all the complicated stuff like uploading it to a server, or displaying UI, or whatever.
But even if you could write to Core Data in the middle of a signal handler, you would never want to do that. During a signal handler, the system is in an undefined state. Various invariants may not currently hold (such as whether the object graph is consistent). The fact that you're crashing this way indicates that something illegal has happened. The last thing you should do in that state is take data that is highly untrustworthy and overwrite the good data on disk.

Ending an Application within the Loaded procedure

my application uses two self-written components that perform actions during the "loaded" procedure.
The component created first is an single-instance control that should exit the application when another instance is found.
The component created secondly is my database access that establishes the database connection in the "loaded" procedure.
I try to end the application before anything else happens, if a first instance has been found.
Tryout 1:
In my instance control, i call "Application.Terminate" to stop my application. That does not work - my database connection still gets established. Seems that "Application.Terminate" uses a "PostMessage" call to itself, which won't get read until the message gets received (which will happen after all components are loaded...). So obviously, this solution does not work for me.
Tryout 2:
In my instance control, i call "Halt" to stop my application. This immediately stops my application, but seems to create a load of memory leaks - at least that's what delphi's "ReportMemoryLeaksOnShutdown" tells me. Afterwards, i get this annoying "This application has stopped working" windows dialogue. This solution does not work for me.
Question A:
Is there another way to end an Application?
Question B:
Does Delphi's memory leak report work correctly when using "Halt"? Are there really memory leaks after exiting the Application with "Halt"?
If Application.Terminate terminates too late, and you want to terminate post haste, then Halt seems to me to be the appropriate course of action. I don't particularly see why Halt should lead to This application has stopped working exceptions. That suggests that your program's unit finalization code is raising exceptions which are not handled. You could perhaps try to work out (using the debugger, say) why these exceptions are raised and protect against it. Or you could avoid unit finalization with a call to the even more brutal ExitProcess.
There will indeed be memory leaks when you call Halt. However, these can safely be ignored because the operating system reclaims all allocated memory when a process terminates. In other words, the memory is leaked in the sense that is not freed by the application, but the operating system frees it immediately in any case.
However, whether or not a brutal Halt is the correct course of action for you, I cannot say. That's for you to decide. FWIW, a call to Halt from inside a component's Loaded method sounds exceptionally dubious to me.
If all you want to do is ensure that just a single instance of your program runs, then you can make simple modifications to the .dpr file to achieve that. That subject has been covered over and over again here on Stack Overflow. You'll find good coverage and advice in Rob Kennedy's answer here: How can I tell if another instance of my program is already running?

AV after successful close of applications

I am getting this AV message about 3 to 5 seconds after the applications close as expected:
Exception EAccessViolation in module rtl160.bpl at 00073225. Access violation at address 500A3225 in module 'rtl160.bpl'. Read of address 00000004.
These (20) applications are very similar in that they are IBX business applications. About half of them did not cause the AV to occur.
These applications were ported from Delphi-xe and they worked flawlessly for a long time. No changes were made to the projects in the port. Both 32 and 64 bit builds gave the same results.
Is this a bug in some library's finalization section freeing a resource or something?
I am using Delphi-XE2 Update 3.
Would appreciate the help.
Try using madExcept / EurekaLog etc. - they give you detailed stack trace on AV. This is not always a panacea, but can point you to the problem.
Access Violations are by their nature already very troublesome beasts since they deal with invalid pointers in memory. One that occurs a while after an application shuts down is even worse because that's when your app is in "cleanup" mode. You're could be dealing with something that went wrong much earlier in the application, but is only exposing itself at shutdown.
General Tips:
Try to always undo things in the reverse order you did them. E.g.
Create A, Create B ... Destroy B, Destroy A
Connect to Database, Open Dataset ... Close Dataset, Disconnect from Database
Even making sure you've done all the above before shutting down can help tremendously.
Any threads that are still running while your application is running can cause problems.
Preferably ensure all your child threads are properly terminated before final shutdown.
Refer back to Closing datasets above. Depending on what you're doing, some database components will create their own threads.
If you're using COM, try ensure ComObj is high up in the initialization sequence (I.e. place it as high as possible in your DPR).
Delphi finalizes units in the reverse order that they were initialized.
And you don't want ComObj to finalize before other things that are dependent on ComObj have also done so.
If you're using interface references, make sure you resolve circular reference issues.
Some of these problems can be tricky to find, but you can do the following:
Setup a source-code "sandbox" environment (you're going to chuck all your changes as soon as you've found the problem).
Figure out the simplest set of steps required to guarantee the error. (Start app and immediately shutdown would be ideal.)
Then you're going to comment-out delete wipe out chunks of code between tests and basically follow a divide and conquer approach to:
rip out code
test
if the problem persists, repeat. Else roll-back and rip out a different chunk of code.
eventually your code base will be small enough to pinpoint likely problems which can be tackled with targeted testing.
I've had this kind of access violation problem on occasion with old Delphi or C++Builder projects. Today I had it with C++Builder. At the time of the crash, by looking in the Debug -> Call Stack window, I can see that it's happening inside a call to fflush, called by __exit_streams and _exit.
I'm not sure what is causing it, since it's so deep in the Borland library code, but it seems to come and go at random when the code changes. And it seems to be more common with multi-form applications.
This time the error went away when I just added a new button on the main form. A button which is just there, has no event handlers and does not do anything. I think that any random change to the code, classes, variables etc rearranges the memory layout when you relink the application, and that either triggers or untriggers the error.
For now, I just leave the new button on the form, set it to "not visible" so that there's no visible change. As it seems to work, it's good enough solution for me at this time.

How do you cleanly abort a Delphi program?

I've got a program that's having some trouble during shutdown, raising exceptions that I can't trace back to their source. It appears to be timing-related and non-deterministic. This is occurring after all shared resources have been released, and since it's shutdown, memory leaks are not an issue, so that makes me wonder if there's any way to just tell the program to terminate immediately and silently after releasing the shared resources, instead of continuing with the shutdown sequence and giving an exception message box.
Does anyone know how to do that?
After looking at the Delphi Run Time Library source code, and at the Microsoft documentation; I can corroborate Mason and Paul-Jan comments.
The hierarchy of shutdown is as follows
Application.Terminate()
performs some unidentified housekeeping of application
calls Halt()
Halt()
calls ExitProc if set
alerts the user in case of runtime error
get rid of PackageLoad call contexts that might be pending
finalize all units
clear all exception handlers
call ExitprocessProc if set
and finally, call ExitProcess() from 'kernel32.dll'
ExitProcess()
unloads all DLLs
uses TerminateProcess() to kill the process
ExitProcess(0) ?
Halt(0) used to be the good old fashioned way of telling the program to end with immediate effect. There's probably a more Delphi-friendly way of doing that now, but I'm 95% sure halt(0) still works. :-)
In case HeartWare's suggestion of using ExitProcess() fails, it might be you are using some DLL's that do not respond well to the DLL_PROCESS_DETACH. In that case, try using a TerminateProcess( getCurrentProcess, 0 );
Once you resort to such measures, one might wonder if the "cleanly" part of the topic title still holds up to scrutiny.
The last time I had to hunt a problem like this was the shutdown was a causing an event (resize? It's been a while.) to fire on the dying window causing an attempt to redraw something that needed stuff that had already been disposed of.

Resources