We've got an application written in Delphi that uses Delphi On Rails and acts as a server and communicates with clients using HTTP, JSON and websockets. We ran into some issues lately and it's hard to debug them and find the problem's source.
Using Wireshark for traffic analysis, we could see the following behaviour: There's a request from a client (HTTP GET for a file). Usually, we process that request and send a HTTP status code, the file (if not cached) etc. However, we have a reproducable problem where there's only
the request from the client, a TCP SYN from the server, but after that, the server sends a RST packet and the TCP communication stops.
The strange thing is, we can reproduce the problem quite well (although the files where the RST packet disrupts the communication differ) and it mysteriously vanishes in one of the following cases:
In debug environment (Delphi IDE), disabling madExcept
In release environment, not patching the executable with madExceptPatch
Give focus to a different window than the main application window.
As we had some problems with Delphi On Rails and had to do minor modifications to it to avoid access violations and debug exceptions, I suspect DOR to be the culprit and some strange memory corruption or uncatched exception to be the bug, but it's still confusing, especially the fact that the problem vanishes if we change focus.
My main question is not how to solve this problem, but how to debug it and where to look for problems. The source of the TCP reset also puzzles me, as we don't run into the usual procedures that process requests in that case and it seems like either DOR or something else (the application, Winsock, OS) resets the connection by mistake.
For completeness, as it might be related, here are issues that I reported at the Delphi On Rails project and a forum thread where I asked the madExcept author about the problem: Issue #6, Issue #7, Issue #8, forum entry.
As a test, we checked out some older DOR sources from version control where no connection issues were known, and it works without showing any of the above problems.
So we decided to solve the problem the other way round: Rolling back the DOR specific source code (about 20 files) to the last stable version and "re-updating" it piece by piece until the error will occur again. If this happens, we can
Go back to the latest working version quickly
Hopefully be quite close to the original DOR sources so we can react on updates on the library.
Analyse the occuring error and report an detailed issue (and perhaps even a solution) to the DOR project.
EDIT: We could now update all but one file back to the old state without getting connection issues. The file that creates problems is dorSynchronizer.pas, more exact it's r179 of this file that caused the issues - threads were changed from Windows API to Delphi TThread there. We'll investigate this further and might add an issue to the DOR project in the next days.
EDIT2: It turned out that DOR uses the deprecated procedures TThread.Suspend and TThread.Resume that can cause undefined behaviour. I reported an issue to the DOR project.
Related
Update: It turned out that there was something with installing Delphi 10.4 CE that broke my app (thanks, DelphiCoder!); specifically, it was something in the Windows Registry that was broken. After using ProcessMonitor to ensure no Delphi 10.4 (aka 21.0) was being invoked, I ended up cleaning out the registry of all 10.4 references, rebuilding completely (not clear if this was needed or not), and lo and behold, it works again! I'm adding this update in case someone in a similar situation finds this question - remember to back up your registry first and be careful!
Original Post: I created several DLLs with Matlab Compiler 10 years ago, with C wrappers, to make them available with Delphi. Once I got them working, they always worked - until today! The code in the C wrapper initialization function in question is in the code box below; the "Could not initialize library" is printed to the console when I run my Delphi app.
mclmcrInitialize();
if (!mclInitializeApplication(NULL, 0)) {
fprintf(stderr, "Could not initialize application\n");
}
if (!libMyDllInitialize()) {
fprintf(stderr, "Could not initialize library\n");
}
The problem is that this has never happened before, over all the probably 10 years since we first wrote these! My machine has the correct version of the 32-bit 2021a MCR installed, as it has for several years; I've installed this on numerous machines from Windows XP up to Windows 10, The DLLs were last built 5 - 7 years ago; anyway, I don't have access to the Matlab compiler anymore. The only thing that has changed is my app, but not anywhere near where this DLL initialization code is called; also, when the problem first happened, my app was working, then didn't -without any changes. Finally, I went back a few days and rebuilt my app, and it still fails.
So I am really stuck, and need some advanced help in debugging DLL startup issues on Windows. I tried looking in the Windows Event Logger, but nothing appears to show up there. Logs to check? A setting in the Registry that somehow got hosed? Wrong phase of the moon? How does one debug loading/initializing a formerly working DLL when forced to treat it as a black box? Help!
How does one debug loading/initializing a formerly working DLL [...]?
I think there is no definitive answer to your question.
This is how we have gone about debugging the loading/initializing of DLLs and applications and may help you:
We regularly work with systems where we have no source code for the DLLs (and often we don't have any source code for the applications either). We experience DLL conflicts quite regularly. When testing why applications don't start as expected we have found the use of Sysinternal's Process Monitor by Mark Russinovich invaluable.
This will show you system level activity. You can filter for your process and then you will see all file, registry, thread and network activity (although thread and network are quite limited). If the DLL has dependencies then the system tries to find those and so you will be able to discover all dependent DLLs and COM interfaces (by seeing the registry lookups for that interface) that it's looking for. Process Monitor will show if the resource is not found or if access is denied.
Slightly more difficult to discover is if one of the dependencies exists but the export table has changed (so the functions have different signatures or export ordinals). There are ways to check that (by looking at the export and import tables) but generally (if you have access to a working environment) it's enough to check the filesize, timestamp (and the VERSIONINFO resource if there is one) between DLLs.
I have embarcadero radstudio xe8 on a windows 10 machine. Everything was working perfectly till about 2weeks ago. Everytime I try to start Delphi xe8 I get an error :"Exception EOSError in module rtl220.bpl at 00050A4D.System Error. Code 111. The file name is too long. " I tried reinstalling a couple of times and I even tried installing appmethod but I still get the same error.
What can be the problem?
I've had the same issue today. I've traced it to a GetAdaptersInfo() call, and it turned out that for some reason (VirtualBox is my main suspect) I've had over 50 network adapters registered on my system. Removing all of them fixed the issue.
There is not enough info in your question in order to tell what exactly is wrong. Try using Process Monitor in order to check what files Delphi is trying to access. It will also show the errors of these file operations.
as they uri2x tell you the problem is that RAD Studio having issue with many network adapters registered more than 20 will not work properly you will have problem with debbuging and run project and you may find many cmd.exe process running in your system
delete duplicate and non needed network adapters registered on "Control Panel\Network and Internet\Network Connections"
that will fix your problem
This may be of help. I had a similar problem, error code 5 Access is denied. This turned out to related to a thread started to test an internet connection on an embedded panel (using BeginThread). If the user exits the form (which is testing the internet connection) immediately after displaying the form, the AV occurs.
On my development PC, the internet connection test was successful...and so fast I never saw the problem! After struggling for several hours, I finally tracked it down to this thread and reproduced it by disconnecting my network cable.
The solution was straightforward: When exiting the form (eg. in the FormDestroy event) ensure the thread is definitely not running before continuing.
We have a Delphi 7 application running on numerous client machines. Recently, some of the client machines started using Microsoft Security Essentials. It started identifying our executable as malware and promptly shut it down. The message displayed by MS Security essentials is:
"Security Essentials detected items on your PC that it doesn't recognize......"
Odd thing is it does not always occur at the same option in the application. You can do the very same operation on subsequent logins and sometimes it works and other times security essentials closes it down. This makes it extremely hard to narrow down to a specific cause in our application.
I tried running the application with elevated account privileges and was still able to get it to fail. I was unable to duplicate the issue when running a Delphi XE2 compile of the same application.
Any ideas about what to look for? We are really trying to avoid adding our application to the Security Essentials exclusion list. Our application has never been identified as a problem with varying security programs (norton, mcaffee, etc.).
I once had a similar issue with an executable built using Delphi(7), though it had nothing to do with Delphi-7. It just so happens that some part of the executable matches some virus signature or the AV heuristic scan suspects that something is wrong with the executable. One thing you can try is to change some of the compiler settings such as Debug options. Changing Debug Information or using debug DCUs might result in a slightly different byte sequences in the final executable.
Users have been reporting problems/crashes/bugs that I can't reproduce on my machine. I'm finding these problems difficult to fix.
I've started using EurekaLog (fantastic!) and SmartInspect. Both these tools have helped greatly but I'm still finding it difficult to catch some problems.
I've just purchased Debugging by David Agans (and waiting for it to arrive).
Are there any other tools or techniques specific to Delphi that will help with catching these hard to find remote problems? The kinds of problems I'm finding difficult to track down are those that don't raise exceptions or have a clear cause. EurekaLog catches exceptions and SmartInspect is pretty good once I have a theory to check. But in some cases it is a seemingly random crash and there are several thousand lines of code that may may be at fault. How to narrow down to the root cause?
MadExcept is what I use, and it is fabulous. I have also used EurekaLog and find the functionality almost exactly identical, except that I have more experience and time using MadExcept. it's free for non-commercial use, and reasonably priced for commercial use.
Update: MadExcept 4 is now out and even supports 64 bit Delphi XE2 apps, and has memory-leak checking too.
When nothing blows up, I rely on heavy use of trace logging. I have a TraceMessage(integer,string) function which I call throughout all my apps, and when someone has problems I get them to click a menu item that turns up the debug trace level to the most verbose level; it gives me a complete history of everything my application did, and this has helped me even more than madExcept, to solve problems at customer sites. Customers get a crash, and that crash report sent by madexcept contains a log file (created by my app) that is attached automatically. I believe you can do this equally well with madExcept and EurekaLog. If you need a logging system you could download Log4D, or you could write your own, it's pretty simple.
For always-free, try JclDebug, which requires more work to set up, but which has worked fabulously for me, also.
For help with heap problems, learn more about fastMM (full version) debug options.
And you shouldn't forget that Delphi itself supports Remote debugging, if you can reproduce a crash on machines in your office that don't have delphi installed, use remote debug across the office network instead of installing a complete RAD Studio installation on that other machine at your work. You could also use remote debug to connect to a client PC computer across the internet, but I have not tried remote debug across the internet yet, so I can't say whether it works great over the internet or not. I do know that since remote debug doesn't support automatic deploy of the EXE file you built (you have to do that part yourself), remote debug over internet, to a client PC is more work.
You might also find lots of your problems by fixing all your hints and warnings, and then going through with CodeHealer or Pascal Analyzer (PAL) from Peganza. These static analysis tools can help you find real code problems.
If performance and memory usage are your problems, get the full version of AQTime, and use it to profile and watch your system operate. It will help you fix your memory leaks, and understand your app's runtime behaviour and memory usage, not just leaks but bottlenecks for memory and CPU usage. Some of those bottlenecks can also be the source of some odd problems. I have even used AQTime to help me find deadlocks, since it can generate traces of execution, that can help me figure out what code is running, and locate deadlocks. Update: AQTime is not installable on machines other than your main dev machine, without violating the newly modified license terms for AQTime. These terms were never this restrictive in the good old days.
If you gave more exact idea of what your problems are, I'm sure other people could give you some more ideas that are specific, but all of the above are general techniques that have served me well.
One of the best way is to use the Remote Debugger that comes with Delphi, so you can debug directly the application running on the remote machine. THe remote debugger is somewhat buggy in some Delphi releases, and requires to follow the instructions carefully to make it working, but when needed it's a tool to consider. Also check if there are updates available for your version, they could come in a separate installer for deployment on "remote" systems. Otherwise first install the remote debugger, than check if the files installed has newer versions in your local installation, and the copy tehm on the remote machine.
CodeSite has helped me a lot in these situations. Since XE it is bundled with Delphi.
Logging is the key, in this matter.
Take a look at our TSynLog class available in our Open Source SynCommons library.
It does have the JCL Debug / MadExcept features, with some additional (like customer-side profiling, and logging):
logging with a set of levels;
fast, low execution overhead;
can load .map file symbols to be used in logging;
compression of .map into binary .mab (900 KB -> 70 KB);
inclusion of the .map/.mab into the .exe;
reading of an external .map to add unit names and line numbers to a log file without .map available information at execution;
exception logging (Delphi or low-level exceptions) with unit names and line numbers;
optional stack trace with units and line numbers;
methods or procedure recursive tracing, with Enter and auto-Leave using interfaces;
high resolution time stamps, for customer-side profiling of the application execution;
set / enumerates / TList / TPersistent / TObjectList / TContainer / dynamic array JSON serialization;
per-thread or global logging;
multiple log files on the same process;
integrated log archival (in zip or any other format);
Open Source, works from Delphi 5 up to XE.
I'm having an interesting problem implementing a global keyboard hook.
I wrote a dll which is used to set the hook and then an application (Delphi) which loads the dll and processes the results of the hook. This was done this afternoon on my PC at work and after some testing I figured it was working 100%.
I've just tested the same app and dll here at home and I'm not getting any errors, but the application does not appear to be getting any data either.
Both machines are WinXP, although my work machine is SP2 and this one is SP3.
Has there been some change in the Win32 API which would cause this to malfunction, or could the problem be related to some A/V / Spyware / MS Update that has been released recently?
I'm hoping somebody here will know of an obvious reason that this may happen before I spend hours debugging.
Thanks!
Actually some A/Vs don't like homemade hooks. I've got the same problem with my mouse hooker on some machines, and it doesn't depend on service pack version.
Yeah, I could. I haven't installed Delphi on this machine, but I think I might have to. I'm going for the low hanging fruit here. If there's an obvious answer, there's no need to go through all the trouble of debugging and hoping to find what might be the problem.
My first suspicion is that there's been a change in the API somewhere.
As I mentioned, this app works absolutely perfectly on my work machine.
Do you have a debugger on your home computer? Do you receive any messages via the hook at all?
Can it be that some other application is hooking, and don't pass the message on down the hook-chain?
BTW: I love virtual machines for this kind of testing. Keep a clean XP install. Install SP2, and test your application. Roll back to clean install again, and install SP3. Try your application again. This way you will know if its SP3, since there is nothing else to mess things up. I like to keep a set of snapshots around with different configurations.
Which kind of hook are you using? I once used the WH_CBT-type and encountered problems when certain other applications where running. One case I could trace back to Trillian, which seems to do also some kind of hooking (and maybe screws up).
Apart from that I am currently working on an application that uses the WH_KEYBOARD-hook and this works on SP2 and SP3 equally well. The MSDN also doesn't mention any service-pack related changes.
What you can do to trace the bug on your home machine:
make sure to check all result values of all system api calls (and use GetLastError in case of error)
provide some kind of debug output in case of error (e.g. as message box or to a text file)
optional: log some status messages so you know whats going on internally
One alternative is to use a low level keyboardhook. (Just a different param to SetWindowsHookEx). The hook is processed in the message loop of the registering thread, and thus does not need to inject a dll everywhere. And for some odd reason VirusScanners/Firewalls interfere much less with it. They often silently block dllinjection or normal keyboardhooks. Also removes the need to share the hHook across processes if you want it to work in older windows versions.
And if you abuse a keyboardhook to implement global hotkeys(Have seen that a lot) use RegisterHotkey/http://msdn.microsoft.com/en-us/library/ms646309.aspx) instead.