How might I find out the source of long delays on resizing the main form? - delphi

I have a D2006 app that contains a page control and various grids, etc on the tabs. When I resize the main form (which ripples through and resizes just about everything on the form that is aligned to something), I experience long delays, like several seconds. The app freezes, the idle handler is not called and running threads appear to suspend also.
I have tries pausing execution in the IDE while this is happening in an attempt to break execution while it is in the troublesome code, but the IDE is not taking messages.
Obviously I'm not expecting anyone to point me at some errant piece of code, but I'm after debugging approaches that might help me. I have extensive execution timing code throughout the app, and the long delays don't show up in any of the data. For example, the execution time of the main form OnResize handler is minimal.

If you want to find out what's actually taking up your time, try a profiler. Sampling Profiler could answer your question pretty easily, especially if you're able to find the beginning and the end of the section of code that's causing trouble and insert OutputDebugString statements around it to narrow down the profiling.

OK. Problem solved. I noticed that the problem only occurred when I had command-line switches enabled to log some debug info. The debug info included some HTTP responses that were written to a debug log (a TMemo) on one of the tabs. When the HTTP response included a large block with no CR/LFs the TMemo wrapped it. Whenever I resized the main form, the TMemo resized and the control had to render the text again with the new word wrapping.
To demonstrate:
start a new Delphi project
drop a TMemo onto the form
align it to Client
compile and run
paste a large amount of text into the TMemo
resize the main form
I won't award myself the answer, as I hadn't really provided enough info for anybody else to solve it.
BTW #Mason - would SamplingProfiler have picked this one up - given that the execution is inside the VCL, and not in my code?

A brute-force approach that may give results.... Put a debug message to OutputDebugString() from every re-size event, sending the name of the control as the string to be displayed. This may show you which ones are being called "a lot".
You may have a situation where controls are bumping each other, setting off cascading re-size events. Like 3 siblings in the back seat of a compact car, once they start jostling for position, it can take a while for them to "settle down".
Don't make me turn this car arround....
The debug log (viewable in the IDE, or with an external ODS viewer), may show you which ones are causing the most trouble, if they appear multiple times for one "user-initiated re-size event".

Run your application in AQTime's performance profiler (included with XE, but you can get a time-limited version from their website).
Do some fanatic resizing for a while, and then stop the application.
After that, you'll see exactly which function was called many times, and where most time was spent.

Related

Erased components in Delphi program

Updates
2016-02-18: Added process information
I have a Delphi program compiled using XE4. It is being used by a few hundred customers. A couple of weeks ago one of these customers reported that some areas of the executable was being erased (images bellow) randomly during the day. This client has 35 sites using this exe and the problem occurs on no more than 10 of these sites.
Investigation
1 - My first suspicion was an infinite loop. The exe keeps responding while the components are erased, nothing changed on the code so radically from the time this problem did'n happen and the logs don't show any loop (this exe has logs everywhere).
2 - Misbehaving threads. I have a separate thread that syncs data between this exe and our server in the cloud. Again, logs don't show that the thread is running when the problem occur and again, nothing was changed here.
3 - Some other program (antivirus?) is affecting my exe. Couldn't investigate this hipotesis properly yet, but until now couldn't find any installed program that raised my attention.
My question is: What could be causing this issue? How can I investigate further? I know this may be a wide question but this is all information I could gather and I can't imagine many more places to look at.
Images
1 - In the image bellow the red-stroked area should be a TToolBar
2 - In this second image there are three areas, from the top to the bottom the first one should be a TToolBar, the second one should be the title of the child form and the third one should be a TwwDBGrid
3 - The third example shows on the top the erased area where should be a TEdit, just bellow it there's what should be a line on a TwwDBGrid and on the side we can see an erased scrollbar from the TwwDBGrid
4 - This last example shows 5 erased areas: The title of the application, the main TToolBar, The title of the Form, a TButton and two TwwDBGrid
5 - This is an interesting example beacause beyond the erased components there are 4 TSpeedButtons that are not erased but they are without the images they have originally (the first red stroked areas). The other 3 red stroked areas are, in order, 2 TEdits, a TwwDBGrd and a TButton
Process Information
I got a screenshot by the momment the problem occurs. scgolr is my software.
There is really not enough detailed information to give you a definite answer. However, I can answer with some directions on your question:
How can I investigate further?
Because of what you have stated:
The program is in use by a few hundred customers
One (only) customer experiences the problem
First occurance of the problem was some weeks ago
the first thing to do, is get in contact with the customer, and get the information you say that you have asked for but not got. The questions that need to be answered are:
What has changed in the customers environment at the time the problem
started with respect to hardware, network, server, OS, other software
running in the PCs?
Has anything changed in the way your customer is using your software?
What do the customer have to do to get rid of the problem, once it occurs? Close the program? Restart the PC? Or maybe just minimize - restore the erroneous window?
With the above I do not suggest that the fault is with this one customer and their equipment or their way of using the software. It may just be that the combination at the site which is different from all your other customers, triggers the problem to show up.
Some specific things to check in your software and at the site when problem occurs and if the problem goes away with a minimize - restore of the application (which would suggest a painting interrupted problem:
Do you call Application.ProcessMessages at any time?
Does the background thread access same data as the GUI? If yes, are the data protection properly in place (locking, synchronisation).
Does the background thread access any GUI components without Synchronize?
Finally I suggest that you visit the customer onsite. You get much better and faster answers in a direct discussion.
Edit after process information received.
There is nothing alarming concerning GDI or User objects. But it is alarming when you say in the comments that you call Application.ProcessMessages in many places, obviously to 'fix' a non-responsive UI. For example, what happens if the user double clicks a button, but does it slowly enough that Windows detects it as to separate clicks? First click may start your long lasting procedure within which you call A.P. The second click is read from the message que which starts the same procedure. Now the second call to the procedure runs (with its own calls to A.P.) and eventually ends and execution returns to the first call. Depending on what you do in this procedure, you may well be messing up handles and device contexts etc. A strong recommendation said with a friendly intent: Get rid of those calls to A.P.
the problem is with the security plugin (Warsaw - Gas Tecnologia) bank's website that your client is accessing , update it and it will be solved , the problem happens in Brazil
As #SebastianZ and #AlekseyK pointed out you may experiment exaustin of some GDI resource (handles?).
If the system coukd be accessed some tools like Process explerer or process hacker could give you some hints. This utility may help too GDIView
I don't know if this may apply to your case, but sometimes database data corruption can lead to strange effect in running programs (i remember 'Data Bombs' causing out of memory exceptions ...
So if something cause a GDI allocation loop, the graphics of your app cauld be affected in 'strange' ways

Setting up a WH_CALLWNDPROC hook prevents WM_HELP propagation in deep hierarchies

When pressing the F1 key, the win32 API first sends the appropriate key message then sends a WM_HELPmessage to the control that has the focus.
As it does not process it, it gets sent up the parenting chain all the way to the form which reacts to the message.
In Delphi (XE7) this happens because of calls to CallWindowProc inside Vcl.Controls.TWinControl.DefaultHandler
While this works fine in pretty much all locations inside my applications, there is one place where WM_HELP never reaches the top form.
Trying to reproduce it, I came up with a test application that you may find here:
http://obones.free.fr/wm_help.zip
After having built the application and started it, place the focus inside the In SubLevel or Level 1 edits and press F1.
You will see that WM_HELP is caught by the form.
Now, if you do the same inside In SubLevel2 or Level 15 edits you will see that nothing is logged, the form never sees WM_HELP
Tracing in the VCL I found out that for those deep levels, the calls to CallWindowProc inside Vcl.Controls.TWinControl.DefaultHandler immediately returns on one of the controls in the hierarchy, thus preventing the form from ever receiving the message.
However, I couldn't figure out why the Win32 API code thinks it should not propagate the message anymore, except for one thing: If I remove the WH_CALLWNDPROC hook, then everything is back to normal.
You can see the effect of disabling it if you uncheck the Use hook checkbox.
Now, one will argue that I shouldn't have such deep hierarchies of components, and I agree. However, the structure in the center with two frames inside one another is directly inspired by what's in the application where I noticed the issue.
This means that it can be quite easy to trigger the problem without actually noticing it. Hopefully, in my case, I can remove a few panels and go back below the limit.
But did anyone encounter the situation before? If yes, were you able to solve it? Or is this a known behavior of the Win32 API?
This is caused by a "Windows kernel stack overflow" that happens if you send window messages recursively. On a 64 bit Windows the kernel stack overflow happens much faster than on a 32 bit Windows.
This bug also caused the VCL to not resize deeply nested controls correctly before it got fixed by changing the recursive AlignControls code to (my) iterative version (more about the stack overflow: http://news.jrsoftware.org/news/toolbar2000/msg07779.html)

Delphi 5 application partially loaded in task manager, takes forever to actually display

I have an application written in Delphi 5, which runs fine on most (windows) computers.
However, occasionally the program begins to load (you can see it in task manager, uses about 2.5-3 MB of memory), but then stalls for a number of minutes, sometimes hours.
If you leave it long enough, the formshow event will eventually occur and the application window will pop up, but it seems like some other application or windows setting is preventing it from initially using all the memory it needs to run (approx. 35-40 MB).
Also, on some of my client's workstations, if they have MS Outlook running, they can close it and my application will pop up. Does anyone know what is going on here, and/or how to fix it?
Since nobody has given a better answer I'll take a stab at how to solve this:
There's something in your initialization that is locking it up somehow. Without seeing your code I do not know what it is so I'll only address how to go about finding it:
You need to log what you accomplish during startup. If you have any kind of screen showing I find the window title useful for this but it sounds like you don't--that means you need to write the log to a file. Let it get lost, kill the task and see where it got.
Note that this means you need to cleanly write your data despite an abnormal program termination. How to go about this:
A) Append, write your line, close.
B) Write your line, then flush the file handle.
C) Initially write your file to consist of a large number of blanks--ensure this is larger than the actual log will be. Write your line. In case of abnormal termination it will retain the original larger file size.
I would write a timestamp on every log item so you can see if it's just processing something too slowly.
If examining the log shows you where the problem is, fine. If, as usually happens, it's not enough you put a bunch more logging between the last item that did get logged and the next one that didn't--I've been known to log every line when hunting a cryptic problem that only happened on someone else's system.
If finding the line isn't enough to pinpoint the problem also dump the value of relevant variables.
Finally, if such intense scrutiny makes the bug go away start looking for an uninitialized variable. (While a memory stomp is also an option I doubt it's the culprit here.)

Delphi OLE Automation freezing GUI

We are using some OLE automation in Delphi 7 to open a word document, then once loaded, save it, and load it into a database.
This is working fine, but part of the requirement is to have a progress bar whilst the OLE bit is taking place, and also a timeout if the OLE part takes too long.
Problem we are having is that the entire GUI freezes whilst the OLE is taking place. The progress bar does nothing, then shoots up right at the end.
Any ideas on how we could approach this?
I think this is going to be difficult to do cleanly. So far as I know, Word automation doesn't give you the opportunity to cancel long running events. It also doesn't notify you of progress.
Probably the best that you can do is first of all move the automation into a separate thread. Then throw up a marquee progress bar whilst the long running automation is in progress. At least that will let the user know that something is happening.
As far as cancelling goes, you can let the user cancel from your progress dialog and then have your program continue. You could kill the automation thread, but that would leave Word in a bad state. I'd just let it continue to completion, but then ignore the results. From the user's perspective this will meet your goals reasonably well, even if it's a little dirty behind the scenes.

How can I keep a large amount of OutputDebugString() calls from degrading my application in the Delphi 6 IDE?

This has happened to me on more than one occasion and has led to many lost hours chasing a ghost. As typical, when I am debugging some really difficult timing-related code I start adding tons of OutputDebugString() calls, so I can get a good picture of the sequence of related operations. The problem is, the Delphi 6 IDE seems to be able to only handle that situation for so long. I'll use a concrete example I just went through to avoid generalities (as much as possible).
I spent several days debugging my inter-thread semaphore locking code along with my DirectShow timestamp calculation code that was causing some deeply frustrating problems. After having eliminated every bug I could think of, I still was having a problem with Skype, which my application sends audio to.
After about 10 seconds the delay between my talking and hearing my voice come out of Skype on the second PC that I was using for testing, the far end of the call, started to grow. At around 20 - 30 seconds the delay started to grow exponentially and at that point triggered code I have that checks to see if a critical section was being held too long.
Fortunately it wasn't too late at night and having been through this before, I decided to stop relentlessly tracing and turned off the majority of the OutputDebugString(). Thankfully I had most of them wrapped in a conditional compiler define so it was easy to do. The instant I did this the problems went away, and it turned out my code was working fine.
So it looks like the Delphi 6 IDE starts to really bog down when the amount of OutputDebugstring() traffic is above some threshold. Perhaps it's just the task of adding strings to the Event Log debugger pane, which holds all the OutputDebugString() reports. I don't know, but I have seen similar problems in my applications when a TMemo or similar control starts to contain too many strings.
What have those of you out there done to prevent this? Is there a way of clearing the Event Log via some method call or at least a way of limiting its size? Also, what techniques do you use via conditional defines, IDE plug-ins, or whatever, to cope with this situation?
A similar problem happened to me before with Delphi 2007. Disable event viewing in the IDE and instead use DebugView from Sysinternals.
I hardly ever use OutputDebugString. I find it hard to analyze the output in the IDE and it takes extra effort to keep several sets of multiple runs.
I really prefer a good logging component suite (CodeSite, SmartInspect) and usually log to various files. Standard files for example are "General", "Debug" (standard debug info that I want to collect from a client installation as well), "Configuration", "Services", "Clients". These are all set up to "overflow" to a set of numbered files, which allows you to keep the logs of several runs by simply allowing more numbered files. Comparing log info from different runs becomes a whole lot easier that way.
In the situation you describe I would add debug statements that log to a separate logfile. For example "Trace". The code to make "Trace" available is between conditional defines. That makes turning it on pretty simple.
To avoid leaving in these extra debug statements, I tend to make the changes to turn on the "Trace" log without checking it out from source control. That way, the compiler of the build server will throw out "identifier not defined" errors on any statements unintentionally left in. If I want to keep these extra statements I either change them to go to the "Debug" log, or put them between conditional defines.
The first thing I would do is make certain that the problem is what you think it is. It has been a long time since I've used Delphi, so I'm not sure about the IDE limitations, but I'm a bit skeptical that the event log will start bogging down exponentially over time with the same number of debug strings being written in a period of 20-30 seconds. It seems more likely that the number of debug strings being written is increasing over time for some reason, which could indicate a bug in your application control flow that is just not as obvious with the logging disabled.
To be sure I would try writing a simple application that just runs in a loop writing out debug strings in chunks of 100 or so, and start recording the time it takes for each chunk, and see if the time starts to increase as significantly over a 20-30 second timespan.
If you do verify that this is the problem - or even if it's not - then I would recommend using some type of logging library instead. OutputDebugString really loses it's effectiveness when you use it for massive log dumps like that. Even if you do find a way to reset or limit the output window, you'd be losing all of that logging data.
IDE Fix Pack has an optimisation to improve performance of OutputDebugString
The IDE’s Debug Log View also got an optimization. The debugger now
updates the Log View only when the IDE is idle. This allows the IDE to
stay responsive when hundreds of OutputDebugString messages or other
debug messages are written to the Debug Log View.
Note that this only runs on Delphi 2007 and above.

Resources