I have a TCP/IP DataSnap server running as a service [Session based LifeCycle] that continuously chews up memory and never comes back to the starting memory size even when there are no connections to it.
In order to eliminate My code as the culprit, I have modeled up a basic TCP/IP DataSnap server running as VCL [Session based LifeCycle] that serves a Server Method class [TDSServerModule] which only contains basic mathematical functions using native data types [no objects to create or free].
When I connect to said DataSnap server with a very thin client I get the same results.
The Memory Usage continuously grows with each connection and sporadically grows when executing the server side methods from the client. Once the connections are closed the DataSnap Server never reduces its Memory Usage [even when left running without connections for 8 hrs].
Any suggestions as to why this occurs or more importantly how to curtail it?
I am using RAD Studio XE2 Update 4 HotFix 1.
Let me quote a "must read" article about DataSnap. This is about XE3 but i hope the code here would work for XE2 as well.
Memory consumption
One of the issues that I observed was related to memory consumption. Why Datasnap server consumes so much memory if the method called does absolutely nothing?
Maybe I don’t know how to explain exactly but i will try. Basically the DataSnap creates a session for each HTTP connection that it receives. This session will be destroyed after 20 minutes, in other words, on the first 20 minutes of the test the memory consumption will only go up, after that it has the tendency of stabilize itself. I really have no idea why Datasnap does this. In a REST application I don’t see much sense in these sessions with a default configuration. Of course, sessions can be helpful, but i can’t understand why it’s a default configuration. Indeed, DataSnap doesn’t have a configuration for that. It appears like you just have to use this session control, without being able to choose otherwise (There is no documentation). The MORMot framework has a session control too but it’s configurable and doesn’t consumes so much memory.
Anyway, there is a way around this problem. Daniele Teti wrote an article on his blog, take a look. The solution that I will show here was placed by him on his blog. Thanks Daniele.
uses System.StrUtils, DataSnap.DSSession, Data.DBXPlatform;
function TServerMethods1.HelloWorld: String;
begin
Result := 'Hello World';
GetInvocationMetaData.CloseSession := True;
end;
After running this method the session will close and memory consumption will be lower. Of course still exists an overhead for creating and destroying this session.
So it seems the best course for you is ending every server method with explicit memory cleanup, if that was possible in XE2. Then you'd better read thosee articles again and prepare for future scalability challenges.
http://www.danieleteti.it/2012/12/15/datasnap-concurrency-problems-and-update1/
http://robertocschneiders.wordpress.com/2013/01/09/datasnap-analysis-based-on-speed-stability-tests-part-2/
I added the below method and called it from the "TWebModule1::WebModuleBeforeDispatch" event. It eliminated the memory consumption and actually allow the idle REST service to return to a state of no-session memory. DataSnap definitely needs to work on this issue.
// ---------------------------------------------------------------------------
/// <summary> Memory Restoration. DataSnap opens a session for each call
/// even when the service is set for invocation.
/// Sessions are building up consuming memory and seem not to be freed.
/// See: https://stackoverflow.com/questions/17748300/how-to-release-datasnap-memory-once-connections-are-closed
/// </summary>
/// <remarks> Iterates session in the session manager and closes then terminates
/// any session that has been idle for over 10 seconds.
/// </remarks>
/// <returns> void
/// </returns>
// ---------------------------------------------------------------------------
void TWebModule1::CloseIdleSessions()
{
TDSSessionManager* sessMgr = TDSSessionManager::Instance;
int sessCount = sessMgr->GetSessionCount();
WriteLogEntry(LogEntryTypeDebug, "TWebModule1::CloseIdleSessions", "Session Count: " + IntToStr(sessCount));
TStringList* sessKeys = new TStringList;
sessMgr->GetOpenSessionKeys(sessKeys);
WriteLogEntry(LogEntryTypeDebug, "TWebModule1::CloseIdleSessions", "Session Keys Count: " + IntToStr(sessKeys->Count));
TDSSession* sess = NULL;
for(int index = 0; index < sessKeys->Count; index++)
{
String sessKey = sessKeys->Strings[index];
sess = sessMgr->Session[sessKey];
unsigned elapsed = (int)sess->ElapsedSinceLastActvity();
if(elapsed > 10000)
{
WriteLogEntry(LogEntryTypeDebug, "TWebModule1::CloseIdleSessions", "CloseSession TerminateSession Key: " + sessKey);
sessMgr->CloseSession(sessKey);
sessMgr->TerminateSession(sessKey);
}
sess = NULL;
}
delete sessKeys;
sessMgr = NULL;
}
You should check the Lifecycle property on TDSServerclass component on your servercontainer. It provides a way to determine the way the session is handled. It defaults to session. Setting it to invokation will free the session after each call (invokation). This will of course mean you have no state. This would be oke in a typical REST server though.
if you still have memory consumption growing. put the following line in your dpr unit.
ReportMemoryLeaksOnShutdown := True;
your application will then show you the memory leaks it has on closing of your datasnap server.
Related
I watched this video : https://channel9.msdn.com/Events/TechDays/Techdays-2012-the-Netherlands/2287. So i tried to implement the usage of async/await in a controller. SO this is basicaly what I did :
public class HomeController : Controller
{
private static WebClient _webClient = new WebClient();
public async Task<ActionResult> IndexAsync()
{
var data = await _webClient.DownloadStringTaskAsync("http://stackoverflow.com/");
return View("Index", (object)data);
}
public ActionResult Index()
{
var data = _webClient.DownloadString("http://stackoverflow.com/");
return View("Index", (object)data);
}
}
Then I used Apache Benchmark and did the two following tests :
ab -n 100 -c 100 http://localhost:53446/Home/index
and
ab -n 100 -c 100 http://localhost:53446/Home/indexasync
And I got the exact same performance (I have 8 CPU core). Why is that ?
Async is not about performance. That's just categorically incorrect. In fact, an async request will often be less performant that sync, simply because there's additional overhead involved with async.
The reason to use async is about efficient resource-management and scale. A typical web server process will have around 1000 threads. This is often called the "max requests", as one thread general equals one request. If you have an 8 core CPU, you should ideally have a process per core (in IIS those are called "web workers"). So, theoretically, you'd have around 8000 threads total to work with.
That's quite a lot actually, though a modern web page consumes more requests than most people think. The page itself is one request, but that page will have images and external JS and CSS files, all of which generate a request, and will often utilize AJAX, for further requests. The point is that while 8000+ threads is still quite a lot to have in your pool, you could still very well run out if the server is under significant load.
Async merely gives you breathing room above that limit. In situations where the thread enters a wait-state, it can be returned to the pool to field other requests while whatever external action is being completed. The alternative is that the thread would just sit there idle (sync). That's really all there is to it. It's entirely about tasking those otherwise idle threads with some other bit of work, which could mean the difference between requests queuing up and timing out or being handled, even if slowly.
Running a load test that exhausts the thread pool is difficult to do on a local box. It's a lot easier to pretend the thread pool is exhausted by artificially restricting it, as I do in my gist:
protected void Application_Start()
{
int workerThreads, ioThreads;
ThreadPool.GetMaxThreads(out workerThreads, out ioThreads);
ThreadPool.SetMaxThreads(Environment.ProcessorCount, ioThreads);
...
}
There are a couple of reasons that stand out.
From Using Asynchronous Methods in ASP.NET MVC 4
the number of threads in the thread pool is limited (the default maximum for .NET 4.5 is 5,000). In large applications with high concurrency of long-running requests, all available threads might be busy. This condition is known as thread starvation.
So, running 100 request at a time will not even begin to starve your threads.
Also, a simple GET request will run very quickly. A test that performs an action that takes multiple seconds or even minutes would bear more obvious performance gains.
We're currently preparing our Azure role (standard Web Role) for an expected massive load, and we need to know how much memory the current setup consumes. To accomplish this, we're using load tests while measuring the consumed memory with GC.GetTotalMemory.
The page http://technet.microsoft.com/en-us/cloud/gg663909.aspx lists the Compute Instance Guaranteed Memory for each instance size (for example, 0.768 GB for the Extra-Small Instance and 3.5 GB for the Medium Instance).
Are the values of GC.GetTotalMemory comparable to the values in these list? In other words, if GC.GetTotalMemory is staying significantly below the listed limit, can we be sure that there won't be any sudden perfomance loss due to memory swapping?
If we hit the limit, is our assumption correct that there will be some memory swapping (writing memory content to the virtual harddisk), or will there be more severe implications like repeated App Pool recycling?
(the last question comes up because most shared hosters will recycle your App Pool if you hit some memory limit, but frankly we don't expect anything like this from Windows Azure)
This method will only give you the currently allocated bytes by your process. The 0.768 GB includes the memory availble to the operating system, and there can be virtual memory as well.
system.gc.gettotalmemory
To get the total system memory you can use:
Add a Reference to System.Management.
private static void DisplayTotalRam()
{
string Query = "SELECT MaxCapacity FROM Win32_PhysicalMemoryArray";
ManagementObjectSearcher searcher = new ManagementObjectSearcher(Query);
foreach (ManagementObject WniPART in searcher.Get())
{
UInt32 SizeinKB = Convert.ToUInt32(WniPART.Properties["MaxCapacity"].Value);
UInt32 SizeinMB = SizeinKB / 1024;
UInt32 SizeinGB = SizeinMB / 1024;
Console.WriteLine("Size in KB: {0}, Size in MB: {1}, Size in GB: {2}", SizeinKB, SizeinMB, SizeinGB);
}
}
Source for code
To answer your last question, Windows Azure will stay out of the way, and paging will happen like on any Windows server.
Whether IIS recycles your app pool probably depends on your IIS settings, but those are under your control. (You can, for example, run appcmd in a startup task if you want to change a default.)
My Delphi XE application is based on a single EXE using a local server DLL created by RemObjects and uses a lot of memory for a specific operation until it generates an exception saying there are not enough memory. So I'm trying to understand why and where this is happening so I placed various steps throughout my code where I report on memory usage. The problem is that I'm getting very different information based on the method used to get memory usage information:
If I use the method explained here which asks FastMM directly for both the Client EXE and Server DLL, here is what I get:
STEP 1: [client] = 36664572 - [server] = 3274976
STEP 2: [client] = 62641230 - [server] = 44430224
STEP 3: [client] = 66665630 - [server] = 44430224
Now if I use the method explained here which uses GetProcessMemoryInfo, I get far more memory usage:
STEP 1: [process] = 133722112
STEP 2: [process] = 1072115712
STEP 3: [process] = 1075818496
It looks like second method is the right based on my memory problems but how could the FastMM method be so "low" ? And what can explain the difference ?
GetProcessMemoryInfo also reports memory that is not managed by FastMM, like memory that is allocated by the various non Delphi dlls you might call (like winapi).
Also FastMM can allocate more memory from Windows that your application actually uses for internal structures, fragmentation and pooling.
And as last, with GetProcessMemoryInfo you measuring the Workingset size. That is what part of the application's memory currenctly in RAM instead if in the page file. It includes more than just data structures and is definately not comparable to the total memory the application has allocated. PagefileUsage would be more comparable. Workingset size almost never is what you are looking for. See here for a better explanation.
So they both give different results because they both measure different things.
Is there a way to hook into the WndProc of a dbx user session?
Background:
dbx DataSnap uses Indy components for TCP communication. In its simplest form, a DataSnap server is an Indy TCP server accepting connections. When a connection is established, Indy creates a thread for that connection which handles all requests for that connection.
Each of these user connections consume resources. For a server with a couple hundred simultaneous connections, those resources can be expensive. Many of the resources could be pooled, but I don't want to always acquire and release a resource each time it is needed.
Instead, I'd like to implement a idle timer. After a thread finishes with a resource, the timer would start. If the thread accesses the resource before the timer has elapsed, the resource would still be "assigned" to that thread. But if the timer elapses before the next access, the resource would be released back to the pool. The next time the thread needs the resource, another resource would be acquired from the pool.
I haven't found a way to do this. I've tried using SetTimer but my timer callback never fires. I assume this is because Indy's WndProc for the thread isn't dispatching WM_TIMER. I have no control of the "execution loop" for this thread, so I can't easily check to see if an event has been signaled. In fact, none of my code for this thread executes unless the thread is handling a user request. And in fact, I'm wanting code to execute outside of any user request.
Solutions to the original question or suggestions for alternative approaches would be equally appreciated.
We tried to implement something to share resources across user threads using TCP connections (no HTTP transport, so no SessionManager), but ran into all sorts of problems. In the end we abandoned using individual user threads (set LifeCycle := TDSLifeCycle.Server) and created our own FResourcePool and FUserList (both TThreadList) in ServerContainerUnit. It only took 1 day to implement, and it works very well.
Here's a simplified version of what we did:
TResource = class
SomeResource: TSomeType;
UserCount: Integer;
LastSeen: TDateTime;
end;
When a user connects, we check FResourcePool for the TResource the user needs. If it exists, we increment the resource's UserCount property. When the user is done, we decrement the UserCount property and set LastSeen. We have a TTimer that fires every 60 seconds that frees any resource with a UserCount = 0 and LastSeen greater than 60 seconds.
The FUserList is very similar. If a user hasn't been seen for several hours, we assume that their connection was severed (because our client app does an auto-disconnect if the user has been idle for 90 minutes) so we programmatically disconnect the user on the server-side, which also decrements their use of each resource. Of course, this means that we had to create a session variable ourselves (e.g., CreateGUID();) and pass that to the client when they first connect. The client passes the session id back to the server with each request so we know which FUserList record is theirs. Although this is a drawback to not using user threads, it is easily managed.
James L maybe had nailed it. Since Indy thread does not have an message loop, you have to rely in another mechanism - like read-only thread-local properties (like UserCount and / or LastSeem in his' example) - and using main thread of the server to run a TTimer for liberating resources given some rule.
EDIT: another idea is create an common data structure (example below) which is updated each time an thread finishes its' job.
WARNING: coding from mind only... It may not compile... ;-)
Example:
TThreadStatus = (tsDoingMyJob, tsFinished);
TThreadStatusInfo = class
private
fTStatus : TThreadStatus;
fDTFinished : TDateTime;
procedure SetThreadStatus(value: TThreadStatus);
public
property ThreadStatus: TThreadStatus read fTStatus write SetStatus;
property FinishedTime: TDateTime read fDTFinished;
procedure FinishJob ;
procedure DoJob;
end
procedure TThreadStatusInfo.SetThreadStatus(value : TThreadStatus)
begin
fTStatus = value;
case fTStatus of
tsDoingMyJob :
fDTFinished = TDateTime(0);
tsFinished:
fDTFinished = Now;
end;
end;
procedure TThreadStatusInfo.FinishJob;
begin
ThreadStatus := tsFinished;
end;
procedure TThreadStatusInfo.DoJob;
begin
ThreadStatus := tsDoingMyJob;
end;
Put it in a list (any list class you like), and make sure each thread is associated
with a index in that list. Removing items from the list only when you won't use that
number of threads anymore (shrinking the list). Add an item when you create a new thread
(example, you have 4 threads and now you need an 5th, you create a new item on main thread).
Since each thread have an index on the list, you don't need to encapsulate this write (the
calls on T
on a TCriticalSection.
You can read this list without trouble, using an TTimer on main thread to inspect
the status of each thread. Since you have the time of each thread's finishing time
you can calculate timeouts.
I'm a member in a team that use Delphi 2007 for a larger application and we suspect heap corruption because sometimes there are strange bugs that have no other explanation.
I believe that the Rangechecking option for the compiler is only for arrays. I want a tool that give an exception or log when there is a write on a memory address that is not allocated by the application.
Regards
EDIT: The error is of type:
Error: Access violation at address 00404E78 in module 'BoatLogisticsAMCAttracsServer.exe'. Read of address FFFFFFDD
EDIT2: Thanks for all suggestions. Unfortunately I think that the solution is deeper than that. We use a patched version of Bold for Delphi as we own the source. Probably there are some errors introduced in the Bold framwork. Yes we have a log with callstacks that are handled by JCL and also trace messages. So a callstack with the exception can lock like this:
20091210 16:02:29 (2356) [EXCEPTION] Raised EBold: Failed to derive ServerSession.mayDropSession: Boolean
OCL expression: not active and not idle and timeout and (ApplicationKernel.allinstances->first.CurrentSession <> self)
Error: Access violation at address 00404E78 in module 'BoatLogisticsAMCAttracsServer.exe'. Read of address FFFFFFDD. At Location BoldSystem.TBoldMember.CalculateDerivedMemberWithExpression (BoldSystem.pas:4016)
Inner Exception Raised EBold: Failed to derive ServerSession.mayDropSession: Boolean
OCL expression: not active and not idle and timeout and (ApplicationKernel.allinstances->first.CurrentSession <> self)
Error: Access violation at address 00404E78 in module 'BoatLogisticsAMCAttracsServer.exe'. Read of address FFFFFFDD. At Location BoldSystem.TBoldMember.CalculateDerivedMemberWithExpression (BoldSystem.pas:4016)
Inner Exception Call Stack:
[00] System.TObject.InheritsFrom (sys\system.pas:9237)
Call Stack:
[00] BoldSystem.TBoldMember.CalculateDerivedMemberWithExpression (BoldSystem.pas:4016)
[01] BoldSystem.TBoldMember.DeriveMember (BoldSystem.pas:3846)
[02] BoldSystem.TBoldMemberDeriver.DoDeriveAndSubscribe (BoldSystem.pas:7491)
[03] BoldDeriver.TBoldAbstractDeriver.DeriveAndSubscribe (BoldDeriver.pas:180)
[04] BoldDeriver.TBoldAbstractDeriver.SetDeriverState (BoldDeriver.pas:262)
[05] BoldDeriver.TBoldAbstractDeriver.Derive (BoldDeriver.pas:117)
[06] BoldDeriver.TBoldAbstractDeriver.EnsureCurrent (BoldDeriver.pas:196)
[07] BoldSystem.TBoldMember.EnsureContentsCurrent (BoldSystem.pas:4245)
[08] BoldSystem.TBoldAttribute.EnsureNotNull (BoldSystem.pas:4813)
[09] BoldAttributes.TBABoolean.GetAsBoolean (BoldAttributes.pas:3069)
[10] BusinessClasses.TLogonSession._GetMayDropSession (code\BusinessClasses.pas:31854)
[11] DMAttracsTimers.TAttracsTimerDataModule.RemoveDanglingLogonSessions (code\DMAttracsTimers.pas:237)
[12] DMAttracsTimers.TAttracsTimerDataModule.UpdateServerTimeOnTimerTrig (code\DMAttracsTimers.pas:482)
[13] DMAttracsTimers.TAttracsTimerDataModule.TimerKernelWork (code\DMAttracsTimers.pas:551)
[14] DMAttracsTimers.TAttracsTimerDataModule.AttracsTimerTimer (code\DMAttracsTimers.pas:600)
[15] ExtCtrls.TTimer.Timer (ExtCtrls.pas:2281)
[16] Classes.StdWndProc (common\Classes.pas:11583)
The inner exception part is the callstack at the moment an exception is reraised.
EDIT3: The theory right now is that the Virtual Memory Table (VMT) is somehow broken. When this happen there is no indication of it. Only when a method is called an exception is raised (ALWAYS on address FFFFFFDD, -35 decimal) but then it is too late. You don't know the real cause for the error. Any hint of how to catch a bug like this is really appreciated!!! We have tried with SafeMM, but the problem is that the memory consumption is too high even when the 3 GB flag is used. So now I try to give a bounty to the SO community :)
EDIT4: One hint is that according the log there is often (or even always) another exception before this. It can be for example optimistic locking in the database. We have tried to raise exceptions by force but in test environment it just works fine.
EDIT5: Story continues... I did a search on the logs for the last 30 days now. The result:
"Read of address FFFFFFDB" 0
"Read of address FFFFFFDC" 24
"Read of address FFFFFFDD" 270
"Read of address FFFFFFDE" 22
"Read of address FFFFFFDF" 7
"Read of address FFFFFFE0" 20
"Read of address FFFFFFE1" 0
So the current theory is that an enum (there is a lots in Bold) overwrite a pointer. I got 5 hits with different address above. It could mean that the enum holds 5 values where the second one is most used. If there is an exception a rollback should occur for the database and Boldobjects should be destroyed. Maybe there is a chance that not everything is destroyed and a enum still can write to an address location. If this is true maybe it is possible to search the code by a regexpr for an enum with 5 values ?
EDIT6: To summarize, no there is no solution to the problem yet. I realize that I may mislead you a bit with the callstack. Yes there are a timer in that but there are other callstacks without a timer. Sorry for that. But there are 2 common factors.
An exception with Read of address FFFFFFxx.
Top of callstack is System.TObject.InheritsFrom (sys\system.pas:9237)
This convince me that VilleK best describe the problem.
I'm also convinced that the problem is somewhere in the Bold framework.
But the BIG question is, how can problems like this be solved ?
It is not enough to have an Assert like VilleK suggest as the damage has already happened and the callstack is gone at that moment. So to describe my view of what may cause the error:
Somewhere a pointer is assigned a bad value 1, but it can be also 0, 2, 3 etc.
An object is assigned to that pointer.
There is method call in the objects baseclass. This cause method TObject.InheritsForm to be called and an exception appear on address FFFFFFDD.
Those 3 events can be together in the code but they may also be used much later. I think this is true for the last method call.
EDIT7: We work closely with the the author of Bold Jan Norden and he recently found a bug in the OCL-evaluator in Bold framework. When this was fixed these kinds of exceptions decreased a lot but they still occasionally come. But it is a big relief that this is almost solved.
You write that you want there to be an exception if
there is a write on a memory address that is not allocated by the application
but that happens anyway, both the hardware and the OS make sure of that.
If you mean you want to check for invalid memory writes in your application's allocated address range, then there is only so much you can do. You should use FastMM4, and use it with its most verbose and paranoid settings in debug mode of your application. This will catch a lot of invalid writes, accesses to already released memory and such, but it can't catch everything. Consider a dangling pointer that points to another writeable memory location (like the middle of a large string or array of float values) - writing to it will succeed, and it will trash other data, but there's no way for the memory manager to catch such access.
I don't have a solution but there are some clues about that particular error message.
System.TObject.InheritsFrom subtracts the constant vmtParent from the Self-pointer (the class) to get pointer to the adress of the parent class.
In Delphi 2007 vmtParent is defined:
vmtParent = -36;
So the error $FFFFFFDD (-35) sounds like the class pointer is 1 in this case.
Here is a test case to reproduce it:
procedure TForm1.FormCreate(Sender: TObject);
var
I : integer;
O : tobject;
begin
I := 1;
O := #I;
O.InheritsFrom(TObject);
end;
I've tried it in Delphi 2010 and get 'Read of address FFFFFFD1' because the vmtParent is different between Delphi versions.
The problem is that this happens deep inside the Bold framework so you may have trouble guarding against it in your application code.
You can try this on your objects that are used in the DMAttracsTimers-code (which I assume is your application code):
Assert(Integer(Obj.ClassType)<>1,'Corrupt vmt');
It sounds like you have memory corruption of object instance data.
The VMT itself isn't getting corrupted, FWIW: the VMT is (normally) stored in the executable and the pages that map to it are read-only. Rather, as VilleK says, it looks like the first field of the instance data in your case got overwritten with a 32-bit integer with value 1. This is easy enough to verify: check the instance data of the object whose method call failed, and verify that the first dword is 00000001.
If it is indeed the VMT pointer in the instance data that is being corrupted, here's how I'd find the code that corrupts it:
Make sure there is an automated way to reproduce the issue that doesn't require user input. The issue may be only reproducible on a single machine without reboots between reproductions owing to how Windows may choose to lay out memory.
Reproduce the issue and note the address of the instance data whose memory is corrupted.
Rerun and check the second reproduction: make sure that the address of the instance data that was corrupted in the second run is the same as the address from the first run.
Now, step into a third run, put a 4-byte data breakpoint on the section of memory indicated by the previous two runs. The point is to break on every modification to this memory. At least one break should be the TObject.InitInstance call which fills in the VMT pointer; there may be others related to instance construction, such as in the memory allocator; and in the worst case, the relevant instance data may have been recycled memory from previous instances. To cut down on the amount of stepping needed, make the data breakpoint log the call stack, but not actually break. By checking the call stacks after the virtual call fails, you should be able to find the bad write.
mghie is right of course. (fastmm4 calls the flag fulldebugmode or something like that).
Note that that works usually with barriers just before and after an heap allocation that are regularly checked (on every heapmgr access?).
This has two consequences:
the place where fastmm detects the error might deviate from the spot where it happens
a total random write (not overflow of existing allocation) might not be detected.
So here are some other things to think about:
enable runtime checking
review all your compiler's warnings.
Try to compile with a different delphi version or FPC. Other compilers/rtls/heapmanagers have different layouts, and that could lead to the error being caught easier.
If that all yields nothing, try to simplify the application till it goes away. Then investigate the most recent commented/ifdefed parts.
The first thing I would do is add MadExcept to your application and get a stack traceback that prints out the exact calling tree, which will give you some idea what is going on here. Instead of a random exception and a binary/hex memory address, you need to see a calling tree, with the values of all parameters and local variables from the stack.
If I suspect memory corruption in a structure that is key to my application, I will often write extra code to make tracking this bug possible.
For example, in memory structures (class or record types) can be arranged to have a Magic1:Word at the beginning and a Magic2:Word at the end of each record in memory. An integrity check function can check the integrity of those structures by looking to see for each record Magic1 and Magic2 have not been changed from what they were set to in the constructor. The Destructor would change Magic1 and Magic2 to other values such as $FFFF.
I also would consider adding trace-logging to my application. Trace logging in delphi applications often starts with me declaring a TraceForm form, with a TMemo on there, and the TraceForm.Trace(msg:String) function starts out as "Memo1.Lines.Add(msg)". As my application matures, the trace logging facilities are the way I watch running applications for overall patterns in their behaviour, and misbehaviour. Then, when a "random" crash or memory corruption with "no explanation" happens, I have a trace log to go back through and see what has lead to this particular case.
Sometimes it is not memory corruption but simple basic errors (I forgot to check if X is assigned, then I go dereference it: X.DoSomething(...) that assumes X is assigned, but it isn't.
I Noticed that a timer is in the stack trace.
I have seen a lot of strange errors where the cause was the timer event is fired after the form i free'ed.
The reason is that a timer event cound be put on the message que, and noge get processed brfor the destruction of other components.
One way around that problem is disabling the timer as the first entry in the destroy of the form. After disabling the time call Application.processMessages, so any timer events is processed before destroying the components.
Another way is checking if the form is destroying in the timerevent. (csDestroying in componentstate).
Can you post the sourcecode of this procedure?
BoldSystem.TBoldMember.CalculateDerivedMemberWithExpression
(BoldSystem.pas:4016)
So we can see what's happening on line 4016.
And also the CPU view of this function?
(just set a breakpoint on line 4016 of this procedure and run. And copy+paste the CPU view contents if you hit the breakpoint). So we can see which CPU instruction is at address 00404E78.
Could there be a problem with re-entrant code?
Try putting some guard code around the TTimer event handler code:
procedure TAttracsTimerDataModule.AttracsTimerTimer(ASender: TObject);
begin
if FInTimer then
begin
// Let us know there is a problem or log it to a file, or something.
// Even throw an exception
OutputDebugString('Timer called re-entrantly!');
Exit; //======>
end;
FInTimer := True;
try
// method contents
finally
FInTimer := False;
end;
end;
N#
I think there is another possibility: the timer is fired to check if there are "Dangling Logon Sessions". Then, a call is done on a TLogonSession object to check if it may be dropped (_GetMayDropSession), right? But what if the object is destroyed already? Maybe due to thread safety issues or just a .Free call and not a FreeAndNil call (so a variable is still <> nil) etc etc. In the mean time, other objects are created so the memory gets reused. If you try to acces the variable some time later, you can/will get random errors...
An example:
procedure TForm11.Button1Click(Sender: TObject);
var
c: TComponent;
i: Integer;
p: pointer;
begin
//create
c := TComponent.Create(nil);
//get size and memory
i := c.InstanceSize;
p := Pointer(c);
//destroy component
c.Free;
//this call will succeed, object is gone, but memory still "valid"
c.InheritsFrom(TObject);
//overwrite memory
FillChar(p, i, 1);
//CRASH!
c.InheritsFrom(TObject);
end;
Access violation at address 004619D9 in module 'Project10.exe'. Read of address 01010101.
Isn't the problem that "_GetMayDropSession" is referencing a freed session variable?
I have seen this kind of errors before, in TMS where objects were freed and referenced in an onchange etc (only in some situations it gave errors, very difficult/impossible to reproduce, is fixed now by TMS :-) ). Also with RemObjects sessions I got something similar (due to bad programming bug by myself).
I would try to add a dummy variable to the session class and check for it's value:
public variable iMagicNumber: integer;
constructor create: iMagicNumber := 1234567;
destructor destroy: iMagicNumber := -1;
"other procedures": assert(iMagicNumber = 1234567)