Memory Profiling Existing Process in VS2012 - memory

I am not able to profile memory allocation using the VS 2012 built-in profiler when connecting to an existing web application process.
When I Start profiling and let it launch the process it works fine, but if I try and attach to an existing process it reverts to CPU sampling instead of memory allocation. There is no warning that this is going to happen.
Does anyone know of a reason why this would be the case?
Update
I'm willing to accept that this is a limitation of the profiler (although letting me know that it is falling back this to this functionality would be nice). There are ways around it.
Use a different profiler. I used this one and can recommend it.
Profile from the start and filter results.
I've certainly moved on.

This is almost certainly a limitation of the profiler. I'm sure other profilers can do this if you really need it.
The built in VS 2012 profilers are great for basic needs but for anything advanced I would go for something else.
Many of the more advanced profilers are not free, but often have a trial period. This is a good one in my opinion.
.Net Memory Profiler but SciTech Software

Related

Is the "4GB patch" of any use in real life?

And if so, how. I'm talking about this 4GB Patch.
On the face of it, it seems like a pretty nifty idea: on Windows, each 32-bit application normally only has access to 2GB of address space, but if you have 64-bit Windows, you can enable a little flag to allow a 32-bit application to access the full 4GB. The page gives some examples of applications that might benefit from it.
HOWEVER, most applications seem to assume that memory allocation is always successful. Some applications do check if allocations are successful, but even then can at best quit gracefully on failure. I've never in my (short) life come across an application that could fail a memory allocation and still keep going with no loss of functionality or impact on correctness, and I have a feeling that such applications are from extremely rare to essentially non-existent in the realm of desktop computers. With this in mind, it would seem reasonable to assume that any such application would be programmed to not exceed 2GB memory usage under normal conditions, and those few that do would have been built with this magic flag already enabled for the benefit of 64-bit users.
So, have I made some incorrect assumptions? If not, how does this tool help in practice? I don't see how it could, yet I see quite a few people around the internet claiming it works (for some definition of works).
Your troublesome assumptions are these ones:
Some applications do check if allocations are successful, but even then can at best quit gracefully on failure. I've never in my (short) life come across an application that could fail a memory allocation and still keep going with no loss of functionality or impact on correctness, and I have a feeling that such applications are from extremely rare to essentially non-existent in the realm of desktop computers.
There do exist applications that do better than "quit gracefully" on failure. Yes, functionality will be impacted (after all, there wasn't enough memory to continue with the requested operation), but many apps will at least be able to stay running - so, for example, you may not be able to add any more text to your enormous document, but you can at least save the document in its current state (or make it smaller, etc.)
With this in mind, it would seem reasonable to assume that any such application would be programmed to not exceed 2GB memory usage under normal conditions, and those few that do would have been built with this magic flag already enabled for the benefit of 64-bit users.
The trouble with this assumption is that, in general, an application's memory usage is determined by what you do with it. So, as over the past years storage sizes have grown, and memory sizes have grown, the sizes of files that people want to operate on have also grown - so an application that worked fine when 1GB files were unheard of may struggle now that (for example) high definition video can be taken by many consumer cameras.
Putting that another way: applications that used to fit comfortably within 2GB of memory no longer do, because people want do do more with them now.
I do think the following extract from your link of 4 GB Patch pretty much explains the reason of how and why it works.
Why things are this way on x64 is easy to explain. On x86 applications have 2GB of virtual memory out of 4GB (the other 2GB are reserved for the system). On x64 these two other GB can now be accessed by 32bit applications. In order to achieve this, a flag has to be set in the file's internal format. This is, of course, very easy for insiders who do it every day with the CFF Explorer. This tool was written because not everybody is an insider, and most probably a lot of people don't even know that this can be achieved. Even I wouldn't have written this tool if someone didn't explicitly ask me to.
And to expand on CFF,
The CFF Explorer was designed to make PE editing as easy as possible,
but without losing sight on the portable executable's internal
structure. This application includes a series of tools which might
help not only reverse engineers but also programmers. It offers a
multi-file environment and a switchable interface.
And to quote a Microsoft insider, Larry Miller of Microsoft MCSA on a blog post about patching games using the tool,
Under 32 bit windows an application has access to 2GB of VIRTUAL
memory space. 64 bit Windows makes 4GB available to applications.
Without the change mentioned an application will only be able to
access 2GB.
This was not an arbitrary restriction. Most 32 bit applications simply
can not cope with a larger than 2GB address space. The switch
mentioned indicates to the system that it is able to cope. If this
switch is manually set most 32 bit applications will crash in 64 bit
environment.
In some cases the switch may be useful. But don't be surprised if it
crashes.
And finally to add from MSDN - Migrating 32-bit Managed Code to 64-bit,
There is also information in the PE that tells the Windows loader if
the assembly is targeted for a specific architecture. This additional
information ensures that assemblies targeted for a particular
architecture are not loaded in a different one. The C#, Visual Basic
.NET, and C++ Whidbey compilers let you set the appropriate flags in
the PE header. For example, C# and THIRD have a /platform:{anycpu,
x86, Itanium, x64} compiler option.
Note: While it is technically possible to modify the flags in the PE header of an assembly after it has been compiled, Microsoft does not recommend doing this.
Finally to answer your question - how does this tool help in practice?
Since you have malloc in your tags, I believe you are working on unmanaged memory. This patch would mostly result in invalid pointers as they become twice the size now, and almost all other primitive datatypes would be scaled by a factor of 2X.
But for managed code since all these are handled by the CLR in .NET, this would mean really helpful and would not have much problems unless you are dealing with any of the following :
Invoking platform APIs via p/invoke
Invoking COM objects
Making use of unsafe code
Using marshaling as a mechanism for sharing information
Using serialization as a way of persisting state
To summarize, being a programmer I would not use the tool to convert my application and rather would migrate it myself by changing build targets. being said that if I have a exe that can do well like games with more RAM, then this is worth a try.

Measure How much memory a program will need

Is it possible to know how much memory a program will need?
The usual method is to use some form of profiler. Many IDEs include their own, Netbeans for example has a particularly good profiler (in my opinion) for Java applications. This will show the memory consumption of your program as its running, and is good for testing for things such as memory leaks as well as overall consumption.
If you've only got the binary, then you'll just have to use a basic tool such as task manager or pmap. This won't give you nearly as much detail though.
if you're using an IDE then it will probably have some in-built feature by which you can see the same...
In case you are executing directly, I guess probably the Task Manager is the best way.

How best to debug Delphi using the IDE and/or FOSS?

I see the following means of debugging and wonder if there are others or which FOSS tools a small company can use (we don't do much Windows programming).
1 Debug in the IDE, by setting breakpoints, using watches, etc
2 Debug in the IDE, by using the Event Log
I got some good info from this page and tweaked it to add timestamps and indent/outdent on procedure call/return, so that I can see nested calls more quickly. Does anyone know of anything better ?
3 Using a profiler
4 Any others?
Such as MadExcept, etc?
(I am currently using Delphi 7)
The Delphi integrated debugger is powerful enough, even in Delphi 7, to handle most debugging tasks. It can also debug an application remotely. Anyway, there are situations where you may need to track different kind of issue:
To check for memory leaks, you can switch to a memory manager like FastMM4 which has good memory leak reporting. Profilers like AQTime have also memory allocation profilers to identify such kind of issues.
To investigate performance problems, you need a performance profiler. There are sampling profilers (less invasive, although may be less precise) and standard profilers (AQTime again, not cheap but very good, and others).
To trace exception, especially on deployed applications, you may need tools like JCL/JVCL (free), MadExcept or EurekaLog or SmartInspect
To obtain a log of what the application does, you can use OutputDebugString() and the IDE event viewer, or the DebugView standalone application. There are also dedicated tools like SmartInspect.
You can also convert Delphi 7 .map files to .dbg files and use an external debugger as the WinSDK WinDbg, and look at application calls in tools like ProcessExplorer
Some debugging tools may also offer features like code coverage checks (which code was actually executed, and which was never), platform compliance (check API calls are supported by a given platform), resources use and so on, but may be useful for larger developments.
Delphi 7's IDE is pretty good to start with, only look at 3rd party tools if you run into something you can't fix with what you've got:
It's error messages are informative and not excessively verbose.
The debugger is pretty good, you've got lots of options for inspecting variables, brakepoints, conditional brakepoints, data brakepoints, address brakepoint, module load brakepoint. It's call-stack view is good, it has some support for multi-threaded debugging.
Good step-by-step execution, step into, step over, run until return, etc.
3rd party tools help when you need to diagnose a problem on the client's computer (you have no Delphi IDE on the client's computer). If you can get the problem to manifest on your computer you get away with the IDE alone, no need for any addition, free or payed for.
Profiler: That's not a debugging tool. You use an profiler when you need to find bottlenecks in your application, or you need to do some speed optimizations.
Logging 3rd party frameworks: The good ones are not cheap, and you can do minimal logging without a tool (even ShowMessage works some times).
MadExcept, other tools that log exceptions: They usually require debugging information to be present in the EXE, and that's not a good idea because it makes the program slower AND it easier to hack. Again, if you can get the exception on your machine, you don't need the logger.
I'm not saying 3rd party debugging aids are not useful: they are, but I'd wait until I can clearly see the benefit of any tool before I commit to it. And in my opinion there's no such thing as free software: Even the software you don't pay for requires time to learn how to use it and requires changes to your programs and workflow.
For the bigger work, there is AQTime.
A cheaper solution for selected code is running it through Free Pascal (with the "randomize local variables option") and run it through valgrind. I've validated most my streaming code (which heavily has backwards compat constructs) that way.
Another such interesting switch is -CR, verify object method call. It basically turns every
TXXX(something).callsomething
into
if something is txx then
TXXX(something).callsomething
else
raise some exception;
Specially in code with complex trees this can give some precious information.
Normal Pascal language checking (Range, I/O, Overflow, sTack aka -Criot) can be useful too, and is also available in Delphi.
Some range check errors (often loop bounderies) that can be detected statically, will result in compile errors in (beta) FPC 3.0.x+.
You can try the "Process Stack Viewer" of my (open source) sampling profiler:
http://code.google.com/p/asmprofiler/wiki/ProcessStackViewer
(you need some debug info: a .map or .jdbg file)
You can watch the stack (also the raw stack, with "false positives" but useful when normal stack walking is not possible) of all threads, and do some simple sampling profiling.
Note: My (older) instrumenting profiler does exact profiling, is on the same site.
Not sure why you would want to upgrade to debug a problem. Yes the newer IDE's provide more features to help you debug something, but taking into consideration your previous question on how to debug your program when it hangs, I'd sooner suggest a good logging solution like CodeSite or SmartInspect. They provide way more flexibility and features than any home-grown solution based around the event log and do not require you to step through the code, like the IDE does (which affects timings in multi-threadeded problems).
Update
Sorry, didn't get that FOSS stands for Free and Open Source Software. CodeSite and SmartInspect are neither. For a free solution, you could have a look though at the logging features within the Jedi family of tools.
Rad Studio XE includes a light version of CodeSite, and AQTime, which together are both compelling improvements.
You could do a lot with JCL Debug, MadExcept, and other profiling and logging tools, but CodeSite and AQTime are the two best for their respective tasks.

Datamining models in FORTRAN or C (or managed code)?

We are planning to develop a datamining package for windows. The program core / calculation engine will be developed in F# with GUI stuff / DB bindings etc done in C# and F#.
However, we have not yet decided on the model implementations. Since we need high performance, we probably can't use managed code here (any objections here?). The question is, is it reasonable to develop the models in FORTRAN or should we stick to C (or maybe C++). We are looking into using OpenCL at some point for suitable models - it feels funny having to go from managed code -> FORTRAN -> C -> OpenCL invocation for these situations.
Any recommendations?
F# compiles to the CLR, which has a just-in-time compiler. It's a dialect of ML, which is strongly typed, allowing all of the nice optimisations that go with that type of architecture; this means you will probably get reasonable performance from F#. For comparison, you could also try porting your code to OCaml (IIRC this compiles to native code) and see if that makes a material difference.
If it really is too slow then see how far that scaling hardware will get you. With the performance available through a modern PC or server it seems unlikely that you would need to go to anything exotic unless you are working with truly brobdinagian data sets. Users with smaller data sets may well be OK on an ordinary PC.
Workstations give you perhaps an order of magnitude more capacity than a standard dekstop PC. A high-end workstation like a HP Z800 or XW9400 (similar kit is available from several other manufacturers) can take two 4 or 6 core CPU chips, tens of gigabytes of RAM (up to 192GB in some cases) and has various options for high-speed I/O like SAS disks, external disk arrays or SSDs. This type of hardware is expensive but may be cheaper than a large body of programmer time. Your existing desktop support infrastructure shouldn be able to this sort of kit. The most likely problem is compatibility issues running 32 bit software on a 64-bit O/S. In this case you have various options like VMs or KVM switches to work around the compatibility issues.
The next step up is a 4 or 8 socket server. Fairly ordinary wintel servers go up to 8 sockets (32-48 cores) and perhaps 512GB of RAM - without having to move off the Wintel platform. This gives you fairly wide range of options within your platform of choice before you have to go to anything exotic1.
Finally, if you can't make it run quickly in F#, validate the F# prototype and build a C implementation using the F# prototype as a control. If that's still not fast enough you've got problems.
If your application can be structured in a way that suits the platform then you could look at a more exotic platform. Depending on what will work with your application, you might be able to host it on a cluster, cloud provider or build the core engine on a GPU, Cell processor or FPGA. However, in doing this you're getting into (quite substantial) additional costs and exotic dependencies that might cause support issues. You will probably also have to bring a third-party consultant who knows how to program the platform.
After all that, the best advice is: suck it and see. If you're comfortable with F# you should be able to prototype your application fairly quickly. See how fast it runs and don't worry too much about performance until you have some clear indication that it really will be an issue. Remember, Knuth said that premature optimisation is the root of all evil about 97% of the time. Keep a weather eye out for issues and re-evaluate your strategy if you think performance really will cause trouble.
Edit: If you want to make a packaged application then you will probably be more performance-sensitive than otherwise. In this case performance will probably become an issue sooner than it would with a bespoke system. However, this doesn't affect the basic 'suck it and see' principle.
For example, at the risk of starting a game of buzzword bingo, if your application can be parallelized and made to work on a shared-nothing architecture you might see if one of the cloud server providers [ducks] could be induced to host it. An appropriate front-end could be built to run locally or through a browser. However, on this type of architecture the internet connection to the data source becomes a bottleneck. If you have large data sets then uploading these to the service provider becomes a problem. It may be quicker to process a large dataset locally than to upload it through an internet connection.
I would advise not to bother with optimizations yet. First try to get a working prototype, then find out where computation time is spent. You can probably move the biggest bottlenecks out into C or Fortran when and if needed -- then see how much difference it makes.
As they say, often 90% of the computation is spent in 10% of the code.

How does the memory footprint of some common web framework(s) compare?

Hypothetically, if I were to build the same app using a few popular/similar frameworks, say PHP(cakePHP|Zend), Django, and Rails, should the memory consumption of each be roughly the same?
Also, I'm sure many have evaluated or used each and would be interested in which you settled on and why?
Code with whatever framework you like best. Then pray your app is popular enough to cause memory problems. We should all be so lucky.
No, it will absolutely vary wildly from one framework to another.
That said, in most cases the memory footprint of the framework is not the determining factor in site performance nor in selection of a framework. It's usually more a matter of using the right tool for the job, since each framework has its own strengths and weaknesses.
It is hard to efficiently say, I would say that PHP frameworks will have mostly a similar footprint, which is typically less than other frameworks such as Rails and Django. But it depends what you include as rails, such as mongrel (rails server proxy). Overall it depends on your code as well however PHP will most of the time give an easier time on the server. (Without any language Bias, I use both PHP and Rails)
Just for getting some perspective let me report a real case memory consumption using a Smalltalk web framework AIDA/Web.
For running 40+ websites on a single Smalltalk image on a single server it currently consumes 330MB of memory.
The only one of those frameworks I have used is CakePHP. I found that it's not to bad footprint wise however it is a lot more heavy that normal PHP without using a framework obviously but can be a good trade off.
A good comparison of some of the most popular PHP frameworks can be found at http://www.avnetlabs.com/php/php-framework-comparison-benchmarks.
Memory is cheap these days. Go with what will make your development easiest (which is usually what your team knows best).
But... In my experience, Django isn't terribly memory hungry. I've run it on my shared host with less than 100 MB of RAM. But my experience is sheerly anecdotal. YMMV. If you go with Django, here are some tips to keep memory usage down.
EDIT: And don't go with zope if memory footprint is important to you.

Resources