Are derived telemetries possible in Perfino, e.g. Heap % Used? - perfino

I'm currently evaluating Perfino 3.0. The sparklines on the VMs tab are great, but the JVMs I'm monitoring have widely differing max heap sizes. It would be more useful to have a Used Heap % telemetry shown.
I attempted to create a custom one by extracting MBean values for used heap (in bytes) and max heap (in bytes) but I don't see a way to divide one by the other to show a percentage. Is this possible?

If you need to calculate something, you have to add a static method to your code and annotate it with #Telemetry. For example:
#Telemetry("Heap percentage", #TelemetryFormat(Unit.PERCENT))
However, your suggestion makes a lot of sense and we'll add a "Used Heap Percentage" VM telemetry in 3.0.1. Please contact support#ej-technologies.com to get a build of 3.0.1. It's already available on the demo server.

Related

How can I monitor peak memory usage for a Delphi application?

I just ended a major refactor in my Delphi application and wanted to compare peak memory usage between builds. Basically I need proof that the latest refactor takes less RAM that the previous build. Since the application changed so much it's hard to pinpoint an equivalent point in time to compare metrics. The best way to compare would be to know the highest memory consumption during the application execution. For example, if my application need 1 MB of RAM for the whole duration, but during 1 ms it needed 2 MB I want to get 2 MB as the result.
I started using FastMM4, but I'm not sure if it can do what I need. It can be an external tool or something I embed to my application (à la FastMM4).
You can use Process Explorer.
Right-click on the top header, then use the Select Columns menu and check Peak Private Bytes from the Process Memory tab.
Process Explorer as recommended by dwrbudr is nice, but it lacks the granularity I needed, so I ended up using FastMM4 to get the memory usage during the whole flow of each build. I just logged the values and then compared the evolution manually.

SSAS Tabular Table Processing Memory issue

When I'am trying to edit SSAS Tabular project using with Visual Studio 2015 in table properties section,I am getting error like
"The operation has been cancelled because there is not enough memory available for the application. If using a 32-bit version of the product, consider upgrading to the 64-bit version or increasing the amount of memory available on the machine."
when counted row is equal to almost 5 million row.
Is there any permanent solution for the issue?
These errors can be caused by an incorrect memory setting in SSAS.
According to MSDN Memory Properties document https://msdn.microsoft.com/en-us/library/ms174514.aspx
Values between 1 and 100 represent percentages of Total Physical Memory or Virtual Address Space, whichever is less. Values over 100 represent memory limits in bytes.
If the admin thinks a value above 100 is in KBs or MBs, the admin may put in a setting that is way too low for SSAS to operate properly, giving this "not enough memory" error when the server still has a lot of memory available.
The solution is to make the memory settings change to give a proper value to the memory limits in either percentage of server memory, or in bytes.

Castalia Memory Issue

My application layer protocol works fine, but when the number of nodes is large (more than 600) it exits without any error.
I traced the code and didn't find any problem. It seems a memory problem since the number of nodes is large and doing many operations.
Update:
In my application:
Each node broadcasts 2msg/second, during all the simulation time.
The msgs contain much information related to my application.
All the nodes are static.
Using BypassRouting, BypassMAC, Radio cc2420.
Castalia works for nodes larger than 600 and reaches to 2500 from my previous experiments but with low simulation time ... so it depends on the relation between the # of nodes and simulation time and # of sent messages per second.
Single experiment run successfully... but when running for example with 30 seed (i.e. -r 30) ... & num of nodes = 110
its stopped after exp 13 simulation time = 1000s
& its stopped after exp 22 if simulation time = 600s
How I can free memory from unnecessary things during simulation runs.
(note: previously I increased the swap memory and worked for a specific limit)
Thanks,
Without more information on your application and the simulation scenario it's hard to provide very specific suggestions. At the very least, you could provide your ini file and information about any custom modules you are using (your application module for example). Are you using any mobile nodes for example? Which protocols are you using? What does you app module do? In general Castalia should be able to handle 600 nodes. In the past, we have tested Castalia with thousands of (static) nodes.
You could use a memory profiler. An excellent tool (a suite of tools really) is valgrind. You can find memory leaks, and you can also memory profile your program. The heap profiler tool of valgrind is called 'massif':
Massif is a heap profiler. It performs detailed heap profiling by taking regular snapshots of a program's heap. It produces a graph showing heap usage over time, including information about which parts of the program are responsible for the most memory allocations. The graph is supplemented by a text or HTML file that includes more information for determining where the most memory is being allocated. Massif runs programs about 20x slower than normal.
Read the valgrind documentation for more info. This is the way you invoke the tool:
valgrind --tool=massif <executable> <arguments>
The executable in this case is CastaliaBin (not the Castalia python script, which is a higher level execution tool).

Getting details on application RAM usage

According to process explorer / task manager my application has a private working set size of around 190MB even while not performing a specific task, which is way more than I would expect it to need. Using FastMM I have validated that none of this is an actual memory leak in a traditional sense.
I have also read the related discussion going on here, which suggests using FastMM's LogMemoryManagerStateToFile();. However the output generated states "21299K Allocated, 49086K Overhead", which combined (70MB) is way less than the task manager suggests.
Is there any way I can find out what causes the huge differences, might 190MB even be an expectable value for an application with ~15 forms? Also, is having 70% overhead "bad", any way of reducing that number?
You can use VMMap from Sysinternals to get a complete overview of the virtual memory addres space your proces is using. This should allow you to work out the difference you are seeing between taks manager and FastMM.
I doubt that FastMM reports or even can report sections like Mapped File, Shareable, Page Table while those sections do occupy Private WS.
DDDebug can give you insights about memory allocation by objects in your app. You can monitor changes live.
Test the trial version or checkout the introductory video on the website.

Windows to embedded port: data and code memory size

I am in the process of porting a windows 7 library to an embedded platform. In order to do so my employer asks me the amount of memory (and CPU but let us concentrate on the memory for now) that my system will need once ported - so he can size the board to my needs.
I had a look on the internet and there seem not be exist much information about this question, hence my questions:
in order to get a rough idea of the memory footprint of the code in flash memory (code only without memory for data), I read on the Internet that I should sum the size of all the dlls I use. It seems that all compilers and platforms give a different size for the code footprint but overall the size of the code (without data) is often very close. Do you confirm?
in order to deal with the memory required by the data only (heap + stack but no code), I had a look at the task manager (and process explorer). It seems the overall amount of data which I use is specified in the 'peak working set'. I have a few questions about it though:
2.a. Does the 'working set' include the heap + stack memory or does it correspond to the heap only?
2.b. Does the 'working set' include the size for the code as well? (as I am on windows 7, the code is also stored in RAM and not in flash as on embedded systems), or does it only correspond to the data?
2.c. it seems the 'peak working set' reflects the maximum amount of physical memory that was actually in RAM from the time the program was started, but it does not reflect the size the program could take afterwards (if I happen to allocate memory at runtime - which would be bad ;) - the peak value would go on increasing). Do you confirm?
2.d. Hence, do you also confirm that if I do not allocate memory at runtime, the 'peak working set' should roughly be the maximum size of RAM my embedded system will need? Up to a bit of size difference due to the difference in systems technology...
Thanks,
Antoine.
Unless you are intending to run your application on Windows Embedded, then looking at the code and data usage in Windows is not going to be much of an indicator of anything useful!
1) DLLs are libraries - not all the code within them will be utilised by your code. Most embedded systems are statically linked and the linker will link only modules that are actually referenced in your by your code. So taking the sum of the DLL dependencies is likley to lead to a gross over estimation of memory requirement.
2) Windows memory management is profligate with memory use - because it can be and to do so generally improves performance of typical desktop systems. For example, an thread stack in Windows is typically of the order to 2Mb - you may seldom use that much, but Windows gives it to you in any case because it can and to do so errs on the side of safety. A thread stack in an embedded system will typically range from a few tens of bytes to a few tens of kilobytes - it depends on your application.
Windows task manager shows what Windows allocates to your process, that may not relate to what your process needs. Also your application is using Windows services - all the memory used for kernel and device services will not show up as part of your process, but your embedded system may still need those.
If you do use your Windows prototype code to assess the embedded system requirements, then your best place to start is by getting the linker to generate a map file, which will give a detailed description of memory usage in terms of statically allocated data and code size.
Code size depends not only on the performance of the compiler, but also on the efficiency of the instruction set. Some architectures achieve higher code density than others. Windows application code size is never a good indicator of embedded code size because its execution environment is likley to be so much different. For example an pre-emptive multitasking RTOS kernel on a 32bit ARM can be implemented in less than 10Kb of code, a file system perhaps another 10, and network stack anything from 10 to 30K, USB another 10. As you can see this is a different world to desktop code.
Data memory usage is more easily determined perhaps; but you do that through analysis of your application rather than observing what Windows does. There is the data your application instantiates directly, and then there is data instantiated by libraries and device drivers you might call - in Windows the latter is likley to be relatively large and out of your control. Typical embedded systems libraries for things such a s network stacks, USB, file systems etc. are fall smaller and far more deterministic in both performance and size.
Your better bet is to describe your application in terms of its general purpose, performance requirements, real-time constraints, and its hardware requirements (display, networking, I/O, mass storage etc.), and then look at comparable solutions or at the libraries you will need to implement your solution; most embedded systems are "bare board" and do not have the services you find in Windows unless you write them or use third-party solutions - Windows is seldom a comparable solution to an embedded system.
If it is just a library rather than an application, then build it for a likley target using a Windows hosted GCC cross-compiler and see how big it ends up. You don't need hardware for that or even expend any money.

Resources