Is there any means to do RAM memory testing for multi-core during another application using target memory domain - memory

I try to implement ram test such like this url(http://www.esacademy.com/en/library/technical-articles-and-documents/miscellaneous/software-based-memory-testing.html) in dual core microcontroller.
This ram testing shall be available in the middle of another process.
I think implementing this by using interrupt disable, but it is not appropriate.
As a precondition, My implementing ram test is supposed to do data backup to another domain before testing and to put these data back to initial address.
So, other driver can use same data as usual after RAM testing.
In this case I use interrupt disable, it's not available in dual core.
Because the both of cores access the same RAM domain and disabling interrupt
is not working another core's processing, there are only occurring data inconsistency.
Could you give me your idea?

Somewhat by definition if you are running code on that ram you are not testing that ram if you want to do a memory test you need to be off the ram under test.
But that depends on what your definition of test is. If it is a memory test to test the memory itself, cant be on it, you are not testing some of the memory so you are not testing the memory (looks like what your link is about, note links are bad in SO questions and answers, remote links are not assumed to remain active).
Cant test half then another half you are not testing the address bus completely.
If this is a performance test then ideally you want to be off of it and have the test run completely from cache. Multi core helps for a targeted test as you can push the interface a little bit harder, difficult to max it out with a general purpose processor though, multi-core or not.
Otherwise if you just want to exercise a fraction then allocate a fraction and test it, in whatever way you wish. Its not really a memory test though.
Sounds like from your requirements you are not really interested in a full memory test, so do as much as you can to make your boss happy.
Actual memory testing a system is very much specific to that system, how you approach it, how you solve it. You want that code(/stack) to not be on that ram, ideally the chip/system design includes a fast internal SRAM that you can use for board bring up and design verification, possibly manufacture test, but manufacture test should be testing the solder/board not all the bits in the ram, there are ways to do that too. If no internal sram then they had to design some way to bring the system up, or not, if you can run from flash and have the cache on, and can map that out of the way of the DRAM address space, then you can test the dram(/external ram) that way (no stack, just the CPU registers, basically assembly language).

Related

Does simulating memory-mapped I/O using VMX require instruction decoding?

I am wondering how a hypervisor using Intel's VMX / VT technology would simulate memory-mapped I/O (so that the guest could think it was performing memory mapped I/O againsta device).
I think the basic principle would be to set up the EPT page tables in such a way that the memory addresses in question would cause an EPT violation (i.e. VM exit) by setting them such that they cannot be read or written? However, the next question is how to process the VM exit. Such a VM-exit would fill out all the exit qualification reasons etc. including the guest-linear and guest-physical address etc. But what I am missing in these exit qualification fields is some field indicating - in case of a write instruction - the value that was attempted to be written and the size of the write. Likewise, for a read instruction it would be nice with some bit fields indicating the destination of the read, say a register or a memory location (in case of memory-to-memory string operations). This would make it very easy for the hypervisor to figure out what the guest was trying to do and then simulate the device behavior towards the guest.
But the trouble is, I can't find such fields among the exit qualifications. I can see an instruction pointer to where the faulting instruction is, so I could walk the page tables to read in the instruction and then decode it to understand the instruction, then simulate the I/O behavior. However, this requires the hypervisor to have a fairly complete picture of all x86 instructions, and be able to decode them. That seems to be quite a heavy burden on the hypervisor, and will also require it to stay current with later instruction additions. And the CPU should already have this information.
There's a chance that that I am missing these relevant fields because the documentation is quite extensive, but I have tried to search carefully but have not been able to find it. Maybe someone can point me in the right direction OR confirm that the hypervisor will need to contain an instruction decoder.
I believe most VMs decode the instruction. It's not actually that hard, and most VMs have software emulators to fallback on when the CPU VM extensions aren't available or up to the task. You don't need to handle every instruction, just those that can take memory operands, and you can probably ignore everything that isn't a 1, 2, or 4 byte memory operand since you're not likely to emulating device registers other than those sizes. (For memory mapped device buffers, like video memory, you don't want to be trapping every memory accesses because that's too slow, and so you'll have to take different approach.)
However, there is one way you can let the CPU do the work for you, but it's much slower then decoding the instruction itself and it's not entirely perfect. You can single step the instruction while temporarily mapping in a valid page of RAM. The VM exit will tell you the guest physical address access and whether it was a read or write. Unfortunately it doesn't reliably tell you whether it was read-modify-write instruction, those may just set the write flag, and with some device registers that can make a difference. It might be easier to copy the instruction (it can only be a most 15 bytes, but watch out for page boundaries) and execute it in the host, but that requires that you can map the page to same virtual address in the host as in the guest.
You could combine these techniques, decode the common instructions that are actually used to access memory mapped device registers, while using single stepping for the instructions you don't recognize.
Note that by choosing to write your own hypervisor you've put a heavy burden on yourself. Having to decode instructions in software is a pretty minor burden compared to the task of emulating an entire IBM PC compatible computer. The Intel virtualisation extensions aren't designed to make this easier, they're just designed to make it more efficient. It would be easier to write a pure software emulator that interpreted the instructions. Handling memory mapped I/O would be just a matter of dispatching the reads and writes to the correct function.
I don't know in details how VT-X works, but I think I see a flaw in your wishlist way it could work:
Remember that x86 is not a load/store machine. The load part of add [rdi], 2 doesn't have an architecturally-visible destination, so your proposed solution of telling the hypervisor where to find or put the data doesn't really work, unless there's some temporary location that isn't part of the guest's architectural state, used only for communication between the hypervisor and the VMX hardware.
To handle a read-modify-write instruction with a memory destination efficiently, the VM should do the whole thing with one VM exit. So you can't just provide separate load and store interfaces.
More importantly, handling atomic read-modify-writes is a special case. lock add [rdi], 2 can't just be done as a separate load and store.

Memory defragmentation software. How does it work? Does it work?

I was reading an article on memory fragmentation when I recalled that there are several examples of software that claim to defragment memory. I got curious, how does it work? Does it work at all?
EDIT:
xappymah gave a good argument against memory defragmentation in that a process might be very surprised to learn that its memory layout suddenly changed. But as I see it there's still the possibility of the OS providing some sort of API for global memory control. It does seem a bit unlikely however since it would give rise to the possibility of using it in malicious intent, if badly designed. Does anyone know if there is an OS out there that supports something of the sort?
The real memory defragmentation on a process level is possible only in managed environments such as, for example, Java VMs when you have some kind of an access to objects allocated in memory and can manage them.
But if we are talking about the unmanaged applications then there is no possibility to control their memory with third-party tools because every process (both the tool and the application) runs in its own address space and doesn't have access to another's one, at least without help from OS.
However even if you get access to another process's memory (by hacking your OS or else) and start modifying it I think the target application would be very "surprised".
Just imagine, you allocated a chunk of memory, got it's starting address and on the next second this chunk of memory is moved somewhere else because of "VeryCoolMemoryDefragmenter" :)
In my opinion memory it's a kind of Flash Drive, and this chip don't get fragmented because there aren't turning disks pins recording and playing information, in a random way, like a lie detector. This is the way that Hard Disk Fragmentation it's done. That's why SSD drives are so fast, effective, reliable and maintenance free. SSD it's a BIG piece of memory and it kind of look alike.

Should a process always consume the same amount of memory if executed in the same way?

Hi folks and thanks for your time in advance.
I'm currently extending our C# test framework to monitor the memory consumed by our application. The intention being that a bug is potentially raised if the memory consumption significantly jumps on a new build as resources are always tight.
I'm using System.Diagnostics.Process.GetProcessByName and then checking the PrivateMemorySize64 value.
During developing the new test, when using the same build of the application for consistency, I've seen it consume differing amounts of memory despite supposedly executing exactly the same code.
So my question is, if once an application has launched, fully loaded and in this case in it's idle state, hence in an identical state from run to run, can I expect the private bytes consumed to be identical from run to run?
I need to clarify that I can expect memory usage to be consistent as any degree of varience starts to reduce the effectiveness of the test as a degree of tolerance would need to be introduced, something I'd like to avoid.
So...
1) Should the memory usage be 100% consistent presuming the application is behaving consistenly? This was my expectation.
or
2) Is there is any degree of variance in the private byte usage returned by windows or in the memory it allocates when requested by an app?
Currently, if the answer is memory consumed should be consistent as I was expecteding, the issue lies in our app actually requesting a differing amount of memory.
Many thanks
H
Almost everything in .NET uses the runtime's garbage collector, and when exactly it runs and how much memory it frees depends on a lot of factors, many of which are out of your hands. For example, when another program needs a lot of memory, and you have a lot of collectable memory at hand, the GC might decide to free it now, whereas when your program is the only one running, the GC heuristics might decide it's more efficient to let collectable memory accumulate a bit longer. So, short answer: No, memory usage is not going to be 100% consistent.
OTOH, if you have really big differences between runs (say, a few megabytes on one run vs. half a gigabyte on another), you should get suspicious.
If the program is deterministic (like all embedded programs should be), then yes. In an OS environment you are very unlikely to get the same figures due to memory fragmentation and numerous other factors.
Update:
Just noted this a C# app, so no, but the numbers should be relatively close (+/- 10% or less).

Cachegrind under Xen

I have an application written in C++ that someone else has written in a way that's supposed to maximally take advantage of cpu caches. This application runs on a guest Ubuntu OS that is using paravirtualization. I ran cachegrind and received very low cache miss rates.
Since my OS is virtualized, can I be sure that these values are in fact correct in showing that the cpu cache is being well used for my application?
Cachegrind is a simulator. A real CPU may actually perform differently (e.g. your real CPU may have a different cache hierarchy to cachegrind, different sizes of cache, a different replacement policy and so forth). You would need to watch the real CPUs performance counters to know for sure how well your program was really performing on real hardware with respect to cache.

What will happen if a application is large enough to be loaded into the available RAM memory?

There is chance were a heavy weight application that needs to be launched in a low configuration system.. (Especially when the system has too less memory)
Also when we have already opened lot of application in the system & we keep on trying opening new new application what would happen?
I have only seen applications taking time to process or hangs up for sometime when I try operating with it in low config. system with low memory and old processors..
How it is able to accomodate many applications when the memory is low..? (like 128 MB or lesser..)
Does it involves any paging or something else..?
Can someone please let me know the theory behind this..!
"Heavyweight" is a very vague term. When the OS loads your program, the EXE is mapped in your address space, but only the code pages that run (or data pages that are referenced) are paged in as necessary.
You will likely get horrible performance if pages need to constantly be swapped as the program runs (aka many hard page faults), but it should work.
Since your commit charge is near the commit limit, and the commit limit will likely have no room to grow, you will also likely recieve many malloc()/VirtualAlloc(..., MEM_COMMIT)/HeapAlloc()/{Local|Global}Alloc() failures so you need to watch the return codes in your program.
Some keywords for search engines are: paging, swapping, virtual memory.
Wikipedia has an article called Paging (Redirected from Swap space).
There is often the use of virtual memory. Virtual memory pages are mapped to physical memory if they are used. If a physical page is needed and no page is available, another is written to disk. This is called swapping and that explains why crowded systems get slow and memory upgrades have positive effects on performance.

Resources