I have made an x86 architecture on gem5. Now if we run an executable file with a memory allocation code using malloc then will this memory be allocated from my system or a virtual memory given to this x86 architecture.
I think the gem5 makes the virtual machine for any ISA(Instruction set architecture). So memory allocated by malloc function is from the memory area allocated to the x86 architecture. It is an actual memory for x86 but it is restricted to the X86. Suppose 1 GB is allocated to x86 then some area in the range of 1GB will be allocated. Not the remaining memory area used by the OS or Users.
Related
I have 3 questions about gpu memory:
Why my application takes a different amount of GPU memory on different machines (with different graphic card)?
What happens when there is not enough memory on GPU for my application? Can RAM memory be used instead? Who is responsible for this memory management?
I saw a strange behavior of GPU memory:
My application starts with 2.5/4 GB GPU memory. When running some function, the GPU memory reaches the maximum (4 GB)and then immediately falls down to illogical values (less than was allocated before this function).
How it could be explained ?
Why my application takes a different amount of GPU memory on different machines (with different graphic card)?
Because the GPUs are different. Code sizes, minimum runtime resource requirements, page sizes, etc can be different between GPUs, driver versions, and toolkit versions.
What happens when there is not enough memory on GPU for my application
That would depend entirely on your application and how it handles runtime errors. But the CUDA runtime will simply return errors.
Can RAM memory be used instead?
Possibly, if you have designed your application to use it. But automatically, no
Who is responsible for this memory management?
You are.
I saw a strange behavior of GPU memory: My application starts with 2.5/4 GB GPU memory. When running some function, the GPU memory reaches the maximum (4 GB)and then immediately falls down to illogical values (less than was allocated before this function). How it could be explained ?
The runtime detected an irrecoverable error (like a kernel trying to access invalid memory as the the result of a prior memory allocation failure) and destroyed the CUDA context held by your application, which releases all resources on the GPU associated with your application.
Alright so I have a question regarding the Memory segments of a JVM,
I know every JVM would choose to implement this a little bit different yet it is an overall concept that should remain the same within all JVM's
A standart C / C++ program that does not use a virtual machine to execute during runtime has four memory segments during runtime,
The Code / Stack / Heap / Data
all of these memory segments are automatically allocated by the Operating System during runtime.
However, When a JVM executes a Java compiled program, during runtime it has 5 Memory segments
The Method area / Heap / Java Stacks / PC Registers / Native Stacks
My question is this, who allocates and manages those memory segments?
The operating system is NOT aware of a java program running and thinks it is a part of the JVM running as a regular program on the computer, JIT compilation, Java stacks usage, these operations require run-time memory allocation, And what I'm failing to understand Is how a JVM divides it's memory into those memory segments.
It is definitely not done by the Operating System, and those memory segments (for example the java stacks) must be contiguous in order to work, so if the JVM program would simply use a command such as malloc in order to receive the maximum size of heap memory and divide that memory into segments, we have no promise for contiguous memory, I would love it if someone could help me get this straight in my head, it's all mixed up...
When the JVM starts it has hundreds if not thousand of memory regions. For example, there is a stack for every thread as well as a thread state region. There is a memory mapping for every shared library and jar. Note: Java 64-bit doesn't use segments like a 16-bit application would.
who allocates and manages those memory segments?
All memory mappings/regions are allocated by the OS.
The operating system is NOT aware of a java program running and thinks it is a part of the JVM running as a regular program on the computer,
The JVM is running as a regular program however memory allocation uses the same mechanism as a normal program would. The only difference is that in Java object allocation is managed by the JVM, but this is the only regions which work this way.
JIT compilation, Java stacks usage,
JIT compilation occurs in a normal OS thread and each Java stack is a normal thread stack.
these operations require run-time memory allocation,
It does and for the most part it uses malloc and free and map and unmap
And what I'm failing to understand Is how a JVM divides it's memory into those memory segments
It doesn't. The heap is for Java Objects only. The maximum heap for example is NOT the maximum memory usage, only the maximum amount of objects you can have at once.
It is definitely not done by the Operating System, and those memory segments (for example the java stacks) must be contiguous in order to work
You are right that they need to be continuous in virtual memory but the OS does this. On Linux at least there is no segments used, only one 32-bit or 64-bit memory region.
so if the JVM program would simply use a command such as malloc in order to receive the maximum size of heap memory and divide that memory into segments,
The heap is divided either into generations or in G1 multiple memory chunks, but this is for object only.
we have no promise for contiguous memory
The garbage collectors either defragment memory by copying it around or take steps to try to reduce it to ensure there is enough continuous memory for any object you allocate.
would love it if someone could help me get this straight in my head, it's all mixed up...
In short, the JVM runs like any other program except when Java code runs it's object are allocated in a managed region of memory. All other memory regions act just as they would in a C program, because the JVM is a C/C++ program.
I hava an application . when I repeat some action , anonymous allocations memory continuously increase a lot while heap allocations increase a little. can some one help me ? Thanks
Focus on the Live Bytes column for All Heap Allocations to see how much memory your application is using. You cannot control your application's Anonymous VM size.
Focus on the heap allocations because your app has more control over
heap allocations. Most of the memory allocations your app makes are
heap allocations.
The VM in anonymous VM stands for virtual memory.
When your app launches, the operating system reserves a block of
virtual memory for your application. This block is usually much larger
than the amount of memory your app needs. When your app allocates
memory, the operating system allocates the memory from the block it
reserved.
Remember the second sentence in the previous paragraph. The operating
system determines the size of the virtual memory block, not your app.
That’s why you should focus on the heap allocations instead of
anonymous VM. Your app has no control over the size of the anonymous
VM.
Source: http://meandmark.com/blog/2014/01/instruments-heap-allocations-and-anonymous-vm/
I have a few question about stack.
Is stack in CPU or RAM?
Is stack a place to run OPcode?
Is EIP in CPU or RAM?
Stack is always in RAM. There is a stack pointer that is kept in a register in CPU that points to the top of stack, i.e., the address of the location at the top of stack.
The stack is found within the RAM and not within the CPU. A segment is dedicated for the stack as seen in the following diagram:
From Wiki:
The stack area contains the program stack, a LIFO structure, typically
located in the higher parts of memory.
Which CPU are you talking about?
Some might contain memory that is used for callstacks, some contain memory that can be used for callstacks but require the OS to implement the callstack management code, and others contain no writable memory at all. For example, the x86 architecture tends to have one or more code caches and data caches built into the CPU.
Some CPUs or OSes implement operations that make specific areas of memory non-executable. To prevent stack-based buffer overflows, for example, many OSes use hardware and/or software-based data execution prevention, which might prevent stack memory from being executed as code. Some don't; It's entirely possible that an x86 CPU data cache line might be used to store both the callstack and code to be executed in faster memory.
EIP sounds like a register for the IA32 CPU architecture. If you're referring to IA-32, then yes, it's a CPU operation, though many OSes will switch it to/from RAM to emulate multi-tasking.
In modern architectures stack is mapped in ram.
Programming languages such ar C, C++, Pascal can allocate memory in ram, this is called Heap allocation, and other variables which live withing functions are stack allocated.
This dictated processors and operating systems to consider stack mapped within ram segment. And for processors with Memory Management Unit this can be anywhere in the ram. However, intel 8080 had a state bit indicating when it reads/writes from stack, thus stack could be implemented physically isolated from RAM. It is not known to me if such machine was implemented, but think of the situation, what memory does a C pointer points to, Heap or Stack.
Should Stack separation gain popularity we should have stack pointer and heap pointer in modern programming languages.
Is there any direct relation between the OpenCL memory architecture:
Local/Global/Constant/Private memory
And the physical GPU's memory and caches.
For example a GPU card that have 1GB memory/L1 cache/L2 cache. Are these related to local/global.. memory?
Or are Local/Constant/Private memory allocated from Global memory?
-Thanks
OpenCL doesn't really discuss caching of memory. Most modern graphics cards do have some sort of caching protocols for global memory, but these are not guaranteed in older cards. However here is an overview of the different memories.
Private Memory - This memory is kept as registers per work-item. GPUs have very large register files per compute unit. However this memory can spill into local memory if needed. Private memory is allocated by default when you create variables.
Local Memory - Memory local to and shared by the workgroup. This memory system typically is on the compute unit itself and cannot be read or written to by other workgroups. This memory has typically very low latency on GPU architectures (on CPU architectures, this memory is simply part of your system memory). This memory is typically used as a manual cache for global memory. Local memory is specified by the __local attribute.
Constant Memory - Part of the global memory, but read only and can therefore be aggressively cached. __constant is used to define this type of memory.
Global Memory - This is the main memory of the GPU. __global is used to place memory into the global memory space.