It's surprisingly hard to find information about this; does DirectX 12 guarantee that fresh resource allocations will be be completely zeroed before they're accessed by the compute/graphics pipelines?
Related
I have 8GB or Vram (Gpu) & 16GB of Normal Ram when allocating (creating) many lets say large 4096x4096 textures i eventual run out of Vram.. however from what i can see it then create it on ram instead.. When ever you need to render (with or to) it .. it seams to transfer the render-context from the ram to vram in order to do so. Running normal accessing many render-context over and over every frame (60fps etc) the pc lags out as it tries to transfer very high amounts back and forth. However so long the amount of new (not recently used render-contexts (etc still on ram not vram)) is references each second.. there should not be a issue (performance wise). The question is if this information is correct?
DirectX will allocate DEFAULT pool resources from video RAM and/or the PCIe aperture RAM which can both be accessed by the GPU directly. Often render targets must be in video RAM, and generally video RAM is faster memory--although it greatly depends on the exact architecture of the graphics card.
What you are describing is the 'over-commit' scenario where you have allocated more resources than actually fit in the GPU-accessible resources. In this case, DirectX 11 makes a 'best-effort' which generally involves changing virtual memory mapping to get the scene to render, but the performance is obviously quite poor compared to the more normal situation.
DirectX 12 leaves dealing with 'over-commit' up to the application, much like everything else about DirectX 12 where generally "runtime magic behavior" has been removed. See docs for details on this behavior, as well as this sample
I get frequent memory warnings in my application but I don't know why.
Here is the snapshot of allocation instruments.
I know that we don't have any control over virtual memory assigned to us but I am trying to understand what information does that number 26.50 MB means for a developer.
1. What does a high VM means ? Does it lead to a jetsam ? Is that cause of any other concern ?
2. Is this value dependent on device ?
3. Does a low vm means that your app is memory efficient
4. Does a high VM leads to memory warnings in your app ?
5. What cause this value to change ?
6. What steps should a developer take when they see a high vm for their app (like 300 MB) ?
7. Is VM tracker instrument related to this value ?
Anonymous VM covers a lot of things, some of which are things you want to minimize and some that are generally less important. The short version of "anonymous VM" is that it's addresses you have mapped but not named. Heap allocations get "named" which lets you track them as objects. But there are lots (and lots) of non-objecty things that fall into the "anonymous VM" bucket.
Things allocated with malloc can wind up in this region. But also memory mapped files. Your executable is a memory mapped file, but since it's never dirty, parts of it can be swapped out. So "it's complicated." But in big, vague terms, yes, you do care about this section, but you may not care about all of it very much. Heap allocations tends to track your ObjC stuff. Anonymous VM often tracks things that you don't have a lot of direct control over (like CALayer backing storage).
All that said, the Instruments output you provide doesn't look like any major problem. I suspect it's not indicative of a time you're pressuring memory. You'll need to get yourself into a memory warning situation and see what's going on then, and dig into the specifics of what is using memory.
For much more detail on this, you should watch WWDC 2013 session 704 "Building Efficient OS X Apps" which goes into depth on much of this. While iOS has a somewhat different memory system, and some OS X tools aren't available on iOS, many of the concepts still apply.
I have been using KineticJS to build an iOS app (UIWebView). I created a simple example app just to get an understand of memory utilization. I create a single stage, added 100 layers to it and one line to each layer. The amount of memory allocated for the stage and layers was about 6 Mb per layer or 600 Mb. I then added code to remove each layer in a setInterval function and then called stage.reset() just to be sure. In profiling, the memory utilization did not reduce.
I reviewed my code to be sure I wasn't keeping references to the layers. In one test I also dereferenced the stage but the memory allocated value does not change. Could this be a bug or is there some other means to reclaim memory using KineticJS?
This is a 'garbage collection' problem for many browsers. Basically, just dereferencing won't free up the memory. You have to rely on the browser to recognize when to free up some memory. I had the same problem on with some Android browsers. Basically, I just installed the latest Firefox browser and it worked a lot better.
Sorry I couldn't be more help.
I have a sophisticated CUDA-based Linux application. It runs on an i7 machine with one NVIDIA GTX 560 Ti card (1 GB memory), using Ubuntu 12.04 (x86_64) and NVIDIA driver 295.41 + CUDA 4.2 Toolkit.
The application requires about 600-700 MB of global memory in GPU, and it fails to run due to "out of memory" error on calls to cudaMalloc().
After some debugging, I found that the first call to cudaSetDevice() at the very beginning of the application allocates about 580 MB of global memory at once, and the available memory for the rest of application is only 433 MB.
The CUDA reference manual says that it initializes a "primary context" for the device and allocates various resources such as CUDA kernels (called "module" in the driver API) and constant variables. The application has some __device__ __constant__ variables but the total amount of them is just a few KB. There are about 20-30 kernels and device functions.
I have no idea why CUDA allocates such a large amount of GPU memory during initialization.
In a separate minimal program that do only cudaSetDevice(0); cudaMemGetInfo(&a, &t); printf("%ld, %ld\n", a, t); shows about 980 MB of available memory. So the problem should reside at my application, but I could not figure out what causes such large memory allocation because the implementation detail of cudaSetDevice() is completely proprietary.
Could I get some other ideas?
I presume that cudaSetDevice is the 1st CUDA call you are doing in your application, therefore as a CUDA developer you should know that 1st CUDA call is very expensive because CUDA 1st allocates its components on the graphic card, which is around 500 MB.
Try starting your program using another CUDA command, e.g. cudaMalloc, you'll experience that same amount of allocation by CUDA. You can also run deviceQuery under the CUDA Samples to see how much memory is in use.
It sounds like an issue, would you like to file a bug to Nvidia? The step are:
1. Open page http://developer.nvidia.com/cuda/join-cuda-registered-developer-program;
2. If not registered, please click "Join Now", otherwise click "Login Now";
3. Input e-mail and password to login;
4. On the left panel, there is a "Bug Report" item in Home section, click it to file a bug;
5. Fill the required itmes, other items are optional, but detailed information will help us to target and fix the issue a lot;
6. If necessary, an attachment should be uploaded;
7. For Linux system, it is better to attach an nvidia-bug-report;
8. If an issue is related to specific code pattern, a sample code and instructions to compile it are desired for reproduction.
I had a similar problem when the first call to any cudaXXX() function caused the reported VmData (UNIX) to spike massively, sometimes to tens of GB. This is not a bug and the reason is given here:
Why does the Cuda runtime reserve 80 GiB virtual memory upon initialization?
I'm in the process of learning something about openCL, and am having what I hope is not a unique problem (found nothing from google, but..). When I call:
clGetPlatformIDs
from my host program I see a sudden increase in the 'VIRT' memory usage as reported by 'top' to about 45 GB. The values for resident and shared memory don't change noticeably and I'm not completely sure as to what top is reporting here. However, if I repeatedly call a function that runs openCL commands I see some fluctuation in the 'VIRT' memory usage, until openCL calls fail with CL_OUT_OF_HOST_MEMORY. I have 32 GB of memory, so this seems a bit absurd.
I see this in some code (C++) that performs maximum intensity projections on image stacks, but I see exactly the same behaviour in code I took from Erik Smistad's blog.
http://www.thebigblob.com/getting-started-with-opencl-and-gpu-computing/
running that example through GDB, the first call to openCL functions has the same effect as in my code:
cl_platform_id platform_id = NULL;
cl_uint ret_num_platforms;
cl_int ret = clGetPlatformIDs(1, &platform_id, &ret_num_platforms);
VIRT memory jumps massively (again to about 45 GB).
Since I haven't seen anything like this anywhere, I suspect that there may be something funny about my setup:
openSUSE 12.1
GeForce GTX 560Ti 1024 MB
nvidia-computeG02-295.49-17.1.x86_64
but.. CUDA toolkit for openSUSE 11.2 downloaded from NVIDIA, which may expect driver versions 295.41 rather than the 295.49 installed with openSUSE.
I'm hoping someone here has seen a similar problem and has some idea as to what's going on, or some idea as to where to look. I'd very much like to work this out as apart from this issue it's working pretty nicely.