What is ID3D12GraphicsCommandList::DiscardResource? - directx

What exactly should I expect to happen when using DiscardResource?
What's the difference between discard and destroying/deleting a resource.
When is a good time/use-case to discard a resource?
Unfortunately Microsoft doesn't seem to say much about it other than it "discards a resource".

TL;DR: Is a rarely used function that provides a driver hint related to handling clear compression structures. You are unlikely to use it except based on specific performance advice.
DiscardResource is the DirectX 12 version of the Direct3D 11.1 method. See Microsoft Docs
The primary use of these methods it to optimize the performance of tiled-based deferred rasterizer graphics parts by discarding the render target after present. This is a hint to the driver that the contents of the render target are no longer relevant to the operation of the program, so it can avoid some internal clearing operations on the next use.
For DirectX 11, this is in the DirectX 11 App template to use DiscardView because it makes use of DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL:
void DX::DeviceResources::Present()
{
// The first argument instructs DXGI to block until VSync, putting the application
// to sleep until the next VSync. This ensures we don't waste any cycles rendering
// frames that will never be displayed to the screen.
DXGI_PRESENT_PARAMETERS parameters = { 0 };
HRESULT hr = m_swapChain->Present1(1, 0, &parameters);
// Discard the contents of the render target.
// This is a valid operation only when the existing contents will be entirely
// overwritten. If dirty or scroll rects are used, this call should be removed.
m_d3dContext->DiscardView1(m_d3dRenderTargetView.Get(), nullptr, 0);
// Discard the contents of the depth stencil.
m_d3dContext->DiscardView1(m_d3dDepthStencilView.Get(), nullptr, 0);
// If the device was removed either by a disconnection or a driver upgrade, we
// must recreate all device resources.
if (hr == DXGI_ERROR_DEVICE_REMOVED || hr == DXGI_ERROR_DEVICE_RESET)
{
HandleDeviceLost();
}
else
{
DX::ThrowIfFailed(hr);
}
}
The DirectX 12 App template doesn't need those explicit calls because it uses DXGI_SWAP_EFFECT_FLIP_DISCARD.
If you are wondering why the DirectX 11 app doesn't just use DXGI_SWAP_EFFECT_FLIP_DISCARD, it probably should. The DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL swap effect was the only one supported by Windows 8.x for Windows Store apps, which is when DiscardView was introduced. For Windows 10 / DirectX 12 / UWP, it's probably better to always use DXGI_SWAP_EFFECT_FLIP_DISCARD unless you specifically don't want the backbuffer discarded.
It is also useful for multi-GPU SLI / Crossfire configurations since the clearing operation can require synchronization between the GPUs. See this GDC 2015 talk
There are also other scenario-specific usages. For example, if doing deferred rendering for the G-buffer where you know every single pixel will be overwritten, you can use DiscardResource instead of doing ClearRenderTargetView / ClearDepthStencilView.

Related

vkGetMemoryFdKHR is return the same fd?

In WIN32:
I'm sure that if the handle is the same, the memory may not be the same, and the same handle will be returned no matter how many times getMemoryWin32HandleKHR is executed.
This is consistent with vulkan's official explanation: Vulkan shares memory.
It doesn't seem to work properly in Linux.
In my program,
getMemoryWin32HandleKHR works normally and can return a different handle for each different memory.
The same memory returns the same handle.
But in getMemoryFdKHR, different memories return the same fd.
Or the same memory executes getMemoryFdKHR twice, it can return two different handles.
This causes me to fail the device memory allocation during subsequent imports.
I don't understand why this is?
Thanks!
#ifdef WIN32
texGl.handle = device.getMemoryWin32HandleKHR({ info.memory, vk::ExternalMemoryHandleTypeFlagBits::eOpaqueWin32 });
#else
VkDeviceMemory memory=VkDeviceMemory(info.memory);
int file_descriptor=-1;
VkMemoryGetFdInfoKHR get_fd_info{
VK_STRUCTURE_TYPE_MEMORY_GET_FD_INFO_KHR, nullptr, memory,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT
};
VkResult result= vkGetMemoryFdKHR(device,&get_fd_info,&file_descriptor);
assert(result==VK_SUCCESS);
texGl.handle=file_descriptor;
// texGl.handle = device.getMemoryFdKHR({ info.memory, vk::ExternalMemoryHandleTypeFlagBits::eOpaqueFd });
Win32 is nomal.
Linux is bad.
It will return VK_ERROR_OUT_OF_DEVICE_MEMORY.
#ifdef _WIN32
VkImportMemoryWin32HandleInfoKHR import_allocate_info{
VK_STRUCTURE_TYPE_IMPORT_MEMORY_WIN32_HANDLE_INFO_KHR, nullptr,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT, sharedHandle, nullptr };
#elif __linux__
VkImportMemoryFdInfoKHR import_allocate_info{
VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR, nullptr,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT,
sharedHandle};
#endif
VkMemoryAllocateInfo allocate_info{
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, // sType
&import_allocate_info, // pNext
aligned_data_size_, // allocationSize
memory_index };
VkDeviceMemory device_memory=VK_NULL_HANDLE;
VkResult result = vkAllocateMemory(m_device, &allocate_info, nullptr, &device_memory);
NVVK_CHECK(result);
I think it has something to do with fd.
In my some test: if I try to get fd twice. use the next fd that vkAllocateMemory is work current......but I think is error .
The fd obtained in this way is different from the previous one.
Because each acquisition will be a different fd.
This makes it impossible for me to distinguish, and the following fd does vkAllocateMemory.
Still get an error.
So this test cannot be used.
I still think it should have the same process as win32. When the fd is obtained for the first time, vkAllocateMemory can be performed correctly.
thanks very much!
The Vulkan specifications for the Win32 handle and POSIX file descriptor interfaces explicitly state different things about their importing behavior.
For HANDLEs:
Importing memory object payloads from Windows handles does not transfer ownership of the handle to the Vulkan implementation. For handle types defined as NT handles, the application must release handle ownership using the CloseHandle system call when the handle is no longer needed.
For FDs:
Importing memory from a file descriptor transfers ownership of the file descriptor from the application to the Vulkan implementation. The application must not perform any operations on the file descriptor after a successful import.
So HANDLE importation leaves the HANLDE in a valid state, still referencing the memory object. File descriptor importation claims ownership of the FD, leaving it in a place where you cannot use it.
What this means is that the FD may have been released by the internal implementation. If that is the case, later calls to create a new FD may use the same FD index as a previous call.
The safest way to use both of these APIs is to have the Win32 version emulate the functionality of the FD version. Don't try to do any kinds of comparisons of handles. If you need some kind of comparison logic, then you'll have to implement it yourself. When you import a HANDLE, close it immediately afterwards.

iOS Patch program instruction at runtime

How would one go about modifying individual assembly instructions in an application while it is running?
I have a Mobile Substrate tweak that I am writing for an existing application. In the tweak's constructor (MSInitialize), I need to be able to rewrite individual instruction(s) in the app's code. What I mean by this is that there may be multiple places in the application's address space that I wish to modify, but in each instance, only a single instruction needs to be modified. I have already disabled ASLR for the application and know the exact memory address of the instruction to be patched, and I have the hex bytes (as a char[], but this is uninportant and can be changed if necessary) of the new instruction. I just need to figure out how to perform the change.
I know that iOS uses Data Execution Prevention (DEP) to specify that executable memory pages cannot also be writeable and vice versa, but I know that it is possible to bypass this on a jailbroken device. I also know that the ARM processor used by iDevices has an instruction cache that needs to be updated to reflect the change. However, I do not even know where to begin to do this.
So, to answer the question that would surely otherwise be asked, I have not tried anything. This is not because I am lazy; rather, it is because I have absolutely no clue how this could be accomplished. Any help at all would be greatly appreciated.
Edit:
If it helps at all, my ultimate goal is to use this in a Mobile Substrate tweak that hooks an App Store application. Previously, in order to mod this application, one would have to first crack it to decrypt the app so the binary could be patched. I want to make it so people wouldn't have to crack the app, since that can lead to piracy which I am strongly against. I can't use Mobile Substrate normally because all of the work is done in C++, not Objective-C, and the application is stripped, leaving no symbols to use MSHookFunction on.
Completely forgot I asked this question, so I'll show what I ended up with now. The comments should explain how and why it works.
#include <stdio.h>
#include <stdbool.h>
#include <mach/mach.h>
#include <libkern/OSCacheControl.h>
#define kerncall(x) ({ \
kern_return_t _kr = (x); \
if(_kr != KERN_SUCCESS) \
fprintf(stderr, "%s failed with error code: 0x%x\n", #x, _kr); \
_kr; \
})
bool patch32(void* dst, uint32_t data) {
mach_port_t task;
vm_region_basic_info_data_t info;
mach_msg_type_number_t info_count = VM_REGION_BASIC_INFO_COUNT;
vm_region_flavor_t flavor = VM_REGION_BASIC_INFO;
vm_address_t region = (vm_address_t)dst;
vm_size_t region_size = 0;
/* Get region boundaries */
if(kerncall(vm_region(mach_task_self(), &region, &region_size, flavor, (vm_region_info_t)&info, (mach_msg_type_number_t*)&info_count, (mach_port_t*)&task))) return false;
/* Change memory protections to rw- */
if(kerncall(vm_protect(mach_task_self(), region, region_size, false, VM_PROT_READ | VM_PROT_WRITE | VM_PROT_COPY))) return false;
/* Actually perform the write */
*(uint32_t*)dst = data;
/* Flush CPU data cache to save write to RAM */
sys_dcache_flush(dst, sizeof(data));
/* Invalidate instruction cache to make the CPU read patched instructions from RAM */
sys_icache_invalidate(dst, sizeof(data));
/* Change memory protections back to r-x */
kerncall(vm_protect(mach_task_self(), region, region_size, false, VM_PROT_EXECUTE | VM_PROT_READ));
return true;
}
vm_protect to w^x, assuming you're jailbroken with a decent jailbreak (e.g. if mobilesubstrate works)
Writing to instruction memory from processor registers is, as others say above, a bit tricky. Especially with iPhones, since Apple tries to keep the processor details secret.
Permissions on memory access are the first problem. Executable memory is not normally writable. However, if this is overcome, then there is a little dance to go through to get data out of the processor registers and into the instruction pipeline. In general, there are synchronisation instructions, which force a specific order on the memory accesses before and after them, and cache commands, which force dirty write data out to memory and flush out clean and possibly stale read data. Both of these are highly dependent on the detailed implementation of the processor.
Arm Has nice manuals on the web that explain these in detail for specific processors. However, whether the processors inside iPhones do what the public Arm manuals say, I have no idea.
Here's a place to start understanding the Arm memory synchronisation model for one processor:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0092b/ch04s03s04.html
and that goes on to tell how to flush the instruction cache by a write to a control register. It certainly is possible to write self-modifying code for Arm processors because somewhere in that manual I found a statement that said that it is sometimes unavoidable and the has to be supported.
(I'm not claiming this is an answer. But it wouldn't fit in a comment.)

Adaptive image caching based on available RAM or iOS Memory warning level

I have an app which caches large images so that user do not wait for imageWithContentsOfFile. As a general rule I cache a previous and next image.
1) Can I make this caching adaptive based on the available memory in iPad ? If yes what should be the threshold ? Below is the function to calculate the available memory
-(void) report_memory {
struct task_basic_info info;
mach_msg_type_number_t size = sizeof(info);
kern_return_t kerr = task_info(mach_task_self(),
TASK_BASIC_INFO,
(task_info_t)&info,
&size);
if( kerr == KERN_SUCCESS ) {
Log(#"Memory in use (in bytes): %u", info.resident_size);
} else {
Log(#"Error with task_info(): %s", mach_error_string(kerr));
}
}
2) I know there is no way to (except the private/undocumented API) know the memory level warning, otherwise it could be a good factor to determine how many pages I can cache. But just to confirm can I use them in some way.
3) Right now I am thinking of caching 3 screens (which have 6 images) and in case my ViewController receive memory warning I unload all screens except the visible one and reset number of screens to cache to 2 (4 images). But I don't found it optimized because either I am caching less than what is possible or in some conditions even loading 4 leads to crash.
If you are looking to cache as much data as possible without causing the application to crash, you can always make use of the function didReceiveMemoryWarning to remove excess caches when needed. Within this function, you don't have to clear out everything. You could use it to selectively clear up enough room for the current situation and then continue caching until it fires off this warning again.
An alternative would be to start a cache routine until the warning fires. This would allow you to create a count of items that were able to be cached. Once this count is reached, you could have your caching iterations stay far enough below that count to avoid issues.
Not exactly an in-depth explanation, but these are some ideas for making use of the standard methods available that should still be able to achieve your goal.
So after waiting for so long I am answering my question if it may be helpful for someone.
You should avoid any undocumented API to get warning level and take action based on that. The only suggested action on memory warning irrespective of level is to release as much memory as you can.
You can use a heuristic based on report_memory function (see question) to determine how much we can cache. Although I have not performed any tests to calculate the threshold (it should be based on total RAM). I would love to see someone perform these tests and update.
The approach to reset the number of pages to cache on memory warning is working fine for my case.

Adobe Air 3 iOS app memory leak issue maybe related to getDefinitionByName class

I'm developing an application with adobe air 3 for ios and having low memory errors frequently.
After ios 5 update os started to kill my app after some low memory warnings.
But the thing is profiler says app uses 4 to 9 megs of memory.
There are a lot of bitmap copy operations around and sometimes instantiates new bitmaps from embedded bitmaps.
I highly optimized everything and look for leaks etc.
I watch profiler for memory status and seems like GC clears everything. everything looks perfect but app continues to get low memory errors and gets killed by os.
Is there anything wrong with this code below. Because my assumption is this ClassReference never gets off from memory even the profiles says memory is cleared.
I used clone method to pass value instead of pass by ref. so I guess GC can collect that local variable. I tried with and without clone nothing changes.
If the code below runs 10-15 times with different tile Id's app crashes but with same ID's it continues working.
Is there anyone who is familiar with this kind of thing?
tmp is bitmapData
if (isMoving)
{
tmp=getProxyImage(x,y); //low resolution tile image
}
else
{
strTmp="main_TILE"+getTileID(x,y);
var ClassReference:Class = getDefinitionByName(strTmp) as Class; //full resolution tile image //something wrong here
tmp=new ClassReference().bitmapData.clone(); //something wrong here
ClassReference=null;
}
return tmp.clone();
Thanks for reading. I hope some one has a solution for this.
You are creating three copies of your bitmapdata with this. They will likely get garbage collected eventually, but you probably run out of memory before that happens.
(Here I assume you have embedded your bitmapdata using the [Embed] tag)
tmp = new ClassReference()
// allocates no new memory, class reference already exists
var ClassReference:Class = getDefinitionByName(strTmp) as Class;
// creates a new BitmapAsset from the class reference including it's BitmapData.
// then you clone this bitmapdata, giving you two
tmp = new ClassReference().bitmapData.clone();
// not really necessary since ClassReference goes out of scope anyway, but no harm done
ClassReference=null;
// Makes a third copy of your second copy and returns it.
return tmp.clone();
I would recommend this (assuming you need unique bitmapDatas for each tile)
var ClassReference:Class = getDefinitionByName(strTmp) as Class;
return new ClassReference().bitmapData.clone();
If you don't need unique bitmapDatas, keep static properties with the bitmapDatas on some class and use the same ones all over. That will minimize memory usage.

How to dispose a Writeable Bitmap? (WPF)

Some time ago i posted a question related to a WriteableBitmap memory leak, and though I received wonderful tips related to the problem, I still think there is a serious bug / (mistake made by me) / (Confusion) / (some other thing) here.
So, here is my problem again:
Suppose we have a WPF application with an Image and a button. The image's source is a really big bitmap (3600 * 4800 px), when it's shown at runtime the applicaton consumes ~90 MB.
Now suppose i wish to instantiate a WriteableBitmap from the source of the image (the really big Image), when this happens the applications consumes ~220 MB.
Now comes the tricky part, when the modifications to the image (through the WriteableBitmap) end, and all the references to the WriteableBitmap (at least those that I'm aware of) are destroyed (at the end of a method or by setting them to null) the memory used by the writeableBitmap should be freed and the application consumption should return to ~90 MB. The problem is that sometimes it returns, sometimes it does not.
Here is a sample code:
// The Image's source whas set previous to this event
private void buttonTest_Click(object sender, RoutedEventArgs e)
{
if (image.Source != null)
{
WriteableBitmap bitmap = new WriteableBitmap((BitmapSource)image.Source);
bitmap.Lock();
bitmap.Unlock();
//image.Source = null;
bitmap = null;
}
}
As you can see the reference is local and the memory should be released at the end of the method (Or when the Garbage collector decides to do so). However, the app could consume ~224 MB until the end of the universe.
Any help would be great.
Is it necessary to render the Bitmap image at the same resolution and pixels? You could create the writeablebitmap at a much lower set of pixels and call the render method. Since the writeablebitmap carries a reference to the original uielements when calling render, in this case, you are going to have 3 chunks: 1) original uielement, 2) pixels in writeablebitmap, 3) reference to copied original.
I had a similar issue with the writeablebitmap in terms of memory leaks and I fixed it from checking out this link:
http://www.wintellect.com/CS/blogs/jprosise/archive/2009/12/17/silverlight-s-big-image-problem-and-what-you-can-do-about-it.aspx
If you create another writeablebitmap and copy the pixels over, then dispose of the first writeablebitmap you should see some memory release - at least I did in my scenario.

Resources