Copying OpenCL buffers directly - memory

Is it possible to assign the of a buffer to another buffer defined in OpenCL source code?
For example, consider the below code:
cl_mem buff;
cl_mem temp;
...
...
...
temp = buff;
Do I need to call clEnqueueBuffer() again?

You would need to copy buff to temp using clEnqueueCopyBuffer between your NDRange calls. I don't recommend doing this if you can help it though. There should be no reason why you cant use the same buffer for NDRange calls unless you are needing it for something else in the meantime.

Related

Memory management. Byte copy of CF Object

I have come across an interesting question.
I have this piece of code (don't ask why I need to do something like this):
CFDataRef data = ... get data from somewhere
SSLContextRef sslContext;
CFDataGetBytes(data, CFRangeMake(0, sizeof(SSLContextRef)), (UInt8 *)&sslContext);
Now I don't know what to do with sslContext. As I understand I have made a byte copy of SSLContextRef and I need to free that memory after I used it.
So here comes a question: how can I properly free memory?
Again as I understand I can't do CFRelease because I didn't increment reference count when I got(copied) the object and if I simply try to do free(sslContext) I get crash.
I would appreciate if someone could explain how it should work.
EDIT: Thanks to user gaige. He has pointed that in the question I have copied only reference to SSLContextRef. As I understand if I do:
UInt8 *buffer = malloc(sizeof(SSLContext));
CFDataGetBytes(data, CFRangeMake(0, sizeof(SSLContext)), buffer);
I then I can just do free(buffer); without any problem (provided that I didn't do any CFRetain/CFRelease logic). Please correct me if I am wrong.
In this case, you copied sizeof(SSLContextRef) bytes of data from the CFDataRef pointed at by data, you didn't increase any reference counts, nor did you copy any data other than the pointer to the SSLContext structure. (SSLContextRef is declared as a pointer to struct SSLContext).
The data you copied ended up in sslContext in your current stack frame, and thus doesn't need any special curation by you in order to make it go away.
In short, you don't need to do anything because no data was copied in the heap.

Difference between writing data in memory directly and using asm instruction

I am reading the Linux kernel. I am curious about the way to write data in memory.
In some part of drivers, they use the writel() function defined in asm/io.h and in definition of that function, they use the movnti instruction - actually I don't understand what this instruction means except it is a kind of mov instruction.
Anyway, when writing data in memory, what's the difference between using writel() and directly writing in memory, e.g. **address = data;.
Here is the case:
static inline void __writel(__u32 val, volatile void __iomem *addr)
{
volatile __u32 __iomem *target = addr;
asm volatile("movnti %1,%0"
: "=m" (*target)
: "r" (val) : "memory");
}
and this is another case:
*(unsigned int*)(MappedAddr+pageOffset) = result;
writel looks like it's intended for memory mapped IO, there are a few things to support this, first the use of the volatile pointer (which prevents optimization such as reordering calls or optimizing them out among other things) and the non-temproal instruction (IO writes/reads shouldn't be cached) and of course the iomem annotation seems to support this too.
If I understand this correctly then using the moventi instruction will minimise the impact on the processor's data caches. Using *(unsigned int*)(MappedAddr+pageOffset) = result; instead leaves the the compiler free to choose whichever move instruction it likes, and its likely to choose one that causes the cache line to be pulled into the cache. Which is probably not what you want if you're interacting with a memory mapped device.

bufferData - usage parameter differences

While reading specification at Khronos, I found:
bufferData(ulong target, Object data, ulong usage)
'usage' parameter can be: STREAM_DRAW, STATIC_DRAW or DYNAMIC_DRAW
My question is, which one should I use?
What are the advantages, what are the differences?
Why would I choose to use some other instead STATIC_DRAW?
Thanks.
For 'desktop' OpenGL, there is a good explanation here:
http://www.opengl.org/wiki/Buffer_Object
Basically, usage parameter is a hint to OpenGL/WebGL on how you intend to use the buffer. The OpenGL/WebGL can then optimize the buffer depending on your hint.
The OpenGL ES docs writes the following, which is not exactly the same as for OpenGL (remember that WebGL is inherited from OpenGL ES):
STREAM
The data store contents will be modified once and used at most a few times.
STATIC
The data store contents will be modified once and used many times.
DYNAMIC
The data store contents will be modified repeatedly and used many times.
The nature of access must be:
DRAW
The data store contents are modified by the application, and used as the source for GL drawing and image specification commands.
The most common usage is STATIC_DRAW (for static geometry), but I have recently created a small particle system where DYNAMIC_DRAW makes more sense (the particles are stored in a single buffer, where parts of the buffer is updated when particles are emitted).
http://jsfiddle.net/mortennobel/YHMQZ/
Code snippet:
function createVertexBufferObject(){
particleBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, particleBuffer);
var vertices = new Float32Array(vertexBufferSize * particleSize);
gl.bufferData(gl.ARRAY_BUFFER, vertices, gl.DYNAMIC_DRAW);
bindAttributes();
}
function emitParticle(x,y,velocityX, velocityY){
gl.bindBuffer(gl.ARRAY_BUFFER, particleBuffer);
// ...
gl.bufferSubData(gl.ARRAY_BUFFER, particleId*particleSize*sizeOfFloat, data);
particleId = (particleId +1 )%vertexBufferSize;
}

When is the data copied to GPU memory?

I have some well known steps in my program:
CreateBuffer
Create..View
CSSet..Views
Dispatch
At which step is the data copied to the GPU?
The reason they down-voted it is because it seems as if you didn't put any effort into a little Google search.
Answer: DirectX usually transfers data from system memory into video memory when the creation methods are called. An example of a creation method is "ID3D11Device::CreateBuffer". This method requires a pointer to the memory location of where the data is so it can be copied from system RAM to video RAM. However, if the pointer that is passed into is a null pointer then it just sets the amount of space to the side so you can copy it later.
Example:
If you create a Dynamic Vertex buffer and you don't pass the data in at first then you will have to use map/unmap to copy the data into video memory.
// Fill in a buffer description.
D3D11_BUFFER_DESC bufferDesc;
bufferDesc.Usage = D3D11_USAGE_DYNAMIC;
bufferDesc.ByteWidth = sizeof(Vertex_dynamic) * m_iHowManyVertices;
bufferDesc.BindFlags = D3D11_BIND_VERTEX_BUFFER;
bufferDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
bufferDesc.MiscFlags = 0;
bufferDesc.StructureByteStride = NULL;
// Fill in the subresource data.
D3D11_SUBRESOURCE_DATA InitData;
InitData.pSysMem = &_vData[0];
InitData.SysMemPitch = NULL;
InitData.SysMemSlicePitch = NULL;
// Create the vertex buffer.
/*Data is being copyed right now*/
m_pDxDevice->CreateBuffer(&bufferDesc, &InitData, &m_pDxVertexBuffer_PiecePos);
DirectX manages the memory for you and the data is copied to the GPU when it needs to be.

What is the correct way to clear sensitive data from memory in iOS?

I want to clear sensitive data from memory in my iOS app.
In Windows I used to use SecureZeroMemory. Now, in iOS, I use plain old memset, but I'm a little worried the compiler might optimize it:
https://buildsecurityin.us-cert.gov/bsi/articles/knowledge/coding/771-BSI.html
code snippet:
NSData *someSensitiveData;
memset((void *)someSensitiveData.bytes, 0, someSensitiveData.length);
Paraphrasing 771-BSI (link see OP):
A way to avoid having the memset call optimized out by the compiler is to access the buffer again after the memset call in a way that would force the compiler not to optimize the location. This can be achieved by
*(volatile char*)buffer = *(volatile char*)buffer;
after the memset() call.
In fact, you could write a secure_memset() function
void* secure_memset(void *v, int c, size_t n) {
volatile char *p = v;
while (n--) *p++ = c;
return v;
}
(Code taken from 771-BSI. Thanks to Daniel Trebbien for pointing out for a possible defect of the previous code proposal.)
Why does volatile prevent optimization? See https://stackoverflow.com/a/3604588/220060
UPDATE Please also read Sensitive Data In Memory because if you have an adversary on your iOS system, your are already more or less screwed even before he tries to read that memory. In a summary SecureZeroMemory() or secure_memset() do not really help.
The problem is NSData is immutable and you do not have control over what happens. If the buffer is controlled by you, you could use dataWithBytesNoCopy:length: and NSData will act as a wrapper. When finished you could memset your buffer.

Resources