I perform a capture of Direct3D back buffer. When I download the pixels the image frame is flipped along its vertical axis.Is it possible to "tell" D3D to flip the frame when copying resource,or when creating target ID3D11Texture2D ?
Here is how I do it:
The texture into which I copy the frame buffer is created like this:
D3D11_TEXTURE2D_DESC description =
desc.BufferDesc.Width, desc.BufferDesc.Height, 1, 1,
{ 1, 0 }, // DXGI_SAMPLE_DESC
D3D11_USAGE_STAGING,//transder from GPU to CPU
D3D11_SUBRESOURCE_DATA data = { buffer, desc.BufferDesc.Width * PIXEL_SIZE, 0 };
device->CreateTexture2D(&description, &data, &pNewTexture);
Then on each frame I do:
pSwapChain->GetBuffer(0, __uuidof(ID3D11Texture2D), reinterpret_cast< void** >(&pSurface));
pContext->CopyResource(pNewTexture, pSurface);
pContext->Map(pNewTexture, 0, D3D11_MAP_READ , 0, &resource);
//reading from resource.pData
PS: I don't have a control of the rendering pipeline. I hook an external app with this code.
Also,I don't want to mess with the pixel buffer on the CPU, like reverse copy in a loop etc.. The low latency of the copy is high priority.
I also tried this:
D3D11_BOX box;
box.left = 0;
box.right = desc.BufferDesc.Width; = desc.BufferDesc.Height;
box.bottom = 0;
box.front = 0;
box.back = 1;
pContext->CopySubresourceRegion(pNewTexture, 0, 0, 0, 0, pSurface, 0, &box);
Which causes the frame to be empty from its content.

Create a texture with D3D11_USAGE_DEAFULT, with CPUAccessFlags=0 and BindFlags=D3D11_BIND_SHADER_RESOURCE. CopyResource the swapchain's backbuffer to it. Create another texture with D3D11_BIND_RENDER_TARGET. Set it as a render target, set a pixel shader and draw a flipped quad using the first texture. Now you should be able to CopyResource the second texture to the staging texture that you use now. This should be faster than copying a flipped image data using the CPU. However, this solution would take more resources on the GPU and might be hard to setup in a hook.

All Direct3D mapped resources should be processed scanline-by-scanline, so just reverse the copy:
auto ptr = reinterpret_cast<const uint8_t>(resource.pData)
+ (desc.BufferDesc.Height - 1) * resource.RowPitch;
for(unsigned int y = 0; y < desc.BufferDesc.Height; ++y )
// do something with the data in ptr
// which is desc.BufferDesc.Width * BytesPerPixel(desc.Format) bytes
// i.e. DXGI_FORMAT_R8G8B8A8_UNORM would be desc.BufferDesc.Width * 4
ptr -= resource.RowPitch;
For lots of examples of working with Direct3D resources, see DirectXTex.


What's the effect of geometry on the final texture output in WebGL?

Updated with more explanation around my confusion
(This is how a non-graphics developer imagines the rendering process!)
I specify a 2x2 sqaure to be drawn in by way of two triangles. I'm going to not talk about the triangle anymore. Square is a lot better. Let's say the square gets drawn in one piece.
I have not specified any units for my drawing. The only places in my code that I do something like that is: canvas size (set to 1x1 in my case) and the viewport (i always set this to the dimensions of my output texture).
Then I call draw().
What happens is this: that regardless of the size of my texture (being 1x1 or 10000x10000) all my texels are filled with data (color) that I returned from my frag shader. This is working each time perfectly.
So now I'm trying to explain this to myself:
The GPU is only concerned with coloring the pixels.
Pixel is the smallest unit that the GPU deals with (colors).
Depending on how many pixels my 2x2 square is mapped to, I should be running into one of the following 3 cases:
The number of pixels (to be colored) and my output texture dims match one to one: In this ideal case, for each pixel, there would be one value assigned to my output texture. Very clear to me.
The number of pixels are fewer than my output texture dims. In this case, I should expect that some of the output texels to have exact same value (which is the color of the pixel the fall under). For instance if the GPU ends up drawing 16x16 pixels and my texture is 64x64 then I'll have blocks of 4 texel which get the same value. I have not observed such case regardless of the size of my texture. Which means there is never a case where we end up with fewer pixels (really hard to imagine -- let's keep going)
The number of pixels end up being more than the number of texels. In this case, the GPU should decide which value to assign to my texel. Would it average out the pixel colors? If the GPU is coloring 64x64 pixels and my output texture is 16x16 then I should expect that each texel gets an average color of the 4x4 pixels it contains. Anyway, in this case my texture should be completely filled with values I didn't intend specifically for them (like averaged out) however this has not been the case.
I didn't even talk about how many times my frag shader gets called because it didn't matter. The results would be deterministic anyway.
So considering that I have never run into 2nd and 3rd case where the values in my texels are not what I expected them the only conclusion I can come up with is that the whole assumption of the GPU trying to render pixels is actually wrong. When I assign an output texture to it (which is supposed to stretch over my 2x2 square all the time) then the GPU will happily oblige and for each texel will call my frag shader. Somewhere along the line the pixels get colored too.
But the above lunatistic explanation also fails to answer why I end up with no values in my texels or incorrect values if I stretch my geometry to 1x1 or 4x4 instead of 2x2.
Hopefully the above fantastic narration of the GPU coloring process has given you clues as to where I'm getting this wrong.
Original Post:
We're using WebGL for general computation. As such we create a rectangle and draw 2 triangles in it. Ultimately what we want is the data inside the texture mapped to this geometry.
What I don't understand is if I change the rectangle from (-1,-1):(1,1) to say (-0.5,-0.5):(0.5,0.5) suddenly data is dropped from the texture bound to the framebuffer.
I'd appreciate if someone makes me understand the correlations. The only places that real dimensions of the output texture come into play are the call to viewPort() and readPixels().
Below are relevant pieces of code for you to see what I'm doing:
... // canvas is created with size: 1x1
... // context attributes passed to canvas.getContext()
contextAttributes = {
alpha: false,
depth: false,
antialias: false,
stencil: false,
preserveDrawingBuffer: false,
premultipliedAlpha: false,
failIfMajorPerformanceCaveat: true
... // default geometry
// Sets of x,y,z (for rectangle) and s,t coordinates (for texture)
return new Float32Array([
-1.0, 1.0, 0.0, 0.0, 1.0, // upper left
-1.0, -1.0, 0.0, 0.0, 0.0, // lower left
1.0, 1.0, 0.0, 1.0, 1.0, // upper right
1.0, -1.0, 0.0, 1.0, 0.0 // lower right
const geometry = this.createDefaultGeometry();
gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
gl.bufferData(gl.ARRAY_BUFFER, geometry, gl.STATIC_DRAW);
... // binding to the vertex shader attribs
gl.vertexAttribPointer(positionHandle, 3, gl.FLOAT, false, 20, 0);
gl.vertexAttribPointer(textureCoordHandle, 2, gl.FLOAT, false, 20, 12);
... // setting up framebuffer; I set the viewport to output texture dimensions (I think this is absolutely needed but not sure)
gl.bindTexture(gl.TEXTURE_2D, texture);
gl.bindFramebuffer(gl.FRAMEBUFFER, this.framebuffer);
gl.FRAMEBUFFER, // The target is always a FRAMEBUFFER.
gl.COLOR_ATTACHMENT0, // We are providing the color buffer.
gl.TEXTURE_2D, // This is a 2D image texture.
texture, // The texture.
0); // 0, we aren't using MIPMAPs
gl.viewport(0, 0, width, height);
... // reading from output texture
gl.bindTexture(gl.TEXTURE_2D, texture);
gl.readPixels(0, 0, width, height, gl.FLOAT, gl.RED, buffer);
new answer
I'm just saying the same thing yet again (3rd time?)
Copied from below
WebGL is destination based. That means it's going to iterate over the pixels of the line/point/triangle it's drawing and for each point call the fragment shader and ask 'what value should I store here`?
It's destination based. It's going to draw each pixel exactly once. For that pixel it's going to ask "what color should I make this"
destination based loop
for (let i = start; i < end; ++i) {
fragmentShaderFunction(); // must set gl_FragColor
destinationTextureOrCanvas[i] = gl_FragColor;
You can see in the loop above there is no setting any random destination. There is no setting any part of destination twice. It's just going to run from start to end and exactly once for each pixel in the destination between start and end ask what color it should make that pixel.
How to do you set start and end? Again, to make it simple let's assume a 200x1 texture so we can ignore Y. It works like this
vertexShaderFunction(); // must set gl_Position
const start = clipspaceToArrayspaceViaViewport(viewport, gl_Position.x);
vertexShaderFunction(); // must set gl_Position
const end = clipspaceToArrayspaceViaViewport(viewport, gl_Position.x);
for (let i = start; i < end; ++i) {
fragmentShaderFunction(); // must set gl_FragColor
texture[i] = gl_FragColor;
see below for clipspaceToArrayspaceViaViewport
What is viewport? viewport is what you set when you called `gl.viewport(x, y, width, height)
So, set gl_Position.x to -1 and +1, viewport.x to 0 and viewport.width = 200 (the width of the texture) then start will be 0, end will be 200
set gl_Position.x to .25 and .75, viewport.x to 0 and viewport.width = 200 (the width of the texture). The start will be 125 and end will be 175
I honestly feel like this answer is leading you down the wrong path. It's not remotely this complicated. You don't have to understand any of this to use WebGL IMO.
The simple answer is
You set gl.viewport to the sub rectangle you want to affect in your destination (canvas or texture it doesn't matter)
You make a vertex shader that somehow sets gl_Position to clip space coordinates (they go from -1 to +1) across the texture
Those clip space coordinates get converted to the viewport space. It's basic math to map one range to another range but it's mostly not important. It's seems intuitive that -1 will draw to the viewport.x pixel and +1 will draw to the viewport.x + viewport.width - 1 pixel. That's what "maps from clip space to the viewport settings means".
It's most common for the viewport settings to be (x = 0, y = 0, width = width of destination texture or canvas, height = height of destination texture or canvas)
So that just leaves what you set gl_Position to. Those values are in clip space just like it explains in this article.
You can make it simple by doing if you want by converting from pixel space to clip space just like it explains in this article
zeroToOne = someValueInPixels / destinationDimensions;
zeroToTwo = zeroToOne * 2.0;
clipspace = zeroToTwo - 1.0;
gl_Position = clipspace;
If you continue the articles they'll also show adding a value (translation) and multiplying by a value (scale)
Using just those 2 things and a unit square (0 to 1) you can choose any rectangle on the screen. Want to effect 123 to 127. That's 5 units so scale = 5, translation = 123. Then apply the math above to convert from pixels to clips space and you'll get the rectangle you want.
If you continue further though those articles you'll eventually get the point where that math is done with matrices but you can do that math however you want. It's like asking "how do I compute the value 3". Well, 1 + 1 + 1, or 3 + 0, or 9 / 3, or 100 - 50 + 20 * 2 / 30, or (7^2 - 19) / 10, or ????
I can't tell you how to set gl_Position. I can only tell you make up whatever math you want and set it to *clip space* and then give an example of converting from pixels to clipspace (see above) as just one example of some possible math.
old answer
I get that this might not be clear I don't know how to help. WebGL draws lines, points, or triangles two a 2D array. That 2D array is either the canvas, a texture (as a framebuffer attachment) or a renderbuffer (as a framebuffer attachment).
The size of the area is defined by the size of the canvas, texture, renderbuffer.
You write a vertex shader. When you call gl.drawArrays(primitiveType, offset, count) you're telling WebGL to call your vertex shader count times. Assuming primitiveType is gl.TRIANGLES then for every 3 vertices generated by your vertex shader WebGL will draw a triangle. You specify that triangle by setting gl_Position in clip space.
Assuming gl_Position.w is 1, Clip space goes from -1 to +1 in X and Y across the destination canvas/texture/renderbuffer. (gl_Position.x and gl_Position.y are divided by gl_Position.w) which is not really important for your case.
To convert back to actually pixels your X and Y are converted based on the settings of gl.viewport. Let's just do X
pixelX = ((clipspace.x / clipspace.w) * .5 + .5) * viewport.width + viewport.x
WebGL is destination based. That means it's going to iterate over the pixels of the line/point/triangle it's drawing and for each point call the fragment shader and ask 'what value should I store here`?
Let's translate that to JavaScript in 1D. Let's assume you have an 1D array
const dst = new Array(100);
Let's make a function that takes a start and end and sets values between
function setRange(dst, start, end, value) {
for (let i = start; i < end; ++i) {
dst[i] = value;
You can fill the entire 100 element array with 123
const dst = new Array(100);
setRange(dst, 0, 99, 123);
To set the last half of the array to 456
const dst = new Array(100);
setRange(dst, 50, 99, 456);
Let's change that to use clip space like coordinates
function setClipspaceRange(dst, clipStart, clipEnd, value) {
const start = clipspaceToArrayspace(dst, clipStart);
const end = clipspaceToArrayspace(dst, clipEnd);
for (let i = start; i < end; ++i) {
dst[i] = value;
function clipspaceToArrayspace(array, clipspaceValue) {
// convert clipspace value (-1 to +1) to (0 to 1)
const zeroToOne = clipspaceValue * .5 + .5;
// convert zeroToOne value to array space
return Math.floor(zeroToOne * array.length);
This function now works just like the previous one except takes clip space values instead of array indices
// fill entire array with 123
const dst = new Array(100);
setClipspaceRange(dst, -1, +1, 123);
Set the last half of the array to 456
setClipspaceRange(dst, 0, +1, 456);
Now abstract one more time. Instead of using the array's length use a setting
// viewport looks like `{ x: number, width: number} `
function setClipspaceRangeViaViewport(dst, viewport, clipStart, clipEnd, value) {
const start = clipspaceToArrayspaceViaViewport(viewport, clipStart);
const end = clipspaceToArrayspaceViaViewport(viewport, clipEnd);
for (let i = start; i < end; ++i) {
dst[i] = value;
function clipspaceToArrayspaceViaViewport(viewport, clipspaceValue) {
// convert clipspace value (-1 to +1) to (0 to 1)
const zeroToOne = clipspaceValue * .5 + .5;
// convert zeroToOne value to array space
return Math.floor(zeroToOne * viewport.width) + viewport.x;
Now to fill the entire array with 123
const dst = new Array(100);
const viewport = { x: 0, width: 100; }
setClipspaceRangeViaViewport(dst, viewport, -1, 1, 123);
Set the last half of the array to 456 there are now 2 ways. Way one is just like the previous using 0 to +1
setClipspaceRangeViaViewport(dst, viewport, 0, 1, 456);
You can also set the viewport to start half way through the array
const halfViewport = { x: 50, width: 50; }
setClipspaceRangeViaViewport(dst, halfViewport, -1, +1, 456);
I don't know if that was helpful or not.
The only other thing to add is instead of value replace that with a function that gets called every iteration to supply value
function setClipspaceRangeViaViewport(dst, viewport, clipStart, clipEnd, fragmentShaderFunction) {
const start = clipspaceToArrayspaceViaViewport(viewport, clipStart);
const end = clipspaceToArrayspaceViaViewport(viewport, clipEnd);
for (let i = start; i < end; ++i) {
dst[i] = fragmentShaderFunction();
Note this is the exact same thing that is said in this article and clearified somewhat in this article.

Empty WebGL context uses a lot of memory

For example, for my 940M video card, the canvas created with the following code takes 500 MB of video memory
var c = document.createElement('canvas');
var ctx = c.getContext('webgl');
c.width = c.height = 4096;
At the same time, the OpenGL context of the same sizes uses only 100 MB of video memory:
glutInit(&argc, argv);
int s = 4096;
glutInitWindowSize(s, s);
glutCreateWindow("Hello world :D");
Why does the WebGL use so much memory? Is it possible to reduce the amount of used memory for the same sizes of the context?
As LJ pointed out, canvas is double buffered, antialiased, has alpha and a depth buffer by default. You made the canvas 4096 x 4096 so that's
16meg * 4 (RGBA) or 64meg for one buffer
You get that times at least 4
front buffer = 1
antialiased backbuffer = 2 to 16
depth buffer = 1
So that's 256meg to 1152meg depending on what the browser picks for antialiasing.
In answer to your question you can try to not ask for a depth buffer, alpha buffer and/or antialiasing
var c = document.createElement('canvas');
var ctx = c.getContext('webgl', { alpha: false, depth: false, antialias: false});
c.width = c.height = 4096;
Whether the browser actually doesn't allocate an alpha channel or does but just ignores it is up to the browser and driver. Whether it will actually not allocate a depth buffer is also up to the browser. Passing antialias: false should at least make the 2nd buffer 1x instead of 2x to 16x.

How to create VHS effect on iOS using GPUImage or another library

I am trying to make a VHS effect for an iOS app, just like in this video:
I want to realize this effect with the less effect possible to generate less CPU charge.
Basically what I need is to crank up the color levels to create a "chromatic aberration", change Sharpen parameters, and add some gaussian blur + add some noise.
I am using GPUImage. For the Sharpen and Gaussian blur, easy to apply.
I am having two problems:
1) For the "chromatic aberration", the way they do it usually is to duplicate three times the video, and put Red to 0 on one video, blue to 0 on another one, and green to 0 on the last one, and blend them together (just like in the tutorial). But doing this in an iPhone would be too CPU consuming.
Any idea how to achieve the same effect withtout having to duplicate the video and blend it =
2) I also want to add some noise but do not know which GPUImage effect to use. Any idea on this one too ?
Thanks a lot,
(I'm not an iOS developer but I hope this can help someone.)
I wrote a VHS filter on Windows, this is what I did:
Crop the video frame to 4:3 aspect ratio and lower the resolution to 360*270.
Lower color saturation, and apply a color matrix to reduce green color to 93% (so the video will look purple).
Apply a convolve matrix to sharpen the video frame directionally. This is the kernel I used:
0 -0.5 0 0
-0.5 2.9 0 -0.5
0 -0.5 0 0
Blend a real blank VHS footage to your video for the noise (search for "VHS overlay" on YouTube).
Video: Before After
Screenshot: Before After
The CPU and GPU consumption is ok. I apply this filter to real time camera preview on my old windows phone (with Snapdragon 808), and it works fine.
Code (C#, using Win2D library for GPU acceleration, implementing Windows.Media.Effects.IBasicVideoEffect interface):
public void ProcessFrame(ProcessVideoFrameContext context) //This method is called each frame
int outputWidth = 360; //Output Resolution
int outputHeight = 270;
IDirect3DSurface inputSurface = context.InputFrame.Direct3DSurface;
IDirect3DSurface outputSurface = context.OutputFrame.Direct3DSurface;
using (CanvasBitmap inputFrame = CanvasBitmap.CreateFromDirect3D11Surface(canvasDevice, inputSurface)) //The video frame to be processed
using (CanvasRenderTarget outputFrame = CanvasRenderTarget.CreateFromDirect3D11Surface(canvasDevice, outputSurface)) //The video frame after processing
using (CanvasDrawingSession outputFrameDrawingSession = outputFrame.CreateDrawingSession())
using (CanvasRenderTarget croppedFrame = new CanvasRenderTarget(canvasDevice, outputWidth, outputHeight, outputFrame.Dpi))
using (CanvasDrawingSession croppedFrameDrawingSession = croppedFrame.CreateDrawingSession())
using (CanvasBitmap overlay = Task.Run(async () => { return await CanvasBitmap.LoadAsync(canvasDevice, overlayFrames[new Random().Next(0, overlayFrames.Count - 1)]); }).Result) //"overlayFrames" is a list containing video frames from, here we just randomly pick one frame for blend
double inputWidth = inputFrame.Size.Width;
double inputHeight = inputFrame.Size.Height;
Rect ractangle;
//Crop the inputFrame to 360*270, save it to "croppedFrame"
if (3 * inputWidth > 4 * inputHeight)
double x = (inputWidth - inputHeight / 3 * 4) / 2;
ractangle = new Rect(x, 0, inputWidth - 2 * x, inputHeight);
double y = (inputHeight - inputWidth / 4 * 3) / 2;
ractangle = new Rect(0, y, inputWidth, inputHeight - 2 * y);
croppedFrameDrawingSession.DrawImage(inputFrame, new Rect(0, 0, outputWidth, outputHeight), ractangle, 1, CanvasImageInterpolation.HighQualityCubic);
//Apply a bunch of effects (mentioned in step 2,3,4) to "croppedFrame"
BlendEffect vhsEffect = new BlendEffect
Background = new ConvolveMatrixEffect
Source = new ColorMatrixEffect
Source = new SaturationEffect
Source = croppedFrame,
Saturation = 0.4f
ColorMatrix = new Matrix5x4
M11 = 1f,
M22 = 0.93f,
M33 = 1f,
M44 = 1f
KernelHeight = 3,
KernelWidth = 4,
KernelMatrix = new float[]
0, -0.5f, 0, 0,
-0.5f, 2.9f, 0, -0.5f,
0, -0.5f, 0, 0,
Foreground = overlay,
Mode = BlendEffectMode.Screen
//And draw the result to "outputFrame"
outputFrameDrawingSession.DrawImage(vhsEffect, ractangle, new Rect(0, 0, outputWidth, outputHeight));

Cocos2d: pixel perfect collision for batched sprites

I found a pixel perfect collision algorithm developed by Daniel Vilchez and included in a project shared in this forum topic.
Below there is the part of the algorithm I am interested. I am trying to modify this because whenever I used CCRenderTexture, as originally in the code, the App crashed.
I am thinking of alternative methods based on circle collision but those are "not pixel perfect" and in the case my bullet is a wave with this shape it wouldn't work well.
**I am wondering how can I get the algorithm working with sprites batched in a CCSpriteBatchNode? And if so does this strictly include the usage of CCRenderTexture? **
To be precise, this question is partially related to this other question of mine, on creating an instance of CCRenderTexture that causes my App to crash. I post two different ones because here I am asking about the algorithm, in the other one I just ask why CCRenderTexture causes my App to crash (without using Daniel's pixel perfect algorithm, but just creating an instance of CCRenderTexture).
Adapted CODE (here is missing CCRenderTexture because it made my app crashing, so I commented out the usage of _rt - instance of CCRenderTexture). The code does not work properly, so I guess I need CCRenderTexture and hence I asked the question:
-(BOOL) isPixelPerfectCollisionBetweenSpriteA:(CCSprite*)spr1 spriteB:(CCSprite*) spr2
BOOL isCollision = NO;
CGRect intersection = CGRectIntersection([spr1 boundingBox], [spr2 boundingBox]);
// Look for simple bounding box collision
if (!CGRectIsEmpty(intersection))
// Get intersection info
unsigned int x = intersection.origin.x;
unsigned int y = intersection.origin.y;
unsigned int w = intersection.size.width;
unsigned int h = intersection.size.height;
unsigned int numPixels = w * h;
//NSLog(#"\nintersection = (%u,%u,%u,%u), area = %u",x,y,w,h,numPixels);
// Draw into the RenderTexture
//[_rt beginWithClear:0 g:0 b:0 a:0];
// Render both sprites: first one in RED and second one in GREEN
glColorMask(1, 0, 0, 1);
[spr1 visit];
glColorMask(0, 1, 0, 1);
[spr2 visit];
glColorMask(1, 1, 1, 1);
// Get color values of intersection area
ccColor4B *buffer = malloc( sizeof(ccColor4B) * numPixels );
glReadPixels(x, y, w, h, GL_RGBA, GL_UNSIGNED_BYTE, buffer);
//[_rt end];
// Read buffer
unsigned int step = 1;
for(unsigned int i=0; i<numPixels; i+=step)
ccColor4B color = buffer[i];
if (color.r > 0 && color.g > 0)
isCollision = YES;
// Free buffer memory
return isCollision;
EDIT: I found also KKPixelMaskSprite but it doesn't seem to work for high resolution sprites batched in CCSpriteBatchNodes (see comment here).

How do I draw thousands of squares with glkit, opengl es2?

I'm trying to draw up to 200,000 squares on the screen. Or a lot of squares basically. I believe I'm just calling way to many draw calls, and it's crippling the performance of the app. The squares only update when I press a button, so I don't necessarily have to update this every frame.
Here's the code i have now:
- (void)glkViewControllerUpdate:(GLKViewController *)controller
//static float transY = 0.0f;
//float y = sinf(transY)/2.0f;
//transY += 0.175f;
GLKMatrix4 modelview = GLKMatrix4MakeTranslation(0, 0, -5.f);
effect.transform.modelviewMatrix = modelview;
//GLfloat ratio = self.view.bounds.size.width/self.view.bounds.size.height;
GLKMatrix4 projection = GLKMatrix4MakeOrtho(0, 768, 1024, 0, 0.1f, 20.0f);
effect.transform.projectionMatrix = projection;
_isOpenGLViewReady = YES;
- (void)glkView:(GLKView *)view drawInRect:(CGRect)rect
if(_model.updateView && _isOpenGLViewReady)
[effect prepareToDraw];
int pixelSize = _model.pixelSize;
//NSLog(#"UPDATING: %d, %d", _model.rows, _model.columns);
for(int i = 0; i < _model.rows; i++)
for(int ii = 0; ii < _model.columns; ii++)
ColorModel *color = [_model getColorAtRow:i andColumn:ii];
CGRect rect = CGRectMake(ii * pixelSize, i*pixelSize, pixelSize, pixelSize);
//[self drawRectWithRect:rect withColor:c];
GLubyte squareColors[] = {,,, 255,,,, 255,,,, 255,,,, 255
// NSLog(#"Drawing color with red: %d",;
int xVal = rect.origin.x;
int yVal = rect.origin.y;
int width = rect.size.width;
int height = rect.size.height;
GLfloat squareVertices[] = {
xVal, yVal, 1,
xVal + width, yVal, 1,
xVal, yVal + height, 1,
xVal + width, yVal + height, 1
glVertexAttribPointer(GLKVertexAttribPosition, 3, GL_FLOAT, GL_FALSE, 0, squareVertices);
glVertexAttribPointer(GLKVertexAttribColor, 4, GL_UNSIGNED_BYTE, GL_TRUE, 0, squareColors);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
_model.updateView = YES;
First, do you really need to draw 200,000 squares? Your viewport only has 786,000 pixels total. You might be able to reduce the number of drawn objects without significantly impacting the overall quality of your scene.
That said, if these are smaller squares, you could draw them as points with a pixel size large enough to cover your square's area. That would require setting gl_PointSize in your vertex shader to the appropriate pixel width. You could then generate your coordinates and send them all to be drawn at once as GL_POINTS. That should remove the overhead of the extra geometry of the triangles and the individual draw calls you are using here.
Even if you don't use points, it's still a good idea to calculate all of the triangle geometry you need first, then send all that in a single draw call. This will significantly reduce your OpenGL ES API call overhead.
One other thing you could look into would be to use vertex buffer objects to store this geometry. If the geometry is static, you can avoid sending it on each drawn frame, or only update a part of it that has changed. Even if you just change out the data each frame, I believe using a VBO for dynamic geometry has performance advantages on the modern iOS devices.
Can you not try to optimize it somehow? I'm not terribly familiar with graphics type stuff, but I'd imagine that if you are drawing 200,000 squares chances that all of them are actually visible seems to be unlikely. Could you not add some sort of isVisible tag for your mySquare class that determines whether or not the square you want to draw is actually visible? Then the obvious next step is to modify your draw function so that if the square isn't visible, you don't draw it.
Or are you asking for someone to try to improve the current code you have, because if your performance is as bad as you say, I don't think making small changes to the above code will solve your problem. You'll have to rethink how you're doing your drawing.
It looks like what your code is actually trying to do is take a _model.rows × _model.columns 2D image and draw it upscaled by _model.pixelSize. If -[ColorModel getColorAtRow:andColumn:] is retrieving 3 bytes at a time from an array of color values, then you may want to consider uploading that array of color values into an OpenGL texture as GL_RGB/GL_UNSIGNED_BYTE data and letting the GPU scale up all of your pixels at once.
Alternatively, if scaling up the contents of your ColorModel is the only reason that you’re using OpenGL ES and GLKit, you might be better off wrapping your color values into a CGImage and allowing UIKit and Core Animation do the drawing for you. How often do the color values in the ColorModel get updated?
