render a small texture every frame and then scale it up? - ios

Using OpenGL on iOS, is it possible to update a small texture (by setting each pixel individually) and then scale it up to fill the screen (60 frames per second)?

You should be able to update the content of a texture using glTexImage2D.
Untested example:
GLubyte data[1024]; // 32x32 (power of two)
for (int i=0; i<1024; i+=4) {
// write a red pixel (RGBA)
data[i] = 255;
data[i+1] = 0;
data[i+2] = 0;
data[i+3] = 255;
glBindTexture(GL_TEXTURE_2D, my_texture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 32, 32, 0, GL_RGBA, GL_UNSIGNED_BYTE, data);
// then simply render a quad with this texture.

In general the answer is yes, it is possible. But it might depend on what you need to draw.
Since you don't provide more details I will describe the general approach:
Bind a texture to a framebuffer (Here is a good explanation with code on how to do that. See "Example 6.10. Initialize() for Supersampling" code example)
Now draw what you need in the same way as you would on the screen (transformations, modelview matrix etc). If you need pixel accuracy (to modify each and every pixel) you might consider using an orthographic projection. If this is possible or not, depends on what you need to draw. All this drawing will be performed on your texture achieving the "update the texture" part.
Bind the normal framebuffer that you use, to draw on the screen. Draw a rectangle (possibly using orthographic projection again) that uses the texture from the previous step. You can scale this rectangle to fill the screen.
If the above approach would be able to achieve a 60 fps, depends on your target device and the scene you need to render.
Hope that helps


Why is a texture coordinate of 1.0 getting beyond the edge of the texture?

I'm doing a color lookup using a texture to apply an effect to a picture. My lookup is a gradient map using the luminance of the fragment of the first texture, then looking that up on a second texture. The 2nd texture is 256x256 with gradients going horizontally and several different gradients top to bottom. So 32 horizontal stripes each 8 pixels tall. My lookup on the x is the luminance, on the y it's a gradient and I target the center of the stripe to avoid crossover.
My fragment shader looks like this:
lowp vec4 source = texture2D(u_textureSampler, v_fragmentTexCoord0);
float luminance = 1.0 - dot(source.rgb, W);
lowp vec2 texPos;
texPos.x = clamp(luminance, 0.0, 1.0);
// the y value selects which gradient to use by supplying a T value
// this would be more efficient in the vertex shader
texPos.y = clamp(u_value4, 0.0, 1.0);
lowp vec4 newColor1 = texture2D(u_textureSampler2, texPos);
It works good but I was getting distortion in the whitest parts of the whites and the blackest part of the blacks. Basically it looked like it grabbed that newColor from a completely different place on texture2, or possibly was just getting nothing for those fragments. I added the clamps in the shader to try to keep it from getting outside the edge of the lookup texture but that didn't help. Am I not using clamp correctly?
Finally I considered that it might have something to do with my source texture or the way it's loaded. I ended up fixing it by adding:
So.. WHY?
It's a little annoying to have to clamp the textures because it means I have to write an exception in my code when I'm loading lookup tables..
If my textPos.x and .y are clamped to 0-1.. how is it pulling a sample beyond the edge?
Also.. do I have to use the above clamp call when creating the texture or can I call it when I'm about to use the texture?
This is correct behavior of texture sampler.
Let me explain this. When you use textures with GL_LINEAR sampling GPU will take an average color of pixel blended with nearby pixels (that's why you don't see pixelation as with GL_NEAREST mode - pixels are blurred instead).
And with GL_REPEAT mode texture coordinates will wrap from 0 to 1 and vice versa, blending with nearby pixels (i.e. in extreme coordinates it will blend with opposite side of texture). GL_CLAMP_TO_EDGE prevents this wrapping behavior, and pixels won't blend with pixels from opposite side of texture.
Hope my explanation is clear.

OpenGL slows down when rendering nearby objects on top of others

I am writing an iOS app using OpenGL ES 2.0 to render a number of objects to the screen.
Currently, those objects are simple shapes (squares, spheres, and cylinders).
When none of the objects overlap each other, the program runs smoothly at 30 fps.
My problem arises when I add objects that appear behind the rest of my models (a background rectangle, for example). When I attempt to draw a background rectangle, I can only draw objects in front of it that take up less than half the screen. Any larger than that and the frame rate drops to between 15 and 20 fps.
As it stands, all of my models, including the background, are drawn with the following code:
- (void)drawSingleModel:(Model *)model
//Create a model transform matrix.
CC3GLMatrix *modelView = [CC3GLMatrix matrix];
//Transform model view
// ...
//Pass matrix to shader.
glUniformMatrix4fv(_modelViewUniform, 1, 0, modelView.glMatrix);
//Bind the correct buffers to openGL.
glBindBuffer(GL_ARRAY_BUFFER, [model vertexBuffer]);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, [model indexBuffer]);
glVertexAttribPointer(_positionSlot, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), 0);
glVertexAttribPointer(_colorSlot, 4, GL_FLOAT, GL_FALSE, sizeof(Vertex), (GLvoid*) (sizeof(float) * 3));
//Load vertex texture coordinate attributes into the texture buffer.
glVertexAttribPointer(_texCoordSlot, 2, GL_FLOAT, GL_FALSE, sizeof(Vertex), (GLvoid*) (sizeof(float) * 7));
glBindTexture(GL_TEXTURE_2D, [model textureIndex]);
glUniform1i(_textureUniform, 0);
glDrawElements([model drawMode], [model numIndices], GL_UNSIGNED_SHORT, 0);
This code is called from my draw method, which is defined as follows:
- (void)draw
//Perform OpenGL rendering here.
_camera = [CC3GLMatrix matrix];
//Camera orientation code.
//Pass the camera matrix to the shader program.
glUniformMatrix4fv(_projectionUniform, 1, 0, _camera.glMatrix);
glViewport(0, 0, self.frame.size.width, self.frame.size.height);
//Render the background.
[self drawSingleModel:_background];
//Render the objects.
for(int x = 0; x < [_models count]; ++x)
[self drawSingleModel:[_models objectAtIndex:x]];
//Send the contents of the render buffer to the UI View.
[_context presentRenderbuffer:GL_RENDERBUFFER];
I found that by changing the render order as follows:
for(int x = 0; x < [_models count]; ++x)
[self drawSingleModel:[_models objectAtIndex:x]];
[self drawSingleModel:_background];
my frame rate when rendering on top of the background is 30 fps.
Of course, the slowdown still occurs if any objects in _models must render in front of each other. Additionally, rendering in this order causes translucent and transparent objects to be drawn black.
I'm still somewhat new to OpenGL, so I don't quite know where my problem lies. My assumption is that there is a slowdown in performing depth testing, and I also realize I'm working on a mobile device. But I can't believe that iOS devices are simply too slow to do this. The program is only rendering 5 models, with around 180 triangles each.
Is there something I'm not seeing, or some sort of workaround for this?
Any suggestions or pointers would be greatly appreciated.
You're running in one of the peculiarities of mobile GPUs: Those things (except the NVidia Tegra) don't do depth testing for hidden surface removal. Most mobile GPUs, including the one in the iPad are tile based rasterizers. The reason for this is to save memory bandwidth, because memory access is actually a power intensive operation. In the power constrained environment of a mobile device reducing required memory bandwidth gains significant battery lifetime.
Tile based renderers split up the viewport into a number of tiles. When sending geometry into it, it is split into the tiles and then for each tile it is intersected with the the geometry already in the tile. Most of the time the tile is covered by only a single primitive. If the incoming primitive happens to be in front of the already present geometry it replaces it. If there's a cutting intersection a new edge is added. Only if a certain threshold of number of edges is reached, that single tile will switch to depth testing mode.
Only at synchronization points the prepared tiles are rasterized, then.
Now it's obvious why overlapping objects reduce rendering performance: The more primitives overlap, the more preprocessing has to be done to setup the tiles.
See "transparency sorting"/"alpha sorting".
I suspect the slowness you're seeing is largely due to "overdraw", i.e. framebuffer pixels being drawn more than once. This is worst when you draw the scene back-to-front, since the depth test always passes. While the iPhone 4/4S/5 may have a beefy GPU, last I checked the memory bandwidth was pretty terrible (I don't know how big the GPU cache is).
If you render front-to-back, the problem is that transparent pixels still write to the depth buffer, causing them to occlude polys behind them. You can reduce this slightly (but only slightly) using the alpha test.
The simple solution: Render opaque polys approximately front-to-back and then transparent polys back-to-front. This may mean making two passes through your scene, and ideally you want to sort the transparent polys which isn't that easy to do well.
I think it's also possible (in principle) to render everything front-to-back and perform alpha testing on the destination alpha, but I don't think OpenGL supports this.

How to avoid transparency overlap using OpenGL?

I am working on a handwriting application on iOS. I found the sample project "GLPaint" from iOS documentation which is implemented by OpenGL ES, and I did something modification on it.
I track the touch points and calculate the curves between the points and draw particle images alone the curve to make it looks like where the finger passby.
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, brushData); // burshData is from CGImage, it is
// vertexBuffer is generated based on the calculated points, it's just a sequence of point where need to draw image.
glVertexPointer(2, GL_FLOAT, 0, vertexBuffer);
glDrawArrays(GL_POINTS, 0, vertexCount);
glBindRenderbufferOES(GL_RENDERBUFFER_OES, viewRenderbuffer);
[context presentRenderbuffer:GL_RENDERBUFFER_OES];
What I got is a solid line which looks quite good. But now I want to draw semi-transparent highlight instead of solid line. So I replace the particle image with a 50% transparency one without changing code.
Result of 50% transparency particle image
There is something wrong with blend.
What I need
I draw three points using the semi-transparency particle image, and the intersection area should keep 50% transparency.
What's the solution?
Im maybe two years later answering that question, but i hope it helps somebody who comes here looking for a solution to this problem, like it happened to me.
You are going to need to assign to each cirle a different z value. It doesn't matter how big or small this difference is, we only need them to not be strictly equal.
First, you disable writing in the color buffer glColorMask(false,false,false,false) , and then draw the circles normally. The Z-buffer will be updated as desired, but no circles will be drawn yet.
Then, you enable writing in the color buffer (glColorMask(true,true,true,true) ) and set the depthFunc to LEQUAL ( glDepthFunc(GL_LEQUAL) ). Only the nearest circle pixels will pass the depth test (Setting it to LEQUAL instead of EQUAL deals with some rare but possible floating point approximation errors). Enabling blending and drawing them again will produce the image you wanted, with no transparency overlap.
You have to change the blend function. You can play around it with:
Maybe (GL_ONE, GL_ONE), forgot how to handle your case, but the solution is in that function.
Late reply but hopefully useful for others.
Another way to avoid that effect is to grab the color buffer before transparent circles are drawn (ie. do a GrabPass) and then read and blend manually with the opaque buffer in the fragment shader of your circles.

iOS GLSL. Is There A Way To Create An Image Histogram Using a GLSL Shader?

Elsewhere on StackOverflow a question was asked regarding a depthbuffer histogram - Create depth buffer histogram texture with GLSL.
I am writing an iOS image-processing app and am intrigued by this question but unclear on the answer provided. So, is it possible to create an image histogram using the GPU via GLSL?
Yes, there is, although it's a little more challenging on iOS than you'd think. This is a red histogram generated and plotted entirely on the GPU, running against a live video feed:
Tommy's suggestion in the question you link is a great starting point, as is this paper by Scheuermann and Hensley. What's suggested there is to use scattering to build up a histogram for color channels in the image. Scattering is a process where you pass in a grid of points to your vertex shader, and then have that shader read the color at that point. The value of the desired color channel at that point is then written out as the X coordinate (with 0 for the Y and Z coordinates). Your fragment shader then draws out a translucent, 1-pixel-wide point at that coordinate in your target.
That target is a 1-pixel-tall, 256-pixel-wide image, with each width position representing one color bin. By writing out a point with a low alpha channel (or low RGB values) and then using additive blending, you can accumulate a higher value for each bin based on the number of times that specific color value occurs in the image. These histogram pixels can then be read for later processing.
The major problem with doing this in shaders on iOS is that, despite reports to the contrary, Apple clearly states that texture reads in a vertex shader will not work on iOS. I tried this with all of my iOS 5.0 devices, and none of them were able to perform texture reads in a vertex shader (the screen just goes black, with no GL errors being thrown).
To work around this, I found that I could read the raw pixels of my input image (via glReadPixels() or the faster texture caches) and pass those bytes in as vertex data with a GL_UNSIGNED_BYTE type. The following code accomplishes this:
glReadPixels(0, 0, inputTextureSize.width, inputTextureSize.height, GL_RGBA, GL_UNSIGNED_BYTE, vertexSamplingCoordinates);
[self setFilterFBO];
[filterProgram use];
glClearColor(0.0, 0.0, 0.0, 1.0);
glBlendFunc(GL_ONE, GL_ONE);
glVertexAttribPointer(filterPositionAttribute, 4, GL_UNSIGNED_BYTE, 0, (_downsamplingFactor - 1) * 4, vertexSamplingCoordinates);
glDrawArrays(GL_POINTS, 0, inputTextureSize.width * inputTextureSize.height / (CGFloat)_downsamplingFactor);
In the above code, you'll notice that I employ a stride to only sample a fraction of the image pixels. This is because the lowest opacity or greyscale level you can write out is 1/256, meaning that each bin becomes maxed out once more than 255 pixels in that image have that color value. Therefore, I had to reduce the number of pixels processed in order to bring the range of the histogram within this limited window. I'm looking for a way to extend this dynamic range.
The shaders used to do this are as follows, starting with the vertex shader:
attribute vec4 position;
void main()
gl_Position = vec4(-1.0 + (position.x * 0.0078125), 0.0, 0.0, 1.0);
gl_PointSize = 1.0;
and finishing with the fragment shader:
uniform highp float scalingFactor;
void main()
gl_FragColor = vec4(scalingFactor);
A working implementation of this can be found in my open source GPUImage framework. Grab and run the FilterShowcase example to see the histogram analysis and plotting for yourself.
There are some performance issues with this implementation, but it was the only way I could think of doing this on-GPU on iOS. I'm open to other suggestions.
Yes, it is. It's not clearly the best approach, but it's indeed the best one available in iOS, since OpenCL is not supported. You'll lose elegance, and your code will probably not as straightforward, but almost all OpenCL features can be achieved with shaders.
If it helps, DirectX11 comes with a FFT example for compute shaders. See DX11 August SDK Release Notes.

YUV to RGBA on Apple A4, should I use shaders or NEON?

I'm writing media player framework for Apple TV, using OpenGL ES and ffmpeg.
Conversion to RGBA is required for rendering on OpenGL ES, soft convert using swscale is unbearably slow, so using information on the internet I came up with two ideas: using neon (like here) or using fragment shaders and GL_LUMINANCE and GL_LUMINANCE_ALPHA.
As I know almost nothing about OpenGL, the second option still doesn't work :)
Can you give me any pointers how to proceed?
Thank you in advance.
It is most definitely worthwhile learning OpenGL ES2.0 shaders:
You can load-balance between the GPU and CPU (e.g. video decoding of subsequent frames while GPU renders the current frame).
Video frames need to go to the GPU in any case: using YCbCr saves you 25% bus bandwidth if your video has 4:2:0 sampled chrominance.
You get 4:2:0 chrominance up-sampling for free, with the GPU hardware interpolator. (Your shader should be configured to use the same vertex coordinates for both Y and C{b,r} textures, in effect stretching the chrominance texture out over the same area.)
On iOS5 pushing YCbCr textures to the GPU is fast (no data-copy or swizzling) with the texture cache (see the CVOpenGLESTextureCache* API functions). You will save 1-2 data-copies compared to NEON.
I am using these techniques to great effect in my super-fast iPhone camera app, SnappyCam.
You are on the right track for implementation: use a GL_LUMINANCE texture for Y and GL_LUMINANCE_ALPHA if your CbCr is interleaved. Otherwise use three GL_LUMINANCE textures if all of your YCbCr components are noninterleaved.
Creating two textures for 4:2:0 bi-planar YCbCr (where CbCr is interleaved) is straightforward:
glBindTexture(GL_TEXTURE_2D, texture_y);
GL_LUMINANCE, // Texture format (8bit)
0, // No border
GL_LUMINANCE, // Source format (8bit)
GL_UNSIGNED_BYTE, // Source data format
glBindTexture(GL_TEXTURE_2D, texture_cbcr);
GL_LUMINANCE_ALPHA, // Texture format (16-bit)
width / 2,
height / 2,
0, // No border
GL_LUMINANCE_ALPHA, // Source format (16-bits)
GL_UNSIGNED_BYTE, // Source data format
where you would then use glTexSubImage2D() or the iOS5 texture cache to update these textures.
I'd also recommend using a 2D varying that spans the texture coordinate space (x: [0,1], y: [0,1]) so that you avoid any dependent texture reads in your fragment shader. The end result is super-fast and doesn't load the GPU at all in my experience.
Converting YUV to RGB using NEON is very slow. Use a shader to offload onto the GPU.
