I am new to Objective-C and OpenGL, so please be patient.
I'm building an app that is mainly based on a full-screen 2D pixelbuffer that is filled and animated using mathematical formulas (similar to fractals), mostly using sin, cos, atan etc.
I have already optimized sin and cos by using tables which gave quite an fps boost, however, while the framerate is cool in the Simulator on a Mac Mini (around 30 fps), I get a totally ridiculous 5 fps on an actual device (iPad Mini non-retina).
As I see no further ways to optimize the pixel loops, would it be possible to implement the effects using, say, an OpenGL shader, and then just draw a fullscreen quad with a texture on it?
As I said, the effects are really simple and just iterate over all pixels in a nested x/y loop and use basic math and trig functions. The way I blit to the screen is already optimal for the device while staying in non-OpenGL, and gives like a million FPS if I leave out the actual math.
Thanks!
If you implement this as a OpenGL shader you will get a rediculously massive increase in performance. The shader would run on the graphics chip, which is designed to be massively parallel, and is optimized exactly for this kind of math.
You don't make a texture so much as define a shader for the surface. Your shader code would be invoked for every rendered pixel on that surface.
I would start by trying to see if you can hack a shader here: http://glsl.heroku.com/
Once you have something working, you can research how to get an OpenGL context working with your shader on iOS, and you shouldn't have to change the actual shader much to get it working.
Related
I was wondering if it is worth it to use shaders to draw a 2D texture in xna. I am asking because with openGL it is much faster if you use GLSL.
Everything on a modern GPU is drawn using a shader.
For the old immediate-style rendering (ie: glBegin/glVertex), that will get converted to something approximating buffers and shaders somewhere in the driver. This is why using GLSL is "faster" - because it's closer to the metal, you're not going through a conversion layer.
For a modern API, like XNA, everything is already built around "buffers and shaders".
In XNA, SpriteBatch provides its own shader. The source code for the shader is available here. The shader itself is very simple: The vertex shader is a single matrix multiplication to transform the vertices to the correct raster locations. The pixel shader simply samples from your sprite's texture.
You can't really do much to make SpriteBatch's shader faster - there's almost nothing to it. There are some things you can do to make the buffering behaviour faster in specific circumstances (for example: if your sprites don't change between frames) - but this is kind of advanced. If you're experiencing performance issues with SpriteBatch, be sure you're using it properly in the first place. For what it does, SpriteBatch is already extremely well optimised.
For more info on optimisation, see this answer.
If you want to pass a custom shader into SpriteBatch (eg: for a special effect) use this overload of Begin and pass in an appropriate Effect.
was playing a bit with awesome GPUImage framework and was able to reproduce some "convex"-like effects with fragment shaders.
However, I'm wondering if it's possible to get some more complex plane curving in 3D using GPUImage or any other OpenGL rendering to texture.
The effect I'm trying to reach looks like this one - is there any chance I can get something alike using depth buffer and vertex shader - or just need to develop some more sophisticated fragment shader emulating Z coordinate?
This is what I get now using only fragment shader and some periodical bulging
Thanks
Well another one thought is maybe it's possible to prototype a curved surface in some 3d modeling app and somehow map the texture to it?
I'm developing a 2D game on iOS, but I'm finding it difficult getting drawing to run fast (60 FPS on Retina display).
I've first used UIKit for drawing, which is of course not suitable for a game. I coulnd't draw a couple of sprites without slowdown.
Then I moved on to OpenGL, because I read it's the closest I can get to the GPU (which I think it means it's the fastest possible). I was using glDrawArrays(). When I ran it on the Simulator, FPS dropped when I was reaching over 200 triangles. People said it was because the Simulator or the computer are not optimized to run iOS OpenGL. Then I tested it on a real device, and to my surprise, the performance difference was really small. It still couldn't run that few triangles smoothly - and I know other games on iOS use a lot more polygons, shaders, 3D graphics, etc.
When I ran it through Instruments to check OpenGL performance, it told me I could speed it up by using VBOs. So I rewrote my code to use VBO instead, updating all vertices each frame. Performance increased very little, and I still can't surpass 200 triangles at consistent 60 FPS. And that is 2D drawing alone without context changes/transformations. I also didn't write the game yet - there are no objects making no CPU-intensive tasks.
Everyone I ask says OpenGL is top performance. What could I possibly be doing wrong? I am assuming OpenGL can handle LOTS of polygons that are updated each frame - is that right? Which method other games use that I see they run fine, like Infinity Blade which is 3D, or even Angry Birds which has lots of ever-updating sprites? What is recommended when making a game?
OpenGL is definitely going to be your fastest option. Even on the oldest iOS devices you can run about 20,000 polygons at 30+ fps.
Sounds like you must be doing something wrong or extra. It is impossible to try to guess what that might be without seeing your source code.
Generally speaking though, you want to make sure you create your VBO and all your loading outside of your drawing pipeline.
Is it possible to optimise OpenGL ES 2.0 drawing by using dirty rectangles?
In my case, I have a 2D app that needs to draw a background texture (full screen on iPad), followed by the contents of several VBOs on each frame. The problem is that these VBOs can potentially contain millions of vertices, taking anywhere up to a couple of seconds to draw everything to the display. However, only a small fraction of the display would actually be updated each frame.
Is this optimisation possible, and how (or perhaps more appropriately, where) would this be implemented? Would some kind of clipping plane need to be passed into the vertex shader?
If you set an area with glViewport, clipping is adjusted accordingly. This however happens after the vertex shader stage, just before rasterization. As the GL cannot know the result of your own vertex program, it cannot sort out any vertex before applying the vertex program. After that, it does. How efficent it does depents on the actual GPU.
Thus you have to sort and split your objects to smaller (eg. rectangulary bounded) tiles and test them against the field of view by yourself for full performance.
As a learning experience, I'm writing an Immediate mode managed DirectX 9 application.
I'm manually calculating Vertex normals across all triangles in a scene to allow smooth Gouraud shading.
This works as expected, but I'm guessing this is not the most efficient approach. Is it possible to get the GPU to do this for me?
You could in theory generate the vertex normals inside the vertex shader. That involves computation every single time you render a mesh using that shader though, so why not generate them in advance.
If you mean you want to generate them in advance of rendering, but use the GPU instead of the CPU, I would say that it's not worth the bother of speeding up something you are only going to do once. Besides, I'm not sure if DX9 has a way to get computed vertex information back from a shader (DX10 does).
All in all, the best thing to do in most cases is the traditional: compute vertex normals in the program that saves the data files that contain the meshes - do it as a pre-computation step. Usually you have them if the mesh came from a 3d package like Max or Maya, because there is artistic information in the normals, unless you know the whole mesh is supposed to be perfectly smooth (or faceted), it's not computable in the general case.