I'm working with Brad Larson's GPUImage, which I've found to be really amazing. For flexibility and ease of writing, I've created a custom filter that has 3 uniform inputs. I based it on the GPUImageBrightnessFilter.h and .m. One uniform is simply a "parameter" float, one is a "center" CGPoint to feed in touch coordinates if I need them, and one will hopefully be a time float which is the amount of time that has elapsed since the shader was started. I'm using initWithFragmentShaderFromFile to use my own shader files.
My "parameter" and "center" uniforms work. It's the time uniform that doesn't work because I don't understand how to create a uniform that is constantly changing (as time does).
The goal of my time uniform seems to be similar to SpriteKit's u_time. It returns the number of seconds since the current shader has been running which you can feed into your custom shader to animate it. One very simple example use might be to cycle the hue of an image as time passes. I'm hoping to build upon that with more complexity as my knowledge of GLSL grows.
I'm not sure what code to post, as my misunderstanding could be in many places and it may be that I'm trying to do something that's not possible. This post about setting a start time and subtracting it from CACurrentMediaTime() seems like a good place to start:
Getting the time elapsed (Objective-c)
The problem is I'm not sure how to continually update a uniform to pass to my shader. Do I do it in my custom filter's setter for time? Don't custom setter's only get executed once? Do I do it in my init override? Doesn't that only get executed once? Do I write a loop in my controller that continually calls the setter? That seems like an expensive and messy solution.
I'm not asking for code, just suggestions to steer me in the right direction or un-steer me from whatever wrong direction I've gone in my thinking.
Asking the question helped me to get to a solution and learn about NSTimer. I incorporated a method that creates a repeating NSTimer into my customized GPUImageFilter subclass that passes a time uniform to the shader. It calls a selector that increments my 'time' variable and passes that into the uniform array.
Related
Using DirectX 11 & Effect 11, I'm trying to understand how to draw efficiently two objects with different shaders. So first I set all the states and set the constant buffers up once for all. And while iterating through all of first object's meshes, all the previously set constant buffers stay available which is fine as you can see
here.
And then I'm applying another pass (Pass.Apply() from Effect 11) to draw the second object. And at this point, all my constant buffers are
destroyed as shown there.
So now I'm starting to wonder if the constant buffers cannot be set once for all on app startup and then be used/shared at anytime, across any shader. Or does it belong to the active shader only?
Thanks!
If I remember, if you execute a different effect then you will have to reassociate the constant buffer to the stage (this is also possible dependent on the driver also). The only time you should get to reuse the same constant buffer is if you are not changing the state of shaders.
To be safe, a different Pass is basically binding a new set of shaders (if they differ). Best practice is that you bind your resource (buffer), each time you do a different effects pass.
I personally have moved away from Effects as it is deprecated, I've also found that explicitly understanding what I am binding to the pipeline has helped my understanding on usage of constant buffers.
The buffer shouldn't be destroyed, it should be just unbound on the 2nd call - otherwise you have something more nefarious going on.
I'm trying to use the stencil buffer to render cross-sections of 3d models with WebGL. I am also using a library - three.js, which gives me a scene graph and various other abstractions.
Three.js exposes a callback that is called before and after gl.drawElements which I used to make the stencil calls. If I leave the render order to be managed by three, I end up in this situation:
Which looks pretty redundant:
gl.disable(STENCIL_TEST) //disable followed immediately by enable
gl.enable(STENCIL_TEST)
gl.stencilFunc(...) //same call for multiple draw calls
gl.stencilOp(...)
It looks like it would require some acrobatics to batch these with three.js, and I'm wondering if it's even worth it. I keep reading about the overhead of WebGL (draw?) calls, but I don't really understand their weight. It's pretty obvious what happens when a bunch of geometry is merged and drawn with a single call, but I don't understand what happens with other calls.
I'm not even entirely sure how to test it, would it be enough to toggle the stencil on/off multiple times between these draw calls until there is a frame drop?
I would like to enable the stencil only once before issuing multiple draw calls, and disable it after the final one. I would like to change the stencilOp and stencilFunc somewhere inside of that group of draw calls, but I'm not sure how much there is to be gained from this.
I'm not asking how to do this with three.js
There are a few relatively straightforward ways of doing this with three. Geometries that need to be batched can be put in their own scene and stencil state can be set before and after rendering it. Another one would be to manually sort everything and make the stencil calls only before the first one.
My question is, if and why should this queue of commands look different, for any reason?
What is the difference between having 5, and 255 calls to gl.enable(STENCIL_TEST) and gl.stencilOp(), is it something that can be ignored or not.
edit
I've reduced the number of calls, and achieved the desired effect when rendering opaque objects. However, transparency now became a bit more involved. I am trying to understand what the difference between the 4k and 5k "calls" in Sceenshot #2 means. Is it something that i should be concerned with at all, or concerned with selectively.
I'm trying to use multiple GLSL fragment shaders with OpenGL ES on iOS 7 and upwards. The shaders itself are running fine after the first call to glDrawArrays. Nevertheless, the very first call to glDrawArrays after the shaders and their program have been compiled and linked takes ages to complete. Afterwards some pipeline or whatever seems to have been loaded and everything goes smooth. Any ideas what the cause of this issue are and how to prevent it?
The most likely cause is that your shaders may not be fully compiled until you use them the first time. They might have been translated to some kind of intermediate form when you call glCompileShader(), which would be enough for the driver to provide a compile status and to act as if the shaders had been compiled. But the full compilation and optimization could well be deferred until the first draw call that uses the shader program.
A commonly used technique for games is to render a few frames without actually displaying them while some kind of intro screen is still shown to the user. This prevents the user from seeing stuttering that could otherwise result from all kinds of possible deferred initialization or data loading during the first few frames.
You could also look into using binary shaders to reduce slowdowns from shader compilation. See glShaderBinary() in the ES 2.0 documentation.
What actually helped speeding up the first draw call was the following (which is fine in my use case since I'm rendering a video so no depth testing is needed).
glDisable(GL_DEPTHTEST)
I've been testing this for about an hour and I don't understand what's going on.
In imageJ, if I say:
i = 3.5;
print(round(i));
I get 4.
However, If I say:
print(65535/(27037-4777)*(26295-4777));
print(round(65535/(27037-4777)*(26295-4777)));
For some reason, I am getting:
63350.5
63350
Shouldnt it be rounding up to 63351?
Taking a look at your comments, the number that was generated through that calculation is actually 63350.499999..., and so when you try and round, the number gets rounded down and you get 63350. One thing that I can suggest is to add a small constant that may seem innocuous in hindsight, but it will resolve situations like this. You want to make it small enough so that it'll push the fractional part of your number over to the 0.5 range so it'll round successfully, but it won't interfere how round works for the other fractional parts.
The Java API has a function called Math.ulp that will compute the next possible fractional component that is after a particular floating point number that you specify. However, because ImageJ doesn't have this functionality, consider adding something small like 1e-5. This may seem like a foolish hack, but this will certainly avoid the situation like what you're experiencing now. This constant that you're adding should also not affect how round works in ImageJ.
tl;dr: Add a small constant to your number (like 1e-5) then round. This shouldn't affect how round works overall, and it'll simply push those numbers with a fractional component that are hovering at the 0.5 mark to be truly over 0.5.
Good luck!
I was just reading through the DirectX documentation and encountered something interesting in the page for IDirect3DDevice9::BeginScene :
To enable maximal parallelism between
the CPU and the graphics accelerator,
it is advantageous to call
IDirect3DDevice9::EndScene as far
ahead of calling present as possible.
I've been accustomed to writing my game loop to handle input and such, then draw. Do I have it backwards? Maybe the game loop should be more like this: (semi-pseudocode, obviously)
while(running) {
d3ddev->Clear(...);
d3ddev->BeginScene();
// draw things
d3ddev->EndScene();
// handle input
// do any other processing
// play sounds, etc.
d3ddev->Present(NULL, NULL, NULL, NULL);
}
According to that sentence of the documentation, this loop would "enable maximal parallelism".
Is this commonly done? Are there any downsides to ordering the game loop like this? I see no real problem with it after the first iteration... And I know the best way to know the actual speed increase of something like this is to actually benchmark it, but has anyone else already tried this and can you attest to any actual speed increase?
Since I always felt that it was "awkward" to draw-before-sim, I tended to push the draws until after the update but also after the "present" call. E.g.
while True:
Simulate()
FlipBuffers()
Render()
While on the first frame you're flipping nothing (and you need to set up things so that the first flip does indeed flip to a known state), this always struck me as a bit nicer than putting the Render() first, even though the order of operations are the same once you're under way.
The short answer is yes, this is how it's commonly done. Take a look at the following presentation on the game loop in God of War III on the PS3:
http://www.tilander.org/aurora/comp/gdc2009_Tilander_Filippov_SPU.pdf
If you're running a double buffered game at 30 fps, the input lag will be 1 / 30 ~= 0.033 seconds which is way to small to be detected by a human (for comparison, any reaction time under 0.1 seconds on 100 metres is considered to be a false start).
Its worth noting that on nearly all PC hardware BeginScene and EndScene do nothing. In fact the driver buffers up all the draw commands and then when you call present it may not even begin drawing. They commonly buffer up several frames of draw commands to smooth out frame rate. Usually the driver does things based around the present call.
This can cause input lag when frame rate isn't particularly high.
I'd wager if you did your rendering immediately before the present you'd notice no difference to the loop you give above. Of course on some odd bits of hardware this may then cause issues so, in general, you are best off looping as you suggest above.