Performance issue of GLImageProcessing re-implemented with OpenGL ES 2 shaders - ios

I re-implemented Apple's GLImageProcessing with OpenGL ES 2 shaders. The effects are perfect but the performance of the Sharpness filter is not as good — it runs only at 20 FPS.
The shader code is simple:
Pass 0 for horizontal blur.
Pass 1 for vertical blur.
Pass 2 to mix the blur texture with the original texture.
Basically, the texture mix in Pass 2 is the cause of slowness since Pass 0 and Pass 1 are only done once and do not contribute to the bad performance.
How can I improve the performance?
Vertex shader:
attribute vec4 a_position;
attribute vec2 a_texCoord;
varying highp vec2 v_texCoord;
varying highp vec2 v_texCoord1;
varying highp vec2 v_texCoord2;
varying highp vec2 v_texCoord1_;
varying highp vec2 v_texCoord2_;
uniform mat4 u_modelViewProjectionMatrix;
uniform lowp int u_pass;
const highp float blurSizeH = 1.0 / 320.0;
const highp float blurSizeV = 1.0 / 480.0;
void main()
{
v_texCoord = a_texCoord;
if (u_pass == 0) {
v_texCoord1 = a_texCoord + vec2(1.3846153846 * blurSizeH, 0.0);
v_texCoord1_ = a_texCoord - vec2(1.3846153846 * blurSizeH, 0.0);
v_texCoord2 = a_texCoord + vec2(3.2307692308 * blurSizeH, 0.0);
v_texCoord2_ = a_texCoord - vec2(3.2307692308 * blurSizeH, 0.0);
} else if (u_pass == 1) {
v_texCoord1 = a_texCoord + vec2(0.0, 1.3846153846 * blurSizeV);
v_texCoord1_ = a_texCoord - vec2(0.0, 1.3846153846 * blurSizeV);
v_texCoord2 = a_texCoord + vec2(0.0, 3.2307692308 * blurSizeV);
v_texCoord2_ = a_texCoord - vec2(0.0, 3.2307692308 * blurSizeV);
}
gl_Position = u_modelViewProjectionMatrix * a_position;
}
Fragment shader:
varying highp vec2 v_texCoord;
varying highp vec2 v_texCoord1;
varying highp vec2 v_texCoord2;
varying highp vec2 v_texCoord1_;
varying highp vec2 v_texCoord2_;
uniform lowp int u_pass;
uniform sampler2D u_texture;
uniform sampler2D u_degenTexture;
uniform mediump mat4 u_filterMat;
void main()
{
if (u_pass == 0) {
gl_FragColor = texture2D(u_texture, v_texCoord) * 0.2270270270;
gl_FragColor += texture2D(u_texture, v_texCoord1) * 0.3162162162;
gl_FragColor += texture2D(u_texture, v_texCoord1_) * 0.3162162162;
gl_FragColor += texture2D(u_texture, v_texCoord2) * 0.0702702703;
gl_FragColor += texture2D(u_texture, v_texCoord2_) * 0.0702702703;
} else if (u_pass == 1) {
gl_FragColor = texture2D(u_degenTexture, v_texCoord) * 0.2270270270;
gl_FragColor += texture2D(u_degenTexture, v_texCoord1) * 0.3162162162;
gl_FragColor += texture2D(u_degenTexture, v_texCoord1_) * 0.3162162162;
gl_FragColor += texture2D(u_degenTexture, v_texCoord2) * 0.0702702703;
gl_FragColor += texture2D(u_degenTexture, v_texCoord2_) * 0.0702702703;
} else {
gl_FragColor = u_filterMat * texture2D(u_texture, v_texCoord) + (mat4(1.0) - u_filterMat) * texture2D(u_degenTexture, v_texCoord);
}
}

Before you continue with this, may I suggest you take a look at my open source GPUImage project? I have several hand-optimized sharpening effects in there, including the unsharp mask you're attempting here. I also make it reasonably easy to pull in image and video sources.
To your specific question, there are a couple of reasons why your shader is running slower than expected. The first is that you are using branching within your fragment shader. This kills performance on iOS devices, and should be avoided if at all possible. If you really need to have different conditions for different passes, split these apart into separate shader programs and swap the programs out as needed rather than using a uniform for control flow.
I'm also not sure that writing to gl_FragColor repeatedly is the fastest thing you can do here. I'd use a lowp or mediump intermediate color variable, add your Gaussian components to that, and then write the final result to gl_FragColor when done.
I do see that you've moved your sampling offset calculations to the vertex shader, and then passed those offsets into the fragment shader, which is a good thing that people usually miss. Once you implement the above tweaks (or give my framework a try to see how I handle this), you should get much better results from your filtering.

It turns out that it is really simple. The cause of bad performance is matrix-vector multiplication:
varying highp vec2 v_texCoord;
uniform sampler2D u_texture;
uniform sampler2D u_degenTexture;
uniform lowp float u_filterValue;
void main()
{
gl_FragColor = u_filterMat * texture2D(u_texture, v_texCoord) + (mat4(1.0) - u_filterMat) * texture2D(u_degenTexture, v_texCoord);
}
I initially wrote my code using matrix this way so that all my filters could share the same color mixing code. Now that I learned the lesson, I simply go back to write filter specific code and use scalar operations as much as possible:
varying highp vec2 v_texCoord;
uniform sampler2D u_texture;
uniform sampler2D u_degenTexture;
uniform lowp float u_filterValue;
void main()
{
gl_FragColor = u_filterValue * texture2D(u_texture, v_texCoord) + (1.0 - u_filterValue) * texture2D(u_degenTexture, v_texCoord);
}
Now it is awesome 60 fps!
Never thought it was such a naive issue, but it is.

Related

Webgl - How to make Specular Light not change size

I am trying to implement specular lighting (thats coming from the front) but the light is always changing size in an unnatural way. How do I fix this?
I hardcoded viewerPos to test. I'm using a halfway vector "shortcut" so I have to calculate less things as explained here: https://webglfundamentals.org/webgl/lessons/webgl-3d-lighting-point.html
Video with my lighting implemented: https://streamable.com/j95bz7
// Vertex shader program
const vsSource = `
attribute vec4 aVertexPosition;
attribute vec3 aVertexNormal;
attribute vec2 aTextureCoord;
uniform mat4 uNormalMatrix;
uniform mat4 uModelViewMatrix;
uniform mat4 uProjectionMatrix;
uniform highp vec3 uViewPos;
varying highp vec2 vTextureCoord;
varying highp vec4 vNormal;
varying highp mat4 vModelViewMatrix;
varying highp vec3 vPos;
void main(void) {
gl_Position = uProjectionMatrix * uModelViewMatrix * aVertexPosition;
vTextureCoord = aTextureCoord; //Textura
vModelViewMatrix = uModelViewMatrix;
vPos = (uModelViewMatrix * aVertexPosition).xyz;
vNormal = uNormalMatrix * vec4(aVertexNormal, 1.0);
}
`;
// Fragment shader program
const fsSource = `
varying highp vec2 vTextureCoord;
varying highp vec4 vNormal;
varying highp mat4 vModelViewMatrix;
varying highp vec3 vPos;
uniform sampler2D uSampler;
void main(void) {
// Apply lighting effect
highp vec4 texelColor = texture2D(uSampler, vTextureCoord);
//Luz Ambiente
highp vec3 ambientLight = 0.3 * vec3(1.0, 1.0, 1.0);
//Luz Difusa
highp vec3 directionalLightColor = vec3(1, 1, 1);
highp vec3 directionalVector = vec3(0.0, 0.0, 1.0);
highp float directional = max(dot(vNormal.xyz, normalize(directionalVector)), 0.0);
//Luz Especular
highp vec3 viewerPos = vec3(0, 0, -6); //NOTA: PASSAR PARA SHADERS, NAO DAR HARDCODE
highp vec3 surfaceToLightDirection = (-1.0 * directionalVector);
highp vec3 surfaceToViewDirection = (vPos - viewerPos);
highp vec3 halfVector = normalize(surfaceToLightDirection + surfaceToViewDirection);
highp float specular = max(dot(vNormal.xyz, halfVector), 0.0);
highp vec3 vLighting = ambientLight;// + (directionalLightColor * directional);
gl_FragColor = vec4(texelColor.rgb * vLighting + (specular * 0.5), texelColor.a);
}
`;

GLSL Shader Error "Constructor calls may not have precision"

GLSL Shader Error
ERROR: 0:1: '(' : syntax error: Constructor calls may not have precision
I'm seeing this error with Xcode 6 on an iOS 8 app based on GLPaint demo... (works fine in iOS7)
I also noticed they no longer use the "STRINGIFY" thing in version 1.13 of GLPaint demo.
.vsh
static const char* BaseVS = STRINGIFY
(
attribute highp vec4 inVertex;
uniform highp mat4 MVP;
uniform highp float pointSize;
uniform highp vec4 vertexColor;
uniform highp float brushRotation;
varying highp vec4 color;
void main()
{
gl_Position = MVP * inVertex;
gl_PointSize = pointSize;
color = vertexColor;
}
);
.fsh
static const char* BaseFS = STRINGIFY
(
uniform sampler2D texture;
uniform sampler2D normalMap;
uniform highp float brushRotation;
varying highp vec4 color;
varying highp vec3 normal;
varying highp vec3 lightDir;
varying highp vec3 eyeVec;
precision highp float;
void main (void)
{
highp float vRotation = (brushRotation/180.0)*3.14;;
highp float mid = 0.5;
highp vec2 rotated = vec2(cos(vRotation) * (gl_PointCoord.x - mid) + sin(vRotation) * (gl_PointCoord.y - mid) + mid,
cos(vRotation) * (gl_PointCoord.y - mid) - sin(vRotation) * (gl_PointCoord.x - mid) + mid);
highp vec4 rotatedTexture = texture2D( texture, rotated);
gl_FragColor = color * rotatedTexture;
}
);
The problem was in a method used for random generation. I removed the "high" before the vec2() construction. (Sigh)
highp float rand(highp vec2 co)
{
return fract(sin(dot(co.xy ,highp vec2(12.9898,78.233))) * 43758.5453);
}

Remove black and white effect from halftone filter

In Brad Larson's excellent GPUImage, there is a halftone filter which also turns the picture black and white. I am just wanting the halftone effect without the black and white and I was wondering can anyone tell me how what I can remove from the following code to fix this? Have been playing around with it, but virtually have no experience in openGL and am not sure what to eliminate.
NSString *const kGPUImageHalftoneFragmentShaderString = SHADER_STRING
(
varying highp vec2 textureCoordinate;
uniform sampler2D inputImageTexture;
uniform highp float fractionalWidthOfPixel;
uniform highp float aspectRatio;
uniform highp float dotScaling;
const highp vec3 W = vec3(0.2125, 0.7154, 0.0721);
void main()
{
highp vec2 sampleDivisor = vec2(fractionalWidthOfPixel, fractionalWidthOfPixel / aspectRatio);
highp vec2 samplePos = textureCoordinate - mod(textureCoordinate, sampleDivisor) + 0.5 * sampleDivisor;
highp vec2 textureCoordinateToUse = vec2(textureCoordinate.x, (textureCoordinate.y * aspectRatio + 0.5 - 0.5 * aspectRatio));
highp vec2 adjustedSamplePos = vec2(samplePos.x, (samplePos.y * aspectRatio + 0.5 - 0.5 * aspectRatio));
highp float distanceFromSamplePoint = distance(adjustedSamplePos, textureCoordinateToUse);
lowp vec3 sampledColor = texture2D(inputImageTexture, samplePos ).rgb;
highp float dotScaling = 1.0 - dot(sampledColor, W);
lowp float checkForPresenceWithinDot = 1.0 - step(distanceFromSamplePoint, (fractionalWidthOfPixel * 0.5) * dotScaling);
gl_FragColor = vec4(vec3(checkForPresenceWithinDot), 1.0);
}
);
You can change the last line to
gl_FragColor = vec4(checkForPresenceWithinDot * sampledColor, 1.0);
This will make the effect have color instead of black and white only.

iPad Opengl ES program works fine on simulator but not device

For the device, all of my shaders load fine except one. For this shader program I get "Fragment program failed to compile with current context state" error, followed by a similar error for the vertex shader when I make a call to glGetProgramInfoLog(...);
Vertex shader:
#version 100
uniform mat4 Projection;
uniform mat4 Modelview;
uniform mat4 Rotation;
uniform vec3 Translation;
uniform vec4 LightDirection;
uniform vec4 MaterialDiffuse;
uniform float MaterialShininess;
attribute vec3 position;
attribute vec3 normal;
varying vec4 color;
varying float specularCoefficient;
void main() {
vec3 _normal = normalize(mat3(Modelview[0].xyz, Modelview[1].xyz, Modelview[2].xyz)*normal);
// There is an easier way to do the above using typecast, but is apparently broken
float NdotL = dot(-_normal, normalize(vec3(LightDirection)));
if(NdotL < 0.0){
NdotL = 0.0;
}
color = NdotL * MaterialDiffuse;
float NdotO = dot(-_normal, vec3(0.0, 0.0, -1.0));
if(NdotO < 0.0){
NdotO = 0.0;
}
specularCoefficient = pow(NdotO, MaterialShininess);
vec3 p = position + Translation;
gl_Position = Projection*Modelview*vec4(p, 1.0);
}
Fragment shader:
#version 100
precision mediump float;
varying vec4 color;
varying float specularCoefficient;
uniform vec4 MaterialSpecular;
void main(){
gl_FragColor = vec4((color + specularCoefficient*MaterialSpecular).rgb, 1.0);
}
I am not sure what is going on, especially since I have a similar program that is exactly as above with the addition of texture coordinates. Also, I checked the compile status of each shader when I linked the programs using glGetShaderiv(theShader, GL_COMPILE_STATUS, &result) and they all checked out fine. Any ideas?
Changing the line
gl_FragColor = vec4((color + specularCoefficient*MaterialSpecular).rgb, 1.0);
in the fragment shader to
gl_FragColor = vec4((1.0*color + specularCoefficient*MaterialSpecular).rgb, 1.0);
fixes the problem. I suspect it has something to do with precision related to the varying variable color, for a reordering of the line to
gl_FragColor = vec4((MaterialSpecular + specularCoefficient*color).rgb, 1.0);
works as well.

Motion Blur effect on UIImage on iOS

Is there a way to get a Motion Blur effect on a UIImage?
I tried GPUImage, Filtrr and the iOS Core Image but all of these have regular blur - no motion blur.
I also tried UIImage-DSP but it's Motion Blur is almost non visible. I need something much stronger.
As I commented on the repository, I just added motion and zoom blurs to GPUImage. These are the GPUImageMotionBlurFilter and GPUImageZoomBlurFilter classes. This is an example of the zoom blur:
For the motion blur, I do a 9-hit Gaussian blur over a single direction. This is achieved using the following vertex and fragment shaders:
Vertex:
attribute vec4 position;
attribute vec4 inputTextureCoordinate;
uniform highp vec2 directionalTexelStep;
varying vec2 textureCoordinate;
varying vec2 oneStepBackTextureCoordinate;
varying vec2 twoStepsBackTextureCoordinate;
varying vec2 threeStepsBackTextureCoordinate;
varying vec2 fourStepsBackTextureCoordinate;
varying vec2 oneStepForwardTextureCoordinate;
varying vec2 twoStepsForwardTextureCoordinate;
varying vec2 threeStepsForwardTextureCoordinate;
varying vec2 fourStepsForwardTextureCoordinate;
void main()
{
gl_Position = position;
textureCoordinate = inputTextureCoordinate.xy;
oneStepBackTextureCoordinate = inputTextureCoordinate.xy - directionalTexelStep;
twoStepsBackTextureCoordinate = inputTextureCoordinate.xy - 2.0 * directionalTexelStep;
threeStepsBackTextureCoordinate = inputTextureCoordinate.xy - 3.0 * directionalTexelStep;
fourStepsBackTextureCoordinate = inputTextureCoordinate.xy - 4.0 * directionalTexelStep;
oneStepForwardTextureCoordinate = inputTextureCoordinate.xy + directionalTexelStep;
twoStepsForwardTextureCoordinate = inputTextureCoordinate.xy + 2.0 * directionalTexelStep;
threeStepsForwardTextureCoordinate = inputTextureCoordinate.xy + 3.0 * directionalTexelStep;
fourStepsForwardTextureCoordinate = inputTextureCoordinate.xy + 4.0 * directionalTexelStep;
}
Fragment:
precision highp float;
uniform sampler2D inputImageTexture;
varying vec2 textureCoordinate;
varying vec2 oneStepBackTextureCoordinate;
varying vec2 twoStepsBackTextureCoordinate;
varying vec2 threeStepsBackTextureCoordinate;
varying vec2 fourStepsBackTextureCoordinate;
varying vec2 oneStepForwardTextureCoordinate;
varying vec2 twoStepsForwardTextureCoordinate;
varying vec2 threeStepsForwardTextureCoordinate;
varying vec2 fourStepsForwardTextureCoordinate;
void main()
{
lowp vec4 fragmentColor = texture2D(inputImageTexture, textureCoordinate) * 0.18;
fragmentColor += texture2D(inputImageTexture, oneStepBackTextureCoordinate) * 0.15;
fragmentColor += texture2D(inputImageTexture, twoStepsBackTextureCoordinate) * 0.12;
fragmentColor += texture2D(inputImageTexture, threeStepsBackTextureCoordinate) * 0.09;
fragmentColor += texture2D(inputImageTexture, fourStepsBackTextureCoordinate) * 0.05;
fragmentColor += texture2D(inputImageTexture, oneStepForwardTextureCoordinate) * 0.15;
fragmentColor += texture2D(inputImageTexture, twoStepsForwardTextureCoordinate) * 0.12;
fragmentColor += texture2D(inputImageTexture, threeStepsForwardTextureCoordinate) * 0.09;
fragmentColor += texture2D(inputImageTexture, fourStepsForwardTextureCoordinate) * 0.05;
gl_FragColor = fragmentColor;
}
As an optimization, I calculate the step size between texture samples outside of the fragment shader by using the angle, blur size, and the image dimensions. This is then passed into the vertex shader, so that I can calculate the texture sampling positions there and interpolate across them in the fragment shader. This avoids dependent texture reads on iOS devices.
The zoom blur is much slower, because I still do these calculations in the fragment shader. No doubt there's a way I can optimize this, but I haven't tried yet. The zoom blur uses a 9-hit Gaussian blur where the direction and per-sample offset distance vary as a function of the placement of the pixel vs. the center of the blur.
It uses the following fragment shader (and a standard passthrough vertex shader):
varying highp vec2 textureCoordinate;
uniform sampler2D inputImageTexture;
uniform highp vec2 blurCenter;
uniform highp float blurSize;
void main()
{
// TODO: Do a more intelligent scaling based on resolution here
highp vec2 samplingOffset = 1.0/100.0 * (blurCenter - textureCoordinate) * blurSize;
lowp vec4 fragmentColor = texture2D(inputImageTexture, textureCoordinate) * 0.18;
fragmentColor += texture2D(inputImageTexture, textureCoordinate + samplingOffset) * 0.15;
fragmentColor += texture2D(inputImageTexture, textureCoordinate + (2.0 * samplingOffset)) * 0.12;
fragmentColor += texture2D(inputImageTexture, textureCoordinate + (3.0 * samplingOffset)) * 0.09;
fragmentColor += texture2D(inputImageTexture, textureCoordinate + (4.0 * samplingOffset)) * 0.05;
fragmentColor += texture2D(inputImageTexture, textureCoordinate - samplingOffset) * 0.15;
fragmentColor += texture2D(inputImageTexture, textureCoordinate - (2.0 * samplingOffset)) * 0.12;
fragmentColor += texture2D(inputImageTexture, textureCoordinate - (3.0 * samplingOffset)) * 0.09;
fragmentColor += texture2D(inputImageTexture, textureCoordinate - (4.0 * samplingOffset)) * 0.05;
gl_FragColor = fragmentColor;
}
Note that both of these blurs are hardcoded at 9 samples for performance reasons. This means that at larger blur sizes, you'll start to see artifacts from the limited samples here. For larger blurs, you'll need to run these filters multiple times or extend them to support more Gaussian samples. However, more samples will lead to much slower rendering times because of the limited texture sampling bandwidth on iOS devices.
CoreImage has a Motion Blur filter.
It's called CIMotionBlur... http://developer.apple.com/library/mac/#documentation/GraphicsImaging/Reference/CoreImageFilterReference/Reference/reference.html#//apple_ref/doc/filter/ci/CIMotionBlur

Resources