Centering all of the points in iOS OpenGL ES app - ios

I have an OpenGL view that displays a set of 3D points with some basic shaders:
// Fragment Shader
static const char* PointFS = STRINGIFY
(
void main(void)
{
gl_FragColor = vec4(0.8, 0.8, 0.8, 1.0);
}
);
// Vertex Shader
static const char* PointVS = STRINGIFY
(
uniform mediump mat4 uProjectionMatrix;
attribute mediump vec4 position;
void main( void )
{
gl_Position = uProjectionMatrix * position;
gl_PointSize = 3.0;
}
);
And the MVP matrix is calculated as:
- (void)setMatrices
{
// ModelView Matrix
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Scale(modelViewMatrix, 2, 2, 2);
// Projection Matrix
const GLfloat aspectRatio = (GLfloat)(self.view.bounds.size.width) / (GLfloat)(self.view.bounds.size.height);
const GLfloat fieldView = GLKMathDegreesToRadians(90.0f);
const GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(fieldView, aspectRatio, 0.1f, 10.0f);
glUniformMatrix4fv(self.pointShader.uProjectionMatrix, 1, 0, GLKMatrix4Multiply(projectionMatrix, modelViewMatrix).m);
}
This works fine, but I have a set of 500 points and I see only a few.
How do I scale/translate the MVP matrix to display all of them (they are a dynamic set)? Ideally the "centroid" should be at the origin, and all of the points visible. It should be able to adapt to rotations of the view (gestures are the next step I want to implement).

Seeing how you present this you might need quite a lot... I guess best approach might be using "look at", the point you are looking at is (0,0,0) as you stated, camera position should probably be (0,0,Z) and up (0,1,0). So the only issue here is the Z component of camera position.
If you start the Z with for instance -.1 and the iterate through all the points then sin(fieldView*.5f) * (p.z-Z) >= point.y for the point to be visible. So you can compute Z1 = p.z-(point.y/sin(fieldView*.5f)) and if Z1<Z then Z=Z1. This check is only for the positive Y check, you also need the same for negative Y and same for +-X. These evasions are very similar though when checking X you could also take the screen ratio into account.
This procedure should give you the smallest field possible to see all the points (with given limitations such as looking towards (0,0,0)) but is far from the simplest. You also need to consider if the equation will work if p.z<-Z.
Another bit easier approach is to generate the smallest cube around centre which holds all the points: iterate through points and get the coordinate with largest absolute value (any of X,Y or Z). When you have it use it with frustum instead perspective so that all rect parameters (top, bottom, left and right) are generated with this value as +-largest. Then you need to compute the translation which for 90 degrees field is Z = (largest*.5). Z is the zNear for the frustum and then also translate the matrix by -(Z+largest). Again one of the coordinate in frustum must be multiplied by screen ratio.
In any case do watch out what your zFar is, having it only 10.0f might be a bit too short in your case. Until you need the depth buffer you should not worry about that value being too large.

Related

Shadow Mapping - Space Transformations are going bad

I am currently studying shadow mapping, and my biggest issue right now is the transformations between spaces. This is my current working theory/steps.
Pass 1:
Get depth of pixel from camera, store in depth buffer
Get depth of pixel from light, store in another buffer
Pass 2:
Use texture coordinate to sample camera's depth buffer at current pixel
Convert that depth to a view space position by multiplying the projection coordinate with invProj matrix. (also do a perspective divide).
Take that view position and multiply by invV (camera's inverse view) to get a world space position
Multiply world space position by light's viewProjection matrix.
Perspective divide that projection-space coordinate, and manipulate into [0..1] to sample from light depth buffer.
Get current depth from light and closest (sampled) depth, if current depth > closest depth, it's in shadow.
Shader Code
Pass1:
PS_INPUT vs(VS_INPUT input) {
output.pos = mul(input.vPos, mvp);
output.cameraDepth = output.pos.zw;
..
float4 vPosInLight = mul(input.vPos, m);
vPosInLight = mul(vPosInLight, light.viewProj);
output.lightDepth = vPosInLight.zw;
}
PS_OUTPUT ps(PS_INPUT input){
float cameraDepth = input.cameraDepth.x / input.cameraDepth.y;
//Bundle cameraDepth in alpha channel of a normal map.
output.normal = float4(input.normal, cameraDepth);
//4 Lights in total -- although only 1 is active right now. Going to use r/g/b/a for each light depth.
output.lightDepths.r = input.lightDepth.x / input.lightDepth.y;
}
Pass 2 (Screen Quad):
float4 ps(PS_INPUT input) : SV_TARGET{
float4 pixelPosView = depthToViewSpace(input.texCoord);
..
float4 pixelPosWorld = mul(pixelPosView, invV);
float4 pixelPosLight = mul(pixelPosWorld, light.viewProj);
float shadow = shadowCalc(pixelPosLight);
//For testing / visualisation
return float4(shadow,shadow,shadow,1);
}
float4 depthToViewSpace(float2 xy) {
//Get pixel depth from camera by sampling current texcoord.
//Extract the alpha channel as this holds the depth value.
//Then, transform from [0..1] to [-1..1]
float z = (_normal.Sample(_sampler, xy).a) * 2 - 1;
float x = xy.x * 2 - 1;
float y = (1 - xy.y) * 2 - 1;
float4 vProjPos = float4(x, y, z, 1.0f);
float4 vPositionVS = mul(vProjPos, invP);
vPositionVS = float4(vPositionVS.xyz / vPositionVS.w,1);
return vPositionVS;
}
float shadowCalc(float4 pixelPosL) {
//Transform pixelPosLight from [-1..1] to [0..1]
float3 projCoords = (pixelPosL.xyz / pixelPosL.w) * 0.5 + 0.5;
float closestDepth = _lightDepths.Sample(_sampler, projCoords.xy).r;
float currentDepth = projCoords.z;
return currentDepth > closestDepth; //Supposed to have bias, but for now I just want shadows working haha
}
CPP Matrices
// (Position, LookAtPos, UpDir)
auto lightView = XMMatrixLookAtLH(XMLoadFloat4(&pos4), XMVectorSet(0,0,0,1), XMVectorSet(0,1,0,0));
// (FOV, AspectRatio (1000/680), NEAR, FAR)
auto lightProj = XMMatrixPerspectiveFovLH(1.57f , 1.47f, 0.01f, 10.0f);
XMStoreFloat4x4(&_cLightBuffer.light.viewProj, XMMatrixTranspose(XMMatrixMultiply(lightView, lightProj)));
Current Outputs
White signifies that a shadow should be projected there. Black indicates no shadow.
CameraPos (0, 2.5, -2)
CameraLookAt (0, 0, 0)
CameraFOV (1.57)
CameraNear (0.01)
CameraFar (10.0)
LightPos (0, 2.5, -2)
LightLookAt (0, 0, 0)
LightFOV (1.57)
LightNear (0.01)
LightFar (10.0)
If I change the CameraPosition to be (0, 2.5, 2), basically just flipped on the Z axis, this is the result.
Obviously a shadow shouldn't change its projection depending on where the observer is, so I think I'm making a mistake with the invV. But I really don't know for sure. I've debugged the light's projView matrix, and the values seem correct - going from CPU to GPU. It's also entirely possible I've misunderstood some theory along the way because this is quite a tricky technique for me.
Aha! Found my problem. It was a silly mistake, I was calculating the depth of pixels from each light, but storing them in a texture that was based on the view of the camera. The following image should explain my mistake better than I can with words.
For future reference, the solution I decided was to scrap my idea for storing light depths in texture channels. Instead, I basically make a new pass for each light, and bind a unique depth-stencil texture to render the geometry to. When I want to do light calculations, I bind each of the depth textures to a shader resource slot and go from there. Obviously this doesn't scale well with many lights, but for my student project where I'm only required to have 2 shadow casters, it suffices.
_context->DrawIndexed(indexCount, 0, 0); //Draw to regular render target
_sunlight->use(1, _context); //Use sunlight shader (basically just runs a Vertex Shader & Null Pixel shader so depth can be written to depth map)
_sunlight->bindDSVSetNullRenderTarget(_context);
_context->DrawIndexed(indexCount, 0, 0); //Draw to sunlight depth target
bindDSVSetNullRenderTarget(ctx){
ID3D11RenderTargetView* nullrv = { nullptr };
ctx->OMSetRenderTargets(1, &nullrv, _sunlightDepthStencilView);
}
//The purpose of setting a null render target before doing the draw call is
//that a draw call with only a depth target bound is much faster.
//(At least I believe so, from my reading online)

using pointSize to trigger the fragment shader to draw pixels

I queries the pointSize range gl.getParameter(gl.ALIASED_POINT_SIZE_RANGE) and got [1,1024] this means, that using this point to cover a texture (so it triggers the fragment shader to draw all pixels spans by the pointSize
at best, using this method i cannot render images larger then 1024x1024, ?
I guess i have to bind 2 triangles (6 points) to the fragment shader so it covers all of clipspace and then gl.viewport(x, y, width, height); will map this entire area to the output texture (frame buffer object or canvas)?
is there any other way (maybe something new in webgl2) other then using an attribute in the fragment shader?
Correct, the largest size area you can render with a single point is whatever is returned by gl.getParameter(gl.ALIASED_POINT_SIZE_RANGE)
The spec does not require any size larger than 1. The fact that your GPU/Driver/Browser returned 1024 does not mean that your users' machines will also return 1024.
note: Answering based on your history of questions
The normal thing to do in WebGL for 99% off all cases is to submit vertices. Want to draw a quad, submit 4 vertices and 6 indices or 6 vertex. Want to draw a triangle, submit 3 vertices. Want to draw a circle, submit the vertices for a circle. Want to draw a car, submit the vertices for a car or more likely submit the vertices for a wheel, draw 4 wheels with those vertices, submit the vertices for other parts of the car, draw each part of the car.
You multiply those vertices by some matrices to move, scale, rotate, and project them into 2D or 3D space. All your favorite games do this. The canvas 2D api does this via OpenGL ES internally. Chrome itself does this to render all the parts of this webpage. That's the norm. Anything else is an exception and will likely lead to limitations.
For fun, in WebGL2, there are some other things you can do. They are not the normal thing to do and they are not recommended to actually solve real world problems. They can be fun though just for the challenge.
In WebGL2 there is an global variable in the vertex shader called gl_VertexID which is the count of the vertex currently being processed. You can use that with clever math to generate vertices in the vertex shader with no other data.
Here's some code that draws a quad that covers the canvas
function main() {
const gl = document.querySelector('canvas').getContext('webgl2');
const vs = `#version 300 es
void main() {
int x = gl_VertexID % 2;
int y = (gl_VertexID / 2 + gl_VertexID / 3) % 2;
gl_Position = vec4(ivec2(x, y) * 2 - 1, 0, 1);
}
`;
const fs = `#version 300 es
precision mediump float;
out vec4 outColor;
void main() {
outColor = vec4(1, 0, 0, 1);
}
`;
// compile shaders, link program
const prg = twgl.createProgram(gl, [vs, fs]);
gl.useProgram(prg);
const count = 6;
gl.drawArrays(gl.TRIANGLES, 0, count);
}
main();
<canvas></canvas>
<script src="https://twgljs.org/dist/4.x/twgl.min.js"></script>
Example: And one that draws a circle
function main() {
const gl = document.querySelector('canvas').getContext('webgl2');
const vs = `#version 300 es
#define PI radians(180.0)
void main() {
const int TRIANGLES_AROUND_CIRCLE = 100;
int triangleId = gl_VertexID / 3;
int pointId = gl_VertexID % 3;
int pointIdOffset = pointId % 2;
float angle = float((triangleId + pointIdOffset) * 2) * PI /
float(TRIANGLES_AROUND_CIRCLE);
float radius = 1. - step(1.5, float(pointId));
float x = sin(angle) * radius;
float y = cos(angle) * radius;
gl_Position = vec4(x, y, 0, 1);
}
`;
const fs = `#version 300 es
precision mediump float;
out vec4 outColor;
void main() {
outColor = vec4(1, 0, 0, 1);
}
`;
// compile shaders, link program
const prg = twgl.createProgram(gl, [vs, fs]);
gl.useProgram(prg);
const count = 300; // 100 triangles, 3 points each
gl.drawArrays(gl.TRIANGLES, 0, 300);
}
main();
<canvas></canvas>
<script src="https://twgljs.org/dist/4.x/twgl.min.js"></script>
There is an entire website based on this idea. The site is based on the puzzle of making pretty pictures given only an id for each vertex. It's the vertex shader equivalent of shadertoy.com. On Shadertoy.com the puzzle is basically given only gl_FragCoord as input to a fragment shader write a function to draw something interesting.
Both sites are toys/puzzles. Doing things this way is not recommended for solving real issues like drawing a 3D world in a game, doing image processing, rendering the contents of a browser window, etc. They are cute puzzles on given only minimal inputs, drawing something interesting.
Why is this technique not advised? The most obvious reason is it's hard coded and inflexible where as the standard techniques are super flexible. For example above to draw a fullscreen quad required one shader. To draw a circle required a different shader. Where a standard vertex buffer based attributes multiplied by matrices can be used for any shape provided, 2d or 3d. Not just any shape, with just a simple single matrix multiply in the shader those shapes can be translated, rotated, scaled, projected into 3D, there rotation centers and scale centers can be independently set, etc.
Note: you are free to do whatever you want. If you like these techniques then by all means use them. The reason I'm trying to steer you away form them is based on your previous questions you're new to WebGL and I feel like you'll end up making WebGL much harder for yourself if you use obscure and hard coded techniques like these instead of the traditional more common flexible techniques that experienced devs use to get real work done. But again, it's up to you, do whatever you want.

WebGL: Scaling affects normal matrix?

I'm playing around with WebGL, I scipted a simple flat-shaded cube.
I got a shader which takes projection matrix, view model matrix and a normal matrix, nothing fancy:
(...)
void main(void) {
gl_Position = uPMatrix * uMVMatrix * vec4(aVertexPosition, 1.0);
vTextureCoord = aTextureCoord;
vec3 transformedNormal = uNMatrix * aVertexNormal;
float directionalLightWeighting = max(dot(transformedNormal, uLightingDirection), 0.0);
vLightWeighting = uAmbientColor + uDirectionalColor * directionalLightWeighting;
}
Everything is fine, the flat shading looks good, but as soon as I resize the cube (noted as mat4.scale below), the shading does not affect the scene anymore. If I scale down the computed normal matrix by the reverse factor, it works again.
The code follows the following schema (drawing pseudo routine):
projection = mat4.ortho
// set up general camera view
view = mat4.lookAt
// set up cube position / scaling / rotation on view matrix
mat4.translate(view)
mat4.scale(view) // remove for nice shading ..
mat4.rotate(view)
// normalFromMat4 returns upper-left 3x3 inverse transpose
normal = mat4.normalFromMat4 ( view )
pass projection, view, normal to shader
gl.drawElements
I am using gl-matrix as math library.
Any ideas where my mistake lies?

Varying Line Width with Open GL using GL_POINTS (iOS)

I'm making a drawing application using swift (based on GLPaint) and open gl. Now I would like to improve the curve so that it varies with stroke speed (in eg thicker if drawing fast)
However, since my knowledge in open gl is quite limited I need some guidance. What I want to do is to vary the size of my texture/point for each CGPoint I calculate and add to the screen. Is it possible?
func addQuadBezier(var from:CGPoint, var ctrl:CGPoint, var to:CGPoint, startTime:CGFloat, endTime:CGFloat) {
scalePoints(from: from, ctrl: ctrl, to: to)
let pointCount = calculatePointsNeeded(from: from, to: to, min: 16.0, max: 256.0)
var vertexBuffer: [GLfloat] = [GLfloat](count: Int(pointCount), repeatedValue:0.0)
var t : CGFloat = startTime + 0.0002
for i in 0..<Int(pointCount) {
let p = calculatePoint(from:from, ctrl: ctrl, to: to)
vertexBuffer.insert(p.x.f, atIndex: i*2)
vertexBuffer.insert(p.y.f, atIndex: i*2+1)
t += (CGFloat(1)/CGFloat(pointCount))
}
glBufferData(GL_ARRAY_BUFFER.ui, Int(pointCount)*2*sizeof(GLfloat), vertexBuffer, GL_STATIC_DRAW.ui)
glDrawArrays(GL_POINTS.ui, 0, Int(pointCount).i)
}
func render()
{
context.presentRenderbuffer(GL_RENDERBUFFER.l)
}
where render() is called every 1/60 s.
shader
attribute vec4 inVertex;
uniform mat4 MVP;
uniform float pointSize;
uniform lowp vec4 vertexColor;
varying lowp vec4 color;
void main()
{
gl_Position = MVP * inVertex;
gl_PointSize = pointSize;
color = vertexColor;
}
Thanks in advance!
In your vertex shader, set gl_pointSize to the width you want. That measurement is in framebuffer pixels, so if the size of your framebuffer changes with the device's scale factor, you'll need to adjust your point size appropriately.
If you find a way to control the line width in the vertex shader it would most likely be the best solution. Not only the lines would have different width but even a single line may have an increasing width (interpolated) between the points. I am not sure you will be able to achieve this on your platform though.
So if you do find a way you would add the point size to your buffer and use it with a new attribute in the vertex shader.
If not you will need to use triangles to draw the line which is generally a better practice anyway. To define vertices between point A and B you can get the normal as W = (B-A).normalized(), normal = N = (W.y, -W.x). Then the 4 positions are k = lineWidth/2.0, t1 = A + N*k, t2 = A - N*k, t3 = B + N*k, t4 = B - N*k. So this is what you add into your buffer and draw as a triangle strip or triangles depending on what you are looking for.

OpenCV: rotation/translation vector to OpenGL modelview matrix

I'm trying to use OpenCV to do some basic augmented reality. The way I'm going about it is using findChessboardCorners to get a set of points from a camera image. Then, I create a 3D quad along the z = 0 plane and use solvePnP to get a homography between the imaged points and the planar points. From that, I figure I should be able to set up a modelview matrix which will allow me to render a cube with the right pose on top of the image.
The documentation for solvePnP says that it outputs a rotation vector "that (together with [the translation vector] ) brings points from the model coordinate system to the camera coordinate system." I think that's the opposite of what I want; since my quad is on the plane z = 0, I want a a modelview matrix which will transform that quad to the appropriate 3D plane.
I thought that by performing the opposite rotations and translations in the opposite order I could calculate the correct modelview matrix, but that seems not to work. While the rendered object (a cube) does move with the camera image and seems to be roughly correct translationally, the rotation just doesn't work at all; it on multiple axes when it should only be rotating on one, and sometimes in the wrong direction. Here's what I'm doing so far:
std::vector<Point2f> corners;
bool found = findChessboardCorners(*_imageBuffer, cv::Size(5,4), corners,
CV_CALIB_CB_FILTER_QUADS |
CV_CALIB_CB_FAST_CHECK);
if(found)
{
drawChessboardCorners(*_imageBuffer, cv::Size(6, 5), corners, found);
std::vector<double> distortionCoefficients(5); // camera distortion
distortionCoefficients[0] = 0.070969;
distortionCoefficients[1] = 0.777647;
distortionCoefficients[2] = -0.009131;
distortionCoefficients[3] = -0.013867;
distortionCoefficients[4] = -5.141519;
// Since the image was resized, we need to scale the found corner points
float sw = _width / SMALL_WIDTH;
float sh = _height / SMALL_HEIGHT;
std::vector<Point2f> board_verts;
board_verts.push_back(Point2f(corners[0].x * sw, corners[0].y * sh));
board_verts.push_back(Point2f(corners[15].x * sw, corners[15].y * sh));
board_verts.push_back(Point2f(corners[19].x * sw, corners[19].y * sh));
board_verts.push_back(Point2f(corners[4].x * sw, corners[4].y * sh));
Mat boardMat(board_verts);
std::vector<Point3f> square_verts;
square_verts.push_back(Point3f(-1, 1, 0));
square_verts.push_back(Point3f(-1, -1, 0));
square_verts.push_back(Point3f(1, -1, 0));
square_verts.push_back(Point3f(1, 1, 0));
Mat squareMat(square_verts);
// Transform the camera's intrinsic parameters into an OpenGL camera matrix
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
// Camera parameters
double f_x = 786.42938232; // Focal length in x axis
double f_y = 786.42938232; // Focal length in y axis (usually the same?)
double c_x = 217.01358032; // Camera primary point x
double c_y = 311.25384521; // Camera primary point y
cv::Mat cameraMatrix(3,3,CV_32FC1);
cameraMatrix.at<float>(0,0) = f_x;
cameraMatrix.at<float>(0,1) = 0.0;
cameraMatrix.at<float>(0,2) = c_x;
cameraMatrix.at<float>(1,0) = 0.0;
cameraMatrix.at<float>(1,1) = f_y;
cameraMatrix.at<float>(1,2) = c_y;
cameraMatrix.at<float>(2,0) = 0.0;
cameraMatrix.at<float>(2,1) = 0.0;
cameraMatrix.at<float>(2,2) = 1.0;
Mat rvec(3, 1, CV_32F), tvec(3, 1, CV_32F);
solvePnP(squareMat, boardMat, cameraMatrix, distortionCoefficients,
rvec, tvec);
_rv[0] = rvec.at<double>(0, 0);
_rv[1] = rvec.at<double>(1, 0);
_rv[2] = rvec.at<double>(2, 0);
_tv[0] = tvec.at<double>(0, 0);
_tv[1] = tvec.at<double>(1, 0);
_tv[2] = tvec.at<double>(2, 0);
}
Then in the drawing code...
GLKMatrix4 modelViewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, -tv[1], -tv[0], -tv[2]);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[0], 1.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[1], 0.0f, 1.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[2], 0.0f, 0.0f, 1.0f);
The vertices I'm rendering create a cube of unit length around the origin (i.e. from -0.5 to 0.5 along each edge.) I know with OpenGL translation functions performed transformations in "reverse order," so the above should rotate the cube along the z, y, and then x axes, and then translate it. However, it seems like it's being translated first and then rotated, so perhaps Apple's GLKMatrix4 works differently?
This question seems very similar to mine, and in particular coder9's answer seems like it might be more or less what I'm looking for. However, I tried it and compared the results to my method, and the matrices I arrived at in both cases were the same. I feel like that answer is right, but that I'm missing some crucial detail.
You have to make sure the axis are facing the correct direction. Especially, the y and z axis are facing different directions in OpenGL and OpenCV to ensure the x-y-z basis is direct. You can find some information and code (with an iPad camera) in this blog post.
-- Edit --
Ah ok. Unfortunately, I used these resources to do it the other way round (opengl ---> opencv) to test some algorithms. My main issue was that the row order of the images was inverted between OpenGL and OpenCV (maybe this helps).
When simulating cameras, I came across the same projection matrices that can be found here and in the generalized projection matrix paper. This paper quoted in the comments of the blog post also shows some link between computer vision and OpenGL projections.
I'm not an IOS programmer, so this answer might be misleading!
If the problem is not in the order of applying the rotations and the translation, then suggest using a simpler and more commonly used coordinate system.
The points in the corners vector have the origin (0,0) at the top left corner of the image and the y axis is towards the bottom of the image. Often from math we are used to think of the coordinate system with the origin at the center and y axis towards the top of the image. From the coordinates you're pushing into board_verts I'm guessing you're making the same mistake. If that's the case, it's easy to transform the positions of the corners by something like this:
for (i=0;i<corners.size();i++) {
corners[i].x -= width/2;
corners[i].y = -corners[i].y + height/2;
}
then you call solvePnP(). Debugging this is not that difficult, just print the positions of the four corners and the estimated R and T, and see if they make sense. Then you can proceed to the OpenGL step. Please let me know how it goes.

Resources