Texture atlas to texture array via PIXEL_UNPACK_BUFFER - webgl

I have two questions:
First, is there any more direct, sane way to go from a texture atlas image to a texture array in WebGL than what I'm doing below? I've not tried this, but doing it entirely in WebGL seems possible, though four-times the work and I still have to make two round trips to the GPU to do it.
And am I right that because buffer data for texImage3D() must come from PIXEL_UNPACK_BUFFER, this data must come directly from the CPU side? I.e. There is no way to copy from one block of GPU memory to a PIXEL_UNPACK_BUFFER without copying it to the CPU first. I'm pretty sure the answer to this is a hard "no".
In case my questions themselves are stupid (and they may be), my ultimate goal here is simply to convert a texture atlas PNG to a texture array. From what I've tried, the fastest way to do this by far is via PIXEL_UNPACK_BUFFER, rather than extracting each sub-image and sending them in one at a time, which for large atlases is extremely slow.
This is basically how I'm currently getting my pixel data.
const imageToBinary = async (image: HTMLImageElement) => {
const canvas = document.createElement('canvas');
canvas.width = image.width;
canvas.height = image.height;
const context = canvas.getContext('2d');
context.drawImage(image, 0, 0);
const imageData = context.getImageData(0, 0, image.width, image.height);
return imageData.data;
};
So, I'm creating an HTMLImageElement object, which contains the uncompressed pixel data I want, but has no methods to get at it directly. Then I'm creating a 2D context version containing the same pixel data a second time. Then I'm repopulating the GPU with the same pixel data a third time. Seems bonkers to me, but I don't see a way around it.

Related

iOS Metal – reading old values while writing into texture

I have a kernel function (compute shader) that reads nearby pixels of a pixel from a texture and based on the old nearby-pixel values updates the value of the current pixel (it's not a simple convolution).
I've tried creating a copy of the texture using BlitCommandEncoder and feeding the kernel function with 2 textures - one read-only and another write-only. Unfortunately, this approach is GPU-wise time consuming.
What is the most efficient (GPU- and memory-wise) way of reading old values from a texture while updating its content?
(Bit late but oh well)
There is no way you could make it work with only one texture, because the GPU is a highly parallel processor: Your kernel that you wrote for a single pixel gets called in parallel on all pixels, you can't tell which one goes first.
So you definitely need 2 textures. The way you probably should do it is by using 2 textures where one is the "old" one and the other the "new" one. Between passes, you switch the role of the textures, now old is new and new is old. Here is some pseudoswift:
var currentText = MTLTexture()
var nextText = MTLTexture()
let semaphore = dispatch_semaphore_create(1)
func update() {
dispatch_semaphore_wait(semaphore) // Wait for updating done signal
let commands = commandQueue.commandBuffer()
let encoder = commands.computeCommandEncoder()
encoder.setTexture(currentText, atIndex: 0)
encoder.setTexture(nextText, atIndex: 1)
encoder.dispatchThreadgroups(...)
encoder.endEncoding()
// When updating done, swap the textures and signal that it's done updating
commands.addCompletionHandler {
swap(&currentText, &nextText)
dispatch_semaphore_signal(semaphore)
}
commands.commit()
}
I have written plenty of iOS Metal code that samples (or reads) from the same texture it is rendering into. I am using the render pipeline, setting my texture as the render target attachment, and also loading it as a source texture. It works just fine.
To be clear, a more efficient approach is to use the color() attribute in your fragment shader, but that is only suitable if all you need is the value of the current fragment, not any other nearby positions. If you need to read from other positions in the render target, I would just load the render target as a source texture into the fragment shader.

Render speed for individual pixels in loop

I'm working on drawing individual pixels to a UIView to create fractal images. My problem is my rendering speed. I am currently running this loop 260,000 times, but would like to render even more pixels. As it is, it takes about 5 seconds to run on my iPad Mini.
I was using a UIBezierPath before, but that was even a bit slower (about 7 seconds). I've been looking in NSBitMap stuff, but I'm not exactly sure if that would speed it up or how to implement it in the first place.
I was also thinking about trying to store the pixels from my loop into an array, and then draw them all together after my loop. Again though, I am not quite sure what the best process would be to store and then retrieve pixels into and from an array.
Any help on speeding up this process would be great.
for (int i = 0; i < 260000; i++) {
float RN = drand48();
for (int i = 1; i < numBuckets; i++) {
if (RN < bucket[i]) {
col = i;
CGContextSetFillColor(context, CGColorGetComponents([UIColor colorWithRed:(colorSelector[i][0]) green:(colorSelector[i][1]) blue:(colorSelector[i][2]) alpha:(1)].CGColor));
break;
}
}
xT = myTextFieldArray[1][1][col]*x1 + myTextFieldArray[1][2][col]*y1 + myTextFieldArray[1][5][col];
yT = myTextFieldArray[1][3][col]*x1 + myTextFieldArray[1][4][col]*y1 + myTextFieldArray[1][6][col];
x1 = xT;
y1 = yT;
if (i > 10000) {
CGContextFillRect(context, CGRectMake(xOrigin+(xT-xMin)*sizeScalar,yOrigin-(yT-yMin)*sizeScalar,.5,.5));
}
else if (i < 10000) {
if (x1 < xMin) {
xMin = x1;
}
else if (x1 > xMax) {
xMax = x1;
}
if (y1 < yMin) {
yMin = y1;
}
else if (y1 > yMax) {
yMax = y1;
}
}
else if (i == 10000) {
if (xMax - xMin > yMax - yMin) {
sizeScalar = 960/(xMax - xMin);
yOrigin=1000-(1000-sizeScalar*(yMax-yMin))/2;
}
else {
sizeScalar = 960/(yMax - yMin);
xOrigin=(1000-sizeScalar*(xMax-xMin))/2;
}
}
}
Edit
I created a multidimensional array to store UIColors into, so I could use a bitmap to draw my image. It is significantly faster, but my colors are not working appropriately now.
Here is where I am storing my UIColors into the array:
int xPixel = xOrigin+(xT-xMin)*sizeScalar;
int yPixel = yOrigin-(yT-yMin)*sizeScalar;
pixelArray[1000-yPixel][xPixel] = customColors[col];
Here is my drawing stuff:
CGDataProviderRef provider = CGDataProviderCreateWithData(nil, pixelArray, 1000000, nil);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGImageRef image = CGImageCreate(1000,
1000,
8,
32,
4000,
colorSpace,
kCGBitmapByteOrder32Big | kCGImageAlphaNoneSkipLast,
provider,
nil, //No decode
NO, //No interpolation
kCGRenderingIntentDefault); // Default rendering
CGContextDrawImage(context, self.bounds, image);
Not only are the colors not what they are supposed to be, but every time I render my image, to colors are completely different from the previous time. I have been testing different stuff with the colors, but I still have no idea why the colors are wrong, and I'm even more confused how they keep changing.
Per-pixel drawing — with complicated calculations for each pixel, like fractal rendering — is one of the hardest things you can ask a computer to do. Each of the other answers here touches on one aspect of its difficulty, but that's not quite all. (Luckily, this kind of rendering is also something that modern hardware is optimized for, if you know what to ask it for. I'll get to that.)
Both #jcaron and #JustinMeiners note that vector drawing operations (even rect fill) in CoreGraphics take a penalty for CPU-based rasterization. Manipulating a buffer of bitmap data would be faster, but not a lot faster.
Getting that buffer onto the screen also takes time, especially if you're having to go through a process of creating bitmap image buffers and then drawing them in a CG context — that's doing a lot of sequential drawing work on the CPU and a lot of memory-bandwidth work to copy that buffer around. So #JustinMeiners is right that direct access to GPU texture memory would be a big help.
However, if you're still filling your buffer in CPU code, you're still hampered by two costs (at best, worse if you do it naively):
sequential work to render each pixel
memory transfer cost from texture memory to frame buffer when rendering
#JustinMeiners' answer is good for his use case — image sequences are pre-rendered, so he knows exactly what each pixel is going to be and he just has to schlep it into texture memory. But your use case requires a lot of per-pixel calculations.
Luckily, per-pixel calculations are what GPUs are designed for! Welcome to the world of pixel shaders. For each pixel on the screen, you can be running an independent calculation to determine the relationship of that point to your fractal set and thus what color to draw it in. The can be running that calculation in parallel for many pixels at once, and its output is going straight to the screen, so there's no memory overhead to dump a bitmap into the framebuffer.
One easy way to work with pixel shaders on iOS is SpriteKit — it can handle most of the necessary OpenGL/Metal setup for you, so all you have to write is the per-pixel algorithm in GLSL (actually, a subset of GLSL that gets automatically translated to Metal shader language on Metal-supported devices). Here's a good tutorial on that, and here's another on OpenGL ES pixel shaders for iOS in general.
If you really want to change many different pixels individually, your best option is probably to allocate a chunk of memory (of size width * height * bytes per pixel), make the changes directly in memory, and then convert the whole thing into a bitmap at once with CGBitmapContextCreateWithData
There may be even faster methods that this (see Justin's answer).
If you want to maximize render speed I would recommend bitmap rendering. Vector rasterization is much slower and CGContext drawing isn't really intended for high performance realtime rendering.
I faced a similar technical challenge and found CVOpenGLESTextureCacheRef to be the fastest. The texture cache allows you to upload a bitmap directly into graphics memory for fast rendering. Rendering utilizes OpenGL, but because its just 2D fullscreen image - you really don't need to learn much about OpenGL to use it.
You can see see an example I wrote of using the texture cache here: https://github.com/justinmeiners/image-sequence-streaming
My original question related to this is here:
How to directly update pixels - with CGImage and direct CGDataProvider
My project renders bitmaps from files so it is a little bit different but you could look at ISSequenceView.m for an example of how to use the texture cache and setup OpenGL for this kind of rendering.
Your rendering procedure could like something like:
1. Draw to buffer (raw bytes)
2. Lock texture cache
3. Copy buffer to texture cache
4. Unlock texture cache.
5. Draw fullscreen quad with texture

Good tutorial on using Quads for custom Text in OpenGL ES 2.0 on iOS

I'm currently new to OpenGL ES and am self teaching myself how to program iOS games. I'm currently playing with a project that I would like to put a HUD over with some custom text. I don't want to do this using a UILabel and currently have no idea how to use Quads to cut up a png or such full of text and attach them to normal text to be used for display. I would like the end result to be providing a simple string to a command/method and the output to be displayed using the textures/bitmap for the quad. Say glPrint("Hello World");. Would anyone be able to guide me in the proper direction? There doesn't seem to be a single good tutorial on how to do this for OpenGL ES 2.0 (just OpenGL). I also want to try to avoid using 3rd party APIs. I really need/want to understand how to tackle this.
When I was getting started with OpenGL ES for my current 2D project I used Ray's tutorial, which helped me get a handle on rendering textured 2D quads. In conjunction with his 3D OpenGL ES tutorial, you might be able to piece together what you want to do. Note that you probably wouldn't render every single quad separately like in the tutorial, as that is very inefficient. Instead, you would gather all of the vertices of the characters into two big arrays/vertex buffers and batch render the characters. The basic flow for rendering each frame would probably look like this: pass a normal perspective projection matrix for 3D rendering, get your vertex information for your 3D scene to your shaders somehow, render the 3D scene. This part you've already done. For the text, immediately after, pass an orthogonal projection matrix in, bind your font texture (generally generated earlier with the GLKTextureLoader class) to the active texture unit, generate two big arrays of texture and geometric vertices for the characters/update VBOs if the text has changed, pass that in, and then batch render all of the letters at once using either glDrawArrays or glDrawElements (which requires indices).
Also, as I'm also new at using OpenGL, some of this may be wrong/inefficient. I've yet to use OpenGL ES to render anything 3D, so I'm not sure what other state changes (enabling, disabling, etc) besides a different projection matrix might be needed between rendering your 3D scene and the 2D scene (text).
It seems that drawing text using only OpenGL is a relatively difficult and tedious task, so if you just want to render a HUD overlay displaying frame rates and other things you are much better off using UILabels and saving yourself the trouble, especially if your project is not very complex. This also prevents you from having to deal with wrapping, kerning, font sizes, colors, different languages and a load of other stuff that greatly complicates text rendering if you need anything more complex.
Rather than tracking the location of each letter, why not use Core Graphics to draw your entire string into a bitmap, then upload that as a texture? You'd just need to get the dimensions from your bitmap to know what size quad to draw for that text string.
Within my open source GPUImage framework, I have an input class called a GPUImageUIElement that does something similar. The relevant code from that input is as follows:
CGSize layerPixelSize = [self layerSizeInPixels];
GLubyte *imageData = (GLubyte *) calloc(1, (int)layerPixelSize.width * (int)layerPixelSize.height * 4);
CGColorSpaceRef genericRGBColorspace = CGColorSpaceCreateDeviceRGB();
CGContextRef imageContext = CGBitmapContextCreate(imageData, (int)layerPixelSize.width, (int)layerPixelSize.height, 8, (int)layerPixelSize.width * 4, genericRGBColorspace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
CGContextTranslateCTM(imageContext, 0.0f, layerPixelSize.height);
CGContextScaleCTM(imageContext, layer.contentsScale, -layer.contentsScale);
[layer renderInContext:imageContext];
CGContextRelease(imageContext);
CGColorSpaceRelease(genericRGBColorspace);
glBindTexture(GL_TEXTURE_2D, outputTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, (int)layerPixelSize.width, (int)layerPixelSize.height, 0, GL_BGRA, GL_UNSIGNED_BYTE, imageData);
free(imageData);
This code takes a CALayer (either directly or from the backing layer of a UIView) and renders its contents to a texture. I've already initialized the texture before this, so the code sets up a bitmap context, renders the layer into that context using -renderInContext:, and then uploads that bitmap to the texture for use in OpenGL ES.
The helper method -layerSizeInPixels just accounts for the current Retina scale factor as follows:
- (CGSize)layerSizeInPixels;
{
CGSize pointSize = layer.bounds.size;
return CGSizeMake(layer.contentsScale * pointSize.width, layer.contentsScale * pointSize.height);
}
If you used a UILabel for your view and had it autosize to fit its text, you could set the text on it, use the above to render and upload your texture, and then take the pixel size of the element to determine your quad size. However, it would probably be more efficient to just draw the text yourself using -drawAtPoint:withFont:fontForSize: or the like with an NSString.
Using Core Graphics to render your text makes it easy to manipulate the text as an NSString and use all of Core Graphics' typesetting capabilities instead of rolling your own.

directx texture dimensions

so I've discovered that my graphics card automatically resizes textures to powers of 2, which isn't usually a problem but I need to render only a portion of my texture and in doing so, must have the dimensions it has been resized to...
ex:
I load a picture that is 370x300 pixels into my texture and try to draw it with a specific source rectangle
RECT test;
test.left = 0;
test.top = 0;
test.right = 370;
test.bottom = 300;
lpSpriteHandler->Draw(
lpTexture,
&test, // srcRect
NULL, // center
NULL, // position
D3DCOLOR_XRGB(255,255,255)
);
but since the texture has been automatically resized (in this case) to 512x512, I see only a portion of my original texture.
The question is,
is there a function or something I can call to find the dimensions of my texture?
(I've tried googling this but always get some weird crap about Objects and HSL or something)
You may get file information by using this call:
D3DXIMAGE_INFO info;
D3DXGetImageInfoFromFile(file_name, &info);
Though, knowing the original texture size you'll still get it resized on load. This will obviously affect texture quality. Texture resizing is not a big deal when you apply it on mesh (it will get resized anyway) but for drawing sprites this could be a concern. To workaround this I could suggest creating a surface, loading it via D3DXLoadSurfaceFromFile and then copying it to a "pow2" sized texture.
And an offtopic: are you definitely sure about your card capabilities? May be in fact your card do support arbitrary texture sizes but you use D3DXCreateTextureFromFile() which by deafult enforces pow2 sizes. To avoid this try using extended version of this routine:
D3DTexture* texture;
D3DXCreateTextureFromFileEx(
device, file_name, D3DX_DEFAULT_NONPOW2, D3DX_DEFAULT_NONPOW2, D3DX_DEFAULT, 0,
D3DFMT_UNKNOWN, D3DPOOL_MANAGED, D3DX_DEFAULT, D3DX_DEFAULT, 0, NULL, NULL,
&texture);
If your hardware suppors non-pow2 textures you'll get your file loaded as it is. If hardware is not able to handle it than method will fail.

Fixing Local Skew in Leptonica 1.68

I'm having an interesting issue with Leptonica that I'm wondering if other SO members have seen.
I'm doing a deskew operation, and having severe artifacting issues, so much so that nobody would rightly accept the results, which degrade the image quality more than they benefit it.
Here's the relevant code that produces the deskew operation:
// Make a black and white version for deskew calculations
l_int32 thresh;
PIX * deskewbw = pixMaskedThreshOnBackgroundNorm(pix,NULL,10,15,25,10,2,2,0.1,&thresh);
NSLog(#"Used threshold of %d to normalize image for deskew",thresh);
// Find the local skew
PTA * ptas, *ptad;
pixGetLocalSkewTransform(deskewbw, 0, 0, 0, 0.0, 0.0, 0.0, &ptas, &ptad);
// Cleanup the first B/W version
pixDestroy(&deskewbw);
// Deskew the original image
PIX * deskewgray = pixProjectivePtaGray(pix, ptad, ptas, 128);
// Reduce the deskewed original image to B/W
pixbw = pixMaskedThreshOnBackgroundNorm(deskewgray, NULL, 10, 15, 25, 10, 2, 2, 0.1, &thresh);
Whether I use this, or the pixDeskewLocal function (which does something similar) I get some VERY UGLY results with an interlaced line effect:
Just for comparison, here is the original (slightly skewed) image:
This happens whether the original is a black or white foreground, and is more severe in areas that are shifted more. I'm tempted at this point just to have iOS do the rendering for me to avoid Leptonica for this particular operation, but that increases the number of conversions in my workflow, which I'd rather avoid if possible.
Has anyone else encountered/overcome this issue before? Any pointers on why this happens/how to fix it?
You may use function pixEndianByteSwap(pixbw); to fix this problem.
I thought about this from an image processing perspective, and realized that I had probably made a mistake in reading data into Leptonica, rather than Leptonica being the culprit here, and as it turns out, I was right.
The pixel spacing for this glitch was 4, and as it turns out, Leptonica reads data in words, processing from MSB/MSb to LSB/LSb, leading to a conflict between the way that CGContext writes data, and how Leptonica reads it. This isn't as big of a problem if you read the data into Leptonica as rgba, because once you flatten it to greyscale or B/W, the error mostly vanishes (at the cost of some dynamic range) but since I was reading data in as 8-bit grayscale, the error didn't vanish, but instead manifested as you see above.
On Little-Endian systems, the data needs to be divided into words, and the byte order reversed to form a sensible image from a CGContext, on Big-Endian systems, no change is necessary. I'd prefer finding a method of having CGContext do this for me, but for now, I'll fix this the hard way.

Resources