iPad missing OpenGL extension string GL_APPLE_texture_2D_limited_npot - ios

In my iOS game, I want to use the GL_APPLE_texture_2D_limited_npot extension when available to save memory (the game have NPOT textures, and in my current implementation I add some padding to make those power of two).
I am testing in my iPad (first generation). Every thing I have read so far says that all iOS devices which supports OpenGLES2 (including iPad) also support GL_APPLE_texture_2D_limited_npot (which is very good, since my game use OpenGLES2). I have tested in my iPad, and it does support (I removed the padding and the images work if I set wrap to GL_CLAMP_TO_EDGE), but the extension does not show when I call glGetString(GL_EXTENSIONS). The code:
const char *extensions = (const char *)glGetString(GL_EXTENSIONS);
std::cout << extensions << "\n";
Results in:
GL_OES_depth_texture GL_OES_depth24 GL_OES_element_index_uint GL_OES_fbo_render_mipmap GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_texture_float GL_OES_texture_half_float GL_OES_vertex_array_object GL_EXT_blend_minmax GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_read_format_bgra GL_EXT_separate_shader_objects GL_EXT_shader_texture_lod GL_EXT_texture_filter_anisotropic GL_APPLE_framebuffer_multisample GL_APPLE_rgb_422 GL_APPLE_texture_format_BGRA8888 GL_APPLE_texture_max_level GL_IMG_read_format GL_IMG_texture_compression_pvrtc
Why does this extension does not show with glGetString(GL_EXTENSIONS)? What is the proper way to check for it? Does all OpenGLES2 iOS devices really support it?

OpenGL ES 2.0 supports non power of 2 textures in specification. There is no need for extension. Here is the spec: http://www.khronos.org/registry/gles/specs/2.0/es_full_spec_2.0.25.pdf (Page 69):
If wt and ht are the specified image width and height, and if either wt or ht are
less than zero, then the error INVALID_VALUE is generated.
The maximum allowable width and height of a two-dimensional texture image
must be at least 2k-lod for image arrays of level zero through k, where k is the log
base 2 of MAX_TEXTURE_SIZE. and lod is the level-of-detail of the image array.
It may be zero for image arrays of any level-of-detail greater than k. The error
INVALID_VALUE is generated if the specified image is too large to be stored under
any conditions.
Not a word about power of two restriction (that is in OpenGL ES 1.x standard).
And if you read the specification of extension - http://www.khronos.org/registry/gles/extensions/APPLE/APPLE_texture_2D_limited_npot.txt, then you'll notice that it is written agains OpenGL ES 1.1 spec.

Related

Why are printed memory addresses in Rust a mix of both 40-bit and 48-bit addresses?

I'm trying to understand the way Rust deals with memory and I've a little program that prints some memory addresses:
fn main() {
let a = &&&5;
let x = 1;
println!(" {:p}", &x);
println!(" {:p} \n {:p} \n {:p} \n {:p}", &&&a, &&a, &a, a);
}
This prints the following (varies for different runs):
0x235d0ff61c
0x235d0ff710
0x235d0ff728
0x235d0ff610
0x7ff793f4c310
This is actually a mix of both 40-bit and 48-bit addresses. Why this mix? Also, can somebody please tell me why the addresses (2, 3, 4) do not fall in locations separated by 8-bytes (since std::mem::size_of_val(&a) gives 8)? I'm running Windows 10 on an AMD x-64 processor (Phenom || X4) with 24GB RAM.
All the addresses do have the same size, Rust is just not printing trailing 0-digits.
The actual memory layout is an implementation detail of your OS, but the reason that a prints a location in a different memory area than all the other variables is, that a actually lives in your loaded binary, because it is a value that can already be calculated by the compiler. All the other variables are calculated at runtime and live on the stack.
See the compilation result on https://godbolt.org/z/kzSrDr:
.L__unnamed_4 contains the value 5; .L__unnamed_5, .L__unnamed_6 and .L__unnamed_1 are &5 &&5 and &&&5.
So .L__unnamed_1 is what on your system is at 0x7ff793f4c310. While 0x235d0ff??? is on your stack and calculated in the red and blue areas of the code.
This is actually a mix of both 40-bit and 48-bit addresses. Why this mix?
It's not really a mix, Rust just doesn't display leading zeroes. It's really about where the OS maps the various components of the program (data, bss, heap and stack) in the address space.
Also, can somebody please tell me why the addresses (2, 3, 4) do not fall in locations separated by 8-bytes (since std::mem::size_of_val(&a) gives 8)?
Because println! is a macro which expands to a bunch of stuff in the stackframe, so your values are not defined next to one another in the frame final code (https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5b812bf11e51461285f51f95dd79236b). Though even if they were there'd be no guarantee the compiler wouldn't e.g. be reusing now-dead memory to save up on frame size.

Weird case with MemoryLayout using a struct protocol, different size reported

I'm working on a drawing engine using Metal. I am reworking from a previous version, so starting from scratch
I was getting error Execution of the command buffer was aborted due to an error during execution. Caused GPU Hang Error (IOAF code 3)
After some debugging I placed the blame to my drawPrimitives routine, I found the case quite interesting
I will have a variety of brushes, all of them will work with specific Vertex info
So I said, why not? Have all the brushes respond to a protocol
The protocol for the Vertices will be this:
protocol MetalVertice {}
And the Vertex info used by this specific brush will be:
struct PointVertex:MetalVertice{
var pointId:UInt32
let relativePosition:UInt32
}
The brush can be called either by giving Vertices previously created or by calling a function to create those vertices. Anyway, the real drawing happens at the vertice function
var vertices:[PointVertex] = [PointVertex].init(repeating: PointVertex(pointId: 0,
relativePosition: 0),
count: totalVertices)
for (verticeIdx, pointIndex) in pointsIndices.enumerated(){
vertices[verticeIdx].pointId = UInt32(pointIndex)
}
for vertice in vertices{
print("size: \(MemoryLayout.size(ofValue: vertice))")
}
self.renderVertices(vertices: vertices,
forStroke: stroke,
inDrawing: drawing,
commandEncoder: commandEncoder)
return vertices
}
func renderVertices(vertices: [MetalVertice], forStroke stroke: LFStroke, inDrawing drawing:LFDrawing, commandEncoder: MTLRenderCommandEncoder) {
if vertices.count > 1{
print("vertices a escribir: \(vertices.count)")
print("stride: \(MemoryLayout<PointVertex>.stride)")
print("size of array \(MemoryLayout.size(ofValue: vertices))")
for vertice in vertices{
print("ispointvertex: \(vertice is PointVertex)")
print("size: \(MemoryLayout.size(ofValue: vertice))")
}
}
let vertexBuffer = LFDrawing.device.makeBuffer(bytes: vertices,
length: MemoryLayout<PointVertex>.stride * vertices.count,
options: [])
This was the issue, calling this specific code produces these results in the console:
size: 8
size: 8
vertices a escribir: 2
stride: 8
size of array 8
ispointvertex: true
size: 40
ispointvertex: true
size: 40
In the previous function, the size of the vertices is 8 bytes, but for some reason, when they enter the next function they turn into 40 bytes, so the buffer is incorrectly constructed
if I change the function signature to:
func renderVertices(vertices: [PointVertex], forStroke stroke: LFStroke, inDrawing drawing:LFDrawing, commandEncoder: MTLRenderCommandEncoder) {
The vertices are correctly reported as 8 bytes long and the draw routine works as intended
Anything I'm missing? if the MetalVertice protocol introducing some noise?
In order to fulfill the requirement that value types conforming to protocols be able to perform dynamic dispatch (and also in part to ensure that containers of protocol types are able to assume that all of their elements are of uniform size), Swift uses what are called existential containers to hold the data of protocol-conforming value types alongside metadata that points to the concrete implementations of each protocol. If you've heard the term protocol witness table, that's what's getting in your way here.
The particulars of this are beyond the scope of this answer, but you can check out this video and this post for more info.
The moral of the story is: don't assume that Swift will lay out out your structs as-written. Swift can reorder struct members and add padding or arbitrary metadata, and it gives you practically no control over this. Instead, declare the structs you need to use in your Metal code in a C or Objective-C file and import them via a bridging header. If you want to use protocols to make it easier to address your structs polymorphically, you need to be prepared to copy them member-wise into your regular old C structs and prepared to pay the memory cost that that convenience entails.

Why is "no code allowed to be all ones" in libjpeg's Huffman decoding?

I'm trying to satisfy myself that METEOSAT images I'm getting from their FTP server are actually valid images. My doubt arises because all the tools I've used so far complain about "Bogus Huffman table definition" - yet when I simply comment out that error message, the image appears quite plausible (a greyscale segment of the Earth's disc).
From https://github.com/libjpeg-turbo/libjpeg-turbo/blob/jpeg-8d/jdhuff.c#L379:
while (huffsize[p]) {
while (((int) huffsize[p]) == si) {
huffcode[p++] = code;
code++;
}
/* code is now 1 more than the last code used for codelength si; but
* it must still fit in si bits, since no code is allowed to be all ones.
*/
if (((INT32) code) >= (((INT32) 1) << si))
ERREXIT(cinfo, JERR_BAD_HUFF_TABLE);
code <<= 1;
si++;
}
If I simply comment out the check, or add a check for huffsize[p] to be nonzero (as in the containing loop's controlling expression), then djpeg manages to convert the image to a BMP which I can view with few problems.
Why does the comment claim that all-ones codes are not allowed?
It claims that because they are not allowed. That doesn't mean that there can't be images out there that don't comply with the standard.
The reason they are not allowed is this (from the standard):
Making entropy-coded segments an integer number of bytes is performed
as follows: for Huffman coding, 1-bits are used, if necessary, to pad
the end of the compressed data to complete the final byte of a
segment.
If the all 1's code was allowed, then you could end up with an ambiguity in the last byte of compressed data where the padded 1's could be another coded symbol.

iOS Simulator memory alignment

I am testing the alignment and i determine something strange with iOS simulator.(XCode 4.3.2 and XCode 4.5).
On iOS simulator, structures are aligned to 8 byte boundary even when attribute ((aligned (4))) is used to force 4 byte boundary. Check that its padded with 0x00000001 at the end to align 8 byte boundary.
If myStruct variable defined in global scope then simulator aligns it to 4-byte boundary, so it may be something related to stack.
Simulator is i386 so its 32-bit and it must be aligning to 4-byte boundary. So, what would be the reason, why is it aligning to 64-bit boundary? Is it a feature or a bug?
(I know it is not necessary to struggle with simulator but it may cause to stuck into subtle problems.)
typedef struct myStruct
{
int a;
int b;
} myStruct;
//} __attribute__ ((aligned (4))) myStruct;
-(void)alignmentTest
{
// Offset 16*n (0x2fdfe2f0)
int __attribute__ ((aligned (16))) force16ByteBoundary = 0x01020304;
// Offset 16*n-4 (0x2fdfe2ec)
int some4Byte = 0x09080706;
// Offset 16*n-12 (0x2fdfe2e4)
myStruct mys;
mys.a = 0xa1b1c1d1;
mys.b = 0xf2e28292;
NSLog(#"&force16ByteBoundary: %p / &some4Byte: %p / &mys: %p",
&force16ByteBoundary, &some4Byte, &mys);
}
(EDIT Optimizations are off, -O0)
Simulator(iOS 5.1) results;
(lldb) x `&mys` -fx
0xbfffda60: 0xa1b1c1d1 0xf2e28292 0x00000001 0x09080706
0xbfffda70: 0x01020304
&force16ByteBoundary: 0xbfffda70 / &some4Byte: 0xbfffda6c / &mys:
0xbfffda60
Device(iOS 5.1) results;
(lldb) x `&mys` -fx
0x2fdfe2e4: 0xa1b1c1d1 0xf2e28292 0x09080706 0x01020304
&force16ByteBoundary: 0x2fdfe2f0 / &some4Byte: 0x2fdfe2ec / &mys:
0x2fdfe2e4
(NEW FINDINGS)
- On Simulator and Device;
- Building for Release or Debug does not make any difference for alignments.
- Local or global variables of "long long", double types are aligned to 8 byte boundary although they must be aligned to 4 byte boundary.
- There is no problem with global variables of structs.
- On Simulator;
- Local variables of structs are aligned to 8 byte boundary even when there is only a char member in the struct.
(EDIT)
I could only find out the "Data Types and Data Alignment" for iOS here.
(Also, they could be inferred from ILP32 alignments here.)
Typically, the alignment attributes only affect the relative alignment of items inside a struct. This allows backward-compatibility with code that wants to just bulk-copy data into a structure directly from the network or binary file.
The alignment attributes won't affect the alignment of local variables allocated on the stack. The ordering and alignment of items on the stack is not guaranteed and will generally be aligned optimally for each item for the device. So, if a 386-based device can fetch a 64-bit long-long from memory in a single operation by 8-byte aligning them, it will do so. Some processors actually lose dramatic amounts of performance if data is not fully aligned. Some processors can throw exceptions for attempting to read data that is not properly aligned.

Best technique for iPad 1 vs iPad 2 GPU determination?

The performance of the iPad 2 GPU is way better than the iPad 1. I'd like to switch in my app and add some extra nice graphical subtlety when I know the GPU can handle it.
So I'd like to be able to detect essentially the distinction between the iPad 1 and 2 (and later), ideally using as close to a capability detection as I can. There are plenty of unrelated things I could switch on (presence of camera, etc), but ideally I'd like to find something, maybe an OpenGL capability, that distinguishes the GPU more directly.
This Apple page doesn't list anything useful for iPad 1 vs 2, and this article talks about benchmarking and GPU arch differences but doesn't pinpoint anything that looks like I can query directly (e.g. number of texture units or whatever).
Anyone have any thoughts on how to do this, or am I missing something obvious? Thanks.
One distinction you can query for is maximum texture size. On iPad 2 and iPhone 4S, the maximum texture size is 4096 x 4096, where on all other iOS devices it's 2048 x 2048. It would seem to me to be a safe assumption that future, more powerful iOS devices would also have a maximum texture size at least this large.
To query for the maximum texture size, first create your OpenGL ES context, then set it as the current context and run the following query:
GLint maxTextureSize;
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &maxTextureSize);
On my iPhone 4, this returns 2048 in maxTextureSize, but on my iPad 2 and iPhone 4S this gives back the value of 4096.
You can also test for the presence of some new extensions that the iPad 2 supports, such as EXT_shadow_samplers (more are documented in "What's New in iOS: iOS 5.0"), but those tests will only work on iOS 5.0. Stragglers still on iOS 4.x won't have those capabilities register.
Today with more GPU's available, here is what I came up with for my own needs.
enum GpuClass {
kGpuA5 = 0,
kGpuA6,
kGpuA7,
kGpuA8,
kGpuUnknown,
} ;
- (enum GpuClass)reportGpuClass {
NSString *glVersion = [NSString stringWithUTF8String:(char *)glGetString(GL_VERSION)];
if ([glVersion containsString:#"Apple A5"] || [glVersion containsString:#"S5L8"]) {
NSLog(#"Running on a A5 GPU");
return kGpuA5;
}
if ([glVersion containsString:#"Apple A6"] || [glVersion containsString:#"IMGSGX5"]) {
NSLog(#"Running on a A6 GPU");
return kGpuA6;
}
if ([glVersion containsString:#"Apple A7"] || [glVersion containsString:#"G6430"]) {
NSLog(#"Running on a A7 GPU");
return kGpuA7;
}
if ([glVersion containsString:#"Apple A8"] || [glVersion containsString:#"GXA6850"]) {
NSLog(#"Running on a A8 GPU");
return kGpuA8;
}
return kGpuUnknown;
}
You may further differentiate between specific chips by specifying more full version numbers. e.g. specify IMGSGX543 instead of just IMGSGX5.

Resources