Weird case with MemoryLayout using a struct protocol, different size reported - ios

I'm working on a drawing engine using Metal. I am reworking from a previous version, so starting from scratch
I was getting error Execution of the command buffer was aborted due to an error during execution. Caused GPU Hang Error (IOAF code 3)
After some debugging I placed the blame to my drawPrimitives routine, I found the case quite interesting
I will have a variety of brushes, all of them will work with specific Vertex info
So I said, why not? Have all the brushes respond to a protocol
The protocol for the Vertices will be this:
protocol MetalVertice {}
And the Vertex info used by this specific brush will be:
struct PointVertex:MetalVertice{
var pointId:UInt32
let relativePosition:UInt32
}
The brush can be called either by giving Vertices previously created or by calling a function to create those vertices. Anyway, the real drawing happens at the vertice function
var vertices:[PointVertex] = [PointVertex].init(repeating: PointVertex(pointId: 0,
relativePosition: 0),
count: totalVertices)
for (verticeIdx, pointIndex) in pointsIndices.enumerated(){
vertices[verticeIdx].pointId = UInt32(pointIndex)
}
for vertice in vertices{
print("size: \(MemoryLayout.size(ofValue: vertice))")
}
self.renderVertices(vertices: vertices,
forStroke: stroke,
inDrawing: drawing,
commandEncoder: commandEncoder)
return vertices
}
func renderVertices(vertices: [MetalVertice], forStroke stroke: LFStroke, inDrawing drawing:LFDrawing, commandEncoder: MTLRenderCommandEncoder) {
if vertices.count > 1{
print("vertices a escribir: \(vertices.count)")
print("stride: \(MemoryLayout<PointVertex>.stride)")
print("size of array \(MemoryLayout.size(ofValue: vertices))")
for vertice in vertices{
print("ispointvertex: \(vertice is PointVertex)")
print("size: \(MemoryLayout.size(ofValue: vertice))")
}
}
let vertexBuffer = LFDrawing.device.makeBuffer(bytes: vertices,
length: MemoryLayout<PointVertex>.stride * vertices.count,
options: [])
This was the issue, calling this specific code produces these results in the console:
size: 8
size: 8
vertices a escribir: 2
stride: 8
size of array 8
ispointvertex: true
size: 40
ispointvertex: true
size: 40
In the previous function, the size of the vertices is 8 bytes, but for some reason, when they enter the next function they turn into 40 bytes, so the buffer is incorrectly constructed
if I change the function signature to:
func renderVertices(vertices: [PointVertex], forStroke stroke: LFStroke, inDrawing drawing:LFDrawing, commandEncoder: MTLRenderCommandEncoder) {
The vertices are correctly reported as 8 bytes long and the draw routine works as intended
Anything I'm missing? if the MetalVertice protocol introducing some noise?

In order to fulfill the requirement that value types conforming to protocols be able to perform dynamic dispatch (and also in part to ensure that containers of protocol types are able to assume that all of their elements are of uniform size), Swift uses what are called existential containers to hold the data of protocol-conforming value types alongside metadata that points to the concrete implementations of each protocol. If you've heard the term protocol witness table, that's what's getting in your way here.
The particulars of this are beyond the scope of this answer, but you can check out this video and this post for more info.
The moral of the story is: don't assume that Swift will lay out out your structs as-written. Swift can reorder struct members and add padding or arbitrary metadata, and it gives you practically no control over this. Instead, declare the structs you need to use in your Metal code in a C or Objective-C file and import them via a bridging header. If you want to use protocols to make it easier to address your structs polymorphically, you need to be prepared to copy them member-wise into your regular old C structs and prepared to pay the memory cost that that convenience entails.

Related

How to find biggest variant in an enum in Rust?

I'm trying to improve the performance of a rust program, which requires me to reduce the size of some large enums. For example
enum EE {
A, // 0
B(i32), //4
C(i64), // 8
D(String), // 24
E { // 16
x: i64,
y: i32,
},
}
fn main() {
println!("{}", std::mem::size_of::<EE>()); // 32
}
prints 32. But if I want to know the size of EE::A, I get a compile error
error[E0573]: expected type, found variant `EE::A`
--> src/main.rs:14:40
|
14 | println!("{}", std::mem::size_of::<EE::A>());
| ^^^^^
| |
| not a type
| help: try using the variant's enum: `crate::EE`
error: aborting due to previous error
error: could not compile `play_rust`.
Is there a way to find out which variant takes the most space?
No, there is no way to get the size of just one variant of an enum. The best you can do is get the size of what the variant contains, as if it were a standalone struct:
println!("sizeof EE::A: {}", std::mem::size_of::<()>()); // 0
println!("sizeof EE::B: {}", std::mem::size_of::<i32>()); // 4
println!("sizeof EE::C: {}", std::mem::size_of::<i64>()); // 8
println!("sizeof EE::D: {}", std::mem::size_of::<String>()); // 24
println!("sizeof EE::E: {}", std::mem::size_of::<(i64, i32)>()); // 16
Even this isn't especially useful because it includes padding bytes that may be used to store the tag; as you point out, the size of the enum can be reduced to 16 if D is shrunk to a single pointer, but you can't know that from looking at just the sizes. If y were instead defined as i64, the size of each variant would be the same, but the size of the enum would need to be 24. Alignment is another confounding factor that makes the size of an enum more complex than just "the size of the largest variant plus the tag".
Of course, this is all highly platform-dependent, and your code should not rely on any enum having a particular layout (unless you can guarantee it with a #[repr] annotation).
If you have a particular enum you're worried about, it's not difficult to get the size of each contained type. Clippy also has a lint for enums with extreme size differences between variants. However, I don't recommend using size alone to make manual optimizations to enum layouts, or boxing things that are only a few pointers in size -- indirection suppresses other kinds of optimizations the compiler may be able to do. If you prioritize minimal space usage you may accidentally make your code much slower in the process.

What is the most memory-efficient array of nullable vectors when most of the second dimension will be empty?

I have a large fixed-size array of variable-sized arrays of u32. Most of the second dimension arrays will be empty (i.e. the first array will be sparsely populated). I think Vec is the most suitable type for both dimensions (Vec<Vec<u32>>). Because my first array might be quite large, I want to find the most space-efficient way to represent this.
I see two options:
I could use a Vec<Option<Vec<u32>>>. I'm guessing that as Option is a tagged union, this would result each cell being sizeof(Vec<u32>) rounded up to the next word boundary for the tag.
I could directly use Vec::with_capacity(0) for all cells. Does an empty Vec allocate zero heap until it's used?
Which is the most space-efficient method?
Actually, both Vec<Vec<T>> and Vec<Option<Vec<T>>> have the same space efficiency.
A Vec contains a pointer that will never be null, so the compiler is smart enough to recognize that in the case of Option<Vec<T>>, it can represent None by putting 0 in the pointer field. What is the overhead of Rust's Option type? contains more information.
What about the backing storage the pointer points to? A Vec doesn't allocate (same link as the first) when you create it with Vec::new or Vec::with_capacity(0); in that case, it uses a special, non-null "empty pointer". Vec only allocates space on the heap when you push something or otherwise force it to allocate. Therefore, the space used both for the Vec itself and for its backing storage are the same.
Vec<Vec<T>> is a decent starting point. Each entry costs 3 pointers, even if it is empty, and for filled entries there can be additional per-allocation overhead. But depending on which trade-offs you're willing to make, there might be a better solution.
Vec<Box<[T]>> This reduces the size of an entry from 3 pointers to 2 pointers. The downside is that changing the number of elements in a box is both inconvenient (convert to and from Vec<T>) and more expensive (reallocation).
HashMap<usize, Vec<T>> This saves a lot of memory if the outer collection is sufficiently sparse. The downsides are higher access cost (hashing, scanning) and a higher per element memory overhead.
If the collection is only filled once and you never resize the inner collections you could use a split data structure:
This not only reduces the per-entry size to 1 pointer, it also eliminates the per-allocation overhead.
struct Nested<T> {
data: Vec<T>,
indices: Vec<usize>,// points after the last element of the i-th slice
}
impl<T> Nested<T> {
fn get_range(&self, i: usize) -> std::ops::Range<usize> {
assert!(i < self.indices.len());
if i > 0 {
self.indices[i-1]..self.indices[i]
} else {
0..self.indices[i]
}
}
pub fn get(&self, i:usize) -> &[T] {
let range = self.get_range(i);
&self.data[range]
}
pub fn get_mut(&mut self, i:usize) -> &mut [T] {
let range = self.get_range(i);
&mut self.data[range]
}
}
For additional memory savings you can reduce the indices to u32 limiting you to 4 billion elements per collection.

What does `UnsafeMutablePointer.initialize()`actually do?

The following is base on my guess. Someone please point out the parts that I understand incorrectly.
If I have a class, of which an instance occupies 128 bits, called Class128Bits. And my program runs on a 64 bits computer.
First, I call let pointer = UnsafeMutablePointer<Calss128Bits>.allocate(capacity: 2)
the memory layout should look like this:
000-063 064 bits chaos
064-127 064 bits chaos
128-255 128 bits chaos
256-383 128 bits chaos
If I call pointer.pointee = aClass128Bits, it crashes because the pointers in the first two grids have not been initialized yet. Accessing to what they point to leads to unpredictable results.
But if I call pointer.initialize(to: aClass128Bits, count: 2), the pointers could be initialized like this:
000-063 address to offset 128
064-127 address to offset 256
128-255 a copy of aClass128Bits
256-383 a copy of aClass128Bits
Then any accesses will be safe.
However this cannot explain why UnsafeMutablePointer<Int> does not crash.
Original
The case I am facing:
The pointer to Int works fine, but the one to String crashes.
I know that I need to initialize it like this:
But I can't see the reason why I need to pass "42" twice.
In C, I might do something similar like this:
char *pointer = (char *)malloc(3 * sizeof(char));
memcpy(pointer, "42", 3);
free(pointer)
If allocate equals malloc, free equals deallocate, memcpy equals pointee{ set },
then what do initialize and deinitialize actually do?
And why does my code crash?
let pointer0 = UnsafeMutablePointer<String>.allocate(capacity: 1)
let pointer1 = UnsafeMutablePointer<Int>.allocate(capacity: 1)
let check the size of both
MemoryLayout.size(ofValue: pointer0) // 8
MemoryLayout.size(ofValue: pointer1) // 8
let check the value of .pointee
pointer0.pointee // CRASH!!!
while
pointer1.pointee // some random value
Why? The answer is as simple, as it can be. We allocated 8 bytes, independently from "associated" Type. Now is clear, that 8 bytes in memory are not enough to store any String. the underlying memory must be referenced indirectly. But there are some 8 random bytes there ... Loading what is in the memory with address represented by 8 random bytes as a String will most likely crash :-)
Why didn't it crash in the second case? Int value is 8 bytes long and the address can be represented as Int value.
let's try in the Playground
import Foundation
let pointer = UnsafeMutablePointer<CFString>.allocate(capacity: 1)
let us = Unmanaged<CFString>.passRetained("hello" as CFString)
pointer.initialize(to: us.takeRetainedValue())
print(pointer.pointee)
us.release()
// if this playground crash, try to run it again and again ... -)
print(pointer.pointee)
look what it prints to me :-)
hello
(
"<__NSCFOutputStream: 0x7fb0bdebd120>"
)
There is no miracle behind. pointer.pointee is trying to represent what is in the memory, which address is stored in our pointer, as a value of its associated type. It never crashes for Int because every 8 continues bytes somewhere in the memory can be represented as Int.
Swift use ARC, but creating the Unsafe[Mutable]Poiner doesn't allocate any memory for the instance of T, destroying it doesn't deallocate any memory for it.
Typed memory must be initialized before use and deinitialized after use. This is done using initialize and deinitialize methods respectively. Deinitialization is only required for non-trivial types. That said, including deinitialization is a good way to future-proof your code in case you change to something non-trivial
Why doesn't assignment to .pointee with Int value crash?
Initialize store the address of value
Assignment to pointee update the value at stored address
Without initializing It most likely will crash, only the probability is less by modifying only 8 bytes in memory at some random address.
trying this
import Darwin
var k = Int16.max.toIntMax()
typealias MyTupple = (Int32,Int32,Int8, Int16, Int16)
var arr: [MyTupple] = []
repeat {
let p = UnsafeMutablePointer<MyTupple>.allocate(capacity: 1)
if k == 1 {
print(MemoryLayout.size(ofValue: p), MemoryLayout.alignment(ofValue: p),MemoryLayout.stride(ofValue: p))
}
arr.append(p.pointee)
k -= 1
defer {
p.deallocate(capacity: 1)
}
} while k > 0
let s = arr.reduce([:]) { (r, v) -> [String:Int] in
var r = r
let c = r["\(v.0),\(v.1),\(v.2),\(v.3)"] ?? 0
r["\(v.0),\(v.1),\(v.2),\(v.3)"] = c + 1
return r
}
print(s)
I received
8 8 8
["0,0,-95,4104": 6472, "0,0,0,0": 26295]
Program ended with exit code: 0
It doesn't look very random, is it? That explains, why the crash with the typed pointer to Int is very unlikely.
One reason you need initialize(), and the only one as for now maybe, is
for ARC.
You'd better think with local scope variables, when seeing how ARC works:
func test() {
var refVar: RefType = initValue //<-(1)
//...
refVar = newValue //<-(2)
//...
//<-(3) just before exiting the loacl scope
}
For a usual assignment as (2), Swift generates some code like this:
swift_retain(_newValue)
swift_release(_refVar)
_refVar = _newValue
(Assume _refVar and _newValue are unmanaged pseudo vars.)
Retain means incrementing the reference count by 1, and release means decrementing the reference count by 1.
But, think what happens when the initial value assignment as at (1).
If the usual assignment code was generated, the code might crash at this line:
swift_release(_refVar)
because newly allocated region for a var may be filled with garbages, so swift_release(_refVar) cannot be safely executed.
Filling the newly region with zero (null) and release safely ignoring the null could be one solution, but it's sort of redundant and not effective.
So, Swift generates this sort of code for initial value assignment:
(for already retained values, if you know ownership model, owned by you.)
_refVar = _initValue
(for unretained values, meaning you have no ownership yet.)
swift_retain(_initValue)
_refVar = _initValue
This is initialize.
No-releasing the garbage data, and assign an initial value, retaining it if needed.
(The above explanation of "usual assignment" is a little bit simplified, Swift omits swift_retain(_newValue) when not needed.)
When exiting the local scope at (3), Swift just generates this sort of code:
swift_release(_refVar)
So, this is deinitialize.
Of course, you know retaining and releasing are not needed for primitive types like Int, so initialize and deinitialize may be donothing for such types.
And when you define a value type which includes some reference type properties, Swift generates initialize and deinitialize procedures specialized for the type.
The local scope example works for the regions allocated on the stack, and initialize() and deinitialize() of UnsafeMutablePointer works for the regions allocated in the heap.
And Swift is evolving so swift, that you might find another reason for needing initialize() and deinitialize() in the future, you'd better make it a habit to initialize() and deinitialize() all allocated UnsafeMutablePointers of any Pointee types.
From the documentation it is possible to conclude that .initialize() is a method that :
Initializes memory starting at self with the elements of source.
And .deinitialize() is a method that :
De-initializes the count Pointees starting at self, returning their
memory to an uninitialized state.
We should understand that when we are using UnsafeMutablePointer we should manage memory on our own. And methods that are described above help us to do this.
So in your case lets analyze example that you provide:
let pointer = UnsafeMutablePointer<String>.allocate(capacity: 1)
// allocate a memory space
pointer.initialize(to: "42")
// initialise memory
pointer.pointee // "42"
// reveals what is in the pointee location
pointer.pointee = "43"
// change the contents of the memory
pointer.deinitialize()
// return pointer to an unintialized state
pointer.deallocate(1)
// deallocate memory
So your code crashes because you do not initialize memory and try to set value.
Previously in objective-c when we are working with objects we always use [[MyClass alloc] init]].
In this case :
alloc:
allocates a part of memory to hold the object, and returns the
pointer.
init:
sets up the initial parameters of the object and returns it.
So basically .initialize() sets the value to the allocated memory part. When you create an object only with alloc you only set reference to empty memory part in the heap. When you call .initialize() you set value to this memory allocation in the heap.
Nice article about the pointers.

Read of memory allocation returns spurious results if, following read, free() is called - why does this happen? (embedded)

Programming on a stm32f4 some strange behaviour is observed:
Data is allocated using realloc, which is called every second or so, as such; ptr = realloc(ptr, sizeof)
Values are read into the data - it has been confirmed that: A) The indexing of the array is correct and B) Immediately following each read of values into memory the array holds the correct values.
Upon reading the array the code fails to produce proper output (outputs 0s the vast majority of the time) if free(ptr) is called in any code following the read. When free(ptr) is not called the code functions properly. It seems that the sequential nature of C breaks down in this instance?
Immediately following each read of values into memory the array holds the correct values regardless of any 'free' calls. Realloc is used because this interrupt is called repeatedly. The 'random pointer' has been set to NULL when initialised, before the pointer is realloced. This is an embedded program on a stm32f4.
Being inexperienced with embedded c I can only speculate, but imagine the cause may be faulty optimisation?
Is this behaviour known? I am aware that it is best practice to avoid malloc ect but due to the large variances in amounts of data potentially being held in this application the flexibility is required.
The code mallocs using pointers contained within a global struct. The following code is the offending material:
structContainingMemoryPointer storedData;
numberOfInts = 0;
// ***********Getdata if interrupt conditions state to do so - contained within interrupt***********
interrupt {
if (SpecificInterrupt) {
numberOfInts++;
storedData.Arrayptr =
realloc(storedData.Arrayptr,
sizeof(int) * storedData.numberOfInts * 2);
// Store the value of actualTemp
storedData.Arrayptr[storedData.numberOfInts - 1] = actualTemp;
// Step through the temperature values array and send to USART
for (arrayStep = 0; arrayStep < storedData.numberOfTempAllocations;
arrayStep++) {
// Convert to string and send
sprintf(valueString, ":%d", storedData.temperature[arrayStep]);
USART_puts(USART2, valueString);
}
}
// ***********free memory*************
free(storedDataStruct.Arrayptr);
storedDataStruct.Arrayptr = NULL;
// End of program, no return from this point to previous points.

Swift numerics and CGFloat (CGPoint, CGRect, etc.)

I'm finding Swift numerics particularly clumsy when, as so often happens in real life, I have to communicate with Cocoa Touch with regard to CGRect and CGPoint (e.g., because we're talking about something's frame or bounds).
CGFloat vs. Double
Consider the following innocent-looking code from a UIViewController subclass:
let scale = 2.0
let r = self.view.bounds
var r2 = CGRect()
r2.size.width = r.size.width * scale
This code fails to compile, with the usual mysterious error on the last line:
Could not find an overload for '*' that accepts the supplied arguments
This error, as I'm sure you know by now, indicates some kind of impedance mismatch between types. r.size.width arrives as a CGFloat, which will interchange automatically with a Swift Float but cannot interoperate with a Swift Double variable (which, by default, is what scale is).
The example is artificially brief, so there's an artificially simple solution, which is to cast scale to a Float from the get-go. But when many variables drawn from all over the place are involved in the calculation of a proposed CGRect's elements, there's a lot of casting to do.
Verbose Initializer
Another irritation is what happens when the time comes to create a new CGRect. Despite the documentation, there's no initializer with values but without labels. This fails to compile because we've got Doubles:
let d = 2.0
var r3 = CGRect(d, d, d, d)
But even if we cast d to a Float, we don't compile:
Missing argument labels 'x:y:width:height:' in call
So we end up falling back on CGRectMake, which is no improvement on Objective-C. And sometimes CGRectMake and CGSizeMake are no improvement. Consider this actual code from one of my apps:
let kSEP : Float = 2.0
let intercellSpacing = CGSizeMake(kSEP, kSEP);
In one of my projects, that works. In another, it mysteriously fails — the exact same code! — with this error:
'NSNumber' is not a subtype of 'CGFloat'
It's as if, sometimes, Swift tries to "cross the bridge" by casting a Float to an NSNumber, which of course is the wrong thing to do when what's on the other side of the bridge expects a CGFloat. I have not yet figured out what the difference is between the two projects that causes the error to appear in one but not the other (perhaps someone else has).
NOTE: I may have figured out that problem: it seems to depend on the Build Active Architecture Only build setting, which in turn suggests that it's a 64-bit issue. Which makes sense, since Float would not be a match for CGFloat on a 64-bit device. That means that the impedance mismatch problem is even worse than I thought.
Conclusion
I'm looking for practical words of wisdom on this topic. I'm thinking someone may have devised some CGRect and CGPoint extension that will make life a lot easier. (Or possibly someone has written a boatload of additional arithmetic operator function overloads, such that combining CGFloat with Int or Double "just works" — if that's possible.)
Explicitly typing scale to CGFloat, as you have discovered, is indeed the way handle the typing issue in swift. For reference for others:
let scale: CGFloat = 2.0
let r = self.view.bounds
var r2 = CGRect()
r2.size.width = r.width * scale
Not sure how to answer your second question, you may want to post it separately with a different title.
Update:
Swift creator and lead developer Chris Lattner had this to say on this issue on the Apple Developer Forum on July 4th, 2014:
What is happening here is that CGFloat is a typealias for either Float
or Double depending on whether you're building for 32 or 64-bits.
This is exactly how Objective-C works, but is problematic in Swift
because Swift doesn't allow implicit conversions.
We're aware of
this problem and consider it to be serious: we are evaluating several
different solutions right now and will roll one out in a later beta.
As you notice, you can cope with this today by casting to Double.
This is inelegant but effective :-)
Update In Xcode 6 Beta 5:
A CGFloat can be constructed from any Integer type (including the
sized integer types) and vice-versa. (17670817)
I wrote a library that handles operator overloading to allow interaction between Int, CGFloat and Double.
https://github.com/seivan/ScalarArithmetic
As of Beta 5, here's a list of things that you currently can't do with vanilla Swift.
https://github.com/seivan/ScalarArithmetic#sample
I suggest running the test suite with and without ScalarArithmetic just to see what's going on.
I created an extension for Double and Int that adds a computed CGFloatValue property to them.
extension Double {
var CGFloatValue: CGFloat {
get {
return CGFloat(self)
}
}
}
extension Int {
var CGFloatValue: CGFloat {
get {
return CGFloat(self)
}
}
}
You would access it by using let someCGFloat = someDoubleOrInt.CGFloatValue
Also, as for your CGRect Initializer, you get the missing argument labels error because you have left off the labels, you need CGRect(x: d, y: d, width: d, height: d) you can't leave the labels out unless there is only one argument.

Resources