SceneKit Error: C3DMeshElementSetPrimitives invalid index buffer size - ios

I have got the following error :
[SceneKit] Error: C3DMeshElementSetPrimitives invalid index buffer size
It appears at every frame (many errors !)
Do you know how to solve this issue ?
Thanks

Check that indices array is paring the vertices in the vertices array properly.
Suppose you wanted to draw a square using lines with 4 vertices like this:
v1 ------ v0
| |
| |
v2 ------ v3
The vertices array would be v0, v1, v2, v3.
The lines pairs would be: (v0,v1) , (v1,v2) , (v2,v3) , (v3,v0)
So indices array would be: 0,1,1,2,2,3,3,0
In code, this is an example that draws a node that contains a red circle using lines:
func circlePathNode() -> SCNNode {
var vertices:[SCNVector3] = []
let N = 200
var indices:[Int16] = []
// vertices
for i in 1...N {
let t = (Float(i-1) / Float(N)) * 2 * .pi
let v = SCNVector3(cos(t), sin(t), 0)
vertices.append(v)
}
// indices pair vertices
for i in 1...N-1 {
indices.append(Int16(i-1))
indices.append(Int16(i))
}
// last:
indices.append(Int16(N-1))
indices.append(0)
let verticesSource = SCNGeometrySource(vertices: vertices)
let indexData = Data(bytes: indices, count: indices.count * MemoryLayout<Int16>.size) // 2 for 2 bytes each
let element = SCNGeometryElement(data: indexData,
primitiveType: .line,
primitiveCount: N,
bytesPerIndex: MemoryLayout<Int16>.size)
let geometry = SCNGeometry(sources: [verticesSource],elements: [element])
#if os(iOS)
geometry.firstMaterial?.emission.contents = UIColor.red
#else
geometry.firstMaterial?.emission.contents = NSColor.red
#endif
return SCNNode(geometry: geometry)
}

I had the same error when i accidentally pass index count to primitive count. Make sure that you pass exactly primitive count parameter (not index count) to primitive count parameter of SCNGeometryElement constructor.
Primitive count can be calculated this way:
func calculatePrimintiveCount( indexCount: Int, primitiveType: SCNGeometryPrimitiveType) -> Int {
switch (primitiveType){
case .triangles: return indexCount / 3
case .triangleStrip: return indexCount - 2
case .line: return indexCount / 2
case .point: return indexCount
case .polygon: return indexCount / 4 // not sure
}
}

Related

Swift 2D Array Performance

I am working on creating an iOS version of an Android app I created. It involves a lot of two-dimensional array access and assignment, and it worked very quickly on Java. However, when I converted to Swift, I noticed a very significant slowdown. After some research on two dimensional Swift arrays, I thought the problem might be coming from the 2D arrays, so I decided to create and time a simple program to test 2D array performance. I compared the execution times of a 2D and 1D array, and there was a significant difference. Below is the program I used to test performance:
import Foundation
var numberOfItems = 1000
var myArray1 = [[Double]](repeating: [Double](repeating: 0.0, count: numberOfItems), count: numberOfItems)
var myArray2 = [[Double]](repeating: [Double](repeating: 0.0, count: numberOfItems), count: numberOfItems)
var myArray3 = [Double](repeating: 0.0, count: numberOfItems * numberOfItems)
var myArray4 = [Double](repeating: 0.0, count: numberOfItems * numberOfItems)
// 2D array assignment
let start1 = CFAbsoluteTimeGetCurrent()
var x = 0.0
for i in 0..<numberOfItems {
for j in 0..<numberOfItems {
myArray1[i][j] = x
x += 1
}
}
let diff1 = CFAbsoluteTimeGetCurrent() - start1
print(diff1 * 1000)
// 2D array access and assignment
let start2 = CFAbsoluteTimeGetCurrent()
for i in 0..<numberOfItems {
for j in 0..<numberOfItems {
myArray2[i][j] = myArray1[i][j]
}
}
let diff2 = CFAbsoluteTimeGetCurrent() - start2
print(diff2 * 1000)
// 1D array assignment
var y = 0.0
let start3 = CFAbsoluteTimeGetCurrent()
for i in 0..<(numberOfItems * numberOfItems) {
myArray3[i] = y
y += 1
}
let diff3 = CFAbsoluteTimeGetCurrent() - start3
print(diff3 * 1000)
// 1D array access and assignment
let start4 = CFAbsoluteTimeGetCurrent()
for i in 0..<(numberOfItems * numberOfItems) {
myArray4[i] = myArray3[i]
}
let diff4 = CFAbsoluteTimeGetCurrent() - start4
print(diff4 * 1000)
I ran it on the command line using the -Ounchecked option. I got the following output (in ms, some variation but usually pretty close):
6.0759782791137695
24.2689847946167
2.4139881134033203
1.5819072723388672
Clearly there is a considerable performance difference between the 2D and 1D array implementations, especially when both accessing and assigning.
Is there a way to create a more efficient 2D array in Swift? Performance is important for me in this instance, so is it better to use a 1D array and do some math for indexing?
If you really want to stick to a 2D array then you can use unsafe buffer pointers for faster access. However, 1D arrays are still going to be more efficient. Give this a shot.
// 2D array assignment
myArray1.withUnsafeMutableBufferPointer { outer1 -> Void in
for i in 0..<numberOfItems {
outer1[i].withUnsafeMutableBufferPointer { inner1 -> Void in
for j in 0..<numberOfItems {
inner1[j] = x
x += 1
}
}
}
}
// 2D array access and assignment
myArray1.withUnsafeMutableBufferPointer { outer1 -> Void in
myArray2.withUnsafeMutableBufferPointer { outer2 -> Void in
for i in 0..<numberOfItems {
outer1[i].withUnsafeMutableBufferPointer { inner1 -> Void in
outer2[i].withUnsafeMutableBufferPointer { inner2 -> Void in
for j in 0..<numberOfItems {
inner2[j] = inner1[j]
}
}
}
}
}
}

iOS Image Metadata how to get decimal values as fractions string?

EDIT: Resolved, I answered the question below.
I am using the following to get metadata for PHAssets:
let data = NSData.init(contentsOf: url!)!
if let imageSource = CGImageSourceCreateWithData(data, nil) {
let metadata = CGImageSourceCopyPropertiesAtIndex(imageSource, 0, nil)! as NSDictionary
}
The metadata dictionary has all the values I am looking for. However a few fields like ShutterSpeedValue, ExposureTime which have fractions get printed as decimals:
ExposureTime = "0.05"
ShutterSpeedValue = "4.321956769055745"
When I look at this data on my Mac's preview app and exiftool, it shows:
ExposureTime = 1/20
ShutterSpeedValue = 1/20
How can I get the correct fraction string instead of the decimal string?
EDIT: I tried simply converting the decimal to a fraction string using this from SO code but this isn't correct:
func rationalApproximation(of x0 : Double, withPrecision eps : Double = 1.0E-6) -> String {
var x = x0
var a = x.rounded(.down)
var (h1, k1, h, k) = (1, 0, Int(a), 1)
while x - a > eps * Double(k) * Double(k) {
x = 1.0/(x - a)
a = x.rounded(.down)
(h1, k1, h, k) = (h, k, h1 + Int(a) * h, k1 + Int(a) * k)
}
return "\(h)/\(k)"
}
As you notice, the decimal value of ShutterSpeedValue printed as 4.321956769055745 isn't even equal to 1/20.
Resolved.
As per
https://www.dpreview.com/forums/post/54376235
ShutterSpeedValue is defined as APEX value, where:
ShutterSpeed = -log2(ExposureTime)
So -log2(1/20) is 4.3219, just as what I observed.
So to get the ShutterSpeedValue, I use the following:
"1/\(ceil(pow(2, Double(4.321956769055745))))"
I tested 3 different photos and 1/20, 1/15 and 1/1919 were all correctly calculated using your formula.

Simple swapping code not working in swift

I am trying to take transpose of a 2d array in swift. But I don't know why the swapping is not happening.
The array remains the same after taking transpose. I am working with the following code:
var array_4x4 = [[Int]](count: 4, repeatedValue: [Int](count: 4, repeatedValue: 4))
for i in 0..<4
{
for j in 0..<4
{
let temp = Int(arc4random_uniform((100 + 1)) - 1) + 1
array_4x4[i][j] = temp
}
}
for i in 0..<4
{
for j in 0..<4 // code in this loop is not working
{
let temp = array_4x4[i][j]
array_4x4[i][j] = array_4x4[j][i]
array_4x4[j][i] = temp
}
}
Your nested loop runs over all possible array indices (i, j), which means that
each array element is swapped with the transposed element twice.
For example, when i=1 and j=2, the (1, 2) and the (2, 1) array elements are swapped.
Later, when i=2 and j=1, these elements are swapped back.
As a consequence, the matrix is identical to the original matrix in the end.
The solution is to iterate only over (i, j) pairs with i < j,
i.e. swap only the elements above the diagonal with their
counterpart below the diagonal:
for i in 0..<4 {
for j in (i + 1)..<4 {
let temp = array_4x4[i][j]
array_4x4[i][j] = array_4x4[j][i]
array_4x4[j][i] = temp
}
}
Note that the Swift standard library already has a function to
exchange two values:
for i in 0..<4 {
for j in (i + 1)..<4 {
swap(&array_4x4[i][j], &array_4x4[j][i])
}
}
And just for the sake of completeness:
an alternative solution would be to compute the transposed matrix as
a value, and assign it to the same (or a different) variable:
array_4x4 = (0..<4).map { i in (0..<4).map { j in array_4x4[j][i] } }

Swift metal parallel sum calculation of array on iOS

Based on #Kametrixom answer, I have made some test application for parallel calculation of sum in an array.
My test application looks like this:
import UIKit
import Metal
class ViewController: UIViewController {
// Data type, has to be the same as in the shader
typealias DataType = CInt
override func viewDidLoad() {
super.viewDidLoad()
let data = (0..<10000000).map{ _ in DataType(200) } // Our data, randomly generated
var start, end : UInt64
var result:DataType = 0
start = mach_absolute_time()
data.withUnsafeBufferPointer { buffer in
for elem in buffer {
result += elem
}
}
end = mach_absolute_time()
print("CPU result: \(result), time: \(Double(end - start) / Double(NSEC_PER_SEC))")
result = 0
start = mach_absolute_time()
result = sumParallel4(data)
end = mach_absolute_time()
print("Metal result: \(result), time: \(Double(end - start) / Double(NSEC_PER_SEC))")
result = 0
start = mach_absolute_time()
result = sumParralel(data)
end = mach_absolute_time()
print("Metal result: \(result), time: \(Double(end - start) / Double(NSEC_PER_SEC))")
result = 0
start = mach_absolute_time()
result = sumParallel3(data)
end = mach_absolute_time()
print("Metal result: \(result), time: \(Double(end - start) / Double(NSEC_PER_SEC))")
}
func sumParralel(data : Array<DataType>) -> DataType {
let count = data.count
let elementsPerSum: Int = Int(sqrt(Double(count)))
let device = MTLCreateSystemDefaultDevice()!
let parsum = device.newDefaultLibrary()!.newFunctionWithName("parsum")!
let pipeline = try! device.newComputePipelineStateWithFunction(parsum)
var dataCount = CUnsignedInt(count)
var elementsPerSumC = CUnsignedInt(elementsPerSum)
let resultsCount = (count + elementsPerSum - 1) / elementsPerSum // Number of individual results = count / elementsPerSum (rounded up)
let dataBuffer = device.newBufferWithBytes(data, length: strideof(DataType) * count, options: []) // Our data in a buffer (copied)
let resultsBuffer = device.newBufferWithLength(strideof(DataType) * resultsCount, options: []) // A buffer for individual results (zero initialized)
let results = UnsafeBufferPointer<DataType>(start: UnsafePointer(resultsBuffer.contents()), count: resultsCount) // Our results in convenient form to compute the actual result later
let queue = device.newCommandQueue()
let cmds = queue.commandBuffer()
let encoder = cmds.computeCommandEncoder()
encoder.setComputePipelineState(pipeline)
encoder.setBuffer(dataBuffer, offset: 0, atIndex: 0)
encoder.setBytes(&dataCount, length: sizeofValue(dataCount), atIndex: 1)
encoder.setBuffer(resultsBuffer, offset: 0, atIndex: 2)
encoder.setBytes(&elementsPerSumC, length: sizeofValue(elementsPerSumC), atIndex: 3)
// We have to calculate the sum `resultCount` times => amount of threadgroups is `resultsCount` / `threadExecutionWidth` (rounded up) because each threadgroup will process `threadExecutionWidth` threads
let threadgroupsPerGrid = MTLSize(width: (resultsCount + pipeline.threadExecutionWidth - 1) / pipeline.threadExecutionWidth, height: 1, depth: 1)
// Here we set that each threadgroup should process `threadExecutionWidth` threads, the only important thing for performance is that this number is a multiple of `threadExecutionWidth` (here 1 times)
let threadsPerThreadgroup = MTLSize(width: pipeline.threadExecutionWidth, height: 1, depth: 1)
encoder.dispatchThreadgroups(threadgroupsPerGrid, threadsPerThreadgroup: threadsPerThreadgroup)
encoder.endEncoding()
var result : DataType = 0
cmds.commit()
cmds.waitUntilCompleted()
for elem in results {
result += elem
}
return result
}
func sumParralel1(data : Array<DataType>) -> UnsafeBufferPointer<DataType> {
let count = data.count
let elementsPerSum: Int = Int(sqrt(Double(count)))
let device = MTLCreateSystemDefaultDevice()!
let parsum = device.newDefaultLibrary()!.newFunctionWithName("parsum")!
let pipeline = try! device.newComputePipelineStateWithFunction(parsum)
var dataCount = CUnsignedInt(count)
var elementsPerSumC = CUnsignedInt(elementsPerSum)
let resultsCount = (count + elementsPerSum - 1) / elementsPerSum // Number of individual results = count / elementsPerSum (rounded up)
let dataBuffer = device.newBufferWithBytes(data, length: strideof(DataType) * count, options: []) // Our data in a buffer (copied)
let resultsBuffer = device.newBufferWithLength(strideof(DataType) * resultsCount, options: []) // A buffer for individual results (zero initialized)
let results = UnsafeBufferPointer<DataType>(start: UnsafePointer(resultsBuffer.contents()), count: resultsCount) // Our results in convenient form to compute the actual result later
let queue = device.newCommandQueue()
let cmds = queue.commandBuffer()
let encoder = cmds.computeCommandEncoder()
encoder.setComputePipelineState(pipeline)
encoder.setBuffer(dataBuffer, offset: 0, atIndex: 0)
encoder.setBytes(&dataCount, length: sizeofValue(dataCount), atIndex: 1)
encoder.setBuffer(resultsBuffer, offset: 0, atIndex: 2)
encoder.setBytes(&elementsPerSumC, length: sizeofValue(elementsPerSumC), atIndex: 3)
// We have to calculate the sum `resultCount` times => amount of threadgroups is `resultsCount` / `threadExecutionWidth` (rounded up) because each threadgroup will process `threadExecutionWidth` threads
let threadgroupsPerGrid = MTLSize(width: (resultsCount + pipeline.threadExecutionWidth - 1) / pipeline.threadExecutionWidth, height: 1, depth: 1)
// Here we set that each threadgroup should process `threadExecutionWidth` threads, the only important thing for performance is that this number is a multiple of `threadExecutionWidth` (here 1 times)
let threadsPerThreadgroup = MTLSize(width: pipeline.threadExecutionWidth, height: 1, depth: 1)
encoder.dispatchThreadgroups(threadgroupsPerGrid, threadsPerThreadgroup: threadsPerThreadgroup)
encoder.endEncoding()
cmds.commit()
cmds.waitUntilCompleted()
return results
}
func sumParallel3(data : Array<DataType>) -> DataType {
var results = sumParralel1(data)
repeat {
results = sumParralel1(Array(results))
} while results.count >= 100
var result : DataType = 0
for elem in results {
result += elem
}
return result
}
func sumParallel4(data : Array<DataType>) -> DataType {
let queue = NSOperationQueue()
queue.maxConcurrentOperationCount = 4
var a0 : DataType = 0
var a1 : DataType = 0
var a2 : DataType = 0
var a3 : DataType = 0
let op0 = NSBlockOperation( block : {
for i in 0..<(data.count/4) {
a0 = a0 + data[i]
}
})
let op1 = NSBlockOperation( block : {
for i in (data.count/4)..<(data.count/2) {
a1 = a1 + data[i]
}
})
let op2 = NSBlockOperation( block : {
for i in (data.count/2)..<(3 * data.count/4) {
a2 = a2 + data[i]
}
})
let op3 = NSBlockOperation( block : {
for i in (3 * data.count/4)..<(data.count) {
a3 = a3 + data[i]
}
})
queue.addOperation(op0)
queue.addOperation(op1)
queue.addOperation(op2)
queue.addOperation(op3)
queue.suspended = false
queue.waitUntilAllOperationsAreFinished()
let aaa: DataType = a0 + a1 + a2 + a3
return aaa
}
}
And I have a shader that looks like this:
kernel void parsum(const device DataType* data [[ buffer(0) ]],
const device uint& dataLength [[ buffer(1) ]],
device DataType* sums [[ buffer(2) ]],
const device uint& elementsPerSum [[ buffer(3) ]],
const uint tgPos [[ threadgroup_position_in_grid ]],
const uint tPerTg [[ threads_per_threadgroup ]],
const uint tPos [[ thread_position_in_threadgroup ]]) {
uint resultIndex = tgPos * tPerTg + tPos; // This is the index of the individual result, this var is unique to this thread
uint dataIndex = resultIndex * elementsPerSum; // Where the summation should begin
uint endIndex = dataIndex + elementsPerSum < dataLength ? dataIndex + elementsPerSum : dataLength; // The index where summation should end
for (; dataIndex < endIndex; dataIndex++)
sums[resultIndex] += data[dataIndex];
}
On my surprise function sumParallel4 is the fastest, which I thought it shouldn't be. I noticed that when I call functions sumParralel and sumParallel3, the first function is always slower even if I change the order of function. (So if I call sumParralel first this is slower, if I call sumParallel3 this is slower.).
Why is this? Why is sumParallel3 not a lot faster than sumParallel ? Why is sumParallel4 the fastest, although it is calculated on CPU?
How can I update my GPU function with posix_memalign ? I know it should work faster because it would have shared memory between GPU and CPU, but I don't know witch array should be allocated this way (data or result) and how can I allocate data with posix_memalign if data is parameter passed in function?
In running these tests on an iPhone 6, I saw the Metal version run between 3x slower and 2x faster than the naive CPU summation. With the modifications I describe below, it was consistently faster.
I found that a lot of the cost in running the Metal version could be attributed not merely to the allocation of the buffers, though that was significant, but also to the first-time creation of the device and compute pipeline state. These are actions you'd normally perform once at application initialization, so it's not entirely fair to include them in the timing.
It should also be noted that if you're running these tests through Xcode with the Metal validation layer and GPU frame capture enabled, that has a significant run-time cost and will skew the results in the CPU's favor.
With those caveats, here's how you might use posix_memalign to allocate memory that can be used to back a MTLBuffer. The trick is to ensure that the memory you request is in fact page-aligned (i.e. its address is a multiple of getpagesize()), which may entail rounding up the amount of memory beyond how much you actually need to store your data:
let dataCount = 1_000_000
let dataSize = dataCount * strideof(DataType)
let pageSize = Int(getpagesize())
let pageCount = (dataSize + (pageSize - 1)) / pageSize
var dataPointer: UnsafeMutablePointer<Void> = nil
posix_memalign(&dataPointer, pageSize, pageCount * pageSize)
let data = UnsafeMutableBufferPointer(start: UnsafeMutablePointer<DataType>(dataPointer),
count: (pageCount * pageSize) / strideof(DataType))
for i in 0..<dataCount {
data[i] = 200
}
This does require making data an UnsafeMutableBufferPointer<DataType>, rather than an [DataType], since Swift's Array allocates its own backing store. You'll also need to pass along the count of data items to operate on, since the count of the mutable buffer pointer has been rounded up to make the buffer page-aligned.
To actually create a MTLBuffer backed with this data, use the newBufferWithBytesNoCopy(_:length:options:deallocator:) API. It's crucial that, once again, the length you provide is a multiple of the page size; otherwise this method returns nil:
let roundedUpDataSize = strideof(DataType) * data.count
let dataBuffer = device.newBufferWithBytesNoCopy(data.baseAddress, length: roundedUpDataSize, options: [], deallocator: nil)
Here, we don't provide a deallocator, but you should free the memory when you're done using it, by passing the baseAddress of the buffer pointer to free().

Apparent indices limit

Using SceneKit in swift I trying to build a custom 3D object (a terrain). To build a terrain I build a plane that I've divided in a number of horizontal and vertical section. With a small number or section everything is fine but with not so large number the app crash in some deep OpenGL function with a EXC_BAD_ACCESS.
Here is a simplified version of the terrain (yes it's just a plane) which don't exhibit the issue:
let width:Float = 12
let depth:Float = 12
let height:Float = 2
let nx = 6
let nz = 6
func build() -> SCNGeometry {
var vertices : [SCNVector3] = Array()
for i in 0..<(nx + 1) {
for j in 0..<(nz + 1) {
let x = (Float(i) / Float(nx)) * width - width/2
let z = (Float(j) / Float(nz)) * depth - depth/2
let y = Float(0)
vertices.append(SCNVector3(x:x, y:y, z:z))
}
}
var indices : [CInt] = []
for i in 0..<nx {
for j in 0..<nz {
indices.append(CInt(i + j * (nz+1)))
indices.append(CInt(i+1 + j * (nz+1)))
indices.append(CInt(i + (j+1)*(nz+1)))
indices.append(CInt(i+1 + j * (nz+1)))
indices.append(CInt(i+1 + (j+1)*(nz+1)))
indices.append(CInt(i + (j+1)*(nz+1)))
}
}
let data = NSData(bytes: vertices, length: sizeof(SCNVector3) * countElements(vertices))
let vertexSource = SCNGeometrySource(data: data, semantic: SCNGeometrySourceSemanticVertex, vectorCount: vertices.count, floatComponents: true, componentsPerVector: 3, bytesPerComponent: sizeof(Float), dataOffset: 0, dataStride: sizeof(SCNVector3))
let indexData = NSData(bytes: indices, length: sizeof(CInt) * countElements(indices))
let element = SCNGeometryElement(data: indexData, primitiveType: SCNGeometryPrimitiveType.Triangles, primitiveCount: indices.count, bytesPerIndex: sizeof(CInt))
return SCNGeometry(sources: [vertexSource], elements: [element])
}
Now change nx and nz to:
let nx = 8
let nz = 8
Crash
This seems very much linked with the number of indices but at ~300 I don't believe I should be hitting a limit.
Any suggestion, help or solution very much appreciated. Thanks.
The problem could be that you're passing primitiveCount: indices.count when creating the SCNGeometryElement rather than indices.count/3 (since there are three indices per triangle). I'm surprised there's no earlier bounds checking, but without that, you could certainly see a crash depending on the number of indices.

Resources