I am trying to find the coordinates of the Camera in the scene I have made but I end up with coordinates in a different system. Coordinates such as (0.0134094329550862, which is about 1cm in the Scene Coordinate System while moving more than that.
I do this to get coordinates:
let cameraCoordinates = self.sceneView.pointOfView?.worldPosition
self.POSx = Double((cameraCoordinates?.x)!)
self.POSy = Double((cameraCoordinates?.y)!)
self.POSz = Double((cameraCoordinates?.z)!)
This is how I found the coordinates of the camera and its orientation.
func getUserVector() -> (SCNVector3, SCNVector3) {
if let frame = self.sceneView.session.currentFrame {
let mat = SCNMatrix4(frame.camera.transform)
let dir = SCNVector3(-1 * mat.m31, -1 * mat.m32, -1 * mat.m33)
let pos = SCNVector3(mat.m41, mat.m42, mat.m43)
return (dir, pos)
}
return (SCNVector3(0, 0, -1), SCNVector3(0, 0, -0.2))
}
Related
I am trying to get the distance of a specific coordinate from a depthMap resized to the screen size, but it is not working.
I have tried to implement the following steps.
convert the depthMap to CIImage, and then resize the image to the orientation and size of the screen using affine transformation
convert the converted image to a screen-sized CVPixelBuffer
get the distance in meters stored in CVPixelBuffer from a one-dimensional array by width * y + x when getting the coordinates of (x, y).
I have implemented the above procedure, but I cannot get the appropriate index from the one-dimensional array. What should I do?
The code for the procedure is shown below.
1.
let depthMap = depthData.depthMap
// convert the depthMap to CIImage
let image = CIImage(cvPixelBuffer: depthMap)
let imageSize = CGSize(width: depthMap.width, height: depthMap.height)
// 1) キャプチャ画像を 0.0〜1.0 の座標に変換
let normalizeTransform = CGAffineTransform(scaleX: 1.0/imageSize.width, y: 1.0/imageSize.height)
// 2) 「Flip the Y axis (for some mysterious reason this is only necessary in portrait mode)」とのことでポートレートの場合に座標変換。
// Y軸だけでなくX軸も反転が必要。
let interfaceOrientation = self.arView.window!.windowScene!.interfaceOrientation
let flipTransform = (interfaceOrientation.isPortrait) ? CGAffineTransform(scaleX: -1, y: -1).translatedBy(x: -1, y: -1) : .identity
// 3) キャプチャ画像上でのスクリーンの向き・位置に移動
let displayTransform = frame.displayTransform(for: interfaceOrientation, viewportSize: arView.bounds.size)
// 4) 0.0〜1.0 の座標系からスクリーンの座標系に変換
let toViewPortTransform = CGAffineTransform(scaleX: arView.bounds.size.width, y: arView.bounds.size.height)
// 5) 1〜4までの変換を行い、変換後の画像をスクリーンサイズでクリップ
let transformedImage = image.transformed(by: normalizeTransform.concatenating(flipTransform).concatenating(displayTransform).concatenating(toViewPortTransform)).cropped(to: arView.bounds)
// convert the converted image to a screen-sized CVPixelBuffer
if let convertDepthMap = transformedImage.pixelBuffer(cgSize: arView.bounds.size) {
previewImage.image = transformedImage.toUIImage()
DispatchQueue.main.async {
self.processDepthData(convertDepthMap)
}
}
// The process of acquiring CVPixelBuffer is implemented in extension
extension CIImage {
func toUIImage() -> UIImage {
UIImage(ciImage: self)
}
func pixelBuffer(cgSize size:CGSize) -> CVPixelBuffer? {
var pixelBuffer: CVPixelBuffer?
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
let width:Int = Int(size.width)
let height:Int = Int(size.height)
CVPixelBufferCreate(kCFAllocatorDefault,
width,
height,
kCVPixelFormatType_DepthFloat32,
attrs,
&pixelBuffer)
// put bytes into pixelBuffer
let context = CIContext()
context.render(self, to: pixelBuffer!)
return pixelBuffer
}
}
private func processDepthData(_ depthMap: CVPixelBuffer) {
CVPixelBufferLockBaseAddress(depthMap, .readOnly)
let width = CVPixelBufferGetWidth(depthMap)
let height = CVPixelBufferGetHeight(depthMap)
if let baseAddress = CVPixelBufferGetBaseAddress(depthMap) {
let mutablePointer = baseAddress.bindMemory(to: Float32.self, capacity: width*height)
let bufferPointer = UnsafeBufferPointer(start: mutablePointer, count: width*height)
let depthArray = Array(bufferPointer)
CVPixelBufferUnlockBaseAddress(depthMap, .readOnly)
// index = width * y + x to trying to get the distance in meters for the coordinate of (300, 100), but it gets the distance for another coordinate
print(depthArray[width * 100 + 300])
}
}
I am drawing a 3D model as below from a obj file. I need to find the Bounding box for each submesh of that obj file.
let assetURL = Bundle.main.url(forResource: "train", withExtension: "obj")!
let allocator = MTKMeshBufferAllocator(device: device)
let asset = MDLAsset(url: assetURL, vertexDescriptor: vertexDescriptor, bufferAllocator: meshAllocator)
let mdlMesh = asset.object(at: 0) as! MDLMesh
mesh = try! MTKMesh(mesh: mdlMesh, device: device)
for submesh in mesh.submeshes {
//I need to find the bounding box for each Submesh
}
How can I achieve that in iOS.
Here's a function that takes an MTKMesh and produces an array of MDLAxisAlignedBoundingBoxes in model space:
enum BoundingBoxError : Error {
case invalidIndexType(String)
}
func boundingBoxesForSubmeshes(of mtkMesh: MTKMesh, positionAttributeData: MDLVertexAttributeData) throws -> [MDLAxisAlignedBoundingBox]
{
struct VertexPosition {
var x, y, z: Float
}
var boundingBoxes = [MDLAxisAlignedBoundingBox]()
var minX = Float.greatestFiniteMagnitude
var minY = Float.greatestFiniteMagnitude
var minZ = Float.greatestFiniteMagnitude
var maxX = -Float.greatestFiniteMagnitude
var maxY = -Float.greatestFiniteMagnitude
var maxZ = -Float.greatestFiniteMagnitude
let positionsPtr = positionAttributeData.dataStart
for submesh in mtkMesh.submeshes {
let indexBuffer = submesh.indexBuffer
let mtlIndexBuffer = indexBuffer.buffer
let submeshIndicesRaw = mtlIndexBuffer.contents().advanced(by: indexBuffer.offset)
if submesh.indexType != .uint32 {
throw BoundingBoxError.invalidIndexType("Expected 32-bit indices")
}
let submeshIndicesPtr = submeshIndicesRaw.bindMemory(to: UInt32.self,
capacity: submesh.indexCount)
let submeshIndices = UnsafeMutableBufferPointer<UInt32>(start: submeshIndicesPtr,
count:submesh.indexCount)
for index in submeshIndices {
let positionPtr = positionsPtr.advanced(by: Int(index) * positionAttributeData.stride)
let position = positionPtr.assumingMemoryBound(to: VertexPosition.self).pointee
if position.x < minX { minX = position.x }
if position.y < minY { minY = position.y }
if position.z < minZ { minZ = position.z }
if position.x > maxX { maxX = position.x }
if position.y > maxY { maxY = position.y }
if position.z > maxZ { maxZ = position.z }
}
let min = SIMD3<Float>(x: minX, y: minY, z: minZ)
let max = SIMD3<Float>(x: maxX, y: maxY, z: maxZ)
let box = MDLAxisAlignedBoundingBox(maxBounds: max, minBounds: min)
boundingBoxes.append(box)
}
return boundingBoxes
}
Note that this function expects the submesh to have 32-bit indices, though it could be adapted to support 16-bit indices as well.
To drive it, you'll also need to supply an MDLVertexAttributeData object that points to the vertex data in the containing mesh. Here's how to do that:
let mesh = try! MTKMesh(mesh: mdlMesh, device: device)
let positionAttrData = mdlMesh.vertexAttributeData(forAttributeNamed: MDLVertexAttributePosition, as: .float3)!
let boundingBoxes = try? boundingBoxesForSubmeshes(of: mesh, positionAttributeData: positionAttrData)
I haven't tested this code thoroughly, but it seems to produce sane results on my limited set of test cases. Visualizing the boxes with a debug mesh should immediately show whether or not they're correct.
I'm trying to estimate the absolute depth (in meters) from an AVDepthData object based on this equation: depth = baseline x focal_length / (disparity + d_offset). I have all the parameters from cameraCalibrationData, but does this still apply to an image taken in Portrait mode with iPhone X since the two cameras are offset vertically? Also based on WWDC 2017 Session 507, the disparity map is relative, but the AVDepthData documentation states that the disparity values are in 1/m. So can I apply the equation on the values in the depth data as is or do I need to do some additional processing beforehand?
var depthData: AVDepthData
do {
depthData = try AVDepthData(fromDictionaryRepresentation: auxDataInfo)
} catch {
return nil
}
// Working with disparity
if depthData.depthDataType != kCVPixelFormatType_DisparityFloat32 {
depthData = depthData.converting(toDepthDataType: kCVPixelFormatType_DisparityFloat32)
}
CVPixelBufferLockBaseAddress(depthData.depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
// Scale Intrinsic matrix to be in depth image pixel space
guard var intrinsicMatrix = depthData.cameraCalibrationData?.intrinsicMatrix else{ return nil}
let referenceDimensions = depthData.cameraCalibrationData?.intrinsicMatrixReferenceDimensions
let depthWidth = CVPixelBufferGetWidth(depthData.depthDataMap)
let depthHeight = CVPixelBufferGetHeight(depthData.depthDataMap)
let depthSize = CGSize(width: depthWidth, height: depthHeight)
let ratio: Float = Float(referenceDimensions.width) / Float(depthWidth)
intrinsicMatrix[0][0] /= ratio;
intrinsicMatrix[1][1] /= ratio;
intrinsicMatrix[2][0] /= ratio;
intrinsicMatrix[2][1] /= ratio;
// For converting disparity to depth
let baseline: Float = 1.45/100.0 // measured baseline in m
// Prepare for lens distortion correction
let lut = depthData.cameraCalibrationData?.lensDistortionLookupTable
let center = depthData.cameraCalibrationData?.lensDistortionCenter
let centerX: CGFloat = center!.x / CGFloat(ratio)
let centerY: CGFloat = center!.y / CGFloat(ratio)
let correctedCenter = CGPoint(x: centerX, y: centerY);
// Build point cloud
var pointCloud = Array<Any>()
for dataY in 0 ..< depthHeight{
let rowData = CVPixelBufferGetBaseAddress(depthData.depthDataMap)! + dataY * CVPixelBufferGetBytesPerRow(depthData.depthDataMap)
let data = UnsafeBufferPointer(start: rowData.assumingMemoryBound(to: Float32.self), count: depthWidth)
for dataX in 0 ..< depthWidth{
let dispZ = data[dataX]
let pointZ = baseline * intrinsicMatrix[0][0] / dispZ
let currPoint: CGPoint = CGPoint(x: dataX,y: dataY)
let correctedPoint: CGPoint = lensDistortionPoint(for: currPoint, lookupTable: lut!, distortionOpticalCenter: correctedCenter,imageSize: depthSize)
let pointX = (Float(correctedPoint.x) - intrinsicMatrix[2][0]) * pointZ / intrinsicMatrix[0][0];
let pointY = (Float(correctedPoint.y) - intrinsicMatrix[2][1]) * pointZ / intrinsicMatrix[1][1];
pointCloud.append([pointX,pointY,pointZ])
}
}
CVPixelBufferUnlockBaseAddress(depthData.depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
I'm creating an anchor and adding it to my ARSKView at a certain distance in front of the camera like this:
func displayToken(distance: Float) {
print("token dropped at: \(distance)")
guard let sceneView = self.view as? ARSKView else {
return
}
// Create anchor using the camera's current position
if let currentFrame = sceneView.session.currentFrame {
// Create a transform with a translation of x meters in front of the camera
var translation = matrix_identity_float4x4
translation.columns.3.z = -distance
let transform = simd_mul(currentFrame.camera.transform, translation)
// Add a new anchor to the session
let anchor = ARAnchor(transform: transform)
sceneView.session.add(anchor: anchor)
}
}
then the node gets created for the anchor like this:
func view(_ view: ARSKView, nodeFor anchor: ARAnchor) -> SKNode? {
// Create and configure a node for the anchor added to the view's session.
if let image = tokenImage {
let texture = SKTexture(image: image)
let tokenImageNode = SKSpriteNode(texture: texture)
tokenImageNode.name = "token"
return tokenImageNode
} else {
return nil
}
}
This works fine and I see the image get added at the appropriate distance. However, what I'm trying to do is then calculate how far the anchor/node is in front of the camera as you move. The problem is the calculation seems to be off immediately using fabs(cameraZ - anchor.transform.columns.3.z). Please see my code below that is in the update() method to calculate distance between camera and object:
override func update(_ currentTime: TimeInterval) {
// Called before each frame is rendered
guard let sceneView = self.view as? ARSKView else {
return
}
if let currentFrame = sceneView.session.currentFrame {
let cameraZ = currentFrame.camera.transform.columns.3.z
for anchor in currentFrame.anchors {
if let spriteNode = sceneView.node(for: anchor), spriteNode.name == "token", intersects(spriteNode) {
// token is within the camera view
//print("token is within camera view from update method")
print("DISTANCE BETWEEN CAMERA AND TOKEN: \(fabs(cameraZ - anchor.transform.columns.3.z))")
print(cameraZ)
print(anchor.transform.columns.3.z)
}
}
}
}
Any help is appreciated in order to accurately get distance between camera and the anchor.
The last column of a 4x4 transform matrix is the translation vector (or position relative to a parent coordinate space), so you can get the distance in three dimensions between two transforms by simply subtracting those vectors.
let anchorPosition = anchor.transforms.columns.3
let cameraPosition = camera.transform.columns.3
// here’s a line connecting the two points, which might be useful for other things
let cameraToAnchor = cameraPosition - anchorPosition
// and here’s just the scalar distance
let distance = length(cameraToAnchor)
What you’re doing isn’t working right because you’re subtracting the z-coordinates of each vector. If the two points are different in x, y, and z, just subtracting z doesn’t get you distance.
This one is for scenekit, I'll leave it here though.
let end = node.presentation.worldPosition
let start = sceneView.pointOfView?.worldPosition
let dx = (end?.x)! - (start?.x)!
let dy = (end?.y)! - (start?.y)!
let dz = (end?.z)! - (start?.z)!
let distance = sqrt(pow(dx,2)+pow(dy,2)+pow(dz,2))
With RealityKit there is a slightly different way to do this. If you're using the world tracking configuration, your AnchorEntity object conforms to HasAnchoring which gives you a target. Target is an enum of AnchoringComponent.Target. It has a case .world(let transform). You can compare your world transform to the camera's world transform like this:
if case let AnchoringComponent.Target.world(transform) = yourAnchorEntity.anchoring.target {
let theDistance = distance(transform.columns.3, frame.camera.transform.columns.3)
}
This took me a bit to figure out but I figure others that might be using RealityKit might benefit from this.
As mentioned above by #codeman, this is the right solution:
let distance = simd_distance(YOUR_NODE.simdTransform.columns.3, (sceneView.session.currentFrame?.camera.transform.columns.3)!);
3D distance - You can check these utils,
class ARSceneUtils {
/// return the distance between anchor and camera.
class func distanceBetween(anchor : ARAnchor,AndCamera camera: ARCamera) -> CGFloat {
let anchorPostion = SCNVector3Make(
anchor.transform.columns.3.x,
anchor.transform.columns.3.y,
anchor.transform.columns.3.z
)
let cametaPosition = SCNVector3Make(
camera.transform.columns.3.x,
camera.transform.columns.3.y,
camera.transform.columns.3.z
)
return CGFloat(self.calculateDistance(from: cametaPosition , to: anchorPostion))
}
/// return the distance between 2 vectors.
class func calculateDistance(from: SCNVector3, to: SCNVector3) -> Float {
let x = from.x - to.x
let y = from.y - to.y
let z = from.z - to.z
return sqrtf( (x * x) + (y * y) + (z * z))
}
}
And now you can call:
guard let camera = session.currentFrame?.camera else { return }
let anchor = // you anchor
let distanceAchorAndCamera = ARSceneUtils.distanceBetween(anchor: anchor, AndCamera: camera)
I'm trying to follow some sample code https://medium.com/#yatchoi/getting-started-with-arkit-real-life-waypoints-1707e3cb1da2 below and I'm getting unresolved identifier 'MatrixHelper'
let translationMatrix = MatrixHelper.translate(
x: 0,
y: 0,
z: distance * -1
)
// Rotation matrix theta degrees
let rotationMatrix = MatrixHelper.rotateAboutY(
degrees: bearing * -1
)
What library or package do I need to import to get MatrixHelper - some Math package? I googled the docs but couldn't find anything.
I think he just wrote a custom class to wrap the functions provided by GLKit.
He named that class MatrixHelper.
With MatrixHelper.rotateAboutY() he is calling something like:
GLKMatrix4 GLKMatrix4Rotate(GLKMatrix4 matrix, float radians, float x, float y, float z);
So the package used in MatrixHelper is GLKit and to be more precise GLKMatrix4
https://developer.apple.com/documentation/glkit/glkmatrix4-pce
You can avoid using MatrixHelper (also GLKit), use simd, it is simple. The methods would look the following way:
func getTranslationMatrix(tx: Float, ty: Float, tz: Float) -> simd_float4x4 {
var translationMatrix = matrix_identity_float4x4
translationMatrix.columns.3 = simd_float4(tx, ty, tz, 1)
return translationMatrix
}
func getRotationYMatrix(angle: Float) -> simd_float3x3 {
let rows = [
simd_float3(cos(angle), 0, -sin(angle)),
simd_float3(0, 1, 0),
simd_float3(-sin(angle), 0, cos(angle))
]
return float3x3(rows: rows)
}
Or more specific the translation matrix:
func getTranslationZMatrix(tz: Float) -> simd_float4x4 {
var translationMatrix = matrix_identity_float4x4
translationMatrix.columns.3.z = tz
return translationMatrix
}
Then the code would look like:
func getTransformGiven(currentLocation: CLLocation) -> matrix_float4x4 {
let bearing = bearingBetween(
startLocation: currentLocation,
endLocation: location
)
let distance: Float = 5
let originTransform = matrix_identity_float4x4
// Create a transform with a translation of 5meter away
// let translationMatrix = MatrixHelper.translate(
// x: 0,
// y: 0,
// z: distance * -1
// )
let translationMatrix = getTranslationZMatrix(tz: distance * -1)
// Rotation matrix theta degrees
// let rotationMatrix = MatrixHelper.rotateAboutY(
// degrees: bearing * -1
// )
let rotationMatrix = getRotationYMatrix(angle: Float(bearing * 180 / Double.pi))
var transformMatrix = simd_mul(rotationMatrix, translationMatrix)
return simd_mul(originTransform, transformMatrix)
}