I'm building an iOS app which detects cars via Vision and then retrieves the distance to said car by a synchronized depthDataMap from the LiDAR sensor.
However, I'm having trouble finding the correct corresponding pixel in that depthDataMap. While the CGRect of the ObjectObservation ranges from 0 - 300 (x) and 0 - 600 (y), The width x height of the DepthDataMap is Only 320 x 180, so I can't get the right corresponding pixel. Any Idea on how to solve this?
This is my function in my AVCaptureDataOutputSynchronizerDelegate
:
func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer,
didOutput synchronizedDataCollection: AVCaptureSynchronizedDataCollection) {
// Retrieve the synchronized depth and sample buffer container objects.
guard let syncedDepthData = synchronizedDataCollection.synchronizedData(for: depthDataOutput) as? AVCaptureSynchronizedDepthData,
let syncedVideoData = synchronizedDataCollection.synchronizedData(for: videoOutput) as? AVCaptureSynchronizedSampleBufferData else { return }
guard let pixelBuffer = syncedVideoData.sampleBuffer.imageBuffer else { return }
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: .up, options: [:])
do {
try imageRequestHandler.perform(self.requests)
} catch let error {
print(error)
}
depthData = syncedDepthData.depthData.converting(toDepthDataType: kCVPixelFormatType_DepthFloat16).depthDataMap
}
and this is my function which in which I want to detect the middle of the object detection and retrieve it's depth pixel:
for observation in results where observation is VNRecognizedObjectObservation {
guard let objectObservation = observation as? VNRecognizedObjectObservation else { continue }
let topLabelObservation = objectObservation.labels[0]
if topLabelObservation.identifier != "car" { return }
let objectBounds = VNImageRectForNormalizedRect(objectObservation.boundingBox, Int(screenRect.size.width), Int(screenRect.size.height))
let transformBounds = CGRect(x: objectBounds.minX, y: screenRect.size.height - objectBounds.maxY, width: objectBounds.maxX - objectBounds.minX, height: objectBounds.maxY - objectBounds.minY)
let depthMapWidth = CVPixelBufferGetWidthOfPlane(depthData, 0) //always 180
let depthMapHeight = CVPixelBufferGetHeightOfPlane(depthData, 0) // always 320
let objMiddlePointX = Int(objectBounds.minX + (objectBounds.maxX - objectBounds.minX)/2)
let objMiddlePointY = Int(screenRect.size.height - objectBounds.maxY + (objectBounds.maxY - objectBounds.minY)/2)
let rowData = CVPixelBufferGetBaseAddress(depthData)?.assumingMemoryBound(to: Float16.self)
let depthPoint = rowData?[objMiddlePointX * depthMapWidth + objMiddlePointY]
let boxLayer = self.drawBoundingBox(transformBounds)
detectionLayer.addSublayer(boxLayer)
}
I'm glad for every suggestion in the right direction.