How to relate intrinsics and distortion from AVDepthData to the current video stream?

60 Views Asked by At

I am writing a small test app in the context of Computer Vision. The final application will require using the camera calibration, so for now my test app creates a capture session, enables the delivery of the intrinsics matrix and logs it.

As an example, this code:

let calibrationPayload = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil)
    if let data = calibrationPayload as? Data {
        let matrix: matrix_float3x3 = data.withUnsafeBytes { $0.pointee }
        print(matrix)
}

running on an iPhone 13 Pro back camera gives me:

simd_float3x3([[4220.394, 0.0, 0.0], [0.0, 4220.394, 0.0], [941.9231, 533.648, 1.0]])

// Corresponding matrix (which makes sense for an 1920x1080 camera):
// [4220.394,    0.0,   941.9231]
// [   0.0,   4220.394, 533.648]
// [   0.0,      0.0,      1.0]

However, I'd like now to also get the distortion associated to that lens. To do so I have changed my app to request a device of type .builtInDualCamera and enabled the depth data stream as only AVDepthData buffers have some companion distortion data (in their cameraCalibrationData property).

In the depth capture delegate call I'm logging the distortion center, lookup table and also the camera intrinsics:

guard let calibrationData = depthData.cameraCalibrationData else {
    return
}
print("Intrinsics = \(calibrationData.intrinsicMatrix)")

let distoCenter = calibrationData.lensDistortionCenter
print("Distortion center: \(distoCenter)")

// More code to log the LUT data too

However in this case the intrinsics are widely different and actually make no sense for an 1920x1080 camera (there seems to be a scale factor of 2.20 between the 2 intrinsic matrices):

Intrinsics = simd_float3x3([[9284.896, 0.0, 0.0], [0.0, 9284.896, 0.0], [2072.8423, 1174.5812, 1.0]])

// Corresponding matrix:
// [9284.896, 0.0, 2072.8423]
// [0.0, 9284.896,1174.5812]
// [0.0, 0.0, 1.0]

Distortion center: (2072.839599609375, 1174.5499267578125)
  1. Can someone please explain to me where this 2.20 ratio come from?
  2. Is it possible to precompute it based on some queries to the capture session, or does it have to be estimated from the focal lengths in the intrinsics matrices?
  3. Applying correctly the LUT to rectify the image requires computing distances from the distortion center, I assume this has to take into account the extra rescaling?
// Example code in AVCalibrationData.h to rectify a point:
// Determine the maximum radius.
float delta_ocx_max = MAX( opticalCenter.x, imageSize.width  - opticalCenter.x );
float delta_ocy_max = MAX( opticalCenter.y, imageSize.height - opticalCenter.y );
float r_max = sqrtf( delta_ocx_max * delta_ocx_max + delta_ocy_max * delta_ocy_max );
 
// Determine the vector from the optical center to the given point.
float v_point_x = point.x - opticalCenter.x;
float v_point_y = point.y - opticalCenter.y;
 
// Determine the radius of the given point.
float r_point = sqrtf( v_point_x * v_point_x + v_point_y * v_point_y );
 
// Look up the relative radial magnification to apply in the provided lookup table
float magnification;
const float *lookupTableValues = lookupTable.bytes;
NSUInteger lookupTableCount = lookupTable.length / sizeof(float);
 
if ( r_point < r_max ) {
  ...
}
0

There are 0 best solutions below