I have camera calibration intrinsics and extrinsics (including rotations and translations, i.e. rvecs and tvecs, for a set of N camera poses, relative to a fixed ChArUco target.
Additionally, for every camera pose, I have a set of 3D coordinates ("point cloud") defined in the standard OpenCV camera coordinate system.
After a lot of reading, I would have thought that I need to first compute each camera pose relative to the ChArUCo board, by constructing a homogeneous transformation matrix like so in python:
# initialize 4x4 transform
inverted_homogeneous_transform_matrix = np.zeros(shape=(4,4))
inverted_homogeneous_transform_matrix[3,3] = 1.0
# convert Rodrigues vector into Rodrigues matrix, and then invert it
rotation_matrix = np.zeros(shape=(3,3))
cv2.Rodrigues(rvecs, rotation_matrix)
inverted_rotation = rotation_matrix.transpose()
# add inverted rotation to transform
inverted_homogeneous_transform_matrix[:3,:3] = inverted_rotation
# compute inverted translation, e.g. see http://ksimek.github.io/2012/08/22/extrinsic/
inverted_translation_vector = -inverted_rotation * tvecs
inverted_transform_matrix[:3,3] = np.asarray(inverted_translation_vector).flatten()
# x_coords, y_coords, z_coords are defined in camera coordinate system
x_coords=np.asarray([1,2,3,4,5])
y_coords=np.asarray([2,4,6,8,10])
z_coords=np.asarray([3,6,9,12,15])
homogeneous_ones = np.ones(len(x_coords))
homogeneous_points = np.matrix([x_coords, y_coords, z_coords, homogeneous_ones])
# perform the transformation
transformed_points = inverted_transform_matrix * homogeneous_points
# clean up to extract x,y,z values from matrix and save as 1D array
x_coords = np.asarray(transformed_points[0,:]).flatten()
y_coords = np.asarray(transformed_points[1,:]).flatten()
z_coords = np.asarray(transformed_points[1,:]).flatten()
Basically, the above code works, but when I run it on multiple point clouds from different camera perspectives, they do not magically line up as I would expect. I can confirm that my inverse homogeneous transform is indeed an inverse to the homogeneous transform constructed directly from rvecs and tvecs; have tried breaking it apart into translation first then rotation, and vice versa; and have seen a non-inverse transformation make everything almost line up from one perspective (but all rotated weirdly from every other perspective)...
Any help appreciated!