I'm trying to compute the relative pose between two frames in a video from KITTI raw dataset. The oxts
data provides lat, lon, alt, roll, pitch, yaw
for each of the frames. How can I convert this data into a transformation matrix (rotation matrix and translation vector)?
This answer suggests that it is possible, but doesn't give a solution. Python is preferred, but if you have a solution in other languages, that's also fine. I can translate the code to python.
Sample Data:
lat, lon, alt, roll, pitch, yaw = 49.015003823272, 8.4342971002335, 116.43032836914, 0.035752, 0.00903, -2.6087069803847
PS: I'm trying to pose-warp one frame to the other using projective geometry. For this, I'll need pose, depth and camera matrix. KITTI raw provides camera matrix. I'm planning to compute depth from the stereo images. So, I'm left with computing pose/transformation matrix between them
The camera position is defined by extrinsics and intrinsic transformations. In Kitti case the cameras can rotate and translate only respect to the world (intrinsics of a camera are supposed to be always the same since is just the car moving in the space). In short:
Now consider:
where
c - Homogeneous coordinates of a point wrt the camera
[R|t] - Camera extrinsic matrix
w - Coordinates of a point in the world space
Starting from the data provided by kitti use the oxts package to compute the transofrmations of 6dof respect the earth.
then the function call will be with the metadata read from the oxts file:
Ref: Repo kitti dataset https://github.com/utiasSTARS/pykitti