I'm trying to compute the relative pose between two frames in a video from KITTI raw dataset. The oxts data provides lat, lon, alt, roll, pitch, yaw for each of the frames. How can I convert this data into a transformation matrix (rotation matrix and translation vector)?

This answer suggests that it is possible, but doesn't give a solution. Python is preferred, but if you have a solution in other languages, that's also fine. I can translate the code to python.

Sample Data:
lat, lon, alt, roll, pitch, yaw = 49.015003823272, 8.4342971002335, 116.43032836914, 0.035752, 0.00903, -2.6087069803847

PS: I'm trying to pose-warp one frame to the other using projective geometry. For this, I'll need pose, depth and camera matrix. KITTI raw provides camera matrix. I'm planning to compute depth from the stereo images. So, I'm left with computing pose/transformation matrix between them

1

There are 1 best solutions below

0
On

The camera position is defined by extrinsics and intrinsic transformations. In Kitti case the cameras can rotate and translate only respect to the world (intrinsics of a camera are supposed to be always the same since is just the car moving in the space). In short:

  • camera intrinsic matrix projects the points whose coordinates are given wrt the camera onto the image plane of the camera
  • camera extrinsic matrix is a change of basis matrix that converts the coordinates of a point from world coordinate system to camera coordinate system

Now consider:

c=[R|t]*w

where
c - Homogeneous coordinates of a point wrt the camera
[R|t] - Camera extrinsic matrix
w - Coordinates of a point in the world space

Starting from the data provided by kitti use the oxts package to compute the transofrmations of 6dof respect the earth.

def rotx(t):
    """Rotation about the x-axis."""
    c = np.cos(t)
    s = np.sin(t)
    return np.array([[1,  0,  0],
                     [0,  c, -s],
                     [0,  s,  c]])


def roty(t):
    """Rotation about the y-axis."""
    c = np.cos(t)
    s = np.sin(t)
    return np.array([[c,  0,  s],
                     [0,  1,  0],
                     [-s, 0,  c]])


def rotz(t):
    """Rotation about the z-axis."""
    c = np.cos(t)
    s = np.sin(t)
    return np.array([[c, -s,  0],
                     [s,  c,  0],
                     [0,  0,  1]])

def transform_from_rot_trans(R, t):
    """Transforation matrix from rotation matrix and translation vector."""
    R = R.reshape(3, 3)
    t = t.reshape(3, 1)
    return np.vstack((np.hstack([R, t]), [0, 0, 0, 1]))

def pose_from_oxts_packet(metadata):

    """Helper method to compute a SE(3) pose matrix from an OXTS packet.
    Taken from https://github.com/utiasSTARS/pykitti
    """
    lat, lon, alt, roll, pitch, yaw = metadata
    scale = np.cos(lat * np.pi / 180.)

    er = 6378137.  # earth radius (approx.) in meters
    # Use a Mercator projection to get the translation vector
    ty = lat * np.pi * er / 180.

    tx = scale * lon * np.pi * er / 180.
    # ty = scale * er * \
    #     np.log(np.tan((90. + lat) * np.pi / 360.))
    tz = alt
    t = np.array([tx, ty, tz]).reshape(-1,1)

    # Use the Euler angles to get the rotation matrix
    Rx = rotx(roll)
    Ry = roty(pitch)
    Rz = rotz(yaw)
    R = Rz.dot(Ry.dot(Rx))
    return transform_from_rot_trans(R, t)

then the function call will be with the metadata read from the oxts file:

metadata = fromtxt("drive/oxts/data/0000000000.txt")
pose_matrix = pose_from_oxts_packet(metadata[:6])

Ref: Repo kitti dataset https://github.com/utiasSTARS/pykitti