OpenCV: What does it mean when the number of inliers returned by recoverPose() function is 0?

577 Views Asked by At

I've been working on a pose estimation project and one of the steps is finding the pose using the recoverPose function of OpenCV.

int cv::recoverPose(InputArray       E,
                    InputArray       points1,
                    InputArray       points2,
                    InputArray       cameraMatrix,
                    OutputArray      R,
                    OutputArray      t,
                    InputOutputArray mask = noArray() 
                   )

I have all the required info: essential matrix E, key points in image 1 points1, corresponding key points in image 2 points2, and the cameraMatrix. However, the one thing that still confuses me a lot is the int value (i.e. the number of inliers) returned by the function. As per the documentation:

Recover relative camera rotation and translation from an estimated essential matrix and the corresponding points in two images, using cheirality check. Returns the number of inliers which pass the check.

However, I don't completely understand that yet. I'm concerned with this because, at some point, the yaw angle (calculated using the output rotation matrix R) suddenly jumps by more than 150 degrees. For that particular frame, the number of inliers is 0. So, as per the documentation, no points passed the cheirality check. But still, what does it mean exactly? Can that be the reason for the sudden jump in yaw angle? If yes, what are my options to avoid that? As the process is iterative, that one sudden jump affects all the further poses!

1

There are 1 best solutions below

3
On

This function decomposes the Essential matrix E into R and t. However, you can get up to 4 solutions, i. e. pairs of R and t. Of these 4, only one is physically realizable, meaning that the other 3 project the 3D points behind one or both cameras.

The cheirality check is what you use to find that one physically realizable solution, and this is why you need to pass matching points into the function. It will use the matching 2D points to triangulate the corresponding 3D points using each of the 4 R and t pairs, and choose the one for which it gets the most 3D points in front of both cameras. This accounts for the possibility that some of the point matches can be wrong. The number of points that end up in front of both cameras is the number of inliers that the functions returns.

So, if the number of inliers is 0, then something went very wrong. Either your E is wrong, or the point matches are wrong, or both. In this case you simply cannot estimate the camera motion from those two images.

There are several things you can check.

  • After you call findEssentialMat you get the inliers from the RANSAC used to find E. Make sure that you are passing only those inlier points into recoverPose. You don't want to pass in all the points that you passed into findEssentialMat.
  • Before you pass E into recoverPose check if it is of rank 2. If it is not, then you can enforce the rank 2 constraint on E. You can take the SVD of E, set the smallest eigenvalue to 0, and then reconstitute E.
  • After you get R and t from recoverPose, you can check that R is indeed a rotation matrix with the determinate equal to 1. If the determinant is equal to -1, then R is a reflection, and things have gone wrong.