Reconstructing a non-planar polygon in 3D given a 2d projection and known polygon dimensions

224 Views Asked by At

I have a non-planar object with 9 points with known dimensions in 3D i.e. length of all sides is known. Now given a 2D projection of this shape, I want to reconstruct the 3D model of it. I basically want to retrieve the shape of this object in the real world i.e. angles between different sides in 3D. For eg: given all the dimensions of every part of the table and a 2D image, I'm trying to reconstruct its 3D model.

enter image description here

I've read about homography, perspective transform, procrustes and fundamental/essential matrix so far but haven't found a solution that'll apply here. I'm new to this, so might have missed out something. Any direction on this will be really helpful.

1

There are 1 best solutions below

2
On

In your question, you mention that you want to achieve this using only a single view of the object. In that case, homographies or Essential/Fundamental matrices wont help you, because these require at least two views of the scene to make sense. If you don't have any priors on the shape of the objects that you want to reconstruct, the key information that you'll be missing is (relative) depth, and in that case I think those are the two possible solutions:

  • Leverage a learning algorithm. There is a rich literature on 6dof object pose estimation with deep networks, see this paper for example. You wont have to deal with depth directly if you use those since those networks are trained end to end to estimate a pose in SO(3).

  • Add many more images and use a dense photometric SLAM/SFM pipeline, such as elastic fusion. However, in that case you will need to segment the resulting models since the estimation they produce is of the entire environment, which can be difficult depending on the scene.

However, as you mentioned in your comment, it is possible to reconstruct the model up to scale if you have very strong priors on its geometry. In the case of a planar object (a cuboid will just be an extension of that), you can use this simple algorithm (that is more or less what they do here, there are other methods but I find them a bit messy, equation-wise):

//let's note A,B,C,D the rectangle in 3d that we are after, such that 
//AB is parellel with CD. Let's also note a,b,c,d their respective
//reprojections in the image, i.e. a=KA where K is the calibration matrix, and so on.

1) Compute the common vanishing point of AB and CD. This is just the intersection
   of ab and cd in the image plane. Let's call it v_1.
2) Do the same for the two other edges, i.e bc and da. Let's call this 
   vanishing point v_2.
3) Now, you can compute the vanishing line, which will just be
   crossproduct(v_1, v_2), i.e. the line going through both v_1 and v_2. This gives 
   you the orientation of your plane. Let's write its normal N.
5) All you need to find now is the boundaries of the rectangle. To do
   that, just consider any plane with normal N that doesn't go through
   the camera center. Now find the intersections of K^{-1}a, K^{-1}b,
   K^{-1}c, K^{-1}d with that plane. 

If you need a refresher on vanishing points and lines, I suggest you take a look at pages 213 and 216 of Hartley-Zisserman's book.