I am trying to capture an image of an object with a monocular camera and with :
- First a normal calibration of the camera
- And then a perspective calibration of the camera using known object coordinates in both the world-system and camera-system
After these two steps I should be able to obtain the world coordinates of any detected object in the frame. This link explains what I'm trying to do in a more detailed pattern, but unlike the link I am not using OpenCV with Python but with Java.
So far I have managed to do the normal calibration, obtained the camera intrinsic parameters and rotation/translation vectors. I used these parameters in the solvePnPRansac() OpenCv function to obtain the extrinsic matrix of the camera, which allowed to build the projection matrix to convert points from the world-coordinates to image coodinates. Here are the obtained parameters :
From this step I have everything I need to conduct the two operations shown in the first link :


Now this is where things get complicated. When I use the world-system coordinates, I obtain the right image-system coordinates as obtained with the object detection algorithm (+- a few pixels). However, when I try to do the second operation, the obtained results make no sense at all. First here's the code that I used to obtain the extrinsic parameters :
double[] cx = this.optMat.get(0, 2);
double[] cy = this.optMat.get(1, 2);
int realCenterX = 258;
int realCenterY = 250;
int realCenterZ = 453;
MatOfPoint3f worldPoints = new MatOfPoint3f();
MatOfPoint2f imagePoints = new MatOfPoint2f();
List<Point3> objPoints = new ArrayList<Point3>();
objPoints.add(new Point3(realCenterX,realCenterY,realCenterZ));
objPoints.add(new Point3(154,169,475));
objPoints.add(new Point3(244,169,470));
objPoints.add(new Point3(337,169,470));
objPoints.add(new Point3(154,240,469));
objPoints.add(new Point3(244,240,452));
objPoints.add(new Point3(337,240,462));
objPoints.add(new Point3(154,310,472));
objPoints.add(new Point3(244,310,460));
objPoints.add(new Point3(337,310,468));
worldPoints.fromList(objPoints);
List<Point> imgPoints = new ArrayList<Point>();
imgPoints.add(new Point(cx[0],cy[0]));
imgPoints.add(new Point(569,99));
imgPoints.add(new Point(421,100));
imgPoints.add(new Point(272,100));
imgPoints.add(new Point(571,212));
imgPoints.add(new Point(422,213));
imgPoints.add(new Point(273,214));
imgPoints.add(new Point(574,327));
imgPoints.add(new Point(423,328));
imgPoints.add(new Point(273,330));
imagePoints.fromList(imgPoints);
for(int i= 0;i<worldPoints.rows();i++) {
for(int j=0;j<worldPoints.cols();j++) {
double[] pointI = worldPoints.get(i, j);
double wX = pointI[0]-realCenterX;
double wY = pointI[1]-realCenterY;
double wD = pointI[2];
double D1 = Math.sqrt((wX*wX)+(wY+wY));
double wZ = Math.sqrt((wD*wD)+(D1*D1));
pointI[2] = wZ;
worldPoints.put(i, j, pointI);
}
}
Mat optMatInv = new Mat();
Core.invert(this.optMat, optMatInv);
Calib3d.solvePnPRansac(worldPoints, imagePoints, optMat, distCoeffs, rvecsPnP, tvecsPnP, true, 100, (float) 0.5, 0.99, new Mat(), Calib3d.SOLVEPNP_ITERATIVE);
Calib3d.Rodrigues(this.rvecsPnP, this.rodriguesVecs);
this.rodriguesVecs.copyTo(this.extrinsicMat);
List<Mat> concat = new ArrayList<Mat>();
concat.add(this.rodriguesVecs);
concat.add(this.tvecsPnP);
Core.hconcat(concat, this.extrinsicMat);
Core.gemm(this.optMat, this.extrinsicMat, 1, new Mat(), 0, this.projectionMat);
int nbOfElements = worldPoints.rows() * worldPoints.cols();
List<Double> sDescribe = new ArrayList<Double>();
For the first operation (from world-system to image-system) :
for(int i= 0;i<nbOfElements;i++) {
double[] pointArray = worldPoints.get(i,0);
Mat pointI = new Mat(1,4,CvType.CV_64F);
pointI.put(0, 0, pointArray[0]);
pointI.put(0, 1, pointArray[1]);
pointI.put(0, 2, pointArray[2]);
pointI.put(0, 3, 1);
Mat transPointI = new Mat(4,1,CvType.CV_64F);
Core.transpose(pointI,transPointI);
Mat sUV = new Mat(3,1,CvType.CV_64F);
Core.gemm(projectionMat, transPointI, 1, new Mat(), 0, sUV);
double[] sArray0 = sUV.get(2,0);
double s = sArray0[0];
Mat UV = new Mat();
Core.multiply(sUV, new Scalar(1/s,1/s,1/s), UV);
sDescribe.add(i, s);
}
Which has pretty good results, like for the (154,169,475), the obtained result was :
And the code for the second operation (from image system to world system) :
for(int i= 0;i<nbOfElements;i++) {
double[] pointArray = imagePoints.get(i, 0);
double sPoint = sDescribe.get(i);
Mat pointI = new Mat(3,1,CvType.CV_64F);
pointI.put(0, 0, pointArray[0]);
pointI.put(1, 0, pointArray[1]);
pointI.put(2, 0, 1);
Mat transPointI = new Mat();
Core.transpose(pointI, transPointI);
Mat sUV = new Mat(3,1,CvType.CV_64F);
Core.multiply(transPointI, new Scalar(sPoint,sPoint,sPoint), sUV);
Mat invCameraMatrix = new Mat(3,3,CvType.CV_64F);
Core.invert(this.optMat, invCameraMatrix);
Mat tmp1 = new Mat();
Core.gemm(sUV,invCameraMatrix, 1, new Mat(), 0, tmp1);
Mat tVecsInv = new Mat();
Core.transpose(this.tvecsPnP, tVecsInv);
Mat tmp2 = new Mat();
Core.subtract(tmp1, tVecsInv, tmp2);
Mat XYZ = new Mat();
Mat inverseRMat = new Mat();
Core.invert(this.rodriguesVecs, inverseRMat);
Core.gemm(tmp2, inverseRMat, 1, new Mat(), 0, XYZ);
}
Which for the same point returned coordinates like :
I'm really lost as to where the problem might be coming from. I revised to code many times but the algorithm doesn't seem wrong. I am however suspecting the obtained extrinsic parameters, especially the Z value of tVecsPnP which is too high considering that it should be close to its X value according to my Camera/World setup, but I don't reallt know how to fix it. If anyone has any clue on how to overcome this please let me know :) Thank you !