I have recently started to learn more about supervised monocular depth estimation. I used the NYU-V2 dataset for it. it is easy to design a torch loader and pre-process the data since the structure of the dataset is quite clear. But in the case of Kitti dataset, it is very confusing. Is it possible to use Kitti for supervised monocular depth estimation? I found a torch loader for kitt here: https://github.com/joseph-zhang/KITTI-TorchLoader however, I don't understand how to use it for depth estimation using the Kitti dataset. the folder structure is quite different!. My plan is to train a simple CNN using a supervised mono depth approach.
Is it possible to use Kitti dataset for supervised monocular depth estimation?
720 Views Asked by PNF At
2
There are 2 best solutions below
3

The repository states that the dense depth map are completions of the lidar ray maps and projected and aligned with the raw KITTI dataset.
Andreas Geiger et al., Vision meets Robotics: The KITTI Dataset
Looking at the dev toolkit for KITTI, the get_depth
function receives as an argument the camera id of the camera the Velodyne points are projected onto. This function is called here the dataloader with cam=self.cam
which is set as an attribute to the Kittiloader
instance.
In other words, you can choose on which camera the Velodyne points and depth completion is performed. By default cam
is set to 2
, which means cam_2
, the left camera view.
I think it is plausible since the KITTI dataset contains depth maps with the corresponding raw LiDaR scans and RGB images (left-image, right-image and depth map) (KITTI). I don't know how exactly the github repo works but the dataset/dataloader should be in a similar format. However, taking a look on the repo files, I think you need only to install the library and then pass as input the root_path of your dataset and the pytorch image transformations.