How to make inference on multiple images, with detectron2 and DefaultPredictor

2k Views Asked by At

I have trained the model, now i would like to use it to detect objects in many images. I saw that the defaultpredictor allows you to detect only on an image, what can I do?

I am really new to this world. The approach I tried was to use a for loop but it doesn't work. Are there any other methods?

%cd /kaggle/working/detectron2
import glob
cfg.MODEL.WEIGHTS = os.path.join("/kaggle/working/detectron2/output", "model_final.pth") # path to the model we trained
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.0001 # set a testing threshold
pred = DefaultPredictor(cfg)
os.chdir("/kaggle/working/detectron2/images")
for img in glob.glob('.jpg'):
    inputs = cv2.imread(img)
    outputs = pred(inputs)
    print(outputs)
2

There are 2 best solutions below

0
On BEST ANSWER

Ok, i solved in this way:

%cd /kaggle/working/detectron2
import glob
cfg.MODEL.WEIGHTS = os.path.join("/kaggle/working/detectron2/output", "model_final.pth")   # path to the model we trained
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.0001   # set a testing threshold
pred = DefaultPredictor(cfg)
for img in glob.glob('/kaggle/working/detectron2/images/*.jpg'):
    inputs = cv2.imread(img)
    outputs = pred(inputs)
    print(outputs)

i deleted os.chdir()

0
On

Quoting the source code documentation for DefaultPredictor:

If you'd like to do anything more complicated, please refer to its source code as examples to build and use the model manually.

Ie, to make inference on multiple images, copy and rename the DefaultPredictor class (github link), then replace _call method with:

def __call__(self, original_images):
    with torch.no_grad():  # https://github.com/sphinx-doc/sphinx/issues/4258
        # Apply pre-processing to images.
        inputs = []
        for original_image in original_images:
            if self.input_format == "RGB":
                # whether the model expects BGR inputs or RGB
                original_image = original_image[:, :, ::-1]

            height, width = original_image.shape[:2]
            image = self.aug.get_transform(original_image).apply_image(original_image)
            image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
            image.to(self.cfg.MODEL.DEVICE)
            inputs.append({"image": image, "height": height, "width": width})

        predictions = self.model(inputs)
        return predictions

Use same way as DefaultPredictor, but pass image arrays instead:

predictor = MyPredictor(cfg)
predictions = predictor(your_images)