Detectron2 pre-trained model using layoutparser in Docker container Error: Checkpoint Not Found

610 Views Asked by At

following is my Dockerfile.

FROM python:3.9
RUN apt-get clean && apt-get update
pip install --upgrade pip

RUN pip install layoutparser 

RUN pip install "layoutparser[ocr]" 

RUN pip install pytesseract 

RUN pip install pdf2image 

RUN pip install torch 

RUN pip install torchvision

RUN apt-get install -y poppler-utils  #(pdf-image) 

RUN apt-get install -y tesseract-ocr 

RUN apt-get install git #(to install detectron2) 

RUN pip install "git+https://github.com/facebookresearch/detectron2.git"  #(detectron2 model) 

RUN apt-get update && apt-get install ffmpeg libsm6 libxext6  -y #(required to run packages)

workdir /home/jovyan/work/layout_parser

volume ["/home/jovyan/work/layout_parser"]

CMD ["python", "test_code.py"]

python code test_code.py:

import pdf2image
import layoutparser as lp
import pytesseract
import numpy as np
import cv2
import matplotlib.pyplot as plt


pdf_file= r"/home/jovyan/work/layout_parser/test_pdf.pdf"
image = np.asarray(pdf2image.convert_from_path(pdf_file)[0])

model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')


I received the following error: enter image description here

I have tried the following to resolve the issue without any success:

  • variations of torch and torchvision versions used
  • python 3.7, 3.8, 3.9 base used
  • variations of pre-built model's config_path used
  • Manually downloaded model as present in the config file -> layout_parser_modelzoo
  • tried manually downloaded model in extension pth and pkl. Received the following error: enter image description here

How could I use the pre-trained models?

1

There are 1 best solutions below

0
On

I got it to run with the following steps:

  1. download the correct yaml file from here: https://github.com/Layout-Parser/layout-parser/blob/main/src/layoutparser/models/detectron2/catalog.py (from CONFIG_CATALOG, starting at line 45) and place it in your directory

  2. load detectron model like this:

cv_model = lp.Detectron2LayoutModel( config_path ='config.yml', # In model catalog label_map ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional ) where the label map has to correspond to the config file with the same name

  1. In the config file the model checkpoints are written like: WEIGHTS: https://www.dropbox.com/s/d9fc9tahfzyl6df/model_final.pth?dl=1 weights will be loaded from there when running the script