Save scanned MCQ question as separate image using python

26 Views Asked by mannar mani At 10 March 2024 at 04:59

I've a scanned MCQ image. Please find image attached. It has question and choices. My goal is to crop each question and save separate image.

Here is my code in python

import cv2
import numpy as np
import pytesseract
import os
import re

# Path to the input image
image_path = r'd:\Users\Desktop\Sample\input.jpg'

# Read the image
image = cv2.imread(image_path)

# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Use pytesseract to perform OCR on the image
ocr_result = pytesseract.image_to_string(gray_image, config='--psm 6')

# Find all occurrences of question numbers
matches = list(re.finditer(r'\d+\)', ocr_result))

# Define the output directory
output_dir = r'd:\Users\Desktop\Sample'

# Iterate over the matches to split each question as a separate image
for i, match in enumerate(matches):
    question_number = match.group()  # Get the question number
    start_x = 0  # Start X coordinate
    start_y = match.start()  # Start Y coordinate
    if i < len(matches) - 1:
        end_y = matches[i + 1].start() - 4  # End Y coordinate (4 pixels above the next question)
    else:
        end_y = gray_image.shape[0]  # End Y coordinate (end of the image)
    end_x = gray_image.shape[1]  # End X coordinate (end of the image)
    
    # Crop the image
    question_image = gray_image[start_y:end_y, start_x:end_x]

    # Save the cropped image
    output_path = os.path.join(output_dir, f'question_{i + 1}.jpg')
    cv2.imwrite(output_path, question_image)
    print(f"Question {i + 1} saved at: {output_path}")

# Display the number of questions found
print(f"Number of questions found: {len(matches)}")

But, for some reason(s), it doesn't work (i.e) the images are not cropped properly. Any help is much appriciated.

Original Q&A

Save scanned MCQ question as separate image using python

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in OPENCV

Related Questions in OCR

Related Questions in CROP

Trending Questions

Popular # Hahtags

Popular Questions