I have a physical photo album, for which each page might contain one or more photos glued on it.
I took a picture of each individual page, containing multiple photos. Now, I placed all the pictures that I took into a single folder, and I would like to iterate over it with Python to extract all photos that were glued on that page.
I have the following Python script, but the downside of this script is that it finds way too many contours (on the pictures itself as well).
What is a good (alternative) method for getting the contrasts right when the page's background is white?
# Read the image
img = cv2.imread("images/" + image)
# Convert the image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Show gray image
cv2.imshow('Gray Image', gray)
cv2.waitKey(0)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
# Apply edge detection using the Canny edge detector
edged = cv2.Canny(blurred, 50, 150)
contours, _ = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
min_area = 50000
filtered_contours = [cnt for cnt in contours if min_area < cv2.contourArea(cnt)]
extracted_photos = []
for i, contour in enumerate(filtered_contours):
x, y, w, h = cv2.boundingRect(contour)
extracted_photos.append(img[y:y+h, x:x+w])
# Uncomment the following line to save individual photos
# cv2.imwrite(f'photo_{i}.jpg', image[y:y+h, x:x+w])
# Show the extracted photos
cv2.imshow('Original Image', img)
cv2.waitKey(0)
for i, photo in enumerate(extracted_photos):
cv2.imshow(f'Photo {i}', photo)
cv2.waitKey(0)
cv2.destroyAllWindows()


