Background removal/masking in Python using edge detection and scikit-image

3.3k Views Asked by At

I'd like to create two images from a source RAW image, a Canon CR2 in this case. I've got the RAW conversion sorted and some of the processing. My final images need to be a PNG with an alpha mask and a 95% quality JPG with the alpha area instead filled with black. I've got a test image set here showing how far I've got with detecting the subject:

https://i.stack.imgur.com/zaAx4.jpg

So basically, as you can see I want to isolate the subjects from the grey background. I also want to mask out any shadows cast on the grey background as much as possible and ideally in entirety. I'm using a Python2 script I've written and so far mostly scikit-image. I would swap to another Python compatible image processing lib if required. Also, I need to do all steps in memory so that I only save out once at the end of all image processing with the PNG and then the JPG. So no subprocess.Popen etc..

You'll see from the sample images that, I think at least, that I've got some way to a solution already. I've used scikit-image and its Canny edge algorithm for the images you see in my examples.

What I need to do now is figure out how to fill the subject in the Canny images with white so that I can get a proper solid white mask. In most of my example images, with Canny filter applied, it appears that there is good edge detection for the subjects themselves, usually with a major unbroken border. But, I'm guessing I might get some images in the future where this does not happen and there may be small breaks in the major border. I need to handle this occurrence if it looks like it'll be an issue for later processing steps.

Also, I'm wondering if I need to increase the overall border by one pixel and set it to the same color as my 0,0 pixel (i.e. first pixel top/left in background) and then run my Canny filter and then shrink my border by 1px again? This should allow for the bottom edge to be detected and for when subjects break the top or the sides of the frame?

So really I'm just looking for advice and wondering where to go next to get a nice solid mask. It needs to stay binary as a binary mask, (i.e. everything outside of the main subject needs to be totally masked to 0). This means that I'll need to run something that looks for isolated islands of pixels below a certain pixel volume at some point - probably last step and add them to the mask (e.g. 50px or so).

Also, overall, the rule of thumb would be that it's better if a little bit of the subject gets masked rather than less of the background being masked (i.e. I want all or as much as possible of the background/shadow areas to be masked.)

I've tried a few things, but not quite getting there. I'm thinking something along the lines of find_contours in sci_kit might help. But I can't quite see from the scikit-image examples how I'd pick and then turn my detected contour into a mask. I've killed quite a bit of time experimenting without success today and so I thought I'd ask here and see if anyone has any better ideas.

This is an OpenCV based method that looks promising:

http://funcvis.org/blog/?p=44

I'd like to stick with scikit-image or some other interchangeable numpty image library for python if possible. However, if it's just easier and faster with OpenCV or another library then I'm open to ideas so long as I can stick with Python.

Also worth bearing in mind that for my application I will always have an image of the background without a subject. So maybe I should be pursuing this route. Problem is that I don't think a simple difference approach deals very well with shadows. It seems to me like some sort of edge detection is required at some point for a superior masking approach.

1 "Source 1"

2 "Source 2"

3 "Source 3"

1 "Result 1"

2 "Result 2"

3 "Result 3"

2

There are 2 best solutions below

0
On

From limited experience, I'll offer some ideas to try.

The Canny edge detection results aren't distinguishing holes (in the Result 2 object) from solid-colored areas (in Result 3). Is that OK for your purposes? Would it fit your needs to do blob-detection on those edges and fill in the blob(s), thus eliminating the holes from Result 2?

Let's assume the part you want to mask off is the original gray background areas along with darker gray shadows on that gray background. Also, some minimize size gray area qualifies as a "hole" rather than gray pixels or gray noise on the object. (Is there any way to distinguish parts of the object that look like the gray background?)

So consider this plan:

  1. Convert the image to HSV (or HSL) color space.
  2. Compute an 8-bit/pixel gray-scale "threshold mask" image where each pixel indicates whether the corresponding input pixel is likely-background or likely-foreground: If the input pixel's Saturation is below a threshold ts (gray or nearly-gray) and its Value (or Lightness) is within a threshold range [tv1 .. tv2] (dark-shadow-background-gray through background-gray), then it's likely-background, so make the output pixel 0 (black), otherwise it's likely-foreground so make it 255 (white).
  3. Dilate the white pixels to fill in gaps, then Erode them back to restore the original size. This pair of operations is also known as the Closing morphology. [Beware that the sample picture on that page is a confusing example. It dilates then erodes the white pixels of a sample image that's difficult to not view as black-on-white strokes!]

The above assumes the original background is uniform gray, without the spots in your actual samples. You could refine this plan to account for the background variations by making the threshold parameters be a function of the original background color.

Steps 2 and 3 produce all-or-nothing alpha channels (masks). It might be better to use multiple gray levels in these steps (fuzzy logic), but it's not obvious how to do that.

Note: If you use JPEG 2000 for the final output image format, then a single file can contain the lossy-compressed image and its alpha channel. It can also maintain the full color depth from the original RAW file.

0
On

I'm going to take a shot at this.

If you want clean masks of the objects, there's something called adaptive thresholding (a local thresholding scheme), I think it might be viable to you, especially because it might remove the effects of the shadows, along with that try Otsu's thresholding (another automatic but global thresholding scheme).

See which one gets better results and implement the one you want.

I said this because your query is very similar to a classic thresholding problem (objects against the same background).

Of course use morphological operations to clean your mask(as pointed by another user close would suffice for removing small speckle noise).