What's the reason behind "extracting 8x8 patches" in Restricted Boltzman Machine?

109 Views Asked by At

I came accros this documentation on pyLearn2 (Machine Learning library) example of RBM. Could someone tell me why it is easier?

# First we want to pull out small patches of the images, since it's easier
# to train an RBM on these
pipeline.items.append(
    preprocessing.ExtractPatches(patch_shape=(8, 8), num_patches=150000)
)

For what it's worth I'm not well-informed about RBM so please bear with me. For the complete code, please refer to this link

1

There are 1 best solutions below

0
On

Simply put, as with any algorithm your complexity increases with larger input numbers. Dividing the problem into smaller sub-problems and subsequently combining these may prove faster (called divide and conquer algorithms).

Now, with these type of Machine Learning algorithms there's an additional need for abstraction in the features. You neither wanna input every single pixel at once (having only local information), nor want to represent the whole image by a single number/symbol (having only global information). A number of approaches combine these sorts of data into hierarchical representations (mostly called Deep Learning).

If you bring together these two concepts, it should be clear(er) that processing small image patches first gives you a bigger amount of local information which you can then combine to infer into global information at a later stage. So "because it's easier" is not the full reasoning behind it. It also makes everything perform better/more accurate.

I hope this answers your question without being too vague (a thorough answer would become too long). For a more detailed introduction on RBMs have a look at e.g. chapter 7 on this page