I've researched a lot and could not find a definitive answer. What kind of image color is most used for pHash input to generate the hash/fingerprint.
For example I have a target image that I'm looking for within a source image, but the target can have many colors and shades, but the shape is always the same (ex: tulips). I have experimented with the image as is, turned gray scale and threshold (pure black and white). I know most pHash libraries will gray scale the input first before the hash is made.
But before I move forward is pre-processing the image color worthwhile? (ignoring size and rotation, and assuming source and target are the same for both)
So after testing and more research it's best to use the original colored image. Most pHash will gray scale an image regardless, so performing a gay scale followed by the internal gray scale actually produced poor results. The same goes for Threshold (pure black and white). There were more collisions and many more false positives.
I used a 64 bit pHash and worked very well. I also tried with Wavelet Hash which was good for color changes but not good for overall matching.
What worked for me is a large data set that was feed into a BinaryTree. This way the look ups were fast and had many examples to compare to. For Java I used: https://github.com/KilianB/JImageHash