I have a dataset of several thousand color palettes; for each image in a dataset of images, I have the ten most common RGB values in an image, and the fraction of that color in the overall image:
{
"id": uuid,
"colors": [[r0,b0,g0], [r1,b1,g1], ...],
"weights": [[0.2, 0.15, ...]
}
For a new, query color palette, what is the best way to find its most similar reference in the dataset? I have found the best results for a simple weighted L1 distance between palettes, but it is by no means optimal. For example, if the query is only blues and greens, some of the most similar palettes via a weighted L1 distance can feature yellows and reds.
I've also experimented with different color spaces, and found Lab* to be the best performing using the L1 approach, but again, it is imprecise.