color optimization problems in python
You wouldn't think that artists might have a legitimate use for an optimization library like cvxpy or sk learn. ๐ This document acts as an addendum and starting point to look into this small python library I created at the beginning of the year: ColorNameFinder. This library is usable. The concept section in the readme there gives a good background story, which I will rephrase here to give some context to this article.
A project later making use of some of the ideas here, in a completely different way (in pure javascript) is on this stackblitz project
I want to point out also, that I asked this as a stackoverflow question but did not find support. I think there is enough interesting things here to share it nonetheless.
Please skip ahead to details if you prefer to understand the story here rather than contemplate optimization strategies.
question up front
def shade_func(color, offset):
return tuple([int(c * (1 - offset)) for c in color])
def tint_func(color, offset):
return tuple([int(c + (255 - c) * offset) for c in color])
def tone_func(color, offset):
return tuple([int(c * (1 - offset) + 128 * offset) for c in color])
given an objective over a collection of colors that returns the least distance to a target color, how do I ensure that basinhopping
isn't better than minimization from each color in the set you want to approximate in scikit learn?
I was thinking that, for any one color, there will be up to 4 moments in a v-shaped curve, and so only one minimum. if the value with offset zero is itself a minimum, maybe it could be 5. Am I wrong? In any case each is a single optimum, so if we are only searching one color at a time, no reason to use basinhopping.
If we instead use basinhopping to scan all colors at once (we can scale the two different dimensions, in fact this is where the idea of a preprocessor function first came from), it scans them, but does not do such a compelling job of scanning all colors. Some colors it only tries once. I think it might completely skip some colors with large enough sets.
details
I was inspired by way artyclick shows colors and allows searching for them. If you look at an individual color, for example mauve you'll notice that it prominently displays the shades, tints, and tones of the color, rather like an artist might like. If you ask it for the name of a color, it will use a hidden unordered list of about a thousand color names, and some javascript to find the nearest colorname to the color you chose. In fact it will also show alternatives.
I noticed that quite often a shade, tint or tone of an alternative (or even the best match) was often a better match than the color it provided. For those who don't know about shade, tint and tone, there's a nice write up at Dunn Edward's Paints. It looks like shade and tint are the same but with signs reversed, if doing this on tuples representing colors. For tone it is different, a negative value would I think saturate the result.
I felt like there must be authoritative (or at least well sourced) colorname sources it could be using.
In terms of the results, since I want any color or its shade/tint/tone, I want a result like this:
{'color': '#aabbcc',
'offset': {'type': 'tint', 'value': 0.31060384614807254}}
So I can return the actual color name from the color, plus the type of color transform to get there and the amount of you have to go.
For distance of colors, there is a great algorithm that is meant to model human perception that I am using, called CIEDE 2000. Frankly, I'm just using a snippet I found that implements this, it could be wrong.
So now I want to take in two colors, compare their shade, tint, and tone to a target color, and return the one with the least distance. After I am done, I can reconstruct if it was a shade, tint or tone transform from the result just by running all three once and choosing the best fit. With that structure, I can iterate over every color, and that should do it. I use optimization because I don't want to hard code what offsets it should consider (though I am reconsidering this choice now!).
because I want to consider negatives for tone but not for shade/tint, my objective will have to transform that. I have to include two values to optimize, since the objection function will need to know what color to transform (or else the result will give me know way of knowing which color to use the offset with).
so my call should look something like the following:
result = min(minimize(objective, (i,0), bounds=[(i, i), (-1, 1)]) for i in range(len(colors)))
offset_type = resolve_offset_type(result)
with that in mind, I implemented this solution, over the past couple of days.
current solution
from scipy.optimize import minimize
import numpy as np
import math
def clamp(low, x, high):
return max(low, min(x, high))
def hex_to_rgb(hex_color):
hex_color = hex_color.lstrip('#')
return tuple(int(hex_color[i:i+2], 16) for i in (0, 2, 4))
def rgb_to_hex(rgb):
return '#{:02x}{:02x}{:02x}'.format(*rgb)
def rgb_to_lab(color):
# Convert RGB to XYZ color space
R = color[0] / 255.0
G = color[1] / 255.0
B = color[2] / 255.0
R = ((R + 0.055) / 1.055) ** 2.4 if R > 0.04045 else R / 12.92
G = ((G + 0.055) / 1.055) ** 2.4 if G > 0.04045 else G / 12.92
B = ((B + 0.055) / 1.055) ** 2.4 if B > 0.04045 else B / 12.92
X = R * 0.4124 + G * 0.3576 + B * 0.1805
Y = R * 0.2126 + G * 0.7152 + B * 0.0722
Z = R * 0.0193 + G * 0.1192 + B * 0.9505
return (X,Y,Z)
def shade_func(color, offset):
return tuple([int(c * (1 - offset)) for c in color])
def tint_func(color, offset):
return tuple([int(c + (255 - c) * offset) for c in color])
def tone_func(color, offset):
return tuple([int(c * (1 - offset) + 128 * offset) for c in color])
class ColorNameFinder:
def __init__(self, colors, distance=None):
if distance is None:
distance = ColorNameFinder.ciede2000
self.distance = distance
self.colors = [hex_to_rgb(color) for color in colors]
@classmethod
def euclidean(self, left, right):
return (left[0] - right[0]) ** 2 + (left[1] - right[1]) ** 2 + (left[2] - right[2]) ** 2
@classmethod
def ciede2000(self, color1, color2):
# Convert color to LAB color space
lab1 = rgb_to_lab(color1)
lab2 = rgb_to_lab(color2)
# Compute CIE 2000 color difference
C1 = math.sqrt(lab1[1] ** 2 + lab1[2] ** 2)
C2 = math.sqrt(lab2[1] ** 2 + lab2[2] ** 2)
a1 = math.atan2(lab1[2], lab1[1])
a2 = math.atan2(lab2[2], lab2[1])
dL = lab2[0] - lab1[0]
dC = C2 - C1
dA = a2 - a1
dH = 2 * math.sqrt(C1 * C2) * math.sin(dA / 2)
L = 1
C = 1
H = 1
LK = 1
LC = math.sqrt(math.pow(C1, 7) / (math.pow(C1, 7) + math.pow(25, 7)))
LH = math.sqrt(lab1[0] ** 2 + lab1[1] ** 2)
CB = math.sqrt(lab2[1] ** 2 + lab2[2] ** 2)
CH = math.sqrt(C2 ** 2 + dH ** 2)
SH = 1 + 0.015 * CH * LC
SL = 1 + 0.015 * LH * LC
SC = 1 + 0.015 * CB * LC
T = 0.0
if (a1 >= a2 and a1 - a2 <= math.pi) or (a2 >= a1 and a2 - a1 > math.pi):
T = 1
else:
T = 0
dE = math.sqrt((dL / L) ** 2 + (dC / C) ** 2 + (dH / H) ** 2 + T * (dC / SC) ** 2)
return dE
def __factory_objective(self, target, preprocessor=lambda x: x):
def fn(x):
print(x, preprocessor(x))
x = preprocessor(x)
color = self.colors[x[0]]
offset = x[1]
bound_offset = abs(offset)
offsets = [
shade_func(color, bound_offset),
tint_func(color, bound_offset),
tone_func(color, offset)]
least_error = min([(right, self.distance(target, right)) \
for right in offsets], key = lambda x: x[1])[1]
return least_error
return fn
def __resolve_offset_type(self, sample, target, offset):
bound_offset = abs(offset)
shade = shade_func(sample, bound_offset)
tint = tint_func(sample, bound_offset)
tone = tone_func(sample, offset)
lookup = {}
lookup[shade] = "shade"
lookup[tint] = "tint"
lookup[tone] = "tone"
offsets = [shade, tint, tone]
least_error = min([(right, self.distance(target, right)) for right in offsets], key = lambda x: x[1])[0]
return lookup[least_error]
def nearest_color(self, target):
target = hex_to_rgb(target)
preprocessor=lambda x: (int(x[0]), x[1])
result = min(\
[minimize( self.__factory_objective(target, preprocessor=preprocessor),
(i, 0),
bounds=[(i, i), (-1, 1)],
method='Powell') \
for i, color in enumerate(self.colors)], key=lambda x: x.fun)
color_index = int(result.x[0])
nearest_color = self.colors[color_index]
offset = preprocessor(result.x)[1]
offset_type = self.__resolve_offset_type(nearest_color, target, offset)
return {
"color": rgb_to_hex(nearest_color),
"offset": {
"type": offset_type,
"value": offset if offset_type == 'tone' else abs(offset)
}
}
let's demonstrate this with mauve. We'll define a target that is similar to a shade of mauve, include mauve in a list of colors, and ideally we'll get mauve back from our test.
colors = ['#E0B0FF', '#FF0000', '#000000', '#0000FF']
target = '#DFAEFE'
agent = ColorNameFinder(colors)
agent.nearest_color(target)
we do get mauve back:
{'color': '#e0b0ff',
'offset': {'type': 'shade', 'value': 0.0031060384614807254}}
the distance is 0.004991238317138219
agent.distance(hex_to_rgb(target), shade_func(hex_to_rgb(colors[0]), 0.0031060384614807254))
why use Powell's method?
in this arrangement, it is simply the best. No other method that uses bounds did a good job of scanning positives and negatives, and I had mixed results using the preprocessor to scale the values back to negative with bounds of (0,2).
I do notice that in the sample test, a range between about 0.003 and 0.0008 seems to produce the same distance, and that the values my approach considers includes a large number of these. is there a more efficient solution?
If I'm wrong, please let me know.
correctness of the color transformations
what is adding a negative amount of white? (in the case of a tint) I was thinking it is like adding a positive amount of black -- ie a shade, with signs reversed.
my implementation is not correct:
agent.distance(hex_to_rgb(target), shade_func(hex_to_rgb(colors[0]), 0.1)) - agent.distance(hex_to_rgb(target), tint_func(hex_to_rgb(colors[0]), -0.1))
produces 0.3239904390784106 instead of 0.
I'll probably be fixing that soon