I was profiling some C# code and wound up investigating the implementation of ToString for enum values.
The ToString method ultimately calls System.Type.GetEnumName. Here's the relevant source code, taken from http://referencesource.microsoft.com/#mscorlib/system/type.cs
public virtual string GetEnumName(object value)
{
if (value == null)
throw new ArgumentNullException("value");
if (!IsEnum)
throw new ArgumentException(Environment.GetResourceString("Arg_MustBeEnum"), "enumType");
Contract.EndContractBlock();
Type valueType = value.GetType();
if (!(valueType.IsEnum || Type.IsIntegerType(valueType)))
throw new ArgumentException(Environment.GetResourceString("Arg_MustBeEnumBaseTypeOrEnum"), "value");
Array values = GetEnumRawConstantValues();
int index = BinarySearch(values, value);
if (index >= 0)
{
string[] names = GetEnumNames();
return names[index];
}
return null;
}
// Returns the enum values as an object array.
private Array GetEnumRawConstantValues()
{
string[] names;
Array values;
GetEnumData(out names, out values);
return values;
}
// This will return enumValues and enumNames sorted by the values.
private void GetEnumData(out string[] enumNames, out Array enumValues)
{
Contract.Ensures(Contract.ValueAtReturn<String[]>(out enumNames) != null);
Contract.Ensures(Contract.ValueAtReturn<Array>(out enumValues) != null);
FieldInfo[] flds = GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Static);
object[] values = new object[flds.Length];
string[] names = new string[flds.Length];
for (int i = 0; i < flds.Length; i++)
{
names[i] = flds[i].Name;
values[i] = flds[i].GetRawConstantValue();
}
// Insertion Sort these values in ascending order.
// We use this O(n^2) algorithm, but it turns out that most of the time the elements are already in sorted order and
// the common case performance will be faster than quick sorting this.
IComparer comparer = Comparer.Default;
for (int i = 1; i < values.Length; i++)
{
int j = i;
string tempStr = names[i];
object val = values[i];
bool exchanged = false;
// Since the elements are sorted we only need to do one comparision, we keep the check for j inside the loop.
while (comparer.Compare(values[j - 1], val) > 0)
{
names[j] = names[j - 1];
values[j] = values[j - 1];
j--;
exchanged = true;
if (j == 0)
break;
}
if (exchanged)
{
names[j] = tempStr;
values[j] = val;
}
}
enumNames = names;
enumValues = values;
}
// Convert everything to ulong then perform a binary search.
private static int BinarySearch(Array array, object value)
{
ulong[] ulArray = new ulong[array.Length];
for (int i = 0; i < array.Length; ++i)
ulArray[i] = Enum.ToUInt64(array.GetValue(i));
ulong ulValue = Enum.ToUInt64(value);
return Array.BinarySearch(ulArray, ulValue);
}
If I'm understanding this correctly, whenever an enum's name is requested:
- The entire set of enum values and names are inserted into arrays using insertion sort.
- The values are copied to an array of ulongs.
- This array is searched with binary search.
Am right to think that this is grossly inefficient? (especially for enums with hundreds or thousands of values)
It seems like some sort of dictionary/hashtable structure would be ideal. But even a simple O(n) linear scan of the values would have accomplished the same goal without sorting or binary searching or allocating superfluous arrays.
I was able to precompute a dictionary for one particular case where ToString was being called many times in a loop; and that yielded significant performance gains. But is there any built-in, alternative, more holistic technique that would also facilitate things like JSON serialization?