Constrain Generic Parameter to ICollection with Varying Element Types

47 Views Asked by At

Generic Collection Problem

This resembles a Tuple<?,?> problem. A utility class needs to invoke a collection of processors; processors expose a generic method, Process<TInput, TOutput>. The utility class should expose a method that accepts a collection of processors and the input to the first processor, then iterates through the processors, invoking something like y = Process(x), where x is the output of the previous iteration.

This can probably be achieved through boxing (e.g., changing my example below to use <object,object> as the type arguments), but ideally the solution would enforce a constraint that each processor's input type matches the previous processor's output type, which requires type discovery within each iteration.

Example:

public interface IProcessor
{
    TOutput Process<TInput, TOutput>(object firstInput);
}
public interface ISequencer<in TCollection, in TFirstInput, out TLastOutput> where TCollection : ICollection<IProcessor>
{
    public TLastOutput ProcessRange(TCollection collection, TFirstInput firstInput)
    {
        if (collection == null || collection.Count < 2)
        {
            throw new ArgumentOutOfRangeException(nameof(collection),
                @"A minimum of 2 collection members is required.");
        }
        // Goal: discover the type of "?" output at runtime:
        var x = collection.First().Process<TFirstInput,?>(firstInput);
        object y;
        foreach (var processor in collection.Skip(1).Take(collection.Count - 1))
        {
            // Goal: replace <?,?> without boxing:
            y = processor.Process<?,?>(x);
            x = y;
        }
        // Goal: discover the type of "?" input at runtime:
        y = collection.Last().Process<?,TLastOutput>(x);
        // this cast would be unnecessary if one could discover the generic type
        // of each collection member at runtime:
        return (TLastOutput)y;
    }
}

I can't help but think there must be a cleaner way to do this than boxing. The ideal solution would discover the TInput and TOutput of IProcessor within each iteration. Perhaps Marc Gravell's answer from https://stackoverflow.com/a/16479852/141508 applies, but I can't figure out how the YourParent parameter / property in his example fits into the solution.

1

There are 1 best solutions below

0
Servy On BEST ANSWER

So your first problem is that your processors, as per their interface, all claim that they can accept inputs of any type and provide outputs of any type, even though that's clearly not true for your actual processors. Instead of a generic method, thus allowing the caller of the method to supply the input and output types each time they call Process on any processor, you need the generic arguments to be on the interface, so any particular implementation accepts one type of input and provides one type of output.

public interface IProcessor<in TInput, out TOutput>
{
    TOutput Process(TInput firstInput);
}

Next, putting a number of processors, all designed to create a pipeline where the output of one is used as the input of the next, into an ICollection that maintains static typing is inherently doomed. Such a collection would need to have a generic argument for not only the input and output types, but every single intermediate type, and since generic arguments are a statically defined finite list, you'd need to create a new collection for every single number of items in that collection which isn't really feasible unless you only ever have a very small number of items, and even then, working with such a type would be horrendous.

Fortunately, given what your final interface needs to do, there's no actually need to put them in a collection. Simply write a single method to link two processors together, such that a new processor is created that passes the input to the first processor, then the result of that to the second, and finally returns that result.

public static class Processor
{
    public static IProcessor<TInput, TOutput> Link<TInput, TIntermediate, TOutput>(
        this IProcessor<TInput, TIntermediate> first, IProcessor<TIntermediate, TOutput> second)
    {
        return new LinkedProcessor<TInput, TIntermediate, TOutput>(first, second);
    }
    private class LinkedProcessor<TInput, TIntermediate, TOutput> : IProcessor<TInput, TOutput>
    {
        private IProcessor<TInput, TIntermediate> first;
        private IProcessor<TIntermediate, TOutput> second;
        public LinkedProcessor(IProcessor<TInput, TIntermediate> first, IProcessor<TIntermediate, TOutput> second)
        {
            this.first = first;
            this.second = second;
        }
        public TOutput Process(TInput firstInput)
        {
            return second.Process(first.Process(firstInput));
        }
    }
}

By then linking your various processors, one after the other (since a linked processor can itself be linked as either the input or output of another processor in a chain) you can create a pipeline of entirely statically typed processors. And you don't even need a separate type for the ISequencer at all, you can simply have an IProcessor who's process method processes all of the linked values in its own implementation.

//Note: each Processor is a stand in for a specific IProcessor doing actual work
IProcessor<int, DateTime> pipeline = new Processor<int, double>()
    .Link(new Processor<double, string>())
    .Link(new Processor<string, DateTime>());
DateTime result = pipeline.Process(5);

And all of this maintains static typing. If you try to link a processor whose input doesn't match the other processor's output, it won't compile.

As a purely academic concern, it's worth noting that such a pipeline is conceptually a binary tree, which from a computer science perspective, is a collection, but you couldn't meaningfully put the processors in an Collection<T> since currently each "node" only knows of the types of itself and, if it has any, it's children. There is no one type that knows of the types of every single one at the same time.