Thursday 18 June 2009

Making sense of Contravariance and Covariance in C# 2.0/4.0

Introduction

In simple words, covariance refers to the fact that you can use a type or one of its descendants when the specific type is expected. Say for the following class hierarchy: Doberman -> Dog -> Animal, where Animal is the root/super parent class. If you had a function

public static Dog ProcessDog(Dog myDog)
{
return new Dog();
}


In all versions of C#,

Dog returnDog = ProcessDog(new Dog()); // is valid
Dog returnDog = ProcessDog(new Doberman()); //is valid
Dog returnDog = ProcessDog(new Animal()); //is INVALID.

The above invalidity confirms the fact that all arguments in C# are covariant.

Similarly,

Dog returnDog = ProcessDog(new Dog()); // is valid
Object returnDog = ProcessDog(new Dog()); // is valid
Doberman returnDog = ProcessDog(new Dog()); // is INVALID
.


confirms the fact that all return values are contravariant. As in , you cannot use a child type when the parent type is expected in case of return values. Effectively, every type that goes IN is covariant and everything that comes OUT is contravariant.

Covariance and Contravariance in Generics / C# 2.0 / C# 4.0

In C# 2.0, generic interfaces by themselves did not allow for covariance/contravariance. Say if I had

IEnumerable dogs = new List();
IEnumerable animals;
animals = dogs;//is INVALID


I cannot do a animals = dogs; though it make sense literally - arent dogs animals?!. Though note that you could still assign an instance of dog to a animal variable. Coming to C# 4.0, the above statement animals = dogs; works perfectly fine! Why? How?

To understand this, assume a case where IEnumerable allowed setting the value of an item. ASSUME, if you could do

IEnumerable dogs = new List();
IEnumerable animals;
animals = dogs;
animals[0] = cats[0]; //assume cat type to be descendant of animal type


Accessing dogs[0] would now result in an invalid typecast because you are now trying to cast a cat object as a dog!. Thankfully, IEnumerable does not let you set a value to an item, which makes IEnumerable a perfect candidate for covariance. To make sure animals=dogs statement work in C# 4.0, we need to make sure that IEnumerable does not allow for setting/accepting the generic type as a function argument or directly (which is what we did when we did animals[0] = cats[0]).

In our case, what we need to enforce on IEnumerable is to make sure IEnumerable does not have any function which can accept a instance of T (or its descendant). If we did allow, we are introducing the problem of allowing animals[0] = cats[0] OR allowing for animals.AddItem(cat) (declared perhaps as void AddItem(T item)). If we did not allow this, we can ensure that animals=dogs is valid. (which is what C# 4.0 did)

The idea is this : some generic interfaces do not allow inserting items into their list (like IEnumerable) by default. This makes sure that you cannot actually set a specific item of the list with a casted value. For this kind of generics interfaces, C# 4.0 introduces the concept of in, out keyword.

If you had an interface definition of the below kind:

interface Itest
{
T GetVal();
}


the new 'out' keyword makes sure that the type T is covariant. This effectively means that the type T is forced to be a covariant and cannot be used as an IN variable. Say if you tried to add a new method 'GetNewVal' to the above interface as :

void GetNewVal(T someParam);

The compiler would return an error saying T is covariant as T can be used to return a value ONLY.
Similarly, if you had an interface Itest2 as:

interface Itest2
{
void getval(T another);
}


the 'in' keyword makes sure that T is contravariant such that T can be used as an inbound entity (usually as an argument) only. Effectively, if you try to add a new method 'GetYetAnotherVal' as :

T GetYetAnotherVal(void);

the compiler would return an error.

Within C# 4.0, IEnumerable is declared as :

public interface IEnumerable : IEnumerable
{
IEnumerator GetEnumerator();
}

Note the 'out' keyword indicating that no function within IEnumerable can now accept T as input making sure it is covariant to the type T. Outputs are fine (as in GetEnumerator).

References

1.) http://research.microsoft.com/en-us/um/people/akenn/generics/ECOOP06.pdf
Above paper also at http://research.microsoft.com/pubs/64042/ecoop06.pdf

2.)
http://en.wikipedia.org/wiki/Covariance_and_contravariance_%28computer_science%29