Pages

Wednesday, December 12, 2012

Generics Internals


In this article, i am going to give an overview about Generics in whidbey and then i will go deep into how exactly it is working.  Before we start on what is generics, we will take a small example and see how generics fits in that. Then we can easily understand what is generics is. In .NET 1.x, if i want to write a collection which will just store the data and list that data. We will create the class like this,
public class List
{
      object[] items;
      int count;
      public void add(object item) {...}
      public object this() {...}
If you see this class declaration, two things we need to understand. Since we dont know about the type of the object that we are going to work on, we declared the items array as objects. Since it is object array, it can store any data type. Other problem is, in the methods for adding and retrieving data we are sending and receiving object only. Because of this two problems, we couldnt do type check at compile and performance also degrading. For example,
List intList = new List(); intList.Add(1);         // Argument is boxed
intList.Add(2);         // Argument is boxed
intList.Add("Three");   // Should be an error 
int i = (int)intList[0];      // Cast required 
If you see this declaration and instatitation process, whenever we are trying to add a item into this collection. Integer type is converted to object type, hence boxing is occuring. Similarly when you trying to retrieve the the value from this collection, we need to explicitly type cast it. Next problem is even we can add string to to this collection, thought we want to add only integers. Since the array is declared as object and add method accepts object, we can add any data type to this collection. Hence type check is not there, we will get an error at runtime, when we try to cast string to int while retrieving. To overcome this problem, we can create specialized collection for each data type. Then you need to replicate the code in all the classes, that will be a overhead.
Instead of this type of collections, if we write a collection in such a way we will get data type as parameter to that class and we will work on that datatype. For example,
public class List<T>
{
      T[] items;
      int count;
      public void Add(T item) {...}
      public T This() {...}
Here we are getting data type with which we are going to work as parameter (T) to this collection and we are declaring array of that data type. Similary for retrieving and listing, we are using this datatype only. If you see this instatitation,
List<int> intList = new List<int>();  
intList.Add(1);         // No boxing
intList.Add(2);         // No boxing
intList.Add("Three");   // Compile-time error  
int i = intList[0];     // No cast required 
Since when we are declaring this collection, we are mentioning that we are going to work on Int. There is no boxing while adding and we no need to do any type casting while retrieving as compiler knows that we are going to work only on Int. Similary if we try to add String data type to this collection we will get compile time error. So using this methodology, we are writing general class for all data type but without compromising type safety, performance, or productivity. This is called Generics in .NET 2.0.
Generics permit classes, structs, interfaces, delegates, and methods to be parameterized by the types of data they store and manipulate. Generics are useful because they provide stronger compile-time type checking, require fewer explicit conversions between data types, and reduce the need for boxing operations and run-time type checks. Generics permit classes, structs, interfaces, delegates, and methods to be parameterized by the types of data they store and manipulate.

Constraints

In the generics example which we have seen before will just do data storage, but most the of generics class will do more work than that. For example, if we need to use compareto method  on generice type in Add method. Then we will write the code like this,
public class List<T>
{
   public void Add(T item)
      {
            if (item.CompareTo(x) < 0) {...}           // Error, no CompareTo method
            ...
      }

 }
During compilation of generic type, we dont know which type we are going to work on. Compiler cant assume that CompareTo method will be available on this datatype, so it will give compile time error. Only members that can be assumed to exist on the type parameter are those declared by type object, such as Equals, GetHashCode, and ToString; a compile-time error therefore occurs in the example above. It is of course possible to cast the key parameter to a type that contains a CompareTo method. For example, the key parameter could be cast to IComparable:
public class List<T>
{
      public void Add(T item)
      {
            ...
            if (((IComparable)item).CompareTo(x) < 0) {...}
            ...
      }
}
 
While this solution works, it requires a dynamic type check at run-time, which adds overhead. It furthermore defers error reporting to run-time, throwing an InvalidCastException if a item doesn’t implement IComparable. To provide stronger compile time check and reduce type casts, Generice provide an optional list of constraints to be supplied for each parameter. A type parameter constraint specifies a requirement that a type must fulfill in order to be used as an argument for that type parameter.
public class List<T> where K: IComparable
{
   public void Add(T item)
      {
            if (item.CompareTo(x) < 0) {...}           // Error, no CompareTo method
            ...
      }

You can optionally specify multiple constraints on the same type parameter. In such cases, the types passed in as type arguments must satisfy all the constraints. Because multiple inheritances is not allowed by the CLR, you can specify only one class type constraint for a type parameter. And since multiple interfaces can be implemented or inherited by the same type, you can specify multiple interface constraints on the same type parameter. You can specify only one New constraint on a type parameter. Note that different kinds of constraints can be specified on the same type parameter.
When both a class constraint and an interface constraint are present on a type parameter, then you can access only the class members directly from objects of the type parameter type. To access the interface members, you need to cast the objects to the constraint interface type. When no class constraint is present, but multiple interface constraints are present, you can access the members of all the interfaces from objects of the type parameter type. If any member names are ambiguous because they are present in multiple interfaces, then you can disambiguate them by casting the objects to the appropriate interface and then accessing the member.  

Generic Methods

In some cases a type parameter is not needed for an entire class, but only inside a particular method. Often, this occurs when creating a method that takes a generic type as a parameter. In that case you can have generic method.

Conclusion

In this article, we have seen what is generics and where it can be used. In the next part of this article, we are going to see how exactly generics is working in CLR, how it is differing from C++ templates. Then we i will explain about how much performance improvement we will get if we go for generics with some performance data.

No comments:

Post a Comment