Chapter 7
Arrays, Collections, and Generics
What's in this chapter?
Working with arrays
Iteration (looping)
Working with collections
Generics
Nullable types
Generic collections
Generic methods
Covariance and contravariance
In the beginning there were variables, and they were good. The idea that you map a location in memory to a value was a key to tracking a value. However, most people want to work on data as a set. Taking the concept of a variable holding a value, you've moved to the concept of a variable that could reference an array of values. Arrays improved what developers could build, but they weren't the end of the line.
Over time, certain patterns developed in how arrays were used. Instead of just collecting a set of values, many have looked to use arrays to temporarily store values that were awaiting processing, or to provide sorted collections. Each of these patterns started as a best practice for how to build and manipulate array data or to build custom structures that replicate arrays.
The computing world was very familiar with these concepts—for example, using a linked list to enable more flexibility regarding how data is sorted and retrieved. Patterns such as the stack (first in, last out) or queue (first in, first out) were in fact created as part of the original base Class Libraries. Referred to as collections, they provide a more robust and feature-rich way to manage sets of data than arrays can provide. These were common patterns prior to the introduction of .NET, and .NET provided an implementation for each of these collection types.
However, the common implementation of these collection classes relied on the Object base class. This caused two issues. The first, which is discussed in this chapter, is called boxing. Boxing wasn't a big deal on any given item in a collection, but it caused a slight performance hit; and as your collection grew, it had the potential to impact your application's performance. The second issue was that having collections based only on the type Object went against the best practice of having a strongly typed environment. As soon as you started loading items into a collection, you lost all type checking.
Solving the issues with collections based on the Object type is called generics. Originally introduced as part of .NET 2.0, generics provide a way to create collection classes that are type-safe. The type of value that will be stored in the collection is defined as part of the collection definition. Thus .NET has taken the type-safe but limited capabilities of arrays and combined them with the more powerful collection classes that were object-based to provide a set of collection classes which are type-safe.
This chapter looks at these three related ways to create sets of information. Starting with a discussion of arrays and the looping statements that process them, it next introduces collections and then moves to the use of generics, followed by a walk-through of the syntax for defining your own generic templates. Note that the sample code in this chapter is based on the ProVB2012 project created in Chapter 1. Rather than step through the creation of this project again, this chapter makes reference to it. A copy of all of the code is also available as part of the download for this book.
Continuing to reference the arrays defined earlier, the declaration of arrMyIntArray2 actually defined an array that spans from arrMyIntArray2(0) to arrMyIntArray2(3). That's because when you declare an array by specifying the set of values, it still starts at 0. However, in this case you are not specifying the upper bound, but rather initializing the array with a set of values. If this set of values came from a database or other source, then the upper limit on the array might not be clear. To verify the upper bound of an array, a call can be made to the UBound function:
UBound(ArrMyIntArray2)
The preceding line of code retrieves the upper bound of the first dimension of the array and returns 3. However, as noted in the preceding section, you can specify an array with several different dimensions. Thus, this old-style method of retrieving the upper bound carries the potential for an error of omission. The better way to retrieve the upper bound is to use the GetUpperBound method on your array instance. With this call, you need to tell the array which dimension's upper-bound value you want, as shown here (also returning 3):
ArrMyIntArray2.GetUpperBound(0)
This is the preferred method of obtaining an array's upper bound, because it explicitly indicates which upper bound is wanted when using multidimensional arrays, and it follows a more object-oriented approach to working with your array
The UBound function has a companion called LBound. The LBound function computes the lower bound for a given array. However, as all arrays and collections in Visual Basic are zero-based, it doesn't have much value anymore.
The following code considers the use of a declared but not instantiated array. Unlike an integer value, which has a default of 0, an array waits until a size is defined to allocate the memory it will use. The following example revisits the declaration of an array that has not yet been instantiated. If an attempt were made to assign a value to this array, it would trigger an exception.
Dim arrMyIntArray5() as Integer ’ The commented statement below would compile but would cause a runtime exception. 'arrMyIntArray5(0) = 1
The solution to this is to use the ReDim keyword. Although ReDim was part of Visual Basic 6.0, it has changed slightly. The first change is that code must first Dim an instance of the variable; it is not acceptable to declare an array using the ReDim statement. The second change is that code cannot change the number of dimensions in an array. For example, an array with three dimensions cannot grow to an array of four dimensions, nor can it be reduced to only two dimensions.
To further extend the example code associated with arrays, consider the following, which manipulates some of the arrays previously declared:
Dim arrMyIntArray3(4,2) as Integer Dim arrMyIntArray4( , ) as Integer = { {1, 2, 3},{4, 5, 6},{7, 8, 9},{10, 11, 12},{13, 14 , 15} } ReDim arrMyIntArray5(2) ReDim arrMyIntArray3(5,4) ReDim Preserve arrMyIntArray4(UBound(arrMyIntArray4),1)
The ReDim of arrMyIntArray5 instantiates the elements of the array so that values can be assigned to each element. The second statement redimensions the arrMyIntArray3 variable defined earlier. Note that it is changing the size of both the first dimension and the second dimension. While it is not possible to change the number of dimensions in an array, you can resize any of an array's dimensions. This capability is required, as declarations such as Dim arrMyIntArray6( , , ,) As Integer are legal.
By the way, while it is possible to repeatedly ReDim a variable, for performance reasons this action should ideally be done only rarely, and never within a loop. If you intend to loop through a set of entries and add entries to an array, try to determine the number of entries you'll need before entering the loop, or at a minimum ReDim the size of your array in chunks to improve performance.
The last item in the code snippet in the preceding section illustrates an additional keyword associated with redimensioning. The Preserve keyword indicates that the data stored in the array prior to redimensioning should be transferred to the newly created array. If this keyword is not used, then the data stored in an array is lost. Additionally, in the preceding example, the ReDim statement actually reduces the second dimension of the array. Although this is a perfectly legal statement, this means that even though you have specified preserving the data, the data values 3, 6, 9, 12, and 15 that were assigned in the original definition of this array will be discarded. These are lost because they were assigned in the highest index of the second array. Because arrMyIntArray4(1,2) is no longer valid, the value that resided at this location (6) has been lost.
Arrays continue to be very powerful in Visual Basic, but the basic Array class is just that, basic. It provides a powerful framework, but it does not provide a lot of other features that would enable more robust logic to be built into the array. To achieve more advanced features, such as sorting and dynamic allocation, the base Array class has been inherited by the classes that make up the Collections namespace.
Class | Description |
ArrayList | Implements an array whose size increases automatically as elements are added. |
BitArray | Manages an array of Booleans that are stored as bit values. |
Hashtable | Implements a collection of values organized by key. Sorting is done based on a hash of the key. |
Queue | Implements a first in, first out collection. |
SortedList | Implements a collection of values with associated keys. The values are sorted by key and are accessible by key or index. |
Stack | Implements a last in, first out collection. |
Each of the objects listed focuses on storing a collection of objects. This means that in addition to the special capabilities each provides, it also provides one additional capability not available to objects created based on the Array class. Because every variable in .NET is based on the Object class, it is possible to have a collection that contains elements that are defined with different types. So a collection might contain an integer as its first item, a string as its second item, and a custom Person object as its third item. There is no guarantee of the type safety that is an implicit feature of an array.
Each of the preceding collection types stores an array of objects. All classes are of type Object, so a string could be stored in the same collection with an integer. It's possible within these collection classes for the actual objects being stored to be different types. Consider the following code example, which implements the ArrayList collection class within Form1.vb:
Private Sub SampleColl() Dim objMyArrList As New System.Collections.ArrayList() Dim objItem As Object Dim intLine As Integer = 1 Dim strHello As String = "Hello" Dim objWorld As New System.Text.StringBuilder("World") ' Add an integer value to the array list. objMyArrList.Add(intLine) ' Add an instance of a string object objMyArrList.Add(strHello) ' Add a single character cast as a character. objMyArrList.Add(" "c) ' Add an object that isn't a primitive type. objMyArrList.Add(objWorld) ' To balance the string, insert a break between the line ' and the string "Hello", by inserting a string constant. objMyArrList.Insert(1, ". ") For Each objItem In objMyArrList ' Output the values on a single line. TextBoxOutput.Text += objItem.ToString() Next TextBoxOutput.Text += vbCrLf For Each objItem In objMyArrList ' Output the types, one per line. TextBoxOutput.Text += objItem.GetType.ToString() & vbCrLf Next End Sub
The collection classes, as this example shows, are versatile. The preceding code creates a new instance of an ArrayList, along with some related variables to support the demonstration. The code then shows four different types of variables being inserted into the same ArrayList. Next, the code inserts another value into the middle of the list. At no time has the size of the array been declared, nor has a redefinition of the array size been required. The output is shown in .
Visual Basic has additional classes available as part of the System.Collections.Specialized namespace. These classes tend to be oriented around a specific problem. For example, the ListDictionary class is designed to take advantage of the fact that although a hash table is very good at storing and retrieving a large number of items, it can be costly when it contains only a few items. Similarly, the StringCollection and StringDictionary classes are defined so that when working with strings, the time spent interpreting the type of object is reduced and overall performance is improved. Each class defined in this namespace represents a specialized implementation that has been optimized for handling special types of collections.
The preceding examples have relied on the use of the For...Next statement, which has not yet been covered. Since you've now covered both arrays and collections, it's appropriate to introduce the primary commands for working with the elements contained in those variable types. Both the For loop and While loop share similar characteristics, and which should be used is often a matter of preference.
The For structure in Visual Basic is the primary way of managing loops. It actually has two different formats. A standard For Next statement enables you to set a loop control variable that can be incremented by the For statement and custom exit criteria from your loop. Alternatively, if you are working with a collection in which the array items are not indexed numerically, then it is possible to use a For Each loop to automatically loop through all of the items in that collection. The following code shows a typical For Next loop that cycles through each of the items in an array:
For i As Integer = 0 To 10 Step 2 arrMyIntArray1(i) = i Next
The preceding example sets the value of every other array element to its index, starting with the first item, because like all .NET collections, the collection starts at 0. As a result, items 0, 2, 4, 6, 8, and 10 are set, but items 1, 3, 5, 7, and 9 are not explicitly defined, because the loop doesn't address those values. In the case of integers, they'll default to a value of 0 because an integer is a value type; however, if this were an array of strings or other reference types, then these array nodes would actually be undefined, that is, Nothing.
The For Next loop is most commonly set up to traverse an array, collection, or similar construct (for example, a data set). The control variable i in the preceding example must be numeric. The value can be incremented from a starting value to an ending value, which are 0 and 10, respectively, in this example. Finally, it is possible to accept the default increment of 1; or, if desired, you can add a Step qualifier to your command and update the control value by a value other than 1. Note that setting the value of Step to 0 means that your loop will theoretically loop an infinite number of times. Best practices suggest your control value should be an integer greater than 0 and not a decimal or other floating-point number.
Visual Basic provides two additional commands that can be used within the For loop's block to enhance performance. The first is Exit For; and as you might expect, this statement causes the loop to end and not continue to the end of the processing. The other is Continue, which tells the loop that you are finished executing code with the current control value and that it should increment the value and reenter the loop for its next iteration:
For i = 1 To 100 Step 2 If arrMyIntArray1.Count <= i Then Exit For If i = 5 Then Continue For arrMyIntArray1 (i) = i - 1 Next
Both the Exit For and Continue keywords were used in the preceding example. Note how each uses a format of the If-Then structure that places the command on the same line as the If statement so that no End If statement is required. This loop exits if the control value is larger than the number of rows defined for arrMyIntArray1.
Next, if the control variable i indicates you are looking at the sixth item in the array (index of five), then this row is to be ignored, but processing should continue within the loop. Keep in mind that even though the loop control variable starts at 1, the first element of the array is still at 0. The Continue statement indicates that the loop should return to the For statement and increment the associated control variable. Thus, the code does not process the next line for item six, where i equals 5.
The preceding examples demonstrate that in most cases, because your loop is going to process a known collection, Visual Basic provides a command that encapsulates the management of the loop control variable. The For Each structure automates the counting process and enables you to quickly assign the current item from the collection so that you can act on it in your code. It is a common way to process all of the rows in a data set or most any other collection, and all of the loop control elements such as Continue and Exit are still available:
For Each item As Object In objMyArrList 'Code A1 Next
In addition to the For loop, Visual Basic includes the While and Do loops, with two different versions of the Do loop. The first is the Do While loop. With a Do While loop, your code starts by checking for a condition; and as long as that condition is true, it executes the code contained in the Do loop. Optionally, instead of starting the loop by checking the While condition, the code can enter the loop and then check the condition at the end of the loop. The Do Until loop is similar to the Do While loop:
Do While blnTrue = True 'Code A1 Loop
The Do Until differs from the Do While only in that, by convention, the condition for a Do Until is placed after the code block, thus requiring the code in the Do block to execute once before the condition is checked. It bears repeating, however, that a Do Until block can place the Until condition with the Do statement or with the Loop statement. A Do While block can similarly have its condition at the end of the loop:
Do 'Code A1 Loop Until (blnTrue = True)
In both cases, instead of basing the loop around an array of items or a fixed number of iterations, the loop is instead instructed to continue perpetually until a condition is met. A good use for these loops involves tasks that need to repeat for as long as your application is running. Similar to the For loop, there are Exit Do and Continue commands that end the loop or move to the next iteration, respectively. Note that parentheses are allowed but are not required for both the While and the Until conditional expression.
The other format for creating a loop is to omit the Do statement and just create a While loop. The While loop works similarly to the Do loop, with the following differences. The While loop's endpoint is an End While statement instead of a loop statement. Second, the condition must be at the start of the loop with the While statement, similar to the Do While. Finally, the While loop has an Exit While statement instead of Exit Do, although the behavior is the same. An example is shown here:
While blnTrue = True If blnFalse Then blnTrue = False End If If not blnTrue Then Exit While System.Threading.Thread.Sleep(500) blnFalse = True End While
The While loop has more in common with the For loop, and in those situations where someone is familiar with another language such as C++ or C#, it is more likely to be used than the older Do-Loop syntax that is more specific to Visual Basic.
Finally, before leaving the discussion of looping, note the potential use of endless loops. Seemingly endless, or infinite, loops play a role in application development, so it's worthwhile to illustrate how you might use one. For example, if you were writing an e-mail program, you might want to check the user's mailbox on the server every 20 seconds. You could create a Do While or Do Until loop that contains the code to open a network connection and check the server for any new mail messages to download. You would continue this process until either the application was closed or you were unable to connect to the server. When the application was asked to close, the loop's Exit statement would execute, thus terminating the loop. Similarly, if the code were unable to connect to the server, it might exit the current loop, alert the user, and probably start a loop that would look for network connectivity on a regular basis.
Normally, when a conversion (implicit or explicit) occurs, the original value is read from its current memory location, and then the new value is assigned. For example, to convert a Short to a Long, the system reads the two bytes of Short data and writes them to the appropriate bytes for the Long variable. However, under Visual Basic, if a value type needs to be managed as an object, then the system performs an intermediate step. This intermediate step involves taking the value on the stack and copying it as the referenced value of a new object, to the heap, a process referred to as boxing. In Chapter 3, in the section titled “Value and Reference Types,” a distinction was made regarding how certain types were stored. As noted then, Value types are stored on the stack, while reference values are stored on the heap. As noted earlier, the Object class is implemented as a reference type, so the system needs to convert value types into reference types for them to be objects. This doesn't cause any problems or require any special programming, because boxing isn't something you declare or directly control, but it does affect performance.
If you're copying the data for a single value type, this is not a significant cost, but if you're processing an array that contains thousands of values, the time spent moving between a value type and a temporary reference type can be significant. Thus, if when reviewing code you find a scenario where a value is boxed, it may not be of significant concern. When it becomes something to address is if that boxing is called within a loop that is executed thousands or millions of times. When considering best practices, boxing is something to address when working with large collections and calls that are made repeatedly.
Fortunately, there are ways to limit the amount of boxing that occurs when using collections. One method that works well is to create a class based on the value type you need to work with. This might seem counterintuitive at first, because it costs more to create a class. The key is how often you reuse the data contained in the class. By repeatedly using the object to interact with other objects, you avoid creating a temporary boxed object.
Examples in two important areas will help illustrate boxing. The first involves the use of arrays. When an array is created, the portion of the class that tracks the element of the array is created as a reference object, but each element of the array is created directly. Thus, an array of integers consists of an Array object and a set of Integer value types. When you update one of the values with another integer value, no boxing is involved:
Dim arrInt(20) as Integer Dim intMyValue as Integer = 1 arrInt(0) = 0 arrInt(1) = intMyValue
Neither of these assignments of an integer value into the integer array that was defined previously requires boxing. In each case, the array object identifies which value on the stack needs to be referenced, and the value is assigned to that value type. The point here is that just because you have referenced an object doesn't mean you are going to box a value. The boxing occurs only when the values being assigned are being transitioned from value types to reference types:
Dim strBldr as New System.Text.StringBuilder() Dim mySortedList as New System.Collections.SortedList() Dim count as Integer For count = 1 to 100 strBldr.Append(count) mySortedList.Add(count, count) Next
The preceding snippet illustrates two separate calls to object interfaces. One call requires boxing of the value intCount, while the other does not. Nothing in the code indicates which call is which, but the Append method of StringBuilder has been overridden to include a version that accepts an integer, while the Add method of the SortedList collection expects two objects. Although the integer values can be recognized by the system as objects, doing so requires the runtime library to box these values so that they can be added to the sorted list.
When looking for boxing, the concern isn't that you are working with objects as part of an action, but that you are passing a value type to a parameter that expects an object, or you are taking an object and converting it to a value type. However, boxing does not occur when you call a method on a value type. There is no conversion to an object, so if you need to assign an integer to a string using the ToString method, there is no boxing of the integer value as part of the creation of the string. Conversely, you are explicitly creating a new string object, so the cost is similar.
This works because both the return type of the function and the type of the data variable are exactly the same. Not only are they both Generic.Dictionary derivatives, they have exactly the same types in the declaration.
The same is true for parameters:
Private Sub DoWork(ByVal values As Generic.Dictionary(Of Integer, String)) ' do work here End Sub
Again, the parameter type is defined not only by the generic type, but also by the specific type values used to initialize the generic template.
It is possible to inherit from a generic type as you define a new class. For instance, the .NET BCL defines the System.ComponentModel.BindingList(Of T) generic type. This type is used to create collections that can support data binding. You can use this as a base class to create your own strongly typed, data-bindable collection. Add new classes named Customer and CustomerList with the following:
Public Class Customer Public Property Name() As String End Class
Public Class CustomerList Inherits System.ComponentModel.BindingList(Of Customer) Private Sub CustomerList_AddingNew(ByVal sender As Object, ByVal e As System.ComponentModel.AddingNewEventArgs) Handles Me.AddingNew Dim cust As New Customer() cust.Name = "<new>" e.NewObject = cust End Sub End Class
When you inherit from BindingList(Of T), you must provide a specific type—in this case, Customer. This means that your new CustomerList class extends and can customize BindingList(Of Customer). Here you are providing a default value for the Name property of any new Customer object added to the collection.
When you inherit from a generic type, you can employ all the normal concepts of inheritance, including overloading and overriding methods, extending the class by adding new methods, handling events, and so forth.
To see this in action, add a new Button control named ButtonCustomer to Form1 and add a new form named FormCustomerGrid to the project. Add a DataGridView control to FormCustomerGrid and dock it by setting the Dock property to the Fill in the parent container option.
Next, double-click on the new button on Form1 to generate a click event handler for the button. In the code-behind ButtonCustomer_Click event handler, add the following:
FormCustomerGrid.ShowDialog()
Next, customize the Form Customer Grid by adding the following behind FormCustomerGrid:
Public Class FormCustomerGrid Dim list As New CustomerList() Private Sub FormCustomerGrid_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load DataGridView1.DataSource = list End Sub End Class
This code creates an instance of CustomerList and data-binds the list as the DataSource for the DataGridView control. When you run the program and click the button to open the CustomerForm, notice that the grid contains a newly added Customer object. As you interact with the grid, new Customer objects are automatically added, with a default name of <new>. An example is shown in .
All this functionality of adding new objects and setting the default Name value occurs because CustomerList inherits from BindingList(Of Customer).
A generic method is a single method that is called not only with conventional parameters, but also with type information that defines the method. Generic methods are far less common than generic types. Due to the extra syntax required to call a generic method, they are also less readable than a normal method.
A generic method may exist in any class or module; it does not need to be contained within a generic type. The primary benefit of a generic method is avoiding the use of CType or DirectCast to convert parameters or return values between different types.
It is important to realize that the type conversion still occurs. Generics merely provide an alternative mechanism to use instead of CType or DirectCast.
Without generics, code often uses the Object type, such as:
Public Function AreEqual(ByVal a As Object, ByVal b As Object) As Boolean Return a.Equals(b) End Function
The problem with this Object type code such as this is that a and b could be anything. There is no restriction here—nothing to ensure that they are even the same type. An alternative is to use generics, such as:
Public Function AreEqual(Of T)(ByVal a As T, ByVal b As T) As Boolean Return a.Equals(b) End Function
Now a and b are forced to be the same type, and that type is specified when the method is invoked. In order to test these, create a new Sub method using the following:
Private Sub CheckEqual() Dim result As Boolean ' use normal method result = AreEqual(1, 2) result = AreEqual("one", "two") result = AreEqual(1, "two") ' use generic method result = AreEqual(Of Integer)(1, 2) result = AreEqual(Of String)("one", "two") 'result = AreEqual(Of Integer)(1, "two") End Sub
However, why not just declare the method as a Boolean? This code will probably cause some confusion. The first three method calls are invoking the normal AreEqual method. Notice that there is no problem asking the method to compare an Integer and a String.
The second set of calls looks very odd. At first glance, they look like nonsense to many people. This is because invoking a generic method means providing two sets of parameters to the method, rather than the normal one set of parameters.
The first set of parameters contain the type or types required to define the method. This is much like the list of types you must provide when declaring a variable using a generic class. In this case, you're specifying that the AreEqual method will be operating on parameters of type Integer.
The second set of parameters contains the conventional parameters that you'd normally supply to a method. What is special in this case is that the types of the parameters are being defined by the first set of parameters. In other words, in the first call, the type is specified to be Integer, so 1 and 2 are valid parameters. In the second call, the type is String, so “one” and “two” are valid. Notice that the third line is commented out. This is because 1 and “two” aren't the same type; with Option Strict On, the compiler will flag this as an error. With Option Strict Off, the runtime will attempt to convert the string at runtime and fail, so this code will not function correctly.
Earlier in this chapter, you used the Dictionary generic, which specifies multiple type parameters. To declare a class with multiple type parameters, you use syntax such as the following (code file: MyCoolType.vb):
Public Class MyCoolType(Of T, V) Private mValue As T Private mData As V Public Sub New(ByVal value As T, ByVal data As V) mValue = value mData = data End Sub End Class
In addition, it is possible to use regular types in combination with type parameters, as follows:
Public Class MyCoolType(Of T, V) Private mValue As T Private mData As V Private mActual As Double Public Sub New(ByVal value As T, ByVal data As V, ByVal actual As Double) mValue = value mData = data mActual = actual End Sub End Class
Other than the fact that variables or parameters of types T or V must be treated as type System.Object, you can write virtually any code you choose. The code in a generic class is really no different from the code you'd write in a normal class.
This includes all the object-oriented capabilities of classes, including inheritance, overloading, overriding, events, methods, properties, and so forth. However, there are some limitations on overloading. In particular, when overloading methods with a type parameter, the compiler does not know what that specific type might be at runtime. Thus, you can only overload methods in ways in which the type parameter (which could be any type) does not lead to ambiguity.
For instance, adding the following two methods to MyCoolType before the .NET Framework 3.5 would have resulted in a compiler error:
Public Sub DoWork(ByVal data As Integer) ' do work here End Sub Public Sub DoWork(ByVal data As V) ' do work here End Sub
Now this is possible due to the support for implicitly typed variables. During compilation in .NET, the compiler figures out what the data type of V should be. Next it replaces V with that type, which allows your code to compile correctly. This was not the case prior to .NET 3.5. Before this version of the .NET Framework, this kind of code would have resulted in a compiler error. It wasn't legal because the compiler didn't know whether V would be an Integer at runtime. If V were to end up defined as an Integer, then you'd have two identical method signatures in the same class.
Not only can you create basic generic class templates, you can also combine the concept with inheritance. This can be as basic as having a generic template inherit from an existing class:
Public Class MyControls(Of T) Inherits Control End Class
In this case, the MyControls generic class inherits from the Windows Forms Control class, thus gaining all the behaviors and interface elements of a Control.
Alternately, a conventional class can inherit from a generic template. Suppose that you have a simple generic template:
Public Class GenericBase(Of T) End Class
It is quite practical to inherit from this generic class as you create other classes:
Public Class Subclass Inherits GenericBase(Of Integer) End Class
Notice how the Inherits statement not only references GenericBase, but also provides a specific type for the type parameter of the generic type. Anytime you use a generic type, you must provide values for the type parameters, and this is no exception. This means that your new Subclass actually inherits from a specific instance of GenericBase, where T is of type Integer.
Finally, you can also have generic classes inherit from other generic classes. For instance, you can create a generic class that inherits from the GenericBase class:
Public Class GenericSubclass(Of T) Inherits GenericBase(Of Integer) End Class
As with the previous example, this new class inherits from an instance of GenericBase, where T is of type Integer.
Things can get far more interesting. It turns out that you can use type parameters to specify the types for other type parameters. For instance, you could alter GenericSubclass like this:
Public Class GenericSubclass(Of V) Inherits GenericBase(Of V) End Class
Notice that you're specifying that the type parameter for GenericBase is V—which is the type provided by the caller when declaring a variable of type GenericSubclass. Therefore, if a caller uses a declaration that creates an object as a GenericSubclass(Of String) then V is of type String. This means that the GenericSubclass is now inheriting from an instance of GenericBase, where its T parameter is also of type String. The point being that the type flows through from the subclass into the base class. If that is not complex enough, for those who just want a feel for how twisted this logic can become, consider the following class definition:
Public Class GenericSubclass(Of V) Inherits GenericBase(Of GenericSubclass(Of V)) End Class
In this case, the GenericSubclass is inheriting from GenericBase, where the T type in GenericBase is actually based on the declared instance of the GenericSubclass type. A caller can create such an instance with the simple declaration which follows:
Dim obj As GenericSubclass(Of Date)
In this case, the GenericSubclass type has a V of type Date. It also inherits from GenericBase, which has a T of type GenericSubclass(Of Date).
Such complex relationships are typically not useful; in fact, they are often counterproductive, making code difficult to follow and debug. The point was that it is important to recognize how types flow through generic templates, especially when inheritance is involved.
You can also define generic Structure types. The basic rules and concepts are the same as for defining generic classes, as shown here:
Public Structure MyCoolStructure(Of T) Public Value As T End Structure
As with generic classes, the type parameter or parameters represent real types that are provided by the user of the Structure in actual code. Thus, anywhere you see a T in the structure, it will be replaced by a real type such as String or Integer.
Code can use the Structure in a manner similar to how a generic class is used:
Dim data As MyCoolStructure(Of Guid)
When the variable is declared, an instance of the Structure is created based on the type parameter provided. In this example, an instance of MyCoolStructure that holds Guid objects has been created.
Finally, you can define generic interface types. Generic interfaces are a bit different from generic classes or structures, because they are implemented by other types when they are used. You can create a generic interface using the same syntax used for classes and structures:
Public Interface ICoolInterface(Of T) Sub DoWork(ByVal data As T) Function GetAnswer() As T End Interface
Then the interface can be used within another type. For instance, you might implement the interface in a class:
Public Class ARegularClass Implements ICoolInterface(Of String) Public Sub DoWork(ByVal data As String) _ Implements ICoolInterface(Of String).DoWork End Sub Public Function GetAnswer() As String _ Implements ICoolInterface(Of String).GetAnswer End Function End Class
Notice that you provide a real type for the type parameter in the Implements statement and Implements clauses on each method. In each case, you are specifying a specific instance of the ICoolInterface interface—one that deals with the String data type.
As with classes and structures, an interface can be declared with multiple type parameters. Those type parameter values can be used in place of any normal type (such as String or Date) in any Sub, Function, Property, or Event declaration.
You have already seen examples of methods declared using type parameters such as T or V. While these are examples of generic methods, they have been contained within a broader generic type such as a class, a structure, or an interface.
It is also possible to create generic methods within otherwise normal classes, structures, interfaces, or modules. In this case, the type parameter is not specified on the class, structure, or interface, but rather directly on the method itself.
For instance, you can declare a generic method to compare equality like this:
Public Module Comparisons Public Function AreEqual(Of T)(ByVal a As T, ByVal b As T) As Boolean Return a.Equals(b) End Function End Module
In this case, the AreEqual method is contained within a module, though it could just as easily be contained in a class or a structure. Notice that the method accepts two sets of parameters. The first set of parameters is the type parameter—in this example, just T. The second set of parameters consists of the normal parameters that a method would accept. In this example, the normal parameters have their types defined by the type parameter, T.
As with generic classes, it is important to remember that the type parameter is treated as a System.Object type as you write the code in your generic method. This severely restricts what you can do with parameters or variables declared using the type parameters. Specifically, you can perform assignments and call the various methods common to all System.Object variables.
In a moment you will look at constraints, which enable you to restrict the types that can be assigned to the type parameters and expand the operations that can be performed on parameters and variables of those types.
As with generic types, a generic method can accept multiple type parameters:
Public Class Comparisons Public Function AreEqual(Of T, R)(ByVal a As Integer, ByVal b As T) As R ' implement code here End Function End Class
In this example, the method is contained within a class, rather than a module. Notice that it accepts two type parameters, T and R. The return type is set to type R, whereas the second parameter is of type T. Also, look at the first parameter, which is a conventional type. This illustrates how you can mix conventional types and generic type parameters in the method parameter list and return types, and by extension within the body of the method code.
At this point, you have learned how to create and use generic types and methods, but there have been serious limitations on what you can do when creating generic type or method templates thus far. This is because the compiler treats any type parameters as the type System.Object within your template code. The result is that you can assign the values and call the various methods common to all System.Object instances, but you can do nothing else. In many cases, this is too restrictive to be useful.
Constraints offer a solution and at the same time provide a control mechanism. Constraints enable you to specify rules about the types that can be used at runtime to replace a type parameter. Using constraints, you can ensure that a type parameter is a Class or a Structure, or that it implements a certain interface or inherits from a certain base class.
Not only do constraints enable you to restrict the types available for use, but they also give the Visual Basic compiler valuable information. For example, if the compiler knows that a type parameter must always implement a given interface, then the compiler will allow you to call the methods on that interface within your template code.
The most common kind of constraint is a type constraint. A type constraint restricts a type parameter to be a subclass of a specific class or to implement a specific interface. This idea can be used to enhance the SingleLinkedList to sort items as they are added. Create a copy of the class called ComparableLinkedList, changing the declaration of the class itself to add the IComparable constraint:
Public Class SingleLinkedList(Of ValueType As IComparable)
With this change, ValueType is not only guaranteed to be equivalent to System.Object, it is also guaranteed to have all the methods defined on the IComparable interface.
This means that within the Add method you can make use of any methods in the IComparable interface (as well as those from System.Object). The result is that you can safely call the CompareTo method defined on the IComparable interface, because the compiler knows that any variable of type ValueType will implement IComparable. Update the original Add method with the following implementation (code file: ComparableLinkedList.vb):
Public Sub Add(ByVal value As ValueType) If mHead Is Nothing Then ' List was empty, just store the value. mHead = New Node(value, mHead) Else Dim current As Node = mHead Dim previous As Node = Nothing While current IsNot Nothing If current.Value.CompareTo(value) > 0 Then If previous Is Nothing Then ' this was the head of the list mHead = New Node(value, mHead) Else ' insert the node between previous and current previous.NextNode = New Node(value, current) End If Exit Sub End If previous = current current = current.NextNode End While ' you're at the end of the list, so add to end previous.NextNode = New Node(value, Nothing) End If End Sub
Note the call to the CompareTo method:
If current.Value.CompareTo(value) > 0 Then
This is possible because of the IComparable constraint on ValueType. Run the project and test this modified code. The items should be displayed in sorted order, as shown in .
Not only can you constrain a type parameter to implement an interface, but you can also constrain it to be a specific type (class) or subclass of that type. For example, you could implement a generic method that works on any Windows Forms control:
Public Shared Sub ChangeControl(Of C As Control)(ByVal control As C) control.Anchor = AnchorStyles.Top Or AnchorStyles.Left End Sub
The type parameter, C, is constrained to be of type Control. This restricts calling code to only specify this parameter as Control or a subclass of Control, such as TextBox.
Then the parameter to the method is specified to be of type C, which means that this method will work against any Control or subclass of Control. Because of the constraint, the compiler now knows that the variable will always be some type of Control object, so it allows you to use any methods, properties, or events exposed by the Control class as you write your code.
Finally, it is possible to constrain a type parameter to be of a specific generic type:
Public Class ListClass(Of T, V As Generic.List(Of T)) End Class
The preceding code specifies that the V type must be a List(Of T), whatever type T might be. A caller can use your class like this:
Dim list As ListClass(Of Integer, Generic.List(Of Integer))
Earlier in the chapter, in the discussion of how inheritance and generics interact, you saw that things can get quite complex. The same is true when you constrain type parameters based on generic types.
Another form of constraint enables you to be more general. Rather than enforce the requirement for a specific interface or class, you can specify that a type parameter must be either a reference type or a value type.
To specify that the type parameter must be a reference type, you use the Class constraint:
Public Class ReferenceOnly(Of T As Class) End Class
This ensures that the type specified for T must be the type of an object. Any attempt to use a value type, such as Integer or Structure, results in a compiler error.
Likewise, you can specify that the type parameter must be a value type such as Integer or a Structure by using the Structure constraint:
Public Class ValueOnly(Of T As Structure) End Class
In this case, the type specified for T must be a value type. Any attempt to use a reference type such as String, an interface, or a class results in a compiler error.
Sometimes you want to write generic code that creates instances of the type specified by a type parameter. In order to know that you can actually create instances of a type, you need to know that the type has a default public constructor. You can determine this using the New constraint:
Public Class Factories(Of T As New) Public Function CreateT() As T Return New T End Function End Class
The type parameter, T, is constrained so that it must have a public default constructor. Any attempt to specify a type for T that does not have such a constructor will result in a compile error.
Because you know that T will have a default constructor, you are able to create instances of the type, as shown in the CreateT method.
In many cases, you will need to specify multiple constraints on the same type parameter. For instance, you might want to require that a type be a reference type and have a public default constructor.
Essentially, you are providing an array of constraints, so you use the same syntax you use to initialize elements of an array:
Public Class Factories(Of T As {New, Class}) Public Function CreateT() As T Return New T End Function End Class
The constraint list can include two or more constraints, enabling you to specify a great deal of information about the types allowed for this type parameter.
Within your generic template code, the compiler is aware of all the constraints applied to your type parameters, so it allows you to use any methods, properties, and events specified by any of the constraints applied to the type.
One of the primary limitations of generics is that variables and parameters declared based on a type parameter are treated as type System.Object inside your generic template code. While constraints offer a partial solution, expanding the type of those variables based on the constraints, you are still very restricted in what you can do with the variables.
One key example is the use of common operators. There is no constraint you can apply that tells the compiler that a type supports the + or – operators. This means that you cannot write generic code like this:
Public Function Add(Of T)(ByVal val1 As T, ByVal val2 As T) As T Return val1 + val2 End Function
This will generate a compiler error because there is no way for the compiler to verify that variables of type T (whatever that is at runtime) support the + operator. Because there is no constraint that you can apply to T to ensure that the + operator will be valid, there is no direct way to use operators on variables of a generic type.
One alternative is to use Visual Basic's native support for late binding to overcome the limitations shown here. Recall that late binding incurs substantial performance penalties, because a lot of work is done dynamically at runtime, rather than by the compiler when you build your project. It is also important to remember the risks that attend late binding—specifically, the code can fail at runtime in ways that early-bound code cannot. Nonetheless, given those caveats, late binding can be used to solve your immediate problem.
To enable late binding, be sure to add Option Strict Off at the top of the code file containing your generic template (or set the project property to change Option Strict projectwide from the project's properties). Then you can rewrite the Add function as follows:
Public Function Add(Of T)(ByVal value1 As T, ByVal value2 As T) As T Return CObj(value1) + CObj(value2) End Function
By forcing the value1 and value2 variables to be explicitly treated as type Object, you are telling the compiler that it should use late-binding semantics. Combined with the Option Strict Off setting, the compiler assumes that you know what you are doing and it allows the use of the + operator even though its validity can't be confirmed.
The compiled code uses dynamic late binding to invoke the + operator at runtime. If that operator does turn out to be valid for whatever type T is at runtime, then this code will work great. In contrast, if the operator is not valid, then a runtime exception will be thrown.
As part of Visual Studio 2010, the concepts of covariance and contravariance were brought forward into generics. The basic ideas are related to concepts associated with polymorphism. In short, prior to Visual Studio 2010, if you attempted to take, for example, an instance of a generic that inherits from the base class BindingList and assign that instance to an instance of its base class, you would get an error. The ability to take a specialized or subclass and do a polymorphic assignment to its parent or base class describes covariance.
This topic can get complex, so before moving on to discuss contravariance, let's provide a very simple example of covariance in code. The following declares two classes, Parent and ChildClass, and shows covariance in action (code file: CoVariance.vb):
Public Class Parent(Of T) End Class Public Class ChildClass(Of T) Inherits Parent(Of T) End Class Public Class CoVariance Public Sub MainMethod() Dim cc As New ChildClass(Of String) Dim dad As Parent(Of String) 'Show me the covariance dad = cc End Sub End Class
You'll note that ChildClass inherits from Parent. The snippet continues with a method extracted from a calling application. It's called MainMethod and you see that the code creates an instance of ChildClass and declares an instance of Parent. Next it looks to assign the instance cc of ChildClass to the instance dad of type Parent. It is this assignment which illustrates an example of covariance. There are, of course, dozens of different specializations that you could consider, but this provides the basis for all of those examples.
Note, if instead of declaring dad as being a Parent (Of String), the code had declared dad as a Parent (Of Integer), then the assignment of cc to dad would fail because dad would no longer be the correct Parent type. It is important to remember that the type assigned as part of the instantiation of a generic directly impacts the underlying class type of that generic's instance.
Contravariance refers to the ability to pass a derived type when a base type is called for. The reason these features are spoken of in a single topic is that they are both specializations of the variance concept. The difference is mainly an understanding that in the case of contravariance you are passing an instance of ChildClass when a Parent instance was expected. Unfortunately contravariance could be called contraintuitive. You are going to create a base method, and .NET will support its used by derived classes. To illustrate this concept, the following code creates two new classes (they are not generic classes), and then has another code snippet for a method that uses these new classes with generic methods to illustrate contravariance (code file: ContraVariance.vb):
Public Class Base End Class Public Class Derived Inherits Base End Class Public Class ContraVariance Private baseMethod As Action(Of Base) = Sub(param As Base) 'Do something. End Sub Private derivedMethod As Action(Of Derived) = baseMethod Public Sub MainMethod() ' Show the contra-syntax Dim d As Derived = New Derived() derivedMethod(d) baseMethod(d) End Sub End Class
As shown in the preceding example, you can have a method that expects an input parameter of type Base as its input parameter. In the past, this method would not accept a call with a parameter of type Derived, but with contravariance the method call will now accept a parameter of type Derived because this derived class will, by definition, support the same interface as the base class, just with additional capabilities that can be ignored. As a result, although at first glance it feels backward, you are in fact able to pass a generic that implements a derived class to a method which is expecting a generic that is defined using a base class.
This chapter took a look at the classes and language elements that target sets. You started with a look at arrays and the support for arrays within Visual Basic. The chapter then looked at collection classes. By default, these classes operate on the type Object, and it is this capability to handle any or all objects within their implementation that makes these classes both powerful and limited.
Following a quick review of the iterative language structures normally associated with these classes, the chapter moved on to looking at generics. Generics enable you to create class, structure, interface, and method templates. These templates gain specific types based on how they are declared or called at runtime. Generics provide you with another code reuse mechanism, along with procedural and object-oriented concepts.
Generics also enable you to change code that uses parameters or variables of type Object (or other general types) to use specific data types. This often leads to much better performance and increases the readability of your code.
Next you'll move into working with XML from Visual Basic. XML processing and generation are one of Visual Basic's strengths. As you'll see, while you may have an XML file with only a single entry, by its nature XML lends itself to creating collections of objects.