Visual Basic/Optimizing Visual Basic

From Wikibooks, open books for an open world
Jump to: navigation, search

Optimization is the art and science of making your program faster, smaller, simpler, less resource hungry, etc. Of course faster often conflicts with simpler and smaller so optimization is a balancing act.

This chapter is intended to contain a toolkit of techniques that you can use to speed up certain kinds of program. Each technique will be accompanied by code that demonstrates the improvement. Please note that the code is deliberately sub-optimal for two reasons: optimized code is almost always harder to understand and the purpose is that the reader should exercise the techniques in order to obtain greater understanding. You should always be able to rewrite the code presented to obtain greater improvements but this will sometimes be accompanied by a reduction in readability. In code that is speed critical choose speed over clarity, in code that is not speed critical you should always choose the most maintainable style; remember that another programmer coming to your code years later relies on you to make the code say what it means. When you optimize the code you will often need to be extra careful in adding comments so as to ensure that someone maintaining the code doesn't throw away your hard earned speed improvement because he doesn't understand why you use what seems like an overly complicated method.

If you change code to gain speed you should write tests that demonstrate that the new code really is faster than the old. Often this can be as simple as recording the time before and after a procedure and taking the difference. Do the test before you change the code and after and see if the improvement was worthwhile.

Remember that the code has to work first. Make it work, then make it work faster.

Code that slowly gets the right answer is almost invariably preferable to code that quickly gets the wrong answer. This is especially true if the answer is a small part of a larger answer so that errors are hard to spot.

Ensure that things work by writing tests, validation functions and assertions. When you change the code to make it faster the results of the tests, validation functions and assertions should not have changed.

Integers and Longs[edit]

In VB6 it is generally the case that a program in which all whole number variables are declared as Long will run faster than than the same program using Integer.

Use:

 Dim I as Long

not:

 Dim I as Integer

However, it is important to realize that the final arbiter of speed is measurement of performance under real life conditions; see the tests section below.

The only reason to use Integer is when passing arguments by reference to functions defined in components that you cannot alter. For instance most of the events raised by the VB6 GUI use Integer arguments instead of Long. This is also the case with some third party DLLs. Most Windows API functions require Long arguments.

Tests[edit]

Here is a simple test that illustrates that things are not always as they seem in the world of optimization. The test is a simple module that executes two subroutines. The two routines are identical except that in one all the whole number variables are declared as Long and in the other they are all declared as Integer. The routines simply increment a number repeatedly. When executed inside the VB6 IDE on my computer the results are (three consecutive runs in seconds):

 Long         11.607
 Integer       7.220
 Long         11.126
 Integer       7.211
 Long         11.006
 Integer       7.221


which appears to contradict my assertion that Longs are faster than Integers. However when compiled and executed outside the IDE the results are:

 Long         0.711
 Integer      0.721
 Long         0.721
 Integer      0.711
 Long         0.721
 Integer      0.721

Notice that when running compiled the times for Long and Integer are indistinguishable. This illustrates several important points about optimization and benchmarking.

Timing jitter 
timing an algorithm on a pre-emptive multitasking operating system is always an exercise in statistics because you cannot (not in VB6 at least) force the OS to not interrupt your code which means that timing using the simple technique of these tests includes time spent in other programs or at least in the System Idle loop.
Know what you are measuring 
the times measured in the IDE are at least ten times longer than those measured when the code is compiled to native code (optimize for fast code, integer bounds checking on in this case). The relative difference in timing for Long and Integer differs between compiled and interpreted code.

For more benchmarks see Donald Lessau's excellent site: VBSpeed.

Here is the code:

 Option Explicit
 
 Private Const ml_INNER_LOOPS As Long = 32000
 Private Const ml_OUTER_LOOPS As Long = 10000
 
 Private Const mi_INNER_LOOPS As Integer = 32000
 Private Const mi_OUTER_LOOPS As Integer = 10000
 
 
 Public Sub Main()
   
   Dim nTimeLong As Double
   Dim nTimeInteger As Double
   
   xTestLong nTimeLong
   xTestInteger nTimeInteger
   
   Debug.Print "Long", Format$(nTimeLong, "0.000")
   Debug.Print "Integer", Format$(nTimeInteger, "0.000")
   
   MsgBox "   Long: " & Format$(nTimeLong, "0.000") & vbCrLf _
          & "Integer: " & Format$(nTimeInteger, "0.000")
   
 End Sub
 
 
 Private Sub xTestInteger(ByRef rnTime As Double)
   
   Dim nStart As Double
   
   Dim iInner As Integer
   Dim iOuter As Integer
   Dim iNum As Integer
   
   nStart = Timer
   For iOuter = 1 To mi_OUTER_LOOPS
     iNum = 0
     For iInner = 1 To mi_INNER_LOOPS
       iNum = iNum + 1
     Next iInner
   Next iOuter
   
   rnTime = Timer - nStart
   
 End Sub
 
 
 Private Sub xTestLong(ByRef rnTime As Double)
   
   Dim nStart As Double
   
   Dim lInner As Long
   Dim lOuter As Long
   Dim lNum As Long
   
   nStart = Timer
   For lOuter = 1 To ml_OUTER_LOOPS
     lNum = 0
     For lInner = 1 To ml_INNER_LOOPS
       lNum = lNum + 1
     Next lInner
   Next lOuter
   
   rnTime = Timer - nStart
   
 End Sub

Strings[edit]

If you have a lot of code that does a lot of string concatenation you should consider using a string builder. A string builder is usually provided as a class but the principle is quite simple and doesn't need any object oriented wrapping.

The problem that a string builder solves is the time used in repeatedly allocating and deallocating memory. The problem arises because VB strings are implemented as pointers to memory locations and every time you concatenate two strings what actually happens is that you allocate a new block of memory for the resulting string. This happens even if the new string replaces the old as in the following statement:

 s = s & "Test"

The act of allocating and deallocating memory is quite expensive in terms of CPU cycles. A string builder works by maintaining a buffer that is longer than the actual string so that the text that is to be added to the end can simply be copied to the memory location. Of course the buffer must be long enough to accommodate the resulting string so we calculate the length of the result first and check to see if the buffer is long enough; if it is not we allocate a new longer string.

The code for a string builder can be very simple:

 Private Sub xAddToStringBuf(ByRef rsBuf As String, _
                             ByRef rlLength As Long, _
                             ByRef rsAdditional As String)
   
   If (rlLength + Len(rsAdditional)) > Len(rsBuf) Then
     rsBuf = rsBuf & Space$(rlLength + Len(rsAdditional))
   End If
   Mid$(rsBuf, rlLength + 1, Len(rsAdditional)) = rsAdditional
   rlLength = rlLength + Len(rsAdditional)
   
 End Sub

This subroutine takes the place of the string concatenation operator (&). Notice that there is an extra argument, rlLength. This is necessary because the length of the buffer is not the same as the length of the string.

We call it like this:

 dim lLen as long
 xAddToString s, lLen, "Test"

lLen is the length of the string. If you look at the table of execution times below you can see that for values of Count up to about 100 the simple method is faster but hat above that the time for simple concatenation increases roughly exponentially whereas that for the string builder increases roughly linearly (tested in the IDE, 1.8GHz CPU, 1GB ram). The actual times will be different on your machine but the general trend should be the same.

It is important to look critically at measurements like this and try to see how, or if, they apply to the application that you are writing. If you build long text strings in memory the string builder is a useful tool but if you only concatenate a few strings the native operator will be faster and simpler.

Tests[edit]

Concatenating the word Test repeatedly gives these results

Times in seconds

Count Simple Builder
10 0.000005 0.000009
100 0.000037 0.000043
1000 0.001840 0.000351
5000 0.045 0.002
10000 0.179 0.004
20000 0.708 0.008
30000 1.583 0.011
40000 2.862 0.016
50000 4.395 0.019
60000 6.321 0.023
70000 13.641 0.033
80000 27.687 0.035

|| Programming Visual Basic Optimization strings timing.png

ByRef versus ByVal[edit]

Many books on VB tell programmers to always use ByVal. They justify this by saying that it is both faster and safer.

You can see that ByVal is slightly faster by the results of the Tests:

Empty functions:

 xTestByRef     13.4190000000017 
 xTestByVal     13.137999999999 

With a simple arithmetic expression in the function:

 xTestAddByRef  15.7870000000039 
 xTestAddByVal  15.3669999999984 

You can also see for that the difference is slight. Be careful when interpreting coding benchmarks, even the simplest can be misleading and it is always wise to profile the real code if you think you have a bottle-neck.

The other claim is that ByVal is safer. This is usually explained by saying that a function in which all arguments are declared ByVal cannot alter the values of the variables in the caller because VB makes copies instead of using pointers. The programmer is then free to make use of the incoming arguments in any way he or she sees fit, in particular they can be assigned new values without disturbing the original values in the caller.

The safety provided by this is generally outweighed by the fact that the compiler is then unable to enforce type checking. In Visual Basic Classic variables are automatically converted between string and numeric types when necessary and possible. If arguments are declared ByVal then the caller can provide a String where a Long was expected. The compiler will silently compile the necessary instructions to convert the String to a Long and carry on. At run time you might, or might not, discover the problem. If the string contained a number then it will be converted and the code will 'work', if it doesn't then the code will fail at the point where the function is called.

 Function IntByVal(ByVal z as Double) as Long
   IntByVal = Int(z)
 End Function
 
 Function IntByRef(ByRef z as Double) as Long
   IntByRef = Int(z)
 End Function
 
 Dim s As String
 's = "0-471-30460-3"
 s = "0"
 Debug.Print IntByVal(s)
 Debug.Print IntByRef(s)
 

If you try to compile the little snippet of code above you will see that the compiler stops on the line:

 Debug.Print IntByRef(s)

it hightlights the s and shows a message box saying 'ByRef argument type mismatch'. If you now comment out that line and run again you will not get an error. Now uncomment this line:

 s = "0-471-30460-3"

and comment out this one:

 's = "0"

Run the program again. Now the program fails but only at run time with 'Runtime error 13: Type mismatch'.

The moral is:

  • use ByRef to cause the compiler to typecheck the arguments in function calls,
  • never assign to the arguments of a function unless they are output parameters.

Look at the Procedure Parameters section in the Coding Standards chapter for some suggestions about how to name the arguments so that it is always clear which are in and which are out parameters.

Tests[edit]

With Empty Functions[edit]

 Option Explicit
 
 Private Const mlLOOPS As Long = 100000000
 Private mnStart As Double
 Private mnFinish As Double
 
 Public Sub main()
   xTestByRef 1#, 2#
   Debug.Print "xTestByRef", mnFinish - mnStart
   xTestByVal 1#, 2#
   Debug.Print "xTestByVal", mnFinish - mnStart  
 End Sub
   
 Private Sub xTestByRef(ByRef a As Double, ByRef b As Double)
   Dim lLoop As Long
   Dim n As Double
   mnStart = Timer
   For lLoop = 1 To mlLOOPS
     n = xByRef(a, b)
   Next lLoop    
   mnFinish = Timer
 End Sub
 
 Private Sub xTestByVal(ByVal a As Double, ByVal b As Double)  
   Dim lLoop As Long
   Dim n As Double
   mnStart = Timer
   For lLoop = 1 To mlLOOPS
     n = xByVal(a, b)
   Next lLoop
   mnFinish = Timer
 End Sub
 
 Private Function xByRef(ByRef a As Double, ByRef b As Double) As Double
 End Function
 
 Private Function xByVal(ByVal a As Double, ByVal b As Double) As Double
 End Function

With Simple Arithmetic[edit]

 Attribute VB_Name = "modMain"
 Option Explicit
   
 Private Const mlLOOPS As Long = 100000000
 Private mnStart As Double
 Private mnFinish As Double
 
 Public Sub main()  
   xTestAddByRef 1#, 2#
   Debug.Print "xTestAddByRef", mnFinish - mnStart
   xTestAddByVal 1#, 2#
   Debug.Print "xTestAddByVal", mnFinish - mnStart
 End Sub
 
 Private Sub xTestAddByRef(ByRef a As Double, ByRef b As Double)
   Dim lLoop As Long
   Dim n As Double
   mnStart = Timer
   For lLoop = 1 To mlLOOPS
     n = xAddByRef(a, b)
   Next lLoop
   mnFinish = Timer
 End Sub
 
 Private Sub xTestAddByVal(ByVal a As Double, ByVal b As Double)  
   Dim lLoop As Long
   Dim n As Double
   mnStart = Timer
   For lLoop = 1 To mlLOOPS
     n = xAddByVal(a, b)
   Next lLoop
   mnFinish = Timer
 End Sub
 
 Private Function xAddByRef(ByRef a As Double, ByRef b As Double) As Double
   xAddByRef = a + b
 End Function
   
 Private Function xAddByVal(ByVal a As Double, ByVal b As Double) As Double
   xAddByVal = a + b
 End Function

Collections[edit]

Collections are very useful objects. They allow you to write simpler code than would otherwise be the case. For instance if you need to hold on to a list of numbers given to you by the user you probably won't know in advance how many will be supplied. This makes it difficult to use an array because you must either allocate an unnecessarily large array so that you are sure that no reasonable user would supply more or you must continually Redim the array as new numbers come in.

A Collection lets you avoid all that because it expands as necessary. However, this convenience comes at the cost of increased runtime under some circumstances. For many, perhaps most, programs this price is perfectly acceptable but some programs take so long to run that you must try to squeeze the last drop of performance out of every line.

This frequently occurs in programs doing scientific computations (don't tell me C would be a better language for such things because the same constraints and optimizations apply to C and all other languages as well).

One of the reasons a Collection is a convenient tool is because you can use a string as a key and retrieve items by key instead of by index. But every time you ask for an item in a collection the collection must first figure out whether you have supplied an Integer (or Byte or Long) or a String which of course takes a small but finite amount of time. If your application makes no use of the ability to look up items by String key you will get faster programs by using arrays instead because VB doesn't need to waste time checking to see if you supplied a String key instead of a whole number index.

If you want a Collection to simply hold a list of items without keys then you can emulate this behaviour with an array as follows (note that this code is not optimal, optimizing it is left as an exercise for the student):

 Public Sub Initialize(ByRef a() As Variant, _
                       ByRef Count As Long, _
                       ByRef Allocated As Long)
   Allocated = 10
   Redim a(1 to Allocated) 
   Count = 0
 End Sub
   
 Public Sub Add(ByRef NewItem as Variant, _
                ByRef a() As Variant, _
                ByRef Count As Long, _
                ByRef Allocated As Long)
   Count = Count + 1    
   If Allocated < Count then
     Allocated = Count
     Redim Preserve a(1 to Allocated)    
   End If
   a(Count) = NewValue
 End Sub

To use the above code you must declare an array of Variants and two Longs. Call the Initialize sub to start it all off. Now you can call the Add sub as often as you like and the array will be extended as necessary.

There are a number of things to note about this seemingly simply subroutine:

  • Once the array size exceeds 10 a new array is allocated each time an item is added (replacing the original),
  • The size of the array (recorded in the Allocated variable) is always the same as the Count when Count exceeds 10,
  • No provision is made to delete items,
  • the items are stored in the same order as they are added.
  • all arguments are declared Byref

It is unlikely that any programmer would want to include exactly this routine in production code. Some reasons why are:

  • NewItem is declared as Variant but the programmer usually knows the type,
  • The initial allocation is hard coded using a literal integer,
  • The name of the subroutine is too simple, it will probably clash with others in the same namespace.
  • No provision is made for removing items.
  • Three separate pieces of information must be kept together but they are not bound together.

The performance of the emulated Collection depends on the relative frequency of calls to Add, Item and Remove (see #Exercises). An optimized version must be optimized for the use to which it will be put. If no lookups by key are performed there is no need to provide a function for it and there is especially no need to provide data structures to make it efficient.

Exercises[edit]

  • Extend the code presented in this section to emulate the Collection class in more detail. Implement Item and Remove methods, see the VB help files for details of the exact method declarations but do not feel obliged to replicate them exactly.
  • Write a test program to exercise all the features and time them so that you can tell when to use a Collection and when to use your emulated version.
  • You should be able to think of at least two distinct implementations of Remove. Think about the consequences of different implementations.
  • Write an explicit description of the requirements that your version of the Collection class satisfies.
  • Compare the behaviour of the built in Collection class and the new class, note any differences and give examples of uses where the differences do and do not matter.
  • If you haven't already done so implement lookup by key.
  • Explain why emulating all of the features of the Collection class using exactly the same interface is unlikely to yield significant improvements in performance for at least one specific use case.

Dictionaries[edit]

In VB a dictionary is actually supplied by the Scripting Runtime component. It is rather like a Collection but also provides a means to retrieve the keys that were used. Also, unlike the Collection, the keys can be any datatype including objects. The principal differences between Collections and Dictionaries are as follows:

  • Dictionaries have Keys and Items properties which return variant arrays
  • Keys can be any datatype not just strings
  • Enumeration using For Each returns the Keys not the Items
  • There is a built in Exists method.

As with Collections the convenience of Dictonaries is sometimes outweighed by their run time expense. One reason Dictionaries are frequently used instead of Collections is simply because of the Exists method. This allows you to avoid overwriting existing data or to avoid attempts to access non-existent data. However if the number of data items is small then a simple linear search can be faster. A small number in this case might be as many as 50 items. You can write a simple class that can be used instead of a Dictionary as shown below; note that no attempt is made to replicate all the behaviour of the dictionary and that this is done intentionally.

 'cDict.cls
 Option Explicit
 
 Private maItems() As String
 Private mlAllocated As Long
 Private mlCount As Long
 
 Public Sub Add(Key As String, Item As String)
   
   mlCount = mlCount + 1
   If mlAllocated < mlCount Then
     mlAllocated = mlAllocated + mlCount
     ReDim Preserve maItems(1 To 2, 1 To mlAllocated)
   End If
   maItems(1, mlCount) = Key
   maItems(2, mlCount) = Item
   
 End Sub
 
 Public Property Get Count() As Long
   Count = mlCount
 End Property
 
 Public Function Exists(Key As String) As Boolean
   
   Exists = IndexOf(Key)
   
 End Function
 
 Public Function IndexOf(Key As String) As Long
   
   For IndexOf = 1 To mlCount
     If maItems(1, IndexOf) = Key Then
       Exit Function
     End If
   Next IndexOf
   IndexOf = 0
   
 End Function
 
 Public Property Let Item(Key As String, RHS As String)
   
   Dim lX As Long
   lX = IndexOf(Key)
   If lX Then
     maItems(lX, 2) = RHS
   Else
     Add Key, RHS
   End If
   
 End Property
 
 Public Property Get Item(Key As String) As String
   Item = maItems(IndexOf(Key), 2)
 End Property
 
 Public Sub Remove(Key As String)
   
   Dim lX As Long
   lX = IndexOf(Key)
   maItems(1, lX) = maItems(1, mlCount)
   maItems(2, lX) = maItems(2, mlCount)
   mlCount = mlCount - 1
 
 End Sub
 
 Public Sub RemoveAll()
   mlCount = 0
 End Sub
 
 Public Sub Class_Initialize()
   mlAllocated = 10
   ReDim maItems(1 To 2, 1 To mlAllocated)
 End Sub
 

A simple test routine can be used to demonstrate that this class is faster than the VB dictionary for certain tasks. For instance adding 32 items to the dictionary and then removing them again is faster using cDict but if you double the number of items the VB Dictionary is better. The moral is: choose the correct algorithm for the task at hand.

Here is the test routine:

 Option Explicit
 
 Public gsModuleName As String
 
 
 Private mnStart As Double
 Private mnFinish As Double
 
 Private Const mlCount As Long = 10000
 
 
 Public Sub main()
   
   Dim litems As Long
   litems = 1
   Do While litems < 100
     litems = litems * 2
     Debug.Print "items=", litems
     Dim lX As Long
     mnStart = Timer
     For lX = 1 To mlCount
       xTestDictionary litems
     Next lX
     mnFinish = Timer
     Debug.Print "xTestDictionary", "Time: "; Format$(mnFinish - mnStart, "0.000")
     
     mnStart = Timer
     For lX = 1 To mlCount
       xTestcDict litems
     Next lX
     mnFinish = Timer
     Debug.Print "xTestcDict     ", "Time: "; Format$(mnFinish - mnStart, "0.000")
   Loop
   
   
 End Sub
 
 
 
 Private Sub xTestDictionary(ByRef rlItems As Long)
   
   Dim d As Dictionary
   Set d = New Dictionary
   
   Dim lX As Long
   Dim c As Double
   For lX = 1 To rlItems
     d.Add Str$(lX), Str$(lX)
   Next lX
   For lX = 1 To rlItems
     d.Remove Str$(lX)
   Next lX
   
 End Sub
 
 
 Private Sub xTestcDict(ByRef rlItems As Long)
   
   Dim d As cDict
   Set d = New cDict
   
   Dim lX As Long
   Dim c As Double
   For lX = 1 To rlItems
     d.Add Str$(lX), Str$(lX)
   Next lX
   For lX = 1 To rlItems
     d.Remove Str$(lX)
   Next lX
   
 End Sub
 
 

And here are the results from my PC in the IDE (seconds):

 items=         2 
 xTestDictionary             Time: 1.602
 xTestcDict                  Time: 0.120
 items=         4 
 xTestDictionary             Time: 1.663
 xTestcDict                  Time: 0.200
 items=         8 
 xTestDictionary             Time: 1.792
 xTestcDict                  Time: 0.361
 items=         16 
 xTestDictionary             Time: 2.023
 xTestcDict                  Time: 0.741
 items=         32 
 xTestDictionary             Time: 2.504
 xTestcDict                  Time: 1.632
 items=         64 
 xTestDictionary             Time: 3.455
 xTestcDict                  Time: 4.046
 items=         128 
 xTestDictionary             Time: 5.387
 xTestcDict                  Time: 11.437

Exercises[edit]

  • Write a version of the cDict class that allows storage of other data types, using Variant arguments for instance,
  • Check the performance using a similar test routine.
  • Write a new test routine that performs other tests. Does the relative performance of the new classes and VB Dictionary change?
  • What happens if you try to retrieve or remove an item that does not exist? How does the behaviour compare with that of the Dictionary?

The Debug Object[edit]

The Debug object has two methods:

Print
prints its arguments to the immediate window,
Assert
pauses the program if its argument is false.

Both of these only have an effect when running in the IDE; at least that is the conventional wisdom. Unfortunately it is not quite true of Debug.Print. This method doesn't print anything when the program is running as a compiled executable but if the arguments are function calls they are still evaluated. If the function call is very time consuming you will find that the compiled version doesn't run as fast as expected.

There two things that can be done:

  • Remove all Debug.Print statements before shipping the product,
  • Only use variables or constants as arguments to Debug.Print

If your program is both very CPU intensive and under continual development the second might be preferable so that you don't have to keep adding and removing lines.

Debug.Assert doesn't suffer from this problem so it is perfectly safe to assert the truth of complicated and time consuming functions. The assertion arguments will only be evaluated when running in the IDE.

Exercises[edit]

  • Write a short program to demonstrate that Debug.Print evaluates functions even when compiled,
  • Modify it to show that Debug.Assert does not suffer from this problem,
  • Show that Debug.Print has zero execution time in the compiled version if the arguments are constants or variables.

Object Instantiation[edit]

Long words for simple concepts. This section deals with the runtime cost of creating objects. The jargon phrase for this is object instantiation which means creating an instance of a class. An instance bears the same relation to a class as a machine bears to plans for it. For any given class there can be as many instances as you like.

If an object takes a long time to construct and you create and destroy a lot of them during the running of the program you might save some time by not destroying them but putting them on a list of ready made objects for later use.

Imagine a program that simulates an animal ecology. There could be, say, two classes of animals: herbivores and carnivores.

If you want to see any results in your lifetime the simulation must obviously run faster than real life speed so a lot of herbivores in particular will be born, breed and be eaten. If each of these is represented by an object that is destroyed when the animal it represents is killed the system will be repeatedly allocating and deallocating memory which is a relatively expensive business in VB6. In such a program you know that new objects of the same kind are continually needed so you can avoid some of the memory allocation overhead by reusing dead objects instead of destoying them and creating new ones. There are many ways of doing this. Which one you choose depends on how many different classes of object you have. If there are very few then you can create separate code for each, this can then be very highly tuned. On the other hand if you have dozens of different classes (perhaps you have extended the simulation to include competing herbivores) you will quickly find that you have a maintenance problem.

The solution is to create a class that describes a general purpose object pool and an Interface that can be implemented by each of the classes.

Before I describe the class and the interface here is a summary of the requirements:

  • Only one pool class needed,
  • The classes being 'pooled' should need small modifications to only the initialization and terminaion code to fit the pooling concept,
  • The code that uses the pooled objects should not have to change except for calling the pool's GetObject function instead of using the New operator.

The technique hinges on the fact that VB6 has deterministic finalization. Another piece of jargon that simply means that VB6 destroys (finalizes) objects as soon as they become unused. Every time VB6 decides that an object is no longer in use it calls the Class_Terminate method of that object. What we can do is add a simple piece of code to each class that puts the terminating object on a list of available objects. VB will see that the object is now in use and not deallocate the memory that it used. Later on, instead of using the New operator to create new object we ask the object pool to give us one that had fallen out of use. Even if the object takes very little time to set up this will be quicker than using New because it avoid both the allocation and deallocation of memory.

Here is an example object pool class and a class that can use it:

 'ObjectPool.cls
 Private moPool as Collection
 Public oTemplate As Object
 
 Private sub Class_Initialize()
   set moPool = New Collection
 End Sub
 
 Public Function GetObject() as Object
   if moPool.Count then
     Set GetObject = moPool(1)
     moPool.Remove(1)
   Else
     Set GetObject = oTemplate.NewObject
   End If
 End Function
 
 Public Sub ReturnToPool(Byref oObject as Object)
   moPool.Add oObject
 End Sub

To use this class declare a public variable in a bas module and assign a new ObjectPool object to it:

 'modPool.bas
 Public goPool As ObjectPool
 
 Public Sub Initialize()
   Set goPool = New ObjectPool
   Set goPool.oTemplate = New Herbivore
 End Sub

Now modify your Herbivore class by adding a call to ReturnToPool to the Class_Terminate method and adding a NewObject method:

 Private Sub Class_Terminate()
   goPool.ReturnToPool Me
 End Sub
 
 Public Function NewObject() as Object
   Set NewObject = New Herbivore
 End Function

For some simple scenarios this might even work. However, there are several flaws, and at least one is a major problem. The problem is that the object you get doesn't necessarily look like a shiny new one. Now for some applications this doesn't matter because the client code will reinitialise everything anyway but many programs rely on the feature of the language that automatically sets newly allocated variables to zero. In order to satisfy the requirement for minimum changes in the client we should extend both ObjectPool and the objects that are pooled by adding a ReInitialize method to the Herbivore:

 Public Sub ReInitialize()
   ' Do whatever you need to do to make the object 
   ' look like new (reset some attributes to zero, 
   ' empty strings, etc).
 End Sub

Don't do any more work in ReInitialize than absolutely necessary. For instance if one of the attributes of the object is a dynamically allocated array it might not be necessary to clear it; it might be enough to set a flag or counter to indicate that no array elements are actually in use.

Now modify the GetObject method of ObjectPool:

 Public Function GetObject() as Object
   if moPool.Count then
     Set GetObject = moPool(1)
     GetObject.ReInitialize
     moPool.Remove(1)
   Else
     Set GetObject = oTemplate.NewObject
   End If
 End Function

Now everywhere that you use New Herbivore use goPool.GetObject instead. If the Herbivore object has references to other objects you might, or might not, want to release those references by setting them to Nothing in the Class_Terminate method. It depends on the behaviour of the objects and the rest of your program, generally you should put off doing expensive operations for as long as possible.

The use of an object pool can improve the performance of certain types of program without requiring the programmer to radically change the program design. However don't forget that you might be able to get similar or greater improvements by using a better algorithm; again timing tests are the key to knowledge in this area. Don't assume that you know where the bottleneck is, demonstrate where it is by profiling the program.

Exercises[edit]

  • Write a simple program that creates and destroys lots of objects and time it. Now modify it to use ObjectPool.
  • Define an interface for pooled objects and implement it. Does eliminating the use of As Object change the performance?
  • Notice that the moPool collection is used simply as a FIFO stack, that is no use is made of the ability to find items by key. Is there an alternative that would be faster?
  • Is the FIFO behaviour of the stack important, that is, is it a deliberate feature or simply irrelevant?

Tip #1 – Move as much code as possible outside of loops. Loops are always the most important thing to optimize in your code. Always try to move as much as you can out of a loop. That way code doesn't get repeated and saves you some cpu cycles. A simple example:


Code: for i = 1 to 50 x = b ' Stays the same with every loop, get it outside of the loop! k = j + i next iChange that to:


Code: x = b 'Moved outside the loop

for i = 1 to 50 k = j + 1 next i That may seem like a no-brainier but you would be surprised about how many programmers do that. The simple rule is, if it doesn't change with every loop iteration then move it outside of the loop as it doesn't need to be there. You only want to include code inside a loop that MUST be there. The more instructions you can clear out of a loop the faster we can run it.

Tip #2 – Loop Unrolling Loop unrolling can eliminate some compare and jump instructions. (Compare and jump instructions are used to create loops, you don't see them in visual basic, its behind the scenes stuff that you learn in ASM.) It also takes advantage of the ability of modern cpus that can fetch several instructions at a time. In a nutshell you get a good speed boast by unrolling loops.

But there is something we need to know about loop unrolling. The largest bottleneck on modern computers is memory. So the designers of CPU's like Intel and AMD addressed this problem by using a cache on their cpus. This is basically a memory location that is accessed much faster by the CPU then standard memory. You want your unrolled loop to fit in that cache, if it doesn't then it could slow down your code. So you may want to experiment with gettickcount when you unroll you're loop.

Example Loop:

Code: For i = 1 To 100

       b = somefun(b)

Next i unrolled Example:

Code: For i = 1 To 100 Step 2

       b = somefun(b)
       b = somefun(b)

Next i You can get up to a 25% gain in speed depending on what you are doing, you just have to experiment with this.

Tip #3 – Avoid dividing if possible. A divide instruction is one of the most if not the most expensive instruction you can perform on a CPU. It is faster to multiply then divide!


Code: B = 40 / 2

is slower then

Code: b = 40 * 0.5 You can also develop some interesting algorithms using subtraction to get your result that is much faster then using division. If your using division in a loop, it is a must to change it to speed up your code. (I was going to also recommend trying shifting the bits for division but I forgot some versions of visual basic doesn't include the shift operator).

Tip #4 – In a nested conditional branch such as select case and nested if statements, put the things that are most likely to be true first in the nest, with the least likely things last.

Tip #5 – Avoid use of variant variables. The variant variable is all nice when you are new to visual basic, but its a habit you need to break. A variant variable is converted into its proper data type at runtime which can be very expensive.

Tip #6 – Be careful when you declare variables. If you don't use as something with every variable you declare, it is a variant! For example:

Code: Dim a, b, c as string. A = A variant B = A variant C = A string I've seen some people use the notation:

Code: dim x x = blah That is a NO NO! It may work yes, but its going to cost you speed.

Tip #7 – Reduce common expressions. Sometimes you have two different variables that use part of the same calculation. Instead of doing the entire calculation for both variables, eliminate the redundant calculation.

Example:

Code: x = a * b + c y = a * b + d

is slower then

Code: t = a * b x = t + c y = t + d That is especially true if your using a redundant expensive calculation in a loop.

Tip # 7 – Use long or integer for calculations. A long is a 32 bit number and is more natural on 32 bit processors. Avoid other variables such as double, single, etc

Tip #8 – Use inline functions inside of loops. Instead of calling a function, stick the code in the loop. This will make you're program larger if you repeat it in enough loops and should only be done in critical places. The reason is due to the over head of calling a function. Before the program calls a function, it has to push some things onto the stack. At the very least it will push the instruction pointer (IE: Return Address). Memory access is slow so we want to avoid that in critical places.

Tip #9 Avoid using properties in loops. Properties are accessed a lot slower then variables, so use variables instead:


Code: for i = 1 to 50 text1.text = text1.text + b(i) next i

is slower then

Code: for i = 1 to 50 strbuffer = strbuffer + b(i) next i text1.text = strbuffer

Tip #10 – Load all the data you need from the disk. Instead of loading one file at a time, load all of them at once. This will avoid future delay for the user.

Tip #11 – Make good use of the timer control. You can do background processing while waiting on a user. Use this time to prefetch data, calculations that are need, etc.

Tip #12 – Minimize dot notation in your objects! Each dot you use in a object makes visual basic do a call.


Code: Myobject.one.two.three

is slower then

Code: Myobject.one

Tip #13 Allocate enough memory at once. When you create a dynamic array and you want to add elements that haven't been allocated yet, make sure you allocate enough for all of them instead of doing it one at a time. If you don't know how many you need, times what you have allocated by 2. Allocating memory is a expensive process.

Tip #14 Avoid built in functions in loops. If you have a algorithm that is looped that requires the len of your string. Make sure you cache the size of your string in a buffer and not call the function len() with each iteration of the loop:

Code: for i = 1 to 100 sz = len(string) 'Do processing next i

instead

Code: sz = len(string) for i = 1 to 100 'Do Processing with sz next i

Tip #15 Hide the control when your setting its properties. Every time you update the properties of your control, you make it repaint. So if your developing something that displays complex graphics, may be a good idea to reduce that from happening so much.

Previous: Effective Programming Contents Next: Examples