personal info | online resume | project details | vb downloads | contact
Page Links
 Article Index
 
 
It's a String Thing
 
 
We All Do It
I never have really looked, but I would imagine that about 90% of all VB programs utilize strings in some way. We store them, check them, build them up, tear them down, slice them, and dice them. Sometimes, we use small ones like If s = "A" Then (do something). Sometimes, we build huge ones like web pages or entire documents built entirely programmatically. Either way, VB rings true in making things as easy for us as possible. If you've ever tried to work with strings or characters in other languages like C++, then you'll know what I mean. There's actually a good bit of work to managing strings internally, and VB/COM takes care of all that for us.

For our part, we need to understand a little about what's going on under the hood. When we do, we can achieve a great deal of efficiency and speed by just keeping a few short rules in mind. Let's look at a few of those rules and see what we can do to work within them.
Numbers, No Letters
Internally, strings are stored as arrays of Integer values so if you treat them as such, you apps will be much more efficient. Consider the following Select Case statement. Drop these two tests into last week's benchmarking application (available on my FTP site, ftp.earldamron.com, in the /pub/TimeTest folder). The idea (in case you missed it then) is to compare the amount of time required to perform the same logical work using two different methods. In these two snippets, I'll compare a simple Select Case statement using strings vs. integers obtained using the AscW function.
  ' first, the typical string way
  Select Case s
   Case "F"
   Case "G"
   Case "H"
  End Select

  ' next, comparing the values stored internally for each
  ' character value above
  Select Case AscW(s)
   Case 70 ' F
   Case 71 ' G
   Case 72 ' H
  End Select
On average with 5,000,000 iterations on a PIII 600, the Print statements give results like

String Comparisons: 3.487
AscW Comparisons: 0.109

A 96% speed improvement! For the string comparisons, the VB compiler generates a series of

If ... Then
ElseIf ... Then
ElseIf ... Then
Else

statements. For the AscW comparisons, the compiler generates a switch table. Why the AscW function? Thanks for asking. Your check is in the mail.
No Conversion Necessary
Internally, VB stores strings as UNICODE. To interact with the "outside world", it converts stings to ANSI. These string conversions are expensive and should be avoided whenever possible. One way is to use the wide versions of string functions. Consider the following two tests in the benchamrking application:
  ' the normal ASCII function
  m = Asc("F")

  ' the Unicode (i.e., wide) version of the ASCII function
  m = AscW("F")
The benchmarking app for 5,000,000 iterations reports results like

Asc(): .990
AscW(): .110

An almost 90% speed improvement. The Asc() function must convert the "F" from an ANSI F to a UNICODE "F". You can see the cost of the conversion in the performance numbers. There's a ChrW() function also that you should consider using over the Chr() function.
Variants Are Evil
Well, not really, but there are multiple versions of approx. 25 string functions. These versions can return either a String or Variant type. The String versions always contain a $ sign while the Variant versions do not. ALWAYS use the string versions unless you actually want the Variant.  Consider
  ' the Variant version
  If UCase(s) = "FRED" Then
  End If

  ' the String version
  If UCase$(s) = "FRED" Then
  End If
The benchmarking app for 1,000,000 iterations reports results like

Variant: 1.608
String: 0.986

The code works either way, but leaving off the simple $ sign costs you 35% in performance. Internally, the Variant result must be converted before using it in an assignment or comparison operation.
Easy, But At What Cost?
String concatenation is also a very common operation in VB apps. It's not something you would normally think about, but string concatenation is very expensive when strings get large. One alternative is to use smarter
concatenation techniques. Here's a small demo of concatenating the lines in this file (with some of the benchmarking code thrown in). Assume I saved this file to "C:\Windows\Desktop\StringHandling.txt".
  s = vbNullString
  For lCounter = 1 To CLng(txtIterations.Text)
    l = FreeFile()

    Open "C:\Windows\Desktop\StringHandling.txt" For Input As #l

    Do Until EOF(l)
     Line Input #l, sLine

     s = s & sLine & vbCrLf
    Loop

    Close #l
  Next ' lCounter
For 25 iterations, this actually takes about 8.32 seconds to complete. As s grows larger and larger, the strings take longer to concatenate. VB has to do a string allocation for every & which means the line

s = s & sLine & vbCrLf

actually does two allocation and concatenation operations. As a simple performance improvement, you can shorten one of the allocations by directing VB to do the smaller allocation first followed by a larger one as in the following code:
  s = vbNullString
  For lCounter = 1 To CLng(txtIterations.Text)
      l = FreeFile()

      Open "C:\Windows\Desktop\StringHandling.txt" For Input As #l

      Do Until EOF(l)
       Line Input #l, sLine

       s = s & (sLine & vbCrLf)
      Loop

      Close #l
  Next ' lCounter
The () allow VB to do the smaller allocation first (sLine & vbCrLf) and then append this smaller string to the larger one. Just this small change drops the time from 8.32 seconds to 4.151 seconds, a savings of approx. 50%!
Wrap Up
This article has just touched on some of the simple ways you can optimize string handling. For some additional information, check out Francesco Balena's site, www.vb2themax.com. Francesco has long been one of the string master's and has some additional tips for maxing out string performance.  Also, check out Matt Curland's book, Advanced Visual Basic 6. Matt includes some a ton of fundamental string information, as well as some string code that will turn normal VB string handling performance on it's ear. If you think the concatenation performance improvement above is good, check out some of Matt's code.

If you've ever worked with C or C++, you know the additional steps required to work with strings and how easy strings are to work with in VB. Well, it may be easy but it can be costly as you've seen by some of the performance numbers here. Rethinking how VB handles strings internally can vastly improve the performance of your heavily "stringed" applications.
 
This article has been republished with permission from EZ Programming Weekly. To subscribe, send an email to cdnelson9@hotmail.com
personal info | online resume | project details | vb downloads | contact
Copyright © 2001 by Earl Damron