| It's
a String Thing | | | We
All Do It | I never have really looked,
but I would imagine that about 90% of all VB programs utilize
strings in some way. We store them, check them, build them up,
tear them down, slice them, and dice them. Sometimes, we use
small ones like If s = "A" Then (do something). Sometimes, we
build huge ones like web pages or entire documents built
entirely programmatically. Either way, VB rings true in making
things as easy for us as possible. If you've ever
tried to work with strings or characters in other languages like C++, then
you'll know what I mean. There's actually a good bit of work to managing
strings internally, and VB/COM takes care of all that for us.
For our part, we need to understand a little about what's going on
under the hood. When we do, we can achieve a great deal of
efficiency and speed by just keeping a few short rules in
mind. Let's look at a few of those rules and see what we can
do to work within them.
| Numbers,
No Letters |
Internally, strings are stored
as arrays of Integer values so if you treat them as such, you
apps will be much more efficient. Consider the following
Select Case statement. Drop these two tests into last week's
benchmarking application (available on my FTP site,
ftp.earldamron.com, in the /pub/TimeTest folder). The idea (in
case you missed it then) is to compare the amount of time
required to perform the same logical work using two
different methods. In these two snippets, I'll compare a simple Select Case
statement using strings vs. integers obtained using the AscW function. |
' first, the typical string way
Select Case s
Case "F"
Case "G"
Case "H"
End Select
' next, comparing the values stored internally for each
' character value above
Select Case AscW(s)
Case 70 ' F
Case 71 ' G
Case 72 ' H
End Select | On average with 5,000,000 iterations on a PIII 600, the Print statements
give results like
String Comparisons: 3.487 AscW Comparisons: 0.109
A 96% speed improvement! For the string comparisons, the VB compiler
generates a series of
If ... Then ElseIf ... Then ElseIf ... Then Else
statements. For the AscW comparisons, the compiler generates a switch
table. Why the AscW function? Thanks for asking. Your check is in the
mail. | No
Conversion Necessary | Internally, VB stores strings as UNICODE. To interact with the "outside
world", it converts stings to ANSI. These string conversions are expensive
and should be avoided whenever possible. One way is to use the wide
versions of string functions. Consider the following two tests in the
benchamrking application: |
' the normal ASCII function
m = Asc("F")
' the Unicode (i.e., wide) version of the ASCII function m =
AscW("F") | The benchmarking app for 5,000,000 iterations reports results like
Asc(): .990 AscW(): .110
An almost 90% speed improvement. The Asc() function must convert the "F"
from an ANSI F to a UNICODE "F". You can see the cost of the conversion in
the performance numbers. There's a ChrW() function also that you should
consider using over the Chr() function. | Variants
Are Evil | Well, not really, but there are multiple
versions of approx. 25 string functions. These versions can
return either a String or Variant type. The String versions
always contain a $ sign while the Variant versions do not.
ALWAYS use the string versions unless you actually want the Variant.
Consider |
' the Variant version
If UCase(s) = "FRED" Then
End If
' the String version
If UCase$(s) = "FRED" Then
End If | The benchmarking app for 1,000,000 iterations reports results like
Variant: 1.608 String: 0.986
The code works either way, but leaving off the simple $ sign costs you 35%
in performance. Internally, the Variant result must be converted before
using it in an assignment or comparison operation. | Easy,
But At What Cost? | String concatenation is also a very common operation in VB apps. It's not
something you would normally think about, but string concatenation is very
expensive when strings get large. One alternative is to use smarter concatenation techniques. Here's a small demo of concatenating the lines in
this file (with some of the benchmarking code thrown in). Assume I saved
this file to
"C:\Windows\Desktop\StringHandling.txt". |
s = vbNullString
For lCounter = 1 To CLng(txtIterations.Text)
l = FreeFile()
Open "C:\Windows\Desktop\StringHandling.txt" For Input As #l
Do Until EOF(l)
Line Input #l, sLine
s = s & sLine & vbCrLf
Loop
Close #l
Next ' lCounter | For 25 iterations, this actually takes about 8.32 seconds to complete. As s
grows larger and larger, the strings take longer to concatenate. VB has to
do a string allocation for every & which means the line
s = s & sLine & vbCrLf
actually does two allocation and
concatenation operations. As a simple performance improvement,
you can shorten one of the allocations by directing VB to do
the smaller allocation first followed by a larger one as in
the
following code: |
s = vbNullString
For lCounter = 1 To CLng(txtIterations.Text) l =
FreeFile()
Open "C:\Windows\Desktop\StringHandling.txt" For Input As #l
Do Until EOF(l)
Line Input #l, sLine
s = s & (sLine & vbCrLf)
Loop
Close #l
Next ' lCounter | The () allow VB to do the smaller
allocation first (sLine & vbCrLf) and then append this
smaller string to the larger one. Just this small change drops
the time from 8.32 seconds to 4.151 seconds, a savings of
approx. 50%!
| Wrap
Up |
This article has just touched
on some of the simple ways you can optimize string handling.
For some additional information, check out Francesco Balena's
site, www.vb2themax.com. Francesco has long been one of the
string master's and has some additional tips for maxing out
string performance. Also, check out Matt Curland's book,
Advanced Visual Basic 6. Matt includes some a ton of
fundamental string information, as well as some string code
that will turn normal VB string handling performance on it's ear. If you
think the concatenation performance improvement above is good, check out
some of Matt's code.
If you've ever worked with C or C++, you know the additional steps required
to work with strings and how easy strings are to work with in VB. Well, it
may be easy but it can be costly as you've seen by some of the performance
numbers here. Rethinking how VB handles strings internally can vastly
improve the performance of your heavily "stringed" applications. |
|