dcsimg
Results 1 to 2 of 2

Thread: Text File Read/Write & String Functions (with a BigString)

  1. #1

    Thread Starter
    Lively Member
    Join Date
    Feb 2015
    Location
    Colorado USA
    Posts
    80

    Text File Read/Write & String Functions (with a BigString)

    Class Module clStrings

    This class module is another in a series of library modules that are designed to work with Visual Basic 6 (VB6) and all versions of Visual Basic for Applications (VBA). The code runs equally well in any of these environments which I will refer to as “VB” instead of VB6 or VBA or VB6/VBA. All routines including those dealing with Windows API calls or file reads/writes are Unicode all the time.

    This module has a large set of routines to work with “BigString” which is up to 1,000 times faster than VB’s string functions in some cases. In addition there is a complete set of text and binary file read/write functions that fill a large void in VB’s file handling capabilities, especially for text files.

    Overview

    This class module contains an enhanced version of the BigString routines (sometimes call StringBuilder) that greatly speed up VB’s string handling especially when dealing with concatenating a large number of strings. This module also has a simple-to-use but comprehensive system for reading/writing text files and binary files (I have done up to 500 MB files in one read). An obvious question is why combine these in one module? When we read a text file it is much more efficient to read from a disk in one read into a large memory buffer and then make individual lines of text out of it. A BigString is very convenient for this. Also, when we write strings to a text file it is more efficient to do the physical write in one call which means the entire set of strings needs to be in one large data buffer, very much like a BigString. Thus it makes sense to me to combine them. I have had earlier versions of these as separate modules but I was almost always using them together so I combined and optimized them.

    The BigString set of routines enable you to greatly speed up string operations when there are many changes and concatenations and/or when string lengths exceed a few hundred characters. Normally, VB re-allocates the entire string any time there is any operation on the string to change it. When you use this module, a large string is allocated once and then the subsequent string operations occur within the large string. The difference can be speed increases of up to 1,000 times versus standard string handling using built-in VB functions. If you are dealing with 100 strings o les then it likely is a bit quicker to use VB’s built-in procedures but the BigString concept excels when dealing with thousands of strings.

    The file Read/Write functions are as fast as anything you can do in any language. As a VB programmer you are likely painfully aware that even though the language deals with Unicode strings, when text files are read or written, all of the Unicode gets converted to ANSI, causing all sorts of problems. This module totally eliminates that problem. Most of what we do is via Windows API calls (all Unicode) so there is no inherent slowdown due to using Visual Basic. We can read or write text files in UTF-8 (today’s text standard used almost exclusively now for web pages since it efficiently handles all Unicode characters), UTF-16 which by convention has a BOM, ANSI (Default, OEM, CurrentThread or whatever code page you want to use), and UTF-8 with a BOM even though this is discouraged now. You can specify the file type or the routine can auto-detect which one it is.

    By the way, a BOM is a Byte Order Mark which is 2 or three bytes of data sometimes used for a UTF-8 file and always for a UTF-16 file to more or less announce what type of format the text will be in. UTF-8 has become so widely used that the use of a BOM is discouraged. If you wish to read more about the different types of text files exist in the Windows world, see any of the following links: UTF-8, UTF-16, ANSI Code Pages (mid-article for Windows code pages) and here for a Microsoft discussion on the various Windows code pages you can use (if you really need to) when you convert to/from ANSI/Unicode.

    I have recently incorporated the ability to read and write binary files. It doesn’t fit with the rest of the module being related to strings but it required very little code beyond that necessary for the text read/write routines so I incorporated the features in this module.

    This class module contains many string handling functions to address some VB shortcomings and to extend their capabilities. This module works in total Unicode and works with VB6 and all 32 and 64-bit flavors of VBA. All of the calls to clStrings require regular module mUCCore in order to function. This module contains many string functions on its own that you likely will find useful in addition to a whole host of routines I use every day including file operations, error handling, the operating system and so forth. Below is a list of string-related functions included in both the class module clStrings and module mUCCore.

    BigString – String operations are very slow in VB6/VBA because every little change requires the string(s) to be totally reallocated. For a few characters this isn’t bad but it gets very bad when dealing with long strings and files. There is a whole subsystem described later that works around all of this, providing an alternate system to append, insert, search, remove, etc. strings at very high speeds.

    Delim Get or set the delimiter string (initially set to vbCrLf). Must be 1 or 2 characters. Defaults to vbCrLf which is standard for Windows text files.
    Append Add a string onto the end of bigString. Optionally set the starting character in the string to append.
    AppendWDelim Add a string and the Delimiter (initially vbCrLf)
    Insert Insert a string into the big string. Tell it what character position to insert ahead of, specify a string & optional start character in that string and whether or not to put the delimiter sequence onto the end of the inserted string.
    InsertWDelim Same as Insert above but with the current Delimiter tacked on to the string to insert.
    Length Return the current length of the string being built (same as normal “Len()”)
    Remove Remove a specified # of characters from a place in the string.
    Split Like normal Split but operates on our bigString. The delimiter is whatever has been set with Delim (default vbCrLf). Specify start/end character positions, Limit sets the number of split strings found. Compare sets how text is searched.
    Find Find a sub-string in the big string (equivalent to normal string’s InStr). Specify the string to find, what character to start looking and the compare method.
    Capacity Return current max length of the string with the current “chunks” (it will auto grown for more data)
    ChunkSize Get or set the Unicode character chunk size. The default value is 32,768 characters (65,536 bytes).
    GetAString Returns part or all of the big string. Specify the start and stop character (default to the whole string in the BigString).
    bigString Set the value of the big string starting to be built (to erase set it to "").
    GrowWithGarbage Lengthen our internal string by a specified # of characters. Useful for later dropping in data from an API call etc. (Advanced).
    AppendPtrData Quicker append using pointer to string and how many characters to append. (Advanced)
    InsertPtrData Insert a string using a pointer to the string (Advanced).
    HeapMinimize Shrink the allocated memory for bigString down to a minimum (can still grow after this).
    GetToIntChars Copy part or all of the big string to an integer array.


    Below are string functions found in the standard module mUCCore.

    SubstStr – Substitute environment variables, drive label convert to drive letters, current time and date into a string.

    StringW – Unicode replacement for VB function String$ which only uses chars 1-255.

    iPad – Left and right-justify 2 strings over a given width. Good especially for tabular output to the Immediate Window since in both VBA and VB6 it is monospaced (all characters are the same width). I use it a lot for debugging.

    AllocString – Makes a string containing a certain number of characters. Faster than Space$ for strings longer than about 400 characters. Not of much use by itself but it does provide a nice buffer for return string buffers from Windows API calls.

    For those of you who dive deeper into programming than normal VB6/VBA coding, the following are Public procedures that deal with strings and text using pointers (although as we all know, nobody using VB6/VBA knows anything about pointers…). These procedures are extremely useful especially when dealing with many Windows API calls. If you don’t use pointers and memory buffers then you can ignore the below procedures. They are used internally in many of my other procedures but almost all of them take in and return normal VB variables and do not require any knowledge of pointers.

    Ptr2VBStr – Makes a string in VBA and copy the data in memory to that string. You can specify the number of characters or have it find the end of the string (marked by a null character).

    Ptr2Str – Even faster than Ptr2VBStr using a different algorithm. The function determines the string length.

    lstrlenW – Find the length of a string in memory (characters followed by the null character).

    RTLMoveMemory - Not just a string function. Copy memory data from one location to another.



    File Functions

    ReadTextFile – Read a text file into string array or BigString. File encoding can be UTF-8, UTF-16 or ANSI.

    WriteTextFile – write a text files from string array or BigString to a file encoded in UTF-8, UTF-16 or ANSI.

    ReadBinaryFile – Read a binary file into a byte array.

    WriteBinaryFile – Write binary (non-text) data from a Buy buffer to a file.

    SetFilePtr – Set then return the read/write file pointer in the open file. In 32-bit code the position is held in a Currency data type and in 64-bit code it the position is in a 64-bit LongPtr. Both use the Windows API function SetFilePointerEx.

    CloseOpenHandle – Close the file handle for our read/write functions (if the file has been left open).


    Setup and Use

    The class module clStrings requires only that module mUCCore is included in the program. If you want to run the code in Excel or VB6 you need do nothing other than use the code. Just insert the class module clStrings and the standard module mUCCore into a new or existing VB6 or VBA project and you are ready to go.

    If you are using this module for Office programs other than Excel, you must set an appropriate conditional compilation constant for your VBA project. There is no built-in way to distinguish between the Office programs at compile time so to do that we need to set our own conditional compilation constants which we use to check here. If you plan to use this in some code for Word, go to Tools | VBAProject Properties (2nd one from bottom) and in the General tab sheet, enter the value "Word = 1" (without quotes; case doesn't matter) to set the conditional compilation variable Word to 1. Do similar things in VBA projects you want to run in Access (Access = 1), PowerPoint (PowerPoint = 1) and Outlook (Outlook = 1). Excel and VB6 do not need a compilation constant because we can distinguish between VB6 and all of the VBA versions and we assume that you are using Excel unless modified above because most people who use VBA are using it in Excel. We can also automatically distinguish between 32 and 64-bit VBA code so you don’t need to do anything special for that.

    Host Required Conditional Compilation Constant
    Visual Basic 6 N/A
    MS Excel N/A
    MS Word Word = 1
    MS Access Access = 1
    MS PowerPoint PowerPoint = 1
    MS Outlook Outlook = 1

    The reason for the distinction in VBA hosts is that there are commands that exist in one host but not in the other. For example, in VBA our code is held within individual documents and often we want to know what document is holding/running our code. In Excel this is ThisWorkbook.Path but in Word it is ActiveDocument.Path, in PowerPoint it is ActivePresentation.Path, in Access it is CurrentProject.Path and in Outlook there is no equivalent. If I have a line of code that uses thisWorkbook.Path it will compile and run fine in Excel but it won’t even compile in any of those other hosts. In this particular case, I created a variable called AppPath that holds this path and I have code blocked out for each of the possible hosts so that they don’t “see” the statements that don’t exist in their version of VBA. It was a bit of a pain to set that up originally but once set up it works very well.


    Use

    As with all class libraries, you must set a reference to the module before you can use it.

    Dim Strs As clStrings

    It doesn’t need to be named “Strs”. Then somewhere in your code you put the line

    Set Strs = New clStrings

    Initialization code sets the size of the big string for the string builder functions to a size of 32,769 characters (it actually doesn’t use any memory until you assign a string to it) and it also calls UCCoreInit in the module mUCCore if it hasn’t already been called by another routine.


    When you are finished using the class module set it to Nothing.

    Set Strs = Nothing

    VB6 Users – There are controls that have been developed on the VBForums website by Krool which are enhanced versions of those Microsoft supplied with Visual Basic. Here is a link to his Common Controls Replacement Project. These controls enable Unicode and many other things. I highly recommend them. If you use them you must start your program with Sub Main and not a form and there is a bit of initialization code required to use a newer version of one of the Windows DLLs. That code is here in UCCoreInit so I recommend starting your programs with Sub Main and making the first line of code in that sub be a call to UCCoreInit. If you aren’t a VB6 user or none of this makes sense to you just skip it.
    Attached Files Attached Files

  2. #2
    Fanatic Member
    Join Date
    Sep 2012
    Posts
    993

    Re: Text File Read/Write & String Functions (with a BigString)

    I tested it on Win10 and XP, it was very fast and worked very well. Thank you for sharing.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width