September 27, 2024

Debugging JScript and VBScript in Visual Studio with Windows® Script Host (cscript.exe)

(This is not VBA but posting here..)

I have managed to set up Local Debugging and not external attach type for a particular VBScript (.vbs) or JScript (.js) file in Visual Studio. 

Tested with Visual Studio Community 2019 and 2022.

 Configuring Visual Studio

The Visual Studio installation is minimal, no workloads are installed only these individual components:

  • JavaScript and TypeScript language support
  • TypeScript 4.3 SDK
  • Just-In-Time debugger

 



Create a new Visual Studio EXE project for cscript.exe


  • Start Visual Studio
  • click Continue without code
  • select File > Open > Project/Solution
  • navigate to and open cscript.exe in C:\Windows\SysWOW64 (32-bit worked for me)
  • click Ignore the elevated permission warning for this directory (we will change the working directory later)


The cscript file appears in the Visual Studio solution:



Change properties for the cscript exe project




For example I have a foo.vbs file in my D:\WSH\ directory.

  • Open cscript exe project Properties in Solution Explorer
  • Set the Working directory to the script location: D:\WSH
  • Set Arguments to: //d foo.vbs
  • Change Debugger Type to Script





Start debugging. Press F11 to step into…




Save the solution


March 13, 2024

Deep dive into VBA Array assignment

Last modified: Sun Mar 17 13:30:16 UTC+0100 2024 © Attila Tarpai

VBA Internals: Deep dive into VBA Array assignment
and about Dictionary.Keys

Arrays in VBA are value-types i.e. a full array copy is made on assignment. This is very different from f. ex. the C language.

Dim arr(5)    ' Fixed-size Array variable with elements
Dim varr()    ' Dynamic Array variable

varr = arr    ' array assignment

After the assignment varr and arr has nothing common. They are two completely separate arrays with their own separate copies of values.

Array assignment between variables can be an expensive operation. The complexity depends on the type of the Array. Assignment from functions does NOT make an array copy - with one exception.

By knowing the rules, avoiding full array copies may speed up VBA code. Especially when working with arrays of strings, functions returning such arrays, arrays within arrays etc. And I was curious how VBA works in detail.


Array assignment: lvalue and rvalue

Only Dynamic Array- and Variant variable can store a Pointer to SAFEARRAY (PSA) and be lvalue in array assignment:

  • Dynamic Array variable MUST match the exact array type of the rvalue.
  • Variant variable can be lvalue in all types of array assignments - as in all other assignments.

NB: the function (return) value of Functions of Array- and Variant type is treated the same way as variables.

The rvalue in Array-assignment can be Dynamic- or Fixed-size Array variables, Variant variables of array type; or functions of array type and functions of type Variant of array type. Phew.

  • Assignment from variable rvalue results in full array copy
  • Assignment from Function rvalue is only a simple pointer assignment, no array copy is made. With one exception!
|------------------------------------------------------------------------|
|                  lvalue               |                                |
|---------------------------------------|            rvalue              |
|      Variant        |  Dynamic Array  |                                |
|      variable       |    variable     |                                |
|------------------------------------------------------------------------|
|                     |                 |  Array variable                |
|   _______________   |    _________    |    _________                   |
|  |_xx20____PSA2__|  |   |___PSA2__|   |   |___PSA___|                  |
|                     |                 |                                |
|   Dim v             |    Dim varr()   |    Dim arr()   ' Dynamic       |
|   v = arr           |    varr = arr   |    Dim arr(3)  ' or Fixed-size |
|------------------------------------------------------------------------|
|                     |                 |  Variant variable              |
|   _______________   |    _________    |    _______________             |
|  |_xx20____PSA2__|  |   |___PSA2__|   |   |_xx20____PSA___|            |
|                     |                 |                                |
|   Dim v             |    Dim varr()   |    Dim va                      |
|   v = va            |    varr = va    |                                |
|------------------------------------------------------------------------|
|                     |                 |  Function returning array      |
|   _______________   |    _________    |    _________                   |
|  |_xx20____PSA___|  |   |___PSA___|   |   |___PSA___|                  |
|                     |                 |                                |
|   Dim v             |    Dim varr()   |    Function fa()               |
|   v = fa            |    varr = fa    |    As Variant()                |
|------------------------------------------------------------------------|
|                     |                 |  Function returning Variant    |
|   _______________   |    _________    |    _______________             |
|  |_xx20____PSA___|  |   |___PSA2__|   |   |_xx20____PSA___|            |
|                     |                 |                                |
|   Dim v             |    Dim varr()   |    Function fv()               |
|   v = fv            |    varr = fv    |    As Variant                  |
|------------------------------------------------------------------------|

 _______________
|_xx20____PSA___|  Is a Variant of array type
                   - vbArray flag (2000h) set
                   - stores a Pointer to SAFEARRAY (PSA)
 _________
|___PSA___|        Is a Dynamic array variable
                   or simply Pointer to SAFEARRAY (PSA)

PSA2               means array copy operation has been performed

The Array assignment process

Arrays of scalar assignment

The source SAFEARRAY (SA) has no special Feature Flag set:

  • allocation of new SA
  • allocation memory space for new elements
  • memcopy elements
Dim arr(5) As Long
Dim varr() As Long
varr = arr
                    ________
               arr |   SA   |    _____ _____ _____ _____ _____ _____
                   |        |-->|__0__|__1__|__2__|__3__|__4__|__5__|
                   |________|


                                      memcopy
                    ________      ---------------->
+-----------+      |   SA   |    _____ _____ _____ _____ _____ _____
|    PSA    | ---> |        |-->|__0__|__1__|__2__|__3__|__4__|__5__|
+-----------+      |________|
    varr            alloc new               alloc new

Arrays of String assignment

The source SAFEARRAY (SA) has FADF_BSTR Feature Flag set:

  • allocation of new SA
  • allocation memory space for new elements
  • loop through each element, allocate new BSTR and copy string content (deep copy)
Dim arr(5) As String
Dim varr() As String
varr = arr
                    ________
               arr |   SA   |    _______ _______ _______ _______ _______
                   |        |-->|_______|_______|_______|_______|_______|
                   |________|     |    ___________                |    _______
                                   -->|BSTR1______|    ...   ...   -->|BSTRn__|

                                    |       |       |       |       |
                                    |       |       |       |       |    deep
                                    |       |       |       |       |    copy
                    ________        V       V       V       V       V
+-----------+      |   SA   |    _______ _______ _______ _______ _______
|    PSA    | ---> |        |-->|_______|_______|_______|_______|_______|
+-----------+      |________|     |    ___________                |    _______
    varr            alloc new      -->|BSTR2______|    ...   ...   -->|BSTRm__|

Arrays of Variant assignment

In a Variant array each Variant element can hold any data type.

The source SAFEARRAY (SA) has FADF_VARIANT Feature Flag set:

  • allocation of new SA
  • allocation memory space for new elements

Loop through each element and decide the copy method:

  • Variant of scalar: memcopy
  • Variant of String: allocate new BSTR and copy string content
  • Variant of Array: call Array assignment recursively
Dim arr(5)
arr(0) = "hi"
arr(1) = 100
arr(2) = 60000
arr(3) = Array("s", 1, 2)
Dim varr()
varr = arr
                    ________
               arr |   SA   |    ___________  ___________  ___________  ___________
                   |        |-->|08_________||02_________||03_________||0C20_______|
                   |________|     |    ________                          |   ______
                                   -->|BSTR1___|                          ->|  SA1 |      _______ _______ _____
                                                                            |______| --> |_______|_______|_____
                                                                                           |    _______
                                                                                            -->|BSTRn__|
                                      |            |            |            |
                                      |deep        |simple      |simple      |recursive
                                      |copy        |copy        |copy        |deep copy
                    ________          V            V            V            V               |       |      |
+-----------+      |   SA   |    ___________  ___________  ___________  ___________          |deep   |      |
|    PSA    | ---> |        |-->|08_________||02_________||03_________||0C20_______|         |copy   |      |
+-----------+      |________|     |    ________                          |   ______          V       V      V
    varr            alloc new      -->|BSTR2___|                          ->|  SA2 |      _______ _______ _____
                                                                            |______| --> |_______|_______|_____
                                                                                           |    _______
                                                                                            -->|BSTRm__|


Array self-assignment test

Consider:

ReDim varr(5)
varr = varr

What will happen? Will the run-time evaluate rvalue to a PSA, free lvalue - which frees the same PSA, then copy a null-ptr? No. The run-time recognizes that both are the same and nothing happens.

This holds also in a Sub, in which we try to trick the run-time by calling it with (varr, varr). No change will happen:

Private Sub SelfArrayAssignmentTest(varr1(), varr2())
    varr1 = varr2
End Sub

Under the hood a COM function would do something like this (just guessing here):

void ArrayAssignment(SAFEARRAY **ppsa1, SAFEARRAY **ppsa2)
    if (*ppsa1==*ppsa2) return;
    ... process copy
End Sub

Note on array assignment from Dictionary.Keys() and Dictionary.Items()

The Variant function → Array assignment exception, when array copy operation is performed is something to consider when choosing the data type of the lvalue variable. Some build-in functions return Variant holding an array, f. ex. Scripting.Dictionary's Keys() function. It is declared as:

Function Keys()

Keys() returns Variant - holding an Array of Variant:

  • when receiving this Variant into a local Variant variable no array copy is made.
  • but receiving this Variant into a local Array of Variant variable will do perform full array copy.
Sub dictarr()
    Dim dict As Scripting.Dictionary
    Set dict = New Scripting.Dictionary

    Dim v               ' Variant
    Dim varr()          ' Array of Variant

    v = dict.Keys       ' no array copy
    varr = dict.Keys    ' full array copy

End Sub

Something to keep in mind - especially when the Dictionary stores thousands of Keys and Items.

Also, the array returned by Keys() is already a copy. Calling Keys() makes a fresh array copy from the internal keys each time of access; the returned PSA is different each time (and the keys cannot be changed in the Dictionary instance through this array either).

This has a possible consequence when f. ex. we try to retrieve the 5-th index key. The code would be:

dict.Keys(5)

Somehow compiles and runs fine but how does this work? Isn't Keys() a function without arguments?

The VBA syntax sometimes is really annoying:

  • calling functions without parameters the parenthesis is optional
  • both array-indexing and function call uses parenthesis ()

These are equivalent and we were just lucky that the compiler evaluated dict.Keys first as function call.. then used array-indexing for dict.Keys(5):

dict.Keys     <---->  dict.Keys()

dict.Keys(5)  <---->  dict.Keys()(5)

So dict.Keys(5) first makes the call and receives the full array of keys.. then indexing into the array. Each time. The most effective solution would be to retrieve and store the key array into a Variant(!) - then use indexing to retrieve (possible) multiple key values or iterate through. Or maybe some optimization is going on in the background I don't know.

March 12, 2024

VBA ParamArray and the built-in Array() function

Last modified: Tue Mar 12 22:02:10 UTC+0100 2024 © Attila Tarpai

ParamArray and the built-in Array() function

ParamArray is a VBA language construct to define a Sub/Function with variable number of arguments. ParamArray is an Optional argument, it must be declared last as Array of Variant with the ParamArray keyword. Optional|ByVal|ByRef is not valid.

ParamArray is essentially an Array built by the caller from zero or more arguments:

 arg1    arg2    arg3    arg4    arg5   ...     call "1st", 100                     <-- without optional parameters
  |       |       |       |       |      |      call "2nd", 200, coll, Array(1,2,3) <-- 2 optional parameters
  |       |       |_______|_______|______|      call "3rd", 300, 9,8,7,6,5          <-- 5 optional parameters
  |       |       |caller construct
  |       |       |
  V       V       V

param1  param2  ParamArray                      Sub (s As String, i As Integer, ParamArray p())
                  |_ param3
                  |_ param4
                  |_ param5
                  |_ ...

The Sub expects 3 pointers from the caller (all declared implicite ByRef). ParamArray is passed as any other Array, an address of a PSA - Pointer to SAFEARRAY structure. It is Dynamic Array but has some compile-time restrictions:

  • cannot be passed further as ByRef Variant
  • cannot ReDim/Erase

ParamArray's PSA is always valid and points to an allocated SAFEARRAY structure. This can be an empty array, which shows up as Variant(0 to -1) in Debug Windows (because cElements=0 of the SAFARRAYBOUND structure).

The Sub/Function accesses ParamArray's elements similar to elements of any other array: p(0), p(1) or For..Next/For Each.

ParamArray is ByRef

The specialty of ParamArray is how arguments are converted to Variants by the caller. Being parameters it follows the principle of passing ByRef, the default mechanism in VBA.

Consider a Sub with ParamArray:

Sub fParamArray(ParamArray a())
    a(0) = 100
    a(2) = "H"
End Sub

and another one calling fParamArray with 4 arguments:

Sub passParamArray()
    Dim L As Long, s As String
    L = 1
    s = "h"
    Debug.Print L, s
    fParamArray L, 254, s, "hi"
    Debug.Print L, s
End Sub

It prints the following showing L and s were passed ByRef and has changed the caller's local variables:

 1            h
 100          H

64-bit VBA ParamArray ByRef BUG

It concerns only one variable data type: LongLong. When we try to change a LongLong variable passed in ParamArray there is this error message:

Variable uses an Automation type not supported in Visual Basic (Error 458)

The test code:

Sub fParamArray(ParamArray a())
    a(0) = -1
End Sub

Sub passParamArray()
    Dim LL As LongLong
    fParamArray LL   ' ERROR
    fParamArray 10^  ' OK
End Sub

All other data types are writable, as above. Passing LongLong literal in ParamArray raises no error - but it's also meaningless and has no effect to change a temp Variant/LongLong variable made by the caller. My only guess is that in this particular case they forgot to extend to accept the VTYPE = 4014h (VT_BYREF|vbLongLong) Variant type for 64-bit VBA. VBA uses a subset of the VTYPE enum values defined by COM and there is always a run-time check.

The built-in Array() function

Member of VBA._HiddenModule and declared as:

Function Array(ParamArray ArgList() As Variant) As Variant

Array() takes its arguments, converts them to Variants and puts them into a Variant Array. Then returns the array in a Variant.

The array returned is dynamic and can be ReDim-ed. The Variant is always of array type even when Array() has been called without arguments.

What's really gonna bake your noodle is that Array() is declared with one ParamArray. So caller constructs the ParamArray as above, passes the address of the result array to Array(), which will assign this array to Variant and that is the return value. Although parameters from variables are received ByRef, the array-assigment-to-Variant operation will go through every element and performs assigment-to-Variant for each one using (possible deep) value copy. This is normal operation of array assignment in VBA when the FADF_VARIANT Feature Flag is set in the SAFEARRAY structure.

ParamArray vs Array()

So there is a considerable difference passing arguments in ParamArray vs using Array() due to how the mechanism builds up the Variant array and fill in with values.

Both takes its arguments and converts them to Variant implicitely but

ParamArray:

  • for variables other than Variant it creates a VT_BYREF-Variant with a pointer to the variable - equivalent to pass-as-Variant
  • for literals it uses assigment-to-Variant

Array():

  • uses assigment-to-Variant for every argument passed. Assignment-to-Variant is value copy and for String and Array datatypes, that are value types in VBA, it makes a deep copy (note BSTR2).
fParamArray L, 254, s, "hi"                         Array(L, 254, s, "hi")


ParamArray                                          Array()
  |   ______________       _____                      |   ______________
  |_ |0340_____ptr__| --> |___1_|                     |_ |0300_______1__|
  |   VT_BYREF             L                          |
  |   ______________                                  |   ______________
  |_ |0200_____254__|                                 |_ |0200_____254__|
  |                                                   |
  |   ______________       __________                 |   ______________
  |_ |0840_____ptr__| --> |__->BSTR1_|                |_ |0800__->BSTR2_|
  |   VT_BYREF             s                          |
  |   ______________                                  |   ______________
  |_ |0800__->BSTR__|                                 |_ |0800__->BSTR__|

         Variants                                            Variants

VT_BYREF Variant for variables                      All Assigment-to-Variant
Assigment-to-Variant only for literals

Memory dump of these SAFEARRAY structures

Lets construct a mimic Array() function first based on the ParamArray-Variant assignment assumption and then compare the outcome:

Function Array2(ParamArray ArgList() As Variant) As Variant
    Array2 = ArgList
End Function

These outputs were made using vbadump.bas (GitHub). Comparison of SAFEARRAY memory dumps between ParamArray, Array() and UDF Array2():

Passing 4 parameters:                              Passing empty:

--------------------------------------------       --------------------------------------------
fParamArray L, 254, s, "hi"                        fParamArray

SAFEARRAY STRUCTURE:                               SAFEARRAY STRUCTURE:
Offs          Name          Value Hex              Offs          Name          Value Hex
-4            VbVarType     00000000               -4            VbVarType     00000000
 0            cDims         0001                    0            cDims         0001
 2            fFeatures     0880                    2            fFeatures     0880
 4            cbElements    00000018                4            cbElements    00000018
 8            cLocks        00000000                8            cLocks        00000000
 16           pvData        000001CCC23B9DA0        16           pvData        000001CCC95651C0
 24           cElements     00000004                24           cElements     00000000
 28           lLbound       00000000                28           lLbound       00000000
Flags: FADF_VARIANT|FADF_HAVEVARTYPE               Flags: FADF_VARIANT|FADF_HAVEVARTYPE

--------------------------------------------       --------------------------------------------
Array(L, 254, s, "hi")                             Array()

SAFEARRAY STRUCTURE:                               SAFEARRAY STRUCTURE:
Offs          Name          Value Hex              Offs          Name          Value Hex
-4            VbVarType     0000000C               -4            VbVarType     00000000
 0            cDims         0001                    0            cDims         0001
 2            fFeatures     0880                    2            fFeatures     0880
 4            cbElements    00000018                4            cbElements    00000018
 8            cLocks        00000000                8            cLocks        00000000
 16           pvData        000001CCC23B9E80        16           pvData        0000000000000000
 24           cElements     00000004                24           cElements     00000000
 28           lLbound       00000000                28           lLbound       00000000
Flags: FADF_VARIANT|FADF_HAVEVARTYPE               Flags: FADF_VARIANT|FADF_HAVEVARTYPE
--------------------------------------------       --------------------------------------------

UDF Array2(L, 254, s, "hi")                        UDF Array2()

SAFEARRAY STRUCTURE:                               SAFEARRAY STRUCTURE:
Offs          Name          Value Hex              Offs          Name          Value Hex
-4            VbVarType     0000000C               -4            VbVarType     00000000
 0            cDims         0001                    0            cDims         0001
 2            fFeatures     0880                    2            fFeatures     0880
 4            cbElements    00000018                4            cbElements    00000018
 8            cLocks        00000000                8            cLocks        00000000
 16           pvData        000001CCC8F7E830        16           pvData        0000000000000000
 24           cElements     00000004                24           cElements     00000000
 28           lLbound       00000000                28           lLbound       00000000
Flags: FADF_VARIANT|FADF_HAVEVARTYPE               Flags: FADF_VARIANT|FADF_HAVEVARTYPE
--------------------------------------------       --------------------------------------------

Some interesting observations:

  • The arrays returned by Array() and our Array2() function are identical: so we guessed just about right. Both are Dynamic and can be ReDim-ed.
  • Calling Array() without arguments is legal in VBA and will return a special SA: the element number is zero - it is a copy of the empty ParamArray.
    This appears as Variant(0 to -1) in Debug windows.
  • FADF_HAVEVARTYPE is set for all these arrays but VbVarType at -4 is not updated for all.
  • in array assigment, when rvalue has zero VbVarType but proper Flags, the lvalue array will have proper VbVarType value set (Array2 = ArgList)
  • Empty ParamArray pvData is set and points to..? No idea where.

March 11, 2024

VBA Array Internals: Memory layout

Last modified: Mon Mar 11 22:08:49 UTC+0100 2024 © Attila Tarpai

VBA Array Internals I.
Memory layout, structures and ReDim Preserve

There are two types of array declaration in VBA:

  • Dynamic Array: Dim varr()
  • Fixed-size Array: Dim arr(3)

VBA Arrays are allocated SAFEARRAY structures based on underlying COM/Automation Interface:

Fixed-size Array variable in memory:

   +-------------+
   |  SAFEARRAY  |
   |   STRUCT    | pvData   _____ _____ _____ _____
   |             | ------> |_____|_____|_____|_____| Elements
   |             |
   +-------------+
     Dim arr(3)


Dynamic Array variables are pointers (can be null-ptr):

   +-------------+
   |   Pointer   | ----> SAFEARRAY STRUCT
   +-------------+
     Dim varr()

Declaration of a Fixed-size Array variable allocates a SAFEARRAY structure along with initialized Elements.

Declaration of a Dynamic Array variable allocates a pointer, initially null-pointer. It appears f. ex. as Variant() in the Debug Windows without bounds. Later ReDim will allocate the SAFEARRAY structure with initialized Elements and sets the variable.

PSA = Pointer to SAFEARRAY

VBA works with Pointer to SAFEARRAY (PSA) when dealing with Arrays:

  • Dynamic Array variable is a PSA
  • passing Array to Sub/Function is passing an address of a PSA (passing Array is ByRef)
  • returning array from Array type functions is a PSA
  • Variant of array type stores a PSA (VT_ARRAY flag set)

Passing Arrays to procedures

Passing arrays is always ByRef: caller passes the memory address of a Pointer to SAFEARRAY (PSA). This PSA can be itself a Dynamic Array variable or a temp PSA made by the caller for Fixed-size Arrays. This ensures that the called procedure will access the array in the same way. It also checks certain SAFEARRAY Flags to prevent ReDim f. ex. on Fixed-size Arrays at run-time. The PSA can also be a null-ptr.

Sub ([ByRef] a())

 _________                 _________
|_________| -------+--->  |___PSA __| ----> SAFEARRAY STRUCT
     a             |       _________
 address of        +--->  |__PSA=0__|
    PSA

Note that this mechanism allows the called procedure to work directly on the array and its elements; it can also re-assign the variable and even execute ReDim and Erase for Dynamic Array variables passed.

Functions returning array

Functions can only return Dynamic Arrays, which is a Pointer to SAFEARRAY (PSA). In VBA the function return value is a hidden local variable, as if it was declared as the very first local variable of the function. End|Exit Function makes this value available for the caller.

Private Function f() As String()
    [Dim f() As String]           <-- as if declared here (PSA)
    Dim I As Integer
    Dim L As Long
    ...
End Function

On entering the function, before ReDim or assignment, the return value is a null-ptr. Usage and syntax is the same as for other Dynamic Array variables.

  • After ReDim memory space for a SAFEARRAY structure and elements are allocated and the variable is set.
  • The function variable can be lvalue in array assignment
  • can be Erase-d and set back to null-ptr

Variant of array type

The Variant stores a Pointer to SAFEARRAY (PSA) and the VT_ARRAY flag is set:

Dim v
ReDim v(3)

0C20-000000000000-B0BB8A8F9D010000-0000000000000000 (64-bit VBA Variant)

+--+--+-----+-----+-----+-----------------------+-----------------------+
|0C 20|    reserved     |         PSA           |           0           |   64-bit VBA Variant
+--+--+-----+-----+-----+-----------------------+-----------------------+
                              |
                              |        +---------------+
                              +------> |   SAFEARRAY   |
                                       |   STRUCTURE   |
                                       +---------------+

PSA in Variant can also be null-ptr (after Erase statement). 2000h is the VT_ARRAY = vbArray flag.

The SAFEARRAY STRUCTURE

See in Oaidl.h

This VBA Type maps correctly on both 32/64-bit VBA:

Private Type SAFEARRAY
  cDims As Integer              ' number of dimensions: 1, 2, ..
  fFeatures As Integer          ' feature flags (fix/dynamic etc.)
  cbElements As Long            ' size of one array element
  cLocks As Long                ' array locked
  pvData As LongPtr             ' pointer to array data
End Type

On the very next memory address immediately follows one SAFEARRAYBOUND structure for each dimension:

Private Type SAFEARRAYBOUND
  cElements As Long
  lLbound As Long
End Type

SAFEARRAYBOUND stores the allowed index lower bound and the number of elements in a dimension. So the formula VBA uses in Debug Windows is (LBound To LBound + cElements - 1). That's why calling Array() without arguments - or passing no arguments to a ParamArray - shows f. ex. Variant(0 to -1). These are empty, but allocated arrays with cElements=0.

The SAFEARRAY structure compiles to a little larger structure on 64-bit VBA: pvData takes 12 bytes (4 byte padding to align 8 and an 8-byte pointer).

32/64-bit SAFEARRAY structure difference:

ReDim varr(3 To 6, 1 To 2) As Integer

0          4
+----------+              32-bit
|   PSA    | --------->   SAFEARRAY  02 00                    cDims
+----------+                         80 00                    fFeatures
Dim varr()                           02 00 00 00              cbElements
32-bit VBA                           00 00 00 00              cLocks
                                     50 EA A9 01              pvData (32-bit)
                                     02 00 00 00 01 00 00 00  rgsabound[0]
                                     04 00 00 00 03 00 00 00  rgsabound[1]

0               8
+---------------+         64-bit
|      PSA      | ----->  SAFEARRAY  02 00                    cDims
+---------------+                    80 00                    fFeatures
Dim varr()                           02 00 00 00              cbElements
64-bit VBA                           00 00 00 00              cLocks
                                     00 00 00 00              align(8) padding
                                     E0 2D 50 EA A9 01 00 00  pvData (64-bit)
                                     02 00 00 00 01 00 00 00  rgsabound[0]
                                     04 00 00 00 03 00 00 00  rgsabound[1]

Feature flags and VBA Array types

These are the flags defined for the fFeatures field of the SAFEARRAY structure:

typedef enum tagADVFEATUREFLAGS
{
   FADF_AUTO = 0x0001,
   FADF_STATIC = 0x0002,
   FADF_EMBEDDED = 0x0004,
   FADF_FIXEDSIZE = 0x0010,
   FADF_RECORD = 0x0020,
   FADF_HAVEIID = 0x0040,
   FADF_HAVEVARTYPE = 0x0080,
   FADF_BSTR = 0x0100,
   FADF_UNKNOWN = 0x0200,
   FADF_DISPATCH = 0x0400,
   FADF_VARIANT = 0x0800
} ADVFEATUREFLAGS;

The VBA run-time uses these flags:

  • FADF_HAVEVARTYPE flag is normally set for both Dynamic- and Fixed-size Arrays declared in VBA. That will also allocate an VbVarType Enum value at offset -4 of the SAFEARRAY structure
  • For arrays of Object VBA sets the FADF_HAVEIID flag and there is an allocated GUID value at offset -16
  • External UDT-s, f. ex. an array of Type mscorlib.Decimal will have proper IRecordInfo interface pointer at -4/8 of the SAFEARRAY and the FADF_RECORD flag is set (no other flags are set).
  • An array of User Type defined in the a Standard Module or Class is somewhat incomplete: a pointer is allocated before SAFEARRAY but its value is zero and no FADF_RECORD flag is set. These
   FADF_HAVEVARTYPE         FADF_HAVEIID             FADF_RECORD              NONE

   +-------------+          +-------------+          +-------------+          +-------------+
-4 |  VbVarType  |       -16|    GUID     |     -4/8 | IRecordInfo |     -4/8 |      0      |
   +-------------+          +-------------+          +-------------+          +-------------+
   |  SAFEARRAY  |          |  SAFEARRAY  |          |  SAFEARRAY  |          |  SAFEARRAY  |
   |   STRUCT    |          |   STRUCT    |          |   STRUCT    |          |   STRUCT    |
   |             |          |             |          |             |          |             |
   |             |          |             |          |             |          |             |
   +-------------+          +-------------+          +-------------+          +-------------+

    Dim a(5)                  Dim a(5)                Dim AT(2)                Dim myarr(2)
    Dim a(5) As String        As Object               As mscorlib.Decimal      As MyType
    Dim a(5) As Long

Arrays of String set FADF_BSTR, arrays of Variant set FADF_VARIANT. These flags are important for COM routines when it comes to array assigment and how to copy element values - see later. Arrays of all Object types set FADF_DISPATCH in VBA.

Fixed-size Arrays declared in VBA additionally set both FADF_FIXEDSIZE and FADF_STATIC flags.

Memory layout of array data and dimensions

Lets make a 2D array in VBA and fill it up with values:

Dim arr(3 To 6, 1 To 2) As Byte
    arr(3, 1) = &H31
    arr(3, 2) = &H32
    arr(4, 1) = &H41
    arr(4, 2) = &H42
    arr(5, 1) = &H51
    arr(5, 2) = &H52
    arr(6, 1) = &H61
    arr(6, 2) = &H62

This array contains 4 x 2 = 8 Byte elements.

Interestingly, the second dimension's SAFEARRAYBOUND is stored first:

rgsabound[0] = 02 00 00 00 01 00 00 00  <--- 2nd dim
rgsabound[1] = 04 00 00 00 03 00 00 00  <--- 1st dim

Dumping pvData..

000001BA1304BCA0: 31 41 51 61 32 42 52 62 | 9B F8 E4 20 BA 41 00 8E | 1AQa2BRb›Ã¸Ã¤ ºA.Ž
                 |__|__|__|__|
                    1st dim
                 |___________|___________|
                          2nd dim

Elements of the first dimension is always stored at continuous memory locations, tightly packed on the data's natural boundaries (this is also the cbElements value). The second dimension lays out storage for the 1st dimension cElements times of the 2nd dimension SAFEARRAYBOUND.

By increasing the number of dimensions the memory layout will follow this algorithm:

            ---------------------------> increasing memory adddresss
                                                                                            cElements

1st dim     |__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|       4

2nd dim     |___________|___________|___________|___________|___________|___________|       2

3rd dim     |_______________________|_______________________|_______________________|       3

4th dim     |_______________________________________________________________________|____   ..

so on..
                           Dim arr(0 To 3, 0 To 1, 0 To 2, ..)

Every higher dimension multiplies the whole previous memory block by cElements times of the corresponding SAFEARRAYBOUND.

Note that "rank" is declared opposite to what f. ex. C language people are used to. The storage will be the same.

VBA ARRAY                          C ARRAY

arr(rank1, rank2, ...)             arr[..][rank2][rank1]

higher dimensions --->             <--- higher dimensions

Another major difference is that a multi-dimensional VBA array is not an array of arrays, like in C. We can address only individual elements.

Understanding the memory layout of n dimensions above will hopefully help why only the last dimension's (upper) bound can be changed in VBA in ReDim Preserve.

ReDim Preserve

It is possible in VBA to resize an array by keeping its content using ReDim Preserve to less or more elements.

Executing ReDim Preserve first allocates memory space for n' elements, then:

  • if n' > n then EXPAND: memcopy n elements and initialize new elements
  • if n' < n then SHRINK: memcopy only n' elements (truncation)

The previous memory block is freed. Note that array data copy is simple memcopy. If n' = n no changes are made.

1D-array ReDim Preserve

Only the upper bound can be changed to shrink or expand the array:

ReDim varr(3 To 6)
 _____ _____ _____ _____
|__3__|__4__|__5__|__6__|


ReDim Preserve varr(3 To 5)
 _____ _____ _____
|__3__|__4__|__5__|  <--- shrink


ReDim Preserve varr(3 To 7)
 _____ _____ _____ _____ _____
|__3__|__4__|__5__|__6__|__0__| ---> expand

2D-array ReDim Preserve

Only the second dimension's upper bound can be changed:

ReDim varr(3 To 6, 1 To 3)         ReDim Preserve                     ReDim Preserve
                                   varr(3 To 6, 1 To 2)               varr(3 To 6, 1 To 4)

 _____ _____ _____ _____            _____ _____ _____ _____            _____ _____ _____ _____
|__3__|__4__|__5__|__6__|  1       |__3__|__4__|__5__|__6__|  1       |__3__|__4__|__5__|__6__|  1
 _____ _____ _____ _____            _____ _____ _____ _____            _____ _____ _____ _____
|__3__|__4__|__5__|__6__|  2       |__3__|__4__|__5__|__6__|  2       |__3__|__4__|__5__|__6__|  2
 _____ _____ _____ _____                                               _____ _____ _____ _____
|__3__|__4__|__5__|__6__|  3                shrink                    |__3__|__4__|__5__|__6__|  3
                                                                       _____ _____ _____ _____
                                                                      |__0__|__0__|__0__|__0__|  4

                                                                                expand

3D- and higher array ReDim Preserve

Only the last dimension's upper bound can be changed: as everything before the last dimension will be added or removed x times.

ReDim varr(3 To 6, 1 To 2, 8 To 10)

            1                        2
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  8
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  9
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  10


ReDim Preserve varr(3 To 6, 1 To 2, 8 To 9)

            1                        2
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  8
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  9     <--- shrink


ReDim Preserve varr(3 To 6, 1 To 2, 8 To 11)

            1                        2
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  8
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  9
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  10
 _____ _____ _____ _____  _____ _____ _____ _____
|__0__|__0__|__0__|__0__||__0__|__0__|__0__|__0__|  11    <--- expand

March 10, 2024

VBA Array assignment BUG

Last modified: Sun Mar 10 23:40:49 UTC+0100 2024 © Attila Tarpai

The VBA Array assignment BUG

I bumped into something weird and erroneous when I was trying to find a way to return arrays from functions without array copy.

Sample code in this GitHub repo.

Abstract

UDF functions returning arrays in a VBA Standard Module after ReDim hold an address of an incomplete SAFEARRAY structure without Feature Flags.

This could be fatal if the Array returned is used later in another array assignment and the array is of String or Variant type. Excel may crash and exit unexpectedly.

Tested in VBA 7.1 with 32-bit Office 2016 and 64-bit Office 365.

Intro

A common way to work with functions returning Array is to Dim or ReDim a local array in the function, fill with values and use array-assignment to the function before exit:

Private Function return_arr_string() As String()
    Dim arr(2) As String
    arr(0) = "hi"
    return_arr_string = arr
End Function

But array-assignment can be expensive. Arrays are value types in VBA and array assignment copies the whole array content. F. ex. for string arrays this involves several string copies. The new array allocates new BSTR for every element and copies the strings one by one.. something we would like to avoid here.

Function returning array

In VBA the function return value is a hidden local variable, as if it was declared as the very first local variable of the function. Functions returning array is no exception: a hidden Pointer to SAFEARRAY (PSA) is allocated (functions can return only Dynamic Arrays). Initially, before ReDim or assignment, a null-ptr and the usage, syntax is the same as with other Dynamic Array variables.

Private Function return_arr_string() As String()
    [Dim return_arr_string() As String]           <- as if declared here 

When this function returns, its PSA is available for the caller. When the value is used as rvalue in a subsequent assignment, no array copy is made only a simple pointer assignment. VBA is smart enough to recognize that the return value variable goes out of scope, thus avoiding unnecessary array copies.

OK, so lets ReDim the function variable itself, being a dynamic array variable, what can go wrong:

Private Function return_arr_string() As String()
    ReDim return_arr_string(2)
    return_arr_string(0) = "hi"
End Function

OOPS, that does not compile. There is an unlucky VBA language construct: both array-indexing and function call uses parenthesis (), and the line above interpreted as function call.

OK, no problem, lets pass the array to another procedure to work on. Passing arrays is always ByRef in VBA (it has to be) with only a pointer passed:

Private Function return_arr_string() As String()
    ReDim return_arr_string(2)
    fill_arr_string return_arr_string
End Function

Private Sub fill_arr_string(arr() As String)
    arr(0) = "hi"
End Sub

Now this is nice, goal achieved, we can return a newly allocated and populated array without making array copies.

But wait.

The BUG

Here comes the error, bug, language feature or something weird.

Lets see the code first with String Arrays then analyze it:

Private Sub array_assignment_bug_string_array()
    Dim arr() As String, arr2() As String
    arr = return_arr_string
    arr2 = arr
    Debug.Print Hex(StrPtr(arr(0)))
    Debug.Print Hex(StrPtr(arr2(0)))
End Sub

Private Function return_arr_string() As String()
    ReDim return_arr_string(2)
    fill_arr_string return_arr_string
End Function

Private Sub fill_arr_string(arr() As String)
    arr(0) = "not me"
End Sub

The code is simple - the explanation is not. We re-use the above construct for a local array, but after that make an array assignment to another local dynamic array.

It prints the following:

StrPtr(arr(0)):             27CC53F1C48
StrPtr(arr2(0)):            27CC53F1C48

This must be wrong, arr(0) and arr2(0) are 2 different String variables in 2 different Arrays holding the same BSTR memory address! (Test was run in 64-bit VBA7.1)

This was not supposed to happen:

arr2                     =    arr
 |___ String --> BSTR1        |___ String --> BSTR1
 |___ String --> BSTR2        |___ String --> BSTR2

This was supposed to happen. Array assignment with proper deep-copy:

arr2                     =    arr
 |___ String --> BSTR3        |___ String --> BSTR1
 |___ String --> BSTR4        |___ String --> BSTR2

So now what? We have two pointers into the BSTR buffer with the same memory address. What if we change one of them? Will the other reflect the change? What if one of them gets de-allocated? Will the other one point to a de-allocated memory location? What if after some time strings come and go in the BSTR buffer, what will be at that memory address one of them still holding?

Lets go gentle first and change only one character with the Mid() statement, in arr(0). The Mid() statement is the only way to manipulate BSTR characters directly without de- and re-allocation in VBA. StrPtr() will not change:

Private Sub array_assignment_bug_string_array()
    Dim arr() As String, arr2() As String
    arr = return_arr_string
    arr2 = arr
    Mid(arr(0), 1, 1) = "!"
    Debug.Print "arr(0): ", arr(0)
    Debug.Print "arr2(0): ", arr2(0)
End Sub

It is really not good, arr2(0) has changed too, the output:

arr(0):       !ot me
arr2(0):      !ot me

Assigning arr(0) to another string, again, and again and then print them sometimes

  • shows no change (probably re-allocation never used the memory area again)
  • shows random string fragments
  • I also managed to crash Excel once completely in F8-step mode

Clearly a corrupted BSTR buffer.

Analysis

The following analysis was done with vbadump.bas (on GitHub). We dump these arrays at various points of execution calling dump_safearray <array>, 1 (1 is for printing array elements as well).

The test was run in 32-bit VBA7.1 (Office 2016) but the result is exactly the same on 64-bit.

In return_arr_string() after ReDim:

dump_safearray return_arr_string, 1

Addr of SAFEARRAY = 0C450A38
SAFEARRAY STRUCTURE:
Offs          Name          Value Hex
 0            cDims         0001
 2            fFeatures     0000
 4            cbElements    00000004
 8            cLocks        00000000
 12           pvData        09600330
 16           cElements     00000003
 20           lLbound       00000000
Flags: NONE

09600330: 0C4B71C4 String
09600334: 00000000 String
09600338: 00000000 String

In the caller after arr = return_arr_string assignment:

dump_safearray arr, 1

Addr of SAFEARRAY = 0C450A38
SAFEARRAY STRUCTURE:
Offs          Name          Value Hex
 0            cDims         0001
 2            fFeatures     0000
 4            cbElements    00000004
 8            cLocks        00000000
 12           pvData        09600330
 16           cElements     00000003
 20           lLbound       00000000
Flags: NONE

09600330: 0C4B71C4 String
09600334: 00000000 String
09600338: 00000000 String

The array assignment from function arr = return_arr_string operates as expected, it's a simple pointer assignment of PSA:
- Addr of SAFEARRAY is the same
- address of elements in pvData is the same, so are the elements

And now for arr2 after arr2 = arr array assignment:

dump_safearray arr2, 1

Addr of SAFEARRAY = 2067AC78
SAFEARRAY STRUCTURE:
Offs          Name          Value Hex
 0            cDims         0001
 2            fFeatures     0000
 4            cbElements    00000004
 8            cLocks        00000000
 12           pvData        096005E8
 16           cElements     00000003
 20           lLbound       00000000
Flags: NONE

096005E8: 0C4B71C4 String
096005EC: 00000000 String
096005F0: 00000000 String

StrPtr(arr(0)):             C4B71C4
StrPtr(arr2(0)):            C4B71C4

The array assignment arr2 = arr went almost well:

  • a new SA allocated for arr2: Addr of SAFEARRAY is different
  • a new block of data allocated for elements of arr2: pvData is different
  • but elements has the same BSTR pointer!

Obviously only a simple memcopy happened without BSTR copies. But why?

Because these SAFEARRAY structures have no Features Flags set. The safearray returned by return_arr_string() is incomplete, has no flags, NONE.

Here is a proper String Array (well, it takes a while to write this article so now I'm at my VBA64 computer again):

ReDim arr(2) As String
dump_safearray arr, 1

Addr of SAFEARRAY = 0000027CC4CC9750
SAFEARRAY STRUCTURE:
Offs          Name          Value Hex
-4            VbVarType     00000008
 0            cDims         0001
 2            fFeatures     0180
 4            cbElements    00000008
 8            cLocks        00000000
 16           pvData        0000027CC517AFA0
 24           cElements     00000003
 28           lLbound       00000000
Flags: FADF_BSTR|FADF_HAVEVARTYPE

0000027CC517AFA0: 0000000000000000 String
0000027CC517AFA8: 0000000000000000 String
0000027CC517AFB0: 0000000000000000 String

The Feature Flags are properly set:

  • FADF_HAVEVARTYPE: there is a type at -4 (as most of Arrays in VBA has it)
  • FADF_BSTR: array contains BSTR pointers

The latter flag is the missing flag: COM array copy functions has to know this to perform String-by-String deep copy of the elements if set. Otherwise its simply a memcopy of the memory pointed by pvData.

So the question is, is this normal or a serious error in VBA?

Compile Error: Invalid ReDim

One remark here I read somewhere: implicitly declared Variants cannot hold an array, i.e. cannot ReDim-ed to be an array and there might be a connection here. The following function fv() returns Variant. Variable v ReDim is successful, as it is explicitly declared, but for the function variable itself raises Invalid ReDim:

Function fv() As Variant
    Dim v As Variant
    ReDim v(5)       ' OK
    ReDim fv(2)      ' ERROR: Invalid ReDim
End Function

Well, our UDF function variable is sort of implicitely declared. Do all UDF functions has this feature? I do not know the internal operation of Win32/COM/Automation VBA uses.

Function returning Array of Variant

A reminder how Variant stores Strings and Arrays: it stores a pointer. Variant/String a BSTR pointer, Variant/Array() a Pointer to SAFEARRAY (PSA):

+--+--+-----+-----+-----+-----------------------+-----------------------+
|08 00|    reserved     |      pointer          |           0           |   64-bit VBA Variant
+--+--+-----+-----+-----+-----------------------+-----------------------+
                              |
                              |        +--+--+--+--+--+--+--+--+-
                              +------> |     |     |     |     |
                                       +--+--+--+--+--+--+--+--+-
                                          BSTR unicode 2-byte characters


+--+--+-----+-----+-----+-----------------------+-----------------------+
|0C 20|    reserved     |         PSA           |           0           |   64-bit VBA Variant
+--+--+-----+-----+-----+-----------------------+-----------------------+
                              |
                              |        +---------------+
                              +------> |   SAFEARRAY   |
                                       |   STRUCTURE   |
                                       +---------------+

The same bug happens when the Function returning Array is of type Variant. We do exactly the same as with the String array, but also put another Array in the second element.

The test code:

Private Sub fill_arr_variant(v())
    v(0) = "string"
    v(1) = Array("a", 5)
End Sub

Private Function return_arr_variant() As Variant()
    ReDim return_arr_variant(2)
    fill_arr_variant return_arr_variant
End Function

Private Sub array_assignment_bug_variant_array()
    Dim varr(), varr2()
    varr = return_arr_variant
    dump_safearray varr, 1
    varr2 = varr
    dump_safearray varr2, 1
End Sub

The results are the same and worse. Not even BSTR pointers are the same but the SAFEARRAY pointers in two different Variant/Array variable are also the same: two variables hold the same memory address of an Array. Erase one or set the Variant to something else will definitely crash the application due to corrupted heap! I could do it several times during tests.

varr2                                        =   varr
 |____ Variant/String --> BSTR1                   |____ Variant/String --> BSTR1
 |____ Variant/Array  --> SA1                     |____ Variant/Array  --> SA1
                          |___ Variant/String                              |___ Variant/String
                          |___ Variant/Integer                             |___ Variant/Integer

The remedy

What if we specify the type for ReDim?

Private Function return_arr_string() As String()
    ReDim return_arr_string(2) As String  ' still NO flags..
    fill_arr_string return_arr_string
End Function

Nope. I got the same result, a SA without Flags.

The only solution, which helped:

Avoid ReDim the array variable of Functions returning Array in the function itself

If we move ReDim into the worker Sub, it creates a proper SAFEARRAY structure. Which is weird.. how come, fill_arr_string() gets an address of a null-pointer, what is the difference for ReDim? There is something not clear going on.

Private Function return_arr_string() As String()
    fill_arr_string return_arr_string
End Function

Private Sub fill_arr_string(arr() As String)
    ReDim arr(2)       ' OK
    arr(0) = "not me"
End Sub

But now we are slowly loosing the whole point of a function returning Array.. our main procedure could just do this:

    Dim arr() As String, arr2() As String
    fill_arr_string arr
    arr2 = arr

It is also a solution: avoid Functions returning Array - when String or Variant and later the array would be assigned to another array.


March 04, 2024

The VBA "stack" and calling mechanisms

Last modified: Sun Mar 10 09:00:29 UTC+0100 2024 © Attila Tarpai

The VBA "call-stack" and calling mechanisms

Using VarPtr() and some memory dump in a VBA Standard Module reveals the following. VBA run-time allocation of local variables, passing parameters, temp-variables, function calls uses a common memory area, which looks very similar to stack-frame structures used by other compilers, like C.

This VBA "stack" is certainly not the CPU stack - only that the memory layout and the operation looks very similar:

  • caller "pushes" parameters onto stack, then execution transferred to callee (call)
  • callee "pulls" the stack pointer to allocate space for local variables
  • when function code execution ends, both callee and the caller "cleans up" and restores the stack.

There is a little difference between 64- and 32-bit VBA run-time when it comes to memory layout:

  • 64-bit run-time
    We find the parameter- and local block together in memory (similar to a traditional stack).
  • 32-bit run-time
    The two blocks are separated in memory. Caller builds up the complete parameter block locally and the function receives a pointer to the parameter block together with locals.

The final operation will be the same for both, only parameter access differs. Here "push" means copy a value into the parameter block by the caller; the CALL-line tries to denote where execution transfers to another function:

increasing
memory address
  ^
  |
  |     64-bit VBA "call-stack"                             32-bit VBA "call-stack"
  |
  |      |                 |                                    |           |
  |      |     local1      |                                    |  local1   |
  |      |     local2      |                                    |  local2   |
  |      |     local4      |  Caller                            |  local4   |  Caller
  |      |_________________|  local block                       |___________|  local block
  |      |      ...        |                                    |   ...     |
  |      |      ...        |                                    |   ...     |
  |      |      ...        |                                    |   push    |
  |      |      ...        |                                    |     |     |
  |      |      ...        |                                    |_____V_____|
  |      |      push       |                                    |  p3       |  callee
  |      |        |        |                                    |  p2       |  parameter block
  |      |________V________|                                --> |  p1       |
  |      |     p3          |  callee                       |    |___________|
  |      |     p2          |  parameter block              |    |   ...     |
  |      |     p1          |                               |        ...
  |    __|_________________|________________ CALL          |        ...
  |      |     local1      |  callee                       |    |___________|________________ CALL
  |      |     local2      |  local block                   --- |  [ptr]    |
  |      |_________________|                                    |  local1   |  callee
  |      |      ...        |                                    |  local2   |  local block
  |      |      ...        |                                    |___________|
  |      |      ...        |                                    |   ...     |
  |      |                 |                                    |           |
  |      |        |        |                                    |     |     |
  |               |                                                   |
  |               V                                                   V

                STACK                                               STACK

VBA Function: return value, locals and parameters memory layout

This was made with vbadump and in 64-bit VBA.

Follows a VBA function in a Standard Module. It has local variables, returns LongPtr and has 2 ByVal arguments (so we can look at):

Private Function func(ByRef p1 As LongPtr, ByRef p2 As LongPtr) As LongPtr
    Dim L As Long
    Dim I As Integer
    Dim p As LongPtr
    p = -1
    L = -1
    I = -1
    func = 9
    memdump VarPtr(p), 10  ' peek
End Function

The "stack" memory layout:

Lets call this function with (1, 2) and dump locals starting from the memory address of p:

 "stack"
  memory
  address
    ^
    |    00000234EBCB29B8: 0000000000000000  ..
    |    00000234EBCB29B0: 0200000000000000  p2
    |    00000234EBCB29A8: 0100000000000000  p1
    |    00000234EBCB29A0: F08EC7EB34020000
    |    00000234EBCB2998: 0900000000000000  func
    |    00000234EBCB2990: FFFFFFFF00000000  L
    |    00000234EBCB2988: FFFF000000000000  I
    |    00000234EBCB2980: FFFFFFFFFFFFFFFF  p        <--- VarPtr(p)
    |

In 64-bit VBA all variables are naturally 8-byte aligned. The structure indeed looks like a stack-frame of a C compiler: locals on lower addresses, parameters higher.

Function "return value"

The Function "return value" is a hidden local variable, as if it was declared as the very first local variable in the function:

Private Function func(ByRef p1 As LongPtr, ByRef p2 As LongPtr) As LongPtr
    [Dim func As LongPtr]
    Dim L As Long
    Dim I As Integer

It makes sense, the syntax is also the same: we use "func" as if it were a normal variable, like func=100, and on exiting the function the run-time takes care of it to return its value somehow to the caller.

With this method we can see the memory map for other, more "interesting" data types by inserting a Dim statement between L and I:

00000234EBCB29B0: 0200000000000000  p2
00000234EBCB29A8: 0100000000000000  p1
00000234EBCB29A0: F08EC7EB34020000
00000234EBCB2998: 0900000000000000  func
00000234EBCB2990: FFFFFFFF00000000  Long
00000234EBCB2988: 0000000000000000  String      <-- null-pointer
00000234EBCB2980: FFFF000000000000  Integer
00000234EBCB2978: FFFFFFFFFFFFFFFF  LongPtr

000001E8925DC050: 0200000000000000  p2
000001E8925DC048: 0100000000000000  p1
000001E8925DC040: 28D0689AE8010000
000001E8925DC038: 0900000000000000  func
000001E8925DC030: FFFFFFFF00000000  Long
000001E8925DC028: 0000000000000000  --
000001E8925DC020: 6400000000000000    |  64-bit Variant = 100 (vbInteger=2)
000001E8925DC018: 0200000000000000  --
000001E8925DC010: FFFF000000000000  Integer
000001E8925DC008: FFFFFFFFFFFFFFFF  LongPtr

00000234EBCB29B0: 0200000000000000  p2
00000234EBCB29A8: 0100000000000000  p1
00000234EBCB29A0: F08EC7EB34020000
00000234EBCB2998: 0900000000000000  func
00000234EBCB2990: FFFFFFFF00000000  Long
00000234EBCB2988: 0000000000000000  Object      <-- null-pointer
00000234EBCB2980: FFFF000000000000  Integer
00000234EBCB2978: FFFFFFFFFFFFFFFF  LongPtr

00000234EBCB29B0: 0200000000000000  p2
00000234EBCB29A8: 0100000000000000  p1
00000234EBCB29A0: F08EC7EB34020000
00000234EBCB2998: 0900000000000000  func
00000234EBCB2990: FFFFFFFF00000000  Long
00000234EBCB2988: 0000000000000000  varr()      <--- Dynamic Array, null-pointer
00000234EBCB2980: FFFF000000000000  Integer
00000234EBCB2978: FFFFFFFFFFFFFFFF  LongPtr

00000234EBCB29B0: 0200000000000000  p2
00000234EBCB29A8: 0100000000000000  p1
00000234EBCB29A0: F08EC7EB34020000
00000234EBCB2998: 0900000000000000  func
00000234EBCB2990: FFFFFFFF00000000  Long
00000234EBCB2988: 0400000000000000  --
00000234EBCB2980: 70586C8134020000    |
00000234EBCB2978: 0000000000000000    |  varr(3)  <--- Fixed-size array: 64-bit SAFEARRAY
00000234EBCB2970: 0100920818000000    |                                         STRUCTURE
00000234EBCB2968: 000000000C000000  --
00000234EBCB2960: FFFF000000000000  Integer
00000234EBCB2958: FFFFFFFFFFFFFFFF  LongPtr

00000234EBCB29B0: 0200000000000000  p2
00000234EBCB29A8: 0100000000000000  p1
00000234EBCB29A0: F08EC7EB34020000
00000234EBCB2998: 0900000000000000  func
00000234EBCB2990: FFFFFFFF00000000  Long
00000234EBCB2988: 3100320033003400  String * 4 = "1234" <--- 2x4=8 bytes of unicode fix-string
00000234EBCB2980: FFFF000000000000  Integer
00000234EBCB2978: FFFFFFFFFFFFFFFF  LongPtr

VBA calling mechanisms and memory map

Here we look at how parameters are passed in memory between Sub/Function-s inside a Standard VBA Module. What exactly ByRef/ByVal means for different VBA data types, and what happens when a Sub/Function expects a Variant argument: how the caller performs the implicite conversion.

This was done dumping the "stack" way above the current Sub/Function to reveal the caller's local area as well.

For visualizing the so called "call-stack" of the running VBA code:

  • each box is an allocated data structure or local variable as it appears in memory dump
  • the caller function's local variables, temp variables are on the top, the stack "grows", then execution transferred to another function: that is the "call-line" --------
  • on entering a function the "stack-pointer" is at the last parameter "pushed". The function allocates further stack space for it's local variables etc.

These are the main data type groups we look at each:

  • scalars (1-8 bytes): Integer, Long, LongLong, Byte, Single, Double, Boolean, Currency, Date
  • Variant - 16/24-byte structure 32/64-bit VBA
  • String - is a pointer
  • Object - is a pointer
  • Dynamic Array - is a pointer
  • Fixed-size Array - is a structure

How VBA stores its data types is essential here to understand first - see it in another chapter.

The caller has some different types of local varaiables, f. ex.:

Private Sub caller()
    Dim L As Long
    Dim varr(3)
    Dim I As Integer
    Dim v
    Dim s As String
    Dim coll As Collection
    Dim B As Byte
    Dim CY As Currency
    Dim p As LongPtr

    ' calls different sub and pass values
End Sub


Passing matchig data types ByRef/ByVal

As a general rule:

ByRef:

  • caller passes a pointer to variable or a result of an expression stored on stack.
  • callee expects a pointer of a specific type. Compile time check.
  • callee dereferences the pointer quietly each time of access.
  • valid for all data types

ByVal:

  • push makes a copy of the variable or expression result
  • valid for all data types except Array

Quick comparison:

Pass ByRef                                                  Pass ByVal

 _______                                                     _______
|   I   |                    Dim I As Integer               |   I   |           Dim I As Integer
|_______|          <---      <-- VarPtr(i)                  |_______|
                       |
    |  push            |                                        |  push
    V  address of      |                                        V
 ______________        |                                     _______
|   pointer    |  i    |                                    | copy  |  i        <-- VarPtr(i)
|______________|                                            |_______|
---------------------------- Sub (ByRef i As Integer)       -------------------------- Sub (ByVal i As Integer)

ByRef: VarPtr(i) gives the address of the variable of the caller - i.e. the value stored in the passed pointer.
ByVal: VarPtr(i) gives the address of the pushed copy.

When a ByRef parameter is passed to another Sub again ByRef, the pointer itself is pushed to Sub2() so that Sub2() can work again on the original variable of the caller. Note VarPtr(i2):

Pass ByRef to Sub2:

 _______
|   I   |                    Dim I As Integer
|_______|          <---      <-- VarPtr(i) points here
                       |     <-- VarPtr(i2) also points here
    |  push            |
    V  address of      |
 ______________        |
|   pointer    |  i    |
|______________|
---------------------------- Sub (ByRef i As Integer)
                       |
                       |
                       |
    |  push            |
    V                  |
 ______________        |
|   pointer    |  i2   |
|______________|
---------------------------- Sub2 (ByRef i2 As Integer)


Passing Arrays

There are two types of array declaration in VBA:

  • Dynamic Array: Dim varr()
  • Fixed-size Array: Dim arr(3)

A Dynamic Array variable is a pointer to a COM SAFEARRAY structure (PSA). A Fixed-size Array is-a COM SAFEARRAY structure. For Dim that means that a Dynamic Array variable allocates pointer-size (4/8 bytes), while Fixed-size Array allocates the whole SAFEARRAY structure in local memory.

Passing array is:

  • only ByRef
  • callee expects an address of a Pointer to SAFEARRAY (PSA)
  • requires temp PSA by caller for Fixed-size Arrays
  • Dynamic Array variable is-a PSA (the mechanism will be the same as with general ByRef)
  • VarPtr() is N/A for Arrays
Pass Fixed-size Array:
 ______________
|              |   Dim arr(3)
|  SAFEARRAY   |
|  STRUCT      |
|              |   <---
|______________|       |
                       |
       |  temp ptr     |
       V               |                                    Pass Dynamic Array:
 ______________        |                                    ______________
|  PSA ->      |       |                                   |  PSA ->      |           Dim varr()
|______________|        <---                               |______________|   <---
                            |                                                     |
       |  push              |                                    |  push          |
       V  address of        |                                    V  address of    |
 ______________             |                               ______________        |
|    pointer   |            |                              |    pointer   |       |
|______________|                                           |______________|
------------------------------- Sub ([ByRef] varr())       -------------------------- Sub ([ByRef] varr())


Passing String

Pass String ByVal:                                         Pass String ByRef:

 ______________                                             ______________
|  --> BSTR1   |          Dim S As String                  |  --> BSTR    |          Dim S As String
|______________|                                           |______________|  <---
                                                                                 |
       |  push                                                 |  push           |
       V                                                       V  address of     |
 ______________                                             ______________       |
|  --> BSTR1   |                                           |   pointer    |      |
|______________|                                           |______________|
------------------------- Sub (ByVal s As String)          ------------------------- Sub (ByRef s As String)
       |  deep copy
       V
 ______________
|  --> BSTR2   |        <-- VarPtr(s) points here
|______________|            hidden local var

Passing String ByRef is the same mechanism as general ByRef.

Passing String ByVal:

Caller makes exact copy of the String variable for push.
It is callee that performs the actual deep-copy, allocates new BSTR and a copy of the original string bytes itself.

The resulting String variable is stored locally as if it was declared here - and VarPtr(i) gives the address of this "hidden" variable:

Sub (ByVal s As String)
  [Dim s As String]
  Dim ...
  ..
End Sub

Passing Variant ByVal

  • Caller pushes an exact copy of the Variant structure and makes the call.
  • Callee always allocates a new, hidden Variant locally and copies the Variant parameter. This is because at run-time the exact data type stored in the Variant is not known: callee will perform simple- or deep-copy based on the received VbVarType of the Variant.

Passing Variant of scalars ByVal

Pass Variant/Integer ByVal: simple copy for scalars

 ____ _________ ______________
|02  |         |      100     |   Dim V
|____|_________|______________|   V = 100

       |  push
       V
 ____ _________ ______________
|02  |         |      100     |
|____|_________|______________|
----------------------------------------- Sub (ByVal v)
       |  copy
       V
 ____ _________ ______________
|02  |         |      100     |     <-- VarPtr(v) points here
|____|_________|______________|     hidden local var

02 denotes vbVarType = vbInteger.

Note. This is also true for expressions results, f. ex. sub_v_ByVal 100: first a new Variant holding the result is created on stack, then this value is copied and pushed.

Passing Variant/String and Variant/Array ByVal

Passing Strings and Arrays ByVal stored in Variants uses the same mechanism but with deep-copy: on entering the function a new BSTR or a new SAFEARRAY structure will be allocated and content of the data it's holding copied.


Pass Variant/String ByVal                                      Pass Variant/Array ByVal

 ____ _________ ______________                                  ____ _________ ______________
|08  |         |  --> BSTR1   |   Dim V                        |VT_ARRAY      |  --> SA1     |   Dim V
|____|_________|______________|   V = "s"                      |____|_________|______________|   ReDim V(3)

       |  push                                                        |  push
       V                                                              V
 ____ _________ ______________                                  ____ _________ ______________
|08  |         |  --> BSTR1   |                                |VT_ARRAY      |  --> SA1     |
|____|_________|______________|                                |____|_________|______________|
-------------------------------------- Sub (ByVal v)           -------------------------------------- Sub (ByVal v)
       |  deep copy                                                   |  deep copy
       V                                                              V
 ____ _________ ______________                                  ____ _________ ______________
|08  |         |  --> BSTR2   |  <-- VarPtr(v) points here     |VT_ARRAY      |  --> SA2     |  <-- VarPtr(v) points here
|____|_________|______________|  hidden local var              |____|_________|______________|  hidden local var


08 denotes vbVarType = vbString. The VT_ARRAY flag is a VARENUM enumeration declared in wtypes.h of the Automation API (COM) - same as vbArray.

Passing other data types and expressions As Variant

This is the case when a Sub/Function is declared with a Variant argument:

Sub|Function (ByRef|ByVal v [As Variant])

Anything, any variable or expression can be passed to such a function ByRef or ByVal. This is possible because Variant can store any other data type and the conversion is implicite in VBA.

Callee still expects Variant - it is the caller that makes the necessary conversions.

Passing other data types As ByRef Variant

  • Caller creates a special temp Variant with VT_BYREF flag set and places a pointer into the Variant.
    VbVarType - including the VT_ARRAY flag - is copied/set correctly for callee.
  • Callee recognises this VT_BYREF Variant and quietly dereferences the pointer. So this is a double dereference each time of access.

The mechanism is essentially the same for all dataypes:

  • scalars
  • String
  • Dynamic Array (PSA)
Pass scalar As ByRef Variant:

 _______
|  100  |                  Dim I As Integer
|_______|          <---
                       |
    |  temp            |
    V                  |
 ____ _________ ______________
|VT_BYREF      |   pointer    |           <-- VarPtr(v) points here(!)
|____|_________|______________|   <---
                                      |
       |  push address of             |
       V                              |
 ______________                       |
|    pointer   |  v                   |
|______________|
----------------------------------------- Sub (ByRef v)

Note that VarPtr(v) points to the temp VT_BYREF Variant. Seems like VBA simply returns the received ByRef pointer value itself for VarPtr(v).

Also, similar to passing a ByRef parameter to another Sub again ByRef, the pointer itself is pushed. Note VarPtr(v2) also points to the original temp VT_BYREF Variant of the caller:

Pass scalar As ByRef Variant to Sub2:

 _______
|  100  |                  Dim I As Integer
|_______|          <---
                       |
    |  temp            |
    V                  |
 ____ _________ ______________
|VT_BYREF      |   pointer    |           <-- VarPtr(v) points here
|____|_________|______________|   <---    <-- VarPtr(v2) also points here
                                      |
       |  push address of             |
       V                              |
 ______________                       |
|    pointer   |  v                   |
|______________|
----------------------------------------- Sub (ByRef v)
                                      |
                                      |
                                      |
       |  push                        |
       V                              |
 ______________                       |
|    pointer   |  v2                  |
|______________|
----------------------------------------- Sub2 (ByRef v2)

The mechanism is the same for String and Array (just for reference):

Pass String as ByRef Variant:                                 Pass Dynamic Array As ByRef Variant:

 ______________                                                ______________
|  --> BSTR    |             <---   Dim str as String         |  --> SA      |             <---   Dim varr()
|______________|                 |                            |______________|                 |
                                 |                                                             |
       |  temp                   |                                   |  temp                   |
       V                         |                                   V                         |
 ____ _________ ______________   |                             ____ _________ ______________   |
|VT_BYREF      |   pointer    |   <---                        |VT_BYREF      |   pointer    |   <---
|____|_________|______________|       |                       |____|_________|______________|       |
                                      |                                                             |
       |  push address of             |                              |  push address of             |
       V                              |                              V                              |
 ______________                       |                        ______________                       |
|    pointer   |  v                   |                       |    pointer   |  v                   |
|______________|                      |                       |______________|                      |
----------------------------------------- Sub (ByRef v)       ----------------------------------------- Sub (ByRef v)

As with passing arrays the function expects a pointer to PSA. For Fixed-size Arrays the caller additionally creates a temp PSA on stack. This is the memory layout:

Pass Fixed-size Array as ByRef Variant:

 ______________
|              |   Dim arr(3)
|  SAFEARRAY   |
|  STRUCT      |
|              |   <---
|______________|       |
                       |
       |  temp ptr     |
       V               |
 ______________        |
|  PSA ->      |             <---
|______________|                 |
                                 |
       |  temp Variant           |
       V                         |
 ____ _________ ______________   |
|VT_BYREF      |   pointer    |   <---
|____|_________|______________|       |
                                      |
       |  push address of             |
       V                              |
 ______________                       |
|    pointer   |  v                   |
|______________|
----------------------------------------- Sub (ByRef v)


Passing other data types As ByVal Variant

Caller allocates a temp Variant and copies the source data as-is. This "cloning" does not create any copy of the BSTR or SAFEARRAY.

Callee performs the same operation as with any ByVal Variant: allocates a new, hidden Variant locally and copies the Variant parameter, which can be simple- or deep-copy (String and Array).

Pass scalar As ByVal Variant:

 _______
|  100  |       Dim I As Integer
|_______|

    |  clone
    V
 ____ _________ ______________
|02  |         |     100      |
|____|_________|______________|

       |  push
       V
 ____ _________ ______________
|02  |         |     100      |
|____|_________|______________|
------------------------------------- Sub (ByVal v)
       |  copy
       V
 ____ _________ ______________
|02  |         |     100      |     <-- VarPtr(v)
|____|_________|______________|     local hidden copy

For String and Array involves deep copy (note BSTR2 and SA2):

Pass String As ByVal Variant:                              Pass Dynamic Array As ByVal Variant:

 ______________                                             ______________
|  --> BSTR1   |             Dim str as String             |  --> SA1     |                   Dim varr()
|______________|                                           |______________|

       |  clone                                                   |  clone
       V                                                          V
 ____ _________ ______________                              ____ _________ ______________
|08  |         |  --> BSTR1   |                            |VT_ARRAY      |  --> SA1     |
|____|_________|______________|                            |____|_________|______________|

       |  push                                                    |  push
       V                                                          V
 ____ _________ ______________                              ____ _________ ______________
|08  |         |  --> BSTR1   |                            |VT_ARRAY      |  --> SA1     |
|____|_________|______________|                            |____|_________|______________|
------------------------------------- Sub (ByVal v)        ------------------------------------- Sub (ByVal v)
       |  deep copy                                               |  deep copy
       V                                                          V
 ____ _________ ______________                              ____ _________ ______________
|08  |         |  --> BSTR2   |     <-- VarPtr(v)          |VT_ARRAY      |  --> SA2     |     <-- VarPtr(v)
|____|_________|______________|     local hidden copy      |____|_________|______________|     local hidden copy

When passing Fixed-size Array caller again creates a temp VT_ARRAY Variant holding the address of the SAFEARRAY structure:

Pass Fixed-size Array As ByVal Variant:

 ______________
|              |        Dim arr(3)
|  SAFEARRAY   |
|  STRUCT      |
|              |   <---
|______________|       |
                       |
       |  clone        |
       V               |
                       |
 ____ _________ ______________
|VT_ARRAY      |  --> SA1     |
|____|_________|______________|

       |  push
       V
 ____ _________ ______________
|VT_ARRAY      |  --> SA1     |
|____|_________|______________|
----------------------------------------- Sub (ByVal v)
       |  deep copy
       V
 ____ _________ ______________
|VT_ARRAY      |  --> SA2     |     <-- VarPtr(v) points here
|____|_________|______________|     hidden local var

Passing Object variables

Passing Objects in VBA ByRef follows the same mechanism as other pointer data types, like String.

Passing Objects ByVal is a different mechanism.

Object is the only true reference type in VBA and more than one pointer is allowed to hold the address of the same object. This is because of the IUnknown Interface and reference counting. There is no "deep copy" of an Object type i.e. we cannot make another full copy with items of a Collection f. ex. by assignment - opposed to Strings and Arrays.

This is also true when Variants are holding a reference to an Object:

Dim v1, v2
Set v1 = New JVAR
Set v2 = v1

v1                                                    v2
0900-000000000000-4090800716020000-0000000000000000   0900-000000000000-4090800716020000-0000000000000000
                   |                                                     |
                   +------------> same instance <------------------------+

09 = vbObject

There is no difference in the ByVal calling mechanism whether typed- or general Object reference is passed, or callee expects typed- or general Object:

Pass Object ByVal:

 ______________                                            ______________
|  --> ref     |      Dim coll As Collection              |  --> ref     |      Dim coll As Collection
|______________|                                          |______________|

       |  clone                                                  |  clone
       V                                                         V
 ______________                                            ______________
|  --> ref     |                                          |  --> ref     |
|______________|                                          |______________|

       |  push                                                   |  push
       V                                                         V
 ______________                                            ______________
|  --> ref     |                                          |  --> ref     |
|______________|                                          |______________|
------------------------ Sub (ByVal o As Object)          ------------------------ Sub (ByVal coll As Collection)
       |  copy                                                   |  copy
       V                                                         V
 ______________                                            ______________
|  --> ref     |     <-- VarPtr(o)                        |  --> ref     |     <-- VarPtr(coll)
|______________|     local hidden copy                    |______________|     local hidden copy

Why the clone-ing step happens is unclear. The only "benefit" of a ByVal Object is that the Sub/Function cannot change the object reference itself (it made a copy). It can add/remove items from the Collection but cannot deallocate the original object by setting the reference to Nothing, create a new one etc.