March 11, 2024

VBA Array Internals: Memory layout

Last modified: Mon Mar 11 22:08:49 UTC+0100 2024 © Attila Tarpai

VBA Array Internals I.
Memory layout, structures and ReDim Preserve

There are two types of array declaration in VBA:

  • Dynamic Array: Dim varr()
  • Fixed-size Array: Dim arr(3)

VBA Arrays are allocated SAFEARRAY structures based on underlying COM/Automation Interface:

Fixed-size Array variable in memory:

   +-------------+
   |  SAFEARRAY  |
   |   STRUCT    | pvData   _____ _____ _____ _____
   |             | ------> |_____|_____|_____|_____| Elements
   |             |
   +-------------+
     Dim arr(3)


Dynamic Array variables are pointers (can be null-ptr):

   +-------------+
   |   Pointer   | ----> SAFEARRAY STRUCT
   +-------------+
     Dim varr()

Declaration of a Fixed-size Array variable allocates a SAFEARRAY structure along with initialized Elements.

Declaration of a Dynamic Array variable allocates a pointer, initially null-pointer. It appears f. ex. as Variant() in the Debug Windows without bounds. Later ReDim will allocate the SAFEARRAY structure with initialized Elements and sets the variable.

PSA = Pointer to SAFEARRAY

VBA works with Pointer to SAFEARRAY (PSA) when dealing with Arrays:

  • Dynamic Array variable is a PSA
  • passing Array to Sub/Function is passing an address of a PSA (passing Array is ByRef)
  • returning array from Array type functions is a PSA
  • Variant of array type stores a PSA (VT_ARRAY flag set)

Passing Arrays to procedures

Passing arrays is always ByRef: caller passes the memory address of a Pointer to SAFEARRAY (PSA). This PSA can be itself a Dynamic Array variable or a temp PSA made by the caller for Fixed-size Arrays. This ensures that the called procedure will access the array in the same way. It also checks certain SAFEARRAY Flags to prevent ReDim f. ex. on Fixed-size Arrays at run-time. The PSA can also be a null-ptr.

Sub ([ByRef] a())

 _________                 _________
|_________| -------+--->  |___PSA __| ----> SAFEARRAY STRUCT
     a             |       _________
 address of        +--->  |__PSA=0__|
    PSA

Note that this mechanism allows the called procedure to work directly on the array and its elements; it can also re-assign the variable and even execute ReDim and Erase for Dynamic Array variables passed.

Functions returning array

Functions can only return Dynamic Arrays, which is a Pointer to SAFEARRAY (PSA). In VBA the function return value is a hidden local variable, as if it was declared as the very first local variable of the function. End|Exit Function makes this value available for the caller.

Private Function f() As String()
    [Dim f() As String]           <-- as if declared here (PSA)
    Dim I As Integer
    Dim L As Long
    ...
End Function

On entering the function, before ReDim or assignment, the return value is a null-ptr. Usage and syntax is the same as for other Dynamic Array variables.

  • After ReDim memory space for a SAFEARRAY structure and elements are allocated and the variable is set.
  • The function variable can be lvalue in array assignment
  • can be Erase-d and set back to null-ptr

Variant of array type

The Variant stores a Pointer to SAFEARRAY (PSA) and the VT_ARRAY flag is set:

Dim v
ReDim v(3)

0C20-000000000000-B0BB8A8F9D010000-0000000000000000 (64-bit VBA Variant)

+--+--+-----+-----+-----+-----------------------+-----------------------+
|0C 20|    reserved     |         PSA           |           0           |   64-bit VBA Variant
+--+--+-----+-----+-----+-----------------------+-----------------------+
                              |
                              |        +---------------+
                              +------> |   SAFEARRAY   |
                                       |   STRUCTURE   |
                                       +---------------+

PSA in Variant can also be null-ptr (after Erase statement). 2000h is the VT_ARRAY = vbArray flag.

The SAFEARRAY STRUCTURE

See in Oaidl.h

This VBA Type maps correctly on both 32/64-bit VBA:

Private Type SAFEARRAY
  cDims As Integer              ' number of dimensions: 1, 2, ..
  fFeatures As Integer          ' feature flags (fix/dynamic etc.)
  cbElements As Long            ' size of one array element
  cLocks As Long                ' array locked
  pvData As LongPtr             ' pointer to array data
End Type

On the very next memory address immediately follows one SAFEARRAYBOUND structure for each dimension:

Private Type SAFEARRAYBOUND
  cElements As Long
  lLbound As Long
End Type

SAFEARRAYBOUND stores the allowed index lower bound and the number of elements in a dimension. So the formula VBA uses in Debug Windows is (LBound To LBound + cElements - 1). That's why calling Array() without arguments - or passing no arguments to a ParamArray - shows f. ex. Variant(0 to -1). These are empty, but allocated arrays with cElements=0.

The SAFEARRAY structure compiles to a little larger structure on 64-bit VBA: pvData takes 12 bytes (4 byte padding to align 8 and an 8-byte pointer).

32/64-bit SAFEARRAY structure difference:

ReDim varr(3 To 6, 1 To 2) As Integer

0          4
+----------+              32-bit
|   PSA    | --------->   SAFEARRAY  02 00                    cDims
+----------+                         80 00                    fFeatures
Dim varr()                           02 00 00 00              cbElements
32-bit VBA                           00 00 00 00              cLocks
                                     50 EA A9 01              pvData (32-bit)
                                     02 00 00 00 01 00 00 00  rgsabound[0]
                                     04 00 00 00 03 00 00 00  rgsabound[1]

0               8
+---------------+         64-bit
|      PSA      | ----->  SAFEARRAY  02 00                    cDims
+---------------+                    80 00                    fFeatures
Dim varr()                           02 00 00 00              cbElements
64-bit VBA                           00 00 00 00              cLocks
                                     00 00 00 00              align(8) padding
                                     E0 2D 50 EA A9 01 00 00  pvData (64-bit)
                                     02 00 00 00 01 00 00 00  rgsabound[0]
                                     04 00 00 00 03 00 00 00  rgsabound[1]

Feature flags and VBA Array types

These are the flags defined for the fFeatures field of the SAFEARRAY structure:

typedef enum tagADVFEATUREFLAGS
{
   FADF_AUTO = 0x0001,
   FADF_STATIC = 0x0002,
   FADF_EMBEDDED = 0x0004,
   FADF_FIXEDSIZE = 0x0010,
   FADF_RECORD = 0x0020,
   FADF_HAVEIID = 0x0040,
   FADF_HAVEVARTYPE = 0x0080,
   FADF_BSTR = 0x0100,
   FADF_UNKNOWN = 0x0200,
   FADF_DISPATCH = 0x0400,
   FADF_VARIANT = 0x0800
} ADVFEATUREFLAGS;

The VBA run-time uses these flags:

  • FADF_HAVEVARTYPE flag is normally set for both Dynamic- and Fixed-size Arrays declared in VBA. That will also allocate an VbVarType Enum value at offset -4 of the SAFEARRAY structure
  • For arrays of Object VBA sets the FADF_HAVEIID flag and there is an allocated GUID value at offset -16
  • External UDT-s, f. ex. an array of Type mscorlib.Decimal will have proper IRecordInfo interface pointer at -4/8 of the SAFEARRAY and the FADF_RECORD flag is set (no other flags are set).
  • An array of User Type defined in the a Standard Module or Class is somewhat incomplete: a pointer is allocated before SAFEARRAY but its value is zero and no FADF_RECORD flag is set. These
   FADF_HAVEVARTYPE         FADF_HAVEIID             FADF_RECORD              NONE

   +-------------+          +-------------+          +-------------+          +-------------+
-4 |  VbVarType  |       -16|    GUID     |     -4/8 | IRecordInfo |     -4/8 |      0      |
   +-------------+          +-------------+          +-------------+          +-------------+
   |  SAFEARRAY  |          |  SAFEARRAY  |          |  SAFEARRAY  |          |  SAFEARRAY  |
   |   STRUCT    |          |   STRUCT    |          |   STRUCT    |          |   STRUCT    |
   |             |          |             |          |             |          |             |
   |             |          |             |          |             |          |             |
   +-------------+          +-------------+          +-------------+          +-------------+

    Dim a(5)                  Dim a(5)                Dim AT(2)                Dim myarr(2)
    Dim a(5) As String        As Object               As mscorlib.Decimal      As MyType
    Dim a(5) As Long

Arrays of String set FADF_BSTR, arrays of Variant set FADF_VARIANT. These flags are important for COM routines when it comes to array assigment and how to copy element values - see later. Arrays of all Object types set FADF_DISPATCH in VBA.

Fixed-size Arrays declared in VBA additionally set both FADF_FIXEDSIZE and FADF_STATIC flags.

Memory layout of array data and dimensions

Lets make a 2D array in VBA and fill it up with values:

Dim arr(3 To 6, 1 To 2) As Byte
    arr(3, 1) = &H31
    arr(3, 2) = &H32
    arr(4, 1) = &H41
    arr(4, 2) = &H42
    arr(5, 1) = &H51
    arr(5, 2) = &H52
    arr(6, 1) = &H61
    arr(6, 2) = &H62

This array contains 4 x 2 = 8 Byte elements.

Interestingly, the second dimension's SAFEARRAYBOUND is stored first:

rgsabound[0] = 02 00 00 00 01 00 00 00  <--- 2nd dim
rgsabound[1] = 04 00 00 00 03 00 00 00  <--- 1st dim

Dumping pvData..

000001BA1304BCA0: 31 41 51 61 32 42 52 62 | 9B F8 E4 20 BA 41 00 8E | 1AQa2BRb›øä ºA.Ž
                 |__|__|__|__|
                    1st dim
                 |___________|___________|
                          2nd dim

Elements of the first dimension is always stored at continuous memory locations, tightly packed on the data's natural boundaries (this is also the cbElements value). The second dimension lays out storage for the 1st dimension cElements times of the 2nd dimension SAFEARRAYBOUND.

By increasing the number of dimensions the memory layout will follow this algorithm:

            ---------------------------> increasing memory adddresss
                                                                                            cElements

1st dim     |__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|       4

2nd dim     |___________|___________|___________|___________|___________|___________|       2

3rd dim     |_______________________|_______________________|_______________________|       3

4th dim     |_______________________________________________________________________|____   ..

so on..
                           Dim arr(0 To 3, 0 To 1, 0 To 2, ..)

Every higher dimension multiplies the whole previous memory block by cElements times of the corresponding SAFEARRAYBOUND.

Note that "rank" is declared opposite to what f. ex. C language people are used to. The storage will be the same.

VBA ARRAY                          C ARRAY

arr(rank1, rank2, ...)             arr[..][rank2][rank1]

higher dimensions --->             <--- higher dimensions

Another major difference is that a multi-dimensional VBA array is not an array of arrays, like in C. We can address only individual elements.

Understanding the memory layout of n dimensions above will hopefully help why only the last dimension's (upper) bound can be changed in VBA in ReDim Preserve.

ReDim Preserve

It is possible in VBA to resize an array by keeping its content using ReDim Preserve to less or more elements.

Executing ReDim Preserve first allocates memory space for n' elements, then:

  • if n' > n then EXPAND: memcopy n elements and initialize new elements
  • if n' < n then SHRINK: memcopy only n' elements (truncation)

The previous memory block is freed. Note that array data copy is simple memcopy. If n' = n no changes are made.

1D-array ReDim Preserve

Only the upper bound can be changed to shrink or expand the array:

ReDim varr(3 To 6)
 _____ _____ _____ _____
|__3__|__4__|__5__|__6__|


ReDim Preserve varr(3 To 5)
 _____ _____ _____
|__3__|__4__|__5__|  <--- shrink


ReDim Preserve varr(3 To 7)
 _____ _____ _____ _____ _____
|__3__|__4__|__5__|__6__|__0__| ---> expand

2D-array ReDim Preserve

Only the second dimension's upper bound can be changed:

ReDim varr(3 To 6, 1 To 3)         ReDim Preserve                     ReDim Preserve
                                   varr(3 To 6, 1 To 2)               varr(3 To 6, 1 To 4)

 _____ _____ _____ _____            _____ _____ _____ _____            _____ _____ _____ _____
|__3__|__4__|__5__|__6__|  1       |__3__|__4__|__5__|__6__|  1       |__3__|__4__|__5__|__6__|  1
 _____ _____ _____ _____            _____ _____ _____ _____            _____ _____ _____ _____
|__3__|__4__|__5__|__6__|  2       |__3__|__4__|__5__|__6__|  2       |__3__|__4__|__5__|__6__|  2
 _____ _____ _____ _____                                               _____ _____ _____ _____
|__3__|__4__|__5__|__6__|  3                shrink                    |__3__|__4__|__5__|__6__|  3
                                                                       _____ _____ _____ _____
                                                                      |__0__|__0__|__0__|__0__|  4

                                                                                expand

3D- and higher array ReDim Preserve

Only the last dimension's upper bound can be changed: as everything before the last dimension will be added or removed x times.

ReDim varr(3 To 6, 1 To 2, 8 To 10)

            1                        2
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  8
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  9
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  10


ReDim Preserve varr(3 To 6, 1 To 2, 8 To 9)

            1                        2
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  8
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  9     <--- shrink


ReDim Preserve varr(3 To 6, 1 To 2, 8 To 11)

            1                        2
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  8
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  9
 _____ _____ _____ _____  _____ _____ _____ _____
|__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__|  10
 _____ _____ _____ _____  _____ _____ _____ _____
|__0__|__0__|__0__|__0__||__0__|__0__|__0__|__0__|  11    <--- expand

No comments:

Post a Comment