VBA Array Internals I.
Memory layout, structures and ReDim Preserve
There are two types of array declaration in VBA:
- Dynamic Array: Dim varr()
- Fixed-size Array: Dim arr(3)
VBA Arrays are allocated SAFEARRAY structures based on underlying COM/Automation Interface:
Fixed-size Array variable in memory: +-------------+ | SAFEARRAY | | STRUCT | pvData _____ _____ _____ _____ | | ------> |_____|_____|_____|_____| Elements | | +-------------+ Dim arr(3) Dynamic Array variables are pointers (can be null-ptr): +-------------+ | Pointer | ----> SAFEARRAY STRUCT +-------------+ Dim varr()
Declaration of a Fixed-size Array variable allocates a SAFEARRAY structure along with initialized Elements.
Declaration of a Dynamic Array variable allocates a pointer, initially null-pointer. It appears f. ex. as Variant() in the Debug Windows without bounds. Later ReDim will allocate the SAFEARRAY structure with initialized Elements and sets the variable.
PSA = Pointer to SAFEARRAY
VBA works with Pointer to SAFEARRAY (PSA) when dealing with Arrays:
- Dynamic Array variable is a PSA
- passing Array to Sub/Function is passing an address of a PSA (passing Array is
ByRef
) - returning array from Array type functions is a PSA
- Variant of array type stores a PSA (
VT_ARRAY
flag set)
Passing Arrays to procedures
Passing arrays is always ByRef: caller passes the memory address of a Pointer to SAFEARRAY (PSA). This PSA can be itself a Dynamic Array variable or a temp PSA made by the caller for Fixed-size Arrays. This ensures that the called procedure will access the array in the same way. It also checks certain SAFEARRAY Flags to prevent ReDim f. ex. on Fixed-size Arrays at run-time. The PSA can also be a null-ptr.
Sub ([ByRef] a()) _________ _________ |_________| -------+---> |___PSA __| ----> SAFEARRAY STRUCT a | _________ address of +---> |__PSA=0__| PSA
Note that this mechanism allows the called procedure to work directly on the array and its elements; it can also re-assign the variable and even execute ReDim and Erase for Dynamic Array variables passed.
Functions returning array
Functions can only return Dynamic Arrays, which is a Pointer to SAFEARRAY (PSA). In VBA the function return value is a hidden local variable, as if it was declared as the very first local variable of the function. End|Exit Function
makes this value available for the caller.
Private Function f() As String() [Dim f() As String] <-- as if declared here (PSA) Dim I As Integer Dim L As Long ... End Function
On entering the function, before ReDim or assignment, the return value is a null-ptr. Usage and syntax is the same as for other Dynamic Array variables.
- After ReDim memory space for a SAFEARRAY structure and elements are allocated and the variable is set.
- The function variable can be lvalue in array assignment
- can be
Erase
-d and set back to null-ptr
Variant of array type
The Variant stores a Pointer to SAFEARRAY (PSA) and the VT_ARRAY
flag is set:
Dim v ReDim v(3) 0C20-000000000000-B0BB8A8F9D010000-0000000000000000 (64-bit VBA Variant) +--+--+-----+-----+-----+-----------------------+-----------------------+ |0C 20| reserved | PSA | 0 | 64-bit VBA Variant +--+--+-----+-----+-----+-----------------------+-----------------------+ | | +---------------+ +------> | SAFEARRAY | | STRUCTURE | +---------------+
PSA in Variant can also be null-ptr (after Erase
statement). 2000h is the VT_ARRAY = vbArray flag.
The SAFEARRAY STRUCTURE
See in Oaidl.h
This VBA Type maps correctly on both 32/64-bit VBA:
Private Type SAFEARRAY cDims As Integer ' number of dimensions: 1, 2, .. fFeatures As Integer ' feature flags (fix/dynamic etc.) cbElements As Long ' size of one array element cLocks As Long ' array locked pvData As LongPtr ' pointer to array data End Type
On the very next memory address immediately follows one SAFEARRAYBOUND structure for each dimension:
Private Type SAFEARRAYBOUND cElements As Long lLbound As Long End Type
SAFEARRAYBOUND stores the allowed index lower bound and the number of elements in a dimension. So the formula VBA uses in Debug Windows is (LBound To LBound + cElements - 1)
. That's why calling Array() without arguments - or passing no arguments to a ParamArray - shows f. ex. Variant(0 to -1). These are empty, but allocated arrays with cElements=0
.
The SAFEARRAY structure compiles to a little larger structure on 64-bit VBA: pvData takes 12 bytes (4 byte padding to align 8 and an 8-byte pointer).
32/64-bit SAFEARRAY structure difference:
ReDim varr(3 To 6, 1 To 2) As Integer 0 4 +----------+ 32-bit | PSA | ---------> SAFEARRAY 02 00 cDims +----------+ 80 00 fFeatures Dim varr() 02 00 00 00 cbElements 32-bit VBA 00 00 00 00 cLocks 50 EA A9 01 pvData (32-bit) 02 00 00 00 01 00 00 00 rgsabound[0] 04 00 00 00 03 00 00 00 rgsabound[1] 0 8 +---------------+ 64-bit | PSA | -----> SAFEARRAY 02 00 cDims +---------------+ 80 00 fFeatures Dim varr() 02 00 00 00 cbElements 64-bit VBA 00 00 00 00 cLocks 00 00 00 00 align(8) padding E0 2D 50 EA A9 01 00 00 pvData (64-bit) 02 00 00 00 01 00 00 00 rgsabound[0] 04 00 00 00 03 00 00 00 rgsabound[1]
Feature flags and VBA Array types
These are the flags defined for the fFeatures
field of the SAFEARRAY structure:
typedef enum tagADVFEATUREFLAGS { FADF_AUTO = 0x0001, FADF_STATIC = 0x0002, FADF_EMBEDDED = 0x0004, FADF_FIXEDSIZE = 0x0010, FADF_RECORD = 0x0020, FADF_HAVEIID = 0x0040, FADF_HAVEVARTYPE = 0x0080, FADF_BSTR = 0x0100, FADF_UNKNOWN = 0x0200, FADF_DISPATCH = 0x0400, FADF_VARIANT = 0x0800 } ADVFEATUREFLAGS;
The VBA run-time uses these flags:
-
FADF_HAVEVARTYPE
flag is normally set for both Dynamic- and Fixed-size Arrays declared in VBA. That will also allocate an VbVarType Enum value at offset -4 of the SAFEARRAY structure - For arrays of Object VBA sets the
FADF_HAVEIID
flag and there is an allocated GUID value at offset -16 - External UDT-s, f. ex. an array of Type
mscorlib.Decimal
will have proper IRecordInfo interface pointer at -4/8 of the SAFEARRAY and theFADF_RECORD
flag is set (no other flags are set). - An array of User Type defined in the a Standard Module or Class is somewhat incomplete: a pointer is allocated before SAFEARRAY but its value is zero and no
FADF_RECORD
flag is set. These
FADF_HAVEVARTYPE FADF_HAVEIID FADF_RECORD NONE +-------------+ +-------------+ +-------------+ +-------------+ -4 | VbVarType | -16| GUID | -4/8 | IRecordInfo | -4/8 | 0 | +-------------+ +-------------+ +-------------+ +-------------+ | SAFEARRAY | | SAFEARRAY | | SAFEARRAY | | SAFEARRAY | | STRUCT | | STRUCT | | STRUCT | | STRUCT | | | | | | | | | | | | | | | | | +-------------+ +-------------+ +-------------+ +-------------+ Dim a(5) Dim a(5) Dim AT(2) Dim myarr(2) Dim a(5) As String As Object As mscorlib.Decimal As MyType Dim a(5) As Long
Arrays of String set FADF_BSTR
, arrays of Variant set FADF_VARIANT
. These flags are important for COM routines when it comes to array assigment and how to copy element values - see later. Arrays of all Object types set FADF_DISPATCH
in VBA.
Fixed-size Arrays declared in VBA additionally set both FADF_FIXEDSIZE
and FADF_STATIC
flags.
Memory layout of array data and dimensions
Lets make a 2D array in VBA and fill it up with values:
Dim arr(3 To 6, 1 To 2) As Byte arr(3, 1) = &H31 arr(3, 2) = &H32 arr(4, 1) = &H41 arr(4, 2) = &H42 arr(5, 1) = &H51 arr(5, 2) = &H52 arr(6, 1) = &H61 arr(6, 2) = &H62
This array contains 4 x 2 = 8 Byte elements.
Interestingly, the second dimension's SAFEARRAYBOUND is stored first:
rgsabound[0] = 02 00 00 00 01 00 00 00 <--- 2nd dim rgsabound[1] = 04 00 00 00 03 00 00 00 <--- 1st dim
Dumping pvData..
000001BA1304BCA0: 31 41 51 61 32 42 52 62 | 9B F8 E4 20 BA 41 00 8E | 1AQa2BRb›øä ºA.Ž |__|__|__|__| 1st dim |___________|___________| 2nd dim
Elements of the first dimension is always stored at continuous memory locations, tightly packed on the data's natural boundaries (this is also the cbElements value). The second dimension lays out storage for the 1st dimension cElements times of the 2nd dimension SAFEARRAYBOUND.
By increasing the number of dimensions the memory layout will follow this algorithm:
---------------------------> increasing memory adddresss cElements 1st dim |__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__| 4 2nd dim |___________|___________|___________|___________|___________|___________| 2 3rd dim |_______________________|_______________________|_______________________| 3 4th dim |_______________________________________________________________________|____ .. so on.. Dim arr(0 To 3, 0 To 1, 0 To 2, ..)
Every higher dimension multiplies the whole previous memory block by cElements times of the corresponding SAFEARRAYBOUND.
Note that "rank" is declared opposite to what f. ex. C language people are used to. The storage will be the same.
VBA ARRAY C ARRAY arr(rank1, rank2, ...) arr[..][rank2][rank1] higher dimensions ---> <--- higher dimensions
Another major difference is that a multi-dimensional VBA array is not an array of arrays, like in C. We can address only individual elements.
Understanding the memory layout of n dimensions above will hopefully help why only the last dimension's (upper) bound can be changed in VBA in ReDim Preserve
.
ReDim Preserve
It is possible in VBA to resize an array by keeping its content using ReDim Preserve
to less or more elements.
Executing ReDim Preserve first allocates memory space for n' elements, then:
- if n' > n then EXPAND: memcopy n elements and initialize new elements
- if n' < n then SHRINK: memcopy only n' elements (truncation)
The previous memory block is freed. Note that array data copy is simple memcopy. If n' = n no changes are made.
1D-array ReDim Preserve
Only the upper bound can be changed to shrink or expand the array:
ReDim varr(3 To 6) _____ _____ _____ _____ |__3__|__4__|__5__|__6__| ReDim Preserve varr(3 To 5) _____ _____ _____ |__3__|__4__|__5__| <--- shrink ReDim Preserve varr(3 To 7) _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__|__0__| ---> expand
2D-array ReDim Preserve
Only the second dimension's upper bound can be changed:
ReDim varr(3 To 6, 1 To 3) ReDim Preserve ReDim Preserve varr(3 To 6, 1 To 2) varr(3 To 6, 1 To 4) _____ _____ _____ _____ _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__| 1 |__3__|__4__|__5__|__6__| 1 |__3__|__4__|__5__|__6__| 1 _____ _____ _____ _____ _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__| 2 |__3__|__4__|__5__|__6__| 2 |__3__|__4__|__5__|__6__| 2 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__| 3 shrink |__3__|__4__|__5__|__6__| 3 _____ _____ _____ _____ |__0__|__0__|__0__|__0__| 4 expand
3D- and higher array ReDim Preserve
Only the last dimension's upper bound can be changed: as everything before the last dimension will be added or removed x times.
ReDim varr(3 To 6, 1 To 2, 8 To 10) 1 2 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 8 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 9 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 10 ReDim Preserve varr(3 To 6, 1 To 2, 8 To 9) 1 2 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 8 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 9 <--- shrink ReDim Preserve varr(3 To 6, 1 To 2, 8 To 11) 1 2 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 8 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 9 _____ _____ _____ _____ _____ _____ _____ _____ |__3__|__4__|__5__|__6__||__3__|__4__|__5__|__6__| 10 _____ _____ _____ _____ _____ _____ _____ _____ |__0__|__0__|__0__|__0__||__0__|__0__|__0__|__0__| 11 <--- expand
No comments:
Post a Comment