VBA Internals: Deep dive into VBA Array assignment
and about Dictionary.Keys
Arrays in VBA are value-types i.e. a full array copy is made on assignment. This is very different from f. ex. the C language.
Dim arr(5) ' Fixed-size Array variable with elements Dim varr() ' Dynamic Array variable varr = arr ' array assignment
After the assignment varr
and arr
has nothing common. They are two completely separate arrays with their own separate copies of values.
Array assignment between variables can be an expensive operation. The complexity depends on the type of the Array. Assignment from functions does NOT make an array copy - with one exception.
By knowing the rules, avoiding full array copies may speed up VBA code. Especially when working with arrays of strings, functions returning such arrays, arrays within arrays etc. And I was curious how VBA works in detail.
Array assignment: lvalue and rvalue
Only Dynamic Array- and Variant variable can store a Pointer to SAFEARRAY (PSA) and be lvalue in array assignment:
- Dynamic Array variable MUST match the exact array type of the rvalue.
- Variant variable can be lvalue in all types of array assignments - as in all other assignments.
NB: the function (return) value of Functions of Array- and Variant type is treated the same way as variables.
The rvalue in Array-assignment can be Dynamic- or Fixed-size Array variables, Variant variables of array type; or functions of array type and functions of type Variant of array type. Phew.
- Assignment from variable rvalue results in full array copy
- Assignment from Function rvalue is only a simple pointer assignment, no array copy is made. With one exception!
|------------------------------------------------------------------------| | lvalue | | |---------------------------------------| rvalue | | Variant | Dynamic Array | | | variable | variable | | |------------------------------------------------------------------------| | | | Array variable | | _______________ | _________ | _________ | | |_xx20____PSA2__| | |___PSA2__| | |___PSA___| | | | | | | Dim v | Dim varr() | Dim arr() ' Dynamic | | v = arr | varr = arr | Dim arr(3) ' or Fixed-size | |------------------------------------------------------------------------| | | | Variant variable | | _______________ | _________ | _______________ | | |_xx20____PSA2__| | |___PSA2__| | |_xx20____PSA___| | | | | | | Dim v | Dim varr() | Dim va | | v = va | varr = va | | |------------------------------------------------------------------------| | | | Function returning array | | _______________ | _________ | _________ | | |_xx20____PSA___| | |___PSA___| | |___PSA___| | | | | | | Dim v | Dim varr() | Function fa() | | v = fa | varr = fa | As Variant() | |------------------------------------------------------------------------| | | | Function returning Variant | | _______________ | _________ | _______________ | | |_xx20____PSA___| | |___PSA2__| | |_xx20____PSA___| | | | | | | Dim v | Dim varr() | Function fv() | | v = fv | varr = fv | As Variant | |------------------------------------------------------------------------| _______________ |_xx20____PSA___| Is a Variant of array type - vbArray flag (2000h) set - stores a Pointer to SAFEARRAY (PSA) _________ |___PSA___| Is a Dynamic array variable or simply Pointer to SAFEARRAY (PSA) PSA2 means array copy operation has been performed
The Array assignment process
Arrays of scalar assignment
The source SAFEARRAY (SA) has no special Feature Flag set:
- allocation of new SA
- allocation memory space for new elements
- memcopy elements
Dim arr(5) As Long Dim varr() As Long varr = arr ________ arr | SA | _____ _____ _____ _____ _____ _____ | |-->|__0__|__1__|__2__|__3__|__4__|__5__| |________| memcopy ________ ----------------> +-----------+ | SA | _____ _____ _____ _____ _____ _____ | PSA | ---> | |-->|__0__|__1__|__2__|__3__|__4__|__5__| +-----------+ |________| varr alloc new alloc new
Arrays of String assignment
The source SAFEARRAY (SA) has FADF_BSTR
Feature Flag set:
- allocation of new SA
- allocation memory space for new elements
- loop through each element, allocate new BSTR and copy string content (deep copy)
Dim arr(5) As String Dim varr() As String varr = arr ________ arr | SA | _______ _______ _______ _______ _______ | |-->|_______|_______|_______|_______|_______| |________| | ___________ | _______ -->|BSTR1______| ... ... -->|BSTRn__| | | | | | | | | | | deep | | | | | copy ________ V V V V V +-----------+ | SA | _______ _______ _______ _______ _______ | PSA | ---> | |-->|_______|_______|_______|_______|_______| +-----------+ |________| | ___________ | _______ varr alloc new -->|BSTR2______| ... ... -->|BSTRm__|
Arrays of Variant assignment
In a Variant array each Variant element can hold any data type.
The source SAFEARRAY (SA) has FADF_VARIANT
Feature Flag set:
- allocation of new SA
- allocation memory space for new elements
Loop through each element and decide the copy method:
- Variant of scalar: memcopy
- Variant of String: allocate new BSTR and copy string content
- Variant of Array: call Array assignment recursively
Dim arr(5) arr(0) = "hi" arr(1) = 100 arr(2) = 60000 arr(3) = Array("s", 1, 2) Dim varr() varr = arr ________ arr | SA | ___________ ___________ ___________ ___________ | |-->|08_________||02_________||03_________||0C20_______| |________| | ________ | ______ -->|BSTR1___| ->| SA1 | _______ _______ _____ |______| --> |_______|_______|_____ | _______ -->|BSTRn__| | | | | |deep |simple |simple |recursive |copy |copy |copy |deep copy ________ V V V V | | | +-----------+ | SA | ___________ ___________ ___________ ___________ |deep | | | PSA | ---> | |-->|08_________||02_________||03_________||0C20_______| |copy | | +-----------+ |________| | ________ | ______ V V V varr alloc new -->|BSTR2___| ->| SA2 | _______ _______ _____ |______| --> |_______|_______|_____ | _______ -->|BSTRm__|
Array self-assignment test
Consider:
ReDim varr(5)
varr = varr
What will happen? Will the run-time evaluate rvalue to a PSA, free lvalue - which frees the same PSA, then copy a null-ptr? No. The run-time recognizes that both are the same and nothing happens.
This holds also in a Sub, in which we try to trick the run-time by calling it with (varr, varr). No change will happen:
Private Sub SelfArrayAssignmentTest(varr1(), varr2()) varr1 = varr2 End Sub
Under the hood a COM function would do something like this (just guessing here):
void ArrayAssignment(SAFEARRAY **ppsa1, SAFEARRAY **ppsa2) if (*ppsa1==*ppsa2) return; ... process copy End Sub
Note on array assignment from Dictionary.Keys()
and Dictionary.Items()
The Variant function → Array assignment exception, when array copy operation is performed is something to consider when choosing the data type of the lvalue variable. Some build-in functions return Variant holding an array, f. ex. Scripting.Dictionary's Keys()
function. It is declared as:
Function Keys()
Keys() returns Variant - holding an Array of Variant:
- when receiving this Variant into a local Variant variable no array copy is made.
- but receiving this Variant into a local Array of Variant variable will do perform full array copy.
Sub dictarr() Dim dict As Scripting.Dictionary Set dict = New Scripting.Dictionary Dim v ' Variant Dim varr() ' Array of Variant v = dict.Keys ' no array copy varr = dict.Keys ' full array copy End Sub
Something to keep in mind - especially when the Dictionary stores thousands of Keys and Items.
Also, the array returned by Keys()
is already a copy. Calling Keys()
makes a fresh array copy from the internal keys each time of access; the returned PSA is different each time (and the keys cannot be changed in the Dictionary instance through this array either).
This has a possible consequence when f. ex. we try to retrieve the 5-th index key. The code would be:
dict.Keys(5)
Somehow compiles and runs fine but how does this work? Isn't Keys()
a function without arguments?
The VBA syntax sometimes is really annoying:
- calling functions without parameters the parenthesis is optional
- both array-indexing and function call uses parenthesis ()
These are equivalent and we were just lucky that the compiler evaluated dict.Keys
first as function call.. then used array-indexing for dict.Keys(5)
:
dict.Keys <----> dict.Keys() dict.Keys(5) <----> dict.Keys()(5)
So dict.Keys(5)
first makes the call and receives the full array of keys.. then indexing into the array. Each time. The most effective solution would be to retrieve and store the key array into a Variant(!) - then use indexing to retrieve (possible) multiple key values or iterate through. Or maybe some optimization is going on in the background I don't know.