13. Memory management and pointers
13.1. Memory
Most of the computers we use today have memory to store data for processing. The data stored in memory needs to be stored and retrieved, for which we need an address. As an analog, you can think of a large library where books are kept in a shelf within their own cells. In our analog, the computer memory would be the shelf, the book will be data and the location of the cell in the shelf would be the memory address. A pointer in this case would be a variable that holds the memory address. Computer memory addresses are usually integers with its size (number of bits representing the address) depending on the computer architecture.
Here in the diagram the address of a cell containing the book we want is represented as row 4, column 3.
A computer memory is typically divided into many parts, of which the stack and heap are relevant for us.
13.1.1. Stack
When we are at a library, we usually fetch a book and then read it at a desk. In a way, the "read operation" happens at the desk. Similarly, when a thread is instantiated for execution, it is assigned a "working space" from the computer memory, known as stack. Within a thread, multiple functions may be getting called. When a function is instantiated for execution, it is assigned a block out of the stack called stack frame.
The stack frame will contain within it the function’s location, any values used within that function, inputs given to the function, etc. Once the function has completed its execution, the whole stack frame will be cleared. Once the stack frame is cleared, its memory would be reused for other functions getting instantiated. Stacks have a well defined lifecycle and the allocation and de-allocation of memory is automatically managed. This means that the programmer does not have to worry about allocating the memory and de-allocating the memory of stack-stored variables that he uses in code.
When a function calls another function, two stack frames gets created. Naturally, the outer function’s stack frame has a longer lifetime than the inner function’s stack frame (the inner functions completes execution and its stack frame is cleared, while the outer function is still running and its stack frame is still in use).
One requirement when allocating memory for a stack frame is that the compiler needs to know how much to allocate for the stack frame at the compile time itself. This is because stack frame is a pre-allocated memory, and it needs to allocate enough so that the function’s local variables can fit in. This means that we cannot place within a stack those data structures that do not know their size at compile time.
13.1.2. Heap
Since a stack cannot store data structures that do not know their size at compile time (e.g. a list that dynamically grows at runtime when data is added to it), we use a heap.
When data is stored in heap, the programmer usually keeps the address where it is stored in a variable (pointer) for future reference. Since a pointer size is well known already at compile time, we can store that pointer in the stack frame. Functions always work with stack frames, and sometimes we need to copy the data from heap into stack.
13.1.3. Memory management
Since memory is a finite resource, we need to manage it very carefully. Additionally, since programs rely on the integrity of the data it works with, care must be taken that the programs work with data that has a well defined lifecycle.
Allocation and de-allocation
If we allocate space in memory for our data and we forget to de-allocate after our use, memory would be "wasted". If it happens quite often during the runtime of the program, the program would reach a point where all the memory is exhausted and it has to abruptly quit. Therefore, we need to ensure to de-allocate memory after use.
Concurrent mutation
If we have a pointer to a memory location and if its data is mutated/modified by another function without us knowing, we would end up with "corrupted" data. This means that we need to ensure that the mutation of data is done only by one function at a time and it is clear at any point of time which function is doing the mutation. Many programming languages do not track such cases.
Lifetime and ownership
Management of memory becomes easier when we define within a program the lifetime (for how long it stays in memory) of a value and which part of the program owns the responsibility for its de-allocation and cleanup. In case a data structure has other data within it (e.g. list), then typically the container data structure is responsible for the de-allocation of data stored within it, and usually the container’s lifetime is the maximum lifetime of the contained data.
Most of the programming languages do not make lifetimes explicit and therefore programmer’s need to have a mental model of how long a piece of data lives. Problems occur when this mental model is wrong, resulting in crashes, or security issues. For example, if a pointer to data stored within an array is kept and if that pointer is not discarded when the array is de-allocated, then the pointer refers to an address that contains random data. This is known as a dangling pointer.
Since stack has a well defined lifecycle management, most of the memory management issues come up when we work with the heap.
Mojo provides excellent support for tracking both concurrent mutation and lifetime of values through built-in support for references.
13.2. Pointers in Mojo
In order to allocate memory from the heap, we use pointer operations like alloc
and free
. Mojo provides UnsafePointer
type for manually managed pointers. The type is explicitly named as unsafe, to indicate that Mojo would not be managing the lifetime of the value pointed by the UnsafePointer
. Programmers must take care to ensure that UnsafePointer
is carefully managed.
An example is shown in the following code listing.
from memory import UnsafePointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
fn __copyinit__(inout self, other: Self):
self.val = other.val
fn __moveinit__(inout self, owned other: Self):
self.val = other.val
def print_pointer(p: UnsafePointer[A]):
print(p)
def print_value(p: UnsafePointer[A]):
print(p[].val) # De-reference the pointer and then get the `val` field
Usage:
var x: UnsafePointer[A] = UnsafePointer[A].alloc(1) # Allocate space to store one instance of `A`
print_pointer(x) # Prints hexa decimal representation of the address within `x`
x.init_pointee_move(A(10)) # Initialize the location
print_value(x)
x.destroy_pointee() # Destroy the value before trying to free the memory
x.free() # Frees the memory pointed by `x`
Here we see that a memory storage for type A
is allocated on the heap using alloc
. You can think of alloc
allocating an array with length specified by the count
argument. The count
is the number of items of type A
planned to be stored in that memory storage. The alloc
method calculates the size of the type A
and then allocate the total number of bytes for it. The space is initialized with value using init_pointee_move
. Mojo stores the value in the allocated memory storage contiguously. In order to get to the initialized value, we use the de-reference operator []
. You can think of the de-reference operator as array subscript getting the first element of an array (actually p[0]
would also work, but is discouraged). Before freeing the memory used by a value, we must call destroy_pointee
, to ensure that all the related resources are also freed. Otherwise we would face a memory leak. Later we de-allocate the space using the free
method of UnsafePointer
.
On a quick look the above code seems to be simple and easy to reason, where the memory is allocated and freed. However, there is a potential memory leak. This would happen if the print_pointer
or print_value
throws an exception and the function exits early because of that. When that happens, the free
will not get executed and the allocated memory in the heap would not be returned to the heap. This would lead to a memory leak, or worse, a security issue because now the data is available to potentially malicious code.
In the following code listing we see that the memory is freed in all cases, including errors.
from memory import UnsafePointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
fn __copyinit__(inout self, other: Self):
self.val = other.val
fn __moveinit__(inout self, owned other: Self):
self.val = other.val
def print_pointer(p: UnsafePointer[A]):
print(p)
def print_value(p: UnsafePointer[A]):
print(p[].val) # De-reference the pointer and then get the `val` field
Usage:
var y: UnsafePointer[A] = UnsafePointer[A].alloc(1)
try:
y.init_pointee_move(A(10))
print_pointer(y)
finally:
x.destroy_pointee() # Destroy the value even in case of exception
y.free() # Free the memory even in case of exception
What if we already have an instantiated value and wanted to find its address? The UnsafePointer
provides address_of
method just for that.
from memory import UnsafePointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
fn __copyinit__(inout self, other: Self):
self.val = other.val
fn __moveinit__(inout self, owned other: Self):
self.val = other.val
def print_pointer(p: UnsafePointer[A]):
print(p)
def print_value(p: UnsafePointer[A]):
print(p[].val) # De-reference the pointer and then get the `val` field
Usage:
var a_instance = A(12)
var z: UnsafePointer[A] = UnsafePointer[A].address_of(a_instance)
print_pointer(z)
As mentioned earlier, care should be taken to understand the lifecycle of the value before passing its address. For example, returning the address of a value allocated on a stack frame to the callee stack frame would most likely result in a dangling pointer. Since Mojo destructs value on last use, it is also necessary to keep the value alive until the pointer’s purpose is completed, otherwise we have a pointer that points to an uninitialized memory since the value was already destructed.
Mojo also provides another option to initialize an allocated memory, using init_pointee_copy
. The difference is that the init_pointee_move
moves the value to the allocated space, while the init_pointee_copy
just copies the value, leaving the original value intact.
from memory import UnsafePointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
fn __copyinit__(inout self, other: Self):
self.val = other.val
fn __moveinit__(inout self, owned other: Self):
self.val = other.val
def print_pointer(p: UnsafePointer[A]):
print(p)
def print_value(p: UnsafePointer[A]):
print(p[].val) # De-reference the pointer and then get the `val` field
Usage:
var a: UnsafePointer[A] = UnsafePointer[A].alloc(1)
a.init_pointee_copy(A(11)) # Initialize the location
print_value(a)
It is also possible that we end up in a situation where two pointers are pointing to the same value. In this case, when a pointer is used to modify the value, the other pointer automatically refers to the modified value. In some case it may be intentional from the programmer’s side. However, in most cases, it results in defects that are hard to track and fix.
var another_a = A(100)
var i: UnsafePointer[A] = UnsafePointer[A].address_of(another_a)
var j: UnsafePointer[A] = UnsafePointer[A].address_of(another_a) # Pointer aliased
print(i[].val) # Prints 100
i[].val += 10
print(j[].val) # Prints 110
In the above example, two pointers were pointing to the same value. When one pointer was used to update the value, the other pointer picked up the updated value. This code is a simple one and it is easy to detect what is happening. However in large code bases especially in multi-threaded scenarios, such cases result in hard to find defects.
All bets are off when we use UnsafePointer
. Mojo does not track what goes on with its usage. This is by design. UnsafePointer
is useful for those cases where the application needs to work intimately with the hardware or the operating system. Therefore, use it with care.
If pointers are such a pain, then why do we use it?
13.2.1. ARC
As you have already seen, managing values on the heap requires a lot of care. Some programming languages make it easy to manage values allocated on the heap through the use of a garbage collector. A garbage collector keeps track of values on the heap and as soon as it determines that nothing is using it, it will automatically clear the value and its associated memory (the unused value is the "garbage"). There are different approaches to garbage collection. One of them is reference counting.
Mojo has a reference counted pointer named as Arc
. The Arc
keeps the count of its copies in use. If an Arc
pointer is created, and it is then passed to another function as a copy while sharing the same underlying value, the Arc
will increment its count. When all the Arc
pointers to that value is destructed, the count becomes zero and the underlying value will also be destructed and its memory cleared.
from memory import Arc
var arcp = Arc(A(200))
# Reference count is 1
var arcp2 = arcp # Reference count increased with this copy
# Count is now 2 as `arcp` is copied to new `Arc` called `arcp2`
print(arcp[].val) # Prints 200
arcp[].val += 10
# Count is now 1 as `arcp` is cleared as the above line was its last use
print(arcp2[].val) # Prints 210
# Value is cleared from memory
Arc
solves de-allocation issues we saw earlier by piggy backing on Mojo compiler’s eager destruction of values on the stack frame. However, it does not solve other issues like concurrent mutation.
13.3. Pointer
If Arc
keeps a runtime record of its copies so as to ensure that the value is destructed when it is no longer needed, Pointer
helps Mojo compiler to track it at compile-time.
The Pointer
type takes in an Origin
parameter, which is then tracked across assignments and moves. This tracking by the compiler ensures that the value pointed by Pointer
lives at least as long as the pointer being used, preventing dangling pointers.
The Pointer
type in Mojo is always derived from a preexisting value, unlike UnsafePointer
that can exist independent of the value (e.g. null pointers are impossible in Pointer
). The Mojo compiler also does not allow replacing values referred by the Pointer
with another value.
from memory.pointer import Pointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
def print_value(p: Pointer[A]):
print("print_value called for val:", p[].val)
print(p[].val) # De-reference the reference and then get the `val` field
Usage:
var a_instance = A(12)
var ref_a = Pointer.address_of(a_instance) # Construct a Pointer out of an existing value
print_value(ref_a) # Prints 12
Origin
gets transferred over to copies of the pointer.
from memory.pointer import Pointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
def print_value(p: Pointer[A]):
print("print_value called for val:", p[].val)
print(p[].val) # De-reference the reference and then get the `val` field
Usage:
var a_instance = A(12)
var ref_a = Pointer.address_of(a_instance) # Construct a Pointer out of an existing value
print_value(ref_a) # Prints 12
var ref_a2 = ref_a # The same origin is copied onto the new pointer too
ref_a2[].val = 13 # De-reference and assign value
print_value(ref_a2) # Prints 13
Mojo ensures that even though the pointer data types are the same (Pointer[A]
) they are not assignable to each other because Origin
is part of the pointer’s essence. Origin
is not exactly a type, but rather a dynamic property of the underlying value of the pointer.
from memory.pointer import Pointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
def print_value(p: Pointer[A]):
print("print_value called for val:", p[].val)
print(p[].val) # De-reference the reference and then get the `val` field
Usage:
var a_instance = A(12)
var ref_a = Pointer.address_of(a_instance) # Construct a Pointer out of an existing value
print_value(ref_a) # Prints 12
var another_instance = A(14)
var ref_another_a = Pointer.address_of(another_instance)
# ref_another_a = ref_a # Uncommenting this would result in compilation error.
print_value(ref_another_a)
Functions in mojo take arguments as read-only by default. This extends to origins also. A mutable origin will automatically become an immutable one when taken as a function argument.
from memory.pointer import Pointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
def print_value(p: Pointer[A]):
print("print_value called for val:", p[].val)
print(p[].val) # De-reference the reference and then get the `val` field
Usage:
var a_instance = A(12)
var ref_a = Pointer.address_of(a_instance) # Construct a Pointer out of an existing value
print_value(ref_a) # Prints 12
def change_value(p: Pointer[A]):
print(p[].val)
#p[] = 16 # Uncommenting this would result in compilation error.
...
change_value(ref_a)
ref_a[].val = 15 # Works
In the previous code listing the ref_a
has a mutable origin, but when it is passed in to change_value
, it is turned into an immutable origin. While we can print its value (i.e., read), we cannot change its value as it is immutable.
What if we really wanted to pass a mutable pointer to the function? The answer to this lies in MutableOrigin
. With MutableOrigin
we are indicating to the compiler to not convert the mutable origin carried by the pointer to an immutable one.
from memory.pointer import Pointer
struct A:
var val: Int
fn __init__(inout self, val: Int):
self.val = val
def print_value(p: Pointer[A]):
print("print_value called for val:", p[].val)
print(p[].val) # De-reference the reference and then get the `val` field
Usage:
var a_instance = A(12)
var ref_a = Pointer.address_of(a_instance) # Construct a Pointer out of an existing value
print_value(ref_a) # Prints 12
def change_value2[origin: MutableOrigin](p: Pointer[A, origin=origin]):
p[] = 16 # Now it works
...
change_value2(ref_a)
print(ref_a[].val) # Prints 16
Here we are taking a MutableOrigin
as parameter, which is then passed on to the pointer. The compiler would then preserve the original mutability of the pointer.