Mojo By Example: A Comprehensive Introduction to the Mojo Programming Language

12. Ownership and lifecycle

Before we jump into the lifecycle operations, let’s understand the concept of ownership of references.

12.1. Pass by value and pass by reference

We have two ways to pass something to a function or method. One is pass by value and the other is pass by reference.

We call something as passed by value when the actual value of a variable is passed to the function, which results in the value being copied to the callee function’s argument. In this case the callee has its own copy and the caller has another copy. If the callee function changes the value, it is not reflected in the caller. In Mojo the data types that fit within the registers of the CPU are passed by default as values and so the callee gets a copy of the value. Also, when we perform an assignment of a variable to another variable, the value of the variable is copied to the assignee.

The second way is to pass the location where the value is stored. In this case, both the caller and the callee refers to the exact same location of the value. We can say that the caller is passing a reference to the value to the callee. So if the callee changes the value, that change will be reflected immediately in the caller.

When we pass a value by reference to a function, that function can potentially change the value. However, if the caller is not expecting its value to be changed while the callee changes the value, we end up with defects. In many programming languages that support pass by reference, it is a common source of defects. So how can we indicate to the caller of a function that the function intends to only read the value or it intends to change it? Mojo provides a solution by annotating the function arguments with a set of keywords that shows the intend.

12.2. read

The read keyword indicates that the argument is used only to read the value and the argument’s value will not be changed. This is the default behavior of all Mojo function’s arguments, so the read keyword is not necessary to be given.

When an argument is marked as read, the Mojo compiler prevents any mutation of the argument’s value. It also does not allow the binding of the argument to be changed as it would have led to discarding (and destruction) of the original value contained in the argument. Since we are borrowing the value, the caller would not expect the value to be destroyed.

fn value_read(read val: Int):
    ...
fn value(val: Int): # This is also read
    ...

In def functions, you may find that Mojo seemingly allows mutation of the arguments. However, behind the scenes, it is performing a copy-on-write. This means that the argument is copied transparently to the developer, and the original argument is left intact. This is done so that for the developer the def function feels similar to how it works in Python.

def cow(val: Int): 
    val = 20

Usage:

    var x = 10
    cow(x)
    print(x) # Prints 10

12.3. owned

The owned keyword indicates that the function assumes the ownership of the given reference argument. This means that we are free to mutate or destruct the passed value within that function.

When an argument is owned, the function can be sure that it can mutate the argument. It is possible that Mojo passes a copy of the value to the function in such cases. When the value is copied, then the caller has own copy and the callee function has its own copy.

fn value_owned(owned val: Int):
    ...

12.4. mut

The mut keyword indicates that the function will potentially mutate the value within the passed reference. The difference from owned references is that while owned arguments may get copied as needed, the mut arguments are not copied.

fn value_mut(mut val: Int):
    ...

12.5. out

The out keyword indicates that the function will initialize a value into the passed reference. The out arguments are implicitly returned by the function. That is, the function cannot return an uninitialized out argument. The caller of the function that takes out arguments will allocate a store for the callee and pass it to the callee. The callee will then initialize the value into the store. The callee does not need to return the value, as the value is already stored in the location passed by the caller. This is sometimes useful when we want to improve performance by avoiding moving or copying the value on return.

fn value_out(out val: Int):
    val = 10

Usage:

    print(value_out()) # Prints 10

Note that at the usage site, we are not passing any argument to the function value_out. The compiler instead allocates a storage for the value and passes that storage location to the function value_out. The function value_out then initializes the value into the storage location. The value is then available to the caller at the location passed to the function.

Let’s now look into the lifecycle methods. We start with one that we are already familiar with: the init method.

12.6. `init`

The init method is part of the lifecycle of a struct. The main purpose of init is to initialize all its member variables (a.k.a fields).

from memory import UnsafePointer
struct MyNumber:

    var value_ptr: UnsafePointer[Int]

    fn __init__(out self, value: Int):
        self.value_ptr = UnsafePointer[Int].alloc(1)
        self.value_ptr.init_pointee_move(value)

    fn value(self) -> Int:
        return self.value_ptr[]

    fn change_value(self, value: Int):
        self.value_ptr.init_pointee_move(value)

    var num: MyNumber = MyNumber(42)
    print("num:", num.value())

In this example, we defined a struct and the init method within it. The first argument of the init method is always self with a modifier out. The self is a reference to the struct’s own instance that is going to be created. The out tells the compiler that the self is to be initiated and returned by the method (i.e., we can change the field values held within self and self is returned). In Mojo, the function arguments are by default read-only, and we cannot change the values of the function argument. The out is needed for self so that we are able to initialize the fields within the struct. Since one of the main responsibility of init is to initialize the fields of the struct and return it, we naturally need to mark it as out. This implies that the init method is like a static method on Self, that returns an instance of Self.

In the example, we are allocating memory from the heap to store an integer value using the static method call UnsafePointer[Int].alloc. We store a value into the pointer location using the method init_pointee_move on the UnsafePointer. We retrieve the stored value from the pointer using the deference operator [].

12.7. `del`

The delete method del is also part of the lifecycle of a struct. If the init method is used to initialize variables or to allocate resources for that struct, the delete method is used to release the resources held for that struct. For example, if init method allocates memory from the heap, the delete method is used to free that memory. The del method is called just before the value is going to be destroyed by the compiler. If we allocate resources in the init method and do not release or free those resources in the delete method, we end up with resource leaks such as memory leaks. So great care must be taken to symmetrically allocate and free resources using the init and delete methods.

from memory import UnsafePointer
struct MyNumber:

    var value_ptr: UnsafePointer[Int]

    fn __init__(out self, value: Int):
        self.value_ptr = UnsafePointer[Int].alloc(1)
        self.value_ptr.init_pointee_move(value)

    fn value(self) -> Int:
        return self.value_ptr[]

    fn change_value(self, value: Int):
        self.value_ptr.init_pointee_move(value)

    fn __del__(owned self):
        self.value_ptr.destroy_pointee()
        self.value_ptr.free()

    var num: MyNumber = MyNumber(42)
    print("num:", num.value())

12.7.1. Eager destruction

Unlike many other languages, Mojo has an eager destruction approach. This means that a value or object is destroyed as soon as its last use is over. This is in contrast with many system languages where the values or objects are destroyed at the end of the scope of a given block. This approach allowed Mojo to have a much simpler lifecycle management, improving overall ergonomics of the language.

The eager destruction might sometimes surprise us while debugging Mojo programs as the variable references would already have been destructed if they are no longer being referenced at locations past the debug breakpoint. A common pattern to extend the lifetime of the value is by assigning it to _.

    var a_value: MyNumber = MyNumber(1)
    print("Hello")
    ...
    print("World")
    _ = a_value # Keep the life of a_value until this point

12.8. `copyinit`

Mojo invokes the copyinit for all the cases where a value needs to be copied. For example, when a variable is assigned to another one, the copyinit may be called for the assignee. This method is quite similar to the init method in the sense that it initializes the struct. In contrast to init, copyinit gets an additional argument of the same type as the struct in which the method is declared (the type of itself is named as Self in Mojo). In the copyinit it is expected that you initialize your member fields with values copied from the "other" struct. copyinit is also known as copy constructor in other languages.

Mojo compiler tries to optimize away copies as much as possible, especially where the reference is not being used later on.

from memory import UnsafePointer
struct MyNumber:

    var value_ptr: UnsafePointer[Int]

    fn __init__(out self, value: Int):
        self.value_ptr = UnsafePointer[Int].alloc(1)
        self.value_ptr.init_pointee_move(value)

    fn value(self) -> Int:
        return self.value_ptr[]

    fn change_value(self, value: Int):
        self.value_ptr.init_pointee_move(value)

    fn __copyinit__(out self, other: Self):
        self.value_ptr = UnsafePointer[Int].alloc(1)
        self.value_ptr.init_pointee_copy(other.value())

    var num: MyNumber = MyNumber(42)
    print("num:", num.value())

    var other_num: MyNumber = num # Calling __copyinit__ on other_num
    print("other_num after copy:", other_num.value())
    other_num.change_value(84)
    print("other_num after change:", other_num.value())
    print("num after copy:", num.value())

In the previous code listing, within the copyinit, we are allocating new memory for holding the copy of the value from other. The other has type Self, which means the same type as the struct defining the copyinit - in this case MyNumber.

12.9. `moveinit`

Mojo invokes the moveinit for cases where a value needs to be physically moved. This method is quite similar to the init method in the sense that it initializes the struct. In contrast to copyinit, moveinit has the second argument annotated with owned. The owned is required because the second argument’s value will be destroyed once the move operation completes. In moveinit, we reassign the values from the other struct to the struct which defines the moveinit.

One thing to note is that the moveinit may not be always called when a value is "moved" (even when using the transfer ^ operator). You can think of move as a transfer of ownership, but sometimes the compiler is not able to optimize it to a simple transfer of ownership (logical move), instead needs to copy the value to a new location and clear the old location (physical move). Physical move happens for example when a value is moving from a short-lived stack frame to a longer lived one.

Move operations are particularly useful for cases where copy operations are expensive. For example, in Mojo move semantics are used for String. This ensures that the string operations are as much efficient as possible.

To move a reference, the caret ^ operator is used.

    fn __moveinit__(out self, owned other: Self):
        self.value_ptr = other.value_ptr
        other.value_ptr = UnsafePointer[Int]()

    var num: MyNumber = MyNumber(42)
    print("num:", num.value())

    var other_num2: MyNumber = num^ # Moving
    print("other_num2 after move:", other_num2.value())
    other_num2.change_value(84)
    print("other_num2 after change:", other_num2.value())
    # Uncommenting below line results in compiler error as `num` is no longer initialized
    #print("num after copy:", num.value())

Due to the fact that Mojo compiler decides when to physically copy or move, we cannot make assumptions on when the moveinit and copyinit are actually called.

The different lifecycle operations are illustrated in the following diagram.