5. Types
Mojo provides quite a number of data types out of the box for our use. Some of those types are described below.
5.1. Bool
The simplest of all types, a Bool
represents values True
or False
. Bool value stores exactly one bit, either 1
representing the value True
or 0
representing the value False
. True
and False
are built-in constants in Mojo, and are treated as keywords.
var bool_value: Bool = True
5.2. Int
Int
is one of the most used data types in programming. It represents a mathematical integer, however, there are limitations on how big a value it can store. The Int
type in Mojo is a built-in type and its size depends on the CPU architecture your program is running on. For a 64 bit architecture, the Int
type in Mojo has size 64
bits; whereas for a 32 bit architecture, it would be 32 bit. How big a number can fit in integer types depends on whether that integer is a "signed" or "unsigned". A signed integer means that contains both negative and positive values. An unsigned integer does not allow any negative values. Int
is a signed integer and therefore in a 64 bit CPU architecture, it allows values of range from -9,223,372,036,854,775,808 until 9,223,372,036,854,775,807, both inclusive.
var int_value: Int = 42
5.3. UInt
Similar to Int
, UInt
type in Mojo is a built-in type and its size depends on the CPU architecture your program is running on. The main difference from Int
is that UInt
is unsigned, meaning it represents only positive integers, including 0
. Since it represents only positive integers, the one bit that is usually reserved for sign is free to be used to represent values. This means that its maximum possible value is much more than the signed Int
type. For example, a 64 bit unsigned integer would have the range 0 through 18,446,744,073,709,551,615.
var uint_value: UInt = 84
5.4. IntLiteral
IntLiteral
is the type when you provide an integer value directly in source code. It has infinite precision, but cannot currently be represented at runtime when the value is higher than the one supported by Int
. Mojo allows underscore character "_" to as a separator for int literals to make it easy to read large numbers.
var int_lit: IntLiteral = 10_000
In the code below, you can see that a very large value is being operated upon using a floor division (we will cover floor division later when we cover operators). This is one of the benefits of using IntLiterals
as the compile time calculations can be done on a very large precision. When you execute the code, it will print 10000
.
print(9999999999999999999999999999999999999999999//999999999999999999999999999999999999999)
IntLiterals
can be assigned to Int
types. Vice versa is not possible, as the value for IntLiterals
come from the Mojo source code. Value for the Int
may come from other sources such as files, network or source code. This holds true for all other literal types in Mojo.
5.5. String
String
is also one of the most used data types in programming. It is a sequence of Unicode characters representing a given text. Unicode is a text encoding standard maintained by the Unicode Consortium and consists of more than hundred thousand codes representing characters in almost all of the world’s writing systems. Since String
abstracts over a sequence of Unicode characters, when you determine the length of a String
, it will return the count of characters (grapheme clusters to be precise).
However, to store or transport such a String we need to represent that String as a sequence of bytes. A popular character representation format is UTF-8, which uses one or more bytes per character depending on the Unicode code point (an integer value designated to represent the character).
When receiving or sending strings over files or network, always ensure that you know what encoding is being used. Quite often subtle defects occur because the programmer expected a different encoding than the one they received.
Strings in Mojo are immutable. Any modification of the String actually returns a new String.
var strg: String = "Hello World!"
5.6. StringLiteral
When you directly provide strings in source code within double quotes or single quotes the value gets assigned the type StringLiteral
.
Mojo allows embedding of one type of quote within a string of the other type of quote. For example, you can embed '' within "", and vice versa. However, make sure to use the same type of quotes for beginning and end of the string.
var strg_lit: StringLiteral = "Hello World!"
var strg_lit2: StringLiteral = 'Hello World!'
var strg_lit3: StringLiteral = 'Hello "World"!'
var strg_lit4: StringLiteral = "Hello 'World'!"
You can define multi line strings using three double quotes like """ or three single quotes like '''. Multi line strings will preserve the new line characters and white spaces.
var strg_lit_multi: StringLiteral = """
Hello World!
"""
var strg_lit_multi2: StringLiteral = '''
Hello World!
'''
var strg_lit_multi3: StringLiteral = '''
Hello """World"""!
'''
var strg_lit_multi4: StringLiteral = """
Hello '''World'''!
"""
StringLiterals
can be assigned to String
; this is why when you declare a String
variable, you are able to pass a string literal in source code to it.
5.7. FloatLiteral
FloatLiteral
is the type that Mojo compiler assigns to a value when you provide a decimal separated numeric value in the source code. The FloatLiteral
is "double precision", which is represented with 64 bits. The mantissa part of the value is represented by 52 bits and the exponent part of the value is represented by 11 bits. The last remaining bit is used for sign.
var float_lit: FloatLiteral = 2.005
5.8. Float16
Float16
is a 16 bit floating point type, also know as "half precision". On some machines lower precision types can be much faster than higher precision types and so are quite useful if high precision is not important in your domain.
var float_16: Float16 = 1.011
5.9. Float32
Float32
is a 32 bit floating point type, also known as "single precision". This type has 23 bit mantissa, 8 bit exponent and the last bit used for sign.
var float_32: Float32 = 3.25
5.10. Float64
Float64
is a 64 bit floating point type, also known as "double precision". The 64 bits are distributed as 52 bits for mantissa, 11 bits for exponent and the last bit for sign. This is the same precision that FloatLiteral
also has.
var float_64: Float64 = 5.6
5.11. Int8
Int8
is a signed integer represented with 8 bits. It has the range of values from -128 to 127. Integers represented with low number of bits save space in memory and also can be used to enforce supported range of values. Similar to floats, Int8
reserves one bit to represent a positive or negative sign.
var int_8: Int8 = -20
5.12. UInt8
Similar to Int8
, UInt8
is represented by 8 bits, but it is an unsigned integer. Since it is unsigned the range of UInt8
is from 0 to 255.
var uint_8: UInt8 = 20
5.13. Int16
Int16
is represented with 16 bits. It has a range of values from -32,768 to 32,767.
var int_16: Int16 = -29
5.14. UInt16
UInt16
is also represented with 16 bits. It has a range of values from 0 to 65,535.
var uint_16: UInt16 = 34
5.15. Int32
Int32
is represented with 32 bits. It has a range of values from -2,147,483,648 to 2,147,483,647.
var int_32: Int32 = -78
5.16. UInt32
UInt32
is represented with 32 bits. It has a range of values from 0 to 4,294,967,295.
var uint_32: UInt32 = 87
5.17. Int64
Int64
is represented with 64 bits. It has a range of values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
var int_64: Int64 = -65
5.18. UInt64
UInt64
is represented with 64 bits. It has a range of values from 0 to 18,446,744,073,709,551,615.
var uint_64: UInt64 = 77
5.19. BFloat16
BFloat16
is represented with 16 bits. It is known as brain floating point. Its main use is in machine learning to increase the performance of ML algorithms.
//TODO: Uncomment before release: include::{sourcedir}/base_types.mojo[tag=bfloat16] #
5.20. SIMD
SIMD standards for Single Instruction, Multiple Data. Processors that support SIMD allow for parallel processing of multiple data points using exactly the same instruction. SIMD was initially implemented for super computers but over a period of time, came to be used in desktop computers as multi media consumption on the desktops increased. The main benefit of SIMD is to perform vector and matrix operations, as many times the same operations need to be applied to many elements of those data structures.
Mojo provides out of the box support for SIMD. Most of the base types mentioned above are built on top of Mojo’s SIMD type.
var simd1: SIMD[DType.int8, 4] = SIMD[DType.int8, 4](10)
var sc: Int8 = 3
print(simd1 * sc)
In the above code, a SIMD vector of 4 elements containing data of type Int8 is instantiated with value 10
assigned to all the elements. Then when we multiply it with a value 3
, each of the element is multiplied with that scalar, resulting in [30, 30, 30, 30]
. On a supported hardware, just one single instruction will be applied over 4 different elements at the same time to yield the array of resulting values.
5.21. DType
In the previous example you saw the initiation of a SIMD instance by passing a data type DType.int8
. DType
in Mojo provides a list of data types that are supported within Mojo. One of the uses of DType
data types is to use data types as arguments to functions. DType
also provides some operations that help in introspecting at runtime different attributes about the data type. DType
is particularly useful in providing compile time optimization by creating specialized code for a particular type.
fn introspect(type: DType):
print("Bit width:", type.bitwidth())
print("Is signed:", type.is_signed())
introspect(DType.float16)
In the above example, we can write a generic function that takes any DType
and prints its bit width and whether or not it is a signed type.
5.22. Type safety
Let’s try something. Execute the following code in Mojo.
def main():
var int_value: Int = "42"
Executing the code listed above results in:
error: cannot implicitly convert 'StringLiteral' value to 'Int' in 'var' initializer
var int_value: Int = "42"
The reason is simple. Mojo is strongly typed. When you specify that a variable has type Int
, then it expects either Int
values or values that can be converted to Int
. In this particular case, we tried to pass a String literal as Integer, and Mojo compiler did not allow us to do that. If Mojo was not that strict we could end up with defects where we assume a variable of a particular type which in reality it is not. This is of particular concern in large code bases worked on by many people.
Now let’s look into the following.
def main():
var string_val: String = str(42)
print(string_val)
The code shown above compiles and runs successfully and prints 42
. The reason is a bit less obvious. The String provides an initializer that takes integer values as input argument. When Mojo compiler encounters incompatible types, but finds such an initializer, it automatically initialize with the passed in value. We will cover initializers later on.
5.23. object
As you have seen earlier, Mojo is quite strict about types. How about the situation when you do not yet know or do not care about the type of the variable, but still want to perform some computation? Mojo provides object
type for such cases.
fn add(a: object, b: object) raises -> object:
return a + b
print(add(1, 2.5))
If you execute the above code, you would see the result 3.5
printed on screen. The reason why Mojo did not complain about the type incompatibility of arguments is that the object
type has initializers for many built-in data types. Similar to the example mentioned above for String
, Mojo calls the appropriate initializer in object
corresponding to the type of the given value. If object
does not have an initializer for a given type, then a value of that particular type cannot be assigned to variables of object
type.
In the above case, object
has initializers for both Int
and FloatLiteral
. Mojo then instantiates an object
with Int
and the other object
with FloatLiteral
as its underlying value.
In case of def
functions, when you omit type annotations on variable, argument, return declarations, Mojo automatically assigns it the type object
.
5.24. Tuple
Tuple
in Mojo is an ordered sequence of values. A Tuple
can have many elements of different types. Mojo uses ()
to represent Tuple
literals in source code.
var t: Tuple[Int, Bool, Float64] = (1, False, 3.5)
The code listed above defines a tuple with elements 1, False and 3.5. You may have noticed that the code above defined some parameters within square brackets. We will come to it in a later chapter.
You can also get length of the tuple by using Mojo’s built in function len
as seen below.
print(len(t))
An empty tuple can be defined using just ()
.
var e: Tuple = ()
print(len(e))
Earlier we saw a tuple being declared with Tuple[Int, Bool, Float64]
. We can also declare a tuple as (Int, Bool, Float64)
. Both the declarations are effectively the same.
var altr: (Int, Bool, Float64) = (1, False, 3.5)
print(len(altr))
To get an element of a tuple, you can use subscript operator []
and pass within the square brackets the index. Note that like most other languages, Mojo has a zero based index. The ability to use subscripts also applies to lists.
var access: (Int, Bool, Float64) = (1, False, 3.5)
print("First value", access[0])
In def
style functions, you can unpack the values of a tuple into different individual variables. The individual variables will have the right data types according to the values that are assigned. The first variable on the left-hand side gets the first value of the tuple on the right-hand side, the second variable on the left-hand side gets the second value of the tuple on the right-hand side, and so on.
def multi_vars():
a, b = (1, False)
print("Variables a & b:", a, b)
multi_vars()
5.25. ListLiteral
Similar to Tuple
, Mojo also provides support for ListLiteral
. A ListLiteral
can have many elements of different types. Mojo uses []
to represent list literals in source code.
var l: ListLiteral[Int, Float64] = [1, 3.5]
print(len(l))
The code shown before defines a list with elements 1 and 3.5. Same as with Tuple some parameters are defined within square brackets, which we will cover later on.
An empty list can be defined using just []
.
5.26. DictLiteral
To get an element of a dictionary, you can use subscript operator []
. Within the []
you can pass the key with which the dictionary is indexed.
5.27. Type inference
In cases where Mojo can infer types for variables, we can omit the type declaration of variables. For example, if a variable is initialized at the time of its declaration, then the Mojo compiler is able to infer the type of the variable.
In the following example, even though we do not declare the types of bool_value2
and int_value2
, Mojo is able to infer the types as Bool
and Int
respectively.
var bool_value2 = True
var int_value2 = 1