A new introduction to Rust

2015-02-27

Lately, I’ve been giving a lot of thought to first impressions of Rust. On May 15, we’re going to have a lot of them. And you only get one chance at a first impression. So I’ve been wondering if our Intro and Basics are putting our best foot forward. At first I thought yes, but a few days ago, I had an idea, and it’s making me doubt it, maybe. So instead of re-writing all of our introductory material, I’m just going to write the first bit. A spike, if you will. And I’d like to hear what you think about it. This would take the same place as 2.4: Variable bindings in the existing structure: They’ve installed Rust and gotten Hello World working.

$ rustc --version
rustc 1.0.0-dev (dcc6ce2c7 2015-02-22) (built 2015-02-22)

Hello, Ownership

Let’s learn more about Rust’s central concept: ownership. Along the way, we’ll learn more about its syntax, too. Here’s the program we’re going to talk about:

fn main() {
    let x = 5;
}

This small snippit is enough to start with. First up: let. A let statement introduces a variable binding. Bindings allow you to associate a name with some sort of value.

Why ‘variable binding’? Rust draws heavily from both systems languages and functional programming languages. The name “variable binding” is a great example of this. Many systems languages let you declare a variable. These variables are called by that name because they can change over time, they’re mutable. Many functional languages let you declare bindings. These bindings are called by that name because they bind a name to a value, and don’t change over time. They’re immutable.

Rust’s variable bindings are immutable by default, but can become mutable, allowing them to be re-bound to something else. In other words,

fn main() {
    let x = 5;
    x = 6; // error: re-assignment of immutable variable

    let mut y = 5;
    y = 6; // just fine
}

You won’t be typing mut that often.

In any case, there’s one way in which let bindings work just like variables in other languages, but they’re the key insight into ownership. As you know, a computer program is executed line by line. At least, until you hit a control flow structure, anyway. Let’s give our program line numbers, so we can talk about it.

// 1
fn main() {    // 2
               // 3
    let x = 5; // 4
               // 5
}              // 6
               // 7

Line one is before our program starts. The endless void. Not actually endless, though, as line two is where we start main. This is the first line that is actually executed, and kicks off our program. Great. Line three is blank, so nothing happens, just like one. Line four is where the first actually interesting thing occurs: we introduce a new variable binding, x. We set x’s initial value to five, allocated on the stack. If you don’t know what that means, we’ll talk about it right after ownership. For now, x is five. No big deal. Line six has a closing curly brace, and so main, and thus, our program, is over. Line seven, the void returns.

This is basically the same thing as many programming languages. But let’s point out an aspect you, as a programmer, probably take for granted. Scoping. If I asked you, “Is x valid on line one?” you would say “no.” “Three? Seven?” “Nope, nada. x is valid from line four, where it was declared, to line six, where it goes out of scope.” This illustrates the idea. There is a certain scope, a certain set of lines, where x is a valid identifier. That scope starts from where it was declared, and goes until the end of the block. We can look at this scope in two ways: for the first, imagine this source code printed on a piece of paper. You highlight lines four three six. In some sense, this is a distance: three lines of code, rather than three meters. But if we imagine the computer running this program, this scope represents a time: three statements of processor execution. Even though that number is actually different based on the assembly, but at our level of abstraction, three units.

In Rust, we have names for these concepts, which are implicit in other languages. The thing that introduces a new scope is called the ‘owner.’ It’s in charge of the data it’s bound to, so we say that it ‘owns’ that data. The length of a scope is called a ‘lifetime,’ taken from that idea of time passing as your program executes. But you can also think of it as a segment of lines of code.

So if other programs do this, why does this make Rust special? Sure, Rust gives these concepts specific names, but names themselves aren’t significantly different. The difference is that Rust takes this concept and cranks it up to 11. Where most programming languages only keep track of how long variables are in scope, Rust knows how to connect the scopes of variables that are pointing to the same thing, as well as how to know the scope of things that are more than just stack-allocated memory.

Let’s talk more about this connecting of scopes. Here’s another Rust program:

fn main() {         // 1
    let x = 5;      // 2
                    // 3
    {               // 4
                    // 5
        let y = &x; // 6
                    // 7
    }               // 8
                    // 9
}                   // 10

In line four, we use an open curly brace to create a new scope. This scope, like in many languages needs to be closed before the scope of main gets closed. It’s nested inside of it.

In this new scope, on line six, we declare a new binding, y, and we set it equal to &x. This reads as “a reference to x,” as & is the reference operator in Rust. References are Rust’s version of ‘pointers’ from other systems programming languages. Here’s a quick introduction, if you haven’t had to deal with pointers before.

By default, in Rust, data is allocated on “the stack.” Your program is given a chunk of memory, and it’s able to allocate information there. The stack starts at the bottom address of memory, and then grows upwards. For example, at the start of our program, the stack has nothing in it: