My first day checking out Rust

This weekend I decided I would learn a bit of Rust and try to understand the hype around it! Like learning most programming languages, I started out writing a "Hello World!" program.

"Hello World!"

I found this example online:

fn main() {
    println!("Hello World!");
}

The fn keyword is used to indicate the declaration of a function, followed by the function name and its arguments in parenthesis. The function body is enclosed in brackets. What stood out to me initially was the ! after the println function. What is it?

The ! indicates a macro. A macro is a meta-programming tool for code generation. In C/C++ you may see a macro like the #define statement here:

#define HELLO_WORLD "Hello World!"
#include <iostream>

int main() {
	printf("%s\n", HELLO_WORLD);
	return 0;
}

One of the big differences between macros in C/C++ and Rust is that C/C++ macros are evaluated at the pre-processor stage of the compiler vs. in Rust they are evaluated as part of the language in the AST. One way to visualize this difference is by seeing the output of this program in C/C++ vs. in Rust:

C/C++

#include <iostream>
#define CUBE(a) (a * a * a)

int main() {
	printf("Value: %d\n", CUBE(1+2));
	return 0;
}

Rust

macro_rules! cube {
    ($a:expr) => {
        $a*$a*$a
    };
}

fn main() {
    println!("{0}", cube!(1+2));
}

The C/C++ version return value of CUBE(1+2) is 7 and the Rust version is 27. C++ under the hood is doing 1+2*1+2*1+2 which is 1+2+2+2 = 7 with the correct order of operations. Because in Rust macros are an extension of the language the macro is converted to an AST before being evaluated, the expression 1+2 is evaluated before being executed.

This got me wondering: what is the utility of function-like macros when functions have great properties like type checking? One example of when you would want to use macros is for helpful debug messages like the following:

macro_rules! assert_impl {
    ($a: expr) => {
        match $a {
            true => "worked!",
            false => concat!("Failed!: ", stringify!($a)),
        }
    };
}

fn main() {
    println!("{}", assert_impl!(1 == 2));
}

Another big benefit of macros is that they can get file name and line numbers at compile time instead of looking them up at runtime.

fn main() {
    println!("File {} Line # {}", file!(), line!());
    println!("File {} Line # {}", file!(), line!());
}

The strings in file!() and line!() are evaluated at compile-time instead of runtime which most other languages do. I wanted to prove that to myself so I searched for a way to get macro-expanded version of the Rust file. This turned out to be pretty straight forward with rustc --pretty expanded -Z unstable-options <filename>.rs. I additionally needed to use the nightly Rust compiler since the stable compiler doesn't support -Z unstable-options.

rustup install nightly
rustup default nightly
rustc --pretty expanded -Z unstable-options <filename>

The output of the macro-expanded code looked like this:

#![feature(prelude_import)]
#![no_std]
#[prelude_import]
use ::std::prelude::rust_2015::*;
#[macro_use]
extern crate std;
fn main() {
    {
        ::std::io::_print(::core::fmt::Arguments::new_v1(&["File ",
                                                           " Line # ", "\n"],
                                                         &match (&"log.rs",
                                                                 &2u32) {
                                                              (arg0, arg1) =>
                                                              [::core::fmt::ArgumentV1::new(arg0,
                                                                                            ::core::fmt::Display::fmt),
                                                               ::core::fmt::ArgumentV1::new(arg1,
                                                                                            ::core::fmt::Display::fmt)],
                                                          }));
    };
    {
        ::std::io::_print(::core::fmt::Arguments::new_v1(&["File ",
                                                           " Line # ", "\n"],
                                                         &match (&"log.rs",
                                                                 &3u32) {
                                                              (arg0, arg1) =>
                                                              [::core::fmt::ArgumentV1::new(arg0,
                                                                                            ::core::fmt::Display::fmt),
                                                               ::core::fmt::ArgumentV1::new(arg1,
                                                                                            ::core::fmt::Display::fmt)],
                                                          }));
    };
}

Woah, cool! As expected, the filename string and line number are already there! Great, now that I've satisfied my curiosities about the "Hello World!" program we wrote, I want to understand the hype around Rust.

The Hype

Zero-Cost Abstractions

The first thing that's mentioned about Rust is its zero-cost abstractions. This is a property C++ boasts about as well. From Bjarne Stroustrup (original C++ developer) a zero-cost abstraction is

What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.

Once again, I needed to prove this to myself. I took a very basic example of a potential Rust compiler optimization to evaluate arithmetic expressions.

fn main() {
    let milliseconds: u32 = 5 * 1000;
    println!("Milliseconds: {}", milliseconds);
}

Checking the assembly output of the program with rustc --emit asm -C opt-level=3 <filename>.rs, I found that the Rust compiler did indeed convert 5 * 1000 into 5000.

__ZN9zero_cost4main17h61b9355cc30261e7E:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        subq    $80, %rsp
        movl    $5000, -4(%rbp)
        leaq    -4(%rbp), %rax
        movq    %rax, -24(%rbp)
        movq    __ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$u32$GT$3fmt17hcc17937ca0bcf1aeE@GOTPCREL(%rip), %rax
        movq    %rax, -16(%rbp)
        leaq    l___unnamed_2(%rip), %rax
...

Neat, but this is what I'd expect from any basic compiler! Let's try a more complicated example:

fn double(a: u32) -> u32 {
    a + a
}

fn main() {
    let milliseconds: u32 = 2000;
    println!("Milliseconds: {}", double(milliseconds) + double(milliseconds) + milliseconds);
}

__ZN17zero_cost_complex4main17h56ac4a9f80f155afE:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        subq    $80, %rsp
        movl    $10000, -4(%rbp)
        leaq    -4(%rbp), %rax
...

Woah, the Rust compiler calculated the entire expression even though it was more complicated! You can notice this by looking at the movl $10000 -4(%rbp) instruction which is the value of the second argument in println!. I'm curious if this is still possible if the data was on the heap and mutable by another thread. That brings me to another benefit of Rust that developers rave about: memory safety and detection of concurrency bugs at compile time.

Memory Safety

Rust makes it very difficult to have a memory leak. Challenge accepted! I'm going to try to cause a basic memory leak:

struct Memory {
    value: u32,
}

impl Memory {
    pub fn new(i: u32)->Memory {
        println!("Allocated {}", i);
        Memory{value: i}
    }
}
impl Drop for Memory {
    fn drop(&mut self)  {
        println!("Free'ing memory {}", self.value);
    }
}
fn test() {
    let _m1 = Box::new(Memory::new(1));
}

fn main() {
    let _m0 = Box::new(Memory::new(0));
    println!("About to run test...");
    test();
    println!("Finished running test.");
}

Box::new allocates memory on the heap in Rust. Then, we implement the Drop trait which will be run whenever the memory address is free'd. If there really are no memory leaks in Rust and it is not a garbage collected language, _m1 should be free'd before test() returns.

This piece of code simply prints out:

Allocated 0
About to run test...
Allocated 1
Free'ing memory 1
Finished running test.
Free'ing memory 0

Cool, as expected _m1 is freed before the test function is returned! So Rust does free memory that can't be referenced anymore. I tested some more programs and found that even if a reference is returned, if it is not used, then Rust will free it. As expected, if a reference is returned and used, it is not freed. I've kind of proved that Rust does what it claims to do, but I'm not going to dive into what mechanism makes it possible.

No Concurrency Bugs

Finally, Rust is also free from concurrency bugs. It'll check for this at compile time. I want to verify that this is the case. The mechanism used to make this possible is similar to the one used for memory safety. I won't talk about it here but its pretty neat! Here is the program I started out with:

fn main() {
    let mut i= Box::new(0);
    std::thread::spawn(|| {
        *i = *i + 1;
    });
    println!("i: {}", i);
}

Pretty simple, in C++ something similar would compile and sometimes return 0 and sometimes return 1. Lets see what happens in Rust.

error[E0373]: closure may outlive the current function, but it borrows `i`, which is owned by the current function
 --> race.rs:3:24
  |
3 |     std::thread::spawn(|| {
  |                        ^^ may outlive borrowed value `i`
4 |         *i = *i + 1;
  |               - `i` is borrowed here
  |
note: function requires argument type to outlive `'static`
 --> race.rs:3:5
  |
3 | /     std::thread::spawn(|| {
4 | |         *i = *i + 1;
5 | |     });
  | |______^
help: to force the closure to take ownership of `i` (and any other referenced variables), use the `move` keyword
  |
3 |     std::thread::spawn(move || {
  |                        ^^^^^^^

error[E0502]: cannot borrow `i` as immutable because it is also borrowed as mutable
 --> race.rs:6:23
  |
3 |       std::thread::spawn(|| {
  |       -                  -- mutable borrow occurs here
  |  _____|
  | |
4 | |         *i = *i + 1;
  | |               - first borrow occurs due to use of `i` in closure
5 | |     });
  | |______- argument requires that `i` is borrowed for `'static`
6 |       println!("i: {}", i);
  |                         ^ immutable borrow occurs here

error: aborting due to 2 previous errors

What just happened!? There were two errors:

  1. Closure may outlive the current function, but it borrows i, which is owned by the current function
  2. Cannot borrow i as immutable because it is also borrowed as mutable

The first error refers to the fact that the variable i is "owned" by main and therefore can't be accessed by this other thread. Memory is free'd once its "owner" has terminated. In this case, its possible that the thread running main exits and frees up i and then the spawned thread attempts to access it resulting in a segfault. That's pretty cool!

The second error refers to i being borrowed as immutable inside the println while its also being used borrowed as mutable in the newly spawned thread. Rust doesn't allow any other references to memory while there exists a mutable reference. This is how Rust prevents reading racy data.

We can fix the first error by moving ownership of i to the spawned thread with the move keyword.

fn main() {
    let mut i= Box::new(0);
    std::thread::spawn(move || {
        *i = *i + 1;
    });
    println!("i: {}", i);
}

Now we get a new error:

error[E0382]: borrow of moved value: `i`
 --> race.rs:6:23
  |
2 |     let mut i= Box::new(0);
  |         ----- move occurs because `i` has type `Box<i32>`, which does not implement the `Copy` trait
3 |     std::thread::spawn(move || {
  |                        ------- value moved into closure here
4 |         *i = *i + 1;
  |               - variable moved due to use in closure
5 |     });
6 |     println!("i: {}", i);
  |                       ^ value borrowed here after move

This error is similar to the second error we saw previously. Because we used the move keyword and transferred ownership to the spawned thread, the main thread can no longer access it! Now we need a way to transfer ownership of i back to the main thread. We can do that like this:

fn main() {
    let mut i= Box::new(0);
    let t = std::thread::spawn(move || {
        *i = *i + 1;
        i
    });
    i = t.join().unwrap();
    println!("i: {}", i);
}

Now, we return i in the spawned thread which transfers ownership back to the thread that it joins with. In this case, that's the main thread. Once ownership is transferred back to the main thread, the main thread can read and modify the i variable.

Conclusion

What a fun day learning Rust! I'm excited to understand the ownership and borrowing mechanism that makes all this possible. If you want to learn more, here are some great resources:

https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html

https://www.youtube.com/watch?v=Dbytx0ivH7Q&t=8s

If you have any questions, feel free to reach out to me on Twitter (DMs open) or via email