Rust

Arrays vs. Slices

  • Arrays have a length known at compile time (this is part of an array’s type signature)
  • Slices use two words of metadata, a data pointer and the length of the slice
    • For this program: 400
    • Here’s the slice in memory: Screen Shot 2022-09-20 at 5.50.15 pm 1.png
    • Green box is the size of the slice (0x1000 == 4096)
    • Red box is the pointer to the underlying data (0x0000FFFFC19C5870)
    • That address contains the u16s in the array: Pasted image 20220920175347.png

Shadowing

Re-bind an immutable identifier without requiring the use of mut:

fn main() {
    let x = 5;
    let x = x + 1;
    let x = x * 2;

    println!("The value of x is: {}", x);
}

Expressions

  • Rust has expressions and statements:
    • expressions return a value based on their operand
    • statements simply return a () type which behaves just like void

Enums

  • Rust supports regular enums:

    enum IpAddr {
      V4,
      V6
    }
    
    IpAddr::V4
    
  • Each enum entry can also have “associated data” that is not necessarily consistent:

    enum IpAddr {
      V4(String),
      V6(String)
    }
    
    IpAddr::V6(String::from("::1"))
    
    enum Message {
      Quit,
      Move { x: i32, y: i32},
      Write(String),
      ChangeColor(i32, i32, i32)
    }
    
    Message::Move { x: 5, y: 6 };
    Message::ChangeColor(255, 255, 0);
    
  • This is almost like each enum variant is a struct-like thing on its own, but it’s also a Message, and so can be passed to anything that expects a Message.

Lifetimes

  • Every reference in Rust has a lifetime. Many of these lifetimes are inferred implicitly, but Rust will complain if it can’t (or if it’s inferred lifetime is wrong in some way).

  • Reference lifetimes aim to prevent dangling references:

    {
      let x;
    
      {
        let y = 15;
        x = &y;
      }
    
      // The value that x points to is out of scope here
      println!("{}", x);
    }
    
  • Rust fails to compile that snippet because it’s possible for x to reference bad memory once y goes out of scope.

  • Rust uses the borrow checker to validate reference lifetimes:

    • If a reference r refers to memory whose lifetime is shorter than its own, this can cause the scenario above.
    • This invariant is essentially what the borrow checker attempts to track.
  • Consider this function:

    fn longest(x: &str, y: &str) -> &str {
      if x.len() > y.len() { x } else { y }
    }
    
    • This code does not compile because the borrow checker doesn’t know the lifetime of the returned reference relative to the references the function receives as arguments.
    • If the returned reference is pointing to memory that x is pointing to, it outlives the owner of that memory, ditto for y.
  • We use lifetime annotations to tell the borrow checker that the arguments and the return value have the same lifetime:

    fn longest(x: &'a str, y: &'a str) -> &'a str {
      if x.len() > y.len() { x } else { y }
    }
    
    • The specific identifier can be any valid Rust identifier, but convention is to use 'a, then 'b, etc. Here, we’re saying that:
      • The function takes two string slices, both of which live at least as long as lifetime 'a.
      • The function returns a string slice, that lives at least as long as lifetime 'a.
      • The returned slice has a lifetime equal to the smaller of the lifetimes of the references passed in.
    • When we pass concrete references in to longest, the concrete lifetime for 'a will be the part of x’s scope that overlaps with y’s scope (the smaller lifetime of the two, essentially).
  • When a function returns a reference, its lifetime parameter must match one of the arguments' lifetime parameters.

  • structs containing references require lifetime parameters as well:

    struct Foo<'a> {
      text: &'a str
    }
    
    • I’m not sure why Rust doesn’t infer this lifetime information, it seems fairly consistent.
    • Although I’m sure there are edge cases I’m not thinking about.
  • Rust does infer lifetime information in some cases following these rules (in order):

    • Each parameter/argument that is a reference gets it’s own lifetime parameter.
    • If there is exactly one input lifetime parameter, all outputs get that lifetime.
    • If one of the input parameters is &self (or &mut self), that reference’s lifetime gets assigned to all outputs.
  • These rules are not (meant to be) foolproof, but the borrow checker catches these lapses:

    struct LifetimeTest {
      name: String
    }
    
    impl LifetimeTest {
      fn announce(&self, announcement: &str) -> &str {
        announcement
      }
    }
    
    • This fails because the returned slice is assigned &self’s lifetime (via rules 1 and 3 above), but actually needs &announcement’s lifetime. The failure message is fairly comprehensive:
      error[E0623]: lifetime mismatch
        --> src/main.rs:81:9
         |
      80 |     fn announce(&self, announcement: &str) -> &str {
         |                                      ----     ----
         |                                      |
         |                                      this parameter and the return type are declared with different lifetimes...
      81 |         announcement
         |         ^^^^^^^^^^^^ ...but data from `announcement` is returned here
      

Ownership, References, and Borrowing

https://doc.rust-lang.org/book/print.html#ownership-and-functions

  • Ownership rules

    • Each value on the heap has a (variable) owner.
    • Only one owner at a given time.
    • The value is dropped (collected?) when the owner goes out of scope.
  • Here, s2 is taking ownership of the String that s1 points to, so s1 can no longer be used.

    let s1 = String::from("hello");
    let s2 = s1;
    // Compile-time error if s1 is used here.
    
  • Here we’re only dealing with data on the stack, so no ownership rules apply:

    let x = 5; 
    let y = x;
    
  • Ownership is transferred during function calls too:

    fn main() {
      let s = String::from("hello");
    
      // s is no longer valid after this line; `takes_ownership` owns it now
      takes_ownership(s);
    }
    
  • Ownership can be transferred back to the caller via return value(s):

    fn main() {
      let s = String::from("hello");
    
      // s is no longer valid here, but s2 is.
      let s2 = takes_and_returns_ownership(s);
    }
    
  • This is too cumbersome to do everytime you want to use a value without necessarily owning it. What if we want to let a function use a value but not take ownership?

  • Rust fixes this with references.

    fn main() {
      let s1 = String::from("hello");
    
      // calculate_length gets a reference to s1, and so doesn't own s1.
      let len = calculate_length(&s1);
      println!("The length of '{}' is {}.", s1, len);
    }
    
    fn calculate_length(s: &String) -> usize { 
      s.len()
    }
    

    /2020-04-29.16.13.36.png

  • Having references as function parameters is called borrowing.

  • A function that accepts a reference can’t modify the data it refers to, unless it’s a mutable reference:

    fn main() {
      let mut s = String::from("hello");
      change(&mut s);
    }
    
    fn change(some_string: &mut String) {
      some_string.push_str(", world");
    }
    
  • There are rules around mutable references, though, which allows Rust to avoid data races at compile time:

    • You can have only one mutable reference to a particular piece of data in a particular scope.
    • You cannot have a mutable reference while you have an immutable one (in a particular scope).
    • Effectively, this is “either one mutable reference or many immutable references are allowed in a given scope”.
  • A reference’s scope starts from where it is introduced and continues through the last time that reference is used, so scopes are not necessarily denoted by {}.

    fn main() {
      let mut s = String::from("hello");
    
      let r1 = &s;
      let r2 = &s;
      println!("{} and {}", r1, r2);
      // r1 and r2's scope _implicitly_ ends here
    
      // so this is fine
      let r3 = &mut s;
      println!("{}", r3);
    }
    

Slices

https://doc.rust-lang.org/book/ch04-03-slices.html

  • A string slice is a reference to a portion of a string. Type signature is &str.

  • Implemented as a pointer to a particular string index, coupled with a length.

  • If you have a string slice, you can’t obtain a mutable reference to the string, so the slice is guaranteed to be valid wrt. the underlying string.

  • For this code:

    let s = String::from("hello world");
    let hello = &s[0..5];
    let world = &s[6..11];
    

    The underlying reference looks like: /2020-04-30.18.46.55.png

  • Get a slice of the entire string with &s[..]. This is semantically identical to &s and automatic coercion occurs, so you can pass &s to a function that expects &str.

  • You can get slices of any type:

    let a = [1, 2, 3, 4, 5];
    
    // `slice` has the type &[i32]
    let slice = &a[1..3];
    

Smart Pointers

  • These are custom reference types that can store additional data.

  • Any custom type that implements Deref and Drop is a smart pointer.

  • Rust walks the tree of potential types that implement Deref to allow for automatic dereferencing/deref coercion to a given target type:

    fn hello(name: &str) {
      println!("Hello, {}!", name);
    }
    
    fn main() {
      // Box<T> implements `Deref` and returns a `&T` (in this case a `&String`)
      // ... which in turn implements `Deref` and returns a `&str`
      // ... which `hello` accepts
      let m = Box::new(String::from("Rust"));
      hello(&m);
    }
    
  • Not sure what Rust does if there are multiple paths to the target reference type, though.

  • The Rust compiler knows about three builtin smart pointer types.

Box<T>

  • A container pointing to data on the heap

  • The box owns the data on the heap, and allows mutable/immutable borrows.

  • Different from a regular reference in that it allows boxing primitives.

  • And allows for indirection in recursive type definitions:

    // Fails because this definition recurses indefinitely
    // when Rust attempts to figure out how much memory 
    // a `List<T>` needs.
    enum List<T> {
      Cons(T, List<T>),
      Nil
    }
    
    // This works
    enum List<T> {
      Cons(T, Box<List<T>>),
      Nil
    }
    

Rc<T> (Reference Counted Pointer)

  • Use this if you want multiple owners for something (and having one owner + immutable references is untenable).

  • This is a reference counting container, (effectively) allowing multiple owners for a single allocation.

  • Uses its Drop trait to deallocate when it’s reference count drops to 0.

  • Supports weak references via downgrade.

  • Use Rc::clone to obtain a new (owning) reference to the allocation.

  • This pattern is vulnerable to memory leaks.

  • An Rc<T> derefs to an immutable reference to the underlying data, but can provide a mutable reference (assuming it’s the only reference present) too via get_mut.

  • Arc<T> is a thread-safe version, and can be used with Mutex to allow multiple threads to own a piece of data:

    // We're going to share this empty vector across threaeds
    let data = Arc::new(Mutex::new(vec![]));
    
    // Start 16 threads, give each one ownership of the mutex
    // guarding the vector by cloning the `Arc`. Each thread
    // then calls `lock()`, which only unlocks for one caller
    // at a time, and returns a mutable reference to the vector 
    // held by the mutex. The mutex is unlocked when `v` goes
    // out of scope.
    let threads: Vec<_> = (0..16).map(|i| {
      let data = Arc::clone(&data);
      thread::spawn(move || {
        let mut v = data.lock().unwrap();
        v.push(i);
      })
    }).collect();
    
    
    // Wait until all 16 threads are done.
    for t in threads {
        t.join().unwrap();
    }
    
    // Mutex { data: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] } 
    println!("{:?}", data);
    

RefCell<T>

  • Use this if you want to mutate something (non-public scratch space, for example) behind an immutable reference.
  • Immutable container to a mutable piece of the heap.
  • Single owner, allows any number of borrows, but unlike regular allocations, the borrow rules are enforced at runtime, and rust panics if there’s a violation.
  • Deallocates the data on the heap when the owner goes out of scope.
  • Doesn’t implement Deref (because this is a Ref Cell, not a reference); use borrow/borrow_mut instead.

References

Convert an existing project to compile to WASM

  • Add to Cargo.toml:

    [lib]
    crate-type = ["cdylib", "rlib"]
    
  • Also add this dependency:

    wasm-bindgen = "0.2"
    
  • Annotate public structs/functions with #[wasm_bindgen]

  • Set up console_error_panic_hook to improve error messages:

    console_error_panic_hook = "0.1.6"
    
    #[wasm_bindgen]
    pub fn set_panic_hook() {
        #[cfg(feature = "console_error_panic_hook")]
        console_error_panic_hook::set_once();
    }
    
    set_panic_hook();
    
  • Import these (not very well documented) chrono features if necessary:

    chrono = { version = "0.4.11", features = ["wasmbind", "js-sys"] }
    
  • And build as usual:

    $ wasm-pack build --target web
    
Edit