Rust
Arrays vs. Slices
- Arrays have a length known at compile time (this is part of an array’s type signature)
- Slices use two words of metadata, a data pointer and the length of the slice
Shadowing
Re-bind an immutable identifier without requiring the use of mut
:
fn main() {
let x = 5;
let x = x + 1;
let x = x * 2;
println!("The value of x is: {}", x);
}
Expressions
- Rust has expressions and statements:
- expressions return a value based on their operand
- statements simply return a () type which behaves just like
void
Enums
-
Rust supports regular enums:
enum IpAddr { V4, V6 } IpAddr::V4
-
Each enum entry can also have “associated data” that is not necessarily consistent:
enum IpAddr { V4(String), V6(String) } IpAddr::V6(String::from("::1")) enum Message { Quit, Move { x: i32, y: i32}, Write(String), ChangeColor(i32, i32, i32) } Message::Move { x: 5, y: 6 }; Message::ChangeColor(255, 255, 0);
-
This is almost like each enum variant is a struct-like thing on its own, but it’s also a
Message
, and so can be passed to anything that expects aMessage
.
Lifetimes
-
Every reference in Rust has a lifetime. Many of these lifetimes are inferred implicitly, but Rust will complain if it can’t (or if it’s inferred lifetime is wrong in some way).
-
Reference lifetimes aim to prevent dangling references:
{ let x; { let y = 15; x = &y; } // The value that x points to is out of scope here println!("{}", x); }
-
Rust fails to compile that snippet because it’s possible for
x
to reference bad memory oncey
goes out of scope. -
Rust uses the borrow checker to validate reference lifetimes:
- If a reference
r
refers to memory whose lifetime is shorter than its own, this can cause the scenario above. - This invariant is essentially what the borrow checker attempts to track.
- If a reference
-
Consider this function:
fn longest(x: &str, y: &str) -> &str { if x.len() > y.len() { x } else { y } }
- This code does not compile because the borrow checker doesn’t know the lifetime of the returned reference relative to the references the function receives as arguments.
- If the returned reference is pointing to memory that
x
is pointing to, it outlives the owner of that memory, ditto fory
.
-
We use lifetime annotations to tell the borrow checker that the arguments and the return value have the same lifetime:
fn longest(x: &'a str, y: &'a str) -> &'a str { if x.len() > y.len() { x } else { y } }
- The specific identifier can be any valid Rust identifier, but convention is to use
'a
, then'b
, etc. Here, we’re saying that:- The function takes two string slices, both of which live at least as long as lifetime
'a
. - The function returns a string slice, that lives at least as long as lifetime
'a
. - The returned slice has a lifetime equal to the smaller of the lifetimes of the references passed in.
- The function takes two string slices, both of which live at least as long as lifetime
- When we pass concrete references in to
longest
, the concrete lifetime for'a
will be the part ofx
’s scope that overlaps withy
’s scope (the smaller lifetime of the two, essentially).
- The specific identifier can be any valid Rust identifier, but convention is to use
-
When a function returns a reference, its lifetime parameter must match one of the arguments' lifetime parameters.
-
struct
s containing references require lifetime parameters as well:struct Foo<'a> { text: &'a str }
- I’m not sure why Rust doesn’t infer this lifetime information, it seems fairly consistent.
- Although I’m sure there are edge cases I’m not thinking about.
-
Rust does infer lifetime information in some cases following these rules (in order):
- Each parameter/argument that is a reference gets it’s own lifetime parameter.
- If there is exactly one input lifetime parameter, all outputs get that lifetime.
- If one of the input parameters is
&self
(or&mut self
), that reference’s lifetime gets assigned to all outputs.
-
These rules are not (meant to be) foolproof, but the borrow checker catches these lapses:
struct LifetimeTest { name: String } impl LifetimeTest { fn announce(&self, announcement: &str) -> &str { announcement } }
- This fails because the returned slice is assigned
&self
’s lifetime (via rules 1 and 3 above), but actually needs&announcement
’s lifetime. The failure message is fairly comprehensive:error[E0623]: lifetime mismatch --> src/main.rs:81:9 | 80 | fn announce(&self, announcement: &str) -> &str { | ---- ---- | | | this parameter and the return type are declared with different lifetimes... 81 | announcement | ^^^^^^^^^^^^ ...but data from `announcement` is returned here
- This fails because the returned slice is assigned
Ownership, References, and Borrowing
https://doc.rust-lang.org/book/print.html#ownership-and-functions
-
Ownership rules
- Each value on the heap has a (variable) owner.
- Only one owner at a given time.
- The value is dropped (collected?) when the owner goes out of scope.
-
Here,
s2
is taking ownership of theString
thats1
points to, sos1
can no longer be used.let s1 = String::from("hello"); let s2 = s1; // Compile-time error if s1 is used here.
-
Here we’re only dealing with data on the stack, so no ownership rules apply:
let x = 5; let y = x;
-
Ownership is transferred during function calls too:
fn main() { let s = String::from("hello"); // s is no longer valid after this line; `takes_ownership` owns it now takes_ownership(s); }
-
Ownership can be transferred back to the caller via return value(s):
fn main() { let s = String::from("hello"); // s is no longer valid here, but s2 is. let s2 = takes_and_returns_ownership(s); }
-
This is too cumbersome to do everytime you want to use a value without necessarily owning it. What if we want to let a function use a value but not take ownership?
-
Rust fixes this with references.
fn main() { let s1 = String::from("hello"); // calculate_length gets a reference to s1, and so doesn't own s1. let len = calculate_length(&s1); println!("The length of '{}' is {}.", s1, len); } fn calculate_length(s: &String) -> usize { s.len() }
/2020-04-29.16.13.36.png
-
Having references as function parameters is called borrowing.
-
A function that accepts a reference can’t modify the data it refers to, unless it’s a mutable reference:
fn main() { let mut s = String::from("hello"); change(&mut s); } fn change(some_string: &mut String) { some_string.push_str(", world"); }
-
There are rules around mutable references, though, which allows Rust to avoid data races at compile time:
- You can have only one mutable reference to a particular piece of data in a particular scope.
- You cannot have a mutable reference while you have an immutable one (in a particular scope).
- Effectively, this is “either one mutable reference or many immutable references are allowed in a given scope”.
-
A reference’s scope starts from where it is introduced and continues through the last time that reference is used, so scopes are not necessarily denoted by
{}
.fn main() { let mut s = String::from("hello"); let r1 = &s; let r2 = &s; println!("{} and {}", r1, r2); // r1 and r2's scope _implicitly_ ends here // so this is fine let r3 = &mut s; println!("{}", r3); }
Slices
https://doc.rust-lang.org/book/ch04-03-slices.html
-
A string slice is a reference to a portion of a string. Type signature is
&str
. -
Implemented as a pointer to a particular string index, coupled with a
length
. -
If you have a string slice, you can’t obtain a mutable reference to the string, so the slice is guaranteed to be valid wrt. the underlying string.
-
For this code:
let s = String::from("hello world"); let hello = &s[0..5]; let world = &s[6..11];
The underlying reference looks like: /2020-04-30.18.46.55.png
-
Get a slice of the entire string with
&s[..]
. This is semantically identical to&s
and automatic coercion occurs, so you can pass&s
to a function that expects&str
. -
You can get slices of any type:
let a = [1, 2, 3, 4, 5]; // `slice` has the type &[i32] let slice = &a[1..3];
Smart Pointers
-
These are custom reference types that can store additional data.
-
Any custom type that implements
Deref
andDrop
is a smart pointer. -
Rust walks the tree of potential types that implement
Deref
to allow for automatic dereferencing/deref coercion to a given target type:fn hello(name: &str) { println!("Hello, {}!", name); } fn main() { // Box<T> implements `Deref` and returns a `&T` (in this case a `&String`) // ... which in turn implements `Deref` and returns a `&str` // ... which `hello` accepts let m = Box::new(String::from("Rust")); hello(&m); }
-
Not sure what Rust does if there are multiple paths to the target reference type, though.
-
The Rust compiler knows about three builtin smart pointer types.
Box<T>
-
A container pointing to data on the heap
-
The box owns the data on the heap, and allows mutable/immutable borrows.
-
Different from a regular reference in that it allows boxing primitives.
-
And allows for indirection in recursive type definitions:
// Fails because this definition recurses indefinitely // when Rust attempts to figure out how much memory // a `List<T>` needs. enum List<T> { Cons(T, List<T>), Nil } // This works enum List<T> { Cons(T, Box<List<T>>), Nil }
Rc<T>
(Reference Counted Pointer)
-
Use this if you want multiple owners for something (and having one owner + immutable references is untenable).
-
This is a reference counting container, (effectively) allowing multiple owners for a single allocation.
-
Uses its
Drop
trait to deallocate when it’s reference count drops to 0. -
Supports weak references via
downgrade
. -
Use
Rc::clone
to obtain a new (owning) reference to the allocation. -
This pattern is vulnerable to memory leaks.
-
An
Rc<T>
derefs to an immutable reference to the underlying data, but can provide a mutable reference (assuming it’s the only reference present) too viaget_mut
. -
Arc<T>
is a thread-safe version, and can be used withMutex
to allow multiple threads to own a piece of data:// We're going to share this empty vector across threaeds let data = Arc::new(Mutex::new(vec![])); // Start 16 threads, give each one ownership of the mutex // guarding the vector by cloning the `Arc`. Each thread // then calls `lock()`, which only unlocks for one caller // at a time, and returns a mutable reference to the vector // held by the mutex. The mutex is unlocked when `v` goes // out of scope. let threads: Vec<_> = (0..16).map(|i| { let data = Arc::clone(&data); thread::spawn(move || { let mut v = data.lock().unwrap(); v.push(i); }) }).collect(); // Wait until all 16 threads are done. for t in threads { t.join().unwrap(); } // Mutex { data: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] } println!("{:?}", data);
RefCell<T>
- Use this if you want to mutate something (non-public scratch space, for example) behind an immutable reference.
- Immutable container to a mutable piece of the heap.
- Single owner, allows any number of borrows, but unlike regular allocations, the borrow rules are enforced at runtime, and rust panics if there’s a violation.
- Deallocates the data on the heap when the owner goes out of scope.
- Doesn’t implement
Deref
(because this is a Ref Cell, not a reference); useborrow
/borrow_mut
instead.
References
- https://doc.rust-lang.org/1.5.0/book/choosing-your-guarantees.html
- https://doc.rust-lang.org/std/rc/struct.Rc.html
- https://doc.rust-lang.org/beta/std/cell/struct.RefCell.html
- https://doc.rust-lang.org/std/sync/struct.Arc.html
- https://doc.rust-lang.org/std/sync/struct.Mutex.html
Convert an existing project to compile to WASM
-
Add to
Cargo.toml
:[lib] crate-type = ["cdylib", "rlib"]
-
Also add this dependency:
wasm-bindgen = "0.2"
-
Annotate public structs/functions with
#[wasm_bindgen]
-
Set up
console_error_panic_hook
to improve error messages:console_error_panic_hook = "0.1.6"
#[wasm_bindgen] pub fn set_panic_hook() { #[cfg(feature = "console_error_panic_hook")] console_error_panic_hook::set_once(); }
set_panic_hook();
-
Import these (not very well documented)
chrono
features if necessary:chrono = { version = "0.4.11", features = ["wasmbind", "js-sys"] }
-
And build as usual:
$ wasm-pack build --target web