6.824 Lecture 2 - RPC and Threads

Intro

Why Go?
- stdlib RPC package
- Memory-safe, etc. etc.
Threads
- Each thread: one PC, one set of registers, one stack. Shared address space.
- I/O concurrency: a different thread can execute when one is waiting for IO
- Parallelism: 2+ compute threads actually running in parallel
- Convenience: scheduled and/or background jobs
Async / Event-Driven
- Single thread, sits in a loop and waits for input
- IO concurrency but no parallelism
- Overheads are smaller than threads
- Can be more performant when the number of jobs/threads would be large, like a server with a million clients
- Hybrid: start N threads with an event loop in each one
Two levels of scheduling: OS scheduling context-switches between OS threads, Go then schedules goroutines onto that thread
Threading issues
- Data races: fix with mutexes, channels
- Deadlock: in its simplest form: $T1$ is blocked waiting on $T2$, and $T2$ is blocked waiting on $T1$.
Can we consider individual instructions to be atomic in a threading context?
- No, some instructions are not
- A 32-bit store is likely atomic. If two threads write a 32-bit value to memory at the same time, you’ll end up with one or the other, but not a mixture.
- A one-byte store likely isn’t. If two threads write a one-byte value to memory at the same time, you might end up with a mixture based on which 8 bits are being modified. Depends on the processor, but this is likely implemented as: load 32 bits, modify 8 bits, store 32 bits.
- More complex instructions like increment likely aren’t unless explicitly specified.
Should locks be stored inside a data structure, like a ConcurrentHashMap?
- Reasonable choice, but has some issues
- Unnecessary overhead in a single-threaded context
- If you need to coordinate two of these data structures, locking needs to happen at a higher level, so the per-data structure locks are wasted. More importantly, this could cause deadlocks.
Coordination in Go
- Channels
- sync.Cond
- WaitGroup

Web Crawler

Recursively fetch pages and pages they link to
Avoid cycles because the graph is cyclic; find a tree-shaped subset of this graph
The actual fetch is embarassingly parallel
Three styles: serial/recursive (DFS), concurrent + mutex, concurrent + channels
Go maps are always pass-by-reference
Go closures are similar to JS; variables within the closure aren’t necessarily copied when the closure is defined.
Go has a race detector; invoke with -race
- (Possibly among other things, it) Checks for reads of shared data from a goroutine right after a write from a different goroutine without an intermediate lock/unlock
- Not guaranteed to catch all data races
- Runtime check, not static analysis
Concurrent + mutex version uses an unbounded number of goroutines