Each thread: one PC, one set of registers, one stack. Shared address space.
I/O concurrency: a different thread can execute when one is waiting for IO
Parallelism: 2+ compute threads actually running in parallel
Convenience: scheduled and/or background jobs
Async / Event-Driven
Single thread, sits in a loop and waits for input
IO concurrency but no parallelism
Overheads are smaller than threads
Can be more performant when the number of jobs/threads would be large, like a server with a million clients
Hybrid: start N threads with an event loop in each one
Two levels of scheduling: OS scheduling context-switches between OS threads, Go then schedules goroutines onto that thread
Threading issues
Data races: fix with mutexes, channels
Deadlock: in its simplest form: $T1$ is blocked waiting on $T2$, and $T2$ is blocked waiting on $T1$.
Can we consider individual instructions to be atomic in a threading context?
No, some instructions are not
A 32-bit store is likely atomic. If two threads write a 32-bit value to memory at the same time, you’ll end up with one or the other, but not a mixture.
A one-byte store likely isn’t. If two threads write a one-byte value to memory at the same time, you might end up with a mixture based on which 8 bits are being modified. Depends on the processor, but this is likely implemented as: load 32 bits, modify 8 bits, store 32 bits.
More complex instructions like increment likely aren’t unless explicitly specified.
Should locks be stored inside a data structure, like a ConcurrentHashMap?
Reasonable choice, but has some issues
Unnecessary overhead in a single-threaded context
If you need to coordinate two of these data structures, locking needs to happen at a higher level, so the per-data structure locks are wasted. More importantly, this could cause deadlocks.
Coordination in Go
Channels
sync.Cond
WaitGroup
Web Crawler
Recursively fetch pages and pages they link to
Avoid cycles because the graph is cyclic; find a tree-shaped subset of this graph
The actual fetch is embarassingly parallel
Three styles: serial/recursive (DFS), concurrent + mutex, concurrent + channels
Go maps are always pass-by-reference
Go closures are similar to JS; variables within the closure aren’t necessarily copied when the closure is defined.
Go has a race detector; invoke with -race
(Possibly among other things, it) Checks for reads of shared data from a goroutine right after a write from a different goroutine without an intermediate lock/unlock
Not guaranteed to catch all data races
Runtime check, not static analysis
Concurrent + mutex version uses an unbounded number of goroutines