Roblox Return to Service 2021
https://blog.roblox.com/2022/01/roblox-return-to-service-10-28-10-31-2021/
- Consul’s streaming feature has all writes go through a single Go channel, so writes block under load
- Moving from a 64-core machine to a 128-core machine made things worse - NUMA on the larger machine == increased latency to that channel
- BoltDB uses a freelist to track deleted pages - this grew to 7MB and was being written out on every write to the database