Consistent hashing and random trees

1. Introduction

  • Caching protocols for distributed networks that can be used to decrease or eliminate hotspots
  • Prior work
    • A common approach to overcome hotspots is to stick a cache in front. The cache itself can become a hotspot, though.
    • Shard the cache and use multicast to first check other caches before the origin. This can cause undue network load.
    • Use a tree of caches instead. Start at a leaf cache, and check that cache and all its siblings. Repeat for every level up to the root before trying the origin. The root node can become a hotspot.
    • Use progressively larger sets of “virtual cache sites” instead. A random hash function maps virtual sites to actual cache machines. Every request goes to one random virtual node in every site. Data from smaller sites is copied to larger sites as load increases. The smaller virtual sites can become hotspots because they’re invoked for every query.

2. Model

  • Three types of machines: browsers, servers, and caches
  • Every machine is only aware of $1/t$ of all caches (for some constant $t$)
  • Most messages between machines are either requests for pages or page payloads themselves
  • Latency (or $\delta(m_1, m_2)$) is the time taken to get a message from $m_1$ to $m_2$
  • Objectives: prevent swamping, minimize cache memory, minimize latency at the browser PoV

3. Random Trees

  • The tree of caches approach can lead to hotspots in the root node because the tree is static
  • Use a randomly-generated tree instead, different for every requested page
  • The basic approach is the same, look in every level of the tree starting at the lowest
  • The tree must be as balanced as possible (although the number of children isn’t restricted)
  • Every node of the tree is mapped to a cache machine except the root, which is mapped to the server holding the page
  • In this protocol, assume that browsers know about all caches and all servers, and so browsers must create a request that contains:
    • Identity of the requester
    • Name of the page
    • Ordered list of nodes to be queried (ending with the server holding the page)
    • Mapping from nodes to caches/servers
  • Browsers then send this request to the first node in the chain
  • Each node maintains a counter for the number of times each page was requested, and chooses to cache it when that count hits a threshold ($q$)
  • Once a page’s payload has been retrieved (either from a cache or the server), it is sent back through the same chain of connections that formed the request. The paper says this isn’t too slow because each machine can stream responses, but this still feels unnecessary, especially if some of these intermediaries choose not to cache the payload.

4. Consistent Hashing

  • When the scope/range of a regular hash function changes, almost all data needs to be moved
  • Nomenclature
    • View: the set of caches that a client is aware of
    • Smoothness: the % of objects that must move when a new machine is added
    • Spread: number of caches that contain a given object
    • Load: number of objects in a given cache
  • Each client uses a consistent hash function to map an object to a cache in its view
  • We want to keep smoothness and spread low, and don’t want to let the load get too high
  • Use a ranged hash function (from a family of such functionsi) which has the properties:
    • Balance: given a view $V$, with a high probability, each bucket in the view has $O(1/|V|)$ items mapped to it
    • Monotonicity: when new buckets are added, items only move from old buckets to new buckets, not from old to old
    • Spread: across all views, the max number of different buckets a given item could be in
    • Load: for a given bucket, the max number of distinct items that at least one view says is in this bucket
  • Construction of a Ranged Hash Family
    • Say we have two random functions $r_B$ and $r_X$
      • $r_B$ maps buckets randomly to the range $(0.0, 1.0)$ - $r_X$ maps items randomly to the range $(0.0, 1.0)$
    • The hash function $f_v(i)$ (which bucket from view $v$ do I put object $i$ in, basically) is defined as the bucket $b$ in the view $v$ that minimizes $|r_B(b) - r_X(i)|$
      • Essentially, find the bucket where the two random functions give you values as close together as possible
    • For some reason, $r_B$ needs to place each bucket into the interval $k\ log(C)$ times, where $C$ is a ceiling on the number of buckets, and $k$ is a constant factor.`
  • This ranged hash family satisfies the properties discussed above:
    • Monotonicity: when a new bucket is added, only those items which are now closer to this new bucket than all other buckets move to it. No other movements need take place.
    • Balance: high probability that the range $(0.0, 0.1)$ is divided evenly by available buckets. Buckets are replicated to even this out further.
Edit