The majority of the Internet’s content is delivered by global caching networks, also known as Content Delivery Networks (CDNs). CDNs enhance performance by caching content in servers located in user proximity. This proximity enables fast content delivery but requires CDNs to operate servers in hundreds of networks around the world.
A major operational cost factor is the bandwidth cost between CDN caching servers and data centers storing the original copies of web content. Hence, CDNs aim to maximize the fraction of bytes served locally from the cache, which is also known as the byte hit ratio (BHR).
To improve efficiency, CDNs seek to remove their dependence on manual parameter tuning. Fortunately, recent advances in reinforcement learning (RL) promise a general approach to systems that “manage resources on their own”.
Existing proposals for caching rely on “model-free” RL where the system starts without any knowledge (or bias) about the task at hand. Such systems learn to make decisions from experience interacting with its environment, where good behavior is reinforced via a reward function. While model-free RL is very popular, recent discussions in the RL community highlight three key challenges
- First, millions of learning samples are typically required, which leads to slow reaction times in dynamic environments
- Second, overfitting to past samples happens frequently, which complicates dealing with unexpected situations and can lead to unintended behaviors
- Third, debugging and maintenance is complicated due to high sensitivity to hyperparameters and random seeds
For Internet-facing systems, these challenges are a significant roadblock. For example, CDN servers face quickly changing conditions that include unexpected (or even adversarial) traffic patterns. CDN server also need to be easily maintainable while serving requests at 40+ Gbit/s. While more sophisticated learning techniques, such as model-based RL promise faster and more robust learning rates, they typically lead to significantly higher complexity and computational overhead.
For this reason, a lightweight and robust machine learning will be necessary, read more in this paper from Daniel S. Berger Carnegie Mellon University: