Reinforcement Learning for Datacenter Congestion Control
نویسندگان
چکیده
We approach the task of network congestion control in datacenters using Reinforcement Learning (RL). Successful algorithms can dramatically improve latency and overall throughput. Until today, no such learning-based have shown practical potential this domain. Evidently, most popular recent deployments rely on rule-based heuristics that are tested a predetermined set benchmarks. Consequently, these do not generalize well to newly-seen scenarios. Contrarily, we devise an RL-based algorithm with aim generalizing different configurations real-world datacenter networks. overcome challenges as partial-observability, nonstationarity, multi-objectiveness. further propose policy gradient leverages analytical structure reward function approximate its derivative stability. show scheme outperforms alternative RL approaches, generalizes scenarios were seen during training. Our experiments, conducted realistic simulator emulates communication networks' behavior, exhibit improved performance concurrently multiple considered metrics compared deployed today real datacenters. is being productized replace some largest world.
منابع مشابه
Flow and Congestion Control for Datacenter Networks
The limits of power dissipation and Moore's law are leading toward increasing parallelism and a shift of focus from CPUs to interconnection networks. This trend is also reflected in the rise of blade-based datacenters, which cluster server and storage units packaged as blades, with several networks. We begin with the trends and requirements of datacenter interconnection networks. Next, we show ...
متن کاملTIMELY: RTT-based congestion control for the datacenter – Public Review
The context is datacenter congestion control. Traditional TCP transport stacks fare poorly in this environment, which has led to considerable interest in recent years in developing specialized transports that aim to deliver high bandwidth utilization at extremely low, microsecond-level packet latency. This is important for demanding datacenter applications such as cloud storage and near-realtim...
متن کاملXavier : A Reinforcement-Learning Approach to TCP Congestion Control
Controlling congestion is a fundmanetal problem in computer networks. If the input load is greater than the output bandwidth at a particular switch, the bottleneck’s queue begins to fill up and we say that it is congested. In pathological scenarios and under certain protocols, the saturation of buffers, or bufferbloat [5], can lead to congestion collapse, a condition in which congestion reaches...
متن کاملCentralized Congestion Control and Scheduling in a Datacenter
We consider the problem of designing a packet-level congestion control and scheduling policy for datacenter networks. Current datacenter networks primarily inherit the principles that went into the design of Internet, where congestion control and scheduling are distributed. While distributed architecture provides robustness, it suffers in terms of performance. Unlike Internet, data center is fu...
متن کاملReinforcement Learning for Control
Reinforcement learning (RL) offers a principled way to control nonlinear stochastic systems with partly or even fully unknown dynamics. Recent advances in areas such as deep learning and adaptive dynamic programming (ADP) have led to significant inroads in applications from robotics, automotive systems, smart grids, game playing, traffic control, etc. This open track provides a forum of interac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Performance evaluation review
سال: 2022
ISSN: ['1557-9484', '0163-5999']
DOI: https://doi.org/10.1145/3512798.3512815