Client-side video players employ bitrate adaptation algo- rithms to cater to the ever-growing QoE requirements of users. These ABR algorithms must balance multiple QoE factors, such as maximizing video bitrate and minimizing rebuffering times. Despite the abundance of recently pro- posed ABR algorithms, state-of-the-art schemes suffer from two practical challenges: (1) throughput prediction is dif- ficult and inaccurate predictions can lead to degraded per- formance; (2) existing algorithms use fixed heuristics which have been fine-tuned according to strict assumptions about deployment environments—such tuning precludes general- ization across network conditions and QoE objectives. To overcome these challenges, we develop Pensieve, a system that generates ABR algorithms entirely using Rein- forcement Learning (RL). Pensieve uses RL to train a neural network model that selects bitrates for future video chunks based on observations collected by client video players. Un- like existing approaches, Pensieve does not rely upon pre- programmed models or assumptions about the environment. Instead, it learns to make ABR decisions solely through ob- servations of the resulting performance of past decisions. As a result, Pensieve can automatically learn ABR algorithms that adapt to a wide range of environmental conditions and QoE metrics. We compare Pensieve to state-of-the-art ABR algorithms using trace-driven and real world experiments spanning a wide variety of network conditions, QoE metrics, and video properties. In all considered scenarios, Pensieve outperforms the best state-of-the-art scheme, with improve- ments in average QoE of 13.1%–25.0%. Pensieve’s poli- cies generalize well, outperforming existing schemes even on networks on which it was not trained.
If you would like to contact us about our work, please scroll down to the people section and click on one of the group leads' people pages, where you can reach out to them directly.