中国传媒大学学报自然科学版

延迟命中场景下基于学习方法的缓存性能优化

A learning-based method to optimize cache performance with delayed hit

投稿时间：

2023/6/20 0:00:00

DOI：

中文关键词：

基于学习；非平稳流行度；内容变大小；延迟命中

英文关键词：

learning-based; non-stationary popularity; variable object size; delayed hit

基金项目：

国家自然科学基金面上项目（62072302）；地区科学基金项目（62262018）；“媒体融合与传播国家重点实验室（中国传媒大学）”开放课题(SKLMCC2021KF011)

姓名	单位
沈志	上海交通大学
江博闻	上海交通大学
江波	上海交通大学
林涛	中国传媒大学

点击数：374

下载数：378

中文摘要：

传统的缓存算法大多基于简单的统计信息进行内容替换，在绝大部分场景下都和离线最优算法有着较大的性能差距。当前基于机器学习来设计高效的缓存策略基本都假设请求的大小相等，忽略了真实场景中请求的大小往往不等且大小变化的范围较大。由于传输速度已接近极限，但网络的吞吐量依旧在持续增长，延迟命中这一因素表现得更加突出。延迟命中下的命中率和传统算法中的命中率并不等价。本文同时考虑请求大小不等和延迟命中两个因素，提出ARC-learning+算法，研究内容主要包括：（1）对请求内容变大小场景，分析了缓存内容大小对于缓存性能的影响，并提出基于概率准入策略的ARC-learning+算法。在真实数据集上实验结果表明，在请求内容变大小场景下，修改后的缓存替换策略非常接近已有算法的最优内容命中率。（2）对延迟命中场景，ARC-learning+算法采用的排序函数综合考虑请求缺失所消耗的时延和请求的流行度等因素。实验结果表明，ARC-learning+算法在流行度变化较快情况下的时延提升明显优于已有的其他缓存算法。在其他流行度变化情况下也非常接近已有算法的最优时延性能。

英文摘要：

Most traditional caching algorithms rely on basic statistical data for content replacement, creating a significant performance discrepancy in comparison to offline optimal caching algorithms. However, due to improvements in CPU performance, cache strategies based on machine learning have been developed in recent years to enhance cache performance. These studies typically assume that requests are uniform in size, without taking into account the wide range of sizes that occur in real-world scenarios. Moreover, with network throughput continuing to increase as transmission speed approaches its limit, the factor of delayed hit becomes more critical. Consequently, the hit rate under delayed hit is not equivalent to traditional algorithms. In this paper, we consider the cache problem with variable object size and delayed hit and propose the algorithm named ARC-learning+. In summary, our research includes: (1) for the variable object size, this paper analyzes the impact of variable object size on cache performance. In the ARC-learning+ algorithm, a probabilistic admission policy is designed, which takes into account the object size. Experimental results on real data sets show that the modified cache replacement strategy is very close to the existing optimal cache algorithm. (2) for the scenario with delayed hit, the ranking function adopted by the ARC-learning+ algorithm comprehensively considers the delay consumed by the missing request and the popularity of the request. Experimental results show that the performance of the modified sorting function algorithm is better than other existing caching algorithms when the popularity changes quickly. It is also very close to the optimal delay performance of the existing algorithm under other prevalence changes.

参考文献：