计算机科学
德拉姆
延迟(音频)
上下文切换
并行计算
架空(工程)
随机存取
背景(考古学)
延迟时间
嵌入式系统
操作系统
计算机硬件
内存控制器
半导体存储器
古生物学
生物
电信
作者
Tomoya Suzuki,Kazuhiro Hiwada,Hirotsugu Kajihara,Shintaro Sano,Shuou Nomura,Tatsuo Shiozawa
标识
DOI:10.14778/3457390.3457397
摘要
For applications in which small-sized random accesses frequently occur for datasets that exceed DRAM capacity, placing the datasets on SSD can result in poor application performance. For the read-intensive case we focus on in this paper, low latency flash memory with microsecond read latency is a promising solution. However, when they are used in large numbers to achieve high IOPS (Input/Output operations Per Second), the CPU processing involved in IO requests is an overhead. To tackle the problem, we propose a new access method combining two approaches: 1) optimizing issuance and completion of the IO requests to reduce the CPU overhead. 2) utilizing many contexts with lightweight context switches by stackless coroutines. These reduce the CPU overhead per request to less than 10 ns, enabling read access with DRAM-like overhead, while the access latency longer than DRAM can be hidden by the context switches. We apply the proposed method to graph algorithms such as BFS (Breadth First Search), which involves many small-sized random read accesses. In our evaluation, the large graph data is placed on microsecond-latency flash memories within prototype boards, and it is accessed by the proposed method. As a result, for the synthetic and real-world graphs, the execution times of the graph algorithms are 88--141% of those when all the data are placed in DRAM.
科研通智能强力驱动
Strongly Powered by AbleSci AI