计算机科学
PB级
服务器
分布式文件系统
操作系统
分布式数据存储
文件系统
分布式数据库
文件服务器
数据库
分布式计算
带宽(计算)
计算机网络
大数据
作者
Konstantin V. Shvachko,Hairong Kuang,Sanjay Radia,Robert J. Chansler
标识
DOI:10.1109/msst.2010.5496972
摘要
The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economical at every size. We describe the architecture of HDFS and report on experience using HDFS to manage 25 petabytes of enterprise data at Yahoo!.
科研通智能强力驱动
Strongly Powered by AbleSci AI