计算机科学
操作系统
分布式文件系统
PB级
大数据
字节
Java
数据库
计算机集群
开源
文件系统
软件
作者
Manish Kumar Gupta,Shrawan Kumar Pandey,Anish Gupta
标识
DOI:10.1109/iciem54221.2022.9853179
摘要
In this paper we will discuss about an open source framework for storing and processing a huge amount of data, known as HADOOP (High Availability Distributed Object Oriented Platform). Originally HADOOP is written in Java Language. HADOOP work on the concept of Write Once Read as many as times as you want but don't change the content of the file (Stream Line Access Pattern). HADOOP consist a cluster containing heterogeneous computing devices with commodity hardware. A HADOOP cluster consist two things: HDFS (Hadoop Distributed File System) and MapReduce. HDFS used for data storage and MapReduce used for data process. HDFS is suitable for storing data from Tear byte to Petabyte on a cluster and it run on a commodity hardware.
科研通智能强力驱动
Strongly Powered by AbleSci AI