云计算
计算机科学
比例(比率)
分布式计算
数据科学
操作系统
地理
地图学
标识
DOI:10.20944/preprints202411.0666.v1
摘要
The exponential growth in the amount of data generated by genomic studies of genetic diseases reflects the rapid development of this field. The limitations of traditional on-premises computing resources, in terms of compute speed, storage scalability and parallel processing power, make it challenging to cope with the exponential growth in data. In this paper, we put forward a cloud-based multi-layer parallel computing framework designed to accelerate and optimise high-throughput analysis of genetic data. Firstly, the elastic expansion characteristics of the cloud platform are employed to distribute the genetic data stored in the object storage system, thereby enabling the dynamic reading and processing of the data. Subsequently, the integration of containerisation and virtualisation technologies enables the implementation of massively parallel computing on a multi-node cloud cluster, allowing each node to independently process disparate gene sub-data blocks. The experimental results demonstrate that the proposed framework exhibits superior performance in terms of data processing speed and computational efficiency compared to traditional local computing methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI