数据仓库
计算机科学
大数据
本体论
过程(计算)
模式(遗传算法)
医疗保健
数据转换
数据库
数据集成
转化(遗传学)
领域(数学分析)
数据科学
数据挖掘
情报检索
操作系统
数学分析
哲学
数学
认识论
经济
经济增长
生物化学
化学
基因
出处
期刊:International Journal of Data Mining, Modelling and Management
[Inderscience Enterprises Ltd.]
日期:2023-01-01
卷期号:15 (3): 240-254
标识
DOI:10.1504/ijdmmm.2023.132974
摘要
Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI