计算机科学
数据仓库
模式(遗传算法)
数据质量
数据科学
过程(计算)
数据集成
数据挖掘
数据库
情报检索
程序设计语言
运营管理
经济
公制(单位)
出处
期刊:IEEE Data(base) Engineering Bulletin
日期:2000-01-01
卷期号:23: 3-13
被引量:905
摘要
We classify data quality problems that are addressed by data cleaning and provide an overview of the main solution approaches. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data warehouses, data cleaning is a major part of the so-called ETL process. We also discuss current tool support for data cleaning.
科研通智能强力驱动
Strongly Powered by AbleSci AI