计算机科学
深度学习
建筑
机器学习
基线(sea)
人工智能
简单(哲学)
适应(眼睛)
领域(数学)
变压器
航程(航空)
数据挖掘
集合(抽象数据类型)
域适应
数据科学
地质学
纯数学
物理
程序设计语言
分类器(UML)
量子力学
认识论
视觉艺术
复合材料
光学
海洋学
电压
数学
材料科学
哲学
艺术
作者
Yury Gorishniy,Ivan Rubachev,Valentin Khrulkov,Artem Babenko
出处
期刊:Cornell University - arXiv
日期:2021-01-01
被引量:116
标识
DOI:10.48550/arxiv.2106.11959
摘要
The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. As a result, it is unclear for both researchers and practitioners what models perform best. Additionally, the field still lacks effective baselines, that is, the easy-to-use models that provide competitive performance across different problems. In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks. Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution.
科研通智能强力驱动
Strongly Powered by AbleSci AI