亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

A comparative evaluation of aggregation methods for machine learning over vertically partitioned data

数据挖掘 支持向量机 特征选择 大数据 集合(抽象数据类型)
作者
Bernardo Trevizan,Jorge Cristhian Chamby-Diaz,Ana L. C. Bazzan,Mariana Recamonde-Mendoza
出处
期刊:Expert Systems With Applications [Elsevier BV]
卷期号:152: 113406- 被引量:3
标识
DOI:10.1016/j.eswa.2020.113406
摘要

Abstract It is increasingly common applications where data are naturally generated in a distributed fashion, especially after the emergence of technologies like the Internet of Things (IoT). In sensor networks, in collaborative health or genomic projects, in credit risk analysis, among other domains, distinct features are collected from multiple sources, including the use of social media and mobile applications, and due to privacy concerns or communication costs, may not be shared among sites. This scenario of vertical data partitioning poses challenges to traditional machine learning (ML) approaches, as classical algorithms are designed to learn from the complete set of features. A common strategy is to combine predictions from local models trained at each site into a global model, and for this purpose, several aggregation methods have been proposed. In this work we tackle a gap within the related literature, performing a comparative evaluation of elementary and meta-learning-based aggregation methods to reveal their strengths and weakness for 46 datasets with varied characteristics. We show that no method outperforms its counterparts in all domains, emphasizing the need for experimental comparison to ensure a good choice in the domain of interest. Moreover, our experiments provide the first insights into the relations between datasets’ properties and aggregators’ performance. We show that for low class imbalance and a good instance-to-feature ratio, almost all aggregation methods tend to perform well. The silhouette coefficient (reflecting class separability) and class imbalance coefficient are the most influential properties on aggregators’ performance, thus we recommend their analysis in the first step of the methodological design. We found that arithmetic-based methods are not suitable for datasets with poor class separability and a large number of classes, whereas meta-learning approaches are less sensitive for datasets with silhouette coefficient close to 0. Our analyses were summarized as classification and regression trees, which have the impact to serve as practical tools for future research. Taken together, our findings give rise to interesting applications in the domain of intelligent systems, especially regarding their potential to reduce the burden of vast experimental comparisons when training ML models with feature-partitioned data.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
jjdbqml完成签到,获得积分10
2秒前
嘟嘟喂嘟嘟应助jjdbqml采纳,获得10
9秒前
13秒前
zsmj23完成签到 ,获得积分0
14秒前
35秒前
44秒前
44秒前
搜集达人应助Bokuto采纳,获得10
45秒前
54秒前
量子星尘发布了新的文献求助30
58秒前
Bokuto发布了新的文献求助10
1分钟前
lovelife完成签到,获得积分10
1分钟前
1分钟前
传奇完成签到 ,获得积分10
1分钟前
2分钟前
量子星尘发布了新的文献求助10
2分钟前
1435087522完成签到 ,获得积分10
3分钟前
慕青应助陈静采纳,获得10
3分钟前
3分钟前
脑洞疼应助科研通管家采纳,获得10
3分钟前
量子星尘发布了新的文献求助50
4分钟前
4分钟前
Kevin完成签到,获得积分10
5分钟前
6分钟前
阿亞完成签到,获得积分10
6分钟前
量子星尘发布了新的文献求助10
6分钟前
6分钟前
Eason完成签到,获得积分10
6分钟前
7分钟前
量子星尘发布了新的文献求助30
7分钟前
贾明灵完成签到 ,获得积分10
8分钟前
9分钟前
量子星尘发布了新的文献求助10
9分钟前
慕青应助yuexiammchong采纳,获得10
9分钟前
9分钟前
9分钟前
不秃燃的小老弟完成签到 ,获得积分10
9分钟前
张铭完成签到 ,获得积分10
9分钟前
10分钟前
高分求助中
【提示信息,请勿应助】关于scihub 10000
Les Mantodea de Guyane: Insecta, Polyneoptera [The Mantids of French Guiana] 3000
The Mother of All Tableaux: Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 3000
A new approach to the extrapolation of accelerated life test data 1000
徐淮辽南地区新元古代叠层石及生物地层 500
北师大毕业论文 基于可调谐半导体激光吸收光谱技术泄漏气体检测系统的研究 390
Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4021491
求助须知:如何正确求助?哪些是违规求助? 3561627
关于积分的说明 11336585
捐赠科研通 3293641
什么是DOI,文献DOI怎么找? 1814349
邀请新用户注册赠送积分活动 889228
科研通“疑难数据库(出版商)”最低求助积分说明 812795