Sparse Learning with Non-convex Penalty in Multi-classification

多项式logistic回归 甲骨文公司 二元分类 线性判别分析 特征选择 惩罚法 多项式分布 数学 数学优化 变量(数学) 最优判别分析 判别式 人工智能 计算机科学 支持向量机 二次方程 功能(生物学) 机器学习 统计 几何学 数学分析 软件工程 进化生物学 生物
作者
Nan Li,Hao Helen Zhang
出处
期刊:Journal of data science [School of Statistics, Renmin University of China]
卷期号:: 56-74 被引量:1
标识
DOI:10.6339/20-jds1000
摘要

Multi-classification is commonly encountered in data science practice, and it has broad applications in many areas such as biology, medicine, and engineering. Variable selection in multiclass problems is much more challenging than in binary classification or regression problems. In addition to estimating multiple discriminant functions for separating different classes, we need to decide which variables are important for each individual discriminant function as well as for the whole set of functions. In this paper, we address the multi-classification variable selection problem by proposing a new form of penalty, supSCAD, which first groups all the coefficients of the same variable associated with all the discriminant functions altogether and then imposes the SCAD penalty on the supnorm of each group. We apply the new penalty to both soft and hard classification and develop two new procedures: the supSCAD multinomial logistic regression and the supSCAD multi-category support vector machine. Our theoretical results show that, with a proper choice of the tuning parameter, the supSCAD multinomial logistic regression can identify the underlying sparse model consistently and enjoys oracle properties even when the dimension of predictors goes to infinity. Based on the local linear and quadratic approximation to the non-concave SCAD and nonlinear multinomial log-likelihood function, we show that the new procedures can be implemented efficiently by solving a series of linear or quadratic programming problems. Performance of the new methods is illustrated by simulation studies and real data analysis of the Small Round Blue Cell Tumors and the Semeion Handwritten Digit data sets.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
linxiangFYYY发布了新的文献求助10
刚刚
清祀发布了新的文献求助10
1秒前
嘻嘻完成签到,获得积分10
1秒前
wbhou完成签到 ,获得积分10
1秒前
JiegeSCI发布了新的文献求助10
2秒前
LYF发布了新的文献求助10
2秒前
科研通AI5应助义气的钥匙采纳,获得10
2秒前
losidan发布了新的文献求助10
3秒前
研友_VZG7GZ应助创不可贴采纳,获得10
3秒前
3秒前
nlm完成签到,获得积分10
3秒前
123zyuyu发布了新的文献求助50
4秒前
小二郎应助小科采纳,获得10
4秒前
无辜秋珊完成签到,获得积分10
4秒前
Xi完成签到,获得积分10
4秒前
天天快乐应助qi采纳,获得10
5秒前
安之于数发布了新的文献求助10
5秒前
尊敬的夏槐完成签到,获得积分10
6秒前
一期一完成签到,获得积分10
6秒前
7秒前
7秒前
nan完成签到,获得积分10
7秒前
静静静完成签到,获得积分10
9秒前
9秒前
包包琪完成签到 ,获得积分10
9秒前
赘婿应助weiwei采纳,获得10
9秒前
JamesPei应助Denmark采纳,获得10
9秒前
Xi发布了新的文献求助10
10秒前
芯止谭轩完成签到,获得积分10
10秒前
10秒前
10秒前
微兔小妹完成签到 ,获得积分10
11秒前
墨兮发布了新的文献求助10
11秒前
脑洞疼应助风中小鸽子采纳,获得10
12秒前
四月完成签到 ,获得积分10
12秒前
liuxh123完成签到,获得积分20
13秒前
14秒前
16秒前
隐形曼青应助123zyuyu采纳,获得10
17秒前
平常的无极完成签到,获得积分20
18秒前
高分求助中
Applied Survey Data Analysis (第三版, 2025) 800
Narcissistic Personality Disorder 700
Assessing and Diagnosing Young Children with Neurodevelopmental Disorders (2nd Edition) 700
The Elgar Companion to Consumer Behaviour and the Sustainable Development Goals 540
The Martian climate revisited: atmosphere and environment of a desert planet 500
Images that translate 500
Transnational East Asian Studies 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3843476
求助须知:如何正确求助?哪些是违规求助? 3385765
关于积分的说明 10542222
捐赠科研通 3106595
什么是DOI,文献DOI怎么找? 1710971
邀请新用户注册赠送积分活动 823898
科研通“疑难数据库(出版商)”最低求助积分说明 774366