Feature Screening for Interval-Valued Response with Application to Study Association between Posted Salary and Required Skills

工资 特征(语言学) 区间(图论) 离群值 估计员 统计 计算机科学 数学 计量经济学 人工智能 组合数学 经济 市场经济 语言学 哲学
作者
Wei Zhong,Chen Qian,Wanjun Liu,Liping Zhu,Runze Li
出处
期刊:Journal of the American Statistical Association [Informa]
卷期号:118 (542): 805-817 被引量:3
标识
DOI:10.1080/01621459.2022.2152342
摘要

It is important to quantify the differences in returns to skills using the online job advertisements data, which have attracted great interest in both labor economics and statistics fields. In this article, we study the relationship between the posted salary and the job requirements in online labor markets. There are two challenges to deal with. First, the posted salary is always presented in an interval-valued form, for example, 5k–10k yuan per month. Simply taking the mid-point or the lower bound as the alternative for salary may result in biased estimators. Second, the number of the potential skill words as predictors generated from the job advertisements by word segmentation is very large and many of them may not contribute to the salary. To this end, we propose a new feature screening method, Absolute Distribution Difference Sure Independence Screening (ADD-SIS), to select important skill words for the interval-valued response. The marginal utility for feature screening is based on the difference of estimated distribution functions via nonparametric maximum likelihood estimation, which sufficiently uses the interval information. It is model-free and robust to outliers. Numerical simulations show that the new method using the interval information is more efficient to select important predictors than the methods only based on the single points of the intervals. In the real data application, we study the text data of job advertisements for data scientists and data analysts in a major China’s online job posting website, and explore the important skill words for the salary. We find that the skill words like optimization, long short-term memory (LSTM), convolutional neural networks (CNN), collaborative filtering, are positively correlated with the salary while the words like Excel, Office, data collection, may negatively contribute to the salary. Supplementary materials for this article are available online.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
一切顺利发布了新的文献求助10
4秒前
Lili举报能干小天鹅求助涉嫌违规
6秒前
6秒前
7秒前
小白啊发布了新的文献求助10
9秒前
9秒前
rFsu66Aiir完成签到,获得积分0
10秒前
10秒前
Mike001发布了新的文献求助80
10秒前
英姑应助oui采纳,获得10
11秒前
飞云之下发布了新的文献求助10
11秒前
13秒前
13秒前
可爱的函函应助ZHANG采纳,获得10
14秒前
从容芮应助葛觅荷采纳,获得10
15秒前
wanci应助眼睛大的紫丝采纳,获得10
18秒前
20秒前
哄哄完成签到,获得积分10
23秒前
852应助殷勤的可兰采纳,获得10
24秒前
Casi完成签到 ,获得积分10
30秒前
30秒前
32秒前
自然杀伤细胞完成签到 ,获得积分10
32秒前
35秒前
ZHANG发布了新的文献求助10
37秒前
39秒前
金虎发布了新的文献求助10
40秒前
充电宝应助猴子采纳,获得10
41秒前
大模型应助鸡冠哥的她采纳,获得10
43秒前
快乐棒棒糖应助帮帮我采纳,获得10
43秒前
李晨阳完成签到,获得积分20
43秒前
45秒前
Atlantis发布了新的文献求助10
46秒前
科研刚哥发布了新的文献求助10
49秒前
49秒前
沉静的夜玉完成签到,获得积分10
51秒前
英俊的铭应助小闵采纳,获得10
51秒前
53秒前
55秒前
Balance Man发布了新的文献求助10
58秒前
高分求助中
The three stars each: the Astrolabes and related texts 1100
Sport in der Antike 800
De arte gymnastica. The art of gymnastics 600
Berns Ziesemer - Maos deutscher Topagent: Wie China die Bundesrepublik eroberte 500
Stephen R. Mackinnon - Chen Hansheng: China’s Last Romantic Revolutionary (2023) 500
Sport in der Antike Hardcover – March 1, 2015 500
Psychological Warfare Operations at Lower Echelons in the Eighth Army, July 1952 – July 1953 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2429734
求助须知:如何正确求助?哪些是违规求助? 2114383
关于积分的说明 5361331
捐赠科研通 1842256
什么是DOI,文献DOI怎么找? 916893
版权声明 561496
科研通“疑难数据库(出版商)”最低求助积分说明 490478