Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data

取样偏差 选择偏差 统计 选择(遗传算法) 数据收集 样本量测定 计量经济学 选型 计算机科学 采样(信号处理) 人口 样品(材料) 航程(航空) 统计能力 推论 抽样分布 数学 估计员 差异(会计) 滤波器(信号处理) 色谱法 复合材料 化学 材料科学 计算机视觉
作者
Steven Phillips,Miroslav Dudík,Jane Elith,Catherine H. Graham,Anthony Lehmann,John R. Leathwick,Simon Ferrier
出处
期刊:Ecological Applications [Wiley]
卷期号:19 (1): 181-197 被引量:1962
标识
DOI:10.1890/07-2153.1
摘要

Most methods for modeling species distributions from occurrence records require additional data representing the range of environmental conditions in the modeled region. These data, called background or pseudo-absence data, are usually drawn at random from the entire region, whereas occurrence collection is often spatially biased toward easily accessed areas. Since the spatial bias generally results in environmental bias, the difference between occurrence collection and background sampling may lead to inaccurate models. To correct the estimation, we propose choosing background data with the same bias as occurrence data. We investigate theoretical and practical implications of this approach. Accurate information about spatial bias is usually lacking, so explicit biased sampling of background sites may not be possible. However, it is likely that an entire target group of species observed by similar methods will share similar bias. We therefore explore the use of all occurrences within a target group as biased background data. We compare model performance using target-group background and randomly sampled background on a comprehensive collection of data for 226 species from diverse regions of the world. We find that target-group background improves average performance for all the modeling methods we consider, with the choice of background data having as large an effect on predictive performance as the choice of modeling method. The performance improvement due to target-group background is greatest when there is strong bias in the target-group presence records. Our approach applies to regression-based modeling methods that have been adapted for use with occurrence data, such as generalized linear or additive models and boosted regression trees, and to Maxent, a probability density estimation method. We argue that increased awareness of the implications of spatial bias in surveys, and possible modeling remedies, will substantially improve predictions of species distributions.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
共享精神应助曹丶丶采纳,获得10
1秒前
Akim应助睡觉觉采纳,获得10
1秒前
乐乐应助玄音采纳,获得10
1秒前
2秒前
赘婿应助科研通管家采纳,获得10
3秒前
斯文败类应助科研通管家采纳,获得10
3秒前
来日昭昭应助科研通管家采纳,获得10
3秒前
小二郎应助科研通管家采纳,获得10
3秒前
华仔应助科研通管家采纳,获得10
3秒前
Jasper应助科研通管家采纳,获得10
3秒前
3秒前
3秒前
3秒前
脑洞疼应助科研通管家采纳,获得10
3秒前
共享精神应助科研通管家采纳,获得10
3秒前
科目三应助科研通管家采纳,获得10
4秒前
wanci应助科研通管家采纳,获得10
4秒前
ding应助科研通管家采纳,获得10
4秒前
orixero应助科研通管家采纳,获得10
4秒前
4秒前
机灵柚子应助科研通管家采纳,获得20
4秒前
我是老大应助科研通管家采纳,获得10
4秒前
深情安青应助科研通管家采纳,获得10
4秒前
4秒前
所所应助科研通管家采纳,获得10
4秒前
5秒前
5秒前
果汁橡皮糖完成签到,获得积分10
5秒前
7秒前
慕青应助tianliyan采纳,获得10
7秒前
李健应助Eric采纳,获得10
7秒前
杏林靴子完成签到,获得积分10
7秒前
小五屁孩儿完成签到,获得积分10
8秒前
9秒前
luojimao发布了新的文献求助10
9秒前
赘婿应助lixm采纳,获得10
9秒前
快乐凝荷完成签到 ,获得积分10
9秒前
微微发布了新的文献求助10
11秒前
赘婿应助石会发采纳,获得10
12秒前
高分求助中
【提示信息,请勿应助】请使用合适的网盘上传文件 10000
The Oxford Encyclopedia of the History of Modern Psychology 1500
Green Star Japan: Esperanto and the International Language Question, 1880–1945 800
Sentimental Republic: Chinese Intellectuals and the Maoist Past 800
The Martian climate revisited: atmosphere and environment of a desert planet 800
Parametric Random Vibration 800
城市流域产汇流机理及其驱动要素研究—以北京市为例 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3861048
求助须知:如何正确求助?哪些是违规求助? 3403386
关于积分的说明 10635114
捐赠科研通 3126593
什么是DOI,文献DOI怎么找? 1724156
邀请新用户注册赠送积分活动 830363
科研通“疑难数据库(出版商)”最低求助积分说明 779103