等级间信度
评定量表
组内相关
心理学
可靠性(半导体)
一致性(知识库)
比例(比率)
临床心理学
精神科
心理测量学
发展心理学
计算机科学
量子力学
物理
人工智能
功率(物理)
标识
DOI:10.1176/ajp.150.6.947
摘要
The effects of methods used to improve the interrater reliability of reviewers' ratings of manuscripts submitted to the Journal of the American Academy of Child and Adolescent Psychiatry were studied.Reviewers' ratings of consecutive manuscripts submitted over approximately 1 year were first analyzed; 296 pairs of ratings were studied. Intraclass correlations and confidence intervals for the correlations were computed for the two main ratings by which reviewers quantified the quality of the article: a 1-10 overall quality rating and a recommendation for acceptance or rejection with four possibilities along that continuum. Modifications were then introduced, including a multi-item rating scale and two training manuals to accompany it. Over the next year, 272 more articles were rated, and reliabilities were computed for the new scale and for the scales previously used.The intraclass correlation of the most reliable rating before the intervention was 0.27; the reliability of the new rating procedure was 0.43. The difference between these two was significant. The reliability for the new rating scale was in the fair to good range, and it became even better when the ratings of the two reviewers were averaged and the reliability stepped up by the Spearman-Brown formula. The new rating scale had excellent internal consistency and correlated highly with other quality ratings.The data confirm that the reliability of ratings of scientific articles may be improved by increasing the number of rating scale points, eliciting ratings of separate, concrete items rather than a global judgment, using training manuals, and averaging the scores of multiple reviewers.
科研通智能强力驱动
Strongly Powered by AbleSci AI