The Hazards of Hazard Ratios

危害 环境科学 化学 有机化学
作者
Miguel A. Hernán
出处
期刊:Epidemiology [Ovid Technologies (Wolters Kluwer)]
卷期号:21 (1): 13-15 被引量:1151
标识
DOI:10.1097/ede.0b013e3181c1ea43
摘要

The hazard ratio (HR) is the main, and often the only, effect measure reported in many epidemiologic studies. For dichotomous, non–time-varying exposures, the HR is defined as the hazard in the exposed groups divided by the hazard in the unexposed groups. For all practical purposes, hazards can be thought of as incidence rates and thus the HR can be roughly interpreted as the incidence rate ratio. The HR is commonly and conveniently estimated via a Cox proportional hazards model, which can include potential confounders as covariates. Unfortunately, the use of the HR for causal inference is not straightforward even in the absence of unmeasured confounding, measurement error, and model misspecification. Endowing a HR with a causal interpretation is risky for 2 key reasons: the HR may change over time, and the HR has a built-in selection bias. Here I review these 2 problems and some proposed solutions. As an example, I will use the findings from a Women's Health Initiative randomized experiment that compared the risk of coronary heart disease of women assigned to combined (estrogen plus progestin) hormone therapy with that of women assigned to placebo.1 By using a randomized experiment as an example, the discussion can focus on the shortcomings of the HR, setting aside issues of confounding and other serious problems that arise in observational studies. The Women's Health Initiative followed over 16,000 women for an average of 5.2 years before the study was halted due to safety concerns. The primary result from the trial was a HR. As stated in the abstract1 and shown in Table 1 of the article, "Combined hormone therapy was associated with a hazard ratio of 1.24."1 In addition, Table 2 provided the HRs during each year of follow-up: 1.81, 1.34, 1.27, 1.25, 1.45, and 0.70 for years 1, 2, 3, 4, 5, and 6+, respectively. Thus, the HR reported in the abstract and Table 1 can be viewed as some sort of weighted average of the period-specific HRs reported in Table 2. This bring us to Problem 1: although the HR may change over time, some studies report only a single HR averaged over the duration of the study's follow-up. As a result, the conclusions from the study may critically depend on the duration of the follow-up. For example, the average HR in the WHI would have been 1.8 if the study had been halted after 1 year of follow-up, 1.7 after 2 years,2 1.2 after 5 years, and—who knows—perhaps 1.0 after 10 years. The 24% increase in the rate of coronary heart disease that many researchers and journalists consider as the effect of combined hormone therapy is the result of the arbitrary choice of an average follow-up period of 5.2 years. A trial with a shorter follow-up could have reported an 80% increase, whereas a longer trial might have found little or no increase at all. The magnitude of the average HR depends on the length of follow-up because the average HR ignores the distribution of events during the follow-up. The average HR can take the value 1.0 if the hazard in the exposed is identical to the hazard in the unexposed during the entire follow-up, or if the hazard in the exposed is higher during, say, the first 5 years and lower afterward. Incidentally, the same problem arises whether the average HR is directly estimated in a cohort study, as discussed here, or estimated via the odds ratio of a properly designed case-control study with incidence density sampling. One might then conclude that we should forget about the average HR and restrict our attention to the period-specific HRs, which seem to capture the potentially time-varying magnitude of the effect. This brings us to Problem 2: the period-specific HRs have a built-in selection bias. To describe the bias, consider that the (discrete-time) hazard during period t is defined as the risk of the outcome during period t among those who reached period t free of the outcome. In the Women's Health Initiative, the calculation of the HR during year t was restricted to women who did not develop coronary heart disease—the "survivors"—between baseline and the beginning of year t. The HR after year 5 was 0.7, which means that the disease rate after year 5 was lower in the treatment arm (the hazard in the numerator of the HR) than in the placebo arm (the hazard in the denominator). However, this apparently protective effect of hormone therapy after year 5 is hardly surprising if one bears in mind that women vary in their susceptibility to heart disease. A certain proportion of all women enrolled in the trial were particularly prone to develop heart disease if they were exposed to hormone therapy or other factors (for simplicity, let's refer to them as the "susceptible women"). The proportion of susceptible women in the trial was of course unknown but, because of randomization, it was expected to be the same in both the treatment and placebo arms at baseline. However, these susceptible women were preferentially excluded from the treatment arm as they developed heart disease over time—precisely because they were assigned to a therapy with harmful effects to which they were susceptible (all other factors to which they were susceptible were expected to be equally distributed between the 2 arms). With time, the proportion of susceptible women progressively increased in the placebo arm compared with the treatment arm. The bias due to the differential selection of less susceptible women over time, because of differential depletion of susceptibles, is the built-in selection bias of period-specific HRs. This bias may explain that the HR after year 5 is less than 1.0 even if hormone therapy has no truly preventive effect in any woman at any time. This built-in selection bias of the HR has also been described using causal diagrams.3,4 In short, the average HR may be uninformative because of potentially time-varying period-specific HRs, and because the period-specific HRs may be time-varying because of built-in selection bias. These problems can be overcome by summarizing the study findings as appropriately adjusted survival curves, where the survival at time t is defined as the proportion of individuals who are free of disease through time t. Another alternative not discussed here is the comparison of the distribution of survival times between the exposed and the unexposed, which can be accomplished by using accelerated failure time models5 rather than Cox models. Because of the shortcomings of the HR, the analysis of randomized experiments routinely include Kaplan-Meier survival curves—or their complement, the cumulative risk curve (see Figure 2 of the Women's Health Initiative trial report1). In contrast (and despite multiple warnings in the epidemiologic literature3–6), the analysis of observational follow-up studies are commonly summarized by HRs only. A possible explanation for this practice in observational studies is the need to deal with confounding. The HRs presented in observational studies are not simply the hazard in the exposed divided by the hazard in the unexposed. Rather, these HRs are adjusted for measured confounders by using regression models, inverse probability weighting, or other methods. Unadjusted HRs would be of little use for causal inference from observational data, as would unadjusted survival curves. It is not unexpected that most epidemiologic articles include HRs only, because epidemiology students are traditionally taught to estimate adjusted HRs but not adjusted survival curves.7 The next paragraph sketches a general procedure to obtain survival curves adjusted for baseline confounders. First, fit a discrete-time hazards model (eg, a pooled logistic model with relatively short periods) that estimates, at each time and for each person, the conditional probability of remaining free of the outcome given exposure, baseline covariates, and time of follow-up. Allow for time-varying hazards by modeling the variable "time of follow-up," using a flexible functional form (eg, cubic splines), and for time-varying HRs by adding product terms between exposure and "time of follow-up." Second, for each subject, multiply the model's predicted values through time t to estimate the survival at t for subjects with their same combination of covariate values. One can then construct conditional (adjusted) survival curves under the conditions of exposure and no exposure for each observed combination of values of the baseline covariates (in randomized trials, the survival curves are unconditional or marginal, ie, averaged over all the individuals irrespective of their covariate values). Third, predict the survival at time t for each subject both under exposure and under no exposure, regardless of the subject's exposure status. Fourth, separately average the conditional survivals under exposure and under no exposure, over all subjects. This last step effectively standardizes the curves to the empirical distribution of the covariates in the study, and results in 2 marginal survival curves: one under exposure, another under no exposure. The above procedure can be extended in a number of ways. In settings with time-varying exposures and confounders, the procedure can be combined with inverse probability weighting of the hazards model. This procedure has been used to present adjusted survival curves under continuous use ("always exposed") and no use of hormone therapy ("never exposed") in the analysis of both observational studies8 and randomized experiments9 in which time-varying exposures arise when considering adherence-adjusted analyses. In settings with continuous rather than dichotomous exposures, the procedure requires the choice of a finite number of levels of exposure to be compared ("always versus never exposed" will not do).10 One may then construct as many survival curves as there are exposure levels of interest. For continuous and time-varying exposures one needs to be especially careful about dose-response assumptions. Sensitivity analyses can be used to evaluate the possibility of model extrapolation beyond the observed data. Confidence intervals for the survival curves can be obtained by bootstrapping. So should we outlaw the use of HRs in epidemiologic studies? Of course not. A single average HR through t may be misleading, as explained above, but a single survival probability at t could be as misleading because both measures ignore the distribution of events between baseline and t. On the other hand, a series of average HRs for increasingly longer periods of follow-up is informative. For example, in the WHI the average HRs for 1, 2, and 5 years were approximately 1.8, 1.7, and 1.2, which indicates that hormone therapy increases the cumulative risk of heart disease in the early part of the follow-up but probably not much over longer periods. The same conclusion is drawn from the survival curves for the treatment and placebo groups, which converge after 8 years. In mortality studies with sufficiently long follow-up, the survival probabilities in both groups are ensured to reach the value 0, and the average HR is ensured to reach the value 1. An advantage of the survival curves over a series of average HRs is that the survival curves provide information about the absolute risks. For example, in the Women's Health Initiative, the average HR of 1.8 during year 1 means that the one-year risk was about 0.49% in the treatment group and 0.28% in the placebo group rather than, say, 49% versus 28%. An advantage of the average HRs over the survival curves is the readiness with which confidence intervals can be computed in standard software. What about period-specific HRs? Their built-in selection bias makes them difficult to interpret as a measure of time-varying effect. For example, in the Women's Health Initiative, the HR goes from greater than 1.0 to less than 1.0 after year 5—that is, the hazards of the treatment and the placebo groups cross at about year 5. However, this crossing of hazards is essentially meaningless from a practical standpoint. What really matters is that the survival is lower in the placebo group compared with the treatment group until at least year 8. Hazards may cross at some point during the follow-up because of depletion of susceptibles even if the survival curves never cross. Cumulative measures, such as a series of average HRs or survival curves, are needed to summarize the data in a meaningful way. On the other hand, period-specific HRs are useful as an intermediate step to estimate survival curves in the procedure described above. In summary, survival curves are more informative than HRs and can be easily generated. It would not be a bad thing to see them more widely used in observational studies.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
科研通AI6应助茶米采纳,获得10
1秒前
要爆炸了完成签到,获得积分10
1秒前
Orange应助木偶人采纳,获得10
2秒前
上上签完成签到,获得积分10
3秒前
3秒前
3秒前
111发布了新的文献求助10
4秒前
江野完成签到,获得积分20
4秒前
4秒前
烧麦专家发布了新的文献求助10
5秒前
5秒前
星辰大海应助机智的梨愁采纳,获得10
5秒前
张东发布了新的文献求助10
5秒前
6秒前
乐乐应助wzz采纳,获得10
6秒前
月亮发布了新的文献求助20
6秒前
6秒前
孟器发布了新的文献求助150
7秒前
cuigao发布了新的文献求助10
7秒前
cc完成签到,获得积分10
7秒前
完美世界应助几欢采纳,获得10
7秒前
传奇3应助龟仙人采纳,获得10
7秒前
成就土豆发布了新的文献求助10
8秒前
李健的小迷弟应助刘正阳采纳,获得10
8秒前
燕子发布了新的文献求助10
8秒前
Lengbo完成签到,获得积分10
9秒前
Alisa发布了新的文献求助10
10秒前
西鱼完成签到,获得积分10
10秒前
爆米花应助老刀采纳,获得10
11秒前
句号完成签到,获得积分10
11秒前
11秒前
BINGBING1230发布了新的文献求助10
11秒前
12秒前
尼尔朵龙拉应助后撤步7777采纳,获得10
12秒前
江野发布了新的文献求助30
13秒前
samu发布了新的文献求助10
13秒前
青山完成签到,获得积分10
14秒前
14秒前
14秒前
可爱的函函应助asqw采纳,获得30
14秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Bandwidth Choice for Bias Estimators in Dynamic Nonlinear Panel Models 2000
HIGH DYNAMIC RANGE CMOS IMAGE SENSORS FOR LOW LIGHT APPLICATIONS 1500
茶艺师试题库(初级、中级、高级、技师、高级技师) 1000
Constitutional and Administrative Law 1000
The Social Work Ethics Casebook: Cases and Commentary (revised 2nd ed.). Frederic G. Reamer 800
Vertebrate Palaeontology, 5th Edition 560
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5362300
求助须知:如何正确求助?哪些是违规求助? 4492165
关于积分的说明 13986052
捐赠科研通 4395354
什么是DOI,文献DOI怎么找? 2414509
邀请新用户注册赠送积分活动 1407276
关于科研通互助平台的介绍 1381841