已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

Trial sequential analysis: adding a new dimension to meta‐analysis

医学 引用 图书馆学 维数(图论) 计算机科学 纯数学 数学
作者
Akshay Shah,Andrew F Smith
出处
期刊:Anaesthesia [Wiley]
卷期号:75 (1): 15-20 被引量:96
标识
DOI:10.1111/anae.14705
摘要

Systematic reviews and meta-analyses have long been considered to be at the top of the evidence-based medicine hierarchy and are frequently used to inform clinical practice and future research. The number of published systematic reviews continues to increase annually, with some estimates suggesting that 11 systematic reviews are published daily in the medical literature 1, 2. Despite their growing numbers, a large proportion of systematic reviews and meta-analyses are unnecessary, misleading and poorly conducted and reported 3. One key issue that has previously been discussed in Anaesthesia is the interpretation of systematic reviews and meta-analyses with sparse data, which often result in false positive (type-1 error) and false negative (type-2 error) findings 4. Over the past decade, trial sequential analysis (TSA) has emerged as an attractive statistical method to address this issue. It combines conventional meta-analytical techniques with statistical monitoring boundaries, that create thresholds for determining significance based on the impact of multiple testing and amount of information already available in the meta-analysis. Increasing numbers of meta-analyses, including one in a recent issue of Anaesthesia by Grape et al. 5, are now incorporating TSA into their methods. In this article, we aim to provide the reader with a basic understanding of the principles underlying TSA and its interpretation. The random variation and imprecision in the results of meta-analyses with sparse data (i.e. small numbers of trials or events) is more likely to lead to incorrect inclusions 6, 7. Such meta-analyses are often regularly updated with data from further trials and are, therefore, subject to repeated significance testing being carried out. This further increases the likelihood of a type-1 error, a phenomenon known as 'multiplicity due to repeated significance testing' 8, evident in randomised controlled trials, where repeated testing of accumulating data increases the overall risk of a type-1 error 9. Previous work has suggested that the actual risk of a type-1 error in meta-analyses may range from 10% to 30%, meaning that between 1 and 3 out of 10 interventions may be falsely reported as beneficial (or useless) 7, 10. A simple and useful way to start thinking about TSA is to draw an analogy with the methods and conduct of a randomised controlled trial. For a clinical trial, investigators derive a sample size calculation based on the following assumptions – (1) the event rate in the control group; (2) the anticipated effect size of the intervention; (3) the accepted risk of a type-1 error (typically < 5%); and (4) the desired statistical power (typically at least an 80% chance that an effect at least as large as the anticipated effect will be detected if it exists). Trial sequential analysis requires these same assumptions to derive a power calculation for a meta-analysis, which is often termed as the 'required information size'. Much like a clinical trial, these assumptions should be pre-specified in the systematic review protocol. Several methods, such as TSA, sequential meta-analysis using Whitehead's triangular test, Bayesian methods and the law of the iterated algorithm, have been proposed to mitigate the risk of misinterpreting random error in meta-analyses with sparse data (i.e. an effect at least as large as the anticipated effect exists) as they provide more conservative thresholds for declaring statistical significance 8, 11, 12. Trial sequential analysis has previously been described as a hybrid position between frequentist and Bayesian approaches, with the sequential analysis arising from frequentist statistics, and the Bayesian component arising from a single a priori effect estimate of the intervention (although Bayesian analysis incorporates multiple prior distributions with different anticipated effect estimates, for example, sceptical, realistic and optimistic priors) 13, 14. As discussed previously, the risk of misinterpreting random error increases when data are sparse. Using the assumptions used to create a required information size, TSA considers this risk and adjusts significance thresholds accordingly. The monitoring thresholds are built as a representation of the strength of the evidence and rest on an assumption that the amount of evidence will continue to accumulate until either a monitoring boundary (or significance threshold) is crossed, or the required information size is reached. If the accrued information size is less than the required information size, a stricter significance threshold is applied. Trial sequential analysis can display this threshold with wider confidence intervals, often reported in manuscripts as TSA-adjusted confidence intervals, which add more transparency to the uncertainty of the point estimate. Similarly, as the accrued information size approaches the required information size, thresholds become more relaxed and TSA-adjusted confidence intervals narrower (Fig. 1). The cumulative Z value in TSA represents the summary test statistic of all included trials and a new Z value is calculated each time a new trial is added. The Z value is an estimation of the random error in the data and a greater Z value (i.e. a lower p value) makes it less likely that the data are spurious or taken from a population where the null hypothesis is true. An estimation of heterogeneity (or diversity) is required to account for differences in trial populations, study designs and interventions, which is similar to adjustments for variations across centres in a multi-centre trial. Such heterogeneity reduces the precision of the results and increases the required information size. D2 is the measure of diversity used in TSA and is a measure of between trial variation, similar to the I2 used in conventional meta-analysis, but may mathematically be a better alternative to I2 when considering variation in any random effects meta-analysis particularly when data are sparse 15. Assumptions regarding the chosen values of D2, including any sensitivity analyses with varying D2 values, should also be pre-specified ideally. Trial sequential analysis can also be used to construct futility boundaries. These were originally developed for interim analyses in randomised trials, to allow early termination of trials if unexpectedly large differences arose between treatment groups, saving time, resources and minimising participants' exposure to the inferior treatment. Such analyses should be considered at the design stage, and incorporated into the study protocol and/or statistical analysis plan before any trial is started. Similarly, if a meta-analysis of an intervention has concluded that there is no evidence of an effect, we need to know whether this was due to a lack of statistical power, or because the intervention is truly unlikely to have any effect. Analogous to interim analyses for clinical trials, TSA requires a pre-specified minimum desired effect size to construct futility boundaries that will be used to provide a threshold for detecting a lack of an anticipated effect that is large enough to be clinically meaningful. In other words, they indicate when the anticipated effect could be considered as being unobtainable. Above this threshold, there is still a possibility that a statistically significant effect will be found, but below this threshold, it is extremely unlikely that an effect as large as the anticipated effect, given the constraints of power and statistical thresholds, will be found. In such instances conducting future trials is futile. For readers wishing to learn more about the underlying principles and conduct of TSA, the software and manual can be freely downloaded from www.ctu.dk/tsa. Trial sequential analysis can be applied to analyses on dichotomous data and on the mean difference of continuous data but not on standardised mean differences. Figure 1 demonstrates the components of two-sided TSA. Unfortunately, meta-analyses of anaesthetic interventions commonly have few data to draw on. Imberger et al., in a review of 50 randomly selected meta-analyses of anaesthetic interventions, observed that the median number of included trials was 8, the median (IQR [range]) number of participants was 964 (523–1736 [99–11 172]) and the median number of participants with the outcome of interest was 202 (96–443 [26–5762]). After applying TSA, only 6 out of 50 (12%) meta-analyses had sufficient power of greater than 80% and only 16 out of 50 (32%) preserved their risk of a type-1 error of less than 5% 16. An example of the utility of TSA was highlighted in a recent meta-analysis, published in Anaesthesia, which aimed to evaluate the effect of paravertebral block on the prevalence of persistent postsurgical pain after breast surgery 17. Two previous meta-analyses had demonstrated that paravertebral block reduced the odds of persistent postoperative pain after 6 months, but only included two 18 and four 19 studies. Heesen et al. updated these and included seven studies 17, but observed no statistically significant risk reduction for chronic postoperative pain using conventional meta-analysis. Using TSA, they demonstrated that the available evidence was not sufficient to reach a conclusion. In order to detect a pre-specified relative risk reduction of 20% in chronic postoperative pain at 3 months, only 317/1734 (18%) of the required information size was reached. For clinical trialists, this approach is potentially useful as it can provide information on the required number of participants in future trials to 'plug the gap'. Similar examples have also been reported in other specialties 20. Trial sequential analysis is a complex statistical tool that can be misused and has been criticised. Despite its possible benefits, its application in meta-analyses is not universal. Even within the Cochrane Collaboration, with its standardised methods, reviews on topics of clinical relevance to anaesthetists have not routinely applied TSA in the past, with some choosing to 21, 22 and other not 23, 24. In fact, the most recent Cochrane Scientific Committee Expert Panel recommended against the routine use of sequential methods for updated meta-analyses 25. The Panel argued that although systematic reviews are able to address the effect of an intervention on different outcomes and on different sub-groups, sequential methods such as TSA cannot accommodate multiple different thresholds for different outcomes. They are often based on a particular outcome that may not be of interest to all stakeholders. Others have argued along similar lines, commenting that TSA is likely to be performed on the primary outcome only, meaning that the risk of spurious findings will still persist for reported secondary outcomes 26. The Panel also argued that although similarities have been drawn between TSA and the conduct of a clinical trial, especially with regard to futility boundaries which assist data monitoring committees, meta-analyses are retrospective and observational by nature. The meta-analyst is, therefore, unable to control for the trials that have already been performed which are eligible for the meta-analysis. It is impossible to create a retrospective sequential programme that would maintain the pre-specified assumptions of a TSA. Knowledge and transparency of the assumptions used when performing TSA is critical. However, the complexity of statistical methods creates a veneer of certainty for naïve analysts and their readers which can lead them to gloss over the many assumptions and judgements which must be made in the process of performing any meta-analysis. Variations in these assumptions can significantly affect the required information size. There may be disagreements about acceptable type-1 and type-2 error rates, what constitutes an anticipated effect size and whether it is of clinical relevance. Sceptical or conservative a priori effect estimates do not take into consideration the effect already obtained from accrued data and can lead to unrealistically large required information sizes. While controlling for type-1 errors, TSA may unintentionally increase the rate of type-2 errors, that is, falsely concluding that there is no effect when one exists. In the meta-analysis by Grape et al., the authors performed TSA on their primary outcome of mean postoperative pain score at 2 h. They state that "trial sequential analysis indicated that firm evidence was reached and that dexmedetomidine was superior to remifentanil". The anticipated effect size appears to be a mean difference of −0.7 and variance of 1.17, which is the point estimate in their forest plot. The authors do not state whether or not this approach was defined a priori. The required information size to detect this difference was 657 participants and the accrued information size in this meta-analysis (672 participants), has already surpassed this and crossed the sequential monitoring boundary for benefit. To illustrate the point of how information sizes can vary with different assumptions, Fig. 2 shows a TSA where the required information size was calculated with the same type-1 (5%) and type-2 (80%) error rates, an anticipated mean pain score reduction of −0.5 and a heterogeneity estimated with diversity of 89%. The graph now shows that the required information size using these assumptions would be 1289, which is almost double than the number already accrued so far. The mean (95%CI) reduction in pain score with TSA was 0.7 (1.44 to −0.03). Given the possibly greater impact of meta-analyses on policy and practice, one could make a case that the thresholds for meta-analyses should be higher than that for clinical trials. If we repeat the primary TSA by Grape et al. but with 90% power instead, Fig. 3 shows a TSA where the required information size is now 885 participants. As the trial sequential monitoring boundary has been broached for statistical significance, we can interpret the evidence as being conclusive. Perhaps the most important question is whether a mean reduction of 0.7 in pain score at two postoperative hours is an important outcome of relevance to patients? In summary, TSA is becoming increasingly popular and provides more information around uncertainty and imprecision in meta-analyses with sparse data. We provide a brief checklist (Table 1) on what to consider when interpreting the results of a TSA, while also bearing in mind the concerns raised by the Cochrane Collaboration. We strongly advocate review authors to work with experienced methodologists when performing sequential methods, and for editors to ensure that the assumptions underlying trial sequential analyses are transparent and clearly conveyed when reviewing and publishing manuscripts. It is a complex statistical tool, and, to quote the software engineer Grady Booch, "a fool with a tool is still a fool". AS is a Trainee Fellow of Anaesthesia and is being supported by an NIHR Doctoral Research Fellowship (DRF-2017-10-094). AS is an editor of Anaesthesia and Co-ordinating Editor of the Cochrane Anaesthesia Review Group.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
共享精神应助爹爹采纳,获得10
1秒前
威武采白完成签到 ,获得积分10
1秒前
ag发布了新的文献求助30
3秒前
ningwu完成签到,获得积分10
3秒前
jyy完成签到,获得积分10
4秒前
5秒前
莫名乐乐完成签到,获得积分10
5秒前
炫饭仙女完成签到,获得积分20
6秒前
8秒前
省级中药饮片完成签到 ,获得积分10
8秒前
8秒前
爹爹完成签到,获得积分10
9秒前
大大发布了新的文献求助10
11秒前
科研通AI2S应助易安采纳,获得10
11秒前
爹爹发布了新的文献求助10
12秒前
小可乐完成签到,获得积分10
12秒前
ISLAND完成签到,获得积分20
12秒前
14秒前
具体问题具体分析完成签到,获得积分10
15秒前
自信号厂完成签到 ,获得积分0
15秒前
东方月汐梦完成签到 ,获得积分10
16秒前
17秒前
小枣完成签到 ,获得积分10
19秒前
TaoJ发布了新的文献求助10
22秒前
乐观期待完成签到,获得积分10
23秒前
ru完成签到 ,获得积分10
25秒前
25秒前
25秒前
资格丘二完成签到 ,获得积分10
25秒前
28秒前
知性的夏之完成签到 ,获得积分10
28秒前
大个应助从容小白菜采纳,获得10
32秒前
哈哈发布了新的文献求助10
33秒前
谦让惜海完成签到 ,获得积分10
34秒前
Fxy完成签到 ,获得积分10
35秒前
36秒前
36秒前
36秒前
Fancy应助科研通管家采纳,获得20
36秒前
37秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Encyclopedia of Forensic and Legal Medicine Third Edition 5000
Introduction to strong mixing conditions volume 1-3 5000
Aerospace Engineering Education During the First Century of Flight 3000
Agyptische Geschichte der 21.30. Dynastie 3000
Les Mantodea de guyane 2000
Electron Energy Loss Spectroscopy 1500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5779434
求助须知:如何正确求助?哪些是违规求助? 5647681
关于积分的说明 15451875
捐赠科研通 4910775
什么是DOI,文献DOI怎么找? 2642857
邀请新用户注册赠送积分活动 1590536
关于科研通互助平台的介绍 1544921