摘要
Free AccessBariatric SurgeryClinical Practice Guideline for Diagnostic Testing for Adult Obstructive Sleep Apnea: An American Academy of Sleep Medicine Clinical Practice Guideline Vishesh K. Kapur, MD, MPH, Dennis H. Auckley, MD, Susmita Chowdhuri, MD, David C. Kuhlmann, MD, Reena Mehra, MD, MS, Kannan Ramar, MBBS, MD, Christopher G. Harrod, MS Vishesh K. Kapur, MD, MPH Address correspondence to: Vishesh K. Kapur, MD, MPH, University of Washington, Harborview Medical Center, 325 Ninth Avenue, Box 359762, Seattle, WA 98104(206) 744-5703(206) 744-5657 E-mail Address: [email protected] University of Washington, Seattle, WA , Dennis H. Auckley, MD MetroHealth Medical Center and Case Western Reserve University, Cleveland, OH , Susmita Chowdhuri, MD John D. Dingell VA Medical Center and Wayne State University, Detroit, MI , David C. Kuhlmann, MD Bothwell Regional Health Center, Sedalia, MO , Reena Mehra, MD, MS Cleveland Clinic, Cleveland, OH , Kannan Ramar, MBBS, MD Mayo Clinic, Rochester, MN , Christopher G. Harrod, MS American Academy of Sleep Medicine, Darien, IL Published Online:March 15, 2017https://doi.org/10.5664/jcsm.6506Cited by:1231SectionsAbstractPDFSupplemental Material ShareShare onFacebookTwitterLinkedInRedditEmail ToolsAdd to favoritesDownload CitationsTrack Citations AboutABSTRACTIntroduction:This guideline establishes clinical practice recommendations for the diagnosis of obstructive sleep apnea (OSA) in adults and is intended for use in conjunction with other American Academy of Sleep Medicine (AASM) guidelines on the evaluation and treatment of sleep-disordered breathing in adults.Methods:The AASM commissioned a task force of experts in sleep medicine. A systematic review was conducted to identify studies, and the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) process was used to assess the evidence. The task force developed recommendations and assigned strengths based on the quality of evidence, the balance of benefits and harms, patient values and preferences, and resource use. In addition, the task force adopted foundational recommendations from prior guidelines as “good practice statements”, that establish the basis for appropriate and effective diagnosis of OSA. The AASM Board of Directors approved the final recommendations.Recommendations:The following recommendations are intended as a guide for clinicians diagnosing OSA in adults. Under GRADE, a STRONG recommendation is one that clinicians should follow under most circumstances. A WEAK recommendation reflects a lower degree of certainty regarding the outcome and appropriateness of the patient-care strategy for all patients. The ultimate judgment regarding propriety of any specific care must be made by the clinician in light of the individual circumstances presented by the patient, available diagnostic tools, accessible treatment options, and resources.Good Practice Statements:Diagnostic testing for OSA should be performed in conjunction with a comprehensive sleep evaluation and adequate follow-up. Polysomnography is the standard diagnostic test for the diagnosis of OSA in adult patients in whom there is a concern for OSA based on a comprehensive sleep evaluation.Recommendations: We recommend that clinical tools, questionnaires and prediction algorithms not be used to diagnose OSA in adults, in the absence of polysomnography or home sleep apnea testing. (STRONG)We recommend that polysomnography, or home sleep apnea testing with a technically adequate device, be used for the diagnosis of OSA in uncomplicated adult patients presenting with signs and symptoms that indicate an increased risk of moderate to severe OSA. (STRONG)We recommend that if a single home sleep apnea test is negative, inconclusive, or technically inadequate, polysomnography be performed for the diagnosis of OSA. (STRONG)We recommend that polysomnography, rather than home sleep apnea testing, be used for the diagnosis of OSA in patients with significant cardiorespiratory disease, potential respiratory muscle weakness due to neuromuscular condition, awake hypoventilation or suspicion of sleep related hypoventilation, chronic opioid medication use, history of stroke or severe insomnia. (STRONG)We suggest that, if clinically appropriate, a split-night diagnostic protocol, rather than a full-night diagnostic protocol for polysomnography be used for the diagnosis of OSA. (WEAK)We suggest that when the initial polysomnogram is negative and clinical suspicion for OSA remains, a second polysomnogram be considered for the diagnosis of OSA. (WEAK)Citation:Kapur VK, Auckley DH, Chowdhuri S, Kuhlmann DC, Mehra R, Ramar K, Harrod CG. Clinical practice guideline for diagnostic testing for adult obstructive sleep apnea: an American Academy of Sleep Medicine clinical practice guideline. J Clin Sleep Med. 2017;13(3):479–504.INTRODUCTIONThe diagnosis of obstructive sleep apnea (OSA) was previously addressed in two American Academy of Sleep Medicine (AASM) guidelines, the “Practice Parameters for the Indications for Polysomnography and Related Procedures: An Update for 2005” and “Clinical Guidelines for the Use of Unattended Portable Monitors in the Diagnosis of Obstructive Sleep Apnea in Adult Patients (2007).”1,2 The AASM commissioned a task force (TF) of content experts to develop an updated clinical practice guideline (CPG) on this topic. The objectives of this CPG are to combine and update information from prior guideline documents regarding the diagnosis of OSA, including the optimal circumstances under which attended in-laboratory polysomnography (heretofore referred to as “polysomnography” or “PSG”) or home sleep apnea testing (HSAT) should be performed.BACKGROUNDThe term sleep-disordered breathing (SDB) encompasses a range of disorders, with most falling into the categories of OSA, central sleep apnea (CSA) or sleep-related hypoventilation. This paper focuses on diagnostic issues related to the diagnosis of OSA, a breathing disorder characterized by narrowing of the upper airway that impairs normal ventilation during sleep. Recent reviews on the evaluation and management of CSA and sleep-related hypoventilation have been published separately by the AASM.3–5The prevalence of OSA varies significantly based on the population being studied and how OSA is defined (e.g., testing methodology, scoring criteria used, and apnea-hypopnea index [AHI] threshold). The prevalence of OSA has been estimated to be 14% of men and 5% of women, in a population-based study utilizing an AHI cutoff of ≥ 5 events/h (hypopneas associated with 4% oxygen desaturations) combined with clinical symptoms to define OSA.6 OSA may impact a larger proportion of the population than indicated by these numbers, as the definition of AHI used in this study was restrictive and did not consider hypopneas that disrupt sleep without oxygen de-saturation. In addition, the estimate excludes individuals with an elevated AHI who do not have sleepiness but who may nevertheless be at risk for adverse consequences such as cardiovascular disease.7–10 In some populations, the prevalence of OSA is substantially higher than this estimate, for example, in patients being evaluated for bariatric surgery (estimated range of 70% to 80%)11 or in patients who have had a transient ischemic attack or stroke (estimated range of 60% to 70%).12 Other disease-specific populations found to have increased rates of OSA include, but are not limited to, patients with coronary artery disease, congestive heart failure, arrhythmias, refractory hypertension, type 2 diabetes, and polycystic ovarian disease.13,14The consequences of untreated OSA are wide ranging and are postulated to result from the fragmented sleep, intermittent hypoxia and hypercapnea, intrathoracic pressure swings, and increased sympathetic nervous activity that accompanies disordered breathing during sleep. Individuals with OSA often feel unrested, fatigued, and sleepy during the daytime. They may suffer from impairments in vigilance, concentration, cognitive function, social interactions and quality of life (QOL). These declines in daytime function can translate into higher rates of job-related and motor vehicle accidents.15 Patients with untreated OSA may be at increased risk of developing cardiovascular disease, including difficult-to-control blood pressure, coronary artery disease, congestive heart failure, arrhythmias and stroke.16 OSA is also associated with metabolic dysregulation, affecting glucose control and risk for diabetes.17 Undiagnosed and untreated OSA is a significant burden on the healthcare system, with increased healthcare utilization seen in those with untreated OSA,18 highlighting the importance of early and accurate diagnosis of this common disorder.Recognizing and treating OSA is important for a number of reasons. The treatment of OSA has been shown to improve QOL, lower the rates of motor vehicle accidents, and reduce the risk of the chronic health consequences of untreated OSA mentioned above.19 There are also data supporting a decrease in healthcare utilization and cost following the diagnosis and treatment of OSA.20 However, there are challenges and uncertainties in making the diagnosis and a number of questions remain unanswered.Individuals with OSA can also have other sleep disorders that may be related to or unrelated to OSA. Co-morbid insomnia has been found to be a frequent problem in patients with OSA.21 It is also possible that undiagnosed OSA may be masquerading as another sleep disorder, such as REM Behavior Disorder.22 Therefore, when OSA is suspected, a comprehensive sleep evaluation is important to ensure appropriate diagnostic testing is performed to address OSA, as well as other comorbid sleep complaints.The diagnosis of OSA involves measuring breathing during sleep. The evolution of measurement techniques and definitions of abnormalities justifies updating the guidelines regarding diagnostic testing, but also complicates the evaluation and summary of evidence gathered from older research studies that have included diagnostic tests with diverse sensor types and scored respiratory events using different definitions. The third edition of the International Classification of Sleep Disorders (ICSD-3) defines OSA as a PSG-determined obstructive respiratory disturbance index (RDI) ≥ 5 events/h associated with the typical symptoms of OSA (e.g., unrefreshing sleep, daytime sleepiness, fatigue or insomnia, awakening with a gasping or choking sensation, loud snoring, or witnessed apneas), or an obstructive RDI ≥ 15 events/h (even in the absence of symptoms).23 In addition to apneas and hypopneas that are included in the AHI, the RDI includes respiratory effort-related arousals (RERAs). The scoring of respiratory events is defined in The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications, Version 2.3 (AASM Scoring Manual).24 However, it should be noted that there is variability in the definition of a hypopnea event. The AASM Scoring Manual recommended definition requires that changes in flow be associated with a 3% oxygen desaturation or a cortical arousal, but allows an alternative definition that requires association with a 4% oxygen desatu-ration without consideration of cortical arousals. Depending on which definition is used, the AHI may be considerably different in a given individual.25–27 The discrepancy between these and other hypopnea definitions used in research studies introduces complexity in the evaluation of evidence regarding the diagnosis of OSA.Due to the high prevalence of OSA, there is significant cost associated with evaluating all patients suspected of having OSA with PSG (currently considered the gold standard diagnostic test). Further, there also may be limited access to in-laboratory testing in some areas. HSAT, which has limitations, is an alternative method to diagnose OSA in adults, and may be less costly and more efficient in some populations. This guideline addresses some of these issues using an evidence-based approach.There are potential disadvantages to using HSAT, relative to PSG, because of the differences in the physiologic parameters being collected and the availability of personnel to adjust sensors when needed. The sensor technology used by HSAT devices varies considerably by the number and type of sensors that are utilized. Traditionally, sleep studies have been categorized as Type I, Type II, Type III or Type IV. Unattended studies fall into categories Type II through Type IV. Type II studies use the same monitoring sensors as full PSGs (Type I) but are unattended, and thus can be performed outside of the sleep laboratory. Type III studies use devices that measure limited cardiopulmonary parameters; two respiratory variables (e.g., effort to breathe, airflow), oxygen saturation, and a cardiac variable (e.g., heart rate or electrocardiogram). Type IV studies utilize devices that measure only 1 or 2 parameters, typically oxygen saturation and heart rate, or in some cases, just air flow. This classification of sleep study devices fails to consider new technologies, such as peripheral arterial tonometry (PAT), and thus an alternative classification scheme has been proposed: the SCOPER classification, which incorporates Sleep, Cardiovascular, Oximetry, Position, Effort and Respiratory parameters.28 The SCOPER system allows for the inclusion of technologies such as PAT. However, due to the complexity of the SCOPER classification, and lack of familiarity with it amongst practicing clinicians, the TF elected to refer to HSAT devices by the traditional Type II through Type IV classification system, and to identify specific devices with technology outside of this schema when appropriate. Regardless, as can be recognized by both classifications, HSAT devices in comparison to attended studies raise risk for technical failures due to a lack of real-time monitoring, and have inherent limitations resulting from the inability of most devices to define sleep versus wake. Another potential disadvantage is that positive airway pressure (PAP) cannot be initiated during a HSAT, but can be initiated during a PSG if needed.Measurement error is inevitable in HSAT, compared against PSG, as standard sleep staging channels are not typically monitored in HSAT (e.g., EEG, EOG and EMG monitoring are not typically performed), which results in use of the recording time rather than sleep time to define the denominator of the respiratory event index (REI; the term used to represent the frequency of apneas and hypopneas derived from HSAT). HSAT devices that use conventional sensors are unable to detect hypopneas only associated with cortical arousals, which are included in the recommended AHI scoring rule in the AASM Scoring Manual.24 Sensor dislodgement and poor quality signal during HSAT are additional contributors to the measurement error of the REI. All these factors can result in the underestimation of the “true” AHI, and may result in the need for repeated studies due to inadequate data for diagnosis.As a diagnostic guideline, our systematic review and recommendations incorporate evidence regarding the accuracy of HSAT for diagnosing OSA. However, diagnosis occurs in the context of management of a patient within the healthcare system, and therefore, outcomes other than diagnostic accuracy are relevant in the evaluation of management strategies. These include the impact on clinical outcomes (e.g., sleepiness, QOL, morbidity, mortality, adherence to therapy) and efficiency of care (e.g., time to test, time to treatment, costs). Therefore, these outcomes are also considered in the formulation of the current guideline.Prior AASM guidelines1,2 on the diagnosis of OSA included statements that the TF determined were no longer pertinent. Thus, these statements were not addressed in the current update. Moreover, prior guidelines included consensus statements that had not been specifically evaluated in clinical studies. Despite this limitation, two of these statements were adopted in the current guideline as foundational statements that underpin the provision of high quality care for the diagnosis of OSA (see good practice statements). The scope of this guideline did not include a comprehensive update of technical specification for diagnostic testing for OSA. Nevertheless, the TF considered whether currently recommended technology was used in the research studies that were evaluated. In particular, the TF determined that the use of currently AASM recommended flow (nasal pressure transducer and thermistor) and effort sensors (respiratory inductance plethysmography) during PSG and HSAT increased the value of evidence derived from validation studies.24 As part of the data extraction process, validation studies were classified based on whether the currently recommended respiratory sensors were used for PSG or HSAT.METHODSExpert Task ForceThe AASM commissioned a TF of board-certified sleep medicine physicians, with expertise in the diagnosis and management of adults with OSA, to develop this guideline. The TF was required to disclose all potential conflicts of interest (COI) according to the AASM's COI policy, both prior to being appointed to the TF, and throughout the research and writing of this paper. In accordance with the AASM's conflicts of interest policy, TF members with a Level 1 conflict were not allowed to participate. TF members with a Level 2 conflict were required to recuse themselves from any related discussion or writing responsibilities. All relevant conflicts of interest are listed in the Disclosures section.PICO QuestionsA PICO (Patient, Population or Prob lem, Intervention, Comparison, and Outcomes) question template was used to develop clinical questions to be addressed in this guideline. PICO questions were developed based on a review of the existing AASM practice parameters on indications for use of PSG and HSAT for the diagnosis of patients with OSA, and a review of systematic reviews, meta-analyses, and guidelines published since 2004. The AASM Board of Directors (BOD) approved the final list of PICO questions presented in Table 1 before the literature search was performed. The PICO questions identify the commonly used approaches and devices for the diagnosis of OSA. Based on their expertise, the TF developed a list of patient-oriented clinically relevant outcomes that are indicative of whether a treatment should be recommended for clinical practice. A summary of the critical outcomes for each PICO is presented in Table 2. Lastly, clinical significance thresholds, used to determine if a change in an outcome was clinically significant, were defined for each outcome by TF clinical judgment, prior to statistical analysis. The clinical significance thresholds are presented by outcome in Table 3. It should be noted that there was insufficient evidence to directly address PICO question 1, as no studies were identified that compared the efficacy of clinical prediction algorithms to history and physical exam. However, the TF decided to compare the efficacy of clinical prediction algorithms to PSG and HSAT.Table 1 PICO questions.Table 1 PICO questions.Table 2 “Critical” outcomes by PICO.Table 2 “Critical” outcomes by PICO.Table 3 Summary of clinical significance thresholds for clinical outcome measures.Table 3 Summary of clinical significance thresholds for clinical outcome measures.Literature Searches, Evidence Review and Data ExtractionThe TF performed a systematic review of the scientific literature to identify articles that addressed at least one of the nine PICO questions. Multiple literature searches were performed by AASM staff using the PubMed and Embase databases, throughout the guideline development process (see Figure 1). The search yielded articles with various study designs, however the analysis was limited to randomized controlled trials (RCTs) and observational studies. The articles that were cited in the 2007 AASM clinical practice guideline,2 2005 practice parameter,1 2003 review,29 and 1997 review30 were included for data analysis if they met the study inclusion criteria described below.Figure 1: Evidence base flow diagram.Download FigureThe literature searches in PubMed were conducted using a combination of MeSH terms and keywords as presented in the supplemental material. The PubMed database was searched from January 1, 2005 through July 26, 2012 for any relevant literature published since the last guideline. The PubMed search was expanded on September 26, 2012 to identify relevant articles published prior to January 1, 2005. Literature searches also were also performed in Embase using a combination of terms and keywords as presented in the supplemental material. The Embase database was searched from January 1, 2005 through September 13, 2012. These searches yielded a total of 3,937 articles. There were 205 duplicates identified resulting in a total of 3,732 articles from both databases.A second round of literature searches was performed in PubMed and Embase to capture more recent literature. The PubMed database was searched from July 27, 2012 to December 23, 2013, and the Embase database was searched from September 13, 2012 to December 23, 2013. These searches yielded a total of 2,061 articles. There were 670 duplicates identified resulting in 1,391 additional papers from both databases.A final literature search was performed in PubMed to capture the latest literature. The PubMed database was searched from December 24, 2013 to June 29, 2016 and identified 2,129 articles.Based on their expertise and familiarity with the literature, TF members submitted additional relevant literature and screened reference lists to identify articles of potential interest. This served as a “spot check” for the literature searches to ensure that important articles were not overlooked and identified an additional 140 publications.A total of 7,392 abstracts were assessed by two reviewers to deter mine whether they met inclusion criteria presented in the supplemental material. Articles were excluded per the criteria listed in the supplemental material and Figure 1. A total of 98 articles were included in evidence base for recommendations. A total of 86 studies were included in meta-analysis and/or grading.Meta-AnalysisMeta-analysis was performed on both diagnostic and clinical outcomes of interest for each PICO question, when possible. Outcomes data for diagnostic approaches were categorized as follows: clinical tools, questionnaires, and prediction algorithms; history and physical exam; HSAT; attended PSG; split-night attended PSG; two-night attended PSG; single-night HSAT; multiple-night HSAT; follow-up attended PSG; and follow-up HSAT. The type of HSAT devices identified in literature search included type 2; type 3; 2–3 channel; single channel; oximetry; and PAT. A definition of these devices has been previously described.31 Adult patients were categorized as follows: suspected OSA; suspected OSA with comorbid conditions; diagnosed OSA; and scheduled for upper airway surgery.For diagnostic outcomes, the pretest probability for OSA (i.e., the prevalence within the study population), sensitivity and specificity of the tested diagnostic approach, and number of patients for each study was used to derive two-by-two tables (i.e., the number of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) diagnoses per 1,000 patients) in both high risk and low risk patients, for each OSA severity threshold (i.e., AHI ≥ 5, AHI ≥ 15, AHI ≥ 30). For analyses that included five or more studies, pooled estimates of sensitivity, specificity, and accuracy were calculated using hierarchical random effects modeling performed in STATA software (accuracy was derived by HSROC curves). When analyses included fewer than five studies, ranges of sensitivity, specificity and accuracy were used. Based on their clinical expertise and a review of available literature, the TF established estimates of OSA prevalence among “low risk” and “high risk” patients for each OSA severity threshold. The TF envisioned a sleep clinic cohort of middle-aged obese men with typical symptoms of OSA as an example of a high-risk patient population. In contrast, a sleep clinic cohort of younger non-obese women with possible OSA symptoms was used as prototype for a low risk patient population. Prevalence estimates for these populations are presented in Table 4.Table 4 Summary of prevalence estimates for high risk and low risk adult sleep clinic patients with OSA by diagnostic cutoff.Table 4 Summary of prevalence estimates for high risk and low risk adult sleep clinic patients with OSA by diagnostic cutoff.The sensitivity and specificity of included studies were entered into Review Manager 5.3 software to generate forest plots for each analysis. The estimates of sensitivity and specificity (pooled or ranges), and OSA prevalence were entered into the GRADE (Grading of Recommendations Assessment, Development and Evaluation) Guideline Development Tool (GDT) to generate the two-by-two tables. The TF determined the downstream consequences of an accurate diagnosis versus an inaccurate diagnosis (see supplemental material, Table S1), and used the estimates to weigh the benefits of an accurate diagnosis versus the harms of an inaccurate diagnosis. This information was used, in part, to assess whether a given diagnostic approach could be recommended when compared against PSG.For clinical outcomes of interest, data on change scores were entered into the Review Manager 5.3 software to derive the mean difference and standard deviation between the experimental diagnostic approach and the gold standard or comparator. For studies that did not report change scores, data from posttreatment values taken from the last treatment time-point were used for meta-analysis. All meta-analyses of clinical outcomes were performed using the random effects model with results displayed as a forest plot. There was insufficient evidence to perform meta-analyses for PICOs 3 and 9, thus no recommendations are provided.Interpretation of clinical significance for the clinical outcomes of interest was conducted by comparing the absolute effects to the clinical significance threshold previously determined by the TF for each clinical outcome of interest (see Table 3).Strength of RecommendationsThe assessment of evidence quality was performed according to the GRADE process.32 The TF assessed the following four components to determine the direction and strength of a recommendation: quality of evidence, balance of beneficial and harmful effects, patient values and preferences and resource use as described below. Quality of evidence: based on an assessment of the overall risk of bias (randomization, blinding, allocation concealment, selective reporting, and author disclosures), imprecision (clinical significance thresholds), inconsistency (I2 cutoff of 75%), indirectness (study population), and risk of publication bias (funding sources), the TF determined their overall confidence that the estimated effect found in the body of evidence was representative of the true treatment effect that patients would see. For diagnostic accuracy studies, the QUADAS-2 tool was used in addition to the quality domains for the assessment of risk of bias in intervention studies. The quality of evidence was based on the outcomes that the TF deemed critical for decision-making.Benefits versus harms: based on the meta-analysis (if applicable), analysis of any harms or side effects reported within the accepted literature, and the clinical expertise of the TF, the TF determined if the beneficial outcomes of the intervention outweighed any harmful side effects.Patient values and preferences: based on the clinical expertise of the TF members and any data published on the topic relevant to patient preferences, the TF determined if patients would use the intervention based on the body of evidence, and if patient values and preferences would be generally consistent.Resource use: based on the clinical expertise of the TF members and a “spot check” for relevant literature the TF determined resource use to be important for determining whether to recommend the use of HSAT versus PSG, split-night versus full-night PSG and single-night versus multiple-night HSAT diagnostic protocols, and repeat testing. Resource use was not considered in-depth for clinical tools, questionnaires and prediction algorithms, diagnosis in adults with comorbid conditions, and repeat PSG. Taking these major factors into consideration, each recommendation statement was assigned strength (“STRONG” or “WEAK”). Additional information is provided in the form of “Remarks” immediately following the recommendation statements, when deemed necessary by the TF. Remarks are based on the evidence evaluated during the systematic review, are intended to provide context for the recommendations, and to guide clinicians in implementing the recommendations in daily practice.Discussions accompany each recommendation to summarize the relevant evidence and explain the rationale leading to each recommendation. These sections are an integral part of the GRADE system and offer transparency to the process.Approval and Interpretation of RecommendationsA draft of the guideline was available for public comment for a two-week period on the AASM website. The TF took into consideration all the comments received and made revisions when appropriate. The revised guideline was submitted to the AASM BOD who approved these recommendations.The recommendations in this guideline define principles of practice that should meet the needs of most patients in most situations. This guideline should not, however, be considered inclusive of all proper methods of care or exclusive of other methods of care reasonab