Author + information
- Received November 28, 2010
- Revision received January 12, 2011
- Accepted January 19, 2011
- Published online May 1, 2011.
- Samuel Unzek, MD,
- Zoran B. Popovic, MD,
- Thomas H. Marwick, MBBS, PhD⁎ (, )
- Diastolic Guidelines Concordance Investigators
- ↵⁎Reprint requests and correspondence:
Dr. Thomas H. Marwick, Cardiovascular Medicine J1-5, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, Ohio 44195
Objectives We sought the impact of recent recommendations on observer concordance on interpretation of diastolic stage and assessment of filling pressure.
Background Worsening stages of diastolic dysfunction are associated with worsening outcome. However, the echocardiographic classification of diastolic function is complex, and parameters may be discordant. The interobserver agreement of diastolic assessment is undefined.
Methods A complete diastolic evaluation (transmitral flow, left atrial volume, tissue Doppler, pulmonary venous flow, mitral flow propagation, and left ventricular images) was obtained in 20 patients and interpreted by 14 experts in 8 countries (280 case reads). Each investigator was asked to interpret diastolic class and left ventricular filling pressure. Brain natriuretic peptide level was drawn on the same day of the echocardiogram to corroborate filling pressures obtained by the echocardiogram. Concordance was assessed as kappa, and accuracy was compared with specific application of the recommendations by 2 investigators.
Results For recognition of raised filling pressure, the sensitivity and specificity of readers for raised filling pressure defined by the reference read were 66 ± 37% and 88 ± 26%, respectively. Complete agreement among all readers was obtained in 10 of 20 cases. Diagnosis of normal and categories of abnormal filling was correct in 71% to 95%, with the lowest values obtained for normal and pseudonormal filling. There was no difference between U.S. and international readers. Not all patients in each diastolic stage showed all of the changes that are typical of that stage, and variations appeared to be attributable to differences in weighting of conflicting observations. Overall, kappa values for filling pressure and diastolic class were 0.71 (range 0.60 to 0.80) and 0.68 (range 0.54 to 0.86).
Conclusions Correct results for estimation of filling pressure were obtained by a high proportion of readers. Classification of diastolic stages continues to be variable and might be addressed by provision of a uniform hierarchy of observations.
Heart failure with preserved ejection fraction (EF) is increasing in incidence (1), and new criteria have incorporated echocardiographic findings (2). Moreover, even in the absence of heart failure, diastolic dysfunction (DD) has been shown to have prognostic significance (3). Recently, the American Society of Echocardiography and European Association of Echocardiography (ASE/EAE) released recommendations for the classification of DD (4). These recommendations use multiple echocardiographic parameters in a 2-level decision tree.
Mitral in-flow, pulmonary venous flow, color M-mode flow propagation velocity, tissue Doppler annular velocities, and left atrial volume are the cornerstones of diastolic function evaluation (4). However, each measurement has fundamental limitations and exceptions to its accurate use. Moreover, although each of these parameters reflects a physiological aspect of diastole, they do not reflect the same process, and ambiguity in the recognition of DD may arise when these parameters are discordant. Thus, contrary to the usual principles of algorithm use (that the same result is obtained on each application), there is an intrinsic ambiguity that reflects a judgment of how discordant data are to be weighted. The impact of the ASE/EAE recommendations on observer concordance is undefined. The aim of this study was to evaluate the interobserver concordance of the recommendations on the interpretation of diastolic class and assessment of filling pressure.
Diastolic function was evaluated with echocardiography in 20 patients who were seen in the outpatient clinic at our institution. Patients had a variety of indications for echocardiographic evaluation, mainly involving evaluation of left ventricular (LV) function (Table 1). Patients with atrial fibrillation/flutter during the study, severe mitral annular calcification, mitral valve prosthesis/ring, moderate or severe mitral regurgitation, and heart rates above 100 beats/min and patients who had undergone heart transplantation were excluded from the study because 1 or more of the parameters can be invalidated in these clinical situations. Patients included in this study were required to have a complete echocardiogram and brain natriuretic peptide (BNP) level drawn on the same day. Studies were screened in a period that began in July 2009 until 20 patients (chosen as the sample size based on a similar recent published report ) satisfied all of the inclusion criteria—the last patient was enrolled in January 2010.
Echocardiograms were performed by experienced sonographers using standard commercially available equipment (ie33, Philips, Andover, Massachusetts; Acuson Sequoia, Siemens, Mountain View, California; Vivid 7, GE Medical, Milwaukee, Wisconsin). Complete studies were abbreviated to a standardized collection of images comprising 2-dimensional views from parasternal long-axis, LV short-axis, 4- and 2-chamber (including left atrial size using the biplane Simpson method), color Doppler of the mitral valve and M-mode flow propagation, spectral pulsed-wave Doppler of the pulmonary venous flow, mitral in-flow with and without Valsalva maneuver, and septal and lateral annular tissue Doppler velocities. Measurements were made on these images and conveyed to the reviewers (Table 2). Duplicate images were removed to minimize ambiguity. No incomplete studies or studies with unclear windows were used.
Brain natriuretic peptide
As part of the entry criteria into the study, patients were required to have had a clinically indicated BNP level (ordered by the clinician responsible for the patient) on the day of the echocardiographic exam. The blood specimens were processed at the Cleveland Clinic central laboratory using a chemiluminescence assay (Siemens Healthcare Diagnostics, Deerfield, Illinois). These levels were used as an external reference for corroboration of the filling pressures obtained by the imaging modality, and readers were blinded to these results.
Anonymized data was distributed electronically to 14 expert readers in multiple countries, including the United States, Japan, France, Belgium, Italy, Australia, and New Zealand. Age and sex were included, but other clinical data were omitted at the time of echocardiographic interpretation.
Interpreters filled a data entry sheet in which the stages of dysfunction (I, II, or III) or absence of DD were coded, as well as an estimate of filling pressures (high or low). These were then returned to the primary investigator (T.H.M.) for data collection and analysis via a secure Web site. As a reference standard, the ASE/EAE guidelines were used jointly by 2 experienced readers (Z.B.P. and T.H.M.) to reinterpret the studies. When features were ambiguous, the DD grade was allocated on the basis of the predominant findings. Discrepancies were resolved by consensus.
We assessed the reliability of assessment of diastolic grade and filling pressure. Accuracy was compared with reference standard-specific application of the ASE guidelines by 2 investigators (diastolic class) and/or BNP >100 pg/ml (filling pressure) (6). Concordance was assessed as Cohen's kappa, and overall agreement among readers was assessed by Fleiss kappa, with values of 0 and 1 signifying no or perfect agreement, respectively. Additionally, agreement in ranking of DD was assessed by the Spearman correlation (rho).
Clinical and echocardiographic characteristics
Demographic and clinical data are detailed in Table 1.
Agreement Between Raters
Accuracy in the estimation of LV filling pressure
Details of the echocardiographic measurements relevant to diastolic class assignment are shown in Table 3. A total of 12 raters completed the evaluation form, while 1 U.S. and 1 non-U.S. rater did not (they were excluded from the study). The agreement between raters for assessing diastolic class was moderately strong (Fleiss kappa = 0.62), with agreement slightly higher for non-U.S. raters (Fleiss kappa = 0.65) than for U.S. raters (Fleiss kappa = 0.62). The median Spearman rho between 2 individual raters for diastolic class was 0.70 (range 0.30 to 0.96). Agreement between raters for assessing presence of elevated filling pressures was very similar (Fleiss kappa = 0.61), with agreement again slightly higher for non-U.S. raters (Fleiss kappa = 0.61) than for U.S. raters (Fleiss kappa = 0.57). The sensitivity and specificity of readers for raised filling pressure defined by the reference read were 66 ± 37% and 88 ± 26%, respectively. Similarly, using BNP >100 pg/ml as the reference standard for raised filling pressure, the agreement among readers with BNP was 78%, with individual readers' Cohen's kappa values ranging from 0.5 to 0.7. The sensitivity and specificity of readers for raised filling pressure defined by BNP were 69 ± 33% and 93 ± 12%, respectively.
Concordance in the assessment of diastolic class
The concordance between observers and the reference read is shown in Figure 1. Delayed relaxation was the abnormality with the best concordance with the reference read (92%), with lower levels of agreement for pseudonormal (58%) and restrictive (65%). The median Spearman rho between 2 individual readers for diastolic class was 0.70 (range 0.30 to 0.96). There were no differences between U.S. and international readers. Cohen's kappa values for each reader relative to the reference rating ranged from 0.4 to 0.8. Two examples of discordant reads are shown in Figure 2.
Details of the echocardiographic measurements relevant to diastolic class assignment are shown in Table 4. Not all patients in each diastolic stage showed all of the changes that are typical of that stage. Of the 2 patients with delayed relaxation, a peak transmitral velocity in early diastole (E) to peak transmitral velocity at atrial contraction (A) ratio <0.8 was identified in 1 and a deceleration time >200 ms in 1. Of the 6 with pseudonormal filling, all had an E:A of 0.8 to 1.5, 2 had a deceleration time of 160 to 200 ms, and 2 had an E/early diastole (e') of 9 to 12. The 6 patients with restrictive filling all showed E:A >2, but only 3 showed deceleration time <160 ms and 3 showed E/e' >13.
This is the first multicenter study to evaluate the interobserver agreement in the assessment of DD. The main findings were that diagnosis of normal and abnormal filling were concordant with the reference read in 71% to 95%, with the lowest values obtained for restrictive and pseudonormal filling. The sensitivity and specificity of the readers to evaluate elevated filling pressures were 83% and 88%, respectively, with no significant international variation among readers. Previous reports have been published describing the variability between different criteria of DD (7). The results of this study built on these observations, showing that the application of the current algorithm, even by expert echocardiographers, still left sufficient ambiguity to lead to inconsistent final results.
Markers of DD
The ASE/EAE recommendations incorporate a variety of parameters that reflect different LV diastolic properties, each of which has potential limitations. Mitral in-flow parameters vary with age and are directly affected by preload, conduction abnormalities, and arrhythmias (8,9). In patients with coronary artery disease or hypertrophic cardiomyopathy, these parameters have shown poor correlation with hemodynamic data (10–12). Pulmonary venous flow indexes are sometimes technically difficult to acquire and are affected by arrhythmias and age and may be inaccurate in preserved EF and severe mitral valve disease. Caution should be used in the application of flow propagation velocities because this may be normal despite abnormal filling pressures in patients with normal LV volumes and EFs (13). Tissue Doppler velocities may be technically challenging, and filling pressures estimated by the E/e' ratio are inaccurate in normal individuals and patients with mitral valve disease (14,15).
In addition to the limitations of each measurement, the measurements may be inconsistent. Diastolic function is a complex process that encompasses multiple sequential processes that affect LV filling. Myocardial relaxation can be divided into 3 phases (16). Mechanical recoil results from the release of energy stored in the sarcomere and myocardial interstitium represented by untwisting and suction. LV relaxation represents a phase in which the LV volume is increasing and LV filling pressures are decreasing. Passive filling takes places near the end of LV filling and reflects LV compliance. The cellular and molecular mechanisms that underlie these events are distinct and may not necessarily be linked. Because the echocardiographic measures of diastolic function are linked to different combinations of these processes, there is every possibility that the measures will be discordant.
Acceptable limits of agreement
The assessment of concordance among reviewers has been the source of remarkably little clinical research. Studies of cardiac nuclear imaging showed only modest concordance among observers with uniform image display (kappa 0.57) and improvement with the use of quantitative circumferential profile analysis (17) but improved markedly (kappa 0.66) with automated analysis. Similar improvements have been obtained with the application of harmonic imaging and standard interpretation rules during dobutamine echocardiography, although the improvement was only to a kappa of 0.55 (18). A recent quality control exercise by the British Society of Echocardiography showed that the classification on either side of the mode of a 6-point mitral regurgitation quantification was <10% (19). The interrater agreement of approximately 0.6 in this study, although comparable to these previous reports, could nonetheless be improved upon. Moreover, although the specificities were good, the reported sensitivity of <70% is suboptimal.
Several possible solutions warrant further consideration. New markers of diastolic function such as direct measurement of deformation (strain, strain rate) and LV untwisting have the attraction of measuring individual physiological processes. Their disadvantages are their complexity, signal noise, and angle dependency of the Doppler-based approaches and lower frame rate and image quality dependence with speckle tracking. Additionally, there are significant intervendor differences in strain measurement (20,21).
The other solutions are simpler but may be more feasible. A probabilistic approach, based on the clinical scenario, would seem prudent. For example, pseudonormal is more likely than normal filling in the diseased LV with apparently normal transmitral flow. The ASE/EAE recommendations also do not incorporate patient age as a variable, despite its well-recognized effect on diastolic function, resulting in a significant percentage of normal individuals older than 70 years who could have early diastolic mitral annular velocities classified as abnormal by current guidelines (22). The development of a hierarchy of observations would improve concordance. For example, a patient with normal E/e', E wave deceleration of 260 ms, and E:A ratio of 1 could reasonably be diagnosed with normal or delayed relaxation. Guidelines are needed regarding the relative weighting of discrepant results.
The comparators in this study were a reference read and BNP measurements, both of which are imperfect. In particular, the correlation of BNP with elevated filling pressures may be limited (23), and the design did not include analysis of patients in the intermediate group (E/e' 8 to 15).
The study design sought identification and classification of DD. Further study of what parameters are used by the readers is needed to evaluate the nature of differences among readers. Although the sample size was small (n = 20), there were 280 case read by the 12 readers.
The application of current ASE/EAE recommendations for diastolic assessment results in correct estimation of filling pressure in a high proportion of studies, but classification of diastolic stage continues to be variable. Audit and critical assessment should be important constituents of the guideline process.
For a list of the Diastolic Guidelines Concordance Investigators, please see the online version of this article.
All authors have reported that they have no relationships to disclose. For a list of the Diastolic Guidelines Concordance Investigators, please see the online version of this article. Jeroen J. Bax, MD, PhD, acted as Guest Editor for this article.
- Abbreviations and Acronyms
- peak transmitral velocity at atrial contraction
- brain natriuretic peptide
- diastolic dysfunction
- peak transmitral velocity in early diastole
- early diastole
- ejection fraction
- left ventricular
- Received November 28, 2010.
- Revision received January 12, 2011.
- Accepted January 19, 2011.
- American College of Cardiology Foundation
- Paulus W.J.,
- Tscope C.,
- Sanderson J.E.,
- et al.
- Biner S.,
- Rafique A.,
- Rafii F.,
- et al.
- Petrie M.C.,
- Hogg K.,
- Caruana L.,
- McMurray J.J.V.
- Appleton C.P.
- Schnittger I.,
- Appleton C.P.,
- Hatle L.K.,
- Popp R.L.
- Yamamoto K.,
- Nishimura R.A.,
- Chaliki H.P.,
- Appleton C.P.,
- Holmes D.R. Jr..,
- Redfield M.M.
- Nishimura R.A.,
- Appleton C.P.,
- Redfield M.M.,
- Ilstrup D.M.,
- Holmes D.R. Jr..,
- Tajik A.J.
- Nagueh S.F.,
- Lakkis N.M.,
- Middleton K.J.,
- Spencer W.H. III.,
- Zoghbi W.A.,
- Quinones M.A.
- Diwan A.,
- McCulloch M.,
- Lawrie G.M.,
- Reardon M.J.,
- Nagueh S.F.
- Rivas-Gotz C.,
- Khoury D.S.,
- Manolios M.,
- Rao L.,
- Kopelen H.A.,
- Nagueh S.F.
- Notomi Y.,
- Thomas J.D.
- Wackers F.J.,
- Bodenheimer M.,
- Fleiss J.L.,
- Brown M.
- Hoffmann R.,
- Marwick T.H.,
- Poldermans D.,
- et al.