Abstract
Background Some UK GPs are acquiring access to natriuretic peptide (NP) testing or echocardiography as diagnostic tests for heart failure. This study developed appropriateness ratings for the diagnostic application of these tests in routine general practice.
Aim To develop appropriateness ratings for the diagnostic application of NP testing or echocardiography for heart failure in general practice.
Design and setting An appropriateness ratings evaluation in UK general practice.
Method Four presenting symptoms (cough, bilateral ankle swelling, dyspnoea, fatigue), three levels of risk of cardiovascular disease (low, intermediate, high), and dichotomous categorisations of cardiovascular/chest examination and electrocardiogram result, were used to create 540 appropriateness scenarios for patients in whom NP testing or echocardiography might be considered. These were rated by a 10-person expert panel, consisting of GPs and GPs with specialist interests in cardiology, in a two-round RAND Appropriateness Method.
Results Onward referral for NP testing or echocardiography was rated as an appropriate next step in 217 (40.2%) of the 540 scenarios; in 194 (35.9%) it was rated inappropriate. The ratings also show where NP testing or echocardiography were ranked as equivalent next steps and when one test was seen as the more appropriate than the other.
Conclusion NP testing should be the routine test for suspected heart failure where referral for diagnostic testing is considered appropriate. An abnormal electrocardiogram status makes referral to echocardiography an accompanying, or more appropriate, next step alongside NP testing, especially in the presence of dyspnoea. Abnormal NP testing should subsequently be followed up with referral for echocardiography.
INTRODUCTION
Diagnosing heart failure is difficult; it requires some objective evidence of cardiac dysfunction in the presence of symptoms and signs. Unfortunately, key indicators, such as dyspnoea, cough, or bilateral ankle swelling (BAS), are common and non-specific, with no single symptom or cluster of symptoms having sufficient predictive value for making a diagnosis.1–3 Moreover, interpreting such symptoms can be difficult in, for example, older patients or those who are obese.1,4
Reliance on symptoms and signs alone, which has been a standard way of labelling patients as having heart failure, can lead to both under- and over-diagnosis.5 As an example, many patients with peripheral oedema are, in the absence of testing, diagnosed as having heart failure when the oedema has another, unrelated aetiology.5 Another difficulty for GPs is distinguishing respiratory causes from cardiac causes of dyspnoea, in which symptoms of heart failure may be attributed to chronic obstructive pulmonary disease and, therefore, treated incorrectly.
In many parts of Europe, GPs are gaining enhanced direct access to investigations that improve the accuracy of diagnosing heart failure,6 namely natriuretic peptide (NP) testing and echocardiography; Box 1 outlines how these tests work. However, guidelines are not clear as to how these technologies should be applied in triaging patients with symptoms; as a result, their use is inconsistent across, and within, countries, including the UK. The ‘appropriateness’ of the use of NP testing and echocardiography is also largely opinion based, with hospital specialists being viewed as the ultimate arbiters of the appropriateness of testing decisions made by GPs.7
Natriuretic peptide and echocardiography diagnostic tests for heart failure
Natriuretic peptide (NP) tests show how well a heart is working by measuring the amount of natriuretic peptide hormone in the blood. The plasma concentration is increased in patients with asymptomatic and symptomatic left ventricular dysfunction, where the heart has to work harder than usual over a long period of time, such as from heart failure. There is no difference in the diagnostic accuracy of B-type NP (BNP) and NT-proBNP.1
Echocardiography: echocardiogram/cardiac echocardiogram/: an ultrasound scan of the heart giving detailed and accurate pictures of the heart muscle, the heart chambers, and structures within the heart such as the valves.
Some specialists have expressed concern about the overdiagnosis of heart failure by GPs who rely simply on routine clinical assessment8,9 when it might be more appropriate to arrange NP testing or echocardiography. However, others believe that such enhanced access might simply lead to more indiscriminate use of NP testing, with a consequent rise in false positives.11 As one of the symptoms of heart failure is fatigue, it is also quite possible to imagine that the current (and growing) battery of tests being ordered by many GPs for patients who are ‘tired all the time’ might be extended to include such items as C-reactive protein and NP.12
However, the precise definition and derivation of the term ‘appropriateness’ in studies examining GP testing is rather elusive.13 Appropriateness is not just about overuse or misuse, but should also consider underuse. For example, in a recent study of access to coronary angiography, it was demonstrated that many patients deemed appropriate for angiography failed to receive it and, as a result, experienced worse outcomes.14
How this fits in
In many parts of Europe, including the UK, GPs are gaining enhanced access to natriuretic peptide testing and echocardiography to seek to improve the accuracy of diagnosing heart failure. Unfortunately as current guidance is unclear about how these technologies should be applied in triaging patients with specific symptoms or symptom combinations their use is inconsistent. This study uses the RAND Appropriateness Method to identify clusters of symptoms, patient characteristics and electrocardiogram results that GPs perceive as appropriate or inappropriate for referral for either natriuretic peptide testing or echocardiography as part of the diagnostic approach for heart failure.
In the absence of definitive guidance, the purpose of this study was, therefore, to derive appropriateness ratings for the use of heart-failure diagnostic triage in routine primary care. The aim was to identify the clusters of symptoms, patient characteristics, and electrocardiogram (ECG) results that GPs perceive as appropriate and inappropriate for referral for either NP testing or echocardiography as a diagnostic approach for heart failure. In addition, the study sought to identify those combinations of symptoms, patient characteristics, and ECG results for which uncertainty exists regarding the diagnostic pathway.
METHODS
The RAND Corporation has developed a definition of appropriateness in relation to GP-initiated testing:
‘For an average group of patients presenting to an average GP, clinically appropriate testing means that the expected health benefit(s) exceed the expected negative consequences by a sufficiently wide margin that the test is worth doing’.15
In conjunction with the University of California, RAND has developed a validated method to generate appropriateness ratings; this involves combining a summary of scientific evidence with the collective judgement of experts.16 For diagnostic testing, this presents an opportunity to produce a series of ratings of clinical scenarios regarding the appropriateness of performing a particular test in relation to patient-specific symptoms, patient characteristics, medical history, and previous investigations.17
This study adhered to the RAND Appropriateness Method17,18 by, initially, reviewing and collating the existing evidence base on heart-failure diagnosis, which included two recently published systematic reviews and their consultant papers.1,2 From this review, together with discussions with cardiologists and GPs, a list of clinical scenarios was generated that covered the major symptom-initiated presentations within general practice for patients with possible heart failure (but not already known to have heart failure), in whom NP testing or echocardiography might be considered (Table 1).
Four common presenting symptoms (cough, BAS, dyspnoea, and fatigue/tiredness), three levels of perceived risk of cardiovascular disease (CVD) (low, intermediate, or high), and dichotomous categorisations of CVD/chest examination and ECG result were identified. The symptom-initiated presentations were then used to develop a list of 540 clinical scenarios (Table 1).
Each scenario represented a separate testing indication consisting of variable combinations of the presenting symptoms, CVD risk, chest examination, and ECG result. Although sex was not explicitly included as a separate variable, it was incorporated in the assessment of CVD risk. The decision to use three levels of CVD risk was based on considerations about the nature of primary care diagnostics and study feasibility. More detail on the characteristics of these variables and the associated instructions given to panellists is provided in Table 1.
Each scenario was then independently linked to one of following three potential decisions that a GP might make:
no further investigations (Nil) — to carry out neither echocardiography nor an NP test;
echocardiography — to directly refer for echocardiography without carrying out an NP test; or
B-type NP (BNP)/NT-proBNP triage (NP) — to first carry out an NP test in order to rule out those individuals at low likelihood of having heart failure. The remaining patients would subsequently go on to be referred for echocardiography.
A rating scale was used to rate these potential decisions in relation to the clinical presentation on a 9-point integer scale, as follows:
scores 1–3: inappropriate next step (that is, no benefit, possible harms);
scores 4–6: uncertainty about next step (that is, when harms and benefits are judged as approximately equal, or when the best available evidence does not support a judgement either way);
scores 7–9: appropriate next step (that is, benefits were judged to outweigh harms).
As an example, a rating of 1 would mean that the decision would be an extremely inappropriate next step, whereas a rating of 9 would mean that it was extremely appropriate.
Panel
Panel composition is a crucial factor in ensuring the legitimacy of consensus techniques.21 As the aim of this study was to create specific appropriateness ratings for routine clinical general practice, the 10-person panel was identified and selected in discussion with the Primary Care Cardiovascular Society and the Royal College of General Practitioners. It consisted of practising GPs, five of whom were nationally recognised experts in this area. These five individuals had particular interests in cardiology in general, and heart failure more specifically. Two panel members also had a significant publication record of original research in relation to NP and heart failure. The panel was co-chaired by one researcher, who is expert in the RAND Appropriateness Method, and one who is expert in primary care diagnostics.
Round one was conducted by email. Panel members were sent the two systematic reviews, the literature summary, and the list of scenarios, together with a list of definitions for all the terms used. They were asked to independently consider and rate the appropriateness of each decision for every scenario on the 9-point integer scale, basing their ratings on the summarised evidence and their own clinical experience as GPs.
In round two, panellists met for a 1.5-day face-to-face meeting, in which they were given rating sheets that included feedback from the first round. This feedback included the frequency distribution of ratings of all panellists across the 9-point scale, together with the overall panel median rating, for each of the 540 scenarios. Each panellist was also presented with their own first-round rating sheet as a reminder of how they had initially rated each scenario. During the meeting, panellists then discussed each symptom-based grouping of scenarios in turn, focusing on areas of disagreement, before rating each scenario on their individual blinded rating sheets.
Data entry and analysis
Within a consensus technique, there are two aspects to each scale's rating process for every scenario: the overall panel median rating and the level of agreement within a panel.21 The RAND Appropriateness Method definition of consensus (overall panel median of 7–9) was adhered to without disagreement. Disagreement was defined to exist where ≥33% of panellists rated a scenario in the 1–3 range and ≥33% of panellists rated the same scenario in the 7–9 range of the 1–9 point scale. In consequence, a scenario rated 7–9 without disagreement was considered to be an appropriate next step, and a rating of 1–3 without disagreement was considered to be an inappropriate next step.
RESULTS
Results are presented for the ratings from the second round only and are summarised in Tables 2 and 3. The appropriate next steps are presented as matrices in Appendix 1.
Table 2 shows a summary of the panel ratings for the 540 separate scenarios and indicates a high level of agreement within the panel, with only 42 scenarios (7.7%) rated with disagreement. Of the 540 scenarios, 217 (40.2%) were rated as appropriate next steps without disagreement and 194 (35.9%) scenarios were rated as inappropriate next steps without disagreement. In all other scenarios (n = 129) the appropriate next step was considered equivocal.
Table 3 presents the ratings for all combinations of symptoms and tests, and shows, for each combination, which of the three next steps studied were rated as either appropriate or inappropriate; the overall panel median rating is given in parentheses. When the panel rated the cluster of symptoms and tests as an equivocal marker for referral for either NP testing or echocardiography, no next step is recorded.
Table 3 shows examples of scenarios in which the panel felt referral for either NP testing or echocardiography would be an inappropriate next step and that no investigation was an appropriate next step; for example, where there was a normal CVD examination, low CVD risk, and normal/no ECG status with either cough, fatigue, or BAS presenting as single symptoms. However, in this case where dyspnoea presented in combination with at least two other symptoms, NP testing was never rated as inappropriate.
There was only one example of NP testing being rated an appropriate next step and echocardiography as an inappropriate next step (normal CVD examination, low CVD risk, normal/no ECG status and presenting symptoms of cough, BAS, and dyspnoea). There were also examples of scenarios in which referral to NP testing was rated an appropriate next step but referral to echocardiography was considered equivocal (that is, normal CVD examination, low CVD risk, abnormal ECG status and presenting symptoms of dyspnoea and fatigue or cough or BAS [data not shown]). In general, the presence of dyspnoea was more likely to be associated with NP testing being an appropriate next step compared to no testing.
Table 3 also shows that there are no scenarios in which echocardiography was rated an appropriate next step but NP testing was not also considered appropriate. For many scenarios, echocardiography and NP testing were rated as equally appropriate next steps; however, in some scenarios, one test was considered more appropriate than the other (Table 3). In general, the presence of an abnormal CVD examination result, high CVD risk, and abnormal ECG status made referral to echocardiography a more appropriate next step than NP testing. In most scenarios representing normal or abnormal CVD examination results in combination with any risk of CVD and normal/no ECG status, NP testing was rated as more appropriate than echocardiography.
These ratings have been used to generate appropriateness matrices for single symptoms and combinations of symptoms (Appendix 1). Furthermore, in developing the appropriateness matrices, it also made clinical sense to combine the low and intermediate CVD risk categories. These matrices show where NP and echocardiography were rated the most appropriate next step, and where NP and echocardiography were rated as equally appropriate next steps.
DISCUSSION
Summary
This study has identified situations in which GP referral for NP testing or echocardiography would be considered appropriate, inappropriate, or equivocal on the basis of peer consensus by generalist and specialist GPs. Given the lack of definitive guidance on such diagnostic triaging, these data could form a basis for how ‘open-access’ diagnostic testing services should be commissioned for primary care. Although every patient who is eventually diagnosed with heart failure will likely have received an NP assay and echocardiography as part of their work-up, in which patient phenotypes the tests might be considered, and in what order, is not determined.
Strengths and limitations
This study adhered to a validated systematic consensus method for developing appropriateness scenarios.17,22 Panel composition is the most important influence on panel ratings and, as advised, the study's panel members reflected the issue, evidence base, and likely users of the output of the ratings.21 The study panel, comprising generalists and generalists with a special interest in heart failure, considered the overall evidence base in these appropriateness ratings; however they did not explicity consider cost effectiveness, which is not part of the RAND Appropriateness Method.15
Comparison with existing literature
These findings have resonance with the conclusions of the recent systematic review of Mant et al, who proposed that patients with suspected heart failure should be offered BNP (or NT-proBNP) as a triage test prior to echocardiography, unless presenting with any one of a history of myocardial infarction, basal crepitations, or, in men, ankle oedema.1
Implications for practice and research
This study sought to generate appropriateness ratings and appropriateness matrices that are clinically precise, specific, and applicable to routine clinical practice in relation to an area in which GPs express a lack of confidence.23 The importance of these findings should also be considered in the context of the variable access that primary care currently has to NP, or even echocardiography, testing within the UK.
Although the RAND Approach has been applied successfully in a number of specialist areas of diagnostic testing,24 the authors are not aware of any attempt to apply it to assessing the appropriateness of diagnostic testing requests by generalists. This study describes the development of appropriateness ratings for GPs in relation to NP and echocardiography diagnostic testing for possible heart failure. In cases where the study panel rated all ‘next steps’ as equivocal, it does not follow that NP testing or echocardiography should be ruled out, but that the GP should take other issues, such as comorbidities, patient preferences, and other clinical information, into account when making treatment decisions.
Appropriateness ratings that are developed using consensus techniques have high face validity. However, this is a minimum prerequisite for any quality measure and subsequent developmental work is required to provide empirical evidence and to test for acceptability, feasibility, reliability, sensitivity to change, and validity.21,25,26 Although the findings illuminate the decision-making process for diagnosing heart failure in routine general practice, they should now be the subject of prospective validation and empirical testing using an observational design.
These results indicate that NP testing should be used as part of the routine decision-making process for diagnosing suspected heart failure within primary care in all cases in which referral for any diagnostic testing is considered appropriate. Subsequently, all patients with abnormal NP testing should be followed up by referral for echocardiography. However, in general, the presence of an abnormal ECG status makes referral to echocardiography a more appropriate next step or an accompanying next step alongside NP testing, especially in the presence of dyspnoea.
Acknowledgments
We thank the panellists — Terry McCormack, Nigel Rowell, Ahmet Fuat, Peter Savill, Richard Falk, Nick Taylor, Chris Arden, Richard Hobbs, Raghu Raghunath, and Michael Norton — in addition to the following colleagues from the Department of Health: Sue Hill, Roger Boyle, and Ian Barnes.
Notes
Funding body
UK Department of Health (Pathology Programme).
Ethics committee
The GPRD Group has ethical approval from a multicentre research ethics committee for all purely observational research using GPRD data; namely, studies that do not include patient involvement, as here. No individual patients are identifiable through this research.
Provenance
Freely submitted; externally peer reviewed.
Competing interests
FD Richard Hobbs has received occasional sponsorship or speaking fees from a variety of pharmaceutical and diagnostic companies with interests in heart failure, including Astra-Zeneca, Pfizer and Roche Diagnostics. The other authors have declared no competing interests.
Discuss this article
Contribute and read comments about this article on the Discussion Forum: http://www.rcgp.org.uk/bjgp-discuss
- Received August 19, 2010.
- Revision received September 15, 2010.
- Accepted September 30, 2010.
- © British Journal of General Practice, January 2011