Diagnostic accuracy of the FRAIL scale plus functional measures for frailty screening: a cross-sectional study

Background There is little knowledge of the diagnostic accuracy of screening programmes for frailty in primary care settings. Aim To assess a two-step strategy consisting of the administration of the FRAIL scale to those who are non-dependent and aged ≥75 years, followed-up by measurement of the Short Physical Performance Battery (SPPB) or gait speed in those who are positive. Design & setting Cross-sectional and longitudinal cohort study. Analysis of primary care data from the FRAILTOOLS project at five European cities. Method All primary care patients consecutively attending were enrolled. They received the index tests, plus the Fried frailty phenotype (FP) and the frailty index to assess their frailty status. Mortality and worsening of dependency in basic and instrumental activities of daily living (BADL and IADL) over 1 year were ascertained. Results Prevalence of frailty based on FP was 14.9% in the 362 participants. A FRAIL scale score ≥1 had a sensitivity of 83.3% (95% confidence interval [CI] = 73.1 to 93.6) to detect frailty. A positive result and an SPPB score <11 had a sensitivity of 72.2% (95% CI = 59.9 to 84.6); when combined with a gait speed <1.1 m/s, the sensitivity was 80.0% (95% CI = 68.5 to 91.5). Two-thirds of those screened as positive were not frail. In the best scenario, sensitivities of this last combination to detect IADL and BADL worsening were 69.4% (95% CI = 59.4 to 79.4) and 63.6% (95% CI = 53.4 to 73.9), respectively. Conclusion Combining the FRAIL scale with other functional measures offers an acceptable screening approach for frailty. Accurate prediction of worsening dependency and death need to be confirmed through the piloting of a frailty screening programme.


Introduction
Frailty in older people is a progressive age-related decline in physiological systems resulting in decreased reserves of intrinsic capacity, extreme vulnerability to stressors, and increased risk of adverse health outcomes. 1 Screening for this very common condition (35% of patients aged ≥70 years attend primary care in Europe) 2 allows early detection and intervention before consequences occur, such as disability, which is more difficult to reverse. There is evidence on the validity, reliability, and feasibility of several tools to perform the screening and on the efficacy of interventions to reverse frailty, mainly multicomponent exercise. 3 Several countries and regions have deployed screening programmes in primary care with different instruments that generate variable workloads for primary care teams. 3,4 The usual limitation of attention time in these practices combined with possible limitations to face-to-face contact, like those brought about by the COVID-19 pandemic, suggest the need for a screening instrument that can be administered quickly and on the phone. The FRAIL scale 5 meets these requirements and could be combined with performance tests, such as the SPPB 6 or the measurement of gait speed, to confirm positive results. These are tools recommended by the ADVANTAGE Joint Action. 3 To the authors' knowledge, there is no published evidence on the diagnostic accuracy of this strategy for frailty, worsening of dependency, or death screening. The objective of this study was to evaluate the sensitivity, specificity, and predictive values for this strategy. Different cut-offs for the three instruments were explored.

Method
This article adheres to the STARD guidelines for reporting diagnostic accuracy studies. 7 The design and rationale of the FRAILTOOLS project have been previously published in more detail. 8 It was an observational, prospective, and longitudinal study planned to explore the diagnostic accuracy of several frailty instruments. It enrolled consecutively 1440 adults (aged ≥75 years) from primary care clinics, geriatric medicine services, and nursing homes from France (Toulouse), Italy (Rome), Poland (Cracow), Spain (Getafe), and the UK (Birmingham). Exclusion criteria were as follows: a Mini-Mental State Examination ≤20 points; a terminal illness (life expectancy ≤6 months); and a Barthel Index <90. Variables were collected at baseline in 2016 and at 6-, 12-, and 18-months' follow-up. This article is limited to the 381 primary care patients and their 1-year follow-up. The first five patients with no exclusion criteria attending primary care practices each morning were selected. These practices were mainly those that volunteered to participate among those that referred patients to the principal investigators' affiliation hospitals.
Information on age, sex, multimorbidity, 9 and several frailty instruments was collected at baseline. This article focuses on the following index tests and frailty measures.
The FP 10 constitutes one of the reference standards because of its general acceptance as a measure of frailty. 3 It consists of three self-reported components (exhaustion, physical activity, and weight loss), and two objective measures (grip strength and gait speed). Exhaustion was considered present when the responder answered at least 3-4 days during the past week to any of the following two questions from the Center for Epidemiological Studies Depression (CES-D) scale: 11 'I felt that anything I did was a big effort' and 'I felt that I could not keep on doing things'. The physical activity item was considered present when males referred <2.5 hours walking per week (equivalent to <383 kcal) and women <2 hours (<270 kcal) usually. The weight loss item was considered positive if there was an unintentional loss of at least 4.5 kg in a year. Grip strength was measured as the best of three trials with a Jamar hydraulic dynamometer in the dominant hand. Gait speed was measured as the best of two trials at usual pace in a 4.5-metre distance from a standing position without using assisting devices. Both items were considered positive when the individual was in the worst quintile of strata of sex and body mass index for grip strength, and sex and height for gait speed. 12 A patient was considered frail if ≥3 criteria were positive, even if the rest of the items were not measured. No imputation of missing items was performed.
The 35-item Frailty Index (FI-35) was the second reference standard because it belongs to the second recognised conceptualisation of frailty as an accumulation of deficits. 13 It was calculated as the proportion of a list of health deficits (that is, symptoms, signs, chronic diseases, disability, and laboratory abnormalities) the patient suffered from, 14 obtained from medical records, self-reported, or measured at the patient's evaluation. The cut-off used to identify frailty was set to ≥0. 25. 15 According to the original protocol, the FI-35 allowed a missingness up to 20% of items to calculate the score and be able to classify the patient as frail.
The FRAIL scale, one of the index tests, comprises five self-reported items: Fatigue, Resistance, Ambulation, Illness, and Loss of weight. 5 Although the recommended cut-off is ≥3, this article also explores lower ones. Any individual with any item lost was excluded from analyses.
Two performance measures, the two other index tests, were administered: the SPPB, 6 which is a scale that ranges from 0-12 and combines three tests, gait speed, time to perform five chair stands, and balance assessment (in three positions: feet together, semi-tandem, and tandem); and gait speed (measured like the FP's item).
For assessing predictive validity, the following frailty adverse outcomes were employed: 1) IADL and BADL dependency worsening at follow-up, which were defined as a loss of one point in the Lawton and Brody 16 or a loss ≥5 points in the Barthel 17 indexes, respectively; 2) death, which was ascertained through phone calls to arrange follow-up visits at 6 and 12 months and, when no answer was obtained, hospital registries (plus the death registry of the Ministry of Health in Spain).
All members of the research healthcare team (nurses and geriatricians) of all countries received the same training on the administration of the scales. 8 All participants gave informed, written consent.

Statistical analysis
Description of variables was carried out with medians and interquartile ranges (IQRs) or absolute and relative frequencies. Sensitivity, specificity, and percentage of positives who are not frail (false positives, understood as the complementary of the positive predictive value, not of the specificity) for frailty defined by the FP and the FI-35, and the three adverse outcomes were calculated for different thresholds of the FRAIL scale, and for different scores of the SPPB and increments of 0.1 m/s of gait speed in individuals with a FRAIL score ≥1.
The R package (version 4.0.2) was used for all analyses.

Results
Out of the 381 primary care patients, 19 did not provide information on the index tests or frailty measures. There were more females, although the difference with participants was not statistically significant (P = 0.407). There were no differences in age (P = 0.49), Charlson Comorbidity Index (P = 0.323), nor Lawton and Brody scale (P = 0.437) and Barthel index (P = 0.326). The final sample size for the diagnostic accuracy of frailty was 362. Characteristics of the sample for analyses are presented in Table 1. Median age was 79 years (IQR 77-82) and 58.8% were female. Three cities (Getafe, Toulouse, and Rome) contributed 79.2% of the sample. The median Charlson Comorbidity Index was high (median = 4; IQR 4-5). Dependency at baseline was infrequent, but around 17% became dependent after 1 year (loss to follow-up for this variable amounted to 24.0%). The prevalence of frailty was 14.9% (95% CI = 11.2 to 18.6) and 15.2% (95% CI = 11.5 to 18.9) according to the FP and the FI-35, respectively. Median SPPB score and gait speed were 10 and 1 m/s, respectively. Deaths were an extremely infrequent outcome, although missing information was the highest for this variable. Table 2 presents the prevalence and diagnostic accuracy of the FRAIL scale for three cut-off points. The traditional cut-off point (≥3) had a very low sensitivity for detecting frailty according to the FP (37.0%, 95% CI = 23.7 to 50.3). Decreasing the cut-off by one point rendered a higher sensitivity of 66.7% (95% CI = 53.7 to 79.7). Scoring any item of the FRAIL scale had a sensitivity of 83.3% (95% CI = 73.1 to 93.6). Indicators were slightly worse for frailty operationalised as the FI-35. Of the sample, 42.3% (95% CI = 37.2 to 47.4) scored at least one item of the FRAIL scale and therefore would be offered to be screened with the performance tests. Table 3 presents the prevalence and diagnostic accuracy of the different scores of the SPPB among those with a score in the FRAIL scale ≥1. Percentages refer to the whole sample, not only those with . When the condition to be screened was defined as frailty according to the FI-35, the cut-off to obtain a similar sensitivity was <12. Using a cut-off of <11, 32.3% (95% CI = 27.5 to 37.2) of the total eligible population would be referred to a multidimensional evaluation. Table 4 presents the same structure as Table 3 but refers to different gait speeds. Sample size was smaller (N = 342) because in some cases gait speed was ascertained for the SPPB scoring but was not recorded in m/s. A FRAIL score of ≥1 plus a gait speed <0.8 m/s showed a sensitivity for frailty of only 52.0% (95% CI = 37.7 to 66.3) for the FP and 34.7% (95% CI = 20.9 to 48.5) for the FI-35. Sensitivity reached 74.0% at a cut-off <1 m/s and got higher at the expense of a little increase of false positives        A score of the FRAIL scale ≥1 had a sensitivity of 52.2% (95% CI = 37.2 to 67.2) and 46.8% (95% CI = 32.0 to 61.6) to detect a 1-year worsening of BADLs and IADLs, respectively. Combining it with the SPPB or gait speed, sensitivities would get even lower. Sensitivity analysis was performed by handling missing data in the worsening dependency variables by the best-case imputation, where missing cases were considered to have worsened their dependency if their FRAIL scale score was >0 at baseline, and not considered to have worsened their dependence if their FRAIL scale score was 0. Under this assumption, FRAIL score ≥1 sensitivities increased to 76.1% (95% CI = 67. Death could not be ascertained in 107 people of the cross-sectional sample, which left a sample size for the diagnostic accuracy of mortality of 255 individuals. Two deaths occurred during the followup, both with a FRAIL score of 1, SPPB scores of 6 and 9, and gait speeds of 0.79 m/s and 0.67 m/s, respectively. That means that the sensitivity for death of a FRAIL score ≥1 and SPPB <11 or any of  the proposed cut-offs for gait speed was 100%. Nevertheless, 97.2% (95% CI = 93.3 to 100) of those considered positive under the former criteria did not die.

Discussion Summary
This article shows that a strategy, which screens all non-dependent adults aged ≥75 years in primary care with the FRAIL scale and follows-up positive results with the SPPB or measurement of gait speed, has a reasonable diagnostic accuracy for frailty detection when the following thresholds are applied: a positive answer to any of the items of the FRAIL scale (instead of the recommended cut-off of ≥3 items), plus an SPPB score <11 or gait speed <1.1 m/s (instead of the usual threshold of 0.8 m/s). The results suggest that this strategy may also predict those who worsen their dependency level in 1-year's time, although cautiousness in the interpretation is warranted because of losses to follow-up.

Strengths and limitations
The article has the strength of presenting the diagnostic accuracy results of a previously untested frailty screening strategy carried out in a multi-country sample of primary care patients. Its main limitation is loss to follow-up, which did not allow the authors to obtain conclusive results on prediction of dependency worsening because of the discrepancies in results when non-imputing and imputing by the best-case approach. It is known that lost-to-follow-up individuals were older and marginally frailer, which limits the capacity of generalising the results to all eligible users. The authors believe most of these individuals probably dropped out from the study because of tiredness and loss of motivation owing to the long administration time to perform the full frailty assessment in FRAILTOOLS with seven instruments. Another limitation is a short follow-up to detect deaths, but extending analyses to 18 months, as stablished in the FRAILTOOLS protocol, would have increased missingness.

Comparison with existing literature
Our results of the sensitivity of the FRAIL scale are comparable to those published by Ambagtsheer et al 18 from a study in primary care patients aged ≥75 years in southern Australia, where they reported a sensitivity of 30% (95% CI = 16.6 to 46.5) and 19.8% (95% CI = 12.9 to 28.5) for the FP and the frailty index, respectively, with a cut-off of ≥3. They are lower than those reported in eastern China, where sensitivity results for the FP in community dwellers aged ≥60 years were 52.2% for a score ≥3, 87% for a score ≥2, and 97.8% for a score ≥1. 19 Similar results to Dong et al19 were reported by Thompson et al 20 for the FP in community dwellers from the northwest of Australia aged ≥65 years. In relation to prediction of disability worsening, Si et al 21 found a sensitivity of a score of the FRAIL scale ≥3 at 1-year follow-up of 11.7% for BADL and 9.9% for IADL in community dwellers aged ≥60 years from a Chinese city. They did not offer results for lower cut-off points.
Although the SPPB has been used for screening in primary care, 22 data have not been recorded about its ability to detect frailty nor adverse outcomes in this level of attention. Ambagtsheer et al 18 also studied the diagnostic accuracy of gait speed ≤0.8 m/s (at four-metre distance). Sensitivity against the FP and the frailty index was 70.0% (95% CI = 53.5 to 83.4) and 47.8% (95% CI = 38.2 to 57.4), respectively, and specificity 77.1% (95% CI = 70.5 to 82.9) and 84.6% (95% CI = 76.8 to 90.6), respectively. Our sensitivities are lower and specificities higher because we added the requirement of having a positive answer in any of the items of the FRAIL scale.

Implications for research and practice
The use of the FRAIL scale as a screening tool for frailty has many advantages for busy primary care clinics; for example, it has a short administration time (less than a minute and a half in most cases), 23 requires little training or instruction for the assessor, and can be delivered over the phone. The authors recommend using it in the context of the algorithm presented in Figure 1. Although the results were limited to individuals aged ≥75 years, existing recommendations have been adhered to 3,4 and all individuals aged ≥70 years have been considered. Any positive answer to the items of the FRAIL scale over the phone in a non-dependent patient should elicit an in-person consultation where either the SPPB or gait speed would be measured. The authors predict that less than half of the screened population will require to be referred to functional assessment, which would certainly reduce primary care teams' workload compared with the assessment of all individuals with functional measures (as recommended in other screening programmes). 24 Positive results in functional assessments should be confirmed through a comprehensive geriatric assessment (CGA) carried out in the primary or secondary level of attention, something that will be required by around one-third of the eligible population. The CGA should encompass the prescription of a multicomponent exercise intervention in confirmed cases. 3 Cut-offs that the authors consider acceptable considering the sensitivity, proportion of false positives, and workload that they would produce are suggested here. These decisions should be tempered by the resources available to carry out functional measurements and CGAs.
To increase certainty on the ability of this strategy to predict death and dependency worsening, a pilot programme with usual primary care users followed for a longer period is warranted.
NHS England opted for detecting frailty in primary care following the accumulation of deficits paradigm using electronic medical records. 25 This would be equivalent in the present study to just administering the FI-35 to the whole sample. This is not a screening strategy, but a diagnostic one, because all cases according to this definition of frailty would be detected. Curiously enough, NHS England states nevertheless that 'confirmation of frailty in an individual should be undertaken using a validated tool such as: [the] Gait Speed Test'. 26 They also state, in another document: 25 '... a clinician from the primary care team should verify the frailty diagnosis by direct assessment using the Clinical Frailty Scale (CFS) or similar validated tool'. Both instruments are considered screening tests for frailty, not diagnostic tools. 3 In any case, the FP and index approaches have different purposes and are to be considered complementary in the evaluation of the older person. 27 One of their main differences is that the frailty index includes diseases, disability, and dependency items, while the FP was conceived as a measure of a condition that usually precedes disability, and because of that it is based on assessing performance-based tasks, which are different from disability. It has been shown that the algorithm is more sensitive to detect frailty according to the FP rather than the frailty index, and therefore more suitable to identify patients at risk of developing disability.

Funding
This study was supported by the European Commission Directorate General for Health and Consumer Affairs -Third Health Programme, Funding Health Initiatives (2014-2020); Consorcio Centro de Investigación Biomédica en Red, Instituto de Salud Carlos III, Ministerio de Ciencia e Innovación; and FEDER funds.

Ethical approval
This study was approved by the ethics committee of each participating centre. The comprehensive validation of frailty assessment tools in older adults in different clinical and social settings (FRAIL-TOOLS) project was registered at http://www. clinicaltrials. org (reference: NCT02637518; date of registration: 18 December 2015).

Provenance
Freely submitted; externally peer reviewed.