P Bent, N & Wright, Christine & Rushton, Alison & Batt, Mark. (2009). Selecting outcome measures in sports medicine: A guide for practitioners using the example of anterior cruciate ligament rehabilitation. British journal of sports medicine. 43. 1006. 10.1136/bjsm.2009.057356.
This review of outcome measures attempts to guide sports medicine partitioners in selecting and using outcome measures.
Outcome measures are commonly used in research and population settings to determine the effectiveness of treatments, but they can also be used to inform rehabilitation and return to sport decisions in athletes following injury. This review uses ACL injury and rehabilitation after reconstruction as a common clinical example. This is particularly pertinent given recent evidence that premature return to sport and quadriceps strength inequality are risk factors for the disappointingly high rate of graft failure and contralateral ACL injury.
Whilst many outcome measures have been validated for use in athletic populations, they are rarely incorporated into routine practice. Reasons for this might include uncertainty as to which measures to use; lack of familiarity; and time and resource constraints involved in their collection.
Questions to ask when selecting an outcome measure:
Is the measure appropriate?
Who and what is being measured and why? Is the measure suitable for this injury, this patient, this activity level, and does it tell you what you want to know? Do you want to evaluate a change over time (eg progress of rehab- is strength improving) or an absolute evaluation of function (eg has a target strength level been reached?). Do you want to measure pain, function, quality of life, strength or a combination?
Is the measure acceptable to patients?
If it is PROM, it should not take too long and the questions should be easy to understand and unambiguous. It should be language and culture specific. If you are using a clinical measure, it should not be too uncomfortable or risk injury. For example, ACL-deficient “non-copers” may refuse to perform a single leg hop test, making the test unacceptable!
Is the measure feasible to use?
What are the financial, time and manpower resources required to use the measure?
A questionnaire (eg KOOS) requiring pen and paper administration, then manual scoring and calculation of results using an algorithm may take too much of a clinician’s time. Cost however, is low. The same questionnaire using a digital format would remove this barrier. Instrumented ACL testing requires expensive equipment and space in addition to a trained tester.
Is the measure meaningful?
Can the score be benchmarked against normative data, compared to the other leg or to pre injury measures? What degree of change in a score is meaningful? For example, the minimal important difference (MID) for the IKDC was found to be 11.5 points, but not all PROMs have stated MIDs. Using the other leg as a benchmark in clinical measures may be unreliable – function in both may suffer as a result of prolonged inactivity.
Are the results reproducible?
There are 3 types of “reliability”.
A measure should provide similar results on repeat testing (test-retest reliability), when testing is delivered at different times by the same practitioner (intra-rater reliability) or by different practitioners (inter-rater reliability).
For clinical measures, a learning effect between tests ( eg, increased familiarity with equipment) can result in systematic bias and reduce intra-rater reliability. Different practitioners may vary in their encouragement, resulting in different hop test measurements, an example of systematic bias reducing inter-rater reliability. Standard error of measurement (SEM) can be used to estimate a range of scores that will contain the patients “true score”- known as the confidence interval.
Does the measure assess what it is supposed to?
There are 4 types of “validity”.
Face validity – it should seem logical, you don’t ask about pet preferences to explore ACL function.
Content validity – it should cover all the important aspects of the condition being explored. A tool for measuring ACL injury symptoms would need to should include pain, stability, swelling etc, whereas a tool for knee strength would not need to.
Criterion validity – can the scores be correlated with a “gold standard”. Unfortunately, there is no such gold standard in ACL measures.
Contract validity – this reflects the degree to which the scores correlate with other measures. To be useful, it should correlate positively with other measures (eg the Lysolm score and KOOS ADL sub-scale), and not correlate strongly with unrelated measures (eg functional scores and age). It might be able to discriminate between groups, eg those ACL injured knees with or without cartilage damage.
Can the measure detect change?
The measure should be responsive to real changes in a patient’s condition.
Are their ceiling and floor effects?
If too many patients (15-20%) achieve the best scores for a domain, then the measure is too easy. This is of particular concern when using a measure in sporting groups, where normal levels of function are very high, where scores adequate for a population of OA do not indicate adequate function for an athlete. Likewise, if too many score the lowest, the measure is too hard. The worst possible score on the Cincinnati system’s sports function sub-scale will be common in patients following ACL injury but before treatment.
Is the measure structured and scored correctly?
Components of measures consisting of one dimension, such as the IKDC form, can be summed together to give an overall score. Multidimensional measures will need some way to give suitable weighting to the various questions they seek to answer. The KOOS contains sub scales which are designed to be interpreted separately form the overall score – pain, activities of daily living, symptoms, sport and recreation and quality of life.
Has it been tested in the right types of patients?
A measure that is valid for using in patients with ACL injuries my not be useful in patients with OA, and measures reliable in recreational athletes may not be reliable for elite athletes.
There are many outcome measures available for use in patients with musculoskeletal conditions. In the research setting, it is easy to become familiar with the one or two measures used in the study design. In clinical practice, where patients have different injuries to different body regions and have different activity levels, it is much more complex to navigate outcome measures usefully.
MyScoreIt has selected commonly used PROMs which we consider useful in day-to-day practice, to make this process quick and easy.