Behavioral health professional and regulatory organizations are stepping up support for routine outcomes monitoring (ROM) (also referred by by terms such as Outcomes Monitoring, Feedback Informed Treatment, Measurement Based Care, etc). They encourage clinicians to monitor patients’ responses to treatment and identify those at risk of a bad outcome and those who are progressing as expected. ROM is also promoted to compare the effectiveness of clinicians. But a soon to be published analysis suggests ROM systems have overstated their case and recommends caution by clinicians using them.
In the paper “Five Types of Clinical Difference to Monitor in Practice,” 2018, Lankass, Wampold, and Hoffart (https://www.researchgate.net/publication/326507191 ) point out the following deficiencies of ROM systems:
- they conflate statistically significant change with clinically meaningful change;
- they suggest positive change is always good, negative change is always bad, and change is always the result of treatment;
- they don’t address the valid clinical concerns of the 25% of patients who begin treatment in the normal range;
- they compare a patient’s progress with average progress suggesting average responders are on-track for a good outcome;
- they benchmark clinician performance based on the average change of their patients without considering the diversity and varying chronicity of the patient populations served.
This blog reviews each of these concerns with ROM and ends with suggestions for better use of progress monitoring data.
ROM emerged in the 90s as systems of care sought to quantify the benefits of psychological services and to use data to optimize the value and effectiveness of treatment systems. (For a list of ROM systems see addendum below). Dozens of research studies from the late 90s and early 2000s, showed that regular progress feedback reduced negative outcomes and improved outcomes for all patients. Some early studies suggested dramatic improvement rates, in some cases quadrupling improvement rates.
More recent studies—conducted by independent investigators who did not profit from these systems– have failed to replicate these earlier studies. In 2016 the prestigious Cochrane Organization conducted an exhaustive meta-analytic review of ROM and found insufficient evidence to recommend its routine use (Routine Use of Reported Patient Outcomes for improving treatment in common mental health disorders in adults, 2016). ROM was not shown to improve patient outcomes or to reduce length of treatment. Cochrane did find a small positive effect for patients identified as not on track, albeit only when clinicians had significant buy in to the approach.
Other studies casting a shadow over ROM include a study by Goldberg (2016) that showed access to regular progress feedback does not result in therapists improving their outcomes over time.
In their paper “Five Types of Clinical Difference to Monitor in Practice,” Wampold and colleagues state unequivocally that no current ROM system can detect whether and to what extent treatment is effective; and that measureable—statistically significant–client change does not always indicate that treatment is working.
ROM systems are built on problematic statistical methodologies. One pillar of ROM systems is the use of Jacobson and Truax’s statistical significance methodology. This is a statistical approach that indicates whether observed change is real in statistical terms, i.e., that it is not due to measurement error. But real change in statistical terms is not necessarily clinically significant change due to treatment.
–There are many clinical situations in which making a clinical difference does not equate to positive change on a graph. Sometimes treatment focuses on preventing a worsening of a client’s distress, e.g. preventing the recurrence of a major depression. In this instance, no change is a good outcome.
–Clients scores may improve and reach statistical significance–client’s level of distress may decline—while their underlying problems persist.
–Studies show that a percentage of patients (5.5%) get worse before they improve. Negative early change is not always a signal that treatment is not “on track.”
–Natural healing might be the reason for patient progress, not treatment. In fact, there are clinical interventions that retard natural healing, for instance early PTSD interventions post trauma (McNally, 2003). In these cases clinically significant change might be detected where treatment actually slowed improvement. ROM can not identify such ineffective treatments.
Thus because ROM systems conflate statistically significant change with clinically significant change, they are prone to the provision of inaccurate and/or misleading decision support recommendations. The point for clinician’s to take on board is that they should not assume that statistical significance equates to clinically meaningful improvement.
In addition, ROM systems cannot provide any meaningful feedback whatsoever about the 25% of clients who come in whose intake assessments are in the normal range. This is because clinical significance methodology requires scores in the clinical range, i.e., above a cut-off.
Another pillar of ROM is the use of expected treatment response graphs (ETR) that predict likely progress of treatment over time. The ETR models ROM generate are proprietary—the systems themselves are commercial. Wampold et al point out that these ETR curves (as currently constructed) do not represent the expected trajectories of cases that actually recover, but rather the average expected progress based on all clinical cases with similar baseline scores.
In the OQ System, for instance, cases that are similar to the average patient are considered “On Track” when they are really only “On Track” for an average treatment response, but not for a satisfactory recovery, which might require far more improvement.
ROM systems are often used to identify top performing or expert therapists: therapists with larger client changes are assumed to have greater expertise. Some clinicians actually use their performance data to market their services. The problem is, as we have noted, that observed client change of statistical significance do not always indicate interventions that were clinically significant. That some experienced therapists might receive a high number of referrals of difficult cases, while less effective therapists might see a high proportion of clients that recover naturally, is not a discrimination these systems can make accurately at this time.
The implication for clinicians is that they should exercise caution in using any ROM system.
For clinician’s using ROM systems our advice is as follows:
–The important data points in a ROM system is the patient’s score and the trajectory of change. The clinician should take these two things into consideration together with all other information and ask the question, is the patient benefiting from treatment and is there something to do to enhance the benefit.
–Cases where there is significant deterioration deserve close clinical scrutiny in view of the evidence that deterioration is a ROM systems’ most robust clinical signal.
–Clinicians should be skeptical of efforts to benchmark clinician performance.
Despite these caveats, ROM systems have the potential to improve the effectiveness of psychological treatments. We will discuss what CarePaths and others are doing to improve ROM systems in a subsequent blog post.
*ROM Systems in use in the USA. Each system is proprietary.
OQ System –Developed by Lambert and Burlingame
PCOMS—Developed by Miller and Duncan
ACORN—Developed by Brown and colleagues
BHM—Developed by Kopta and colleagues
STIC—Developed by Pinsof and colleagues
TOP—Developed by Kraus and colleagues
Each of these systems uses clinical cut-off scores with the exception of TOP. Each uses clinical significance with the exception of STIC and TOP. And each uses expected treatment response (ETR) except BHM, STIC and TOP.