This week, Clinical Informatics News is exploring those intangible, sometimes invisible, but altogether indispensable pieces of the health care system – the patients themselves. How can data-driven methods be brought to bear on patients’ lives inside and outside the clinic? And if we ask the right questions, can they tell us how to provide better care?
By Aaron Krol
October 18, 2013 | There are researchers and organizations pursuing some truly innovative measurement tools in the field of patient reporting; this week, Clinical Informatics News has highlighted the Patient Activation Measure and the Vulnerability Index as tools pushing the boundaries of what kind of data is relevant to clinical practices. But for certain symptoms and conditions, patient reporting has always been our principal source of information: pain, for instance, or mental health problems like depression. One might suppose that the health care system has the use of this essential patient-generated data down to a fine science. In fact, there is considerable variability in how patient-reported outcomes, or PROs, are collected, measured, and especially compared. Translating PROs into accurate performance measures that providers can use to improve quality remains a daunting task, with its own unique web of methodological knots and, until recently, few guides to untangling them.
The National Quality Forum (NQF), a nonprofit first created in 1999 in response to the President’s Advisory Commission on Consumer Protection and Quality in the Health Care Industry, has a mandate to evaluate and endorse performance measures for use across providers. Those endorsements reach into both the public and private sectors. When the NQF, as a national voluntary consensus standards-setting organization, issues an endorsement of a performance measure, relevant federal programs must use that measure or provide a rationale why they have chosen not to. Hospitals and clinics in the private sector also prefer to use NQF-endorsed measures where they exist. “It gives [performance measures] a gold standard,” Karen Adams, the NQF’s Vice President of National Priorities, told Clinical Informatics News. “When they see the NQF has endorsed it, they know that it has reached a certain bar.”
Adams is one of the internal leaders of a recent NQF project to create clear guidelines for the endorsement of PRO-based performance measures, or PRO-PMs. In early 2012, the NQF was part of a team called the National Priority Partners (NPP) who identified five areas of the health care system that most urgently needed better measurement tools; one of these areas was “Person-Centered Care and Outcomes,” including most prominently PROs. Identifying this as a top priority represents a significant shift in the traditional values of health services. “There are very important diseases, certainly,” says Adams. “Heart disease and cancer, asthma, et cetera. But [the NPP] really focused on things like care coordination… They picked things like patient-centered care and outcomes because, regardless of your condition, this matters to you. And ultimately, how we judge quality should be around achieving the outcomes that are meaningful to patients and their families.” PROs met the twin qualifications of major impact on a large, cross-cutting spectrum of health services, and an underdeveloped system for incorporating them into performance measurements.
To follow through on its commitment to the NPP priorities, the NQF in 2012 commissioned two white papers on the methodological challenges to developing PRO-PMs, and assembled an expert panel to review the findings and create a set of rigorous standards.
Everyone seems to agree that communicating with patients, and turning those interactions into actionable data, is very important, but the NQF’s Patient-Reported Outcomes project provides a glimpse into just how difficult it is in practice. There are four broad categories of outcomes that can only be measured by patient reporting – health-related quality of life, symptoms and symptom burdens, experience with care, and health-related behaviors – and all of them deal with subjective experiences and sensitive personal information that patients may be reluctant to share.
Problems, the NQF panel found, begin at the level of administering surveys. If a survey is delivered by an interviewer, patients may find some questions uncomfortable to answer honestly, while self-administration limits how complex and detailed a survey can be. Administration in the clinic slows the workflow and gives extra responsibilities to clinic staff, but administration at home is difficult and reduces the likelihood a survey will be completed. Patients also may not be perfectly comparable; each time a survey is translated into a different language, or administered by proxy because a patient is unable to communicate directly with clinicians, there is a chance of biased changes in the responses. And of course each patient will have a different conception of, for instance, where a given level of pain falls on a 1 to 10 scale. These are all potential sources of noise that do not arise in a standard clinical measurement like a blood pressure test.
Ideally these sources of variability would be compared against a reliable baseline and controlled for. Unfortunately, says Karen Pace, NQF’s Senior Director of Performance Measurement and co-leader with Adams of the Patient-Reported Outcomes project, there are “few good examples of health care providers systematically using these kinds of tools to get the information from patients. That presents a barrier to even having data” that could help validate individual PRO measurement tools by demonstrating cross-consistency.
PRO-PMs compound these issues. It may seem obvious that clinics who provide better care should register better scores when their PRO measurements are aggregated into PRO-PMs, but in fact it can be difficult to make combined PRO scores sensitive to quality of care. “There are numerous ways that you could construct [a PRO-PM],” says Pace. One method would be to ask, “‘What percentage of this organization’s patients improved on a particular outcome?’ Or it could be, ‘What percentage of their patients met a certain benchmark?’... Or you could have, ‘What’s the average score that was achieved by the patients served?’”
Each of these standards has its own unique problems. Comparing patients’ initial PRO measurements to their scores when they leave care would seem to best capture the effects of intervention, but PROs are rarely sensitive to small changes one way or the other, since they have to control for small random variance. Floor and ceiling effects, the panel warned, may also threaten the validity of this type of PRO-PM: patients who start at the high end of a PRO measurement will only register in the PRO-PM if they deteriorate, and patients at the low end will only register if they improve. Measuring change over time is also highly dependent on understanding the minimum meaningful differences in PRO scores. If a hospital’s patients, on average, report fatigue scores of 7.3 entering the hospital and 6.8 on discharge, is that a sign of improvement or just noise?
On the other hand, pegging PRO-PMs to patients’ achievement of benchmarks demands a great deal of faith in the validity of those benchmarks. A PRO-PM to capture the percentage of patients who move from depression to remission over the length of a clinical stay relies on having a PRO measurement with a distinct threshold for depression, and not a smooth gradient. And all PRO-PMs must take into account that certain patient populations are likely to score higher or lower on PRO measurements regardless of care, for instance because different providers serve areas with different income levels. If a PRO-PM cannot adjust for this relative risk, it may be useful for an individual provider to track their progress, but it does not help to compare quality of care between clinics.
These sources of variation in PRO-PMs’ construction means that validating them is a time-consuming process, in which reviewers must drill deep into the methodology from collection of patient-reported data to calculation of scores. Even when everything seems sound, it can be difficult to verify that the results reflect real quality of care. The NQF commission recommends a combination of “construct validity” – that is, providers that are already known for their quality of care achieving higher scores – and “face validity,” especially from patient experts like advocacy groups or focus groups of highly engaged patients, to make sure the results are not noticeably counter to patient experience. The patient perspective was “the true north,” says Adams, and even when the NQF was narrowly focused on methodology, they included patient representatives like the AARP, the Patient Centered Outcomes Research Institute, and the National Partnership for Women and Families in their panels. “We wanted to put patients up front in regards to that decision-making,” Adams adds.
More Work to Be Done
Adams and Pace recognize that, despite the uncertainty surrounding PRO-PMs, steps to adopt them cannot be tentative or purely for research purposes. “The first thing that providers need to do is actually use these instruments in their practice to measure their patients’ care,” says Pace. “We don’t want the patient to just feel that they’re filling out forms… In order for this to be meaningful and valued by the patient, it has to directly relate to their care, the decisions about their care, and monitoring their progress.”
The NQF released its report on endorsing PRO-PMs in October of 2012, and they plan to follow it with a series of endorsements in 2014. The organization has also started a project to recommend specific PRO-PMs for development, and to spread awareness of qualified PRO-PMs to members of the health care industry, policymakers and the medical education system. That project just closed nominations for its committee on October 15, and will hold its first multi-stakeholder meeting of providers, insurers, policy experts, public health organizations and patient groups on October 22.
Although a coordinated endorsement of large numbers of PRO-PMs remains a year or more in the future, two sets of PRO-PMs have already cleared the hurdles for NQF approval. One uses the PHQ-9 tool for measuring patient-reported depression, to identify the percentage of depressed patients who enter remission within six months of intervention. This tool is well-vetted because the PHQ-9 has a clear and clinically-validated cut-off point for depression, and because it requires significant improvement to register remission; patients scoring in the “gray area” under depression but not at remission do not contribute to the PRO-PM’s positives. This makes it unambiguous that a clinic that raises its score on the PRO-PM has made more successful interventions in its patients’ conditions.
The NQF has also endorsed PRO-PMs based on CAHPS, surveys of patient satisfaction with care developed by the Department of Health & Human Services. CAHPS tools exist for a wide variety of specific health services, including surgical care, pediatric care, home health care, and care of chronic conditions, among others. The PRO-PMs based on these surveys are very simple threshold measures, capturing only the percentage of patients who give “excellent” ratings on each part of the survey. This lets providers focus on bringing their quality of care to the highest standards, rather than making marginal improvements from “satisfactory” to “good” ratings, and helps them break down gaps in quality into specific aspects of care – for instance, the quality of information provided to help patients recover from surgery.
These are basics elements of care that deserve to be measured, but members of the NQF’s person-centered care committees know that the tools used to quantify them are anything but simple. It is important to remember, when discussing what our patients can tell us about their care, that even the most basic PROs still represent unexplored frontiers for our health care system. In the next few years, the NQF hopes to endorse PRO-PMs examining patients’ functional status, and the degree to which their symptoms affect their quality of life. “These kinds of outcomes are critically important to patients and their families,” says Adams, yet there are no clear, national standards for measuring them.
There are many obstacles to validating PRO-PMs, but at the end of the day, the biggest challenge to their adoption may be lack of awareness. It will take the voices of respected organizations like the NQF to give the best PRO-PMs a national profile, and help our health care system turn its attention to the outcomes that matter most to patients.
Read Part 1: Judy Hibbard and the Patient Activation Measure
Read Part 2: Eliza and the Vulnerability Index