Imagine sitting down for an appointment with a therapist – or any care provider. Perhaps it's the first time you've seen this provider, or the first time in a while. You'll likely need to fill out a form with a set of questions to ensure they know why you're there or how you're doing.
Now imagine that the symptom questionnaire is quite confusing, to the point where you need to ask someone to clarify what the questions are asking. Many people don't have to imagine – this confusion is common, according to a study led by a University of Arizona psychologist.
Questionnaires like these have been the standard since the 1990s. The Patient Health Questionnaire – or PHQ, as it's known – comes in various forms and has helped health providers of all sorts to triage symptoms and lay the foundation for treatment plans. Its use is mandated by the National Institutes of Health and other governmental agencies.
The paper, to be published in the journal JAMA Psychiatry , suggests that the questionnaire is confusing for most people who use it, meaning that psychologists and other care providers may create treatment plans based on poor data from patients who misunderstand how to answer, said Zachary Cohen , the paper's lead author.
Cohen, an assistant professor in the U of A Department of Psychology , in the College of Science , first became curious about potential problems with the PHQ and similar questionnaires 14 years ago during his clinical training. Almost every patient, Cohen recalled, would ask for some guidance on how to answer questions, and clinicians often are only able to say, "Oh, just do your best."
That's not a great solution – clinically or scientifically – said Cohen, who leads the Personalized Treatment Lab , where he works on scalable digital therapies and studies how to tailor mental health treatments to individual patients to improve outcomes.
"This is the questionnaire that everyone fills out, and it's a such a common experience of being confused – it's potentially catastrophic," Cohen said. "Because everything we do in mental health research is dependent on, to a large extent, people's report of their mental health symptoms, so if you don't have good data on that, you're building a house of cards."
Frequent symptoms, or frequently bothered by symptoms?
The study zeroed in on a particularly problematic phrasing in the questionnaire's instructions, which asks patients to indicate the frequency with which they've been "bothered by" any one of a list of symptoms, including oversleeping, overeating, concentration difficulties, being fidgety or restless, and others. The answer options range from "nearly every day" to "not at all."
Cohen and his colleagues first asked roughly 850 participants to fill out a Patient Health Questionnaire.
They were then asked to consider a hypothetical: Imagine they had in fact overslept nearly every day for a week, but that they were not bothered by the oversleeping, perhaps because they were on vacation. They were then asked whether in that scenario they would respond with "not at all" because they were never bothered, or "nearly every day" because they overslept nearly every day.
Then, they were asked whether their actual answers on the PHQ they had just completed reflected how often they had experienced the symptoms or how often they were bothered by the symptoms. Finally, they were asked how they would fill out the questionnaire in the future.
"If you're reading the instructions to the letter, you would actually expect a 'not at all' there," Cohen said, because the patient was never bothered by their sleeping in even though they did it every day.
But the results showed that the instructions were not interpreted consistently among participants – only 328 participants, or 38%, answered with the correct response "not at all." Further, only 146 participants, or 17%, indicated that they would answer based on "bothered by" if they filled out the PHQ in the future.
"If you're using smartwatches to do passive sensing of sleep, and everybody is sleeping too much, but half of the people are saying that they sleep too much every day, and half of the people are saying not at all because they're never bothered by it, then your passive sensing will look like noise when it's not really," he added. "Everybody really is sleeping too much, but you're comparing apples and oranges."
The results indicate that the test can't accurately tell providers what patients are experiencing, Cohen said.
"Most of the time when we use these questionnaires we want to know about symptoms of depression, so the 'bothered by' part really matters," he added. "Think about the recent explosion of people using GLP-1 weight loss drugs. For someone who is on Ozempic, experiencing reduced appetite probably shouldn't be counted as an indicator of depression – that's the main reason they're taking the drug."
Straightforward, comprehensive fix
The study is only the first step, Cohen said, toward fully understanding the issues behind these questionnaires. But given how widely the PHQ is used, he added, the impact could be far-reaching.
"I struggle to imagine that it's a good thing to have some people answering one way, and some people answering the exact opposite way for the same experience," he said. "There's just no way that that can be a good thing – and in this paper, we show that it's happening and provide preliminary evidence of how that can be a problem."
Simply altering the language in the questions related to symptom frequency and distress, Cohen said, is a logical starting place as the field continues to investigate the implications.
"If I want to know how frequently people are oversleeping, then just change the wording of the instructions and make it very clear that I'm just asking about the frequency of oversleeping," he said. "Alternatively, if I want to avoid mischaracterizing things like intentional appetite reduction as a symptom of depression, I could change the wording to better emphasize the 'bothered by' component. Obviously, we'd want to do the studies that would show that that does solve some of these problems, but I would imagine that that would be a both straightforward and decently comprehensive fix."
The paper's co-authors were Margarita Panayiotou with the University of Manchester in England; Josip Razum from the University of Iceland and the Ivo Pilar Institute of Social Sciences in Croatia; Gudrun Eisele with KU Leuven, a research university in Belgium; Shirley B. Wang from Yale University; and Eiko I. Fried with Leiden University in the Netherlands.