Facebook Posts Better at Predicting Diabetes, Mental Health Than Demographic Info

PHILADELPHIA – Language in Facebook posts may help identify conditions such as diabetes, anxiety, depression and psychosis in patients, according to a study from Penn Medicine and Stony Brook University researchers. It's believed that language in posts could be indicators of disease and, with patient consent, could be monitored just like physical symptoms. This study was published in PLOS ONE.

"This work is early, but our hope is that the insights gleaned from these posts could be used to better inform patients and providers about their health," said lead author Raina Merchant, MD, MS, the director of Penn Medicine's Center for Digital Health and an associate professor of Emergency Medicine. "As social media posts are often about someone's lifestyle choices and experiences or how they're feeling, this information could provide additional information about disease management and exacerbation."

Using an automated data collection technique, the researchers analyzed the entire Facebook post history of nearly 1,000 patients who agreed to have their electronic medical record data linked to their profiles. The researchers then built three models to analyze their predictive power for the patients: one model only analyzing the Facebook post language, another that used demographics such as age and sex, and the last that combined the two datasets.

Looking into 21 different conditions, researchers found that all 21were predictable from Facebook alone. In fact, 10 of the conditions were better predicted through the use Facebook data instead of demographic information.

Some of the Facebook data that was found to be more predictive than demographic data seemed intuitive. For example, "drink" and "bottle" were shown to be more predictive of alcohol abuse. However, others weren't as easy. For example, the people that most often mentioned religious language like "God" or "pray" in their posts were 15 times more likely to have diabetes than those who used these terms the least. Additionally, words expressing hostility — like "dumb" and some expletives— served as indicators of drug abuse and psychoses.

"Our digital language captures powerful aspects of our lives that are likely quite different from what is captured through traditional medical data," said the study's senior author Andrew Schwartz, PhD, a visiting assistant professor at Penn in Computer and Information Science, and an assistant professor of Computer Science at Stony Brook University. "Many studies have now shown a link between language patterns and specific disease, such as language predictive of depression or language that gives insights into whether someone is living with cancer. However, by looking across many medical conditions, we get a view of how conditions relate to each other, which can enable new applications of AI for medicine."

Last year, many members of this research team were able to show that analysis of Facebook posts could predict a diagnosis of depression as much as three months earlier than a diagnosis in the clinic. This work builds on that study and shows that there may be potential for developing an opt-in system for patients that could analyze their social media posts and provide extra information for clinicians to refine care delivery. Merchant said that it's tough to predict how widespread such a system would be, but it "could be valuable" for patients who use social media frequently.

"For instance, if someone is trying to lose weight and needs help understanding their food choices and exercise regimens, having a healthcare provider review their social media record might give them more insight into their usual patterns in order to help improve them," Merchant said.

Later this year, Merchant will conduct a large trial in which patients will be asked to directly share social media content with their health care provider. This will provide a look into whether managing this data and applying it is feasible, as well as how many patients would actually agree to their accounts being used to supplement active care.

"One challenge with this is that there is so much data and we, as providers, aren't trained to interpret it ourselves — or make clinical decisions based on it," Merchant explained. "To address this, we will explore how to condense and summarize social media data."

The current study received funding from a Robert Wood Johnson Foundation Pioneer Award.

Other authors on this study include David A. Asch, Patrick Crutchley, Lyle H. Ungar, Sharath C. Guntuku, Johannes Eichstaedt, Shawndra Hill, Kevin Padrez, and Robert J. Smith.

You might also like