Research published in PNAS marries social media data with medical-outcomes data for the first time.
Stony Brook, NY, October 19, 2018 – In any given year, depression affects more than six percent of the adult population in the United States—some 16 million people—but fewer than half receive the treatment they need. What if an algorithm could scan social media and point to linguistic red flags of the disease before a formal medical diagnosis had been made? Research published in PNAS shows this is now more plausible than ever. By analyzing social media data shared by consenting users across the months leading up to a depression diagnosis, researchers from Stony Brook University and University of Pennsylvania found their algorithm could accurately predict future depression. Indicators of the condition included mentions of hostility and loneliness, words like “tears” and “feelings,” and use of more first-person pronouns like “I” and “me.”
“What people write in social media and online captures an aspect of life that’s very hard in medicine and research to access otherwise. It’s a dimension that’s relatively untapped compared to biophysical markers of disease,” says H. Andrew Schwartz, an Assistant Professor in the Department of Computer Science at Stony Brook University, senior paper author and a principal investigator of the World Well-Being Project. “Conditions like depression, anxiety, and PTSD, for example, you find more signals in the way people express themselves digitally.”
For more about this research and its implications, see this video.
For six years, researchers in the World Well-Being Project (WWBP), based in Penn’s Positive Psychology Center and Stony Brook’s Human Language Analysis Lab, have been studying how the words people use reflect their inner feelings and contentedness. In 2014, Johannes Eichstaedt, WWBP founding research scientist and a postdoctoral fellow at Penn, started to wonder whether it was possible for social media to predict mental health outcomes, particularly for depression.
“Social media data contain markers akin to the genome. With surprisingly similar methods to those used in genomics, we can comb social media data to find these markers,” Eichstaedt explains. “Depression appears to be something quite detectable in this way; it really changes people’s use of social media in a way that something like skin disease or diabetes doesn’t.”
Eichstaedt and Schwartz teamed up with colleagues Robert J. Smith, Raina Merchant, David Asch, and Lyle Ungar from the Penn Medicine Center for Digital Health for this study. Rather than do what previous studies had done—recruit participants who self-reported they had depression—the researchers identified data from people consenting to share Facebook statuses and electronic medical record information, then analyzed the statuses using machine-learning techniques to distinguish those with a formal depression diagnosis.
“This is early work from our Social Mediome Registry from the Penn Medicine Center for Digital Health, which joins social media with data from health records,” Merchant says. “For this project, all individuals are consented, no data is collected from their network, the data is anonymized, and the strictest levels of privacy and security are adhered to.”
Nearly 1,200 people then consented to provide both digital archives. Of these, just 114 people had a diagnosis of depression in their medical records. The researchers then matched every person with a diagnosis of depression with five who did not, to act as a control, for a total sample of 683 people (excluding one for insufficient words within status updates). The idea was to create as realistic a scenario as possible to train and test the researchers’ algorithm.
“This is a really hard problem,” Eichstaedt says. “If 683 people present to the hospital and 15 percent of them are depressed, would our algorithm be able to predict which ones? If the algorithm says no one was depressed, it would be 85 percent accurate.”
To build the algorithm, Eichstaedt, Smith, and colleagues looked back at 524,292 Facebook updates from the years leading up to diagnosis for each individual with depression and for the same time span for the control. They determined the most frequently used words and phrases, then modeled 200 topics to suss out what they called “depression-associated language markers.” Finally, they compared in what manner and how frequently depressed versus control participants used such phrasing.
They learned that these markers comprised emotional, cognitive, and interpersonal processes such as hostility and loneliness, sadness and rumination, and could predict future depression as early as three months before first documentation of the illness in a medical record.
“There’s a perception that using social media is not good for one’s mental health, but it may turn out to be an important tool for diagnosing, monitoring, and eventually treating it,” Schwartz says. “Here, we’ve shown that it can be used with clinical records, a step toward improving mental health with social media.”
Eichstaedt sees long-term potential in using these data as a form of unobtrusive screening. “The hope is that one day, these screening systems can be integrated into systems of care,” he says. “This tool raises yellow flags; eventually the hope is that you could directly funnel people it identifies into scalable treatment modalities.”
Despite some limitations to the study, including a distinctive urban sample, and limitations in the field itself—not every depression diagnosis in a medical record meets the gold standard that structured clinical interviews provide, for example—the findings offer a potential new way to uncover and get help for those suffering from depression.
Stony Brook University is going beyond the expectations of what today’s public universities can accomplish. Since its founding in 1957, this young university has grown to become a flagship as one of only four University Center campuses in the State University of New York (SUNY) system with more than 26,000 students and 2,600 faculty members, and 18 NCAA Division I athletic programs. Our faculty have earned numerous prestigious awards, including the Nobel Prize, Pulitzer Prize, Indianapolis Prize for animal conservation, Abel Prize and the inaugural Breakthrough Prize in Mathematics. The University offers students an elite education with an outstanding return on investment: U.S. News & World Report ranks Stony Brook among the top 50 public universities in the nation. Its membership in the Association of American Universities (AAU) places Stony Brook among the top 62 research institutions in North America. As part of the management team of Brookhaven National Laboratory, the University joins a prestigious group of universities that have a role in running federal R&D labs. Stony Brook University is a driving force in the region’s economy, generating nearly 60,000 jobs and an annual economic impact of more than $4.6 billion. Our state, country and world demand ambitious ideas, imaginative solutions and exceptional leadership to forge a better future for all. The students, alumni, researchers and faculty of Stony Brook University are prepared to meet this challenge.