Facebook posts alone can predict some 21 diseases and conditions, including diabetes, hypertension, anxiety and depression, a new study reveals.
The study, published in PLOS ONE by researchers at Stony Brook University and Penn Medicine, included 999 participants who consented to share their social media posts and medical records. It involved an analysis of approximately 20 million words. The researchers looked at language patterns – words, phrases, clusters of related words — and their statistical association with 21 standard categories of medical record diagnoses indicating conditions.
“Our predictions from language captures diagnosis of diabetes about as well as predictions based on one’s body mass index,” said senior author H. Andrew Schwartz, assistant professor of Computer Science in the College of Engineering and Applied Sciences. “We can treat language pattern analogous to a genome and see similar diseases seem to have similar linguistic patterns.”
The method appears to have strong correlations to predicting mental health conditions, such as anxiety, depression, and psychosis in some patients. And with certain diseases, such as diabetes and mental health conditions, Facebook posts can predict disease more often than demographic information.
“Our digital language captures powerful aspects of our lives that are likely quite different from what is captured through traditional medical data,” Schwartz said. “By looking across many medical conditions, we get a view of how conditions relate to each other, which can enable new applications for AI for medicine.”
Three models were used to analyze the predictive power for the patients. One model only analyzed Facebook post language, another used demographics such as age and sex, and a third combined the two datasets. The researchers found that Facebook posts alone predicted all 21 conditions, and for 10 of the conditions Facebook better predicted them in comparison to demographic information.