Everything big data claims to know about you could be wrong
When it comes to understanding what makes people tick — and get sick — medical science has long assumed that the bigger the sample of human subjects, the better. But new research led by the University of California, Berkeley, suggests this big-data approach may be wildly off the mark.
That's largely because emotions, behavior and physiology vary markedly from one person to the next and one moment to the next. So averaging out data collected from a large group of human subjects at a given instant offers only a snapshot, and a fuzzy one at that, researchers said.
The findings, published this week in the Proceedings of the National Academy of Sciences journal, have implications for everything from mining social media data to customizing health therapies, and could change the way researchers and clinicians analyze, diagnose and treat mental and physical disorders.
"If you want to know what individuals feel or how they become sick, you have to conduct research on individuals, not on groups," said study lead author Aaron Fisher, an assistant professor of psychology at UC Berkeley. "Diseases, mental disorders, emotions, and behaviors are expressed within individual people, over time. A snapshot of many people at one moment in time can't capture these phenomena."
Moreover, the consequences of continuing to rely on group data in the medical, social and behavioral sciences include misdiagnoses, prescribing the wrong treatments and generally perpetuating scientific theory and experimentation that is not properly calibrated to the differences between individuals, Fisher said.
That said, a fix is within reach: "People shouldn't necessarily lose faith in medical or social science," he said. "Instead, they should see the potential to conduct scientific studies as a part of routine care. This is how we can truly personalize medicine."
Plus, he noted, "modern technologies allow us to collect many observations per person relatively easily, and modern computing makes the analysis of these data possible in ways that were not possible in the past."
Fisher and fellow researchers at Drexel University in Philadelphia and the University of Groningen in the Netherlands used statistical models to compare data collected on hundreds of people, including healthy individuals and those with disorders ranging from depression and anxiety to post-traumatic stress disorder and panic disorder.
In six separate studies they analyzed data via online and smartphone self-report surveys, as well as electrocardiogram tests to measure heart rates. The results consistently showed that what's true for the group is not necessarily true for the individual.
For example, a group analysis of people with depression found that they worry a great deal. But when the same analysis was applied to each individual in that group, researchers discovered wide variations that ranged from zero worrying to agonizing well above the group average.
Moreover, in looking at the correlation between fear and avoidance – a common association in group research – they found that for many individuals, fear did not cause them to avoid certain activities, or vice versa.
"Fisher's findings clearly imply that capturing a person's own processes as they fluctuate over time may get us far closer to individualized treatment," said UC Berkeley psychologist Stephen Hinshaw, an expert in psychopathology and faculty member of the department's clinical science program.
In addition to Fisher, co-authors of the study are John Medaglia at Drexel University and Bertus Jeronimus at the University of Groningen.