Thursday, February 6, 2014

Dethroning Causation

I'm reading a book called Big Data: A Revolution That Will Transform How We Live, Work, and Think, by Mayer-Shonberger and Cukier. I just got to the chapter, "Correlation," which discusses how causation is being dethroned by correlation. I remember my introduction to research and statistics 1 classes, where causation was trumpeted as the gold-standard of research. The aim of traditional research has been to take data from a small group and be able to make claims about a larger group.

Big data shakes things up because we can now get lots of data from the larger group. The book reviews an example about Google, who was able to predict the spread of the flu based on the search terms people were putting into Google. The search terms didn't cause the spread, but the correlation was extremely valuable to health organizations and governments wanting to target interventions and to track or stop the spread.

Big data in education means we can look for meaningful relationships among all sorts of data being produced in electronic learning contexts. Much of the data we can get from the electronic world would be extremely unreasonable to track by people in a traditional face-to-face classroom. Machines can capture, store, process, and analyze information much more efficiently than people can. We can look at big groups, like all student data across the university to a single user's data from on learning activity. And generalizing doesn't seem so important any more. The machine can learn and revise predicting and categorizing models as it gathers more data.

I have thought more and more about what value theory or traditional empirical research has any more. If you just need to let the data speak to find meaningful relationships and patterns, what does the "why" matter as long as the "what" helps us get things done? From what I have learned from Dr. Gibbons Explore, Explain, Design framework (Gibbons & Bunderson 2005), design work, which is focused more on achieving outcomes rather than explaining why outcomes were achieved, is becoming more valuable in the world of big data than traditional explain work done by science. This very notion was argued by Chris Anderson in Wiredhttp://www.wired.com/science/discoveries/magazine/16-07/pb_theory

So, why does theory and explain research matter? As Mayer-Shonberger and Cukier argue, we use theories and models to build the algorithms that analyze data, and we use theories and models to make sense of the relationships discovered. "After all," they said, "Google used search terms as a proxy for the flu, not the length of people's hair." Theory and models increasingly matter to enable us to make sense of all the data the digital world makes available to us.

No comments:

Post a Comment