What is ‘Big data’ and does it tell us anything?


Consider this: almost every important story these days is accompanied by a chart, an info graphic or a detailed econometric analysis that even the creators of the graphic don’t fully understand. It is filled with ‘Big data’, the buzz word these days. So, what is ‘Big data’ and why are we so obsessed with it? As a friend pointed out recently, there isn’t anything ‘Big’ about Big data, it just happens to be data, plain and simple. The ‘Big’ is a buzz word, a spectacle created to make it look attractive and sexy, given our obsession with things

big and flashy.Image

If you are remotely engaged in the study of Social Sciences, Nonprofit management or related fields, chances are you are already exposed to the myth that “data can solve all our problems”. Data, Big data, ‘Scientific data’ are all bandied about, as if, just by having access to them, we can magically solve all our problems. Whether it is fixing our national education system(s), healthcare reform or any other pressing social policy issue, somehow data is presented as the Holy Grail. Even in public policy schools, this is taught and one can see the dominance of the ‘positivist’ social sciences, those that rely heavily on quantitative analysis. But the question remains: How much of this is actually true? How can just evaluation metrics and info graphics help us understand complex human realities? Surely, they are helpful and have their place. But this overarching emphasis that we place on numeric, quantifiable data seems misplaced.

Consider education reform. A recent Op-Ed in NY Times points to this debate using the very dichotomy I have brought up for discussion – the one between quantitative, data driven analysis and the more ‘touchy-feely’ qualitative analysis. Stanley Fish, the author says of Derek Bok, Harvard University President and his views on two ways of looking at education reform: “The first is an evidence-based approach to education … rooted in the belief that one can best advance teaching and learning by measuring student progress and testing experimental efforts to increase it.” The second “rests on a conviction that effective teaching is an art which one can improve over time through personal experience and intuition without any need for data-driven reforms imposed from above.” Bok goes on to point that indeed it is hard to measure the ‘right’ metrics of success in the long term, given this dichotomous framing. This is not just a problem of measurement, but the problem of what to measure, he argues. This is eerily similar to how Oscar Wilde defined a cynic as someone who knows the price of something, but not its value. Our education system seems to have been reduced to this state of affairs, too. A cynical, quantified, ossified mechanism where at every step, we are asking “What is the output we expect?” either as individuals or a society. The value of education just doesn’t come up in discussions any more.  

All of this is not to suggest that number crunching is pointless. Quite the opposite, I think numerical analysis, large quantitative studies have their place and can be extremely useful and insightful when large populations are to be studied, and there are limited resources. The limitation that these surveys, data points do not go into in-depth information or provide insights that are very deep must be acknowledged at the same time. Going back to the example of education reform, higher education is aggressively moving towards more quantitative analysis and there are several indicators of this. Consider MOOCS for instance. These are growing by the day and their rationale seems to be based around providing access to information, while leaving out the question of ‘learning outcomes’ for future. No one seems to be worrying about these ‘soft’ issues for now. It is all about the number of courses taken or number of students in a course. Outreach has trumped quality of teaching and learning.

A brief detour to look at the history of these ‘paradigm wars,’ as these were known in the 1960s and 70s’, is helpful. Careers were made or destroyed depending on which ‘camp’ one belonged to. The dominance of ‘scientific’ and ‘data’ driven research and analysis came to a peak at that point and it was only with the emergence of alternative theorists, particularly the feminists, post colonialists that they challenged this paradigm of quantitative analysis. As Denzin and Lincoln (2011) say “Critical pedagogy, critical theorists, and feminist analyses fostered struggles to acquire power and cultural capital for the poor, non-whites, women and gays.” They further elaborate that all kinds of research brings with it certain ontological and epistemological commitments. “All research is interpretive: guided by a set of beliefs and feelings about the world and how it should be understood and studied. Some beliefs may be taken for granted, invisible or only assumed, whereas others are highly problematic and controversial. Each paradigm makes particular demands on the researcher, including the questions that are asked and the interpretations that are brought to them.”

What this implies is that even the so called ‘objective’ and ‘rational’ data driven studies are riddled with biases such as what kinds of questions are asked, who is interviewed, what is the survey instrument used and above all – what is studied and who is doing the studying and the analysis. One of the most influential approaches that addresses these questions of knowledge, power and dominance is by Michel Foucault. His Discipline and Punish takes a genealogical look at the emergence of power, dominance and the rise of ‘expertise’. This ‘expertise’ was and still continues to be dominated by those who have access to ‘knowledge’. This specialization, either in the form of medical profession, the legal system or any other systems leads to others being treated as ‘subjects’. A similar logic is at play when we consider the use of public policy measures, governance mechanisms by the government to ‘guard’ and protect as well as oversee its citizens. The entire securitization discourse is a classic example of this phenomenon of ‘expertise’, where a handful of people are deciding the fate of millions or perhaps billions others, all in the name of ‘national security’.

While I am not advocating crystal ball gazing as a strategy, the extremes to which we, as a society are going to gather and interpret certain kinds of data, is hurting us, rather than informing us. Political analysis can be done by means that is not necessarily statistical or numeric. War games and game theory simulations with zero sum outcomes program us to think like robots and unfortunately, much of the current discourse seems to be tuned towards turning us into some form of automatons, with limited capacity for judgment.

             I believe our obsession with certain kinds of data has to do something with our need for control. There is also a great need for certainty and control, that data offers us. With statistics, surveys and some data, we can feel ‘secure’ that we are doing something to ‘fix’ the problem. Indeed many agencies, organization hide behind their ‘progress reports’ and quantitative analysis of how many houses they built, the number of sick cured or other such numbers. Monitoring and Evaluation systems are tell us that the indicators for poverty are actually showing progress. The GDP measures are ticking up, even though the ground realities may be different, with inequality growing as the GDP grows. It all looks good on paper and the annual reports will get a good round of applause. The complicating evidence in the form of human narratives, phenomenological studies and perspectives of the participants involved (often subjects, who end up being just a statistic) are ignored. This is what robust data analysis should bring, coupling both the quantitative and qualitative aspects – to inform us of the ‘truth’, even if there are several versions of it.

            Finally, it may be prudent to remember that data is not just numerical. Even rich qualitative descriptions, historical and archival data are also ‘data’ points and equally valid. Perhaps the solution to many of the problems of evaluation, judging and planning seems to be in balancing this ‘hard’ data with some of the ‘soft’ stuff. Bringing back the human element to our data collection, analysis and usage may be the way for us to get back in touch with our own humanity.




Leave a Reply

Your email address will not be published. Required fields are marked *