3. Challenges & opportunities with big data

The American Evaluation Association 2015 Conference salon on big data and evaluation sent a clear message that big data is an addition to the tool box, leading to evolution in evaluation, rather than revolution against traditional methods and approaches to scientific inquiry (more detail here). Pages 4-12 of this report, from Aspen Institute’s roundtable on big data, provide detail on the debate about the role for theory in a world where speedy statistical algorithms can mine massive datasets for correlations. Again the conclusion is a future with old plus new, with the ability to mine big data supporting the generation of new theories.

There is also potential for big data be applied, using real time monitoring and predictive modelling, to the earliest stages of evaluation. For example, in evaluability assessment and developmental evaluation, to inform programme design and better identify when the time is right to start trying to capture impact. And there is no doubting the value of bigger datasets that can be cleaned and analysed at faster speed.

But these opportunities are not without challenges.

Big data comes from different sources. Perhaps the most obvious in our daily life being social media data, from Twitter, Facebook and the like. Then there is data generated by organisations and how they interact or transact with each other and their clients or customers, think orders, invoices, stock taking and other records. Administrative data collected by government, (which I discuss in more detail here), data from sensors and the internet of things.

The way data is generated leads to huge bias and there is inequality in how data can be accessed. Both important issues for social research and evaluation.

For example, there are massive inequalities in connectivity, with certain groups likely to be underrepresented in digital data, or lack of clarity in how people use social media. Some sources are much easier to get hold of and analyse than others, data from Twitter is much more accessible than data from Facebook. Certain administrative data sets require organisations or individual to meet criteria to get approval for access; not everyone can access the same information. In much the same way that a Government department will have access to data, skills and resources for evaluating a programme, that a small charity won’t.

Another concern is the applicability of approaches from big data science, which are focused on describing (what is happening now) or predicting (what could happen in the future), when social research is often concerned with causality and generalizability (trying to understand what happened in the past, why and in what circumstances it could happen again).

From a perspective of political science but with relevance to social science research more broadly, Justin Grimmer discusses how big data, while insufficient on its own, is able to improve causal inferences. He argues:

Massive datasets and social networking sites provide opportunities to design experiments on a scale that was previously impossible in the social sciences. Subtle experiments on a large number of people provide the opportunity to test social theories in ecologically valid settings. The massive scale of the experiments also provides the chance to move away from coarse treatments estimated at the population level to more granular treatments in more specific populations. The result will be a deeper understanding of the social world…There also are numerous opportunities to combine experimental design with machine-learning algorithms to learn how high-dimensional affect response.

These benefits being dependent upon effective collaboration between social scientists – with experience of causal inference and context expertise necessary for hearing the signal in the noise -and data scientists, is recognised. For example, Cambridge University’s Undergraduate Quantitative Methods Centre teaches social scientists of the future the advanced quantitative skills they will need to work with massive dataset.

Or, as put another way by Grimmer:

“For ‘big data’ to actually be revolutionary, we must recognise that we are all social scientist now – regardless of in which field our degree is.”

Share on facebook
Share on twitter
Share on linkedin
Related Posts