What is big data? What does it mean for the work I do? What is the role for big data in evaluation? What skills or and experience do I need to develop?
As you can probably tell from the questions, I am not a programmer, statistician or data scientist. I have quite a bit of experience analysing data – I know my p-value from my confidence interval – and I work with statisticians and economists. But data is growing (& growing) and how to work with it is changing. The purpose of this blog is to help me focus my thinking and share my learning.
If you are involved in commissioning, designing or delivering social research, evaluations or programmes I hope you find my learning journey and the resources I mention helpful. If you are already an expert in all things data, please get in touch to nudge me in the right direction or tell me where I am going wrong.
One of the first things I read was this article by Burrows and Savage, on big data and the methodological challenges of empirical sociology, where they highlight the importance of traditional research methods and new approaches from data science working with each other, to avoid polarisation in thinking and practice. They discuss how much the study of social phenomena now takes place outside the traditional interview / survey methods of sociological researchers. How it encompasses, among others, journalists, TV documentary makers, economists, and the data collected and analysed by commercial organisations (a list to which I would add service designers) and a role for data, often produced by commercial organisations, that can “track, trace, record and sense” our real time interactions with the world in new ways.
This NCRM podcast by Mark Birkin expands on these ideas, talking about a 4th paradigm in research, with big data providing a source for new hypothesis and practical issues, such as whether the census will be supplemented or even replaced by new, real time sources of data.
They sparked my thinking about what I, or my clients, might be doing in the future.
I am going to be working with more and different types of data in the future. More volume, more dynamic, and more that is not formally collected or structured for the purpose of a study. All of which will raise new challenges for understanding context, validity, reliability, bias and ethics.
How clients identify questions to explore could increasingly be based on initial data analysis to highlight what is interesting and possible. Complementary to and more efficient than relying only on a small group of individuals within an organisation listing ‘what we want to know is…’.
Team roles may be redefined: storytellers rather than report writers, data scientists rather than analysts and deep context experts, to identify the signal though increasing amounts of noisy data. With managers needing some new skills and understanding to manage them.