There is no single definition of big data, which has only become a commonly used term since about 2011. More formal definitions tend to include a combination of ‘V’s plus a need for specialist software to process these very large and complex datasets (rather than, for example, Microsoft Access or Excel).
- volume (entries that take up ever-increasing storage space, from terabytes (1 terabyte contains 2,000 hours of CD-quality music to petabytes and up)
- velocity (high-speed of generation and processing and constantly changing)
- variety (data in different forms from different sources – images, recordings, online interactions, sensing technologies etc.)
- veracity (data being in doubt, due to varying noise and processing errors)
Or more simply put “data that is inconveniently large” (Dr Clair Alston, from Queensland University of Technology), that exceeds an organisation’s capacity to store and analyse it to inform their work. The first half of this short article provides an excellent overview of different definitions, including the reasons behind a lack of universal benchmarks on the ‘bigness’ of big data.
There is much talk about organisations needing to ‘do more with their data’. Perhaps media and business attention on big data has heightened our awareness about data in general. For many involved in social policy or public service delivery, while they might be talking about quite a lot of data, for example data on thousands or even hundreds of thousands of interactions with service users, they are rarely talking about big data as defined in these terms. While their data might feel ‘inconveniently large’, it can be possible to make use of it with less technical data science skills and resources for analysis than is required for truly big data.
I came across organisations (also here, here and here) helping charities and social organisations use data for improved decision making and to take action. Nesta has documented specific examples, for example, Citizen’s Advice Bureau used techniques for working with Big Data – normal language processing, predictive modelling and combining data sets – to turn impenetrable text data into a dashboard that mines from tens of thousands of CAB consultations and queries to better understand and address emerging social issues in the UK.
These are examples of how data can be used to better understand social need, action and communities of interest. So potential applications for social research and evaluation might be to mine data to help focus research questions, identify populations to sample from, or, more specifically, as a way of identifying experts for a Delphi questionnaire.
But it was hard to find many examples of big data being used for social research and evaluation. The global information sharing site, Better Evaluation has limited content, with the focus again being on understanding public perceptions and engagement on an issue.
The next two posts cover some of the challenges and opportunities in using big data for social research and evaluation and the examples I did come across.