Big Data has become ubiquitous in modern society. It challenges state-of-the-art data acquisition, computation, and analysis methods. Much focus has been placed on the application of Big Data methods and less focus is placed on the theoretical underpinning of the field.
The new availability of data—administrative records, mobile devices, sensors, and many private sources – as well as new processing and analytical techniques, has the potential to transform the practice of science. In the social science context, the new data can potentially offer information for policymakers that is much more current, granular, and richer in environmental information than data produced by statistical agencies from surveys. Yet with the unfolding of new research opportunities, there are challenges associated with making use of the new data that are no longer generated and disseminated by statistical agencies, but can be harvested from many individual, public, and some private actions.
Nonetheless, important scholarly work has been done that uses big data in a way that is valuable to policymakers—in areas as varied as finance, labour, education, science, innovation, transportation, and development.
The workshop topics include, but are not limited to the following:
1. Big Data comprises bits on one side and processing on the other. As we enter the 5 Vs we are confronted with:
- Volume: can present architectures scale to crunch the volume of data involved.
- Velocity: the rapid and asynchronous change in data over several sources makes consistency a major issue and replication basically impossible.
- Variety: differences are deeper than the expression of independent values based on different metrics. They relate to differences in the capturing of those values, in their stability, and in the credibility of the sources.
- Value: who is benefiting from the metadata, how are data sources rewarded, and who is accountable?
- Virtualization: this might be happening in the Internet of Things (IoT) where you don’t actually have Volume, nor Velocity, nor Variety in principle since in IoT a lot is happening locally. However the challenge for the future is to consider all of data spread around and look at them as a "virtual Big Data".
- We would like to dedicate a deeper investigation of the Vs implication and see a small group of discussant bringing different perspectives to the floor, to extend the discussion to the audience with a resulting list of actions that this initiative will need to take.
2. How can we better understand Big Data?
- What are the technologies for generating metadata (data analytics) and what are the technologies for rendering (visualization)?
- How can we trust the outcome—Psychology of Perception?
- How does the value shift from data to metadata, who is benefiting?This is OK
3. It is about Big or it is about Meta?
- If we look at the set of sets, i.e. correlation, then in both cases we have meta-data generation and can we relate to Big Data issues?
- How can we generate meta-information from data?
- What are the changes needed to better exploit the data ecosystem?
- What are the hurdles to solve?
- How about addressing emerging information, semantics, and behaviours?