Further Reading and Resources:
- Laney, D.(2001) 3D Data Management: Controlling Data Volume, Velocity, and Variety. Meta Group.
- De Mauro, A., Greco, M. and Grimaldi, M. (2016) A formal definition of Big Data based on its essential features. Libr. Rev., 65, 122–135.
- Fang, H. (2015) Managing Data Lakes in Big Data Era: What’s a Data Lake and Why Has It Became Popular in Data Management Ecosystem.
- Marinescu, D.C. (2018) Cloud Computing: Theory and Practice.
- Haseeb, A. and Pattun, G. (2017) A review on NoSQL: applications and challenges.
- Sánchez, J.M. (2018) In-Memory Analytics. In Mehdi Khosrow-Pour, D.B.A. (ed.) Encyclopedia of Information Science and Technology (4th edn), pp. pp. 1806–1813. IGI Global.
The government collects huge volumes of data (increasingly published as open data) and thus has major opportunities for so-called big data (analytics). In general, big data provides the opportunity of examining large and varied data sets to uncover hidden patterns, unknown correlations, customer preferences, etc. Big data encompass a mix of structured, semi-structured and unstructured data gathered formally through interactions with citizens, social media content, text from citizens’ emails and survey responses, phone call data and records, data captured by sensors connected to the Internet-of-things and so on. The notion of ‘big data’ is evolving; the variety of data being generated by organizations and the velocity at which that data is being created and updated; referred to as the 3Vs of big data. Alternative descriptions of big data add other features such as veracity, value, complexity and unstructuredness.
Big data encompasses a number of associated technologies:
Big data lakes—a ‘data lake’ is a storage repository that holds a vast amount of raw data in its native format until it is needed.
Cloud computing—the practice of using a network of remote servers hosted on the Internet to store, manage and process data, rather than a local server or a personal computer.
Unstructured data & NoSQL databases—refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner; a NoSQL database is a mechanism for storage and retrieval of data which is modelled in means other than the tabular relations used in relational databases.
Hadoop—an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment.
In-memory analytics—the queries and data reside in the server’s random access memory (RAM), so increasing the speed, performance and reliability.
Related Papers and Publications
Algorithmic Government - The Computer Journal
Next GovTechLab Event
UCL Digital Ethics Forum Event: Materialising the invisible labour of the data chain (21st Nov. 9.15 – 12.30 pm)
UCL Digital Ethics Forum: Materialising the invisible labour of the data chain 21st November, 9:15 am -12.30 pm Hosted by: Dr. Jenny Bunn (UCL Information Studies) We live in an increasingly data-driven society, where data is becoming a global asset at the same time...read more
Opportunities for embedding evidence-based policy across Government (Zeynep Engin Address to Westminster Higher Education Forum)
On the 8th of October (2019) GovTech Lab’s Dr. Zeynep Engin addressed the Westminster Higher Education Forum Keynote Seminar: Evidence-based policymaking - strengthening the impact of academic and industry research on policy development. Zeynep’s address can be found...read more
GovTech Lab Seminar: DataNet and the Future of National Data Infrastructures @UCL (26th Nov. 9.30-Noon)
GovTech Lab Seminar: DataNet and the Future of National Data Infrastructure 26 November, 2019 University College London 9.30 - 12 Noon http://govtechlab.org/datanet/ DataNet - Doing for Data what the Internet did for Communications DataNet represents a new approach...read more