The continuous generation of Big Data in all aspects of life, through the constant use of smart devices, digital tools, and electronic networks/social networks, is characterising the social, environmental, and economic reality and the interaction between people and organisations.
The Internet of Things applied to daily life or business processes is transforming data production operations and is increasing the integration between smart systems.
In this context, Istat plays a key role in transforming this data deluge into official statistics, producing the so-called Trusted Smart Statistics. For the last 10 years, Istat has been committing to innovation through methodological developments and the use of new data sources for more efficient and timely production of official statistics, to better meet user needs and reduce the statistical burden on households, individuals, and businesses.
These statistics are produced in compliance with the standards adopted at the national and international levels, with particular regard to the aspects relating to privacy protection and quality.
What are they and what are they for?
They are statistical products deriving from smart systems, elaborated with tools and methodologies for official statistics, and, therefore, verifiable and transparent. Istat guarantees its validity and accuracy, in full respect of the privacy of people and other interested parties.
Smart statistics from Big Data enrich official statistics because they can provide immediate and fast answers on social, environmental, and economic evolutions, with a wide granularity, also from a territorial point of view.
These statistics help to make citizens and businesses understand the value of their data, bringing them closer to official statistics, supporting their active, aware, and participatory role, and lightening their statistical burden. In this way, they reinforce a mutual and consolidated interaction between Istat, citizens, and companies that generate data.
How they change the production process
Citizens provide data relating to their daily activities to many types of operators and intermediaries, who, in turn, can share them with Istat for statistical purposes, providing due information to the interested parties.
Furthermore, all data held by individuals when used to compile official statistics are subject to a guarantee of confidentiality: Istat cannot disclose any information that involves the dissemination of personal or company data, and all information is published in aggregate form.
There are many types of sources from which it is possible to obtain data regarding, for example, population mobility and public transportation, changes in lifestyles, consumption, and health.
In the smart statistics production system, sources and processing procedures can be managed outside Istat, through agreed methodologies and algorithms, and reliable software.
Furthermore, the traditional processes for producing statistics, based on questionnaires and interactions between interviewers and respondents, are giving way to new integrated models of data acquisition and processing. In smart surveys, still, in the experimental stage, respondents use smart devices (smartphones, tablets, activity-tracking devices) to provide data. In this way, the statistical burden on respondents can be greatly reduced.
Istat transforms Big Data into statistical information and disseminates them in an integrated way with other official statistics. The smart statistics production system must therefore integrate with Istat’s data acquisition and production infrastructure, i.e. the Integrated System of Registers and current surveys.
However, a revision of the statistical process and of the traditional paradigms for evaluating and documenting data quality is necessary to guarantee the quality and reliability of smart statistics, according to the principles of accountability and transparency of official statistics.
The protection of the privacy of citizens and businesses, and the confidentiality of information are fundamental principles for each National Statistical Institute of the European Statistical System (ESS) and Eurostat. The guarantee of confidentiality is contained in national and EU statistical legislation and is put into practice by publishing all information in aggregated form. National statistical institutes cannot disseminate information that would lead to the disclosure of information about a person or company.
- European Statistics Code of Practice – a self-regulatory tool based on sixteen principles concerning the institutional context, statistical processes, and products, which represents the common reference framework for the quality of the ESS ESS.
National Statistical Institutes producing official statistics under the strong lens of the Code of European Statistics are uniquely positioned to provide EU citizens and public policy-makers with high-quality, independent, objective, transparent, and impartial insights, not politically influenced.
- General Data Protection Regulation – GDPR – Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 introduces the concepts of Privacy integrated into the technological design (Privacy by design) and Privacy by default, referring to technical and organisational measures to be put in place to provide all the guarantees.
- The Proposal for a Regulation of the European Parliament and of the Council concerning harmonised rules on fair access to data and their use (Data Law) of 23 February 2022, addresses the issue of the use of data held by private individuals in official statistics. From this Regulation originated a whole series of techniques aimed at guaranteeing privacy already in the collection/pre-treatment (Input) phase of the data to be used.
An investment shared internationally
Smart statistics represent an extension of the official statistics made available to citizens and policy-makers produced using non-traditional sources and innovative methodologies, capable of giving more timely responses to emerging and differentiated information needs, catching the evolutionary aspects in real time.
These statistics consolidate the results of scientific research and allow their immediate use in production processes. Therefore, for several years, Istat, in collaboration with the National Statistical Institutes of other European countries and with international organisations (Eurostat, UN, UNECE), has been actively engaging in the study, experimentation, and production of smart statistics from Big Data, in compliance with common guidelines and principles.
The Scheveningen Memorandum, in 2013, started the experiments on the new Big Data sources in the context of the European Statistical System, encouraging the study of methods for producing timely and reliable official statistics.
Subsequently, the Bucharest Memorandum, adopted by the European Statistical System Committee in 2018, introduced the concept of smart statistics created with Big Data and the so-called Trusted Smart Surveys.
Currently, data sharing between private data holders and National Statistical Institutes is allowed through private agreements on an individual basis. Recognising the richness of data and its potential value for official statistics, the ESS is focussing its efforts on developing a legal framework that allows National Statistical Institutes to access and use these data reliably.
The European data strategy aims to build a true single market for data and make Europe a global leader in the data sector. In this regard, the forthcoming Data Law is expected to be a key pillar and the second major initiative announced in the data strategy. The aim is to contribute to the creation of a cross-sector governance framework for data access and use by legislating on issues affecting the relationships between data economy actors, to incentivise the horizontal sharing of data across sectors.
The production system for smart statistics from Big Data provides for new forms of collaboration between internal and external actors of Istat (e.g. Universities, other institutions, and private partners) and the sharing of methodological knowledge and IT solutions.
A Steering Committee made up of Istat internal Directors, which is responsible for the Strategic Analysis process and for identifying the demand for smart statistics, rules this process. An Interdepartmental Centre for Trusted Smart Statistics coordinates the technical, methodological, and production activities of smart statistics implemented in Istat organisational structures.
Smart statistics from Big Data find application in Istat in the production of both official statistics and experimental statistics.
Monthly consumer price indices
Since 2018, the introduction of scanner data (i.e. the data obtained from barcode readers on supermarket products) in the survey of consumer prices has paved the way for the use of new sources in Istat and has led to a revision of the survey strategy.
Experimental Statistics with Big Data
- The Social Mood on Economy Index – this is a measure of sentiment on the overall economic situation, based on data from Twitter. The historical daily mood series has been calculated since February 2016 and is updated quarterly. More recently, a specific focus dedicated to the analysis of the effect of the war in Ukraine on this index has been released.
- The use of Open Street Map for the calculation of indicators for road accidents on the Italian road network made it possible to refine the indicators of accidents with the use of information both on the length in metres of the carriageway in each direction of travel of each road arc and on the traffic points, providing a new reading at a territorial level.
- Estimates of how websites are used by companies, made by web-scraping techniques of downloading texts from the website pages of companies with at least 10 employees, provide information on the functions and services offered on the websites.
Trials in progress
There are many sources, methodologies, and data processing procedures on which trials are underway. From the use of mobile phone data to study mobility, aimed at classifying the movements of the population into commuters, occasional or tourists, to the study of information obtained from sites or portals that disseminate job searches used to improve labour market indicators.
From using satellite imagery to understand land use in agriculture, to vessel location information (AIS) from tracking sensors to enrich shipping statistics. From the study of electronic payment transactions for the improvement of forecasts and rapid estimates of macroeconomic indicators to the use of information from social networks for the analysis of gender-based violence and hate speech.
What is datafication?
With the growing digitisation, we are witnessing the process of so-called datafication: every event or state, in the physical or virtual world, is transformed into data in real time. The data, in turn, are collected, exchanged, recorded, analysed, transformed and sold.
The availability of Big Data-type sources from which such data derive is part of those global changes that also have significant impacts on the production of official statistics.
What are smart objects in the context of the Internet of Things (IoT)?
The Internet of Things represents the evolution of the use of the Internet: objects (things) become recognisable and acquire intelligence thanks to the fact that they can communicate data about themselves, and access aggregate information from others. By “things” or “objects” we mean categories such as devices, equipment, plants and systems, materials, and tangible products, works and goods, and machines and equipment.
What are the critical issues for official statistics in the use of data held by private individuals?
Private companies are not obliged to provide Istat with the data in their possession and can therefore decide to sell them to other organisations that provide paid insights. Therefore, the supply of data cannot be guaranteed in the long term and this affects the ability to produce comparable data over time.
Comparability is an important aspect of official statistics, as it allows decision-makers to draw meaningful conclusions. Furthermore, the costs of accessing information and preparing data can often be high.