UnitMix
The contents related to UnitMix are shown in the following sections:
In statistical surveys, particularly economic and business ones, it is common for some units to report values in units of measurement different from the expected ones — for example, amounts in thousands of euros instead of euros, or in cents instead of units. The UnitMix package for the R environment provides tools to detect and correct these errors in multivariate numeric data using model-based clustering. The methodology relies on Gaussian Mixture Models (GMM) with user-defined translation vectors (patterns of potential errors), which identify clusters of records that differ in scale or unit of measurement.
The UnitMix package includes three main functions:
- assign.cluster: performs clustering of observations using the EM (Expectation-Maximization) algorithm, shifting the global mean vector according to user-specified error patterns;
- cluster.plot: dynamically generates pairwise scatterplots for all variables, log-transformed, colouring points according to their cluster membership as determined by posterior probabilities;
- refine.cluster: performs post-processing of assign.cluster results, evaluating the compatibility of each observation with its assigned cluster using the Mahalanobis distance on log-transformed data.
The method assumes multivariate log-normality and equal covariance across clusters. The package imports the mvtnorm package for computing multivariate Gaussian distribution densities.
Main references
Di Zio, M., Guarnera, U., & Luzi, O. (2005). “Editing systematic unity measure errors through mixture modelling”. Survey Methodology, 31(1), 53–63.
Status: validated
Author: Istat
Licence: GPL-3
GSBPM code: 5.4 Edit and impute
Programming language: R
Keywords: measurement-unit errors; Gaussian mixture; EM algorithm; Mahalanobis distance
Contact: name: Renato Magistro – email: renato.magistro@istat.it
TECHNICAL REQUIREMENTS
The UnitMix package works on R versions 4.0.0 and later on any operating system (Windows, Mac, or Linux). It requires the following additional R package to be installed: mvtnorm.
COPYRIGHT
Copyright 2025 Cristina Faricelli, Renato Magistro
Licensed under the GNU General Public License (GPL), version 3 or later. You may not use this work except in compliance with the License. You may obtain a copy of the License at the following address: http://www.gnu.org/licenses/.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
DISCLAIMER
Istat does not assume responsibility for results deriving from a use of the tool that is not consistent with the methodological indications contained in the available documentation.
DOWNLOAD
Release date: 10/03/2026
UnitMix Version 0.0.1 – Precompiled package for Windows
UnitMix Version 0.0.1 – Package source for Windows and Unix-like systems
INSTALLATION
In R the package can be installed using the following instructions:
> install.packages(path_to_file, repos = NULL)
where path_to_file indicated the path of the downloaded.zip or .tar.gz.
TECHNICAL AND METHODOLOGICAL DOCUMENTATION
Reference manual – UnitMix v. 0.0.1
https://cran.r-project.org/web/packages/UnitMix/UnitMix.pdf
Di Zio, M., Guarnera, U., & Luzi, O. (2005). “Editing systematic unity measure errors through mixture modelling”. Survey Methodology, 31(1), 53–63. https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2005001/article/8087-eng.pdf
OTHER DOCUMENTATION