Search
strumenti

Methods and software of the statistical process

UnitMix

The contents related to UnitMix are shown in the following sections:

In statistical surveys, particularly economic and business ones, it is common for some units to report values in units of measurement different from the expected ones — for example, amounts in thousands of euros instead of euros, or in cents instead of units. The UnitMix package for the R environment provides tools to detect and correct these errors in multivariate numeric data using model-based clustering. The methodology relies on Gaussian Mixture Models (GMM) with user-defined translation vectors (patterns of potential errors), which identify clusters of records that differ in scale or unit of measurement.

The UnitMix package includes three main functions:

  • assign.cluster: performs clustering of observations using the EM (Expectation-Maximization) algorithm, shifting the global mean vector according to user-specified error patterns;
  • cluster.plot: dynamically generates pairwise scatterplots for all variables, log-transformed, colouring points according to their cluster membership as determined by posterior probabilities;
  • refine.cluster: performs post-processing of assign.cluster results, evaluating the compatibility of each observation with its assigned cluster using the Mahalanobis distance on log-transformed data.

The method assumes multivariate log-normality and equal covariance across clusters. The package imports the mvtnorm package for computing multivariate Gaussian distribution densities.

 

Main references

Di Zio, M., Guarnera, U., & Luzi, O. (2005). “Editing systematic unity measure errors through mixture modelling”. Survey Methodology, 31(1), 53–63.

Status: validated

Author: Istat

Licence: GPL-3

GSBPM code: 5.4 Edit and impute

Programming language: R

Keywords: measurement-unit errors; Gaussian mixture; EM algorithm; Mahalanobis distance

Contact: name: Renato Magistro – email: renato.magistro@istat.it

TECHNICAL REQUIREMENTS

The UnitMix package works on R versions 4.0.0 and later on any operating system (Windows, Mac, or Linux). It requires the following additional R package to be installed: mvtnorm.

 

COPYRIGHT

Copyright 2025 Cristina Faricelli, Renato Magistro

Licensed under the GNU General Public License (GPL), version 3 or later. You may not use this work except in compliance with the License. You may obtain a copy of the License at the following address: http://www.gnu.org/licenses/.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

 

DISCLAIMER

Istat does not assume responsibility for results deriving from a use of the tool that is not consistent with the methodological indications contained in the available documentation.

 

DOWNLOAD

Release date: 10/03/2026

UnitMix Version 0.0.1 – Precompiled package for Windows

UnitMix Version 0.0.1 – Package source for Windows and Unix-like systems

 

INSTALLATION

In R the package can be installed using the following instructions:

> install.packages(path_to_file, repos = NULL)

where path_to_file indicated the path of the downloaded.zip or .tar.gz.

 

TECHNICAL AND METHODOLOGICAL DOCUMENTATION

Reference manual – UnitMix v. 0.0.1

https://cran.r-project.org/web/packages/UnitMix/UnitMix.pdf

Di Zio, M., Guarnera, U., & Luzi, O. (2005). “Editing systematic unity measure errors through mixture modelling”. Survey Methodology, 31(1), 53–63. https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2005001/article/8087-eng.pdf

 

OTHER DOCUMENTATION

https://github.com/CristinaFaricelli/UnitMix

Was this page useful?