CIRCE (Comprehensive Istat R Coding Environment)

CIRCE is a software package based on R aimed at automatically coding textual variables according to official classifications.

It is a generalised software system with respect to the language and the classification used. CIRCE replaces Actr v3, that has been adopted by Istat since 1998, because it was no more commercialised and maintained from its producers and because it was not compatible with the software platform used in Istat (Windows 7, Windows Server 2008).

To prevent lower quality results, the same matching algorithms of ACTR v3 has been developed for CIRCE.

Being an R package it is portable to different platform with no need of compilation. This made it possible to have one single package running on both Windows and Linux operating systems. It can be used on Windows environment through an User Graphical Interface and on web through a “call” to a web service.

CIRCE belongs to the weighting algorithms category and manages three types of coding procedures:

  • automated coding, for set of records (batch coding);
  • interactive coding, to analyse coding results of single record (a GUI is provided to coders);
  • web coding, a web service for single record coding. In this case is currently available a web service dedicated to the identification of the activity code (in Italian language) accessible through the page.

Notwithstanding the type of procedure, the coding phase is performed in two consecutive steps:

1) standardization of texts, called parsing;
2) matching of parsed texts.

The parsing step is a quite sophisticated phase of text standardisation totally customisable, that provides (till now) 14 different functions such as characters mapping, deletion of trivial words, definition of synonymous, suffixes removal, etc.. The parsing aims at removing grammatical or syntactical differences in order to make equal two different descriptions but with the same semantic content.

The second step is the matching phase. The parsed response is compared with the parsed descriptions of the informative base. If this search returns a perfect match or direct match, then a unique code is assigned, otherwise the software uses an algorithm to find the best partial matches, providing an indirect match.

CIRCE is developed by Istat. This will make it easier adding or changing its functionalities with respect to standardization parings and/or matching steps.
Please note: for the moment, both CIRCE user guide (Manuale Utente.pdf) and its GUI are in Italian. English versions will eventually be provided in the future.

Status: validated

Author: Istat

Licence: EUPL-1.1

GSBPM code: 5.2. Classify and code

Programming language: R, VB.NET

Language of the GUI: IT

Keywords: automated coding, weighting coding algorithms


name: Laura Capparucci


– R (version ≥ 3.1.1).

– Windows (version ≥7).

– Microsoft Framework .net 4 (only for the graphical user interface).


Release date: 28/07/2016

CIRCE version 1.0


User manual – CIRCE v. 1.0


