The activities related to the design of the list and the sampling methodology refer to sub-process 2.4 “Design frame and sample” of GSBPM. More precisely:
- Construction of the selection frame of the target population, containing, for each unit of the population, all the identifying information needed for the contact, any auxiliary variables used to define the sample design (stratification variables, identifying variables of any selection stage);
- Design of the sampling design which, on the basis of the objectives specified in Phase 1 “Specify Needs” and of the operating and cost constrains, allows obtaining estimates as accurate as possible.
The characteristics of the sample list are essential for the correct definition of the sampling design.
The list should meet quality criteria in terms of refreshing, coverage and accuracy of the information contained in it. From a theoretical point of view the selection list should ideally have the following requirements:
- it is composed only of the units belonging to the population of interest at the reference time of the survey;
- it includes all units of the population only once;
- it contains the most updated data for identifying variables (name and address) and for any descriptive information (other relevant structural data) of the units.
Possible situations of departure from the ideal list are:
- under-coverage, which occur when some elements of the population are not contained in the list and cannot, therefore, be included in the sample;
- over-coverage, when some elements of the list are non-existent and / or do not belong to the population of interest;
- duplication of some units, when some elements of the population are enumerated more than once in the list;
- clusters of units, when some elements of the list contain clusters of elements of the population.
The planning of the sample design consists firstly of the following activities:
- definition of the sampling scheme, performed on the basis of the cost related to the chosen data collection technique and the information contained in the selection list (multistage sampling selection, stratified sampling). The choice of a multistage design generally derives from the need to concentrate the sample locally in order to limit the cost of interview in case of survey using a direct mode of administration of the questionnaire (face to face interview). The choice of a stratified sample design has the purpose of improving the precision of the estimates and guaranteeing planned sample size for the domains of estimate. The division of the units of the population in strata is carried out using auxiliary variables contained in the list and related to the variables under investigation.
On the basis of the adopted sampling scheme, the following steps may be undertaken:
- choice of stratification criteria (choice of variables, choice of the number of strata, definition of the criterion of strata construction);
- choice of the probabilistic method for the selection of the sample units (selection with equal probabilities, selection with unequal probabilities). For designs in two or more stages the selection of the primary sampling units (units of the first stage) is generally done with probability proportional to a measure of size suppository correlated with the variables under investigation.
- determination of sample size for the different stages of selection and allocation of the sample between the strata on the basis of the sampling error admitted for the main estimates, in relation to the reference domains and subclasses of the population. Since surveys are usually designed for the production of a variety of estimates for different domains of interest, it is necessary to use approaches that address the problem in a global perspective of determining the optimal sample size in the presence of a high number of objectives and constraints.