This page describes software references of interest for survey analysts
other than that for design-based analysis of surveys, including software
models (also called multilevel models, Empirical Bayes models,
and various other names) embody another approach that can be applied to
analysis of survey data. The emphasis of this approach is on explicitly
modeling the structure of the various survey units, rather than using a
predetermined estimator and assessing its properties with the given
design and population. See a brief
introduction and some leads into this area.
Imputation of missing data: Most
analysis software does not accomodate missing data; default handling of
missing data (such as deleting cases with any missing variables, or
"complete-case analysis") potentially can generate biased, inefficient
and inconsistent results. "Imputation" refers to any strategy for for
completing ("filling in") missing data on survey items so that standard
methods can then be applied to analyze the completed datasets. Special
methods such as multiple imputation can then be used to obtain valid
standard errors reflecting the uncertainties of imputation.
Texts on imputation and analysis with missing data include the following.
There is also a large literature in journals and proceedings, explicating
and evaluating various methods.
- Little, R.J.A. and Rubin, D.B., Statistical Analysis with Missing
Data (Second Edition), New York: Wiley, 2002.
- Rubin, D.B., Multiple Imputation for Nonresponse in Surveys,
New York: Wiley, 1987.
- Schafer, J.L., Analysis of Incomplete Multivariate Data, London:
Chapman and Hall (now New York: CRC Press), 1997.
- Software review: Horton, N. J. and Lipsitz, S. R.,
"Multiple Imputation in Practice", The American Statistician, 55(3):244-254, 2001. Compares Solas, SAS, MICE, S-Plus implementations of imputation.
- See also Joseph Schafer's
Multiple Imputation FAQ Page for introductory explanations and further
Software for imputation includes the following. See also the software
summary and links by Stef van Buuren.
- SAS PROC MI and PROC MIANALYZE: multiple imputation
under a multivariate normal, and analysis of multiply imputed datasets
using other SAS procedures. (Part of SAS Version 8 and above, as
experimental procedures.) See an introduction
to these procedures on the SAS web site.
- Software for multiple imputation by Joseph Schafer, available
for free download at
http://www.stat.psu.edu/~jls/misoftwa.html, includes both standalone
packages for Windows computers and functions to be used from S-Plus
(incorporated into recent releases of S-Plus). Programs include NORM
(for multivariate normal data), CAT (for categorical data), MIX (for
mixed normal and categorical data), and PAN (for clustered data,
including longitudinal data).
- Solas software from Statistical Solutions, Inc..
Standalone Windows package implementing several imputation methods.
- IVEware, described elsewhere on this site,
includes procedures for multiple imputation.
- Amelia for Missing Data: available for free download as a
stand-alone program for Windows computers, or a package for Gauss.
- MICE is available as functions running under S-Plus, free
- The Hmisc library for S-Plus, by Frank Harrell, includes missing
data functions and is available
- See also an extensive
bibliography on consistency-editing of survey (or administrative)
data, contributed by William Winkler. Also see his bibliography on
Disclosure control refers to methods
for preventing release of data that should be kept confidential. See the
site maintained by the Committee on Privacy and Confidentiality of
the American Statistical Association for extensive information,
bibliography, and Web links, including
software links and references.
Survey administration: The Association for Survey Computing (UK)
maintains a Register of Software
for Statistical and Social Survey Analysis, including a variety of
software for conducting and analysing surveys. In particular, see the listing by
primary function to select programs for survey administration. Note that
not all of the analysis packages listed are equipped to deal with typical
survey features such as clustering and disproportionate sampling.
The following are some links for survey administration software. Only a
very few specific packages are listed, mainly those from government or
major academic survey centers; there are hundreds of packages for
web surveys alone and many more for other types of surveys.
- Software for Web surveys is listed at WebSM.org (
hosted by University of Ljubljana. The section on software includes a
listing of around 300 products, a guide on how to pick the appropriate
software, and a link to a recent comparison table.
- The American Evaluation Association mantains a list of vendors of "software that aids in the development,
distribution, scanning, and analysis of surveys/questionnaires".
(Census and Survey Processing System) is a free Windows-based
public domain product for the production of survey (and census) data,
including data entry, editing and tabulation. It is developed and
supported by the International Programs Center (IPC), part of the
Population Division of the U.S. Bureau of the Census, with collaborating
- EpiInfo is a
free Windows-based program for form and database construction,
data entry, and analysis with epidemiologic statistics, maps, and
graphs. It is sponsored by the U.S. Centers for Disease Control.
Translations are available in 13 languages. See also the description of
the variance estimation features of EpiInfo on
package for survey administration and management was developed at Statistics
world distribution) and is distributed in the USA by Westat (contact
- Berkeley CASES (Computer-Aided Survey Execution System) is a full-featured
survey administration and management package from the Computer-assisted Survey Methods Program at
University of California, Berkeley.
Please send comments and suggestions regarding this page to Alan Zaslavsky,
Back to site home page
Survey Methods Research Section home page