Survey Software: Estimands summary

Summary of survey software: Estimands and Statistical Analyses Accomodated

This is a summary of the information included under the heading "Types of estimands and statistical analyses that can be accomodated" for each of the software packages described on these pages. Select the appropriate title for more information on any package.

AM Software

AM Software was developed particularly for analysis of data from educational surveys (such as the National Assessment of Educational Progress). It includes a number of analyses under the rubric of Marginal Maximum Likelihood (MML), based on test theory (Item Response Theory) and used particularly to analyze data in which different subjects complete to different subscales of a test. Procedures in this group include MML Regression, MML Means, MML Table (Ordinal), MML Table (Nominal), MML Composite Means, MML Composite Regression, NALS Table, and NAEP Table.

It also includes more standard analyses including Frequencies, Descriptives, Correlations, (Linear) Regression, Percentiles, Probit and Logit Regression.

Data manipulation capabilities include ability to recode or calculate variables.

New features currently in a Beta-test version (available for download) include:

Graphics: bar charts, line charts, and the new Sectioned Density Plot designed to compare distributions.
New import/export facilities that enable you to easily import/export data to/from nearly 150 different data file formats, as well as over ODBC.
Sample-design consistent Wald tests of model fit for all regression models, including the point-and-click ability to test the significance of subsets of regressors.
A new Mantel-Haenszel stratified chi-square test of the type typically used to evaluate differential item function on tests. As with all AM procedures, this one provides significance tests that are appropriate for complex sample designs.

Bascula

Bascula computes adjustment weights using auxiliary variables. It incorporates various weighting techniques. If only categorical auxiliary variables are used, the simplest technique is complete poststratification. For incomplete poststratification, Bascula offers a choice between linear weighting (based on the general regression estimator) and multiplicative weighting (based on iterative proportional fitting). Linear weighting can also be applied if one or more of the auxiliary variables is a quantitative variable.

The program can calculate estimates of population totals, means, and ratios.

CENVAR

Totals, means, ratios, proportions for total population and domains; output includes estimated value of the parameter, standard error, coefficient of variation, 95% confidence interval, design effect (DEFF), and number of observations upon which the estimate is based.

CLUSTERS

Computes sampling errors and derived statistics such as design effects and intra-cluster correlations for ratios and their differences over population subclasses.

Epi Info

Means, proportions, odds ratios, risk ratios, risk differences.
EpiInfo also includes a wide variety of other estimation modules, not necessarily designed for survey data estimation, and there is a related mapping program, EpiMap.

Generalized Estimation System

The focus of this software is on calibration estimation using generalized regression (GREG) estimator theory.

Main functions are: calculation of sample design weights, calculation of g-weights under a calibration approach, calculation of calibration estimates, and calculation of synthetic estimates.
Estimation of totals, averages, and ratios, for universe or domains.
Auxiliary variables are used for estimation through the Generalized Regression (GREG) approach. This framework permits a large family of estimators including the traditional separate, combined and post-stratified estimators.
Synthetic estimates can also be produced from auxiliary information for each domain of interest.

IVEware

Descriptive statistics includig means, proportions, subgroup differences, linear contrasts.
Multiple imputation for missing data.
A variety of SAS procedures can be run under IVEware, including CALIS, CATMOD, GENMOD, LIFEREG, MIXED, NLIN, PHREG, and PROBIT for linear, logistic, Poisson, survival, and polytomous regression models. These runs incorporate survey design-based variance estimation and/or multiple imputation analysis for missing data.

PC CARP

Constructs estimates and standard errors for totals, means, quantiles, ratios, difference of ratios and entries in two-way tables. Weighted regression equations can also be estimated.

Add-on modules calculate logistic regressions and estimation with poststratification.

R

mean, quantiles, variance, tables, ratios, totals
graphics: scatterplots, smoothers, boxplots, barcharts
generalised linear models (e.g. linear regression, logistic regression, Poisson models, etc.)
proportional hazards models
proportional odds and other cumulative link models
survival curves
post-stratification, raking, and calibration
tests of association in two-way tables
loglinear models for multiway tables

SAS

SAS/STAT Software provides the SURVEYSELECT procedure for sample selection and the SURVEYMEANS and SURVEYREG procedures for producing descriptive statistics and regression estimates, respectively. These three procedures are available in SAS versions 8 and higher. Beginning with SAS 9, SAS/STAT also includes the SURVEYFREQ procedure for computing crosstabulations and tests of association, and the SURVEYLOGISTIC procedure for performing logistic regression. The analysis procedures can accommodate complex survey designs that include stratification, clustering, and unequal weighting.

The SURVEYSELECT procedure provides a variety of methods for selecting probability-based random samples. The procedure can select a simple random sample, or samples with design features such as stratification, clustering or multistage sampling, or unequal probabilities of selection. It can accomodate very large sampling frames. It can draw a replicated sampling, i.e. a sample composed of a set of replicates, each selected in the same way.
PROC SURVEYSELECT accepts the sampling frame as a SAS data set. Control language specifies the selection methods, the desired sample size or sampling rate, and other parameters. The output data set contains the selected units, with selection probabilities and sampling weights.
The SURVEYMEANS procedure estimates population totals, means, and ratios (SAS 8.2 and later), with estimates of their variances, confidence limits, and other descriptive statistics, under sample designs that may include stratification, clustering, and unequal weighting.
The SURVEYREG procedure estimates regression coefficients by generalized least squares, using elementwise regression, assuming that the regression coefficients are the same across strata and PSUs.
The SURVEYLOGISTIC procedure fits logistic regression models for discrete response survey data by maximum likelihood, incorporating the sample design into the analysis.
The SURVEYFREQ procedure produces one-way to n-way frequency and crosstabulation tables from sample survey data. These tables include estimates of population totals, population proportions, and corresponding standard errors. Confidence limits, coefficients of variation, and design effects are also available, as are tests of independence (Wald test, Rao-Scott likelihood ratio test, Rao-Scott chi-square test).

XXXXX

Complex Samples Plan module: Specifies design information for sample selection and/or analysis. (See "Designs" section for designs that are supported.) The file created by this module is used by all other modules.
Complex Samples Selection module: Chooses units according to a sample design specified by Complex Samples Plan.
Complex Samples Frequencies module: Cell counts and proportions with standard errors.
Complex Samples Descriptives module: Estimates sums, means, and ratios with standard errors and design effects, for whole population or subpopulations.
Complex Samples Crosstabs module: One- or two-way tabulations with standard errors, design effects, coefficients of variation, odds ratios and/or relative risks, and tests of independence, taking into account the complex survey design.
Complex Samples General Linear Model module: Linear regression models including analysis of variance and analysis of covariance models. Model parameters with design-corrected standard errors, t-tests and Wald F and chi-square tests, adjustments for multiple comparisons.
Complex Samples Logistic Regression module: Binary and multinomial logistic regression models, with similar options for linear predictor specification to CSGLM.

Note: above information is for SPSS 13.0; SPSS 12 supports a more restricted set of features.

Stata

There are about currently about 50 Stata commands for various analyses of survey data, including the following analyses and others:

Estimation of means, totals, ratios, and proportions.
Linear regression, logistic regression, and probit; also, tobit, interval, censored, instrumental variables, multinomial logit, ordered logit and probit, and Poisson. Point estimates, associated standard errors, confidence intervals, and design effects for the full population or subpopulations are displayed. Auxiliary commands will display all this information for linear combinations (e.g., differences) of estimators, and conduct hypothesis tests.
Contingency tables with Rao-Scott corrections of chi-squared tests; new survey-corrected regression commands including tobit, interval, censored, instrumental variables, multinomial logit, ordered logit and probit, and Poisson.

SUDAAN

SUDAAN includes the following statistical procedures:

MULTILOG: Fits multinomial logistic regression models to ordinal and nominal categorical data and computes hypothesis tests for model parameters. Estimates odds ratios and their 95% confidence intervals for each model parameter. Has GEE (Generalized Estimating Equation) modeling capabilities for efficient parameter estimation.
REGRESS: Fits linear regression models to continuous outcomes and performs hypothesis tests concerning the model parameters.
LOGISTIC: Fits logistic regression models to binary data and computes hypothesis tests for model parameters. Estimates odds ratios and their 95% confidence intervals for each model parameter.
SURVIVAL: Fits proportional hazards (Cox regression) models to failure time data. Estimates hazard ratios and their 95% confidence intervals for each model parameter.
CROSSTAB: Computes frequencies, percentage distributions, odds ratios, relative risks, and their standard errors (or confidence intervals) for user-specified cross-tabulations, as well as chi-square tests of independence and the Cochran-Mantel-Haenszel chi-square test for stratified two-way tables.
DESCRIPT: Computes estimates of means, totals, proportions, percentages, geometric means, quantiles, and their standard errors. Also computes standardized estimates and tests of single degree-of-freedom contrasts among levels of a categorical variable.
RATIO: Computes estimates and standard errors of generalized ratios of the form (Summation y) / (Summation x), where x and y are observed variables. Also computes standardized estimates and tests single-degree-of-freedom contrasts among levels of a categorical variable.
The EFFECT statement allows users to specify contrasts of regression coefficients and hypothesis tests using simple effect names.

VPLX

VPLX calculates summary statistics (means, proportions, and totals for the entire sample or by subclasses) and their standard errors. It can be used to calculate a valid t-test. Arithmetical transformations of the data can be specified in the command language, which means that standard errors can be calculated for arbitrary sums, differences, products, and quotients.

WesVar

Estimates from tables (up to 8-way), including totals, means, percentages, test of independence, and user-specified functions of variables or estimates in cells of the table.
Estimates of medians and other quantiles.
Regression analysis, linear regression, logistic (dichomotomous and polychotomous) regression, and ANOVA. Parameters estimates and tests of hypotheses.

Return to main page for survey software