Summary of survey software:
Types of designs that can be accommodated.
- stratified designs;
- cluster sampling;
- unequal probabilities of selection (sampling weights);
- multiple stages of sampling, with stratification, clustering and finite population corrections at each stage;
- finite-population corrections can be calculated for simple
random sampling without replacement of sampling units within
- post-stratification and direct standardization;
- some designs with a single PSU per stratum
Types of estimands and statistical analyses that can be accommodated.
There are about currently about 50 Stata commands for various analyses
of survey data, including the following analyses and others:
- Estimation of means, totals, ratios, and proportions.
- Linear regression, logistic regression, and probit; also, tobit,
interval, censored, instrumental variables, multinomial logit, ordered
logit and probit, and Poisson. Point estimates, associated standard
errors, confidence intervals, and design effects for the full population
or subpopulations are displayed. Auxiliary commands will display all
this information for linear combinations (e.g., differences) of
estimators, and conduct hypothesis tests.
- Contingency tables with Rao-Scott corrections of
chi-squared tests; new survey-corrected regression commands including
tobit, interval, censored, instrumental variables, multinomial logit,
ordered logit and probit, and Poisson.
Restrictions on number of variables or observations.
Maximum number of observations is 2,147,483,647, which in reality will
be limited by computer RAM (virtual memory can be used, but commands run
slower). Maximum number of variables is 32,767 with Stata/SE and 2047
with Intercooled Stata.
Primary methods used for variance estimation.
- Linearization estimator
- The jackknife (with either the set of weights specified by the
user, or through direct calculation)
- BRR (with either the set of weights specified by the user, or with
an Hadamard matrix specified by the user) with Fay corrections
- The bootstrap estimation can be performed with user add-ons
- The type of variance estimator can be specified as a design option
(and thus saved with data), or requested at time of estimation
- Warnings are issued for strata with a single PSU or no PSUs in
domain estimation. In the former case, the user can have
standard errors calculated by treating singleton PSUs as
selected with certainty, or by scaling variances from other
strata, or reported as missing
General description of the "feel" of the software.
Stata is a complete statistical software package with full statistical,
data management, and graphical capabilities. It can be run interactively
or in batch mode, and is fully programmable. The survey commands are
part of the standard software package. Initially, data can be read in
from ASCII files and a Stata-format data file created; or data in other
file formats can be translated to Stata format using a stand-alone
software package (Stat/Transfer or DBMS/Copy). Support can be obtained
from the vendor, or through an active mailing list. A huge depository of
user contributed modules (over 1000) is available online at Statistical
Software Components Archive. Samping capabilities are limited to SRS,
although there is a number of user-contributed routines for PPS
sampling. The manuals, although bulky, are extremely informative, and
may serve as an introductory reading.
Platforms on which the software can be run.
- Windows (all current versions from 2000 to Vista; 32-bit and
64-bit Windows varieties for x86, x86-64, and Itanium);
- Power Macintosh (OS X 10.4 or later);
- Linux (any x86, x86-64, Intel Itanium, or compatible running
Linux; 32-bit and 64-bit varieties; tested and supported operating
systems include Red Hat 6.1 (or later), SuSE 9 (or later), Fedora
Core 1 (or later), RHEL 4);
- Alpha AXP running Digital Unix;
- HP-9000 with HP-UX;
- IBM RS/6000 running AIX (32- and 64-bit varieties);
- SGI running Irix 6.5;
- Sun Solaris on 64-bit SPARC or 64-bit x86-64
Software is distributed as precompiled object program. Updates are
distributed through http protocol; the minor updates include the script
("ado") and help files, and are usually issued once in two-three weeks;
the major updates to the executable file are usually made approximately
once in a quarter. The software checks the availability of updates on
Current version is Stata 10 (as of June 2007). Multiprocessor parallel
processing version of Stata is available.
Availability, pricing and terms.
One-time purchase with perpetual license. Upgrade purchases are
optional. Generous academic discount. Volume discounts and student
discounts. Different "flavors" depending on the data set size
requirements, and parallel processing capabilities.
Example: University price for one single-user copy: from $540 for the
basic IC version and basic documentation set to $985 for extended SE
version and full documentation set.
4905 Lakeway Drive
College Station, TX 77845
This software is discussed in the review
article from The Survey Statistician.
Other relevanat Stata survey links:
Thanks to Stas Kolenikov for major contributions to this page.
Return to main page for survey software