A team of statisticians and a medical sociologist are developing new statistical methods for more precise estimation of the influence of one individual on another in a network, testing and controlling for selection effects such as homophily observed between individuals. The challenge with network data is that it may contain multiple types of information, including network topology, nodal covariates, tie characteristics, and temporal change. The central problem is accounting for the complex correlation structure that arises because each actor in the network may play the dual role of focal actor (rater or responder) and alter (target or stimulus) and thus may appear in the data multiple times. Furthermore, outcomes might be geographically correlated and correlated over time if subjects are followed longitudinally. Methodology for longitudinal analysis of network data is extremely valuable as this provides the best opportunity for obtaining causal-type inferences in the absence of randomization or other instrumental variables. The specific areas of research include: (1) Development of methodology for longitudinal analysis of individual outcomes of actors. The objective of the analysis is to determine the causal effect, if any, of an actor adopting a certain health-related behavior or experiencing a certain outcome (e.g., obesity, heart attack) on the focal actor adopting or experiencing a similar behavior or outcome. Because the correlations between characteristics of actors contain important information on how effects propagate across a population, such models offer the potential to further the scientific understanding of network effects. (2) Development of methods for longitudinal analysis of observations made on distinct groups of connected actors (e.g., dyads, triads). For example, suppose that distinct dyads are defined based on marriage of two individuals; it may be that a property of the tie, such as the quality of the marriage (e.g., measured by strength, mutual affection, time spent together per day), is in turn related to the actors’ obesity, the occurrence of health shocks, or the obesity genes in the partners. Although there is a similarity to egocentric analysis, the dependent variable and possibly some of the independent predictors are defined on groups of connected actors rather than the individual actors. (3) Development of methods for modeling the transition of dyadic data across time as a function of attributes of the actors and of network characteristics (e.g., clustering, transitivity). Here, the dependent variable is defined for all potential dyads whether a relationship exists or not. For most substantive analyses, the dependent variable is an indicator of whether a tie exists at a given time, in which case we model the transition of the dyad between connected and unconnected states. However, we are also developing methods for the case where the dependent variable is more general (e.g., a count such as the number of patients shared between any two physicians in a network or some other continuously-valued measure).

