A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation (2024)

Chapter: Appendix D: Technical Details for Geography Variables

Previous Chapter: Appendix C: Technical Details for Differential Privacy Table Builder
Suggested Citation: "Appendix D: Technical Details for Geography Variables." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

Appendix D

Technical Details for Geography Variables

Hierarchical modeling is often conveniently thought of as using three stages: data model (e.g., likelihood), process model, and parameter model. Using the notation [Y|X] to denote the conditional distribution of Y given X, one may consider the following conditional distributions [data|process, parameters], [process|parameters], and [parameters]. Then, using Bayes’ rule, the posterior distribution of the process and parameters given the data can be expressed as [process, parameters|data] ∝ [data|process,parameters] × [process|parameters] × [parameters].

The well-known Fay-Herriot (FH) model (Fay & Herriot, 1979) can be expressed as

Z = Y +
Y
= + ξ,

where ~ N(0, D), D = diag( σ 1 2 , σ 2 2 , … , σ N 2 ), ξ ~ N(0, σ ξ 2 ), Z are the direct estimators, and X are known covariates with associated parameters βi (i = 0,1, … , р). In this context, σ i 2 (i = 1, … , N) is the sampling error variance for area i (i = 1, … , N) and р is the number of covariates (β0 is the intercept). Thus, this can be expressed hierarchically with [Z|Y, β, σ ξ 2 ] equal to the distribution of data model given the process and parameters, [Y|β, σ ξ 2 ] the distribution of the process given the parameters, and [β, σ ξ 2 ] the distribution of the parameters.

Here is a description of the multivariate spatio-temporal mixed effects model (MSTM) (Bradley et al., 2015). For ease of exposition, this discussion

Suggested Citation: "Appendix D: Technical Details for Geography Variables." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

presents the multivariate spatial case. Details surrounding the spatio-temporal case can be found in Bradley et al. (2015). Similar to the FH model, one has

Z ~ MVN(Y, D)
Y = + + ξ.

Here, MVN denotes the multivariate normal distribution, D contains the known sampling error variance, S is spatial basis functions with associated coefficients given by the elements of η (see Bradley et al., 2015, for a comprehensive discussion), and ξ ~ MVN(0, σ ξ 2 I) represents an additional error term to capture fine-scale variation. Importantly, the FH model can be viewed as a special case of the MSTM.

For the unit-level case, one first considers a model for an ignorable design (Battese et al., 1988). Specifically, consider the linear mixed-effects model

yij = xijβ + vi + ∈ij,

where yij is the response for unit j in area i (i = 1, … , m), xij are fixed covariates associated with unit j in area i. β is associated regression coefficients, vi is area-level random effects for area i with an iid mean zero normal distribution having variance σ v 2 . Finally, ∈ij is iid normally distributed sampling error random effects with mean zero and variance σ 2 . Importantly, this model can be rewritten in the form of a hierarchical model.

In the case of an informative sample design, one path forward proceeds through the Bayesian pseudo-likelihood (PL; Savitsky & Toth, 2016). The PL is given by

P L ( θ ) = i f ( y i | θ ) ^ { w i } ,

where unit i ranges over the sample and wi denotes the sample weight for unit i, scaled to sum to the sample size. Combined with a suitable prior distribution on the model parameters θ, this leads to a pseudo-posterior distribution.

Suggested Citation: "Appendix D: Technical Details for Geography Variables." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.
Page 217
Suggested Citation: "Appendix D: Technical Details for Geography Variables." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.
Page 218
Next Chapter: Appendix E: Data Collection Report
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.