Nested g-computation procedure

What is the difference in health care cost when two different treatments are used?  This question is challenging because cumulative health care cost is often censored either by death or lack of continuous enrollment.  Lin (2000) addressed this issue in his 2000 paper (see paper and my blog write-up).

The problem with this approach, however, is that it looks at differences in cost based on an intent-to-treat analysis, and largely ignores any changes in treatment or confounders over time.

A recent paper by Spieker, Roy and Mitra (2018), tries to address this limitation using a nested g-computation procedure.  Implementing the nested g-computation procedure requires four steps:

1. Select and fit model for the conditional mean of Yj given prior treatment, confounding, and cost history: E[Yjj,LjYj-1]. Where Yj is the cost in period j, Āj is the treatment used in period j, and Lj is vector of covariates. One could estimate this relationship using a standard OLS regression or a generalized linear model.
2. Select and fit model for the distribution of the confounders given previous treatment, confounding, and cost history: p(Lj|Lj−1j−1; Yj−1). For binary confounders, one can estimate this relationship using a logistic regression whereas for
continuous confounders, one can accomplish the same with an OLS regression or generalized linear models.
3. In the third step, select and fit model for the risk (log‐linear regression) or odds of death (logistic regression) given previous treatment, confounding, and cost history: p(Dj = 1| Āj, Lj, Yj−1).
4. To determine the overall impact of a treatment regime of interest assuming the patient stayed on it (e.g., Ā = ā), simulate data from the models fitted within Steps 1 to 3 many times to generate predicted cost values.  Researchers should consider that once a patient dies, they would not continue to accumulate further costs, and thus one would need to averaging the predicted total costs from these simulated predictions to get the figure E[Yā].

What assumptions do we need to make to have the nested g-computation procedure be valid?  There are four key assumptions which include:

(a) correct model specification, (b) the potential costs under the observed treatment regime are equal to the observed costs, (c) sequentially ignorable treatment assignment—that is, the assumption that treatment at each point is independent of the potential costs conditional on observed variable history, and (d) conditionally independent censoring and death—more specifically, conditional on observed covariate history.

The authors also go on to use simulated data to show the benefits of nested g-formula approach as compared against the inverse probability weighed (IPW) regression models.

Source: