Adventures in DM: Delta Method in Logistic Regression

In the last post we looked at how to construct and evaluate a simple linear hypothesis test within the frame work of logistic regression (or any generalized linear model fit via maximum likelihood). While the example was simple, the idea can be quite powerful for quickly testing hypothesis concerning regression coefficients.

For this method to work however you need to be able to write your (null) hypothesis in terms of a linear combination of the model parameters or coefficients. In our simple example, this was achieved as we wished to test the null hypothesis $\beta_{2}-\beta_{3}=0$. What happens when we can not construct the weights ($L$) to place on the coefficients with a vector (or a matrix when interested in multiple simultaneous contrasts such as Ho: $\beta_{2}=\beta_{3}=0$ )? We need to apply other methods to the estimation and inference task. An often used classical procedure is called the Delta Method.

Delta Method

The main idea of the delta method is in those cases where you are not facing a simple linear sum of observations (random variables), create a linear approximation to that more complicated function using a Taylor expansion and derive a variance of that approximation instead. This method will allow you to derive an approximate variance in large samples and thus construct a confidence interval for theses non-linear estimators.

There are many sources for information on the theory of the delta method, including:

but the main result of interest here is that in large samples under "standard regularity conditions" a
function $G(b)$ of the coefficients of our generalized linear model ($b_{0},b_{1}....b_{p}$) will be distributed as $N(G(\beta),\frac{\partial G(\beta)}{\partial \beta'}VAR(\hat{\beta})\frac{\partial G(\beta)}{\partial \beta}$ ) which we evaluate at the ML estimates:

Point Estimate: $G(\hat{\beta})$
Variance: $\frac{\partial G(\hat{\beta})}{\partial \hat{\beta}'}VAR(\hat{\beta})\frac{\partial G(\hat{\beta})}{\partial \hat{\beta}}$

For example, returning to the last post, lets say we are interested in the ratio quantity $\frac{\beta_{2}}{\beta_{3}}$.

We have the vector of maximum likelihood estimates :

$[-2.8211865 -0.5794947 -0.4589188 -0.3883266 ]$

and the estimated variance-covariance matrix:

(Intercept) Gifts1 Gifts2 Gifts3

(Intercept) 0.0002674919 -0.0002674919 -0.0002674919 -0.0002674919
Gifts1 -0.0002674919 0.0034667138 0.0002674919 0.0002674919
Gifts2 -0.0002674919 0.0002674919 0.0039732911 0.0002674919
Gifts3 -0.0002674919 0.0002674919 0.0002674919 0.0039964383

From here, we know the point estimate of this quantity is just the expression evaluated at the maximum likelihood estimates:

$\frac{ -0.4589188}{ -0.3883266}=1.181778$

Next, we need to get the first derivative of the function:

$\frac{\partial G(\beta)}{\partial \beta}$ which is $[0,0,\frac{1}{\beta_{3}}, \frac{-\beta_{2}}{(\beta_{3})^{2}}]^{t}$ and evaluated at the ML estimates $[0,0, \frac{1}{-0.3883266}, \frac{-0.4589188}{-0.3883266^{2}}]^{t}$

Thus the variance of $\frac{\beta_{2}}{\beta_{3}}$ is thus 0.05916905 and a 95% confidence interval could be constructed as $1.181778\pm1.96\sqrt{0.0591690}=(0.7050221, 1.65855)$

The car package in R provides a very nice short-cut to all of this, supplying symbolic differentiation to save you from calculating the derivative. The deltaMethod function accepts a variety of different model objects or you can supply a mean vector and vcov matrix. Here is a replication of the simple analysis above:

Another common usage for this method is to compute confidence intervals for predicted values out of generalized linear models on the response scale - for example in logistic regression we may be interested in the effect of a predictor variable on the probability p and not on the link scale of the logit. Xu and Long show the derivation for several predicted probabilities in the link above.

SAS NLMIXED uses the delta method for estimating the variance with its "estimate" statement - for example when you are interested in the variance of the difference in estimated probabilities between two different predictor values (or vector of values).

When you have time and computing power, a bootstrap analysis will typically provide better coverage. In a future post we will compare bootstrapping, simulation and the delta method.

Adventures in DM

Wednesday, January 2, 2013

Delta Method in Logistic Regression

No comments:

Post a Comment