We are testing the null hypothesis: $Ho: L'b=0$ where L is a vector or matrix of hypothesis coefficients (the contrast matrix), carefully chosen to represent a linear combination of coefficients from the model of interest and b is the vector of maximum likelihoods estimates (the coefficients of our model).
In this very simple example, we are interested in the difference between $\beta_{2}$ and $\beta_{3}$ in our logistic regression. Thus we are interested in the linear combination of parameters: $\beta_{2}-\beta_{3}=0$ (i.e. $\beta_{2}=\beta_{3}$).
The first type of test we describe is a Wald test.
- The intuition about this test comes from the test of a single model parameter: $\frac{b_{1}-\beta_{1}}{s\{\beta_{1}\}}$ where $s\{\beta_{1}\}$ is the standard error of $b_{1}$ which is distributed t with degrees of freedom equal to the number of observations in the data set minus the number of parameters estimated. In large samples, a t distribution is the same as the normal distribution and a squared random variable that follows a standard normal distribution is distributed as chi-square.
- Thus, from this heuristic description, the formulation of the Wald test makes sense and is given as $S=(L'b)'(L'\sum_{}L)^{-1}(L'b)$ where $\sum_{}$ is the variance-covariance matrix of the estimated coefficients. $S$ is asymptotically distributed as a chi-square with degrees of freedom equal to the number of restrictions in the null hypothesis. In this case, we are conducting a single degree of freedom test.
- With our model $log(\frac{p}{(1-p)})=\beta_{0}+\beta_{1}(Gifts1)+\beta_{2}(Gifts2)+\beta_{3}(Gifts3)$ the formulation of $L$ to construct $\beta_{2}-\beta_{3}=0$ is clearly $L=[0,0,1,-1]^{t}$ which comes from the contrast of interest $[\beta_{0}+\beta_{1}(0)+\beta_{2}(1)+\beta_{3}(0)] - [\beta_{0}+\beta_{1}(0)+\beta_{2}(0)+\beta_{3}(1)] $
- For a maximum likelihood estimator, the MLE of $\beta_{2}-\beta_{3}$ is this expression evaluated at the MLE of the coefficients: $b_{2}-b_{3}$.
- Further, this contrast has a variance of $(L'\sum_{}L)$ which comes from first principles: $VAR(b_{2}-b_{3})=VAR(b_{2})+VAR(b_{3}) -2COV(b_{2},b_{3})$.
- In R we can use the LinearHypothesis function in John Fox's excellent Car package. We fail to reject the null hypothesis with a test statistic of 0.6702656 and p-value 0.412959. We also compute the test statistically manually:
The contrast $\beta_{2}-\beta_{3}$ has a point estimate of -0.07059 with variance of 0.007434746. We can construct a wald type 95% confidence interval as $-0.07059\pm1.96*\sqrt(0.007434746)= (-0.2395909, 0.09841095 )$
Another type of test we describe is a Likelihood Ratio (LR) test. Asymptotically, the LR and Wald test will result in the same answer, and will certainly with this large of sample size.
- The basic idea of the LR test is to compare the log likelihood (LL) at the maximized value without constraint, to the maximized value with the constraint(s): $G^{2}=2*(LL full model)-2*(LL constrained model)$. This is illustrated below via brute force, with the constrained model (making $\beta_{2}=\beta_{3}$) fit as: $$log(\frac{p}{(1-p)})=\beta_{0}+\beta_{1}(Gifts1)+\beta_{2}(Gifts2)+\beta_{2}(Gifts3)$$
$$=log(\frac{p}{(1-p)})=\beta_{0}+\beta_{1}(Gifts1)+\beta_{2}(Gifts2+Gifts3)$$
- This constrained model is very simple to refit in R and does not require use of an offset variable: