We are testing the null hypothesis: Ho: L'b=0 where L is a vector or matrix of hypothesis coefficients (the contrast matrix), carefully chosen to represent a linear combination of coefficients from the model of interest and b is the vector of maximum likelihoods estimates (the coefficients of our model).
In this very simple example, we are interested in the difference between \beta_{2} and \beta_{3} in our logistic regression. Thus we are interested in the linear combination of parameters: \beta_{2}-\beta_{3}=0 (i.e. \beta_{2}=\beta_{3}).
The first type of test we describe is a Wald test.
- The intuition about this test comes from the test of a single model parameter: \frac{b_{1}-\beta_{1}}{s\{\beta_{1}\}} where s\{\beta_{1}\} is the standard error of b_{1} which is distributed t with degrees of freedom equal to the number of observations in the data set minus the number of parameters estimated. In large samples, a t distribution is the same as the normal distribution and a squared random variable that follows a standard normal distribution is distributed as chi-square.
- Thus, from this heuristic description, the formulation of the Wald test makes sense and is given as S=(L'b)'(L'\sum_{}L)^{-1}(L'b) where \sum_{} is the variance-covariance matrix of the estimated coefficients. S is asymptotically distributed as a chi-square with degrees of freedom equal to the number of restrictions in the null hypothesis. In this case, we are conducting a single degree of freedom test.
- With our model log(\frac{p}{(1-p)})=\beta_{0}+\beta_{1}(Gifts1)+\beta_{2}(Gifts2)+\beta_{3}(Gifts3) the formulation of L to construct \beta_{2}-\beta_{3}=0 is clearly L=[0,0,1,-1]^{t} which comes from the contrast of interest [\beta_{0}+\beta_{1}(0)+\beta_{2}(1)+\beta_{3}(0)] - [\beta_{0}+\beta_{1}(0)+\beta_{2}(0)+\beta_{3}(1)]
- For a maximum likelihood estimator, the MLE of \beta_{2}-\beta_{3} is this expression evaluated at the MLE of the coefficients: b_{2}-b_{3}.
- Further, this contrast has a variance of (L'\sum_{}L) which comes from first principles: VAR(b_{2}-b_{3})=VAR(b_{2})+VAR(b_{3}) -2COV(b_{2},b_{3}).
- In R we can use the LinearHypothesis function in John Fox's excellent Car package. We fail to reject the null hypothesis with a test statistic of 0.6702656 and p-value 0.412959. We also compute the test statistically manually:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(car) | |
#B2-B3=0 | |
cVec<-c(0,0,1,-1) #this is L | |
car::linearHypothesis(mod_glm,cVec, verbose=TRUE, test = "Chisq") #Estimated linear function (hypothesis.matrix %*% coef - rhs) | |
#Hypothesis matrix: | |
# [,1] [,2] [,3] [,4] | |
#[1,] 0 0 1 -1 | |
#Right-hand-side vector: | |
#[1] 0 | |
#Estimated linear function (hypothesis.matrix %*% coef - rhs) | |
#[1] -0.07059217 | |
#Linear hypothesis test | |
#Hypothesis: | |
#Gifts2 - Gifts3 = 0 | |
#Model 1: restricted model | |
#Model 2: Respond ~ Gifts1 + Gifts2 + Gifts3 | |
# Res.Df Df Chisq Pr(>Chisq) | |
#1 95409 | |
#2 95408 1 0.6703 0.413 | |
#Manually | |
(t(t(cVec)%*%mod_glmcoefficients)) %*% solve((t(cVec)%*%vcov(mod_glm)%*%cVec)) %*% (t(cVec)%*%mod_glmcoefficients) | |
#[1,] 0.6702656 | |
#p-value | |
1-pchisq(0.6702656, 1) | |
#[1] 0.412959 |
The contrast \beta_{2}-\beta_{3} has a point estimate of -0.07059 with variance of 0.007434746. We can construct a wald type 95% confidence interval as -0.07059\pm1.96*\sqrt(0.007434746)= (-0.2395909, 0.09841095 )
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#B2=1 | |
-2.82119+(-0.57949*0)+(-0.45892*1)+(-0.38833*0) #-3.28011 | |
#B3=1 | |
-2.82119+(-0.57949*0)+(-0.45892*0)+(-0.38833*1) #-3.20952 | |
#Difference is -3.28011-(-3.20952)= -0.07059 | |
#This difference has the following variance: | |
(t(cVec)%*%vcov(mod_glm)%*%cVec) | |
#0.007434746 | |
#95% CI | |
-0.07059-1.96*sqrt(0.007434746)#-0.2395909 | |
-0.07059+1.96*sqrt(0.007434746)#0.09841095 |
Another type of test we describe is a Likelihood Ratio (LR) test. Asymptotically, the LR and Wald test will result in the same answer, and will certainly with this large of sample size.
- The basic idea of the LR test is to compare the log likelihood (LL) at the maximized value without constraint, to the maximized value with the constraint(s): G^{2}=2*(LL full model)-2*(LL constrained model). This is illustrated below via brute force, with the constrained model (making \beta_{2}=\beta_{3}) fit as: log(\frac{p}{(1-p)})=\beta_{0}+\beta_{1}(Gifts1)+\beta_{2}(Gifts2)+\beta_{2}(Gifts3)
=log(\frac{p}{(1-p)})=\beta_{0}+\beta_{1}(Gifts1)+\beta_{2}(Gifts2+Gifts3)
- This constrained model is very simple to refit in R and does not require use of an offset variable:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#LR from unconstrained | |
logLik(mod_glm) | |
#'log Lik.' -19061.75 (df=4) | |
#new variable constructed | |
dataGifts2_3<-dataGifts2+data$Gifts3 | |
#fit new model | |
mod_glm_new<-glm(Respond~Gifts1+Gifts2_3,data=data, family=binomial(link=logit)) | |
logLik(mod_glm_new) | |
#'log Lik.' -19062.08 (df=3) | |
#LR test | |
2*logLik(mod_glm)-2*logLik(mod_glm_new) | |
#[1] 0.6701161 |