Using the 2012 Smoking and Drug Use Amongst English Pupils Dataset (2012smokedrugs.dta), perform diagnostics on the second cigarette consumption multiple linear regression model. To remind you, the outcome variable was cigs7, but recoded to remove all 0s, and the predictor variables were free, schyear, and sex.
Functional Form Diagnostics:
(a) Create a plot to check for a functional form violation.
(b) Perform a Ramsey RESET test.
(c) If you violate the functional form assumption, try to find a solution.
Read-in 2012 Smoking and Drug Use Amongst English Pupils, and re-run regression.
setwd("C:/QSSD/Exercises/Chapter 12 - Exercises")
getwd()
[1] "C:/QSSD/Exercises/Chapter 12 - Exercises"
library(foreign)
drugs <- read.dta("2012smokedrugs.dta", convert.factors=FALSE)
library(car)
Loading required package: carData
drugs$cigs7a <- recode(drugs$cigs7, "0=NA")
table(drugs$cigs7a)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
54 28 22 28 12 16 16 4 7 8 7 6 5 8 1 5 6 5
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
3 2 3 3 2 2 4 4 2 5 4 2 6 1 1 3 9 3
37 38 39 40 41 42 43 45 46 47 49 50 51 52 53 54 55 56
3 2 1 4 2 6 2 3 4 1 3 3 2 4 4 1 1 2
60 62 63 66 67 69 70 71 73 74 75 76 77 79 80 83 84 85
7 2 1 2 2 3 10 1 1 1 3 1 1 1 8 1 1 1
86 89 90 92 95 100 104 105 110 140
2 1 4 1 1 2 1 3 3 2
model.1 <- lm(cigs7a ~ free + schyear + sex, data=drugs)
summary(model.1)
Call:
lm(formula = cigs7a ~ free + schyear + sex, data = drugs)
Residuals:
Min 1Q Median 3Q Max
-33.31 -20.80 -12.47 15.99 122.04
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.549 7.452 0.879 0.3800
free 5.940 3.365 1.765 0.0783 .
schyear 3.803 1.625 2.340 0.0197 *
sex 2.811 2.855 0.984 0.3255
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 28.7 on 404 degrees of freedom
(7181 observations deleted due to missingness)
Multiple R-squared: 0.01964, Adjusted R-squared: 0.01236
F-statistic: 2.698 on 3 and 404 DF, p-value: 0.04555
Note: the general problem with regression diagnostics here is that 2 predictors are nominal and one is ordinal.
a.
x11()
plot(y=model.1$residuals,x=model.1$fitted.values, xlab="Fitted Values", ylab="Residuals")
abline(h=0, col="red")

It is not clear whether there is a local mean of 0.
b.
library(lmtest)
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
resettest(model.1, power=2:3, type="fitted")
RESET test
data: model.1
RESET = 0.76745, df1 = 2, df2 = 402, p-value = 0.4649
The p-value is above .05, thus we do not violate the functional form assumption.
c. Since we do not have an incorrect functional form, we do not need to attempt any corrections.
d.
You might also like to view...
There are difficulties with the enforcement of professional codes
Indicate whether the statement is true or false
A closed question is more useful for obtaining specific information
Indicate whether the statement is true or false
"Continue being depressed", a way of asking for no change for the time being, is an example of a:
a. negative intervention b. provocative intervention c. circular intervention d. paradoxical intervention
The evidence-based practice process
a. requires practitioners to employ interventions that have the best research support even if those interventions conflict with client values. b. can involve decisions about the selection of assessment tools. c. applies only to clinical decisions, NOT to practice decisions about communities or social policies. d. All of these.