Using the 2012 Smoking and Drug Use Amongst English Pupils Dataset (2012smokedrugs.dta), perform diagnostics on the second cigarette consumption multiple linear regression model. To remind you, the outcome variable was cigs7, but recoded to remove all 0s, and the predictor variables were free, schyear, and sex.

Normality Diagnostics:
(a) Create a histogram of the residuals to test for non-normality.
(b) Create a Q-Q plot to test for non-normality.
(c) Perform a Shapiro-Wilk Normality test.
(d) If you find non-normality in the residuals, try to find a solution using the techniques.




x11()

hist(model.1$residuals,xlab="Residuals",main="")





Definitely not normally distributed.

b.



x11()

qqnorm(model.1$residuals)

qqline(model.1$residuals,col="red")





Definitely not normally distributed.

c.



shapiro.test(model.1$residuals)



Shapiro-Wilk normality test

data: model.1$residuals

W = 0.8451, p-value < 2.2e-16

Since the p-value is below .05, we violate the normality assumption.

d.



drugs$cigs7b <- drugs$cigs7a + 1

summary(powerTransform(drugs$cigs7b))



bcPower Transformation to Normality

Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd

drugs$cigs7b 0.0063 0 -0.0827 0.0952

Likelihood ratio test that transformation parameter is equal to 0

(log transformation)

LRT df pval

LR test, lambda = (0) 0.01904637 1 0.89023

Likelihood ratio test that no transformation is needed

LRT df pval

LR test, lambda = (1) 441.0524 1 < 2.22e-16

The LR test says that we should transform the outcome variable and the suggested transformation is to raise

it to .0063.

model.1a <- lm(I(cigs7a^.0063) ~ free + schyear + sex, data=drugs)

summary(model.1a)

Call:

lm(formula = I(cigs7a^0.0063) ~ free + schyear + sex, data = drugs)

Residuals:

Min 1Q Median 3Q Max

-0.0179023 -0.0072318 0.0005963 0.0084505 0.0188354

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.0088510 0.0024222 416.494 <2e-16 ***

free 0.0016630 0.0010939 1.520 0.1292

schyear 0.0013119 0.0005281 2.484 0.0134 *

sex 0.0008289 0.0009281 0.893 0.3723

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.009328 on 404 degrees of freedom

(7181 observations deleted due to missingness)

Multiple R-squared: 0.01931, Adjusted R-squared: 0.01203

F-statistic: 2.652 on 3 and 404 DF, p-value: 0.0484

As we discussed in the chapter, transforming the outcome variable in a non-intuitive way makes it difficult to

interpret the coefficients. Therefore, we may be better off leaving the outcome variable in its original form.

Social Work & Human Services

You might also like to view...

A symptom of Secondary Trauma Syndrome is burnout

a. True b. False Indicate whether the statement is true or false

Social Work & Human Services

Compare and contrast income and wealth. Then compare and contrast absolute and relative poverty.

What will be an ideal response?

Social Work & Human Services

Social work educational programs are the same across the globe

Indicate whether the statement is true or false

Social Work & Human Services

An individual that who is uncomfortable with how he feels about himself related to his gender is said to be experiencing:

A. Gender dysphoria B. Ego- systonic C. Gender dystoria D. An anxiety disorder

Social Work & Human Services