What are the four approaches to testing the validity of measures?
What will be an ideal response?
We can consider measurement validity the first concern in establishing the validity of research results, because without having measured what we think we measured, we really do not know what we are talking about.
Measurement validity exists when a measurement actually measures what we think it does
We briefly discussed the difference between official police reports and survey data in Chapter 1. We noted that official reports underestimate the actual amount of offending because a great deal of offending behavior never comes to the attention of police (Mosher et al., 2002). There is also evidence that arrest data often reflect the political climate and police policies as much as they do criminal activity. For example, let’s suppose we wanted to examine whether illicit drug use was increasing or decreasing since the United States’ “War on Drugs,” which heated up in the 1980s and is still being fought today. During this time, arrest rates for drug offenses soared, giving the illusion that drug use was increasing at an epidemic pace. However, self-report surveys that asked citizens directly about their drug use behavior during this time period found that use of most illicit drugs was actually declining or had stayed the same (Regoli & Hewitt, 1994). In your opinion, then, which measure of drug use—the UCR or self-report surveys—was more valid? The extent to which measures indicate what they are intended to measure can be assessed with one or more of four basic approaches: face validation, content validation, criterion validation, and construct validation.
Whatever the approach to validation, no one measure will be valid for all times and places. For example, the validity of self-report measures of substance abuse varies with such factors as whether the respondents are sober or intoxicated at the time of the interview, whether the measure refers to recent or lifetime abuse, and whether the respondents see their responses as affecting their chances of receiving housing, treatment, or some other desired outcome (Babor, Stephens, & Marlatt, 1987). In addition, persons with severe mental illness are, in general, less likely to respond accurately (Corse, Hirschinger, & Zanis, 1995). These types of possibilities should always be considered when evaluating measurement validity.
Face Validity
Researchers apply the term face validity to the confidence gained from careful inspection of a concept to see if it is appropriate “on its face.” More precisely, we can say that a measure has face validity if it obviously pertains to the concept being measured more than to other concepts (Brewer & Hunter, 1989, p. 131). For example, if college students’ alcohol consumption is what we are trying to measure, asking for students’ favorite color seems unlikely on its face to tell us much about their drinking patterns. A measure with greater face validity would be a count of how many drinks they had consumed in the past week.
Face validity is the type of validity that exists when an inspection of the items used to measure a concept suggests that they are appropriate “on their face.”
Although every measure should be inspected in this way, face validation on its own is not the gold standard of measurement validity. The question “How much beer or wine did you have to drink last week?” may look valid on its face as a measure of frequency of drinking, but people who drink heavily tend to underreport the amount they drink. So the question would be an invalid measure in a study that includes heavy drinkers.
Content Validity
Content validity establishes that the measure covers the full range of the concept’s meaning. To determine that range of meaning, the researcher may solicit the opinions of experts and review literature that identifies the different aspects of the concept. An example of a measure that covers a wide range of meaning is the Michigan Alcoholism Screening Test (MAST). The MAST includes 24 questions representing the following subscales: recognition of alcohol problems by self and others; legal, social, and work problems; help seeking; marital and family difficulties; and liver pathology (Skinner & Sheu, 1982). Many experts familiar with the direct consequences of substance abuse agree that these dimensions capture the full range of possibilities. Thus, the MAST is believed to be valid from the standpoint of content validity.
Content validity is the type of validity that establishes a measure covers the full range of the concept’s meaning.
Criterion Validity
Consider the following scenario: When people drink an alcoholic beverage, the alcohol is absorbed into their bloodstream and then gradually metabolized (broken down into other chemicals) in their liver (NIAAA, 1997). The alcohol that remains in their blood at any point, unmetabolized, impairs both thinking and behavior (NIAAA, 1994). As more alcohol is ingested, cognitive and behavioral consequences multiply. These biological processes can be identified with direct measures of alcohol concentration in the blood, urine, or breath. Questions about alcohol consumption, on the other hand, can be viewed as attempts to measure indirectly what biochemical tests measure directly.
Criterion validity is the type of validity that is established by comparing the scores obtained on the measure being validated to those obtained with a more direct or already validated measure of the same phenomenon (the criterion).
Criterion validity is established when the scores obtained on one measure can accurately be compared to those obtained with a more direct or already validated measure of the same phenomenon (the criterion). A measure of blood-alcohol concentration or a urine test could serve as the criterion for validating a self-report measure of drinking, as long as the questions we ask about drinking refer to the same period. Observations of substance use by friends or relatives could also, in some circumstances, serve as a criterion for validating self-report substance use measures.
An attempt at criterion validation is well worth the effort because it greatly increases confidence that the measure is actually measuring the concept of interest—criterion validity basically offers evidence. However, often no other variable might reasonably be considered a criterion for individual feelings or beliefs or other subjective states. Even with variables for which a reasonable criterion exists, the researcher may not be able to gain access to the criterion, as would be the case with a tax return or employer document as criterion for self-reported income.
Construct Validity
Measurement validity also can be established by showing that a measure is related to a variety of other measures as specified in a theory. This validation approach, known as construct validity, is commonly used in social research when no clear criterion exists for validation purposes. For example, in one study of the validity of the Addiction Severity Index (ASI), McLellan et al. (1985) compared subject scores on the ASI to a number of indicators that they felt from prior research should be related to substance abuse: medical problems, employment problems, legal problems, family problems, and psychiatric problems. They could not use a criterion-validation approach because they did not have a more direct measure of abuse, such as laboratory test scores or observer reports. However, their extensive research on the subject had given them confidence that these sorts of other problems were all related to substance abuse, and thus their measures seemed to be valid from the standpoint of construct validity. Indeed, the researchers found that individuals with higher ASI ratings tended to have more problems in each of these areas, giving us more confidence in the ASI’s validity as a measure.
Construct validity is the type of validity that is established by showing that a measure is related to other measures as specified in a theory.
The distinction between criterion and construct validation is not always clear. Opinions can differ about whether a particular indicator is indeed a criterion for the concept that is to be measured. For example, if you need to validate a question-based measure of sales ability for applicants to a sales position, few would object to using actual sales performance as a criterion. But what if you want to validate a question-based measure of the amount of social support that people receive from their friends? Should you just ask people about the social support they have received? Could friends’ reports of the amount of support they provided serve as a criterion? Even if you could observe people in the act of counseling or otherwise supporting their friends, can an observer be sure that the interaction is indeed supportive? There isn’t really a criterion here, just a combination of related concepts that could be used in a construct validation strategy.
What construct and criterion validation have in common is the comparison of scores on one measure to scores on other measures that are predicted to be related. It is not so important that researchers agree that a particular comparison measure is a criterion rather than a related construct. But it is very important to think critically about the quality of the comparison measure and whether it actually represents a different measure of the same phenomenon. For example, it is only a weak indication of measurement validity to find that scores on a new self-report measure of alcohol use are associated with scores on a previously used self-report measure of alcohol use.
You might also like to view...
Discuss how migration from North Africa and the prolific export of terror training has been used to good effect by Middle East terror groups
What will be an ideal response?
What are the two shortcomings of integrated theory? In your opinion, how does one overcome these shortcomings?
What will be an ideal response?
Sally has just finished serving an 8-10 year prison sentence for a drug offense. She has been released, but is not completely free of criminal justice supervision. She must remain under community supervision for 24 months, during which she must find and maintain employment or education, refrain from drug and alcohol use, and receive permission before she moves or leaves her home county. Sally is on:
A. parole. B. house arrest. C. electronic monitoring. D. probation.
Sentencing guidelines direct the __________ to specific actions that should be taken.
Fill in the blank(s) with the appropriate word(s).