What are the four basic methods of evaluating reliability?

What will be an ideal response?


1) Test-Retest Reliability
When researchers measure a phenomenon that does not change between two points separated by an interval of time, the degree to which the two measurements yield comparable, if not identical, values is the test-retest reliability of the measure. If you take a test of your math ability and then retake the test two months later, the test is performing reliably if you receive a similar score both times, presuming that nothing happened during the two months to change your math ability. Of course, if events between the test and the retest have changed the variable being measured, then the difference between the test and retest scores should reflect that change.
2) Inter-Item Reliability (Internal Consistency)
When researchers use multiple items to measure a single concept, they are concerned with inter-item reliability (or internal consistency). For example, if we are to have confidence that a set of questions reliably measures an attitude, say, attitudes toward violence, then the answers to the questions should be highly associated with one another. The stronger the association between the individual items and the more items included, the higher the reliability of the index. Cronbach’s alpha is a reliability measure commonly used to measure inter-item reliability. Of course, inter-item reliability cannot be computed if only one question is used to measure a concept. For this reason, it is much better to use a multi-item index to measure an important concept (Viswanathan, 2005).
Test-retest reliability A measurement showing that measures of a phenomenon at two points in time are highly correlated, if the phenomenon has not changed, or have changed only as much as the phenomenon itself.
Interitem reliability An approach that calculates reliability based on the correlation among multiple items used to measure a single concept.
Cronbach’s alpha A statistic that measures the reliability of items in an index or scale.
3) Alternate-Forms Reliability
Researchers are testing alternate-forms reliability when they compare subjects’ answers to slightly different versions of survey questions (Litwin, 1995). A researcher may reverse the order of the response choices in an index or modify the question wording in minor ways and then readminister that index to subjects. If the two sets of responses are not too different, alternate-forms reliability is established.
A related test of reliability is the split-halves reliability approach. A survey sample is divided in two by flipping a coin or using some other random assignment method. These two halves of the sample are then administered the two forms of the questions. If the responses of the two halves are about the same, the measure’s reliability is established.
Alternate-forms reliability A procedure for testing the reliability of responses to survey questions in which subjects’ answers are compared after the subjects have been asked slightly different versions of the questions or when randomly selected halves of the sample have been administered slightly different versions of the questions.
Split-halves reliability Reliability achieved when responses to the same questions by two randomly selected halves of a sample are about the same.
4) Intra-Observer and Inter-Observer Reliability
When ratings by an observer, rather than ratings by the subjects themselves, are being assessed at two or more points in time, test-retest reliability is termed intra-observer or intra-rater reliability. Let’s say a researcher observes a grade school cafeteria for signs of bullying behavior on multiple days. If his observations captured the same degree of bullying on every Friday, it can be said that his observations were reliable. When researchers use more than one observer to rate the same persons, events, or places, inter-observer reliability is their goal. If observers are using the same instrument to rate the same thing, their ratings should be very similar. In this case, the researcher interested in cafeteria bullying would use more than one observer. If the measurement of bullying is similar across the observers, we can have much more confidence that the ratings reflect the actual degree of bullying behavior.
Intraobserver reliability (intrarater reliability) Consistency of ratings by an observer of an unchanging phenomenon at two or more points in time.
Interobserver reliability When similar measurements are obtained by different observers rating the same persons, events, or places.

Criminal Justice

You might also like to view...

From an interactionist perspective, police corruption is to be expected

Indicate whether the statement is true or false

Criminal Justice

Define Offense.

What will be an ideal response?

Criminal Justice

Which of the following statements regarding offender use of drugs and alcohol is TRUE?

A) There is ongoing drug use among offenders in prison. B) The majority of incoming prison inmates today were convicted of drug law violations. C) Drug overdose is the main cause of death among inmates in state prisons. D) The majority of prison inmates had been drinking when they committed the crime for which they are incarcerated.

Criminal Justice

Which form of terrorism involves the use of terror for profit or psychological gain?

A. ideological terrorism B. criminal terrorism C. state-sponsored terrorism D. revolutionary terrorism

Criminal Justice