Effect of Chance Success Due To Guessing on Error of Measurement in Multiple-Choice Tests


[This text is machine generated and may contain errors.]





Psychological Reports, 1965, 16, 1193-1196. ? Southern Universities Press 1965

EFFECT OF CHANCE SUCCESS DUE TO GUESSING ON ERROR
OF MEASUREMENT IN MULTIPLE-CHOICE TESTS

DONALD W. ZIMMERMAN AND RICHARD H. WILLIAMS
East Carolina College

Summary."Chance success due to guessing is treated as a component of the
error variance of a multiple-choice test score. It is shown that for a test of given
item structure the minimum standard error of measurement can be estimated by

the formula Y (N"X)/a, where N is the total number of items, X is the score,
and @ is the number of alternative choices per item. The significance of non-
independence of true score and this component of error score on multiple-choice
tests is discussed.

The reliability of a test is limited by a number of factors which taken to-
gether are said to constitute oerror.? For actual tests these factors and the relative
contribution of each are unknown.

One contribution to the error variance of a multiple-choice test score, how-
ever, is apparent from examination of the test. This is the chance error due to
the guessing inherent in this type of test. Suppose a test consists of N items
with a alternative choices for each item. If a person had no knowledge of the

subject matter but marked the answers to all items with the aid of a table of
random numbers, a score of (1/a)N correct answers could be expected. If the
number of alternatives per item is small (in most multiple-choice tests 4 or 5), a
substantial component of the total score may be accounted for by successful
guessing.

More important, however, is the variability of the component of the total
score attributable to successful guessing. It is error variance which limits the re-
liability of a test. If all persons obtained the same number correct by guessing,
there would be no problem. A constant would be added to the score. Different
ersons, however, will receive different increments in score as a result of more-
or-less successful guessing. In fact, the number of correct guesses will presumably
follow a binomial distribution with mean xp (where zm is the number of items
guessed and p is the probability of a correct guess) and variance npg (where ¢
isl1" p).

In scoring tests a ocorrection formula,? in which a fraction of the number of
wrong answers is subtracted from the number right, is sometimes used (cf. Lord,
1963). The correction makes the scores of those persons who guess more com-
arable to the scores of those who, for one reason or another, do not guess. It
should be noted, however, that the correction has no effect upon variability intro-
duced by guessing. For some scores, in other words, the formula will undercor-
rect, for others it will overcorrect. .

The item structure of a test (the number of items and the number of alterna-





1194 D. W. ZIMMERMAN & R. H. WILLIAMS

tive choices per item) can thus be considered as contributing a component of
error of measurement which is unavoidable. A simple formula can be derived
by which this value can be estimated for any particular test.

The following symbols will be used.

N = total number of items on the test

Xx = observed score

fi = true score

a = number of alternative choices per item

n = number of items on which guesses are made

Semin == minimum error variance (variance of distribution of number of items guessed
correctly )

Semin = minimum standard error of measurement.

The error variance in which we are interested is the variance of the distribu-
tion of number of items which are guessed correctly. This will be given by the
binomial formula, mpg, where m is the number of items on which guesses are
made, p is the probability of a successful guess, and g is 1"p. For a multiple-
choice test p will be 1/a, where a is.the number of alternative choices per item,
and g will be 1" (1/2).

In order to use the binomial formula the number of items on which guesses
are made must be estimated. First, the conventional ocorrection formula? for
guessing can be used to estimate the true score.

T = X-[1/(a-1)](N-X) . [1]

Or, as usually expressed, the true score is estimated by subtracting a fraction of
the number of items owrong? from the number oright.? This fraction is one
divided by one less than the number of alternatives per item (14 of number
wrong for a test with 5 alternatives, 14 for 4 alternatives, and so on).

The number of items on which guesses are made can then be found by sub-
tracting this result from the total number of items on the test.

nm = N"{ X-[1/(a-1)](N-X) + . [2]

Finally, this result is substituted in the binomial formula to give the variance
of the distribution of number of items guessed correctly,

Semin = [N"} X-[1/(a-1)](N-X) ¢] (1/2) [1-(1/a)] . [3]
Simplifying, the following result is obtained:
Semin w= (N-X)/a « [4]

The square root gives the minimum standard error of measurement,

Se mins y (N-X) /a . [| ;

The value obtained by the formula is a minimum in that, if all other sources
of error were eliminated, a standard error of this value would still be contributed
by the item structure of the test.

Derivation of the formula is based on assumptions which are only approxi-
mated in an actual situation. Failure of these assumptions to hold precisely would

GUESSING ERROR IN MULTIPLE-CHOICE TESTS 1195

further increase the standard error of measurement. For example, a person taking
a test may eliminate some of the alternatives for a given item because he has
artial information. There is never a sharp distinction between oknowing the
answer? and oguessing? (cf. Horst, 1933). In an actual test, therefore, the prob-
ability of a successful guess will be somewhat greater than 1/a.

Also, as said previously, there are many other sources of error in addition to
guessing. Results obtained using the formula given above, therefore, must be
considered as lower limits. Actual standard errors will be greater than the cal-
culated values. The formula may prove useful, however, in giving a rough idea
of what can be expected from any particular type of item structure.

For example, consider a test of 100 items, with 4 alternatives per item, and a score
of 50. Use of the formula shows a minimum standard error of measurement of 3.5. Or,
as an extreme example, consider a o~true-false? test of 10 items and a score of 5. This is a
special case of a multiple-choice test, where a is 2. Calculation shows a minimum standard
error of measurement of 1.6. In this case the standard error would be almost as large as
the standard deviation of the true scores which would be expected. Here the formula con-
firms what one would suspect, that short otrue-false? tests are quite unreliable.

An important feature of this minimum standard error of measurement is that it varies
with true score. The higher the true score, the lower its value will be. The standard error
of measurement as usually understood is a fixed value for a given test. The confidence in-
terval for true score which is established is the same width for any observed score. This
difference reflects the special characteristics of the class of error variance considered in the
resent paper.

TABLE 1

MINIMUM STANDARD ERROR OF MEASUREMENT FOR A MULTIPLE-CHOICE TEST
WITH N ITEMS AND a ALTERNATIVE CHOICES PER ITEM





N/a 2 3 4 5 N/a 2 3 4 5
10 1.6 1.3 ee 1.0 90 4.7 3.9 3.4 3.0
20 Ze 1.8 1.6 1.4 100 5.0 4.1 oe 6 ee
30 2 Ze 1.9 Ley 110 ie. 4.3 a7 3.3
40 3.2 2.6 ZZ 2.0 120 aD 4.5 3.9 3.5
50 5D 2.9 25 2.2 130 Dol 4.7 4.1 3.6
60 3.9 3.2 | 2.4 150 6.1 5.0 4.3 3.9
70 4,2 Fee 3.0 2.6 200 Ti 5.8 5.0 4.5

3
80 4.5 3.7 5 2.8 250 8.0 6.5 5.6 5.0


Table 1 shows the values of the minimum standard error for selected values of N and
a. These include those which would most often occur in tests. In calculating these values
it has been assumed that the score being considered is 14 the total number of items.

An implication of the above consideration concerns non-independence of true score
and error score in multiple-choice tests. In test theory it has been assumed often that error
score and true score are uncorrelated. For multiple-choice tests where there is chance suc-
cess due to guessing this assumption cannot be made. Those persons with low true scores
will guess on more items and thus receive relatively higher error scores. On the other





1196 D. W. ZIMMERMAN & R. H. WILLIAMS

hand, those persons with high true scores will guess on fewer items and receive lower error
scores. Therefore, there will be a negative correlation between true score and error score.
As shown above, minimum standard error of measurement is a decreasing function of true
score.

The extent to which non-independence of true score and error score is a serious prob-
lem for test theory is not certain. Possibly the inaccuracy introduced by neglecting this re-
lationship is not large. A similar situation has been found to be true in the case of other
statistical problems where the fit of the theoretical model to the actual situation is imper-
fect (Box, 1953; Norton, 1953). In the case of multiple-choice tests, however, the fact of
non-independence is clear and its possible effect could be large.

REFERENCES

Box, G. E. P. Non-normality and tests on variances. Biometrika, 1953, 40, 318-335.

Horst, A. P. The difficulty of a multiple choice test item. J. educ. Psychol., 1933, 24,
229-232.

LorD, F. M. Formula scoring and validity. Educ. psychol. Measmt, 1963, 23, 663-672.

NORTON, D. W. An empirical investigation of some effects of non-normality and hetero-
geneity on the F-distribution. Unpublished Ph.D. thesis in Education, State Univer.
of Iowa. Reported in E. F. Lindquist, Design and analysis of experiments in psy-
chology and education. Boston: Houghton-Mifflin, 1953.

Accepted May 10, 1965.


Title
Effect of Chance Success Due To Guessing on Error of Measurement in Multiple-Choice Tests
Description
Announcements, Brochures, and Publications from the Records of the Department of Psychology (UA25-11) - N/A
Extent
Local Identifier
UA25.11.01
Rights
This item has been made available for use in research, teaching, and private study. Researchers are responsible for using these materials in accordance with Title 17 of the United States Code and any other applicable statutes. If you are the creator or copyright holder of this item and would like it removed, please contact us at als_digitalcollections@ecu.edu.
http://rightsstatements.org/vocab/InC-EDU/1.0/
Permalink
https://digital.lib.ecu.edu/79308
Preferred Citation
Cite this item
Content Notice

Public access is provided to these resources to preserve the historical record. The content represents the opinions and actions of their creators and the culture in which they were produced. Therefore, some materials may contain language and imagery that is outdated, offensive and/or harmful. The content does not reflect the opinions, values, or beliefs of ECU Libraries.

Contact Digital Collections

If you know something about this item or would like to request additional information, click here.


Comment on This Item

Complete the fields below to post a public comment about the material featured on this page. The email address you submit will not be displayed and would only be used to contact you with additional questions or comments.


*
*
*
Comment Policy