<?xml version="1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.tei-c.org/ns/1.0 http://digital.lib.ecu.edu/tei/xsd/tei_P5.xsd">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>
        </title>
        <author>
        </author>
        <respStmt>
          <resp>Text encoded by</resp>
          <name>Digital Collections</name>
        </respStmt>
      </titleStmt>
      <publicationStmt>
        <distributor>East Carolina University. J. Y. Joyner Library</distributor>
        <address>
          <addrLine>Digital Collections</addrLine>
          <addrLine>Joyner Library, East Carolina University</addrLine>
          <addrLine>East Fifth Street, Greenville NC 27858-4353 USA</addrLine>
        </address>
        <date>2012</date>
      </publicationStmt>
      <sourceDesc>
        <bibl>
        </bibl>
      </sourceDesc>
    </fileDesc>
    <encodingDesc>
      <samplingDecl>
        <p>All quotation marks retained as data.</p>
        <p>All end-of-line hyphens have been removed, and the trailing part of a word has been joined to the preceding line.</p>
        <p>All smart quotes have been converted into straight quotes.</p>
      </samplingDecl>
      <classDecl>
        <taxonomy xml:id="LCSH">
          <bibl>Library of Congress Subject Headings</bibl>
        </taxonomy>
      </classDecl>
    </encodingDesc>
    <profileDesc>
      <creation>
        <date>
        </date>
      </creation>
      <langUsage xml:lang="en-US">
        <language ident="en-US" usage="100">English</language>
      </langUsage>
      <textClass>
        <keywords scheme="#LCSH">
          <list>
            <item>
            </item>
          </list>
        </keywords>
      </textClass>
    </profileDesc>
  </teiHeader>
  <text>
    <body>
      <div type="other">
        <p rend="align(centerbold)">[This text is machine generated and may contain errors.]</p>
        <pb facs="00079308_0001" />
        <p>Psychological Reports, 1965, 16, 1193-1196. ? Southern Universities Press 1965<lb /><lb />EFFECT OF CHANCE SUCCESS DUE TO GUESSING ON ERROR<lb />OF MEASUREMENT IN MULTIPLE-CHOICE TESTS<lb /><lb />DONALD W. ZIMMERMAN AND RICHARD H. WILLIAMS<lb />East Carolina College<lb /><lb />Summary."Chance success due to guessing is treated as a component of the<lb />error variance of a multiple-choice test score. It is shown that for a test of given<lb />item structure the minimum standard error of measurement can be estimated by<lb /><lb />the formula Y (N"X)/a, where N is the total number of items, X is the score,<lb />and @ is the number of alternative choices per item. The significance of non-<lb />independence of true score and this component of error score on multiple-choice<lb />tests is discussed.<lb /><lb />The reliability of a test is limited by a number of factors which taken to-<lb />gether are said to constitute oerror.? For actual tests these factors and the relative<lb />contribution of each are unknown.<lb /><lb />One contribution to the error variance of a multiple-choice test score, how-<lb />ever, is apparent from examination of the test. This is the chance error due to<lb />the guessing inherent in this type of test. Suppose a test consists of N items<lb />with a alternative choices for each item. If a person had no knowledge of the<lb /><lb />subject matter but marked the answers to all items with the aid of a table of<lb />random numbers, a score of (1/a)N correct answers could be expected. If the<lb />number of alternatives per item is small (in most multiple-choice tests 4 or 5), a<lb />substantial component of the total score may be accounted for by successful<lb />guessing.<lb /><lb />More important, however, is the variability of the component of the total<lb />score attributable to successful guessing. It is error variance which limits the re-<lb />liability of a test. If all persons obtained the same number correct by guessing,<lb />there would be no problem. A constant would be added to the score. Different<lb />ersons, however, will receive different increments in score as a result of more-<lb />or-less successful guessing. In fact, the number of correct guesses will presumably<lb />follow a binomial distribution with mean xp (where zm is the number of items<lb />guessed and p is the probability of a correct guess) and variance npg (where ¢<lb />isl1" p).<lb /><lb />In scoring tests a ocorrection formula,? in which a fraction of the number of<lb />wrong answers is subtracted from the number right, is sometimes used (cf. Lord,<lb />1963). The correction makes the scores of those persons who guess more com-<lb />arable to the scores of those who, for one reason or another, do not guess. It<lb />should be noted, however, that the correction has no effect upon variability intro-<lb />duced by guessing. For some scores, in other words, the formula will undercor-<lb />rect, for others it will overcorrect. .<lb /><lb />The item structure of a test (the number of items and the number of alterna-<lb /></p>
        <pb facs="00079308_0002" />
        <p>1194 D. W. ZIMMERMAN &amp; R. H. WILLIAMS<lb /><lb />tive choices per item) can thus be considered as contributing a component of<lb />error of measurement which is unavoidable. A simple formula can be derived<lb />by which this value can be estimated for any particular test.<lb /><lb />The following symbols will be used.<lb /><lb />N = total number of items on the test<lb /><lb />Xx = observed score<lb /><lb />fi = true score<lb /><lb />a = number of alternative choices per item<lb /><lb />n = number of items on which guesses are made<lb /><lb />Semin == minimum error variance (variance of distribution of number of items guessed<lb />correctly )<lb /><lb />Semin = minimum standard error of measurement.<lb /><lb />The error variance in which we are interested is the variance of the distribu-<lb />tion of number of items which are guessed correctly. This will be given by the<lb />binomial formula, mpg, where m is the number of items on which guesses are<lb />made, p is the probability of a successful guess, and g is 1"p. For a multiple-<lb />choice test p will be 1/a, where a is.the number of alternative choices per item,<lb />and g will be 1" (1/2).<lb /><lb />In order to use the binomial formula the number of items on which guesses<lb />are made must be estimated. First, the conventional ocorrection formula? for<lb />guessing can be used to estimate the true score.<lb /><lb />T = X-[1/(a-1)](N-X) . [1]<lb /><lb />Or, as usually expressed, the true score is estimated by subtracting a fraction of<lb />the number of items owrong? from the number oright.? This fraction is one<lb />divided by one less than the number of alternatives per item (14 of number<lb />wrong for a test with 5 alternatives, 14 for 4 alternatives, and so on).<lb /><lb />The number of items on which guesses are made can then be found by sub-<lb />tracting this result from the total number of items on the test.<lb /><lb />nm = N"{ X-[1/(a-1)](N-X) + . [2]<lb /><lb />Finally, this result is substituted in the binomial formula to give the variance<lb />of the distribution of number of items guessed correctly,<lb /><lb />Semin = [N"} X-[1/(a-1)](N-X) ¢] (1/2) [1-(1/a)] . [3]<lb />Simplifying, the following result is obtained:<lb />Semin w= (N-X)/a « [4]<lb /><lb />The square root gives the minimum standard error of measurement,<lb /><lb />Se mins y (N-X) /a . [| ;<lb /><lb />The value obtained by the formula is a minimum in that, if all other sources<lb />of error were eliminated, a standard error of this value would still be contributed<lb />by the item structure of the test.<lb /><lb />Derivation of the formula is based on assumptions which are only approxi-<lb />mated in an actual situation. Failure of these assumptions to hold precisely would<lb /><lb />GUESSING ERROR IN MULTIPLE-CHOICE TESTS 1195<lb /><lb />further increase the standard error of measurement. For example, a person taking<lb />a test may eliminate some of the alternatives for a given item because he has<lb />artial information. There is never a sharp distinction between oknowing the<lb />answer? and oguessing? (cf. Horst, 1933). In an actual test, therefore, the prob-<lb />ability of a successful guess will be somewhat greater than 1/a.<lb /><lb />Also, as said previously, there are many other sources of error in addition to<lb />guessing. Results obtained using the formula given above, therefore, must be<lb />considered as lower limits. Actual standard errors will be greater than the cal-<lb />culated values. The formula may prove useful, however, in giving a rough idea<lb />of what can be expected from any particular type of item structure.<lb /><lb />For example, consider a test of 100 items, with 4 alternatives per item, and a score<lb />of 50. Use of the formula shows a minimum standard error of measurement of 3.5. Or,<lb />as an extreme example, consider a o~true-false? test of 10 items and a score of 5. This is a<lb />special case of a multiple-choice test, where a is 2. Calculation shows a minimum standard<lb />error of measurement of 1.6. In this case the standard error would be almost as large as<lb />the standard deviation of the true scores which would be expected. Here the formula con-<lb />firms what one would suspect, that short otrue-false? tests are quite unreliable.<lb /><lb />An important feature of this minimum standard error of measurement is that it varies<lb />with true score. The higher the true score, the lower its value will be. The standard error<lb />of measurement as usually understood is a fixed value for a given test. The confidence in-<lb />terval for true score which is established is the same width for any observed score. This<lb />difference reflects the special characteristics of the class of error variance considered in the<lb />resent paper.<lb /><lb />TABLE 1<lb /><lb />MINIMUM STANDARD ERROR OF MEASUREMENT FOR A MULTIPLE-CHOICE TEST<lb />WITH N ITEMS AND a ALTERNATIVE CHOICES PER ITEM<lb /><lb /><lb /><lb /><lb /><lb />N/a 2 3 4 5 N/a 2 3 4 5<lb />10 1.6 1.3 ee 1.0 90 4.7 3.9 3.4 3.0<lb />20 Ze 1.8 1.6 1.4 100 5.0 4.1 oe 6 ee<lb />30 2 Ze 1.9 Ley 110 ie. 4.3 a7 3.3<lb />40 3.2 2.6 ZZ 2.0 120 aD 4.5 3.9 3.5<lb />50 5D 2.9 25 2.2 130 Dol 4.7 4.1 3.6<lb />60 3.9 3.2 | 2.4 150 6.1 5.0 4.3 3.9<lb />70 4,2 Fee 3.0 2.6 200 Ti 5.8 5.0 4.5<lb /><lb />3<lb />80 4.5 3.7 5 2.8 250 8.0 6.5 5.6 5.0<lb /><lb /><lb />Table 1 shows the values of the minimum standard error for selected values of N and<lb />a. These include those which would most often occur in tests. In calculating these values<lb />it has been assumed that the score being considered is 14 the total number of items.<lb /><lb />An implication of the above consideration concerns non-independence of true score<lb />and error score in multiple-choice tests. In test theory it has been assumed often that error<lb />score and true score are uncorrelated. For multiple-choice tests where there is chance suc-<lb />cess due to guessing this assumption cannot be made. Those persons with low true scores<lb />will guess on more items and thus receive relatively higher error scores. On the other</p>
        <pb facs="00079308_0003" />
        <p>1196 D. W. ZIMMERMAN &amp; R. H. WILLIAMS<lb /><lb />hand, those persons with high true scores will guess on fewer items and receive lower error<lb />scores. Therefore, there will be a negative correlation between true score and error score.<lb />As shown above, minimum standard error of measurement is a decreasing function of true<lb />score.<lb /><lb />The extent to which non-independence of true score and error score is a serious prob-<lb />lem for test theory is not certain. Possibly the inaccuracy introduced by neglecting this re-<lb />lationship is not large. A similar situation has been found to be true in the case of other<lb />statistical problems where the fit of the theoretical model to the actual situation is imper-<lb />fect (Box, 1953; Norton, 1953). In the case of multiple-choice tests, however, the fact of<lb />non-independence is clear and its possible effect could be large.<lb /><lb />REFERENCES<lb /><lb />Box, G. E. P. Non-normality and tests on variances. Biometrika, 1953, 40, 318-335.<lb /><lb />Horst, A. P. The difficulty of a multiple choice test item. J. educ. Psychol., 1933, 24,<lb />229-232.<lb /><lb />LorD, F. M. Formula scoring and validity. Educ. psychol. Measmt, 1963, 23, 663-672.<lb /><lb />NORTON, D. W. An empirical investigation of some effects of non-normality and hetero-<lb />geneity on the F-distribution. Unpublished Ph.D. thesis in Education, State Univer.<lb />of Iowa. Reported in E. F. Lindquist, Design and analysis of experiments in psy-<lb />chology and education. Boston: Houghton-Mifflin, 1953.<lb /><lb />Accepted May 10, 1965.</p>
      </div>
    </body>
  </text>
</TEI>