<?xml version="1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.tei-c.org/ns/1.0 http://digital.lib.ecu.edu/tei/xsd/tei_P5.xsd">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>
        </title>
        <author>
        </author>
        <respStmt>
          <resp>Text encoded by</resp>
          <name>Digital Collections</name>
        </respStmt>
      </titleStmt>
      <publicationStmt>
        <distributor>East Carolina University. J. Y. Joyner Library</distributor>
        <address>
          <addrLine>Digital Collections</addrLine>
          <addrLine>Joyner Library, East Carolina University</addrLine>
          <addrLine>East Fifth Street, Greenville NC 27858-4353 USA</addrLine>
        </address>
        <date>2012</date>
      </publicationStmt>
      <sourceDesc>
        <bibl>
        </bibl>
      </sourceDesc>
    </fileDesc>
    <encodingDesc>
      <samplingDecl>
        <p>All quotation marks retained as data.</p>
        <p>All end-of-line hyphens have been removed, and the trailing part of a word has been joined to the preceding line.</p>
        <p>All smart quotes have been converted into straight quotes.</p>
      </samplingDecl>
      <classDecl>
        <taxonomy xml:id="LCSH">
          <bibl>Library of Congress Subject Headings</bibl>
        </taxonomy>
      </classDecl>
    </encodingDesc>
    <profileDesc>
      <creation>
        <date>
        </date>
      </creation>
      <langUsage xml:lang="en-US">
        <language ident="en-US" usage="100">English</language>
      </langUsage>
      <textClass>
        <keywords scheme="#LCSH">
          <list>
            <item>
            </item>
          </list>
        </keywords>
      </textClass>
    </profileDesc>
  </teiHeader>
  <text>
    <body>
      <div type="other">
        <p rend="align(centerbold)">[This text is machine generated and may contain errors.]</p>
        <pb facs="00079320_0001" />
        <p>
          <lb />
          <lb />~<lb /><lb />THE JOURNAL OF EXPERIMENTAL EDUCATION<lb />(Volume 35, Number 4, Summer 1967)<lb /><lb />THE MAXIMUM RELIABILITY OF A MULTIPLE-<lb />CHOICE TEST AS A FUNCTION OF NUMBER OF<lb />ITEMS, NUMBER OF CHOICES, AND<lb />GROUP HETEROGENEITY<lb /><lb />GRAHAM J. BURKHEIMER<lb />DONALD W. ZIMMERMAN<lb />RICHARD H. WILLIAMS<lb />East Carolina College<lb /><lb />IN PREVIOUS papers (7, 8) it has been shown<lb />that chance success due to guessing introduces an<lb />unavoidable source of error into multiple-choice<lb />test scores. This particular class of error is neg-<lb />atively correlated with true scores. The usual equa-<lb />tions for test reliability and other intercorrelations<lb />among components of test scores depend upon the<lb />assumption that the correlations between true scores<lb />and error scores and between error scores and er-<lb />ror scores on parallel forms of a test are zero, In<lb />previous papers (6, 8, 9, 10) more general equa-<lb />tions for these intercorrelation terms, whichdo not<lb />depend upon the above assumptions, have been pre-<lb />sented,<lb /><lb />Because of the presence of chance success due<lb />to guessing the reliability of a multiple-choice test<lb />has a maximum value. In other words, if allsources<lb />of error other than chance success due to guessing<lb />were eliminated, the reliability of a test would re-<lb />main at some value less than unity because of the<lb />unavoidable error due to guessing. The computer<lb />simulation method described previously (8) gave re-<lb />liabilities for several kinds of tests, under the as-<lb />sumption that only error due to guessing is present.<lb />The purpose of this paper is to determine these val-<lb />ues using analytic methods. An equation for the<lb />maximum reliability of a multiple-choice test, which<lb />involves only number of items, number of choices,<lb />and mean and variance of true scores (group hetero-<lb />geneity) is derived.<lb /><lb />Horst (2) derived equations indicating the maxi-<lb />mum correlation between two different tests. Be-<lb />ginning with these, Roberts (5) derived equations for<lb />maximum reliability of a test. These results in-<lb /><lb />volve item difficulties and are based on assumptions<lb />concerning intercorrelations among items. The re-<lb />lation of number of alternative choices to test reli-<lb />ability has also been investigated by Carroll (1),<lb />Lord (3), and Plumlee (4). The present paper dif-<lb />fers from these approaches to the problem in that<lb />it does not involve item difficulties, but considers<lb />only components of variance of test scores. It in-<lb />volves no assumptions about intercorrelations among<lb />items and holds for the case in which there isaneg-<lb />ative correlation between true scores and error<lb />scores introduced by guessing. The result is rela-<lb />tively simple in form.<lb /><lb />VARIANCE OF ERROR SCORES AND OF<lb />OBSERVED SCORES<lb /><lb />When chance success due to guessing is the only<lb />source of error, the error scores for those true<lb />scores having a fixed value, T, will approach a bi-<lb />nomial distribution as the number of cases con-<lb />sidered increases without limit. Therefore, we can<lb />write Kr<lb /><lb />Ties ee oe<lb />T<lb />Kr<lb /><lb />where Ep is the mean of the error scores for the<lb /><lb />true scores having some fixed value, Eny is an er-<lb /><lb />ror score for one of thse true scores, and Kris the<lb />number of true scores having that particular value.<lb /></p>
        <pb facs="00079320_0002" />
        <p>
          <lb />
          <lb />
          <lb />
          <lb />90<lb /><lb />THE JOURNAL OF EXPERIMENTAL EDUCATION<lb /><lb />Here np indicates the mean of a binomial distribution<lb />Since n = N - T and p = 1/a (7) we have<lb />K<lb /><lb />_ KyN - KT<lb />» art <lb /><lb />a<lb /><lb />where N is the total number of items and a is the number of choices per item<lb />Summating, as the true score value varies from O to N, we can write<lb />N Kf<lb /><lb />~ KN - KyT<lb />3] ) en) .<lb /><lb />a<lb />T=0 oi=l<lb /><lb />T=O<lb />or KN - =T<lb />[4]. ee<lb /><lb />a<lb /><lb />Equation [ 4] gives, in other words, the sum of the error scores in the entire distribution of test scores<lb /><lb />The variance of error scores corresponding to the true scores for a fixed value, T, is<lb />KT<lb /><lb />» ET} ( y ETj y<lb /><lb />ae<lb />T c<lb />As Kr increases without limit this variance is also given by the binomial formula, npq, where q= (1 -<lb />(1-1). Therefore we can write<lb />a<lb />Kr Kr ;<lb />[6] » Eri �,� Eq, )<lb />i= i= 1 1<lb />an Ew ah (ett ").<lb />K a a<lb />= T<lb /><lb />Solving [ 6] for<lb /><lb />= gives<lb />ye =<lb />Kr<lb /><lb />( » Eri<lb />i geo et -<lb /><lb />= KAN - KT<lb />Lae | Ti a= T trl +<lb />i=]<lb /><lb />Kop<lb />Substituting [ 2] in [7] gives<lb /><lb />1 1<lb />#6. fs Boe 2<lb />», ET a [KpN - KpT ] [KpN*° - 2K,NT + KyT* J.<lb /><lb />N<lb /><lb />N<lb />3] ) » ETi = y = De n- Ker] + -<lb /><lb />2 2<lb /><lb />" [KpN* - 2KpNT + KpTT].<lb /><lb />T=O i=l T=O T=O<lb /><lb />OY ao<lb />2<lb /><lb />[10] 2E" = ",<lb /><lb />Summating, as the true score value varies from O to N<lb />K<lb />N jt<lb /><lb />, leads to the following result<lb /><lb />- 2NXT + Z2T?).<lb /><lb />p), or</p>
        <pb facs="00079320_0003" />
        <p>
          <lb />
          <lb />BURKHEIMER -" ZIMMERMAN - WILLIAMS 91<lb /><lb />Total error variance is given by<lb />7 aaa tea 0) =)<lb /><lb />[ai] so = _<lb />4, ee K?<lb />Substituting [4] and [10] in [11] and reducing gives<lb />(=T)°<lb />: sa aa ha<lb />[3224 Se a eo ( F ) + (N- T), which can also be written as<lb />a<lb /><lb />T<lb />ae<lb />3<lb />I<lb />J<lb /><lb />ed es<lb /><lb />In a similar manner it can be shown that the variance of observed scores is given by the following equation:<lb /><lb />ein  ot 3<lb />" (N-7).<lb />a a<lb /><lb />[14] S6 =<lb /><lb />Equations [13] and[14], then, give the variance of error scores and observed scored under the assumption<lb />that chance success due to guessing is the only source of error. These variances are expressed as a function<lb />of number of items, number of choices, and mean and variance of true scores.<lb /><lb />CORRELATION BETWEEN ERROR SCORES ON PARALLEL FORMS OF A TEST<lb /><lb />An expression will now be derived for the correlation between error scores on parallel forms of a test.<lb />This correlation can be written as follows:<lb /><lb />2 @4@o<lb />[15] r,, ="""" , or<lb />ee 2<lb /><lb />Ks<lb /><lb />as ya 2)<lb />BU ee eee ae cn K<lb /><lb />i 5 SE2 - (SE)?<lb />K<lb /><lb />Expressions for DE and XE? are given in[4] and[10]. An expression is needed, therefore, for DE,E, in<lb />order to determine reg. We begin by finding the sum of E,E, values for a fixed E, value and a fixed T value.<lb />In other words, we consider a joint distribution or error scores on parallel forms of a test for each T value.<lb />We can write A)<lb /><lb />Kr, Kr,<lb />[17] » Ei; =E, » Boj , Since E, is fixed. Using [2] gives<lb />j=l j=l<lb />Kk,<lb />N-T<lb />[18] »ELEp; = EiKy, (";" ). Since for a fixed value of E,, Ky, = Kg, » we have<lb />j=1<lb />KE,<lb />NT<lb />[19] ) B.E5j = EsKp, ("" ).<lb />j=l<lb /><lb />Sum mating now over the E, values gives the following:<lb /></p>
        <pb facs="00079320_0004" />
        <p>
          <lb />
          <lb />
          <lb />
          <lb />92 THE JOURNAL OF EXPERIMENTAL EDUCATION<lb /><lb />N-T Kg,<lb /><lb />[ 20] - YE, E2j = os KR, which can also be written as<lb />E,-0 j=l<lb />N-T Kg, Ky<lb /><lb />[21] » vB, 1E 2; = Pa ) » Eri, where Kr, indicates the total number of cases for the fixed value of T.<lb />E, 20. j=l i=l<lb /><lb />Again using the equation [2 | we have<lb />N-T Kg,<lb /><lb />[ 22] » YE, 1E2; =<lb /><lb />E,=O j= =]<lb /><lb />N-T K<lb />Ea KyN? - 2KpNT + KyT?<lb /><lb />[ 23] » Bio; = <lb /><lb />E,=O j=l<lb /><lb />We now need only summate over the T values to obtain YE, E, for the entire distribution. Doing this, we ob-<lb />tain<lb /><lb />or<lb /><lb />N-T Kg,<lb /> 2K.NT + KpT *<lb />[24] x=E Ae 3 VEsEs; = pe. , or<lb />ae | j leon q2<lb />T=0 E,=0 j=l T=O<lb /><lb />1<lb />[25] DEE, = rae (KN2- 2NZT + DT?).<lb /><lb />Substituting equation [ 4] and [ 25] in the numerator of [ 16] and simplyfing, gives<lb /><lb />(@T)°<lb />ae eee<lb />| 26] «e... =<lb />a� (28%. "" a)<lb />K<lb />Dividing by K in both numerator and denominator leads to the following result:<lb />s2 ;<lb />t<lb />[27] roo = a<lb />a� Se<lb /><lb />Reliability is given by<lb />s2<lb /><lb />[ 28] ne 7. (1 - r,,,) (Reference 8). Substituting [ 27| in [28] we have<lb />Oo<lb />22 2 2 2<lb />a So 7 a Set St<lb />[29] ro, = 22<lb />aS5<lb /><lb />Subtracting [13] from [14] gives<lb /><lb />2-2 228 2 2<lb />[30] as, - a's, = (a- 1)'s; - 8; .<lb /></p>
        <pb facs="00079320_0005" />
        <p>
          <lb />
          <lb />BURKHEIMER -"- ZIMMERMAN "- WILLIAMS 93<lb /><lb />Substituting this result in [ 29] and simplifying, we<lb />have<lb />(a- 1)? sf<lb /><lb />[ 31] ae i<lb />So<lb /><lb />This expression gives maximum reliability interms<lb /><lb />of number of choices, variance of true scores, and<lb /><lb />variance of observed scores. Substituting the value<lb /><lb />for s6 given by [14] leads to the following alterna-<lb /><lb />tive result:<lb /><lb />(a - 1) sf<lb />r = nen<lb />Oe ere + eT<lb /><lb />[ 32]<lb /><lb />This equation, then, gives the maximum relia-<lb />bility of a multiple-choice test as afunction of num-<lb />ber of items, number of choices, variance of true<lb />scores, and mean of true scores. It indicates that<lb />maximum reliability depends on group heterogeneity<lb />as well as test length and number of choices. _<lb /><lb />ioe Pee, ee Si ag &amp;<lb />Since O = T+ E and, from [4], E ="",;" , we<lb />can write<lb />Pay ew ae<lb />a* 2<lb /><lb />Solving [ 14] for ar; substituting the results, to-<lb />gether with [33], in [31], and simplifying gives<lb />another expression for maximum reliability:<lb /><lb />N-O<lb />[ 34] Jes i= zee<lb /><lb />ALTERNATIVE EQUATIONS FOR CORRELATION<lb />BETWEEN ERROR SCORES ON PARALLEL<lb />FORMS<lb /><lb />Substituting [13] in [27] and simplifying, we<lb />have<lb />Sf<lb />[35] r_ = - " ?<lb />nae ee Nome de<lb /><lb />which is similar in form to [32]. Equation[ 34] can<lb />be written in this form:<lb /><lb />N-O<lb />[36] s* Q- ro,) =<lb /><lb />Equation [ 28] can be written as follows:<lb />[87] s5 - 1r5,) = 82 (1 - Tee):<lb /><lb />Substituting the right hand side of [37] in [36] and<lb />simplifying, we have<lb />N-O<lb />og as<lb />as?<lb />�,�<lb /><lb />[38] r<lb /><lb />COMPUTER CHECKS<lb /><lb />The equations presented above give the values of<lb />Yoo and reg which would be expected if chance suc-<lb />cess due to guessing were the only source of error<lb />in multiple-choice tests. The reliabilities of actual<lb />tests would be expected to be less thanthese values<lb />because of the presence of other sources of error.<lb />In addition, if reliability were determinedfrom a<lb />finite number of ordered pairs of observed scores<lb />on parallel forms of atest, with only error due to<lb />guessing present, there would be sampling variabil-<lb />ity of the reliability coefficient. The binomial dis-<lb />tribution of error scores assumed in derivation of<lb />the equations, in other words, would be only approx-<lb />imated for any finite number of true scores.<lb /><lb />As the number of ordered pairs of scores on par-<lb />allel forms increases without limit, however, the re-<lb />liability coefficient would be expected to come clos-<lb />er and closer to the values given by the equations.<lb />In a previous paper (8) a method of determining the<lb />reliability coefficient by a computer simulation<lb />method was described. It was shown that for fairly<lb />large numbers of scores (samples of 100, 400, 700,<lb />and 1000) the estimates given by the method were<lb />stable. For example, for ten samples of 400 scores,<lb />the reliability of a 100-item, two-choice test was<lb />indicated as . 89, .88, .89, .87, .90, .88, . 89,.89,<lb />. 89, and . 87.<lb /><lb />In Table 1 the reliabilities given by the computer<lb />simulation method are compared to the values given<lb />analytically by equations [ 31], [ 32], and[ 34] above.<lb />Also, the correlations between error scores on par-<lb />allel forms given by the computer program are com-<lb />pared to the values given by equations [27], [35],<lb />and [38] above. In making these checks we begin<lb />with a distribution of true scores having a certain<lb />mean and a certain variance. The computer pro-<lb />gram then generates error scores which depend up-<lb />on the magnitude of the true scores, as a model of<lb />guessing error, and these are added tothe true<lb />scores to give observed scores. Repeating the pro-<lb />cedure gives results comparable to observed scores<lb />on parallel forms of a test, when guessing is the<lb />only source of error. Finally, product-moment<lb />correlations between the two sets of observed scores '<lb />give an indication of test reliability. Also, corre-<lb />lation between the two sets of error scoresis found,<lb />as well as the means and variances of all distribu-<lb />tions.<lb /><lb />It can be seen from the table that the values given<lb />by the computer program correspond closely to the<lb />values predicted from the equations presented in<lb /><lb />this paper.<lb />REFERENCES<lb /><lb />1. Carroll, J. B., ~~The Effect of Difficulty and<lb />Chance Success on Correlations Between Items</p>
        <pb facs="00079320_0006" />
        <p>
          <lb />
          <lb />
          <lb />
          <lb />94<lb /><lb />TABLE 1<lb /><lb />THE JOURNAL OF EXPERIMENTAL EDUCATION<lb /><lb />COMPARISON OF COMPUTER RESULTS WITH VALUES PREDICTED FROM EQUATIONS<lb /><lb />N=10 N=10 N=100 N=100<lb /><lb />a=2 a=9 a=2 a=5<lb />pes . 44 . 74 og 97<lb />roo** 46 77 . 89 96<lb />rae . 44 . 16 . 89 oF<lb />rou . 42 . 76 . 89 97<lb />ee 46 17 . 89 . 65<lb />Yee} . 44 15 . 88 mr!<lb />LeeT ** . 44 ey | . 89 . 66<lb />Togiiit .45 223 . 89 . 65<lb />* Value obtained from computer program :<lb />** Value obtained by substituting computer data in equation [ 31]<lb />*** Value obtained by substituting computer data in equation [ 32]<lb />**** Value obtained by substituting computer data in equation [ 34]<lb />1 Value obtained from computer program ?<lb />11 Value obtained by substituting computer data in equation Bg<lb />111 Value obtained by substituting computer data in equation [ 35]<lb />1111 Value obtained by substituting computer data in equation [ 38]<lb /><lb />or Between TestsTT, Psychometrika, X (1945),<lb /><lb />pp; i"22.<lb /><lb />2. Horst, P., ~~The Maximum Expected Correla-<lb />tion Between Two Multiple-Choice TestsTT,<lb />Psychometrika, XIX (1954), pp. 291-296.<lb /><lb />3. Lord, F. M., ~~Reliability of Multiple-Choice<lb />Tests as a Function of Number of Choices per<lb /><lb />ItemTT, Journal of Educational Psychology,<lb /><lb />XXXV (1944), pp. 175-180.<lb /><lb />4, Plumlee, L. B., ~~The Effect of Difficulty and<lb />Chance Success on Item-Test Correlation and<lb />on Test ReliabilityTT, Psychometrika, XVII<lb />(1952), pp. 69-86.<lb /><lb />5. Roberts, A. O. H., ~~The Maximum Reliability<lb />of a Multiple-Choice TestTT, Psychologia Afri-<lb /><lb />cana, IX (1962), pp. 286-293.<lb />6. Williams, R. H., and Zimmerman, D. W.,<lb />~~Some Conjectures Concerning the Index of<lb /><lb />Reliability and Related Quantities When True<lb /><lb />Scores and Error Scores on Mental Tests are<lb />Not IndependentTT,<lb /><lb />The Journal of Experimen-<lb /><lb />tal Education, XXXV, No. 2 (Winter 1966), pp.<lb /><lb />16-79,<lb /><lb />10.<lb /><lb />» Gimmerman, D. W., andWilliams,<lb /><lb />. Zimmerman, D. W., and Williams, R. H., ~~Ef-<lb /><lb />fect of Chance Success Due to Guessing on Er-<lb />ror of Measurement in Multiple-Choice TestsTT,<lb />Psychological Reports, XVI (1965), pp. 1193-<lb />1196.<lb /><lb />ie «Fae<lb />~~Chance Success Due to Guessing and Non-in-<lb />dependence of True Scores and Error Scores<lb />in Multiple-Choice Tests: Computer Trials<lb />with Prepared DistributionsTT, Psychological<lb />Reports, XVII (1965), pp. 159-165.<lb /><lb />» Limmerman, D. W., andWilliams, R. H.,<lb /><lb />~~Independence and Non-independence of True<lb />Scores and Error Scores in Mental Tests: As-<lb />sumptions in the Definition of Parallel FormsTT,<lb />The Journal of Experimental Education, XXXV,<lb />No. 3 (Spring 1967), pp. 59-64.<lb /><lb />Zimmerman, D. W., Williams, R.H.,and<lb />Rehm, H. H.,~~Test Reliability When Error<lb />Scores Consist of Independent and Non-inde-<lb />pendent ComponentsTT, The Journal of Experi-<lb />mental Education, XXXV, No. 1 (Fall 1966),<lb />pp. To"16,<lb /><lb />4</p>
      </div>
    </body>
  </text>
</TEI>