您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[ACT]:The Effectiveness of Circular Equating as a Criterion for Evaluating Equating - 发现报告
当前位置:首页/行业研究/报告详情/

The Effectiveness of Circular Equating as a Criterion for Evaluating Equating

文化传媒2014-09-15ACT野***
The Effectiveness of Circular Equating as a Criterion for Evaluating Equating

A C T Researcli Report Series 98-6The Effectiveness of Circular Equating as a Criterion for Evaluating EquatingTianyou Wang Bradley A. Hanson Deborah J. HarrisOctoter 1998 For additional copies write:ACT Research Report Series PO Box 168Iowa City, Iowa 52243-0168© 1998 by ACT, Inc. All rights reserved. The Effectiveness of Circular Equating as a Criterion for Evaluating EquatingTianyou Wang Bradley A. Hanson Deborah J. Harris AbstractEquating a test form to itself through a chain of equatings, commonly referred to as circular equating, has been widely used as a criterion to evaluate the adequacy of equating. This paper uses both analytical methods and simulation methods to show that this criterion is in general invalid in serving this purpose. For the random groups design done in the same year, it is shown analytically that circular equating will always result in the identity function (i.e., the perfect result) even with the presence of random and systematic equating errors. For the random groups design done in the different years, a heuristic argument is provided that circular equating will generally deviate from the identity function by some random sampling error. A simulation study for this design also showed that expected values of the circular equating may deviate slightly from the identity function but those deviations do not reflect the systematic error (bias) embedded in the equating. For the common-item nonequivalent groups design, a simulation study was done to show that circular equating again can not reflect the systematic error in equating. More effective ways of assessing random and systematic equating errors are recommended. AcknowledgmentsThe authors wish to thank Michael Kolen, Jill Crouse and Ronald Cope for their helpful comments on the earlier drafts of this report. The Effectiveness of Circular Equating as a Criterion for Evaluating EquatingIn test equating, there has been a lack of definitive and practically feasible criteria for evaluating the adequacy of equating. Harris and Crouse (1993) did a thorough review and discussion of the available criteria in the literature. One of the criteria they reviewed is the circular equating paradigm. Circular equating involves equating a test form to itself through a chain of equatings. To illustrate this with a case of three test forms, X, Y, and Z, test form X is equated to form Y, which is equated to form Z, which is equated back to form X. It is presumed that if an equating is without error, an identity circular equating should result, and that if the equating functions in the chain contains much error, the circular equating would not result in identity equating and the result should reflect the error accumulated in the chain. Based on this reasoning, the circular equating criterion was commonly used in evaluating equating methods (e.g., Cope, 1987) or scale drift in IRT equating (e.g., Petersen, Cook, & Stocking, 1983). Brennan and Kolen (1987a, b) and Angoff (1987) discussed this criterion. Angoff (1987) had more positive views on the usefulness of this criterion. Brennan and Kolen (1987a, b), however, expressed cautions about using this criterion. They pointed out that no equating at all will result in identity equating under this paradigm. They also demonstrated that equating methods with fewer parameters tend to achieve better results, and starting from a different form may affect the results. Despite these concerns, this paradigm and some variations of it continue to be used as a criterion in both research and practice (e.g., Klein & Jarjoura, 1985; McKinley & Schaeffer, 1989; Gafni & Melamed, 1990; Harris, Welch & Wang , 1994). The applications of the criterion have not produced clear results about its usefulness. Some authors expressed doubts about the validity of this criterion (e.g., Gafni & Melamed, 1990). There has not been a substantive study on the validity and usefulness of this widely used criterion. The objective of this paper is to address this need. More specifically, we focus on type of equating error for which this criterion can or cannot provide an accurate measure, when applied to different equating methods and equating designs. Kolen and Brennan (1995, pp. 210-211) summarized two major types of equating errors: random error and systematic error. There is one source of random error; that is, equating is performed based on samples randomly drawn from the population of examinees rather than based on the population itself. There are two major sources of systematic errors. One source is the equating method used, including violations of the assumptions associated with the equating method, and the estimation bias related to that method. A second source is the collection of data in the equating study, including whether the samples are randomly drawn from the population that actual