四川师范大学教育科学学院, 成都 610066
收稿日期:
2020-04-19出版日期:
2020-11-15发布日期:
2020-09-23通讯作者:
毛秀珍E-mail:maomao_wanli@163.comOnline calibration based on computerized adaptive testing: Design and method
ZHANG Xueqin, MAO Xiuzhen(), LI JiaInstitute of Educational Science, Sichuan Normal University, Chengdu 610066, China
Received:
2020-04-19Online:
2020-11-15Published:
2020-09-23Contact:
MAO Xiuzhen E-mail:maomao_wanli@163.com摘要/Abstract
摘要: 项目增补是题库建设和维护的重要手段, 而标定新题参数是项目增补的重要内容。在线标定设计和在线标定方法分别研究新题的施测方式和参数估计方法, 是计算机化自适应测验(computerized adaptive testing, CAT)情景下项目增补的核心技术。重点厘清在线标定设计与在线标定方法的发展思路和脉络, 并对它们的特点、联系和表现进行介绍和评价。未来应基于其他信息指标进一步研究在线标定设计, 可基于联合估计和误差校正的思路探究在线标定方法, 应加强研究认知诊断CAT和多维CAT的在线标定技术, 深入开展项目增补方法的实证研究。
图/表 2
表1已有的CAT中的在线标定设计
分类标准 | 方法 | 特点 |
---|---|---|
项目视角:参数信息量 | D-优化、序贯D-优化 | 自适应选取被试 |
D-TP、D-VR、ED 和D-c方法 | 自适应选取项目 | |
考生视角:能力与样本量 | OIRPI、SI指标 |
表1已有的CAT中的在线标定设计
分类标准 | 方法 | 特点 |
---|---|---|
项目视角:参数信息量 | D-优化、序贯D-优化 | 自适应选取被试 |
D-TP、D-VR、ED 和D-c方法 | 自适应选取项目 | |
考生视角:能力与样本量 | OIRPI、SI指标 |
表2CAT中项目参数在线标定方法
分类标准 | 方法 | 特点 | 适用情景 |
---|---|---|---|
条件极大似 然估计 | MethodA、MethodB、FFMLE-A和ECSE-A | 简单、易操作, 需要大样本 | 传统CAT/MCAT |
MLE-LBCI-A | 传统CAT | ||
CD-MethodA、MLE | CD-CAT | ||
MMLE/EM算法 | OEM、MEM | 计算复杂, 耗时, 不易收敛 | 传统CAT中二级和多级评分项目/MCAT |
CD-OEM、CD-MEM、MMLE | CD-CAT | ||
贝叶斯算法 | 贝叶斯版本:方法A, OEM和MEM | 精度高、计算复杂, 耗时 | 传统CAT/MCAT |
联合极大似 然估计 | JEA、SIE、SimIE、SIE-R、JEA-R、SIE-R-BIC、JEA-R-BIC RMSEA-N | 联合估计Q矩阵和项目参数 | CD-CAT |
表2CAT中项目参数在线标定方法
分类标准 | 方法 | 特点 | 适用情景 |
---|---|---|---|
条件极大似 然估计 | MethodA、MethodB、FFMLE-A和ECSE-A | 简单、易操作, 需要大样本 | 传统CAT/MCAT |
MLE-LBCI-A | 传统CAT | ||
CD-MethodA、MLE | CD-CAT | ||
MMLE/EM算法 | OEM、MEM | 计算复杂, 耗时, 不易收敛 | 传统CAT中二级和多级评分项目/MCAT |
CD-OEM、CD-MEM、MMLE | CD-CAT | ||
贝叶斯算法 | 贝叶斯版本:方法A, OEM和MEM | 精度高、计算复杂, 耗时 | 传统CAT/MCAT |
联合极大似 然估计 | JEA、SIE、SimIE、SIE-R、JEA-R、SIE-R-BIC、JEA-R-BIC RMSEA-N | 联合估计Q矩阵和项目参数 | CD-CAT |
参考文献 38
[1] | 陈平 . ( 2016). 两种新的计算机化自适应测验在线标定方法. 心理学报, 48( 9), 1184-1198. |
[2] | 陈平, 辛涛 . ( 2011a). 认知诊断计算机化自适应测验中在线标定方法的开发. 心理学报, 43( 06), 710-724. |
[3] | 陈平, 辛涛 . ( 2011b). 认知诊断计算机化自适应测验中的项目增补. 心理学报, 43(07), 836-850. |
[4] | 陈平, 张佳慧, 辛涛 . ( 2013). 在线标定技术在计算机化自适应测验中的应用. 心理科学进展, 21( 10), 1883-1892. |
[5] | 谭青蓉 . ( 2019). CD-CAT广义在线标定方法开发研究(硕士学位论文). 江西师范大学, 南昌. |
[6] | 汪文义, 丁树良, 游晓锋 . ( 2011). 计算机化自适应诊断测验中原始题的属性标定. 心理学报, 43( 08), 964-976. |
[7] | 熊建华, 罗慧, 王晓庆, 丁树良 . ( 2018). 基于GRM的在线校准研究. 江西师范大学学报(自然科学版), 42( 01), 62-66. |
[8] | 游晓锋, 丁树良, 刘红云 . ( 2010). 计算机化自适应测验中原始题项目参数的估计. 心理学报, 42( 7), 813-820. |
[9] | Ali, U. S., & Chang, H. H . ( 2014). An item-driven adaptive design for calibrating pretest items, ETS Research Report Series, 2014( 2), 1-12. |
[10] | Ban, J.-C., Hanson, B. A., Wang, T. Y., Yi, Q., & Harris, D. J . ( 2001). A comparative study of on-line pretest item-calibration/ scaling methods in computerized adaptive testing. Journal of Educational Measurement, 38(3), 191-212. doi: 10.1111/jedm.2001.38.issue-3URL |
[11] | Berger, M. P. F . ( 1992). Sequential sampling designs for the two-parameter item response theory model. Psychometrika, 57(4), 521-538. doi: 10.1007/BF02294418URL |
[12] | Berger, M. P. F . ( 1994). D-Optimal Sequential Sampling Designs for Item Response Theory Models. Journal of Educational Statistics, 19( 1), 43-56. doi: 10.3102/10769986019001043URL |
[13] | Buyske, S. ( 1998). Optimal design for item calibration in computerized adaptive testing: The 2PL case. In N. Flournoy et al.(Ed.), New developments and applications in experimental design. Lecture Notes—Monograph Series, 34. Haywood, CA: Institute of Mathematical Statistics. |
[14] | Chang, Y.-C. I., & Lu, H. Y . ( 2010). Online calibration via variable length computerized adaptive testing. Psychometrika, 75( 1), 140-157. doi: 10.1007/s11336-009-9133-0URL |
[15] | Chen, P. ( 2017). A comparative study of online item calibration methods in multidimensional computerized adaptive testing. Journal of Educational and Behavioral Statistics, 42( 5), 559-590. doi: 10.3102/1076998617695098URL |
[16] | Chen, P., & Wang, C. ( 2015). A new online calibration method for multidimensional computerized adaptive testing, Psychometrika, 81( 3), 674-701. doi: 10.1007/s11336-015-9482-9URLpmid: 26608960 |
[17] | Chen, Y., Liu, J., & Ying, Z . ( 2015). Online item calibration for Q-matrix in CD-CAT. Applied Psychological Measurement, 39( 1), 5-15. doi: 10.1177/0146621613513065URLpmid: 29882531 |
[18] | Cheng, Y. ( 2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74( 4) 619-632. doi: 10.1007/s11336-009-9123-2URL |
[19] | Hassan, M. U., & Miller, F. ( 2019). Optimal item calibration for computerized achievement tests. Psychometrika, 84(4), 1101-1128. doi: 10.1007/s11336-019-09673-6URLpmid: 31183669 |
[20] | He, Y. H., & Chen, P. ( 2020). Optimal online calibration designs for item replenishment in adaptive testing. Psychometrika, 85( 1), 35-55. doi: 10.1007/s11336-019-09687-0URLpmid: 31531789 |
[21] | He, Y. H., Chen, P., & Li, Y . ( 2019). New efficient and practicable adaptive designs for calibrating items online. Applied Psychological Measurement, 44( 1), 3-16. doi: 10.1177/0146621618824854URLpmid: 31853155 |
[22] | He, Y. H., Chen, P., Li, Y., & Zhang, S. M . ( 2017). A new online calibration method based on Lord's Bias-Correction. Applied Psychological Measurement. 41( 6), 456-471. doi: 10.1177/0146621617697958URLpmid: 29882532 |
[23] | Jones, D. H., & Jin, Z. Y . ( 1994). Optimal sequential designs for on-line item estimation. Psychometrika, 59( 1), 59-75. doi: 10.1007/BF02294265URL |
[24] | Kang, H. A., Zheng, Y., & Chang, H. H . ( 2020). Online calibration of a joint model of item responses and response times in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 45( 2), 175-208. doi: 10.3102/1076998619879040URL |
[25] | Kingsbury, G. G. ( 2009. Adaptive item calibration: A process for estimating item parameters within a computerized adaptive test. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC conference on computerized adaptive testing (pp.1-15). Retrieved from http://www.psych.umn.edu/psylabs/CATCentral/ |
[26] | Makransky, G. ( 2009). An automatic online calibration design in adaptive testing. Paper presented at the 2007 GMAC Conference on Computerized Adaptive Testing, McLean, USA. |
[27] | Mulder, J., & van der Linden, W. J . ( 2009, June). Multidimensional adaptive testing with optimal design criteria for Item Selection. Psychometrika, 74( 2), 273-296. doi: 10.1007/s11336-008-9097-5URLpmid: 20119511 |
[28] | Ren, H., van der Linden, W. J., & Diao, Q . ( 2017). Continuous online item calibration: Parameter recovery and item utilization. Psychometrika, 82( 2), 498-522. doi: 10.1007/s11336-017-9553-1URLpmid: 28290109 |
[29] | Stefanski, L. A., & Carroll, R. J . ( 1985). Covariate measurement error in logistic regression. Annals of Statistics, 13( 4), 1335-1351. |
[30] | Stocking, M. L . ( 1988). Scale drift in on-line calibration (Research Rep. 88-28). Princeton, NJ: ETS. |
[31] | van der, Linden, W., J., & Ren, H. ( 2015). Optimal bayesian adaptive design for test-item calibration. Psychometrika, 80( 2), 263-288. doi: 10.1007/s11336-013-9391-8URLpmid: 24407735 |
[32] | Wainer, H., & Mislevy, R. J . ( 1990). Item response theory, item calibration, and proficiency estimation. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (Chap. 4, pp. 65-102). Hillsdale, NJ: Erlbaum. |
[33] | Wang, C., & Chang, H. H . ( 2011). Item selection in multidimensional computerized adaptive testing-gaining information from different angles. Psychometrika, 76( 3), 363-384. doi: 10.1007/s11336-011-9215-7URL |
[34] | Xiong, J., Ding, S., Luo, F., & Luo, Z . ( 2020). Online calibration of polytomous items under the graded response model. Frontiers in Psychology, 10( 1), 3085. doi: 10.3389/fpsyg.2019.03085URL |
[35] | Xu, X. L., Chang, H. H., & Douglas, J . ( 2003). A simulation study to compare CAT strategies for cognitive diagnosis. Paper presented at the annual meeting of National Council on Measurement in Education, Montreal, Canada. |
[36] | Zheng, Y. ( 2014). New methods of online calibration for item bank replenishment (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign, Champaign, IL. |
[37] | Zheng, Y. ( 2016). Online calibration of polytomous items under the generalized partial credit model. Applied Psychological Measurement, 40( 6), 434-450. doi: 10.1177/0146621616650406URLpmid: 29881063 |
[38] | Zheng, Y., & Chang, H. H . ( 2017). A comparison of five methods for pretest item selection in online calibration. International Journal of Quantitative Research in Education, 4( 1), 133-158. doi: 10.1504/IJQRE.2017.086500URL |
相关文章 6
[1] | 唐倩, 毛秀珍, 何明霜, 何洁. 认知诊断计算机化自适应测验的选题策略[J]. 心理科学进展, 2020, 28(12): 2160-2168. |
[2] | 高旭亮;涂冬波;王芳;张龙;李雪莹. 可修改答案的计算机化自适应测验的方法[J]. 心理科学进展, 2016, 24(4): 654-664. |
[3] | 毛秀珍;辛涛. 多维计算机化自适应测验:模型、技术和方法[J]. 心理科学进展, 2015, 23(5): 907-918. |
[4] | 陈平;张佳慧;辛涛. 在线标定技术在计算机化自适应测验中的应用[J]. 心理科学进展, 2013, 21(10): 1883-1892. |
[5] | 唐小娟;丁树良;俞宗火. 计算机化自适应测验在认知诊断中的应用[J]. 心理科学进展, 2012, 20(4): 616-626. |
[6] | 毛秀珍;辛涛. 计算机化自适应测验选题策略述评[J]. 心理科学进展, 2011, 19(10): 1552-1562. |
PDF全文下载地址:
http://journal.psych.ac.cn/xlkxjz/CN/article/downloadArticleFile.do?attachType=PDF&id=5230