1西南大学心理学部, 重庆 400715
2中国基础教育质量监测协同创新中心西南大学分中心, 重庆 400715
收稿日期:
2020-11-02出版日期:
2021-09-25发布日期:
2021-07-22通讯作者:
郭磊E-mail:happygl1229@swu.edu.cn基金资助:
国家自然科学基金青年项目(31900793);北京师范大学中国基础教育质量监测协同创新中心重大成果培育性项目(2019-06-023- BZPK01);中央高校基本科研业务费专项资金(SWU2109222)Nonparametric methods for cognitive diagnosis to multiple-choice test items
GUO Lei1,2(), ZHOU Wenjie11Faculty of Psychology, Southwest University, Chongqing 400715, China
2Southwest University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality, Chongqing 400715, China
Received:
2020-11-02Online:
2021-09-25Published:
2021-07-22Contact:
GUO Lei E-mail:happygl1229@swu.edu.cn摘要/Abstract
摘要: 充分挖掘选择题(Multiple-Choice, MC)的诊断信息受到了较多关注, 将干扰项信息考虑在内可以提升诊断精度。为了弥补参数模型基于大样本才能获得可靠估计的不足, 以及适用于班级水平的小样本诊断测验情境, 本研究提出了非参数的多选题诊断方法。模拟和实证研结果表明:(1)当MC测验中题目参数不存在较大差异时,
图/表 7
表1选项编码的分数减法示例
| 属性 | |||
---|---|---|---|---|
S1 | S2 | S3 | ||
A | | √ | ||
B | | √ | √ | |
C | | √ | √ | |
D | | √ | √ | √ |
表1选项编码的分数减法示例
| 属性 | |||
---|---|---|---|---|
S1 | S2 | S3 | ||
A | | √ | ||
B | | √ | √ | |
C | | √ | √ | |
D | | √ | √ | √ |
表2MC题目中干扰项已编码的Q矩阵
属性 | 题目 | |||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | |
A1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 2 | 2 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 3 | 1 | 2 | 2 | 0 | 0 | 0 | 0 |
A2 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 1 | 1 | 0 | 0 | 0 | 2 | 2 | 1 | 0 | 0 | 0 | 2 | 2 | 2 | 0 |
A3 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 2 | 2 | 0 | 2 | 2 | 0 | 2 |
A4 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 2 | 0 | 2 | 0 | 1 | 0 | 2 | 0 | 2 | 0 | 2 | 2 | 0 | 2 | 2 |
A5 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 0 | 2 | 2 | 0 | 0 | 1 | 0 | 2 | 2 | 0 | 2 | 2 | 2 |
表2MC题目中干扰项已编码的Q矩阵
属性 | 题目 | |||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | |
A1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 2 | 2 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 3 | 1 | 2 | 2 | 0 | 0 | 0 | 0 |
A2 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 1 | 1 | 0 | 0 | 0 | 2 | 2 | 1 | 0 | 0 | 0 | 2 | 2 | 2 | 0 |
A3 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 2 | 2 | 0 | 2 | 2 | 0 | 2 |
A4 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 2 | 0 | 2 | 0 | 1 | 0 | 2 | 0 | 2 | 0 | 2 | 2 | 0 | 2 | 2 |
A5 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 0 | 2 | 2 | 0 | 0 | 1 | 0 | 2 | 2 | 0 | 2 | 2 | 2 |
表3两类诊断方法的模式判准率和属性判准率(真模型为MC1)
题目质量 | 题目数量 | 样本量 | PCCR | AACCR | ||||||
---|---|---|---|---|---|---|---|---|---|---|
| | MC1 | MC2 | | | MC1 | MC2 | |||
高质量 | 10 | 30 | 0.784 | 0.710 | 0.763 | 0.703 | 0.918 | 0.884 | 0.906 | 0.896 |
50 | 0.783 | 0.701 | 0.749 | 0.690 | 0.916 | 0.883 | 0.900 | 0.889 | ||
100 | 0.789 | 0.703 | 0.757 | 0.704 | 0.922 | 0.888 | 0.902 | 0.896 | ||
20 | 30 | 0.911 | 0.893 | 0.896 | 0.888 | 0.968 | 0.962 | 0.930 | 0.928 | |
50 | 0.911 | 0.895 | 0.879 | 0.863 | 0.976 | 0.962 | 0.918 | 0.970 | ||
100 | 0.912 | 0.895 | 0.905 | 0.896 | 0.973 | 0.963 | 0.921 | 0.968 | ||
30 | 30 | 0.957 | 0.947 | 0.979 | 0.964 | 0.987 | 0.984 | 0.992 | 0.991 | |
50 | 0.951 | 0.934 | 0.973 | 0.966 | 0.986 | 0.980 | 0.992 | 0.989 | ||
100 | 0.954 | 0.940 | 0.976 | 0.970 | 0.986 | 0.982 | 0.993 | 0.983 | ||
低质量 | 10 | 30 | 0.575 | 0.495 | 0.498 | 0.450 | 0.843 | 0.798 | 0.814 | 0.799 |
50 | 0.588 | 0.501 | 0.505 | 0.428 | 0.843 | 0.801 | 0.820 | 0.788 | ||
100 | 0.590 | 0.501 | 0.518 | 0.420 | 0.849 | 0.806 | 0.828 | 0.784 | ||
20 | 30 | 0.802 | 0.768 | 0.742 | 0.655 | 0.933 | 0.919 | 0.917 | 0.888 | |
50 | 0.798 | 0.762 | 0.742 | 0.651 | 0.935 | 0.921 | 0.919 | 0.889 | ||
100 | 0.793 | 0.760 | 0.752 | 0.671 | 0.930 | 0.917 | 0.922 | 0.892 | ||
30 | 30 | 0.865 | 0.849 | 0.820 | 0.757 | 0.964 | 0.959 | 0.952 | 0.935 | |
50 | 0.868 | 0.845 | 0.837 | 0.777 | 0.965 | 0.957 | 0.957 | 0.940 | ||
100 | 0.874 | 0.853 | 0.848 | 0.801 | 0.967 | 0.959 | 0.961 | 0.947 |
表3两类诊断方法的模式判准率和属性判准率(真模型为MC1)
题目质量 | 题目数量 | 样本量 | PCCR | AACCR | ||||||
---|---|---|---|---|---|---|---|---|---|---|
| | MC1 | MC2 | | | MC1 | MC2 | |||
高质量 | 10 | 30 | 0.784 | 0.710 | 0.763 | 0.703 | 0.918 | 0.884 | 0.906 | 0.896 |
50 | 0.783 | 0.701 | 0.749 | 0.690 | 0.916 | 0.883 | 0.900 | 0.889 | ||
100 | 0.789 | 0.703 | 0.757 | 0.704 | 0.922 | 0.888 | 0.902 | 0.896 | ||
20 | 30 | 0.911 | 0.893 | 0.896 | 0.888 | 0.968 | 0.962 | 0.930 | 0.928 | |
50 | 0.911 | 0.895 | 0.879 | 0.863 | 0.976 | 0.962 | 0.918 | 0.970 | ||
100 | 0.912 | 0.895 | 0.905 | 0.896 | 0.973 | 0.963 | 0.921 | 0.968 | ||
30 | 30 | 0.957 | 0.947 | 0.979 | 0.964 | 0.987 | 0.984 | 0.992 | 0.991 | |
50 | 0.951 | 0.934 | 0.973 | 0.966 | 0.986 | 0.980 | 0.992 | 0.989 | ||
100 | 0.954 | 0.940 | 0.976 | 0.970 | 0.986 | 0.982 | 0.993 | 0.983 | ||
低质量 | 10 | 30 | 0.575 | 0.495 | 0.498 | 0.450 | 0.843 | 0.798 | 0.814 | 0.799 |
50 | 0.588 | 0.501 | 0.505 | 0.428 | 0.843 | 0.801 | 0.820 | 0.788 | ||
100 | 0.590 | 0.501 | 0.518 | 0.420 | 0.849 | 0.806 | 0.828 | 0.784 | ||
20 | 30 | 0.802 | 0.768 | 0.742 | 0.655 | 0.933 | 0.919 | 0.917 | 0.888 | |
50 | 0.798 | 0.762 | 0.742 | 0.651 | 0.935 | 0.921 | 0.919 | 0.889 | ||
100 | 0.793 | 0.760 | 0.752 | 0.671 | 0.930 | 0.917 | 0.922 | 0.892 | ||
30 | 30 | 0.865 | 0.849 | 0.820 | 0.757 | 0.964 | 0.959 | 0.952 | 0.935 | |
50 | 0.868 | 0.845 | 0.837 | 0.777 | 0.965 | 0.957 | 0.957 | 0.940 | ||
100 | 0.874 | 0.853 | 0.848 | 0.801 | 0.967 | 0.959 | 0.961 | 0.947 |
表4两类诊断方法的模式判准率和属性判准率(真模型为MC2)
题目质量 | 题目数量 | 样本量 | PCCR | AACCR | ||||||
---|---|---|---|---|---|---|---|---|---|---|
| | MC1 | MC2 | | | MC1 | MC2. | |||
高质量 | 10 | 30 | 0.772 | 0.700 | 0.746 | 0.697 | 0.915 | 0.884 | 0.904 | 0.896 |
50 | 0.781 | 0.700 | 0.747 | 0.701 | 0.917 | 0.880 | 0.900 | 0.893 | ||
100 | 0.788 | 0.705 | 0.753 | 0.705 | 0.921 | 0.889 | 0.903 | 0.897 | ||
20 | 30 | 0.907 | 0.888 | 0.887 | 0.888 | 0.966 | 0.961 | 0.935 | 0.967 | |
50 | 0.909 | 0.892 | 0.884 | 0.905 | 0.965 | 0.959 | 0.923 | 0.972 | ||
100 | 0.911 | 0.896 | 0.886 | 0.916 | 0.967 | 0.961 | 0.923 | 0.971 | ||
30 | 30 | 0.953 | 0.938 | 0.960 | 0.976 | 0.985 | 0.980 | 0.991 | 0.991 | |
50 | 0.949 | 0.938 | 0.966 | 0.973 | 0.985 | 0.981 | 0.989 | 0.992 | ||
100 | 0.952 | 0.936 | 0.972 | 0.973 | 0.986 | 0.981 | 0.987 | 0.993 | ||
低质量 | 10 | 30 | 0.566 | 0.501 | 0.490 | 0.424 | 0.835 | 0.798 | 0.807 | 0.787 |
50 | 0.580 | 0.493 | 0.497 | 0.424 | 0.841 | 0.797 | 0.815 | 0.786 | ||
100 | 0.593 | 0.501 | 0.516 | 0.422 | 0.847 | 0.803 | 0.823 | 0.786 | ||
20 | 30 | 0.787 | 0.752 | 0.723 | 0.642 | 0.931 | 0.917 | 0.915 | 0.886 | |
50 | 0.793 | 0.761 | 0.744 | 0.656 | 0.930 | 0.917 | 0.917 | 0.889 | ||
100 | 0.792 | 0.762 | 0.754 | 0.666 | 0.931 | 0.918 | 0.921 | 0.892 | ||
30 | 30 | 0.872 | 0.849 | 0.830 | 0.759 | 0.964 | 0.957 | 0.954 | 0.935 | |
50 | 0.873 | 0.846 | 0.844 | 0.777 | 0.965 | 0.956 | 0.959 | 0.940 | ||
100 | 0.873 | 0.848 | 0.849 | 0.797 | 0.965 | 0.956 | 0.959 | 0.945 |
表4两类诊断方法的模式判准率和属性判准率(真模型为MC2)
题目质量 | 题目数量 | 样本量 | PCCR | AACCR | ||||||
---|---|---|---|---|---|---|---|---|---|---|
| | MC1 | MC2 | | | MC1 | MC2. | |||
高质量 | 10 | 30 | 0.772 | 0.700 | 0.746 | 0.697 | 0.915 | 0.884 | 0.904 | 0.896 |
50 | 0.781 | 0.700 | 0.747 | 0.701 | 0.917 | 0.880 | 0.900 | 0.893 | ||
100 | 0.788 | 0.705 | 0.753 | 0.705 | 0.921 | 0.889 | 0.903 | 0.897 | ||
20 | 30 | 0.907 | 0.888 | 0.887 | 0.888 | 0.966 | 0.961 | 0.935 | 0.967 | |
50 | 0.909 | 0.892 | 0.884 | 0.905 | 0.965 | 0.959 | 0.923 | 0.972 | ||
100 | 0.911 | 0.896 | 0.886 | 0.916 | 0.967 | 0.961 | 0.923 | 0.971 | ||
30 | 30 | 0.953 | 0.938 | 0.960 | 0.976 | 0.985 | 0.980 | 0.991 | 0.991 | |
50 | 0.949 | 0.938 | 0.966 | 0.973 | 0.985 | 0.981 | 0.989 | 0.992 | ||
100 | 0.952 | 0.936 | 0.972 | 0.973 | 0.986 | 0.981 | 0.987 | 0.993 | ||
低质量 | 10 | 30 | 0.566 | 0.501 | 0.490 | 0.424 | 0.835 | 0.798 | 0.807 | 0.787 |
50 | 0.580 | 0.493 | 0.497 | 0.424 | 0.841 | 0.797 | 0.815 | 0.786 | ||
100 | 0.593 | 0.501 | 0.516 | 0.422 | 0.847 | 0.803 | 0.823 | 0.786 | ||
20 | 30 | 0.787 | 0.752 | 0.723 | 0.642 | 0.931 | 0.917 | 0.915 | 0.886 | |
50 | 0.793 | 0.761 | 0.744 | 0.656 | 0.930 | 0.917 | 0.917 | 0.889 | ||
100 | 0.792 | 0.762 | 0.754 | 0.666 | 0.931 | 0.918 | 0.921 | 0.892 | ||
30 | 30 | 0.872 | 0.849 | 0.830 | 0.759 | 0.964 | 0.957 | 0.954 | 0.935 | |
50 | 0.873 | 0.846 | 0.844 | 0.777 | 0.965 | 0.956 | 0.959 | 0.940 | ||
100 | 0.873 | 0.848 | 0.849 | 0.797 | 0.965 | 0.956 | 0.959 | 0.945 |
表5题目质量存在较大差异时各方法的模式判准率和属性判准率
真模型 | 题目数量 | 样本量 | PCCR | AACCR | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| | | MC1 | MC2 | | | | MC1 | MC2 | |||
MC1 | 10 | 30 | 0.631 | 0.547 | 0.669 | 0.596 | 0.523 | 0.865 | 0.820 | 0.877 | 0.858 | 0.835 |
50 | 0.644 | 0.549 | 0.675 | 0.605 | 0.518 | 0.866 | 0.825 | 0.877 | 0.856 | 0.822 | ||
100 | 0.645 | 0.543 | 0.678 | 0.623 | 0.523 | 0.869 | 0.825 | 0.880 | 0.866 | 0.826 | ||
20 | 30 | 0.839 | 0.812 | 0.888 | 0.857 | 0.796 | 0.945 | 0.935 | 0.964 | 0.958 | 0.937 | |
50 | 0.840 | 0.817 | 0.882 | 0.859 | 0.800 | 0.948 | 0.939 | 0.964 | 0.960 | 0.938 | ||
100 | 0.844 | 0.819 | 0.894 | 0.877 | 0.829 | 0.947 | 0.937 | 0.967 | 0.964 | 0.946 | ||
30 | 30 | 0.904 | 0.878 | 0.938 | 0.930 | 0.906 | 0.975 | 0.968 | 0.986 | 0.984 | 0.978 | |
50 | 0.904 | 0.883 | 0.943 | 0.933 | 0.916 | 0.974 | 0.968 | 0.987 | 0.984 | 0.981 | ||
100 | 0.908 | 0.891 | 0.942 | 0.939 | 0.925 | 0.976 | 0.970 | 0.986 | 0.986 | 0.983 | ||
MC2 | 10 | 30 | 0.623 | 0.546 | 0.647 | 0.578 | 0.512 | 0.866 | 0.820 | 0.868 | 0.847 | 0.825 |
50 | 0.638 | 0.548 | 0.672 | 0.601 | 0.521 | 0.866 | 0.824 | 0.876 | 0.858 | 0.827 | ||
100 | 0.643 | 0.548 | 0.676 | 0.621 | 0.519 | 0.870 | 0.824 | 0.879 | 0.865 | 0.825 | ||
20 | 30 | 0.834 | 0.803 | 0.886 | 0.853 | 0.801 | 0.944 | 0.933 | 0.967 | 0.957 | 0.939 | |
50 | 0.836 | 0.808 | 0.897 | 0.862 | 0.817 | 0.942 | 0.931 | 0.969 | 0.959 | 0.944 | ||
100 | 0.838 | 0.808 | 0.892 | 0.868 | 0.828 | 0.944 | 0.932 | 0.966 | 0.960 | 0.948 | ||
30 | 30 | 0.905 | 0.879 | 0.942 | 0.925 | 0.900 | 0.973 | 0.966 | 0.986 | 0.982 | 0.976 | |
50 | 0.906 | 0.884 | 0.942 | 0.928 | 0.909 | 0.974 | 0.968 | 0.986 | 0.984 | 0.979 | ||
100 | 0.905 | 0.884 | 0.937 | 0.933 | 0.924 | 0.974 | 0.968 | 0.985 | 0.984 | 0.982 |
表5题目质量存在较大差异时各方法的模式判准率和属性判准率
真模型 | 题目数量 | 样本量 | PCCR | AACCR | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| | | MC1 | MC2 | | | | MC1 | MC2 | |||
MC1 | 10 | 30 | 0.631 | 0.547 | 0.669 | 0.596 | 0.523 | 0.865 | 0.820 | 0.877 | 0.858 | 0.835 |
50 | 0.644 | 0.549 | 0.675 | 0.605 | 0.518 | 0.866 | 0.825 | 0.877 | 0.856 | 0.822 | ||
100 | 0.645 | 0.543 | 0.678 | 0.623 | 0.523 | 0.869 | 0.825 | 0.880 | 0.866 | 0.826 | ||
20 | 30 | 0.839 | 0.812 | 0.888 | 0.857 | 0.796 | 0.945 | 0.935 | 0.964 | 0.958 | 0.937 | |
50 | 0.840 | 0.817 | 0.882 | 0.859 | 0.800 | 0.948 | 0.939 | 0.964 | 0.960 | 0.938 | ||
100 | 0.844 | 0.819 | 0.894 | 0.877 | 0.829 | 0.947 | 0.937 | 0.967 | 0.964 | 0.946 | ||
30 | 30 | 0.904 | 0.878 | 0.938 | 0.930 | 0.906 | 0.975 | 0.968 | 0.986 | 0.984 | 0.978 | |
50 | 0.904 | 0.883 | 0.943 | 0.933 | 0.916 | 0.974 | 0.968 | 0.987 | 0.984 | 0.981 | ||
100 | 0.908 | 0.891 | 0.942 | 0.939 | 0.925 | 0.976 | 0.970 | 0.986 | 0.986 | 0.983 | ||
MC2 | 10 | 30 | 0.623 | 0.546 | 0.647 | 0.578 | 0.512 | 0.866 | 0.820 | 0.868 | 0.847 | 0.825 |
50 | 0.638 | 0.548 | 0.672 | 0.601 | 0.521 | 0.866 | 0.824 | 0.876 | 0.858 | 0.827 | ||
100 | 0.643 | 0.548 | 0.676 | 0.621 | 0.519 | 0.870 | 0.824 | 0.879 | 0.865 | 0.825 | ||
20 | 30 | 0.834 | 0.803 | 0.886 | 0.853 | 0.801 | 0.944 | 0.933 | 0.967 | 0.957 | 0.939 | |
50 | 0.836 | 0.808 | 0.897 | 0.862 | 0.817 | 0.942 | 0.931 | 0.969 | 0.959 | 0.944 | ||
100 | 0.838 | 0.808 | 0.892 | 0.868 | 0.828 | 0.944 | 0.932 | 0.966 | 0.960 | 0.948 | ||
30 | 30 | 0.905 | 0.879 | 0.942 | 0.925 | 0.900 | 0.973 | 0.966 | 0.986 | 0.982 | 0.976 | |
50 | 0.906 | 0.884 | 0.942 | 0.928 | 0.909 | 0.974 | 0.968 | 0.986 | 0.984 | 0.979 | ||
100 | 0.905 | 0.884 | 0.937 | 0.933 | 0.924 | 0.974 | 0.968 | 0.985 | 0.984 | 0.982 |
表6包含干扰项信息的大学英语高级英语阅读测验Q矩阵
题目1 | 题目2 | 题目3 | 题目4 | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
B | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
D | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 |
题目5 | 题目6 | 题目7 | 题目8 | |||||||||||||||||||||
A | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
B | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
D | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
题目9 | 题目10 | 题目11 | 题目12 | |||||||||||||||||||||
A | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
B | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 |
D | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
题目13 | 题目14 | 题目15 | ||||||||||||||||||||||
A | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | ||||||
B | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | ||||||
C | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | ||||||
D | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 |
表6包含干扰项信息的大学英语高级英语阅读测验Q矩阵
题目1 | 题目2 | 题目3 | 题目4 | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
B | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
D | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 |
题目5 | 题目6 | 题目7 | 题目8 | |||||||||||||||||||||
A | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
B | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
D | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
题目9 | 题目10 | 题目11 | 题目12 | |||||||||||||||||||||
A | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
B | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 |
D | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
题目13 | 题目14 | 题目15 | ||||||||||||||||||||||
A | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | ||||||
B | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | ||||||
C | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | ||||||
D | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 |
表7各模型间的分类一致性程度
指标 | 平均属性分类一致性指标 (AAR) | 模式分类一致性指标1 (PAR(K = 6)) | 模式分类一致性指标2 (PAR(K ≥ 5)) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| | MC1 | MC2 | | | MC1 | MC2 | | | MC1 | MC2 | |
| 1 | 1 | 1 | |||||||||
| 0.88 | 1 | 0.61 | 1 | 0.92 | 1 | ||||||
MC1 | 0.85 | 0.86 | 1 | 0.55 | 0.59 | 1 | 0.88 | 0.89 | 1 | |||
MC2 | 0.84 | 0.85 | 0.92 | 1 | 0.51 | 0.57 | 0.71 | 1 | 0.87 | 0.88 | 0.94 | 1 |
表7各模型间的分类一致性程度
指标 | 平均属性分类一致性指标 (AAR) | 模式分类一致性指标1 (PAR(K = 6)) | 模式分类一致性指标2 (PAR(K ≥ 5)) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| | MC1 | MC2 | | | MC1 | MC2 | | | MC1 | MC2 | |
| 1 | 1 | 1 | |||||||||
| 0.88 | 1 | 0.61 | 1 | 0.92 | 1 | ||||||
MC1 | 0.85 | 0.86 | 1 | 0.55 | 0.59 | 1 | 0.88 | 0.89 | 1 | |||
MC2 | 0.84 | 0.85 | 0.92 | 1 | 0.51 | 0.57 | 0.71 | 1 | 0.87 | 0.88 | 0.94 | 1 |
参考文献 27
[1] | Bock R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29-51. doi: 10.1007/BF02291411URL |
[2] | Bradshaw L., & Templin J. (2014). Combining item response theory and diagnostic classification models: a psychometric model for scaling ability and diagnosing misconceptions. Psychometrika, 79(3), 403-425. doi: 10.1007/s11336-013-9350-4pmid: 25205005 |
[3] | Chang Y.-P., Chiu C.-Y., & Tsai R.-C. (2019). Nonparametric CAT for CD in educational settings with small samples. Applied Psychological Measurement, 43(7), 543-561. doi: 10.1177/0146621618813113URL |
[4] | Chiu C.-Y., & Douglas J.A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30(2), 225- 250. doi: 10.1007/s00357-013-9132-9URL |
[5] | Chiu C.-Y., Douglas J.A., & Li X.D. (2009). Cluster analysis for cognitive diagnosis: Theory and applications. Psychometrika, 74(4), 633-665. doi: 10.1007/s11336-009-9125-0URL |
[6] | Chiu C.-Y., Sun Y., & Bian Y.H. (2018). Cognitive diagnosis for small educational programs: The general nonparametric classification method. Psychometrika, 83(2), 355-375. doi: 10.1007/s11336-017-9595-4URL |
[7] | de la Torre J.(2009). A cognitive diagnosis model for cognitively based multiple-choice options. Applied Psychological Measurement, 33(3), 163-183. doi: 10.1177/0146621608320523URL |
[8] | de la Torre J.(2011). The generalized DINA model framework. Psychometrika, 76, 179-199. doi: 10.1007/s11336-011-9207-7URL |
[9] | DiBello L.V., Henson R.A., & Stout W.F. (2015). A family of generalized diagnostic classification models for multiple choice option-based scoring. Applied Psychological Measurement, 39(1), 62-79. doi: 10.1177/0146621614561315pmid: 29880994 |
[10] | Guo L., Yang J., & Song N.Q. (2018). Application of spectral clustering algorithm under various attribute hierarchical structures for cognitive diagnostic assessment. Journal of Psychological Science, 41(3), 735-742. |
[ 郭磊, 杨静, 宋乃庆. (2018). 谱聚类算法在不同属性层级结构诊断评估中的应用. 心理科学, 41(3), 735-742.] | |
[11] | Guo L., Yang J., & Song N.Q. (2020). Spectral clustering algorithm for cognitive diagnostic assessment. Fronties in Psychology, 11, 944. doi: 10.3389/fpsyg.2020.00944 doi: 10.3389/fpsyg.2020.00944 |
[12] | Henson R.A., Templin J.L., & Willse J.T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191-210. doi: 10.1007/s11336-008-9089-5URL |
[13] | Kang C.H., Ren P., & Zeng P.F. (2015). Nonparametric cognitive diagnosis: A cluster diagnostic method based on grade response items. Acta Psychologica Sinica, 47(8), 1077-1088. doi: 10.3724/SP.J.1041.2015.01077URL |
[ 康春花, 任平, 曾平飞. (2015). 非参数认知诊断方法:多级评分的聚类分析. 心理学报, 47(8), 1077-1088.] | |
[14] | Kang C.H., Yang Y.K., & Zeng P.F. (2019). Approach to cognitive diagnosis: The manhattan distance discriminating method. Journal of Psychological Science, 42(2), 455-462. |
[ 康春花, 杨亚坤, 曾平飞. (2019). 一种混合计分的非参数认知诊断方法:曼哈顿距离判别法. 心理科学, 42(2), 455-462.] | |
[15] | Levine M.V., & Drasgow F. (1983). The relation between incorrect option choice and estimated ability. Educational and Psychological Measurement, 43(3), 675-685. doi: 10.1177/001316448304300301URL |
[16] | Li S.Z. (2019). Application of back propagation neural network based teaching cognitive diagnosis (Unpublished master’s thesis). Henan Normal University, China. |
[ 李世珍. (2019). 基于BP神经网络的教学认知诊断方法及应用 (硕士学位论文). 河南师范大学.] | |
[17] | Li Y. (2014). The construction for cognitive diagnosis tests of multiple-choice items and the development of multiple- choice cognitive diagnosis model for multiple strategies. (Unpublished doctoral dissertation). Jiangxi Normal University, China. |
[ 李瑜. (2014). 多选题认知诊断测验编制及多策略的多选题认知诊断模型的开发 (博士学位论文). 江西师范大学.] | |
[18] | Liu T. (2016). Using distractor information in computerized adaptive testing (Unpublished doctoral dissertation). Beijing Normal University. |
[ 刘拓. (2016). 干扰项信息在计算机化自适应测验中的利用 (博士学位论文). 北京师范大学.] | |
[19] | Ma W.C., Iaconangelo C., de la Torre J. & (2016). Model similarity, Model selection, and attribute classification. Applied Psychological Measurement, 40(3), 200-271. doi: 10.1177/0146621615621717URL |
[20] | Osterlind S.J. (1998). Constructing test items: Multiple- choice, constructed-response, performance and other formats (2nd ed.). Boston: Kluwer Academic. |
[21] | Ozaki K. (2015). DINA Models for multiple-choice items with few parameters: Considering incorrect answers. Applied Psychological Measurement, 39(6), 431-447. doi: 10.1177/0146621615574693URL |
[22] | Steven M.D. (2004). Reliability: On the reproducibility of assessment data. Medical Education, 38(9), 1006-1012. doi: 10.1111/med.2004.38.issue-9URL |
[23] | Thissen D.M. (1976). Information in wrong responses to the raven progressive matrices. Journal of Educational Measurement, 13(3), 201-214. doi: 10.1111/jedm.1976.13.issue-3URL |
[24] | Thissen D.M., & Steinberg L. (1984). A response model for multiple-choice items. Psychometrika, 49(4), 501-519. doi: 10.1007/BF02302588URL |
[25] | Thissen D.M., & Wainer H. (1993). Combining multiple- choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6(2), 103-118. doi: 10.1207/s15324818ame0602_1URL |
[26] | von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287-307. doi: 10.1348/000711007X193957URL |
[27] | Yigit H.D., Sorrel M.A., & de la Torre J.(2019). Computerized adaptive testing for cognitively based multiple-choice data. Applied Psychological Measurement, 43(5), 388-401. doi: 10.1177/0146621618798665URL |
相关文章 1
[1] | 汪文义;丁树良;宋丽红. 认知诊断中基于条件期望的距离判别方法[J]. 心理学报, 2015, 47(12): 1499-1510. |
PDF全文下载地址:
http://journal.psych.ac.cn/xlxb/CN/article/downloadArticleFile.do?attachType=PDF&id=5044