DSpace at EWHA: 문항 진단성에 근거한 다분 인지진단모형의 적용 가능성 탐색

Browse

My Repository

DSpace at EWHA일반대학원 교육학과 Theses_Ph.D

View : 953 Download: 0

문항 진단성에 근거한 다분 인지진단모형의 적용 가능성 탐색

Title: 문항 진단성에 근거한 다분 인지진단모형의 적용 가능성 탐색

Other Titles: Exploration of applicability of polytomous cognitive diagnosis model based on Item diagnosticity

Authors: 구슬기

Issue Date: 2020

Department/Major: 대학원 교육학과

Publisher: 이화여자대학교 대학원

Degree: Doctor

Advisors: 성태제

Abstract: 학생 개인이 자신의 역량을 최대한 발휘할 수 있도록 학습자 중심의 교육의 필요성이 증대되고 있는 현시점에서, 이를 위한 학생평가는 학습자의 다양하고 상세한 정보를 최대한 수집하여 학생의 역량에 대한 정확한 판단을 제공할 뿐 아니라 역량 발달을 도울 수 있는 역할을 해야 할 것이다. 이러한 차원에서 최근의 학생평가에서는 구성형, 서술형 및 논술형 문항을 활용하여 학생의 문제 해결 능력 등의 고차적인 능력을 다각적으로 평가하고 있다. 이에 따라 학생의 인지적 기술의 숙달과 같은 인지 정보를 추출할 수 있는 인지진단모형의 활용 범위도 확대될 필요가 있다. 최근의 이러한 교육적 변화와 요구에 맞추어 교육 측정 및 평가 분야에서는 학생의 이분 문항 반응에 초점이 맞추어져 있던 인지진단모형의 활용 범위를 다분 문항 반응에도 적용할 수 있도록 확대하고 있다. 측정평가 영역에서 측정 모형을 활용하여 분석할 때 수집된 자료에 대한 모형의 적합성을 확보하는 것은 타당하고 신뢰로운 분석 결과를 얻기 위해 충족해야 할 준거이며 반드시 거쳐야 할 단계이다. 이러한 의미에서 문항의 인지요소 적합성 정도를 의미하는 문항의 진단성(diagnosticity)이라는 개념은 인지진단모형을 활용할 때 고려해야 할 중요한 측정학적 요건이라 할 수 있다. 기존 선행연구에서는 이분 문항 반응에 대한 인지진단모형 적용 시 문항의 진단성이 모수 추정에 영향을 미치는 중요한 조건임이 확인된 바, 다분 문항에서도 문항의 진단성이 모수 추정에 영향을 주는 중요한 조건임을 확인할 필요가 있다. 이에 본 연구에서는 문항의 진단성이 모형 적합도와 모수 추정의 정확성에 미치는 영향을 확인하여 구성형 문항 응답 자료에 다분 인지진단모형을 적용할 때 모형 적합도와 모수 추정의 정확성을 확보할 수 있는 방안을 탐색하고자 한다. 구체적으로 다분 문항 반응으로 이루어진 자료에 대하여 다분 인지진단모형인 순차적 G-DINA(Sequential generalized DINA: 이후 SG-DINA) 모형의 적용 가능성과 유용성을 탐색할 목적으로 SG-DINA 모형과 이분 인지진단모형인 G-DINA 모형의 모형 적합도와 모수 추정의 정확성을 비교하였다. 특히 검사의 양호도로 해석할 수 있는 문항의 진단성 정도가 각 모형의 적합도와, 모수 추정의 정확성에 미치는 영향을 파악하여 다분 자료가 갖는 문항 정보를 손실하지 않고 모수를 안정적으로 추정하기 위한 측정학적 정보를 확인하고 제시하고자 하였다. 이러한 결과를 산출하기 위해서 본 연구에서는 문항 수(20개, 40개), 피험자 수(400명, 800명), 문항 진단성(높음, 보통, 낮음, 무선), 단계형 Q행렬 형태(제한적 단계형 Q행렬, 비제한적 단계형 Q행렬), 다분 자료에 대한 이분화 전략(partial credit, full credit)을 조건화하여 총 48가지 조건의 모의자료를 100회 반복하여 생성하였다. 모의자료에 대한 다분 인지진단모형과 이분 인지진단모형의 비교를 위해 모형 적합도(SRMSR)와 문항 모수 회복률(RMSE), 피험자 모수 회복률(PCV)을 산출하였다. 모의실험을 통한 연구 결과, 첫째, 모의실험에서 사용한 조건인 문항 수, 피험자 수, 문항 진단성은 SG-DINA 모형과 G-DINA 모형의 적합도와 문항 모수 회복률, 피험자 모수 회복률에 영향을 미치는 것으로 분석되었다. 대체로 문항 수와 피험자의 수가 많고, 문항 진단성이 높을수록 두 모형 모두에서 모형 적합도와 모수 추정의 정확성이 높았다. 둘째, SG-DINA 모형이 G-DINA 모형보다 모형 적합도, 모형 추정의 정확성이 높은 것으로 나타났다. SG-DINA 모형 중에서는 제한적 단계형 Q행렬을 사용하는 RS-GDINA 모형의 결과가 US-GDINA 모형의 결과보다 전반적으로 모형 적합도, 모수 추정 정확성이 높았다. 셋째, 대체로 문항 진단성이 높을 때 각 조건별 모형 적합도가 높고 문항 모수 및 피험자 모수 추정의 정확성이 높았다. 모의자료를 통한 비교 결과를 바탕으로 실제 다분 응답 자료에서도 이분 인지진단모형 보다는 다분 인지진단모형을 활용하는 것이 다분 자료가 갖는 정보의 손실을 막고 정확하고 안정적으로 모수를 추정하는지를 확인하기 위해서 TIMSS 2015 4학년 수학 10개 문항에 대한 학생들의 응답 자료를 사용하여 분석하였다. 이를 통해 SG-DINA 모형이 실제로 어떠한 결과를 산출하고, 이 결과가 이분 모형인 G-DINA 모형의 결과와 비교하여 어떠한 상세 정보를 제공하는지를 분석하였다. 그 결과, 첫째, SG-DINA 모형과 G-DINA 모형에 의해 분석한 인지요소 숙달 비율을 산출한 결과, 두 모형에 의해 분석한 인지요소 숙달 비율은 대체로 유사하였으나 ‘이해’ 인지요소에서 비교적 큰 차이를 보였다. 둘째, SG-DINA 모형과 G-DINA 모형이 제공하는 인지요소 숙달 패턴에 따른 정답 확률을 비교한 결과, SG-DINA 모형은 평가틀에서 측정하려는 인지요소의 숙달 여부에 따른 단계별 점수를 획득할 확률을 구분하여 제공해주는 반면, G-DINA 모형은 모든 단계를 통합하여 인지요소 숙달 여부에 따른 정답 확률을 제공하기 때문에 단계별 정보는 제공하지 못하였다. 셋째, TIMSS 2015 4학년 수학 문항의 진단성을 SG-DINA 모형과 G-DINA 모형을 통해 분석한 결과, SG-DINA 모형에 의한 문항 진단성은 문항 설계에 따른 정보의 소실 없이 단계별로 문항 진단성을 산출한 반면, G-DINA 모형에 의한 문항 진단성은 단계를 통합하여 문항 수준으로 산출하기 때문에 문항 단계가 지니는 정보는 제공하지 못하였다. 연구 결과를 통한 시사점은 다음과 같다. 우선, 다분 인지진단모형의 적합도와 모수 추정의 정확성을 높이기 위해서는 충분한 문항 수와 피험자의 수를 확보하고 문항 진단성 수준을 고려해야 하며, 둘째, 응답 자료가 갖는 정보의 손실을 막기 위해서는 다분 문항 응답 자료에는 다분 인지진단모형을 적용하는 것이 측정학적으로 바람직하다. 다만, 연구 목적에 의해 이분 인지진단모형을 활용하여 다분 문항 반응 자료를 분석할 경우에는 자료의 특성을 고려하여 적절한 자료 이분화 전략을 선택해야 하며 다분 문항 반응 자료에 대해서도 문항 진단성을 활용하여 모형의 적합성을 확인해야 한다. 셋째, Q행렬에서 문항과 인지요소의 관계를 규명하는 과정은 여러 단계를 걸쳐 신중하게 수행되어야 하고, 지속적으로 정교화해야 한다. 넷째, 단계형 Q행렬을 사용할 수 있는 SG-DINA 모형의 특징을 활용하여 실제 교육 현장에서 채점기준 타당화 및 인지요소별 점수화 등에 SG-DINA 모형이 활용될 수 있다. 본 연구의 결과를 토대 후속 연구를 제언하면 다음과 같다. 첫째, 다분 인지진단모형의 적합성에 대한 보다 다양한 조건의 모의자료 연구가 필요하다. 본 연구에서는 실제 자료와의 유사성 및 해석의 용이성의 이유로 문항 수, 피험자의 수, 문항 진단성 등 3가지 요인으로 모의실험을 수행하였지만 문항의 내용, 피험자 능력 수준, 인지진단모형의 종류, 측정하는 인지요소의 개수 등 다양한 요인들이 모형 적합도 및 모수 추정에 영향을 미칠 수 있기 때문에 이러한 요인들이 모형 적합도 및 모수 추정의 정확성에 어떤 영향을 미치는지 탐색할 필요가 있다. 둘째, 다분 인지진단 모형 간의 비교 연구가 필요하다. 본 연구는 다분 인지진단모형의 한 종류인 SG-DINA 모형을 선정하여 이분 인지진단모형인 G-DINA 모형과 비교를 통해 다분 문항 반응 자료에 대한 다분 인지진단모형의 적합성을 탐색하였지만 향후 다분 문항 반응 자료의 형태에 따라 다양한 다분 인지진단모형 중 어떠한 모형이 보다 적합하고 안정적으로 모수를 추정하는지 비교하는 연구가 이루어져야 할 것이다. 아울러 SG-DINA 모형의 적용 가능성을 제고하기 위해 다양한 실제 자료의 적용 연구가 축적되어 교육 상황에서 생성되는 다양한 다분 자료를 이 모형을 통해 분석하여 학생의 학습 및 교사의 수업을 개선하는데 활용되는 방안도 모색될 필요가 있다. 셋째, 문항 진단성의 수준을 본 연구에서 지정한 단계보다 더 다양하게 조건화 한 문항 진단성 수준이 모형 및 모수 추정에 미치는 연구들이 축적되어 문항 진단성의 절대 기준이 제시될 필요가 있다. 넷째, 리커트(Likert) 척도를 주로 사용하는 심리검사 자료에 대한 다분 인지진단모형 적용 연구가 수행될 필요가 있다. 본 연구에서 사용한 성취도 검사 뿐 아니라 각종 심리검사도 리커트 척도의 다분 응답 자료를 생성하기 때문에 학생들의 인지기술에 대한 분석을 위하여 다분 인지진단 모형을 활용한 연구들이 축적될 필요가 있다. ;Recently, the need for learner-centered education has increased so that individual students can maximize their capabilities. Student assessments should play a role in helping to develop competencies as well as providing accurate judgments about student competencies by gathering as much as possible from the diverse and detailed information of learners. Student assessments should play a role in helping to develop competencies as well as providing accurate judgments about student competencies by gathering as much as possible from the diverse and detailed information of learners. Accordingly, the scope of application of cognitive diagnostic model that can extract cognitive information such as student's cognitive skill mastery needs to be expanded. In line with these recent educational changes and demands, it is necessary to expand the scope of application of the human diagnostic model, which was focused on the student's response to binary questions, to apply to the multiple response. When analyzing the measurement model, ensuring the model's suitability for the collected data is a prerequisite and a necessary step to obtain valid and reliable analysis results. In this sense, item diagnosticity, which is the degree of conformity of attribute, is an important criteria to consider when using the cognitive diagnostic model. In polytomous CDM, it is necessary to confirm that the diagnosis of the question is an important condition that affects the parameter estimation. The purpose of this study is to investigate the effects of item diagnostics on the model fit and the accuracy of parameter estimation. In particular, the effect of the diagnostic accuracy of the item estimates on the suitability of each model and the accuracy of the parameter estimation is determined. It was confirmed that the diagnostic model should be selected. In order to calculate these results, in this study, the number of items (20/40), the number of examinees(400/800), the item diagnosticity(high, normal, low, random), -matrix type(restricted -matrix/ unrestricted -matrix), and recode method(partial credit, full credit) were conditioned and the simulation data of a total of 48 conditions were generated 100 times. The comparison between the polytomous cognitive diagnosis model and the binary cognitive diagnosis model for the simulation data is based on model fit(SRMSR), item parameter recovery rate(RMSE), and person parameter recovery rate(PCV). The results of the simulation study showed that, first, the number of items, the number of examinees, and the item diagnosticity were found to affect the fitness of the SG-DINA model and the G-DINA model, the item parameter recovery rate, and the person parameter recovery rate. The larger the number of items and examinees, and the higher the item diagnostics, the higher the accuracy of model fit and parameter estimation. Second, the SG-DINA model showed higher accuracy of model fit and model estimation than the G-DINA model. Third, when the item diagnosticity was high, the model fit for each condition was high and the accuracy of the item parameter and examinees parameter estimation was high. In this study, i analyzed students' responses to 10 questions in TIMSS 2015 4th grade math. This study analyzed what results the SG-DINA model actually estimated, and what detailed information the results provided compared with the results of the two-part model G-DINA. As a result, first, item diagnosticity by SG-DINA model yields item diagnosticity step by step without losing information according to item design, whereas item diagnosticity by G-DINA model calculates item level by integrating the steps. It did not provide the information it has. Second, as a result of calculating the cognitive factor mastery ratio analyzed by SG-DINA model and G-DINA model, mastery of 'understanding' atrribute Whether there was a difference in the results of the two models. Fourth, the SG-DINA model distinguishes and provides the probability of acquiring the steps score according to the mastery of the cognitive factor to be measured in the evaluation framework, while the G-DINA model integrates all the steps to provide the probability of correct answer according to the mastery of the cognitive factor. It did not provide step by step information. The implications of this study through these results, First of all, in order to increase the fit of the model and the accuracy of the parameter estimation, it is necessary to secure a sufficient number of items and the number of examinees, and to consider the level of item diagnosis. Second, in order to prevent the loss of information in response data, it is desirable to apply the polytomous cognitive diagnostic model to polytomous item response data. Inevitably, when analyzing the data using the binary cognitive diagnosis model, the appropriate data dichotomize strategy should be selected in consideration of the characteristics of the data. Third, the process of identifying the relationship between questions and attributes in the Q matrix needs to be carried out carefully in several stages, and requires continuous refinement. Fourth, in order to apply the SG-DINA model to provide the students with the results and to interpret them properly, the ruburic must be clearly identified from the stage of the test design. Based on the results of this study, the suggestions are as follows. First of all, it is necessary to study the simulation data under various conditions regarding the suitability of the polytomous cognitive diagnostic model. Second, a comparative study between polytomous cognitive diagnostic models is needed. In addition, in order to enhance the applicability of the SG-DINA model, research on applying various real data has been accumulated, and various polytomous data generated in the educational situation are analyzed through this model to be used to improve student learning and teacher instruction. Third, studies on the model and parameter estimation of the level of item diagnosticity that conditioned the level of item diagnosticity more variously than the steps specified in this study should be accumulated, and the absolute criterion of the item diagnosticity needs to be presented. Fourth, it is necessary to study the application of polytomous ognitive diagnostic model on psychological test data mainly using Likert scale.