DSpace at EWHA: 컴퓨터화 적응검사와 지필검사에 의한 피험자 능력추정의 정확성 비교

Browse

My Repository

DSpace at EWHA일반대학원 교육학과 Theses_Master

View : 931 Download: 0

컴퓨터화 적응검사와 지필검사에 의한 피험자 능력추정의 정확성 비교

Title: 컴퓨터화 적응검사와 지필검사에 의한 피험자 능력추정의 정확성 비교

Authors: 임현정

Issue Date: 1999

Department/Major: 대학원 교육학과

Publisher: 이화여자대학교 대학원

Degree: Master

Abstract: 본 연구는 GRE 언어영역 모의고사에서 컴퓨터화 적응검사의 적용가능성을 검증하고자 수행되었다. 그러므로 컴퓨터화 적응검사와 지필검사에 의해 추정된 피험자 능력추정치의 동등성을 검증하고, 능력추정의 정확성을 비교하여 컴퓨터화 적응검사의 효율성을 입증하는데 연구의 목적이 있다. 컴퓨터화 적응검사는 중간 난이도를 가진 문항으로 검사를 시작하고, 라쉬모형에 의해 추정된 능력과 가장 가까운 수준의 난이도를 가진 문항을 다음 문항으로 선택하였으며, 30문항 모두에 응답하거나 제한시간인 30분이 경과하면 종료되었다. 본 연구에서는 검사시행 순서의 효과를 배제하기 위하여 연구대상을 두 집단에 무선할당 하여, 34명에게는 지필검사를 먼저 실시한 다음 컴퓨터화 적응검사를 실시하였으며, 나머지 32명은 컴퓨터화 적응검사를 지필검사보다 먼저 실시했다. 컴퓨터화 적응검사와 지필검사에 의해 추정된 피험자의 능력에 유의미한 차이가 없었으며, 두 검사점수의 상관계수도 .8557로서 매우 높았다. 검사길이가 동일할 때 컴퓨터화 적응검사가 지필검사보다 높은 검사정보와 비슷한 수준의 내적 일치성 신뢰도를 가지고 있었다. 또한 컴퓨터화 적응검사는 전체 능력수준에서 비교적 비슷한 수준의 검사정보를 가지고 있었으며, 컴퓨터화 적응검사에서 전체 문항의 80%에만 응답한 피험자의 경우에 평균 수준의 추정오차를 가지고 있었다. 지필검사는 컴퓨터화 적응검사 보다 두 배의 검사시간을 필요로 하였으나, 문항당 할당시간은 오히려 더 적었다. 이상의 연구결과를 바탕으로 내릴 수 있는 결론은 다음과 같다. 첫째, 컴퓨터화 적응검사와 지필검사에 의해 추정된 피험자의 능력추정치는 동등하므로 두 검사결과의 상호교환이 가능하다. 둘째, 컴퓨터화 적응검사를 시행함으로써 지필검사에 비해 보다 적은 수의 문항으로 보다 정확하게 능력을 추정할 수 있다. 셋째, 컴퓨터화 적응검사는 모든 능력수준의 피험자에게 동일한 수준의 정확도를 보장해 주루 수 있는 공정한 검사방법이므로, 특히 능력이 매우 높거나 낮은 피험자의 능력추정 정확성을 향상시킬 수 있다. 넷째, 컴퓨터화 적응검사에서는 전체 문항의 80% 정도에만 응답해도 비교적 정확하게 능력을 추정할 수 있다. 또한 컴퓨터화 적응검사에서는 지필검사보다 짧은 검사시간과 적은 수의 문항이 소요되어 검사시행에 편의를 제공할 수 있다. 이상과 같이 본 연구를 통하여 컴퓨터화 적응검사는 지필검사와 동일한 특성을 보다 정확하게 추정하거나 지필검사의 절반 정도의 길이로 동일한 측정 결과를 산출할 수 있음을 알 수 있다. 그러므로 컴퓨터화 적응검사는 인간의 능력을 측정하는데 있어서 많은 장점을 가지는 바람직한 검사방법임이 입증되었다. 이러한 이유로 현재 국내에서 시행되고 있는 기관과 학교수준의 검사들을 컴퓨터화 적응검사로 전환하거나, 새로운 컴퓨터화 적응검사를 개발하는 것은 보다 정확하게 인간의 잠재적 능력을 측정하기 위한 노력의 일환으로 매우 의미 있는 작업이 될 것이다. ; This study examines the feasibility of using computer adaptive technology to administer trial examination of GRE verbal ability. The purpose of this study is to compare a computerized adaptive test(CAT) with paper-and-pencil(PAP) tests in terms of the comparability of examinee ability estimates, and to verify efficiency of CAT by comparing the accuracy of estimating ability. With CAT, the first item is randomly selected among items with the middle difficulty level and as the next item, the item with the nearest difficulty level to the provisional ability which is estimated by 1-parameter logistic model is selected. CAT is terminated if a examinee responds to all 30 items or if 30 minutes (the limited time) goes on after CAT starts. To avoid ordering effect in this study an equivalent groups design is used, with examines randomly assigned to one of two groups. 34 subjects in Group A are administered PAP in the first testing session, followed by CAT. 32 subjects in Group B are administered CAT followed by PAP. First, There is no significant difference between CAT and PAP ability estimates. The correlation between ability estimates with CAT and ability estimates with PAT is .8557. Second, When both tests are performed in the same length condition, or the same number of items, test information of CAT is higher than PAP in overall ability level. Third, test information is fairly high over the ability range from -2.0 to +2.0 in CAT while it peaks around the average-ability level in PAP. Fourth, although examinees respond to only 80% out of all items, the measurement error of estimating ability is almost at average level. In addition, PAP requires twice as many time as CAT, but less time per each item. The results of this study are summarized as follows: 1) the ability estimates on both tests is equivalent so that the results of both can be interchanged, 2) the ability can be estimated more accurately with less items on CAT, 3) because CAT provides fairly high test information over all examinee at overall level, measurement precision is especially improved for both low- and high-ability examinees in CAT, 4) CAT requires less testing time and less items so that it makes test taken more conveniently. The results demonstrate that CAT measures the same traits measured by PAP, with equal or greater precision, or with test lengths only half as long as its PAP counterpart. This verified that CAT is proper and efficient method for estimating human ability. The replacement of a lot of test taken by any institutes with CAT makes the latent trait be estimated more accurately.