DSpace at EWHA: 적합성 가중치 검색 및 P-NORM 검색에 관한 연구

Browse

My Repository

DSpace at EWHA일반대학원 문헌정보학과 Theses_Ph.D

View : 1232 Download: 0

Full metadata record

DC Field	Value	Language
dc.contributor.author	이효숙	-
dc.creator	이효숙	-
dc.date.accessioned	2016-08-26T12:08:15Z	-
dc.date.available	2016-08-26T12:08:15Z	-
dc.date.issued	1994	-
dc.identifier.other	OAK-000000071134	-
dc.identifier.uri	https://dspace.ewha.ac.kr/handle/2015.oak/190075	-
dc.identifier.uri	http://dcollection.ewha.ac.kr/jsp/common/DcLoOrgPer.jsp?sItemId=000000071134	-
dc.description.abstract	To examine the effectiveness of the Relevance Weighted Model and the P-Norm Model, the present study has tested the following hypotheses with the experiments. First, The Relevance Weighted Boolean Model and the P-NORM Model achievehigher precision than the Boolean Request Conversion Model and in comparisons with the document output ranks the former has more positive corrlation with a user judgement than the latter. Second, The Relevance Weighted Boolean Model achieve higher precision than the P-Norm Model and in comparison with the document output ranks the former has more positive correlatiion with a user judgement than the latter. Third, The relevance Weighted Boolean Model using the substantial relevance information retrospectively results in higher precision that the Model using the limited relevance information and comparing with the document output ranks the former has more positive correlation with a user judgement than the latter. Fourth, The P-Norm Model using the relevance weights results in higher precision than the Model using the inverse document frequency weights and comparing with the document output ranks the former has more positive correlation with a user judgement than the latter. The experimental settings in this study are as following. Retrieval methods as the independent variable have been employed with the Relevance Weighted Boolean Model and with the P-Norm Model. Precision and ranking as the dependent variables have been examined to evaluate the effectiveness of the Models. The rest has been done for searching INSPEC and BIST. Search requests are sixty six in English and in Korean for which the researchers in the electrical and eletronic engineering field asked to ne answered. Both the relevance of each article retrieved and the ranks of the output documents have been judged by the researchers. The relevance degree of the documents has been classified by 'definitely relevant', 'probably relevant', and 'irrelevant'. In conclusion, the findings of this study are as following. 1. For hypothesis 1, the precision and the document output ranks of the Relevance Weighted Boolean Model and the P-Norm Model are different from the Boolean Request Conversion Model. However, the results of multiple range test does not show a significant difference between the retrieval models at the .05 level. 2. In support of hypothesis 2, there is a significant difference between the Relevance Weighted Boolean Model and the P-Norm Model in searching with the search requests in English. The probability is 0.0002. The ratio of rank correlation coefficients is 51.51 :42.42. The Relevance Weighted Boolean Model is more effective than the P-NORM Model. 3. In support of hypothesis 3, there is a significant difference between the Relevance Weighted Boolean Model using the substantial relevance information retrospectively and the Model with little information in searching with the requests in English. The probability is 0.0125 and the ratio of rank correlation coefficients is 51.51 : 45.45. The relevance Weighted Boolean Model using the subtantial relevance information is more effective than the Model using little relevance information. 4. In support of hypothesis 4, there is a significant difference between the P-Norm Model with the relevance weights for research terms and the Model with the inverse document frequency weights when searches are done in English. The ratio of rank correlation coefficients is 42.42:30.30 and the probability is 0.0006. The P-Norm Model using the relevance weights is more effective than the Model using the inverse document frequency weights.;본 논문은 불 논리 검색을 개선하기 위해 개별적으로 연구되어 온 검색방법에 관하여 이론 미및 실험연구를 통하여 가장 우수한 검색모형을 밝히는데 목적을 두었다. 연구된 검색모형은 질문변환에 의한 불 논이 검색, 적합성 가중치 검색, P-NORM 검색으로 각검색모형에 의한 정보탐색을 실시하여 이에 대한 검색성능을 평가하였다. 본 연구에서 검색성능은 검색시스템에서의 검색효율과 검색문헌을 질문과의 유사성 순위대로 출력시킴으로써 이용자의 노력을 감소시켜 주는 시스템의 능력을 의미하며, 검색성능의 척도로서는 정확률과 검색순위를 사용하였다. 검색실실험은 다음과 같운 실험환경에서 실시되었다. 실험대상의 주제분야는 전기·전자 분야이며, 데이터베이스는 영국전시공학회에서 제작한 INSPEC 데이터 베이스와 국내의 산업기술정보원에서 제작한 BIST로서 이에 대해서 탐색을 실시하였다. 탐색에서 사용한 검색시스템은 영문문헌에서는 STAIRS, 한극문헌에서는 KIROS이고, P-NORM 검색을 실시하기 위해서 STAIRS와 KIROS를 통하여 다운로드 된 문헌집단에 대해서 실험목적으로 구현된 시스템을 사용하였다. 탐색질문은 국내의 전기·전자 분야의 연구자에 대해 표집된 33명의 이용자들로부터 접수된 질문들로서 66개의 탐색식(한글 33개, 영문 33개)이 작성되었다. 검색된 결과에 대해서는 탐색을 신청한 이용자들이 검색된 각문헌에 대해 적합성 판정을 하였으며, 판정기준은 적합, 부분적합, 부적합으로 하였다. 그리고 검색문헌에 대해거 검색되기 원하는 순으로 순위를 부여하도록 하였다. 각 검색모형에 의한 검색실험 결과에 대래서 표본 통계치의 정규분포 테스트를 실시한 후 사전, 사후 검증에 의한 정확률의 유의도를 분석하였다. 검색순위는 순위상관관계에 의한 검증을 동하여 검색모형 간의 성능을 비교하였따. 검색실험을 통해서 밝혀진 결과를 요약하면 다음과 같다. 1. 질문변환에 의한 불 논리 검색, 적합성 가중피 검색, P-NORM 검색은 정확률 평균값 간의 유의도 검증에서 한글 문헌 및 영문문헌의 검색결과는 유의 수준 0.05에서 차이가 없었다. 그러나 검색순위에 의한 순위상관관계와 이용자 판정에 의한 검색결과와의 비교에서 적합성 가중치 검색은 불 논리 검색 및 P-NORM 검색보다 더 효과적이다. 2. 영문문헌의 검색에서 적합성 가중치 검색은 P-NORM 검색보다 정확률이 더 높고 검색순위는 이용자 판정에 의한 검색순위와 더 높은 상관관계가 있다. 3. 적합성 가중치 검색에서 완전한 적합성 정보를 이용한 경우와 소수의 적합성 정보를 이용한 경우에, 영문문헌의 검색에서 정확률이 더 높고, 이용자 판정에 의한 검색순위와 더 높은 상관관계가 있다. 4. P-NORM 검색은 영문문헌의 검색에서 탐색어에 부여한 가중치에 따라 정확률과 검색순위에 차이가 있어서 적합성 가중치를 적용한 검색이 역문헌빈도 가중치를 적용한 검색보다 더 높은 정확률을 나타내며, 검색순위는 이용자 판정에 의한 순위와 더 높은 상관관계가 있다.	-
dc.description.tableofcontents	논문개요 ------------------------------------------------------------- ⅸ Ⅰ. 서론 ------------------------------------------------------------- 1 1.1. 문제의 정의 ---------------------------------------------------- 1 1.2. 연구의 목적 ---------------------------------------------------- 2 1.3. 가설 ----------------------------------------------------------- 3 1.4. 연구의 방법 및 범위 -------------------------------------------- 3 1.5. 선행연구 ------------------------------------------------------- 5 Ⅱ. 불 논리 검색의 개선 ---------------------------------------------- 15 2.1. 적합성 가중치 검색 --------------------------------------------- 15 2.1.1. 탐색어과 가중치 --------------------------------------------- 15 2.1.2. 용어의 출현특성 --------------------------------------------- 16 2.1.3. 적합성 확률 ------------------------------------------------- 17 2.2. P-NORM 검색 ---------------------------------------------------- 22 2.2.1. 질문 및 문헌백터 -------------------------------------------- 22 2.2.2. 질문과 문헌의 유사도 ---------------------------------------- 24 2.2.3. P 피라미터와 논리의 확장 ------------------------------------ 25 2.2.4. 검색모형의 발전 --------------------------------------------- 27 Ⅲ. 실험환경 --------------------------------------------------------- 31 3.1. 데이터베이스 --------------------------------------------------- 31 3.2. 검색시스템 ----------------------------------------------------- 37 3.3. 이용자 --------------------------------------------------------- 38 3.4. 질문 ----------------------------------------------------------- 38 3.5. 탐색전략 ------------------------------------------------------- 42 3.6. 적합성 판정 ---------------------------------------------------- 44 3.7. 변수 ----------------------------------------------------------- 47 Ⅳ. 검색실험 --------------------------------------------------------- 51 4.1. 적합성 가중치 검색 --------------------------------------------- 51 4.1.1. 검색문헌의 적합성 판정 -------------------------------------- 51 4.1.2. 검색결과의 순위화 ------------------------------------------- 52 4.2. P-NORM 검색 ---------------------------------------------------- 54 4.2.1. 데이터베이스 다운로드 --------------------------------------- 54 4.2.2. 데이터 변환 ------------------------------------------------- 56 4.2.3. 시스템 설계 및 구현 ----------------------------------------- 58 4.2.4. 문헌검색 ---------------------------------------------------- 62 Ⅴ. 검색실험 결과의 분석 --------------------------------------------- 69 5.1. 가설 1과 가설 2의 검증 ----------------------------------------- 69 5.2. 가설 3과 가설 4의 검증 ----------------------------------------- 89 5.3. 분석결과 요약 --------------------------------------------------101 Ⅵ. 결론 -------------------------------------------------------------103 참고문헌 -------------------------------------------------------------107 부록 -----------------------------------------------------------------114 영문초록 -------------------------------------------------------------172	-
dc.format	application/pdf	-
dc.format.extent	6404461 bytes	-
dc.language	kor	-
dc.publisher	이화여자대학교 대학원	-
dc.title	적합성 가중치 검색 및 P-NORM 검색에 관한 연구	-
dc.type	Doctoral Thesis	-
dc.title.subtitle	불 논리 검색의 개선을 중심으로	-
dc.format.page	x, 174 p.	-
dc.identifier.thesisdegree	Doctor	-
dc.identifier.major	대학원 도서관학과	-
dc.date.awarded	1994. 2	-