DSpace at EWHA: Comparison of discrimination methods for the classification fo acute leukemia using gene expression date

Browse

My Repository

DSpace at EWHA일반대학원 통계학과 Theses_Master

View : 632 Download: 0

Full metadata record

DC Field	Value	Language
dc.contributor.author	유미나	-
dc.creator	유미나	-
dc.date.accessioned	2016-08-26T10:08:28Z	-
dc.date.available	2016-08-26T10:08:28Z	-
dc.date.issued	2003	-
dc.identifier.other	OAK-000000033957	-
dc.identifier.uri	https://dspace.ewha.ac.kr/handle/2015.oak/200984	-
dc.identifier.uri	http://dcollection.ewha.ac.kr/jsp/common/DcLoOrgPer.jsp?sItemId=000000033957	-
dc.description.abstract	Microarray experiments generate large datasets with expression values for thousands of genes but not more than a few dozens of samples in such high-dimensional problems is difficult. However the ability to successfully distinguish between tumor classes(already known or yet to be discovered) using gene expression data is an important aspect of this novel approach to cancer classification. This paper compares the performance of discrimination methods for the classification of acute leukemia based on gene expressin data mainly. The methods include linear discriminant analysis, nearest neighborhood classifiers, classification trees and recent machine learning approaches. About Gene selection and performance assessment methods is also considered.;마이크로어레이는 수천 개의 유전자의 발현을 동시에 측정하여 얻은 자료로, 암의 종류에 따라 다르게 발현되는 양상을 통계적으로 발견함으로써 암 분류에 이비지할 수 있다. 암의 분류로 암의 원인을 찾을 수 있고 암의 종류에 따라 다른 처방으로 효율적인 치료를 가져올 수 있다. 일반적인 분류는 더이상 통계에서 큰 문제가 아닐 수 있다. 그러나 분류의 결과가 인간의 생명과 연결되며 수천개의 변수가 동시에 측정되는 반면 관측 표본 수가 적은 마이크로어레이에서는 정확한 분류식을 세우는 것이 중요하며 쉬운 일이 아니다. 이 논문에서는 급성 백혈병 마이크로어레이 자료를 이용하여 판별분석의 성능을 비교한다. 판별분석의 방법으로는 선형판별분석, nearest neighborhood, 판별분류 나무, 최근 기법인 기계학습등을 비교한다. 그리고 마이크로어레이 이용한 판별분석에서 중요한 변수 선택과 성능 비교하는 방법에 대해서도 생각해 본다.	-
dc.description.tableofcontents	TABLE OF CONTENTS = i ABSTRACT = v I. INTRODUCTION = 1 II. Methods = 4 2.1 Discrimination Methods = 4 a) Linear and quadratic discriminant analysis = 5 b) Logistic, Loglinear regression = 6 c) Nearest neighbor classifiers = 8 d) Classification tree = 9 e) Bagging = 10 f) AdaBoost = 11 g) LogitBoost = 12 2.2 Feature Selection = 14 2.3 Performance assessment = 17 a) Leave-One-Cross-Validation = 17 b) .632+Leave One Out Bootstrap = 18 III. STUDY DESIGN = 21 3.1 Dataset = 21 3.2 Standardization = 21 3.3 Gene selection = 22 3.4 Performance Assessment = 23 IV. Results = 24 4.1 LDA, Diagonal discriminant analysis = 25 4.2 Logistic and Loglinear regression = 26 4.3 Nearest neighbors = 27 4.4 Classification Tree and Aggregating predictor = 27 4.5 Gene selection = 28 V. DISCUSSION = 30 References = 40 논문개요 = 42	-
dc.format	application/pdf	-
dc.format.extent	1767057 bytes	-
dc.language	eng	-
dc.publisher	이화여자대학교 대학원	-
dc.title	Comparison of discrimination methods for the classification fo acute leukemia using gene expression date	-
dc.type	Master's Thesis	-
dc.format.page	v, 44 p.	-
dc.identifier.thesisdegree	Master	-
dc.identifier.major	대학원 통계학과	-
dc.date.awarded	2004. 2	-