DSpace at EWHA: 논문 데이터 마이닝을 이용한 질병관련 유전자의 발굴

Browse

My Repository

DSpace at EWHA일반대학원 생명·약학부 Theses_Master

View : 760 Download: 0

Full metadata record

DC Field	Value	Language
dc.contributor.author	이현정	-
dc.creator	이현정	-
dc.date.accessioned	2016-08-26T10:08:29Z	-
dc.date.available	2016-08-26T10:08:29Z	-
dc.date.issued	2004	-
dc.identifier.other	OAK-000000034588	-
dc.identifier.uri	https://dspace.ewha.ac.kr/handle/2015.oak/201332	-
dc.identifier.uri	http://dcollection.ewha.ac.kr/jsp/common/DcLoOrgPer.jsp?sItemId=000000034588	-
dc.description.abstract	Human genome project 의 완성으로 LocusLink 나 OMIM 같은 respective databases 들이 이미 human genome region에 mapping이 되어 있으나 알려지지 않은 candidate gene들을 찾아내기 위해서 다양한 주제별 전문가들에 의해 정리, 분류되어있고MeSH 라는 고유의 검색용어로 인덱스 되어진 MEDLINE에 annotation되어 있는 controlled vocabularies들을 사용해서 질병과 관련있는 MeSH C terms 과 chemistry를 표현하는chemical terms 사이의 연관정도,. chemical terms 과 protein function 의 연관정도, pathological conditions 과 protein-fuction terms 사이의 연관정도를 계산한 후 이 결과를 가지고medical terms 을 protein-fuction 과 관련 짓는 것이다. 그 결과OMIM 상의 특정 질병에 대하여 GO 상의 어떤 항이 많은 관련이 있는지, 또한 RefSeq 의 어떤 유전자가 관련 가능성이 높은지를 판단 할 수 있는 scoring system을 개발하였다. 이를 이용하여 염색체의 특정 locus 에 존재하는 질병 관련 유전자를 구체적으로 찾는 방법을 개발하였고 다양한 질병에 대하여 그 결과를 볼 수 있는 웹사이트를 구축하였다.;Although many inherited diseases currently recorded in respective databases (LocusLink, OMIM) are already linked to a region of the human genome, about many have no known associated gene. The public availability of the human genome draft sequence has fostered new strategies to map molecular functional features of gene products to complex phenotypic descriptions, such as those of genetically inherited diseases. Owing to recent progress in the systematic annotation of genes using controlled vocabularies, we have developed a scoring system for the possible functional relationships of human genes to genetically inherited diseases that have been mapped to chromosomal regions without assignment of a particular gene. To support and rationalize the manual association of known or inferred functional features of genes to the phenotypic features The first phase of the data-mining process involves combining the information from MEDLINE and a protein sequence database to derive relationships between pathological conditions and terms describing protein function. We used a three-step procedure. The first we computed the associations between pathological conditions and chemical terms using MEDLINE, a database of indexed journal citations and abstracts of the biomedical literature, which currently contains more than 12 million entries and MeSH classified by experts in each field . We consider the relationship between associated terms as strong if they occur together in many abstracts. The second we calculated the relationships between chemical terms and terms describing protein function. We used the NCBI RefSeq database, which contains more than 15,000 genes whose function is annotated with terms from a controlled functional vocabulary.The third we combined the associations of functional terms to chemical terms with the previously established associations of pathological conditions to chemical terms, to derive the aforementioned relations between pathological conditions and protein-function terms. As a result an scoring system has been developed which find the specific relationship between GO terms and an OMIM_based disease. The system was used todiscover disease-related genes in a locus and also the results for varies diseases are shown in the website(http://genome.ewha.ac.kr/∼hera).	-
dc.description.tableofcontents	논문개요 = ⅷ Ⅰ. 서론 = 1 Ⅰ-A. 질병과 유전자의 상관관계 연구배경 = 1 Ⅰ-B. 질병과 유전자의 상관관계 연구 필요성 = 2 Ⅰ-C. 본 논문의 개발 의의 = 3 Ⅰ-D. 본 논문의 분석 데이터 = 4 Ⅰ-E. 본 논문에서의 접근 방법 = 7 Ⅱ. 자원 및 방법 = 10 Ⅱ-A. 사용 데이터 베이스 = 10 Ⅱ-B. 사용 하드 웨어 및 소프트 웨어 = 18 Ⅱ-C. 방법 = 21 Ⅱ-C -1. MeSH Headings 와 ChemicalList 의 분리 = 22 Ⅱ-C -2. MeSH C terms 과 MeSH D terms의 pair 생성 = 23 Ⅱ-C -3. MeSH D terms 과 GO terms의 pair 생성 = 23 Ⅱ-C -4. System 개발 방법 = 24 Ⅱ-C -4-가. R(C,D) 개발 방법 및 S(D,GO) 개발 방법 = 26 Ⅱ-C -4-나. T ( C, GO ) 개발 방법 = 28 Ⅱ-C -4-다. T_(2)^(w) ( C, GO ) 개발 방법 = 29 Ⅱ-D. 염색체 위치에 Homology searches = 31 Ⅱ-F. 대용량 데이터베이스 핸들링 = 32 Ⅲ-E. SCRIPTS. = 36 Ⅲ. 결과 = 54 Ⅲ-A. Homology searches 결과 = 54 Ⅲ-B. 검색 시스템 구현 = 63 Ⅳ. 결론 및 고찰 = 69 Ⅴ. 참고문헌 = 70 Abstracts = 72	-
dc.format	application/pdf	-
dc.format.extent	1521890 bytes	-
dc.language	kor	-
dc.publisher	이화여자대학교 대학원	-
dc.title	논문 데이터 마이닝을 이용한 질병관련 유전자의 발굴	-
dc.type	Master's Thesis	-
dc.format.page	viii, 73 p.	-
dc.identifier.thesisdegree	Master	-
dc.identifier.major	대학원 분자생명과학부	-
dc.date.awarded	2004. 8	-