View : 458 Download: 0

Organizing an in-class hackathon to correct pdf-to-text conversion errors of genomics & informatics 1.0

Title
Organizing an in-class hackathon to correct pdf-to-text conversion errors of genomics & informatics 1.0
Authors
Kim S.Kim R.Nam H.-J.Kim R.-G.Ko E.Kim H.-S.Shin J.Cho D.Jin Y.Bae S.Jo Y.W.Jeong S.A.Kim Y.Ahn S.Jang B.Seong J.Lee Y.Seo S.E.Kim H.-J.Kim H.Sung H.-L.Lho H.Koo J.Chu J.Lim J.Lee K.Lim Y.Kim M.Hwang S.Han S.Yoo S.Seo Y.Shin Y.Ko Y.-J.Baek J.Hyun H.Choi H.Oh J.-H.Kim D.-Y.Park H.-S.
Ewha Authors
박현석
SCOPUS Author ID
박현석scopus
Issue Date
2020
Journal Title
Genomics and Informatics
ISSN
2234-0742JCR Link
Citation
Genomics and Informatics vol. 18, no. 3, pp. 1 - 8
Keywords
Biomedical text miningCorpusText analytics
Publisher
Korea Genome Organization
Indexed
SCOPUS scopus
Document Type
Article
Abstract
This paper describes a community effort to improve earlier versions of the full-text corpus of Genomics & Informatics by semi-automatically detecting and correcting PDF-to-text conversion errors and optical character recognition errors during the first hackathon of Genomics & Informatics Annotation Hackathon (GIAH) event. Extracting text from multi-col-umn biomedical documents such as Genomics & Informatics is known to be notoriously difficult. The hackathon was piloted as part of a coding competition of the ELTEC College of Engineering at Ewha Womans University in order to enable researchers and students to create or annotate their own versions of the Genomics & Informatics corpus, to gain and create knowledge about corpus linguistics, and simultaneously to acquire tangible and transferable skills. The proposed projects during the hackathon harness an internal database containing different versions of the corpus and annotations. © 2020, Korea Genome Organization.
DOI
10.5808/GI.2020.18.3.e33
Appears in Collections:
인공지능대학 > 컴퓨터공학과 > Journal papers
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

BROWSE