View : 802 Download: 223

EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering

Title
EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering
Authors
Lee, SoohyunSeo, Chae HwaAlver, Burak HanLee, SanghyukPark, Peter J.
Ewha Authors
이상혁
SCOPUS Author ID
이상혁scopus
Issue Date
2015
Journal Title
BMC BIOINFORMATICS
ISSN
1471-2105JCR Link
Citation
BMC BIOINFORMATICS vol. 16
Keywords
Expression quantificationIsoformsMulti-readsOptimizationSuffix array
Publisher
BIOMED CENTRAL LTD
Indexed
SCIE; SCOPUS WOS scopus
Document Type
Article
Abstract
Background: RNA-seq has been widely used for genome-wide expression profiling. RNA-seq data typically consists of tens of millions of short sequenced reads from different transcripts. However, due to sequence similarity among genes and among isoforms, the source of a given read is often ambiguous. Existing approaches for estimating expression levels from RNA-seq reads tend to compromise between accuracy and computational cost. Results: We introduce a new approach for quantifying transcript abundance from RNA-seq data. EMSAR (Estimation by Mappability-based Segmentation And Reclustering) groups reads according to the set of transcripts to which they are mapped and finds maximum likelihood estimates using a joint Poisson model for each optimal set of segments of transcripts. The method uses nearly all mapped reads, including those mapped to multiple genes. With an efficient transcriptome indexing based on modified suffix arrays, EMSAR minimizes the use of CPU time and memory while achieving accuracy comparable to the best existing methods. Conclusions: EMSAR is a method for quantifying transcripts from RNA-seq data with high accuracy and low computational cost.
DOI
10.1186/s12859-015-0704-z
Appears in Collections:
자연과학대학 > 생명과학전공 > Journal papers
Files in This Item:
001.pdf(6.11 MB) Download
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

BROWSE