View : 5 Download: 0

Full metadata record

DC Field Value Language
dc.contributor.author윤명국-
dc.date.accessioned2024-08-30T16:31:12Z-
dc.date.available2024-08-30T16:31:12Z-
dc.date.issued2024-
dc.identifier.issn1943-0663-
dc.identifier.otherOAK-35582-
dc.identifier.urihttps://dspace.ewha.ac.kr/handle/2015.oak/269564-
dc.description.abstractRecent GPUs provisioned with large register files (RFs) cannot fully utilize the bandwidth between the RFs and execution pipelines, as the current policy for allocating operand (OP) collectors defers the RF accesses until all the source OPs become ready. To tackle this issue, this letter introduces a new OP collector allocation mechanism called Triple-A. Triple-A comprises four key operations. First, Triple-A proactively allocates an OP collector (OC) to a warp instruction even if one of its source OPs is not yet ready, taking advantage of GPUs' in-order execution. Second, a computation result can be directly forwarded to an early allocated OC along with a data dependence, reducing OP loading time from the RFs. Third, Triple-A bypasses RF write operations if the forwarded data is not consumed by any other instruction. Finally, the early allocation is further enhanced with latency-aware optimization, alleviating the potential performance degradation caused by allocating OCs aggressively. Together, these techniques synergistically improve the register bank utilization, demonstrating a 14.1% improvement in performance and an 11.8% reduction in RF energy consumption compared to the state-of-the-art GPUs. © 2009-2012 IEEE.-
dc.description.sponsorshipInstitute of Electrical and Electronics Engineers Inc.-
dc.languageEnglish-
dc.subjectData forwarding-
dc.subjectgraphics processing units (GPUs)-
dc.subjectoperand collector (OC)-
dc.subjectregister files (RFs)-
dc.titleTriple-A: Early Operand Collector Allocation for Maximizing GPU Register Bank Utilization-
dc.typeArticle-
dc.relation.issue2-
dc.relation.volume16-
dc.relation.indexSCIE-
dc.relation.indexSCOPUS-
dc.relation.startpage206-
dc.relation.lastpage209-
dc.relation.journaltitleIEEE Embedded Systems Letters-
dc.identifier.doi10.1109/LES.2023.3307622-
dc.identifier.wosidWOS:001236731600017-
dc.identifier.scopusid2-s2.0-85168744700-
dc.author.googleJeong-
dc.author.googleIpoom-
dc.author.googleEunbi-
dc.author.googleKim-
dc.author.googleNam Sung-
dc.author.googleYoon-
dc.author.googleMyung Kuk-
dc.contributor.scopusid윤명국(55646629400)-
dc.date.modifydate20240830115056-
Appears in Collections:
인공지능대학 > 컴퓨터공학과 > Journal papers
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

BROWSE