DSpace at EWHA: A Scalable RISC-V Vector Processor for Edge AI Devices

Browse

My Repository

DSpace at EWHA일반대학원 전자전기공학과 Theses_Master

View : 144 Download: 0

A Scalable RISC-V Vector Processor for Edge AI Devices

Title: A Scalable RISC-V Vector Processor for Edge AI Devices

Authors: 전유진

Issue Date: 2024

Department/Major: 대학원 전자전기공학과

Publisher: 이화여자대학교 대학원

Degree: Master

Advisors: 김지훈

Abstract: Edge AI refers to the execution of AI services on the device itself, without transmitting data acquired from edge devices to central servers or the cloud. It reduces network dependency, enhances real-time responsiveness, and strengthens personal data security, leading to high demand in various fields. Typically, deep learning models run on edge devices, necessitating hardware acceleration systems to efficiently handle repetitive operations on large-scale datasets. A vector processor is a processor designed to concurrently process multiple sets of data that require the same operation. It is utilized in various fields with high data-level parallelism, such as graphics processing, scientific simulations, and deep learning computations. Especially in deep learning computations, a vector processor typically provides faster processing speeds than handling data one by one in the CPU core of a microcontroller. Additionally, it can flexibly support operations across various models, compared to accelerators optimized for specific deep learning models, making a vector processor an effective acceleration tool for implementing edge AI. To this end, this paper organizes operations commonly used in various neural network operations into neural network library functions using RISC-V instruction sets and proposes a vector processor optimized for processing these functions. During the processing of deep learning applications, to accelerate operations using the proposed vector processor, functions in this study can be leveraged. The number of lanes of the vector processor and vector register length are parameterizable, and the scalability associated with these configurations has been confirmed. Additionally, we minimized function processing time by maximizing the overlap of multiple vector instructions, resulting in a 1.31x - 2.91x speed-up.;Edge AI 는 에지(Edge) 디바이스에서 획득한 데이터를 중앙 서버나 클라우드로 전송하지 않고, 기기 자체에서 처리하여 AI 서비스를 실행하는 것이다. 이로써 네트워크 의존성이 감소하고, 실시간 응답 및 개인 정보 보안이 강화되어 여러 분야에서 높은 수요를 얻고 있다. 주로 딥러닝 모델이 에지 기기에서 실행되며, 따라서 대규모 데이터에 대한 반복적인 연산을효율적으로 처리하는 하드웨어 가속 시스템이 필요하다. 벡터 프로세서는 동일한 연산이 필요한 여러 데이터를 동시에 처리하는 프로세서로, 그래픽 처리, 과학적 시뮬레이션, 딥러닝 연산 등과 같이 데이터 수준의 병렬성이 높은 다양한 분야에서 활용되고 있다. 특히 딥러닝 연산에서 일반적으로 마이크로컨트롤러의 CPU 코어에서 데이터를 하나씩 처리하는 것보다 빠른 처리 속도를 제공하며, 특정 딥러닝 모델에 최적화된 가속기보다 다양한 모델의 연산을 유연하게 지원할 수 있어서 edge AI 를 구현하는 데 효과적인 가속 수단이 될 수 있다. 이를 위해 본 논문에서는 다양한 신경망 연산에서 흔히 사용되는 연산들을 RISC-V 명령어 집합을 활용하여 신경망 라이브러리 함수로 구성하고, 이러한 함수를 처리하는 데에 최적화된 벡터 프로세서를 제안한다. 딥러닝 어플리케이션 처리 중 제안하는 벡터 프로세서를 사용하여 연산을 가속하고자 하는 경우, 본 연구에서 생성한 라이브러리 함수를 활용할 수 있다. 벡터 프로세서의 레인 수와 벡터 레지스터 길이를 사용자의 요구에 따라 조절할 수 있도록 하였으며 그에 따른 확장성을 확인하였다. 또한 다수의 벡터 명령어를 최대한 중첩하여 처리하여 함수 처리 시간을 최소화하고자 했으며, 이 결과로 1.31 배에서 2.91 배의 속도 향상이 가능하였다.