View : 568 Download: 0

Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems

Title
Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems
Authors
Yang H.-K.Yong H.-S.
Ewha Authors
용환승
SCOPUS Author ID
용환승scopus
Issue Date
2020
Journal Title
Journal of Data and Information Science
ISSN
2096-157XJCR Link
Citation
Journal of Data and Information Science vol. 5, no. 2, pp. 13 - 32
Keywords
Apache SparkBig dataIncremental tensor decompositionPARAFACTensor decomposition
Publisher
Sciendo
Indexed
SCOPUS scopus
Document Type
Article
Abstract
We propose InParTen2, a multi-aspect parallel factor analysis three-dimensional tensor decomposition algorithm based on the Apache Spark framework. The proposed method reduces re-decomposition cost and can handle large tensors. Considering that tensor addition increases the size of a given tensor along all axes, the proposed method decomposes incoming tensors using existing decomposition results without generating sub-tensors. Additionally, InParTen2 avoids the calculation of Khari-Rao products and minimizes shuffling by using the Apache Spark platform. The performance of InParTen2 is evaluated by comparing its execution time and accuracy with those of existing distributed tensor decomposition methods on various datasets. The results confirm that InParTen2 can process large tensors and reduce the re-calculation cost of tensor decomposition. Consequently, the proposed method is faster than existing tensor decomposition algorithms and can significantly reduce re-decomposition cost. There are several Hadoop-based distributed tensor decomposition algorithms as well as MATLAB-based decomposition methods. However, the former require longer iteration time, and therefore their execution time cannot be compared with that of Spark-based algorithms, whereas the latter run on a single machine, thus limiting their ability to handle large data. The proposed algorithm can reduce re-decomposition cost when tensors are added to a given tensor by decomposing them based on existing decomposition results without re-decomposing the entire tensor. The proposed method can handle large tensors and is fast within the limited-memory framework of Apache Spark. Moreover, InParTen2 can handle static as well as incremental tensor decomposition. © 2020 2020 Hye-Kyung Yang et al., published by Sciendo.
DOI
10.2478/jdis-2020-0010
Appears in Collections:
인공지능대학 > 컴퓨터공학과 > Journal papers
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

BROWSE