DSpace at EWHA: Bimodal semantic fusion prototypical network for few-shot classification

Browse

My Repository

DSpace at EWHA공과대학 전자전기공학전공 Journal papers

View : 165 Download: 0

Full metadata record

DC Field	Value	Language
dc.contributor.author	최선한	-
dc.date.accessioned	2024-05-10T16:30:41Z	-
dc.date.available	2024-05-10T16:30:41Z	-
dc.date.issued	2024	-
dc.identifier.issn	1566-2535	-
dc.identifier.other	OAK-35290	-
dc.identifier.uri	https://dspace.ewha.ac.kr/handle/2015.oak/268072	-
dc.description.abstract	Few-shot classification learns from a small number of image samples to recognize unseen images. Recent few-shot learning exploits auxiliary text information, such as class labels and names, to obtain more discriminative class prototypes. However, most existing approaches rarely consider using text information as a clue to highlight important feature regions and do not consider feature alignment between prototypes and targets, leading to prototype ambiguity owing to information gaps. To address this issue, a prototype generator module was developed to perform interactions between the text knowledge of the class name and visual feature maps in the spatial and channel dimensions. This module learns how to assign mixture weights to essential regions of each sample feature to obtain informative prototypes. In addition, a feature refinement module was proposed to embed text information into query images without knowing their labels. It generates attention from concatenated features between query and text features through pairwise distance loss. To improve the alignment between the prototype and relevant targets, a prototype calibration module was designed to preserve the important features of the prototype by considering the interrelationships between the prototype and query features. Extensive experiments were conducted on five few-shot classification benchmarks, and the results demonstrated the superiority of the proposed method over state-of-the-art methods in 1-shot and 5-shot settings. © 2024 Elsevier B.V.	-
dc.language	English	-
dc.publisher	Elsevier B.V.	-
dc.subject	Feature aggregation	-
dc.subject	Few-shot classification	-
dc.subject	Local and global attention	-
dc.subject	Multi-source information fusion	-
dc.title	Bimodal semantic fusion prototypical network for few-shot classification	-
dc.type	Article	-
dc.relation.volume	109	-
dc.relation.index	SCIE	-
dc.relation.index	SCOPUS	-
dc.relation.journaltitle	Information Fusion	-
dc.identifier.doi	10.1016/j.inffus.2024.102421	-
dc.identifier.scopusid	2-s2.0-85190576361	-
dc.author.google	Huang	-
dc.author.google	Xilang	-
dc.author.google	Choi	-
dc.author.google	Seon Han	-
dc.contributor.scopusid	최선한(57199723590)	-
dc.date.modifydate	20240510140050	-