Multimodal transformer augmented fusion for speech emotion recognition
Speech emotion recognition is challenging due to the subjectivity and ambiguity of emotion.In recent years, multimodal methods for speech emotion recognition have achieved rme ufx iii promising results.However, due to the heterogeneity of data from different modalities, effectively integrating different modal information remains a difficulty and br