作者
Gang Chen,Shuaiyong Xiao,Chenghong Zhang,Huimin Zhao
摘要
Multimodal data are proliferating and hence flourishing data-driven business decision making, exemplified by short video attractiveness prediction (SVAP), multimodal review sentiment classification (MRSC), and multimodal data-based default risk prediction (DRP). However, when data of various modalities (e.g., text, graph, image, and video) are used jointly, they may mutually interact, adversely affecting prediction performance. To unravel and resolve the opaque conflicts in multimodal data, we formally conceptualize multimodal interactions and provide analytical insights for mitigating negative interactions at the feature, modality, and modality-wise instance levels. To better realize the predictive power of multimodal data, we propose a novel deep learning strategy named NIRMD (for negative interaction-regularized multimodal deep learning), which allows positive (negative) multimodal interactions to be effectively encouraged (mitigated) in a learnable nonlinear representation space. Empirical evaluation in three case studies involving SVAP, MRSC, and DRP, respectively, shows that the prediction performance of state-of-the-art multimodal deep learning methods can be enhanced by incorporating NIRMD. Exploratory (i.e., ablation, feature contribution, and case) analyses render evidence of NIRMD’s effectiveness in mitigating negative multimodal interactions. History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning. Funding: G. Chen was supported by the National Natural Science Foundation of China [Grants 72522010, 72301239, and 72394371]. S. Xiao was supported by the National Natural Science Foundation of China [Grants 72301194, 72495133, and 72472058]. C. Zhang was supported by the National Natural Science Foundation of China [Grants 72271059 and 72571071]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0794 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2024.0794 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .