计算机科学
分类
代表(政治)
关系(数据库)
特征学习
人工智能
稳健性(进化)
生成语法
过程(计算)
编码器
机器学习
数据挖掘
生物化学
化学
政治
政治学
基因
法学
操作系统
作者
Xin Wang,Hong Chen,Wenwu Zhu
标识
DOI:10.1145/3581783.3613859
摘要
Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form. The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controllability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, data mining etc. In this tutorial, we comprehensively present DRL from various aspects including motivations, definitions, methodologies, evaluations, applications and model designs for multimedia. We discuss works on DRL based on two well-recognized definitions, i.e., Intuitive Definition and Group Theory Definition. We further categorize the methodologies for DRL into four groups, i.e., Traditional Statistical Approaches, Variational Auto-encoder Based Approaches, Generative Adversarial Networks Based Approaches, Hierarchical Approaches and Other Approaches. We also analyze principles to design different DRL models that may benefit different tasks in practical multimedia applications. Finally, we point out challenges in DRL as well as potential research directions deserving future investigations. We believe this tutorial may provide insights for promoting the DRL research in the multimedia community.
科研通智能强力驱动
Strongly Powered by AbleSci AI