核(代数)
计算机科学
估计员
人工智能
计算机视觉
凝视
生成模型
数学
生成语法
模式识别(心理学)
统计
组合数学
作者
Xuanhong Chen,Muchun Chen,Yugang Chen,Yinxin Lin,Bilian Ke,Bingbing Ni
标识
DOI:10.1109/tip.2025.3529379
摘要
Efficient and highly accurate lightweight gaze estimation method has been receiving increasing research attention due to the emergence of mobile interactive platforms such as mobile device and AR/VR. State-of-the-art deep learning based gaze estimation models suffer from either heavy computational architecture which is infeasible for mobile deployment or limited generalization capability which cannot deal with large diversity in eye texture or distinguish subtle/frequent pupil movement. To mitigate the above challenges, we propose a novel lightweight network structure featuring a deformable approximate large kernel which can effectively extend the receptive field to handle complicated eye movement and highly varying eye/gaze region appearance with very tight computational budget. In the meantime, we embed the training of the gaze estimator into a control information extraction module, which serves as a gaze-parameter input that modularizes a large generative model (Stable Diffusion V1.5) to output gaze-specific eye images. In this way, the great generalization capability of large generative model could be implicitly distilled/pursued into our lightweight gaze model. Extensive comparisons with various state-of-the-art gaze estimation methods demonstrate the superiority of our proposed model and training scheme in terms of both accuracy and model complexity.
科研通智能强力驱动
Strongly Powered by AbleSci AI