期刊:IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers] 日期:2018-03-15卷期号:20 (10): 2814-2823被引量:55
标识
DOI:10.1109/tmm.2018.2815784
摘要
Optical flow estimation plays an important role in many multimedia and computer vision tasks. Although great progress has been made in applying convolutional neural networks (CNNs) to estimate optical flow in recent works, it is still difficult for CNNs to generate optical flow with the desired effectiveness and efficiency. Compared to CNN-based methods, conventional variational methods normally perform to optimize an energy function and produce optical flow with more precise details. Inspired by the effectiveness of variational methods and deep CNNs, we propose a recurrent spatial pyramid (RecSPy) network for optical flow estimation. To deal with large displacements and to decrease the number of parameters, we formulate the spatial pyramid as a recurrent process, and adopt a CNN to refine optical flow at each spatial scale. Furthermore, to improve the results with more precise details, we propose an energy function that encodes structure and constancy constraints to help refine the optical flow at each spatial scale. The combination of the proposed RecSPy network and the proposed energy-based refinement enables our system to estimate optical flow effectively and efficiently. Experimental results on the benchmarks validate the effectiveness and efficiency of the proposed method.