Based on the maximum a posterior (MAP) estimation framework and recent advances in deep learning, this article reports on a novel deep MAP-based video denoising method named MAP-VDNet with adaptive temporal fusion and deep image prior.
Unlike the maturity of image denoising research, video denoising has remained a challenging problem. A fundamental issue at the core of the video denoising (VD) problem is how to efficiently remove noise by exploiting temporal redundancy in video frames in a principled manner. The proposed MAP-based VD algorithm enables computationally efficient untangling of motion estimation (frame alignment) and image restoration (denoising). To address the misalignment issue, this article also presents a robust multi-frame fusion strategy for predicting spatially varying fusion weights by a neural network. To facilitate end-to-end optimization, the proposed iterative MAP-based VD algorithm is unfolded into a deep convolutional network named MAP-VDNet. Extensive experimental results on three popular video datasets have shown that the proposed MAP-VDNet significantly outperforms current state-of-the-art VD techniques such as ViDeNN and FastDVDnet. The code is available at https://see.xidian.edu.cn/faculty/wsdong/Projects/MAP-VDNet.htm. (publisher abstract modified)