Congratulations to Yuhang Song for passing his Qualifying Exam on January 10, 2019! Yuhang’s thesis proposal is titled with “High-Quality Image Inpainting with Deep Generative Models”. His qualifying exam committee consisted of Jay Kuo (Chair), Antonio Ortega, Alexander Sawchuk, Panayiotis Georgiou, and Ulrich Neumann (Outside Member).

We invited Yuhang to talk about his thesis proposal:

Image inpainting is the task to reconstruct the missing region in an image with plausible contents based on its surrounding context, which is a common topic of low-level computer vision. Recent development in deep generative models enables an efficient end-to-end framework for image synthesis and inpainting tasks, However, existing methods are limited to fill in small holes on low-resolution images, and very often generate unsatisfying results containing easily detectable flaws. In this thesis proposal, we specifically study two image inpainting related problems: 1) finetuning the image generation textures; 2) making use of the semantic segmentation information for higher quality image inpainting.

In order to overcome the difficulty to directly learn the distribution of high-dimensional image data, we divide the task into inference and translation as two separate steps and model each step with a deep neural network. We also use simple heuristics to guide the propagation of local textures from the boundary to the hole. We show that, by using such techniques, inpainting reduces to the problem of learning two image-feature translation functions in much smaller space and hence easier to train. We evaluate our method on several public datasets and show that we generate results of better visual quality than previous state-of-the-art methods.

The second research idea is motivated by the fact that existing methods based on generative models don’t exploit the segmentation information to constrain the object shapes, which usually lead to blurry results on the boundary. To tackle this problem, we propose to introduce the semantic segmentation information, which disentangles the inter-class difference and intra-class variation for image inpainting. This leads to much clearer recovered boundary between semantically different regions and better texture within semantically consistent segments. Our model factorizes the image inpainting process into segmentation prediction (SP-Net) and segmentation guidance (SG-Net) as two steps, which predict the segmentation labels in the missing area first, and then generate segmentation guided inpainting results. Experiments on multiple public datasets show that our approach outperforms existing methods in optimizing the image inpainting quality, and the interactive segmentation guidance provides possibilities for multi-modal predictions of image inpainting.