Tencent researchers propose a GFP-GAN that takes advantage of the rich and diverse Priors encapsulated in a pre-selected GAN face for blind face recovery.

The goal of Blind Face Recovery is to recover high-quality images of human faces from lower-quality analogues that have been degraded for an unknown reason. Some of the causes of degeneration may be noise, distortion, low resolution, and compression. In this work, researchers from Tencent’s Center for Applied Research suggest GFP-GAN, obstetric GAN of the face to restore the blind face in the real world. As can be seen in Figure 1, the images recovered through GFP-GAN reach a higher level of realism and resolution with fewer artifacts.

Figure 2 shows the general structure of GFP-GAN. Given the degraded input image, the goal of GFP-GAN is to generate a high-quality image that is most likely similar to the non-degraded ground truth image. Generally, GFP-GAN includes a degradation module and a pre-trained GAN face like StyleGAN2. We will describe these units in the following paragraphs.

Degradation removal unit

In this paper, the authors use the U-Net architecture as the de-degradation unit. The de-degradation module is responsible for de-degrading the input image and extracting the “clean” features (named Fpotential and Fmy place) that StyleGAN2 will use later. In order to obtain intermediate oversight during the degradation process, the authors rely on recovery loss. Specifically, GFP-GAN produces an image at each resolution scale of the UNet decoder. This image is then forced to approximate the corresponding reality image with the same resolution scale.

Generative synthesis of anterior facet and latent mapping

The intuition in this work is that because the previously trained GAN face captures the distribution on human faces, it can be used to enhance the recovery of degraded images. Normally, it is possible to map the input image to its closest latent token in the latent space of a pre-trained GAN and then use the same GAN to generate the corresponding output. However, such a solution usually requires iterative improvement that takes a long time to reach satisfactory results. For this reason, the authors of this paper decided to create F . intermediate featuresHowever From the nearest face, modified by the latent features Fpotential From the de-degradation module to increase the accuracy with the input image. In particular, Fpotential They are first mapped to W intermediate latent tokens through a multilayer perceptron (MLP). Then, the latent symbols W pass through each convolutional layer of a pre-trained GAN to generate GAN features at each resolution scale. These features are essential because they provide many of the facial details captured by pre-trained GAN weights. It also includes information about facial colors that helps with color enhancement, including colorization of old black and white photos.

Spatial Feature Transformation of Channel Segmentation

Finally, to better preserve accuracy, F . spatial features were usedmy place They are produced by a degradation unit to spatially modify GAN featuresHowever. In this way, spatial information from the input image can be retained. To achieve this goal, GFP-GAN relies on spatial feature transformation (SFT) to apply affine transforms (eg, scaling, transformation) to FHowever. Moreover, given the trade-off between realism and fidelity, the authors actually propose the use of split-channel spatial feature transformation (CS-SFT) layers that perform spatial modulation through the spatial feature Fmy place In some GAN features to improve accuracy while leaving other GAN features unchanged to maintain realism. CS-SFT layers are used in each resolution before the last generation of the restored face.

Learning objectives

The loss function used during the learning process combines four losses.

The Reconstruction loss It requires that the output be as close as possible to the truth on the ground.

The discount loss Used to force GFP-GAN to generate images as natural as possible. To achieve this goal, a discrimination model should be required to distinguish the original images from those generated by GFP-GAN. Thanks to this loss, GFP-GAN learns to construct the images so that the discrimination model cannot understand whether they are real or fake.

The goal of losing facial components is to improve levels of detail in some parts of the face: the left eye, the right eye, and the mouth. In particular, the authors considered local discrimination factors for each of these three regions that should distinguish whether the restored patches are original or not. This forces the spots to be closer to the corresponding natural facial components.

Finally, the authors considered An identity that preserves the loss. A pre-trained face recognition model (such as ArcFace) is used to capture key features of identification. The loss of identity preservation forces the recovered image to be close to the ground truth, given the ArcFace feature space.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'Towards Real-World Blind Face Restoration with Generative Facial Prior'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and github link.

Please Don't Forget To Join Our ML Subreddit


Luca is a Ph.D. Student at the Department of Computer Science at the University of Milan. His interests are machine learning, data analysis, internet of things, phone programming and indoor positioning. His research currently focuses on holistic computing, context awareness, explainable artificial intelligence, and the recognition of human activity in intelligent environments.