An N-Layered Network (NLN) for High-Resolution Image Inpainting with Multi Spatial Lexical-Texture Fusion (MS-LTF)
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Image inpainting is a task that reconstructs deteriorated regions of an image by adding high-resolution pixels with high quality. Various applications, including photo restoration, object removal, and artistic creation, also implement it. This paper presents the Synergistic Ensemble Model (SEM) for high-resolution image inpainting, which can effectively reconstruct damaged or high-resolution images with high accuracy. The proposed SEM is a hybrid algorithm that reconstructs the damaged image with high resolution. In this work, the pre-trained model DeepFill v2 (Generative Image Inpainting) is used to train on given image datasets. The DeepFill v2 improved in identifying accurate patterns of missing regions, which helped increase the performance of the proposed approach. The preprocessing technique, mask creation, requires masks that find missing or corrupted areas of input images. An integrated feature extraction technique Multi Spatial Lexical-Texture Fusion (MS-LTF) that extracts both global and local texture features. In this context, the proposed SEM includes a new model called the N-Layered Network (NLN), which systematically reconstructs and redesigns deteriorated input images with an accurate filling rate. The main advantage of NLN is that it obtains the features of pre-trained DeepFill v2 and combines them with an N5-layer model, improving the reconstruction rate. Researchers conducted experiments on three benchmark datasets: CelebA-HQ, the Paris StreetView Dataset, and the Places2 Dataset. The result analysis shows that the proposed approach improves the reconstruction rate in terms of peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), mean square error (MSE), and Frechet inception distance (FID), while also enhancing subjective visual quality, without showing patches in the output image.