Real-world Adaptation for enhancedphoto-realistic and semantic Style Transfer in indoor Panoramas

Muhammad Tukur
Uzair Shah
Sara Jashari
Yehia Boraey
Giovanni Pintore
Enrico Gobbetti
Noora Fetais
Jens Schneider
Marco Agus

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We present a novel geometry-aware, shading-independent, photo-realistic, andsemantic style transfer method for indoor panoramic scenes—for practical, real-world use. Unlike previous methods requiring separate inputs, our approach employs a multitask dense prediction architecture to infer multiple pixel-wise sig-nals from a single panoramic image. This comprehensive approach automaticallyderives essential signals (depth, semantic layers, shading, and reflectance) from a single 360-degree panoramic indoor photo, significantly enhancing usability for real-world scenarios. Our method extends the capabilities of semantic-aware gen-erative adversarial architectures by introducing two innovative strategies that address the geometric characteristics of indoor scenes and improve overall performance. We incorporate robust geometry losses that utilize layout and depthinference during training to ensure shape consistency between the generatedscenes and the ground truth. Secondly, we employ an hybrid end-to-end edge-driven scheme based on convolutional neural network for performing Intrinsic Image Decomposition (IID) in a way to extract the albedo and normalized shading signals in form of obscurance and highlights from the original scenes. We perform the style transfer on the albedo rather than on full RGB images, effectively preventing shading-related bleeding issues. Additionally, we apply super-resolution to the resulting scenes to enhance image quality and capturefine details. We tested this extended model on both real-world and syntheticdata. Experimental results demonstrate that our proposed enhanced architecture outperforms state-of-the-art style transfer models in terms of perceptualand accuracy metrics, achieving an 18.91% lower ArtFID (Art-Fr´echet InceptionDistance), a 13.99% higher PSNR (Peak Signal-to-Noise Ratio), and an 8.99%higher SSIM (Structural Similarity). The visual results show that our method iseffective in producing realistic and visually pleasing indoor scenes for a varietyof applications in the Architecture, Engineering and Construction field.

Version published to 10.21203/rs.3.rs-5024690/v1 on Research Square
Oct 4, 2024

Towards Sustainable Image Synthesis: A Comprehensive Review of Text-to-Image Generation Models

This article has 5 authors:
1. Smita Bharne
2. Pallavi Sapkale
3. Ekta Sarda
4. Puja Padiya
5. Shamal Salunkhe
This article has no evaluationsLatest version Apr 16, 2025
A content-aware image editing method for perceptual size restoration based on seam carving

This article has 3 authors:
1. Zhenyang Zhu
2. Naohiko Ishikawa
3. Xiaoyang Mao
This article has no evaluationsLatest version Apr 24, 2025
Open-Vocabulary 3D Understanding with Identity-Enhanced Segmentation

This article has 5 authors:
1. Weijie Lin
2. Wei Xiang
3. Lu Yu
4. Tianyu Chen
5. Kang Han
This article has no evaluationsLatest version May 7, 2025

Listed in

Abstract

Article activity feed

Related articles

Towards Sustainable Image Synthesis: A Comprehensive Review of Text-to-Image Generation Models

A content-aware image editing method for perceptual size restoration based on seam carving

Open-Vocabulary 3D Understanding with Identity-Enhanced Segmentation