A Caltech Library Service

SynSin: End-to-End View Synthesis From a Single Image

Wiles, Olivia and Gkioxari, Georgia and Szeliski, Richard and Johnson, Justin (2020) SynSin: End-to-End View Synthesis From a Single Image. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE , pp. 7465-7475. ISBN 978-1-7281-7168-5.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


View synthesis allows for the generation of new views of a scene given one or more images. This is challenging; it requires comprehensively understanding the 3D scene from images. As a result, current methods typically use multiple images, train on ground-truth depth, or are limited to synthetic data. We propose a novel end-to-end model for this task using a single image at test time; it is trained on real images without any ground-truth 3D information. To this end, we introduce a novel differentiable point cloud renderer that is used to transform a latent 3D point cloud of features into the target view. The projected features are decoded by our refinement network to inpaint missing regions and generate a realistic output image. The 3D component inside of our generative model allows for interpretable manipulation of the latent feature space at test time, e.g. we can animate trajectories from a single image. Additionally, we can generate high resolution images and generalise to other input resolutions. We outperform baselines and prior work on the Matterport, Replica, and RealEstate10K datasets.

Item Type:Book Section
Related URLs:
URLURL TypeDescription ItemDiscussion Paper
Johnson, Justin0000-0002-1251-088X
Record Number:CaltechAUTHORS:20221215-789782000.20
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:118379
Deposited By: George Porter
Deposited On:19 Dec 2022 20:53
Last Modified:19 Dec 2022 22:48

Repository Staff Only: item control page