Dagstuhl Seminar 25202
Generative Models for 3D Vision
( May 11 – May 16, 2025 )
Permalink
Organizers
- Bernhard Egger (Friedrich-Alexander-Universität Erlangen-Nürnberg, DE)
- Adam Kortylewski (Universität Freiburg, DE and MPI für Informatik - Saarbrücken, DE)
- William Smith (University of York, GB)
- Stefanie Wuhrer (INRIA - Grenoble, FR)
Contact
- Marsha Kleinbauer (for scientific matters)
- Simone Schilke (for administrative matters)
Shared Documents
- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
Schedule
The rise of purely data-driven generative models, in particular generative adversarial networks, auto-regressive models, neural fields and diffusion models, has led to a step change in image synthesis quality. It is now possible to create photorealistic images with high level semantic control and solve many desirable use cases such as 2D inpainting. Whilst prior models were object specific (e.g. 3D Morphable Models of Faces), we now have generative models for images and videos that can represent various object classes and generate a huge variety of objects and scenes, even in different styles. The drawback of purely data-driven approaches is that the control and explainability provided by 3D and physically-based parameters is lost. It is also difficult (and perhaps prohibitively inefficient) to learn 3D consistent representations without prior models purely from 2D data alone.
For this seminar, a total of 58 researchers were invited, and 25 of them attended. Participants came from both academia and industry and at varying stages of their careers. Thirteen participants presented their work in around 15-30 minute presentations, and an abstract of each presentation is included in this report. We started the seminar with a short introduction of each participant. Everyone was given one slide to introduce themselves and asked to prepare a question, challenge or goal to discuss during the seminar.
In addition to traditional presentations, multiple slots were left for research discussions with the full group or sub-groups of the participants. The first set of these slots was filled with topics that participants proposed before the start of the seminar. Five participants led research discussions of about 1 hour each about a topic or a problem they considered important. Some of these discussions were led with the full group, while others were discussed in sub-groups, and the resulting conclusions were shared with the full group afterwards. Additionally, two 2 hour discussion slots were initially reserved to be filled with suggestions that came up during the seminar. These two long discussions concerned research questions that were identified as being important for the research community in the course of the seminar, namely the topics of metrics and capture, and hard problems in the research community that merit being studied more. All proposed topics led to lively discussions about various problems around generative models. Summaries of the results of these flexible sessions are contained in the Dagstuhl report. In addition to these organized discussions, there were numerous informal discussions during both the Wednesday outing and free time slots that are not summarized in this report.
Bernhard Egger, Adam Kortylewski, Laura Neschen, William Smith, and Stefanie Wuhrer
The rise of purely data-driven generative models, in particular generative adversarial networks, auto-regressive models, neural fields and diffusion models, has led to a step change in image synthesis quality. It is now possible to create photorealistic images with high level semantic control and solve many desirable use cases such as 2D inpainting. Whilst prior models were object specific (e.g. 3D Morphable Models of Faces), we now have generative models for images and videos that can represent various object classes and generate a huge variety of objects and scenes, even in different styles. The drawback of purely data-driven approaches is that the control and explainability provided by 3D and physically-based parameters is lost. It is also difficult (and perhaps prohibitively inefficient) to learn 3D consistent representations without prior models purely from 2D data alone.
Very recently, the community has begun to explore how to combine these two philosophies. 3D computer vision tasks can benefit from the visual prior provided by generative image models. Generative models can learn powerful image priors with some notion of view-point consistency from solely 2D data and then be used to synthesize training data for 3D vision models. Physically-based priors from 3D vision can be used to guide generative image models as a strong explicit inductive prior towards more data-efficient and accurate visual representations of the world. On the other hand, modern generative models rely on huge training datasets and compute resources that, increasingly, are only available to large industrial research labs.
This Dagstuhl Seminar seeks to bring together communities of researchers from computer graphics, computer vision and machine learning in both industry and academia at this extremely timely moment in the progress of the field.
Bernhard Egger, Adam Kortylewski, William Smith, and Stefanie Wuhrer
- Andreea Ardelean (Universität Erlangen-Nürnberg, DE) [dblp]
- Timotei Ardelean (Universität Erlangen-Nürnberg, DE)
- Thabo Beeler (Google - Zürich, CH) [dblp]
- Timo Bolkart (Google Research - Zürich, CH) [dblp]
- Neill Campbell (University of Bath, GB) [dblp]
- Rishabh Dabral (MPI für Informatik - Saarbrücken, DE)
- Olaf Dünkel (MPI für Informatik - Saarbrücken, DE) [dblp]
- Bernhard Egger (Friedrich-Alexander-Universität Erlangen-Nürnberg, DE) [dblp]
- James Gardner (University of York, GB)
- Samara Ghrer (University of Grenoble, FR)
- Marilyn Keller (MPI für Intelligente Systeme - Tübingen, DE) [dblp]
- Ron Kimmel (Technion - Haifa, IL) [dblp]
- Tobias Kirschstein (TU München - Garching, DE) [dblp]
- Adam Kortylewski (Universität Freiburg, DE and MPI für Informatik - Saarbrücken, DE) [dblp]
- Jan Eric Lenssen (MPI für Informatik - Saarbrücken, DE)
- Ruoshi Liu (Columbia University - New York, US) [dblp]
- Laura Neschen (INRIA Rhône-Alpes, FR)
- Or Patashnik (Tel Aviv University, IL) [dblp]
- Ryan Po (Stanford University, US) [dblp]
- Shunsuke Saito (Codec Avatars Lab - Pittsburgh, US) [dblp]
- William Smith (University of York, GB) [dblp]
- Christian Theobalt (MPI für Informatik - Saarbrücken, DE) [dblp]
- Gül Varol (ENPC - Marne-la-Vallée, FR) [dblp]
- Yaniv Wolf (Technion - Haifa, IL)
- Stefanie Wuhrer (INRIA - Grenoble, FR) [dblp]
Related Seminars
Classification
- Computer Vision and Pattern Recognition
- Graphics
- Machine Learning
Keywords
- Generative Models
- Implicit Representation
- Diffusion Models
- Neural Rendering
- Inverse Rendering

Creative Commons BY 4.0
