Search the Dagstuhl Website
Looking for information on the websites of the individual seminars? - Then please:
Not found what you are looking for? - Some of our services have separate websites, each with its own search option. Please check the following list:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Within this website:
External resources:
  • DOOR (for registering your stay at Dagstuhl)
  • DOSA (for proposing future Dagstuhl Seminars or Dagstuhl Perspectives Workshops)
Within this website:
External resources:
Within this website:
External resources:
  • the dblp Computer Science Bibliography

Dagstuhl Seminar 22161

Recent Advancements in Tractable Probabilistic Inference

( Apr 18 – Apr 22, 2022 )

(Click in the middle of the image to enlarge)

Please use the following short url to reference this page:





ML models and systems to enable and support decision making in real-world scenarios need to robustly and effectively reason in the presence of uncertainty over the configurations of the world that can be observed. Probabilistic inference provides a principled framework to carry on this reasoning process, and enables probabilistic modeling: a collection of principles to design and learn from data models that are capable of dealing with uncertainty. The main purpose for these models, once learned or built, is to answer queries -- posed by humans or other autonomous systems -- concerning some aspects of the represented world and quantifying some form of uncertainty over it. That is, that is computing some quantity of interest of the probability distribution that generated the observed data. For instance, the mean or the modes of such a distribution, the marginal or conditional probabilities of events, expected utilities of our policies, or decoding most likely assignments to variables (also known as MAP inference, cf. the Viterbi algorithm). Answering these queries reliably and efficiently is more important than ever: we need ML models and systems to perform inference based on well-calibrated uncertainty estimates throughout all reasoning steps, especially when informing and supporting humans in decision making processes in the real world.

For instance, consider a ML system learned from clinical data to support physicians and policy makers. Such a system would need to support arbitrary queries posed by physicians, that is, questions that are not known a priori. Moreover, these queries might involve complex probabilistic reasoning over possible states of the world, for instance involving maximization of some probabilities and the ability to marginalize over unseen or not available (missing) attributes like "At what age is a patient with this X-ray but no previous health record most likely to show any symptom of COVID-19?", or counting and comparing sub-populations "What is the probability of there being more cases with fever given a BMI of 25 in this county than in the neighboring one?". At the same time, it should guarantee that the uncertainty in its answers, modeled as probabilities, should be faithful to the real-world distribution as uncalibrated estimates might greatly mislead the decision maker.

Recent successes in machine learning (ML) and particularly deep learning have delivered very expressive probabilistic models and learning algorithms. These have proven to be able to induce exceedingly richer models from larger datasets but, unfortunately, at an incredible cost: these models are vastly intractable for all but the most trivial of probabilistic reasoning tasks, and they have been demonstrated to provide unreliable uncertainty estimations. In summary, their applicability to real-world scenarios, like the one just described, is very limited.

Nevertheless all these required "ingredients" are within the grasp of several models which we group together under the umbrella name of tractable probabilistic models, the core interest of this seminar. Tractability here guarantees answering queries efficiently and exactly. Tractable probabilistic models (TPMs) have a long history rooted in several research fields such as classical probabilistic graphical models (low-treewidth and latent variable models), automated reasoning via knowledge compilation (logical and arithmetic circuits) and statistics (mixture models, Kalman filters). While these classical TPMs are known to be limited in expressiveness, several recent advancements in deep tractable models (sum-product networks, probabilistic sentential decision diagrams, normalizing flows and neural autoregressive models) are inverting the trend and promising tractable probabilistic inference with little or no compromise when compared to the deep generative models discussed above. It becames then more and more important to have a seminar on these recent successes of TPMs bringing together the different communities of tractable probabilistic modeling at the same table to propel collaborations by defining the goals and the agenda for future research.

These are the major topics around which the seminar brought up the aforementioned discussion:

  • Advanced probabilistic query classes
  • Deep tractable probabilistic modeling
  • Robust and verifiable probabilistic inference
  • Exploiting symmetries for probabilistic modelling and applications in science.

Advanced probabilistic query classes

Probabilistic inference can be reduced as computing probabilistic queries, i.e., functions whose output are certain properties of a probability distribution (e.g., its mass, density, mean, mode, etc.) as encoded by a probabilistic model. Probabilistic queries can be grouped into classes when they compute the same distributional properties and hence share the same computational effort to be answered. Among the most commonly used query classes there are complete evidence (EVI), marginals (MAR), conditionals (CON) and maximum a posteriori (MAP) inference. While these classes have been extensively investigated in theory and practice, they constitute a small portion of the probabilistic inference that might be required to support complex decision making in the real-world.

In fact, one might want to compute the probabilities of logical and arithmetic constraints, of structured objects such as rankings, comparing the likelihood and counts of groups of events or computing the expected predictions of discriminative model such as a classifier or regression w.r.t. some feature distribution. Tracing the exact boundaries of tractable probabilistic inference for these advanced probabilistic query classes and devising probabilistic models delivering efficient and reliable inference for them is an open challenge.

Deep tractable probabilistic modeling

A probabilistic model falls under the umbrella name of tractable probabilistic models (TPMs) if it guarantees exact and polytime inference for certain query classes. As different model classes can be tractable representations for different query classes, a spectrum of tractable inference emerges. Typically, this create a tension with the extent of a model class supporting a larger set of tractable query classes, and its expressive efficiency, i.e., the set of functions it can represent compactly.

Recent deep generative models such as generative adversarial networks (GANs), regularized and variational autoencoders (VAEs) fall out of the TPM umbrella because they either have no explicit likelihood model or computing even the simplest class of queries, EVI, is hard in general. In fact, despite their successes, their inference capabilities are severely limited and one has to recur to approximations. However, the approximate inference routines available so far (such as the evidence lower bound and its variants) do not provide sufficiently strong guarantees on the quality of the approximation delivered to be safely deployed in real-world scenarios.

On the other hand, classical TPMs from the probabilistic graphical model community support larger classes of tractable queries comprising MAR, CON and MAP (to different extents based on the model class). Among these there are: i) low or bounded-treewidth probabilistic graphical models that exchange expressiveness for efficiency; ii) determinantal point processes which allow tractable inference for distributions over sets; iii) graphical models with high girth or weak potentials, that provide bounds on the performance of approximate inference methods; and iv) exchangeable probabilistic models that exploit symmetries to reduce inference complexity.

A different prospective on tractability is brought by models compiling inference routines into efficient computational graphs such as arithmetic circuits, sum-product networks, cutset networks and probabilistic sentential decision diagrams have advanced the state-of-the-art inference performance by exploiting context-specific independence, determinism or by exploiting latent variables. These TPMs, as well as many classical tractable PGMs as listed above, can be cast under a unifying framework of probabilistic circuits (PCs), abstracting from the different graphical formalisms of each model. PCs with certain structural properties support tractable MAR, CON, MAP as well as some of the advanced query classes touched in the previous topic item. Guy Van den Broeck gave a long talk on the first day of the seminar to set the stage for participants for viewing tractable probabilistic models from the lens of probabilistic circuits.

More recently, the field of neural density estimators has gained momentum in the tractable probabilistic modeling community. This is due to model classes such as normalizing flows and autoregressive models. Autoregressive models and flows retain the expressiveness of GANs and VAEs, by levering powerful neural representations for probability factors or invertible transformations, while overcoming their limitations and delivering tractable EVI queries. As such, they position themselves in the spectrum of tractability in an antithetic position w.r.t. PCs: while the latter support more tractable query classes, the former are generally more expressive. On the first day of the seminar, Marcus Brubaker introduced these models to the seminar participants in a long talk. It is an interesting open challenge to combine TPM models from different regions of such a spectrum to leverage the "best of different worlds", i.e., increase a model class expressive efficiency while retaining the largest set of supported tractable query classes as possible. The first day subsequently ended with a lively open discussion on the differences between TPMs and Neural Generative Models and what advantages and lessons they can provide the other models.

Robust and verifiable probabilistic inference

Along exactness and efficiency, one generally requires inference routines to be robust to adversarial conditions (noise, malicious attacks, etc.) and to be allow exactness and efficiency to be formally provable. This is crucial to deploy reliable probabilistic models in real-world scenarios (cf. other topic). Recent advancements in learning tractable and intractable probabilistic models from data have raised the question if the learned models are just exploiting spurious correlations in input space, thus ultimately delivering an unfaithful image of the probability distribution they try to encode. This raises several issues, as in tasks like anomaly detection and model comparison, which rely on correctly calibrated probabilities, one can be highly mislead by such unfaithful probabilistic models. Furthermore, one might want to verify a priori or ex-post (e.g., in presence of adversarial interventions) if one probabilistic inference algorithm truly guarantees exact inference. Questions like this have just very recently been tackled in a formal verification setting, where proofs of the correctness of inference can be verified with less resources than it takes to execute inference.

Over the course of the seminar, through informal discussions and formal talks by the participants discussed the above mentioned issues in tractable probabilistic inference through topics such as Bayesian Deep Learning, Incorporating symmetries in probabilistic modelling using equivariance with applications in sciences, explainable AI etc.

Overall, the seminar produced numerous insights into how efficient, expressive, flexible, and robust tractable probabilistic models can be built. Specially, the discussions and talks at the seminar spurred a renewed interest in the community to:

  • develop techniques and approaches that bring together key ideas from several different fields that include deep generative models, probabilistic circuits, knowledge compilation, and approximate inference.
  • create bridges between researchers in these different fields and identify ways in which enhanced interaction between the communities can continue.
  • generate a set of goals, research directions, and challenges for researchers in these field to develop robust and principled probabilistic models.
  • provide a unified view of the current undertakings in these different fields towards probabilistic modelling and identifying ways to incorporate ideas from several fields together.
  • develop a new systematic and unified set of development tools encompassing these different areas of probabilistic modelling.
Copyright Priyank Jaini, Kristian Kersting, Antonio Vergari, and Max Welling


AI and ML systems are being increasingly deployed in real-world scenarios — from healthcare, to finance, to policy making — to support human decision makers. As such, they are expected to reliably and flexibly support decisions by reasoning in the presence of uncertainty. Probabilistic inference provides a principled way to carry on this reasoning process over models that encode complex representations of the world as probability distributions. While we would like to have guarantees over the quality of the answers that these probabilistic models provide, we also expect them to be expressive enough to capture the intricate dependencies of the world they try to represent. Research on tractable probabilistic inference and modeling precisely investigates how a sensible trade-off between reliability and flexibility can be substantiated in these challenging scenarios.

Traditionally, research on representations and learning for tractable inference have embraced very different fields, each one contributing its own perspective. These include automated reasoning, probabilistic modeling, statistical and Bayesian inference and deep learning. More recent trends include the emerging fields of tractable neural density estimators such as autoregressive models and normalizing flows; probabilistic circuits such as sum-product networks and probabilistic sentential decision diagrams; and approximate inference routines with guarantees on the quality of the approximation.

The main goal of this Dagstuhl Seminar is to provide a common forum for researchers working in these seemingly “disparate” areas to discuss the recent advancement on reliable, efficient inference over expressive probabilistic models and discuss open problems such as

i) how can we design and learn expressive probabilistic models that guarantee tractable inference? How can we trade-off reliability and expressiveness in a principled way?

ii) how can probabilistic models robustly reason about the world and safely generalize over unknown states of the world?

iii) what challenges do practitioners of probabilistic modeling face in their applications and how can we democratize the use of reliable and efficient probabilistic inference?

iv) how can we effectively exploit the structure and symmetries in the world and in our models to efficiently perform inference or obtain reliable approximations?

We hope that the discussions around these topics can be turned into a vision document that not only summarizes the current state-of-the-art in these diverse fields, but also reconciles them to serve as an inspirational guide for a new generation of researchers that is just approaching the broader field of probabilistic AI and ML.

We thus aim to include participants from the many recent emerging fields of tractable neural density estimators such as autoregressive models and normalizing flows; deep tractable probabilistic circuits such as sum-product networks, probabilistic sentential decision diagrams and cutset networks; as well as approximate inference routines with guarantees on the quality of the approximation to offer diverse perspectives for the seminar.

Copyright Priyank Jaini, Kristian Kersting, Antonio Vergari, and Max Welling

  • Alessandro Antonucci (IDSIA - Manno, CH)
  • Michael Chertkov (University of Arizona - Tucson, US)
  • YooJung Choi (UCLA, US) [dblp]
  • Alvaro Correia (TU Eindhoven, NL)
  • Priyank Jaini (Google - Toronto, CA)
  • Kristian Kersting (TU Darmstadt, DE) [dblp]
  • Stefan Mengel (University of Artois/CNRS - Lens, FR) [dblp]
  • Eric Nalisnick (University of Amsterdam, NL)
  • Sriraam Natarajan (University of Texas - Dallas, US) [dblp]
  • Mathias Niepert (Universität Stuttgart, DE) [dblp]
  • Robert Peharz (TU Graz, AT)
  • Xiaoting Shao (TU Darmstadt, DE)
  • Guy Van den Broeck (UCLA, US) [dblp]
  • Antonio Vergari (University of Edinburgh, GB)
  • Andrew G. Wilson (New York University, US)
  • Marcus A. Brubaker (York University - Toronto, CA)
  • Cassio de Campos (TU Eindhoven, NL)
  • Nicola Di Mauro (University of Bari, IT)
  • Laurent Dinh (Montreal, CA)
  • Danilo Jimenez Rezende (Google DeepMind - London, GB) [dblp]
  • Mikko Koivisto (University of Helsinki, FI) [dblp]
  • Sara Magliacane (University of Amsterdam, NL)
  • Lilith Francesca Mattei (IDSIA - Lugano, CH)
  • Denis D. Mauá (University of Sao Paulo, BR)
  • Karthika Mohan (Oregon State University, US)
  • David Montalvan Hernandez (TU Eindhoven, NL)
  • Deepak Pathak (Carnegie Mellon University - Pittsburgh, US)
  • Tahrima Rahman (University of Texas - Dallas, US)
  • Jakub Tomczak (VU University Amsterdam, NL)
  • Aki Vehtari (Aalto University, FI)
  • Max Welling (University of Amsterdam, NL) [dblp]
  • Yaoliang Yu (University of Waterloo, CA)
  • Han Zhao (University of Illinois - Urbana-Champaign, US)

  • Artificial Intelligence
  • Machine Learning

  • Generative Models
  • Deep Learning
  • Probabilistic Models
  • Graphical Models
  • Tractable inference