Dagstuhl Seminar 22291: Machine Learning and Logical Reasoning: The New Frontier

Dagstuhl Seminar 22291

Machine Learning and Logical Reasoning: The New Frontier

( Jul 17 – Jul 22, 2022 )

(Click in the middle of the image to enlarge)

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/22291

Organizers

Sébastien Bardin (CEA LIST, FR)
Vijay Ganesh (University of Waterloo, CA)
Somesh Jha (University of Wisconsin-Madison, US)
Joao Marques-Silva (CNRS - Toulouse, FR)

Contact

Michael Gerke (for scientific matters)
Jutka Gasiorowski (for administrative matters)

Publications

Sébastien Bardin, Somesh Jha, and Vijay Ganesh. Machine Learning and Logical Reasoning: The New Frontier (Dagstuhl Seminar 22291). In Dagstuhl Reports, Volume 12, Issue 7, 80-111, Schloss Dagstuhl - Leibniz-Zentrum für Informatik. February 3, 2023

Schedule

Schedule

Summary

Show Summary

This Dagstuhl seminar is meant to be the first in a series, bringing together researchers from the two main pillars of AI, namely, logical reasoning and machine learning (ML), with a sharp focus on solver-based testing, analysis, and verification (TAV) methods aimed at improving the reliability and security of ML-based systems, and conversely, the use of ML heuristics in improving the power of solvers/provers. A third, albeit smaller focus is neuro-symbolic reasoning (NSR), that aims to combine the power of ML to learn deep correlations with the ability of solvers to perform logical inference as applied to many domains (including but not limited to math and logic).

While many previous Dagstuhl seminars focus on sub-fields of this particular seminar (SAT, SMT, CP or machine learning), we focus here on the synergies and interplay between all them. Our goal is to deepen the understanding of the connections between learning and reasoning, and draw mutually beneficial research directions.

General context: Bringing ML and Logic Reasoning Closer

fields, namely, machine learning (ML) and logical reasoning, that have developed relatively independent of each other. Each of these sub-fields has had a deep and sustained impact on many topics in computer science and beyond, despite the limited interaction between them over the years. However, in recent years new problems and opportunities have come to fore that point towards combinations of ML and logical reasoning as the way forward [1]. In this seminar, we aim to explore combinations of ML and logical reasoning, under the following three specific themes:

Logic Reasoning for ML. Neural Networks (NN) today are ubiquitous and are being deployed as part of critical civilian and defense infrastructure, business processes, automotive software, and governmental decision-making systems. Unfortunately, despite their efficacy in solving many problems, NNs are brittle, unreliable, and pose significant security/privacy challenges [2]. The question of safety and security of NNs has therefore become a great concern to scientists, companies, and governments. In response to this problem, a nascent field of TAV methods for NNs is developing [3]. Key research directions in this context include logics aimed at symbolically representing NNs and their properties [4], novel solving methods [5], as well as solver-based TAV techniques specifically tailored for NNs [6]. A related set of questions focus on explainability and interpretability of NNs [7]. Finally, researchers are also exploring methods that combine logical reasoning within NN learning processes, with the aim of making them adversarially robust [8, 9]. The seminar aims to bring together leading researchers in these topics, enabling cross-fertilization of ideas at a critical juncture in the development of the field.

ML for Logic Reasoning. In recent years, there has been considerable effort aimed at developing ML-based heuristics for logic reasoning engines such as SAT, SMT, and CP solvers. The premise of this line of research is that logic solvers are a combination of methods that implement proof rules and ML-based heuristics aimed at optimally selecting, sequencing, and initializing such proof rules [10]. This has led to new efficient solving algorithms that can solve real-world formulas with millions of variables and clauses in them. One of the many questions that will be explored in the seminar is how can we further deepen and strengthen this relation between ML and reasoning methods. Yet another line of research being explored is that of replacing rule-based solvers with NN-based logic reasoning (e.g., NeuroSAT [11]). Finally, methods are being developed to combine rule-based methods with reinforcement learning to automatically prove mathematical conjectures [12]. The seminar aims to foster deeper interaction and collaboration among researchers who are pioneers in this intersection of ML-based methods and logical reasoning.

Neuro-symbolic Reasoning. The field of neuro-symbolic reasoning (NSR) aims to combine NNs with symbolic reasoning for the purposes of improving reasoning for many domains (including but not limited to pure math or logic). While at a high-level the field of NSR and logic solvers (with ML heuristics) may seem similar, they employ very different kinds of techniques and have differing goals [1]. For example, NSR researchers have developed methods for translating logical representations of knowledge into neural networks. Others have developed neuro-symbolic methods for concept-learning, and yet others have recently applied NSR to program synthesis. Can these concept-learning methods be adapted to the setting of logic solvers? Could it be that graph neural network (GNN) based representations of mathematical knowledge are easier to analyze? The seminar aims to bring these two disparate communities closer together, that otherwise rarely interact with each other. In a nutshell, the aim of the seminar is to foster cross-fertilization of ideas between the logic reasoning, TAV, and ML communities.

In-depth Description of Focus Areas

Logic Reasoning for ML. As stated above, the reliability, safety, and security of NNs is a critical challenge for society at large. An example of a specific problem in this context is that of adversarial input generation methods against NNs. Many methods have been proposed to address this question, from randomized defense mechanisms to adversarial training to symbolic analysis of NNs via solvers, such as Reluplex [5] that are specifically designed to reason about NNs with ReLU units. Another line of work proposes verification of Binarized Neural Networks (BNNs) via SAT solvers [6]. These initial forays into reasoning for ML bring to fore new challenges, especially having to do with scalability of solvers for NN analysis. Traditional solver methods that scale well for typical software systems, do not seem to scale well for NNs. For example, it is known that solvers, such as Reluplex, are capable of analyzing NNs with only a few thousand nodes. The pressing question of this area of research then is “How can we develop methods that enable solvers to scale to NNs with millions of nodes in them?”

A related question has to do with appropriate logics to represent NNs and their properties. Recent work by Soutedeh and Thakur suggests that NNs can be represented symbolically as piecewise linear functions, even though they may use non-linear activation functions such as ReLU [4]. This suggests that there may be efficient solving methods capable of analyzing very large NNs. Yet another question in this setting is how do logic-based methods aimed at testing and verifying NNs compare against hybrid methods that do not require translation of NNs into logic. What are the tradeoffs in this setting?

Another interesting direction where logic reasoning can play a role is in explainability and interpretability of ML models. While both these questions have been long studied in AI and are closely related, they take particular importance in the context of NNs. We say a ML model is explainable, if there is discernable causal relationship between its input and output. Explanations for the behavior of NN, when presented in symbolic form, can be analyzed and debugged using solvers. Researchers have also developed solver-based xAI methods that aim to provide explanations for behavior of NNs [7]. By contrast, interpretable models are ones that have mathematical guarantees regarding their approximation or generalization errors. Solvers can play a role in this context as well via methods for generating counterfactuals (or adversarial examples) [1].

Strong points:

Adversarial attacks and defense mechanisms

Neural network testing, analysis, and verification methods

Piecewise linear symbolic representation of NNs

Solvers for NNs

Logic-guided machine learning

Adversarial training

Logic-based explainability and interpretability of NNs

ML-based Heuristics for Logic Solvers. In recent years, ML-based methods have had a considerable impact on logic solvers. The key premise of this line of research is that logic solvers are a combination of proof systems and ML-based heuristics aimed at optimally selecting, sequencing, and initializing proof rules with the goal of constructing short proofs (if one exists) [10]. A dominant paradigm in this setting is modeling branching heuristics as RL methods to solve the multi-arm bandit (MAB) problem [10]. While this connection seems quite natural today and MAB-style methods have been shown to be empirically powerful, an important question remains as to why these heuristics are effective for industrial instances. A theoretical answer to this question can open up new connections between ML and logic solvers. Another direction of research that has been explored is solving SAT using NNs, a la NeuroSAT [11]. Finally, higher-order theorem provers have been developed recently at Google and elsewhere that combine RL with logic reasoning in order to automatically prove theorems from a variety of mathematical fields [13, 12]. The seminar will focus on these recent developments and the next steps in the research on combinations of ML-based methods with logic reasoning with the goal of achieving greater solver efficiency as well as expressive power.

Strong points:

ML-techniques for branching and restarts in SAT, SMT, and CP solvers

Supervised learning methods for splitting and initialization in solvers

NN-based methods for logical reasoning

RL for higher-order theorem proving

Neuro-symbolic Reasoning. Researchers in neuro-symbolic reasoning (NSR) have been independently developing algorithms that combine ML with symbolic reasoning methods with a slightly different focus than solver and theorem prover developers. NSR research has been focused on concept learning in a broader setting than math or logic, and the cross-fertilization of these ideas with logic-based methods can have deep impact both on NSR as well as solver research [1]. One of the key ideas we plan to explore in this context is that of concept learning, i.e., learning of relations or concepts represented in a logical language directly from data. One interesting direction to explore would be how we can incorporate these methods in logic solvers? Another direction is to explore the synergy between NSR and synthesis of programs from examples. The seminar will focus on bringing NSR and solver researchers closer together, given that they rarely interact in other settings.

Strong points:

Concept-learning, with possible applications in higher-order theorem provers

Neuro-symbolic methods for program synthesis

Concept learning for predicate abstraction

Goals of the Seminar

shaping and setting the research agenda for ML-based solvers, TAV methods aimed at NNs, and NSR for many years to come.

The seminar will highlight the current challenges with symbolic analysis of NNs, scalability issues with solvers tailored for NNs, state-of-the-art ML-based heuristics for solvers, adapting NSR ideas to the setting of solvers and vice-versa, as well as bring to fore competing TAV methods that don’t necessarily rely on symbolic representation of NNs.

Research questions. We highlight some of the main challenges at the intersection of ML and logic reasoning that will be addressed during the seminar from different research perspectives, and discuss how we seek to combine or adapt current techniques to attack them.

Symbolic representation of NNs: Recent work suggests that, while NNs are non-linear functions, they can be effectively modelled symbolically as piecewise linear functions. This is a significant advance since it dramatically simplifies the design of solvers for analyzing NNs. Some of the challenges that remain are algorithmic, i.e., how can NNs be efficiently converted into a symbolic representation.
Solvers for NNs: As of this writing, Reluplex and its successors seem to be among the best solvers for analyzing the symbolic representations of NNs. Unfortunately, these tools scale to NNs with at most a few thousand nodes. There is an urgent need for novel ideas for solving algorithms that enable us to scale to real-world NNs with millions of nodes. Can hybrid methods that combine ML techniques with solvers scale more effectively than pure logic methods?
Combining Constraints and NN Learning: Another set of questions we plan to address is how can we improve the process via which NNs learn using logical constraints. In other words, can the back propagation algorithm be modified to take constraint or domain-specific knowledge into account? Can NNs be combined with logic solvers in a CEGAR-style feedback loop for the purposes of adversarial training?
Next steps in ML-based Heuristics for Solvers: As stated earlier, modern solvers rely in significant ways on ML-based heuristics for their performance. We plan to focus on how we could strengthen this interaction further. For example, are there supervised learning methods for improving the performance of divide-and-conquer parallel SAT solvers. Can we develop ML-based methods for clause sharing in portfolio solvers? How about ML-based restarts and clause deletion policies?
Reinforcement learning (RL) and Theorem Provers: There has been some recent success in combining basic RL methods with reasoning methods in the context of higher-order theorem provers. How can this combination be strengthened further to prove math theorems in a completely automated fashion.
Comparison of NN Verification with Testing and Fuzzing Methods: Researchers have developed a variety of fuzzing methods aimed at NNs. These methods often scale better than verification techniques. On the other hand, unlike verification, testing techniques do not give any guarantees. What are the tradeoffs in this context of complete verification vs. scalability? Can we develop hybrid methods and light-weight verification techniques?
Concept Learning and Solvers: Can we lift the ideas of concept learning from NSR to the setting of solvers, especially in the context of higher-order and combinatorial mathematics?.

Synergies. We have also identified the following potential synergies between the ML, Solver, TAV, and NSR communities and expect strong interactions around these points:

ML researchers in general (and RL in particular) can help refine the ML-based methods used by solver developers;
Solver developers can propose constraint-based learning strategies for NNs (e.g., combining constraints with gradient-descent in the back propagation algorithm);
Researchers who work in the space of TAV for NN can benefit greatly by better understanding the realistic security and safety concerns of the ML community;
Solver developer can substantially benefit by better understanding concept learning from NSR researchers.

Expected results and impact on the research community. One of the core goals of the seminar is to bring together the many different research communities that work in the logic reasoning and ML fields, who unfortunately rarely talk to each other. We believe that the exchange of ideas between them – each with their own methods and perspectives – will help accelerate the future development of combinations of ML and logic reasoning. In terms of concrete outcomes, we believe the workshop is likely to lead to several collaboration projects, especially between members of different communities working on similar or related problems. Common benchmarks and regular meeting forums will also be discussed and we expect for some progress there as well.

References

Amel, Kay R. From shallow to deep interactions between knowledge representation, reasoning and machine learning. 13th International Conference Scala Uncertainity Mgmt (SUM 2019), Compiègne, LNCS. 2019.
Goodfellow, Ian J and Shlens, Jonathon and Szegedy, Christian. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. 2014
Leofante, Francesco and Narodytska, Nina and Pulina, Luca and Tacchella, Armando. Automated verification of neural networks: Advances, challenges and perspectives. arXiv preprint arXiv:1805.09938. 2018
Sotoudeh, Matthew and Thakur, Aditya V. A symbolic neural network representation and its application to understanding, verifying, and patching networks. arXiv preprint arXiv:1908.06223. 2019
Katz, Guy and Barrett, Clark and Dill, David L and Julian, Kyle and Kochenderfer, Mykel J. Reluplex: An efficient SMT solver for verifying deep neural networks. In CAV 2017
Narodytska, Nina and Kasiviswanathan, Shiva and Ryzhyk, Leonid and Sagiv, Mooly and Walsh, Toby. Verifying properties of binarized deep neural networks. AAAI 2018
Samek, Wojciech and Wiegand, Thomas and Müller, Klaus-Robert. . Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296. 2017
Fischer, Marc and Balunovic, Mislav and Drachsler-Cohen, Dana and Gehr, Timon and Zhang, Ce and Vechev, Martin. Dl2: Training and querying neural networks with logic. ICML 2019
LGML: Logic Guided Machine Learning. Scott, Joseph and Panju, Maysum and Ganesh, Vijay. AAAI 2020.
Jia Hui Liang, Hari Govind V. K., Pascal Poupart, Krzysztof Czarnecki and Vijay Ganesh. An Empirical Study of Branching Heuristics Through the Lens of Global Learning Rate. In SAT 2017.
Selsam, Daniel and Lamm, Matthew and Bünz, Benedikt and Liang, Percy and de Moura, Leonardo and Dill, David L. Learning a SAT solver from single-bit supervision. arXiv preprint arXiv:1802.03685. 2018
Kaliszyk, Cezary and Urban, Josef and Michalewski, Henryk and Olšák, Miroslav. Reinforcement learning of theorem proving. In Advances in Neural Information Processing Systems. 2018
Bansal, Kshitij and Loos, Sarah and Rabe, Markus and Szegedy, Christian and Wilcox, Stewart. HOList: An environment for machine learning of higher order logic theorem proving. ICML 2019.

Creative Commons BY 4.0

Sébastien Bardin, Vijay Ganesh, and Somesh Jha

Motivation

Show Motivation

Machine learning (ML) and logical reasoning have been the two key pillars of AI since its inception, and yet, there has been little interaction between these two sub-fields over the years. At the same time, each of them has been very influential in their own way. ML has revolutionized many sub-fields of AI including image recognition, language translation, and game playing, to name just a few. Independently, the field of logical reasoning (e.g., SAT/SMT/CP/first-order solvers and knowledge representation) has been equally impactful in many contexts in software engineering, verification, security, AI, and mathematics. Despite this progress, there are new problems, as well as opportunities, on the horizon that seem solvable only via a combination of ML and logic.

One such problem that requires one to consider combinations of logic and ML is the question of reliability, robustness, and security of ML models. For example, in recent years, many adversarial attacks against ML models have been developed, demonstrating their extraordinary brittleness. How can we leverage logic-based methods to analyze such ML systems with the aim of ensuring their reliability and security? What kind of logical language do we use to specify properties of ML models? How can we ensure that ML models are explainable and interpretable?

In the reverse direction, ML methods have already been successfully applied to making solvers more efficient. In particular, solvers can be modeled as complex combinations of proof systems and ML optimization methods, wherein ML-based heuristics are used to optimally select and sequence proof rules. How can we further deepen this connection between solvers and ML? Can we develop tools that automatically construct proofs for higher math?

To answer these and related questions we are organizing this Dagstuhl Seminar, with the aim of bringing together the many world-leading scientists who are conducting pioneering research at the intersection of logical reasoning and ML, enabling development of novel solutions to problems deemed impossible otherwise.

Creative Commons BY 4.0

Sébastien Bardin, Vijay Ganesh, Somesh Jha, and Joao Marques-Silva

Participants

Show Participants

On-site

Rajeev Alur (University of Pennsylvania - Philadelphia, US) [dblp]
Sébastien Bardin (CEA LIST, FR) [dblp]
Chih-Hong Cheng (Fraunhofer IKS - München, DE) [dblp]
Jonathan Chung (University of Waterloo, CA)
Judith Clymo (University of Liverpool, GB)
Artur d'Avila Garcez (City - University of London, GB) [dblp]
Alhussein Fawzi (Google DeepMind - London, GB)
Marc Fischer (ETH Zürich, CH) [dblp]
Pascal Fontaine (University of Liège, BE) [dblp]
Matt Fredrikson (Carnegie Mellon University - Pittsburgh, US) [dblp]
Vijay Ganesh (University of Waterloo, CA) [dblp]
Sebastian Junges (Radboud University Nijmegen, NL) [dblp]
Chunxiao (Ian) Li (University of Waterloo, CA) [dblp]
Ravi Mangal (Carnegie Mellon University - Pittsburgh, US) [dblp]
Georg Martius (MPI für Intelligente Systeme - Tübingen, DE) [dblp]
Kuldeep S. Meel (National University of Singapore, SG) [dblp]
Grégoire Menguy (CEA LIST - Nano-INNOV, FR)
Matthew Mirman (ETH Zürich, CH)
Anselm Paulus (MPI für Intelligente Systeme - Tübingen, DE)
Markus N. Rabe (Google - Mountain View, US) [dblp]
Joseph Scott (University of Waterloo, CA) [dblp]
Xujie Si (McGill University - Montréal, CA) [dblp]
Armando Tacchella (University of Genova, IT) [dblp]
Hazem Torfah (University of California - Berkeley, US) [dblp]
Caterina Urban (INRIA - Paris, FR) [dblp]
Saranya Vijayakumar (Carnegie Mellon University - Pittsburgh, US)

Remote:

Somesh Jha (University of Wisconsin-Madison, US) [dblp]
Luis C. Lamb (Federal University of Rio Grande do Sul, BR) [dblp]
Vineel Nagisetty (Borealis AI - Toronto, CA)
Sanjit A. Seshia (University of California - Berkeley, US) [dblp]

Related Seminars

Dagstuhl Seminar 14442: Symbolic Execution and Constraint Solving (2014-10-27 - 2014-10-30) (Details)
Dagstuhl Seminar 19062: Bringing CP, SAT and SMT together: Next Challenges in Constraint Solving (2019-02-03 - 2019-02-06) (Details)

Classification

Logic in Computer Science
Machine Learning
Software Engineering

Keywords

SAT/SMT/CP solvers and theorem provers
Testing
Analysis
Verification
and Security of Machine Learning
Neuro-symbolic AI

Seminar 22291

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 22291

Machine Learning and Logical Reasoning: The New Frontier

( Jul 17 – Jul 22, 2022 )

Permalink

Organizers

Contact

Publications

Schedule

Summary

General context: Bringing ML and Logic Reasoning Closer

In-depth Description of Focus Areas

Goals of the Seminar

References

Motivation

Participants

Related Seminars

Classification

Keywords