Dagstuhl Seminar 25052
From Research to Certification with Data-Driven Medical Decision Support Systems
( Jan 26 – Jan 31, 2025 )
Permalink
Organizers
- Thomas Fuchs (HPI for Digital Health - New York, US)
- Raul Santos-Rodriguez (University of Bristol, GB)
- Kacper Sokol (ETH Zürich, CH)
- Julia E. Vogt (ETH Zürich, CH)
- Sven Wellmann (Universität Regensburg, DE)
Contact
- Andreas Dolzmann (for scientific matters)
- Susanne Bach-Bernhard (for administrative matters)
Shared Documents
- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
Schedule
Seminar Vision
Artificial intelligence has made tremendous strides across many spheres of life, however deploying this technology in safety critical domains remains challenging. This Dagstuhl Seminar focuses on clinical practice where data-driven models can streamline the work of healthcare professionals and democratise access to personalised medicine, thus have lasting positive impact on society, but also where deploying such tools without adequate foresight and safeguards can be perilous. This duality - anticipated benefits that may come along with unintended consequences - requires new technologies to be thoroughly vetted, e.g., with clinical trials and medical certification processes, before they can be deployed to avoid any harmful fallout. However, fulfilling such regulatory requirements is a lengthy and complex process plagued with many challenges, hence while prototype systems are becoming increasingly ubiquitous, they often remain indefinitely designated as research tools that can be used exclusively for research purposes. Their lacklustre adoption is compounded by pervasive reproducibility issues; history of unsafe systems being deployed prematurely; scarce data that are inherently private, difficult to collect or share, and often riddled with numerous biases; and prevalence of automation promises that never come to fruition. Such hurdles result in healthcare remaining one of the least digitised spheres of life.
A different contributing factor is predictive systems often being misconstrued as autonomous rather than social and relational, which is manifested in a counterproductive drive to match or exceed human-level performance in selected (narrowly- or ill-defined) tasks, with the aim to fully automate and replace humans. This goal has nonetheless repeatedly proven difficult to attain due to brittle predictions whose subpar fairness, interpretability and robustness as well as ambiguous accountability are concerning, especially given their potential harm. By considering the broader organisational and societal context in which data-driven systems are operationalised, we should not only strive to automate and replace (when appropriate and desirable) but also to augment and support human reasoning and decision-making to help people flourish at work, e.g., through human--machine collaboration that preserves people's agency and maintains the attribution of responsibility with them. Such a perspective promises to offer an antidote to widely reported apprehension of artificial intelligence and expedite its adoption in safety critical domains.
Seminar Topic
To address these challenges, our interdisciplinary seminar gathers a broad range of stakeholders - including clinicians, academics and researchers from industry -- whose diverse expertise can contribute to charting a novel research agenda for effective and responsible adoption of artificial intelligence in medicine given the complex sociotechnical landscape outlined above. Our goal is to identify best ways of operationalising medical data-driven systems as to ensure their alignment with the needs and expectations of various stakeholders in healthcare as well as seamless integration into real-life clinical workflows, taking a human-centred perspective. Exploring these aspects of artificial intelligence is especially important given that achieving state-of-the-art performance on benchmark tasks often does not directly translate into clinical efficacy and acceptability. To support this objective, we additionally intend to scrutinise relevant evaluation procedures, medical device certification processes, practicality of clinical trials involving data-driven algorithms and clinical approvals thereof in view of compliance with various laws, rules and regulations as well as societal norms and ethical standards. Throughout the seminar we envisage identifying challenges that can be addressed with current technologies, distilling areas that require further work, and emphasising promising research directions. Finally, the event aims to galvanise an interdisciplinary community dedicated to advancing the meeting's agenda after its conclusion.
Seminar Outcomes
The seminar focused on the challenges of translating medical artificial intelligence (AI) models from research settings to real-world clinical applications. It brought together academic and industry researchers, start-up representatives as well as practising clinicians to foster a multidisciplinary exchange of ideas.
One of the key highlights of the seminar was an invited keynote by Rich Caruana from Microsoft Research. His presentation on ante-hoc interpretable models emphasised the importance of intelligibility in machine learning for healthcare. This talk sparked significant discussions among the participants and served as a catalyst for many of the conversations that followed.
Throughout the seminar, the participants engaged in a variety of discussions and presentations. Clinicians were invited to share their experiences with data-driven decision support systems, focusing on both success stories and ongoing challenges; they were also encouraged to describe their hopes and vision for the future of such tools. These clinical pitches played a central role in shaping the seminar's core themes, which included research, translation, testing, deployment, monitoring, updating and maintenance of AI systems in healthcare. Additionally, researchers delivered short presentations on their work, providing insights into the state of the art as well as open research problems in clinical AI systems.
A dedicated session for start-ups offered valuable insights into the process of transforming research findings into real-life clinical tools. Among others, entrepreneurs shared their experiences with commercialisation and the regulatory hurdles they encountered. Many discussions revolved around the practical aspects of deploying AI in healthcare settings and the lessons learnt from these experiences.
The seminar also facilitated group work; two dedicated working groups were formed. The first group focused on frameworks for evaluation and (post-deployment) monitoring of clinical AI. The second group explored important criteria to consider when selecting clinical problems for which to develop AI tools; it additionally investigated human factors of medical AI systems and key approaches to improve the interaction between AI and doctors.
Overall, the seminar identified pressing challenges and opportunities in clinical AI research and deployment. Clinicians gained a deeper understanding of AI's capabilities and limitations, while researchers benefited from the exchange of strategies for overcoming integration and adoption barriers. The discussions and findings from the seminar are expected to facilitate smoother transitions from research to clinical AI prototypes, allowing such tools to be tested and deployed in hospitals.
By fostering interdisciplinary collaboration, the seminar laid the groundwork for future innovations in AI-driven clinical decision support systems. The insights shared and connections formed during the event will contribute to ongoing advancements in the field and help bridge the gap between AI research and practical healthcare applications.
Kacper Sokol, Raul Santos-Rodriguez, Julia E. Vogt, and Sven Wellmann
Artificial intelligence has made tremendous strides across many spheres of life, however deploying this technology in safety critical domains remains challenging. This seminar focuses on clinical practice where data-driven models can streamline the work of healthcare professionals and democratise access to personalised medicine, thus have lasting positive impact on society, but also where deploying such tools without adequate foresight and safeguards can be perilous. This duality – anticipated benefits that may come along with unintended consequences – requires new technologies to be thoroughly vetted, e.g., with clinical trials and medical certification processes, before they can be deployed to avoid any harmful fallout. However, fulfilling such regulatory requirements is a lengthy and complex process plagued with many challenges, hence while prototype systems are becoming increasingly ubiquitous, they often remain indefinitely designated as research tools that can be used exclusively for research purposes. Their lacklustre adoption is compounded by pervasive reproducibility issues; history of unsafe systems being deployed prematurely; scarce data that are inherently private, difficult to collect or share, and often riddled with numerous biases; and prevalence of automation promises that never come to fruition. Such hurdles result in healthcare remaining one of the least digitised spheres of life.
A different contributing factor is predictive systems often being misconstrued as autonomous rather than social and relational, which is manifested in a counterproductive drive to match or exceed human-level performance in selected (narrowly- or ill-defined) tasks, with the aim to fully automate and replace humans. This goal has nonetheless repeatedly proven difficult to attain due to brittle predictions whose subpar fairness, interpretability and robustness as well as ambiguous accountability are concerning, especially given their potential harm. By considering the broader organisational and societal context in which data-driven systems are operationalised, we should not only strive to automate and replace (when appropriate and desirable) but also to augment and support human reasoning and decision-making to help people flourish at work, e.g., through human–machine collaboration that preserves people’s agency and maintains the attribution of responsibility with them. Such a perspective promises to offer an antidote to widely reported apprehension of artificial intelligence and expedite its adoption in safety critical domains.
To address these challenges, our interdisciplinary seminar gathers a broad range of stakeholders – including clinicians, academics and researchers from industry – whose diverse expertise can contribute to charting a novel research agenda for effective and responsible adoption of artificial intelligence in medicine given the complex sociotechnical landscape outlined above. Our goal is to identify best ways of operationalising medical data-driven systems as to ensure their alignment with the needs and expectations of various stakeholders in healthcare as well as seamless integration into real-life clinical workflows, taking a human-centred perspective. Exploring these aspects of artificial intelligence is especially important given that achieving state-of-the-art performance on benchmark tasks often does not directly translate into clinical efficacy and acceptability. To support this objective, we additionally intend to scrutinise relevant evaluation procedures, medical device certification processes, practicality of clinical trials involving data-driven algorithms and clinical approvals thereof in view of compliance with various laws, rules and regulations as well as societal norms and ethical standards. Throughout the seminar we envisage identifying challenges that can be addressed with current technologies, distilling areas that require further work, and emphasising promising research directions. Finally, the event aims to galvanise an interdisciplinary community dedicated to advancing the meeting’s agenda after its conclusion.
Thomas Fuchs, Raul Santos-Rodriguez, Kacper Sokol, Julia E. Vogt, and Sven Wellmann
Please log in to DOOR to see more details.
- Brett Beaulieu-Jones (University of Chicago, US) [dblp]
- Michael Brudno (University of Toronto, CA) [dblp]
- Evangelia Christodoulou (DKFZ - Heidelberg, DE)
- Jeff Clark (IngeniumAI - Bath, GB)
- James Fackler (Johns Hopkins Univ. - Baltimore, US) [dblp]
- Thomas Gärtner (Technische Universität Wien, AT) [dblp]
- Maia Jacobs (Northwestern University - Evanston, US) [dblp]
- Michael Kamp (Universitätsmedizin Essen, DE) [dblp]
- Gilbert Koch (Universitäts-Kinderspital beider Basel, CH)
- Yamuna Krishnamurthy (Phamily - New York, US)
- Fabian Laumer (Scanvio Medical AG, CH)
- Christoph Lippert (Hasso-Plattner-Institut, Universität Potsdam, DE) [dblp]
- Florian Markowetz (University of Cambridge, GB) [dblp]
- Randall Moorman (University of Virginia - Charlottesville, US) [dblp]
- Rajesh Ranganath (NYU Courant Institute of Mathematical Science, US) [dblp]
- Patricia Reis Wolfertstetter (KH Barmh. Brüder Klinik St. Hedwig - Regensburg, DE) [dblp]
- Raul Santos-Rodriguez (University of Bristol, GB) [dblp]
- Kacper Sokol (ETH Zürich, CH) [dblp]
- Wouter van Amsterdam (University Medical Center Utrecht, NL)
- Robin Van de Water (Hasso-Plattner-Institut, Universität Potsdam, DE)
- Julia E. Vogt (ETH Zürich, CH) [dblp]
- Sven Wellmann (Universität Regensburg, DE) [dblp]
Classification
- Artificial Intelligence
- Human-Computer Interaction
- Machine Learning
Keywords
- Explainable
- Robust
- Human-compatible
- Clinical Decision Support Tools
- Augmented Intelligence

Creative Commons BY 4.0
