January 4 – 8 , 2021, Dagstuhl Seminar 21011

CANCELLED Media Forensics and the Challenge of Big Data

Due to the Covid-19 pandemic, this seminar was cancelled. A related Dagstuhl Seminar was scheduled to January 8 – 13 , 2023 – Seminar 23021.


Irene Amerini (Sapienza University of Rome, IT)
Anderson de Rezende Rocha (State University – Campinas, BR)
Paul Rosin (Cardiff University, GB)
Xianfang Sun (Cardiff University, GB)

For support, please contact

Dagstuhl Service Team


With demanding and sophisticated crimes and terrorist threats becoming more common and pervasive, allied with the advent and widespread of fake news, it becomes paramount to design and develop objective and scientific-based criteria to identify the characteristics of investigated materials associated with potential criminal activities. We need effective approaches to help us answer the four most important questions in forensics regarding an event: “who”, “in what circumstances”, “why”, and “how”. In recent years, the rise of social media has resulted in a flood of media content. As well as providing a challenge due to the increase in data that needs fact-checking, it also provides the possibility of leveraging on big-data techniques for forensic analysis. This Dagstuhl Seminar will discuss the main aspects related to big data when it comes to the design and development of forensics techniques: What is at stake? How to deal with spurious correlations? How to mitigate social and economic bias? How to come up with fair, accountable, and explainable forensics solutions? In addition, we aim to identify aspects in this research area that deserve more attention and concentrated efforts. This seminar covers the following topics:

  • Prior work in media tampering detection consists of either retouching, cloning, or splicing modes of analysis. Does an examination of current practices show that new modes of tampering exist?
  • Some existing benchmarks are created by academics and students who are not experts at performing image manipulation, but media tampering by professionals may be more challenging than existing benchmarks. Are the current benchmarks realistic and challenging? How should algorithms’ performance be computed?
  • Currently, much work in media forensics is carried out by researchers in fields separated according to the media (e.g., image forensics, audio forensics). What lessons can be learned from a cross-media approach?
  • What is the consequence of the development and application of deep learning into media forensics?
  • What are the different characteristics of media that have been tampered with using “traditional” methods as compared to forgeries generated using deep learning?
  • How to apply media forensics methods outside of academia?
  • How to explore context when analyzing a digital object? How to spot out inconsistencies when analyzing a pool of objects rather than just a single one?
  • How to deal with the challenges of big data and the unprecedented amount of available information when designing new solutions?
  • How to develop fair, accountable, unbiased, and explainable solutions respecting directives such as the General Data Protection Regulation 2016/679 (GDPR) legislation and other similar regulations?
  • What other methodological issues need to be considered? This will require brainstorming sessions by all participants at the seminar.

The huge amount of data now available has had at least a fourfold impact on media forensics:

  • Scaling up the application of media forensics to huge amounts of data is challenging;
  • Big data has enabled data-driven content generation (visual, textual, auditory), exacerbating the above;
  • There is a split between researchers using data-driven approaches to media forensics and traditional (handcrafted solutions). How can this be bridged/resolved?
  • The design and development of fair, accountable, and explainable forensic solutions are paramount to help us understand the decision-making protocol and also to provide users with fair and explainable decisions.

With all the challenges above, how can we orchestrate the efforts of the research community in such a way that we harness different tools to fight misinformation and the spread of fake content? All of these topics will be touched on during the seminar, raising awareness of these important topics and paving the way for stronger tomorrow's digital forensics methods.

The schedule for the seminar will be: Day 1: Traditional methods, Day 2: Deep learning based methods, Day 3: Big data, Day 4: Benchmark and performance evaluation, Day 5: Applications and future directions, and most days will consist of 1 overview talk followed by 5-8 shorter regular talks. In addition, there will be several breakout group discussions and panel discussions.

Motivation text license
  Creative Commons BY 3.0 DE
  Irene Amerini, Anderson de Rezende Rocha, Paul Rosin, and Xianfang Sun


  • Artificial Intelligence
  • Computer Vision And Pattern Recognition
  • Multimedia


  • Image and video forensics
  • Digital forensics
  • Image and video forgery detection
  • Image and video authentication
  • Tampering detection


In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.


Download overview leaflet (PDF).


Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.