Dagstuhl Seminar 21052: Privacy in Speech and Language Technology

Dagstuhl Seminar 21052

Privacy in Speech and Language Technology Cancelled

( Jan 31 – Feb 05, 2021 )

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/21052

Replacement

Dagstuhl Seminar 22342: Privacy in Speech and Language Technology (2022-08-21 - 2022-08-26) (Details)

Organizers

Simone Fischer-Hübner (Karlstad University, SE)
Dietrich Klakow (Universität des Saarlandes, DE)
Peggy Valcke (KU Leuven, BE)
Emmanuel Vincent (INRIA Nancy - Grand Est, FR)

Contact

Michael Gerke (for scientific matters)
Simone Schilke (for administrative matters)

Motivation

Show Motivation

In the last few years, voice assistants have become the preferred means of interacting with smart devices and services. Chatbots and related technologies such as automated translation or typing prediction are also widely used. These technologies often rely on cloud-based machine learning systems trained on speech or text data collected from the users.

The recording, storage and processing of users' speech or text data raises severe privacy threats. This data contains a wealth of personal information about, e.g., the personality, ethnicity and health state of the user, that may be (mis)used for targeted processing or advertisement. It also includes information about the user identity which could be exploited by an attacker to impersonate him/her. News articles exposing these threats to the general public have made national headlines.

A new generation of privacy-preserving speech and language technologies is needed that ensures user privacy while still providing users with the same benefits and companies with the training data needed to develop these technologies. Recent regulations such as the European General Data Protection Regulation (GDPR), which promotes the principle of privacy-by-design, have further fueled interest. Yet, efforts in this direction have suffered from the lack of collaboration across research communities. These include the development of encryption tools such as homomorphic encryption and secure multiparty computation, machine learning frameworks such as federated or decentralized learning, and anonymization techniques targeting speech and language specifically. Privacy in speech and language technology also recently attracted the interest of law researchers and data protection authorities.

To the best of our knowledge, this Dagstuhl Seminar will be the first event that aims to bring together academic researchers, industry representatives, and policy makers in the fields of speech processing, natural language processing, privacy-enhancing technologies (PETs), machine learning, and law and ethics, in order to draw cross-disciplinary solutions. The questions to be addressed include (but are not limited to) the following:

What are the threats to privacy arising from the recording, storage and processing of user-generated speech and language data? What is their probability of occurrence and their impact?
What are the related ethical and moral issues?
How shall those threats be translated into actionable, formal privacy models? Do existing general-purpose privacy models apply or are new domain-specific models needed?
Which existing PETs can be leveraged to address privacy requirements regarding raw speech and language data? How shall they be combined into holistic solutions?
How should secondary data, e.g., models trained on raw data, be treated?
Which new PETs are being developed? Can they benefit from cross-disciplinary collaboration?
What privacy goals can these PETs achieve? Which metrics shall be used to assess their success?
How shall these PETs be implemented in practice, so as to provide transparent information and management capabilities to the users? How can formal guarantees be made and explained?
What are the expected limitations of these PETs? What is the research roadmap to address them?
How will privacy laws affect these new developments? Conversely, how will they be impacted by these new developments?

The Dagstuhl Seminar will involve of a mix of plenary talks and subgroup discussions aiming to achieve a shared understanding of problems and solutions and to sketch a cross-disciplinary roadmap we hope to publish as a joint position paper. Besides, there will be multiple breaks for invitees to socialize and make new cross-disciplinary collaborations emerge.

D. Klakow and E. Vincent acknowledge support from the European Union's Horizon 2020 Research and Innovation Program within project COMPRISE "Cost-effective, multilingual, privacy-driven voice-enabled services" (www.compriseh2020.eu).

Creative Commons BY 3.0 DE

Simone Fischer-Hübner, Dietrich Klakow, Peggy Valcke, and Emmanuel Vincent

Participants

Show Participants

Simone Fischer-Hübner (Karlstad University, SE) [dblp]
Dietrich Klakow (Universität des Saarlandes, DE) [dblp]
Peggy Valcke (KU Leuven, BE) [dblp]
Emmanuel Vincent (INRIA Nancy - Grand Est, FR) [dblp]

Classification

Computation and Language
Computers and Society
Cryptography and Security

Keywords

Speech and language technology
Privacy
Data protection
Privacy-enhancing technologies
Law and policy

Seminar 21052

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 21052

Privacy in Speech and Language Technology Cancelled

( Jan 31 – Feb 05, 2021 )

Permalink

Replacement

Organizers

Contact

Motivation

Participants

Classification

Keywords