January 22 – 27 , 2017, Dagstuhl Seminar 17042

From Characters to Understanding Natural Language (C2NLU): Robust End-to-End Deep Learning for NLP


Phil Blunsom (University of Oxford, GB)
Kyunghyun Cho (New York University, US)
Chris Dyer (Carnegie Mellon University – Pittsburgh, US)
Hinrich Schütze (LMU München, DE)

For support, please contact

Dagstuhl Service Team


Dagstuhl Report, Volume 7, Issue 1 Dagstuhl Report
Aims & Scope
List of Participants


Deep learning is currently one of most active areas of research in machine learning and its applications, including natural language processing (NLP). One hallmark of deep learning is end-to-end learning: all parameters of a deep learning model are optimized directly for the learning objective; e.g., for the objective of accuracy on the binary classification task: is the input image the image of a cat? Crucially, the set of parameters that are optimized includes "first-layer" parameters that connect the raw input representation (e.g., pixels) to the first layer of internal representations of the network (e.g., edge detectors). In contrast, many other machine learning models employ hand-engineered features to take the role of these first-layer parameters.

Even though deep learning has had a number of successes in NLP, research on true end-to-end learning is just beginning to emerge. Most NLP deep learning models still start with a hand-engineered layer of representation, the level of tokens or words, i.e., the input is broken up into units by manually designed tokenization rules. Such rules often fail to capture structure both within tokens (e.g., morphology) and across multiple tokens (e.g., multi-word expressions). Given the success of end-to-end learning in other domains, it is likely that it will also be widely used in NLP to alleviate these issues and lead to great advances.

The seminar brought together researchers from deep learning, general machine learning, natural language processing and computational linguistics to develop a research agenda for the coming years. The goal was to combine recent advances in deep learning architectures and algorithms with extensive domain knowledge about language to make true end-to-end learning for NLP possible.

Our goals were to make progress on answering the following research questions.

  • C2NLU approaches so far fall short of the state of the art in cases where token structures can easily be exploited (e.g., in well-edited newspaper text) compared to word-level approaches. What are promising avenues for developing C2NLU to match the state of the art even in these cases of text with well-defined token structures?
  • Character-level models are computationally more expensive than word-level models because detecting syntactic and semantic relationships at the character-level is more expensive (even though it is potentially more robust) than at the word-level. How can we address the resulting challenges in scalability for character-level models?
  • Part of the mantra of deep learning is that domain expertise is no longer necessary. Is this really true or is knowledge about the fundamental properties of language necessary for C2NLU? Even if that expertise is not needed for feature engineering, is it needed to design model architectures, tasks and training regimes?
  • NLP tasks are diverse, ranging from part-of-speech tagging over sentiment analysis to question answering. For which of these problems is C2NLU a promising approach, for which not?
  • More generally, what characteristics make an NLP problem amenable to be addressed using tokenization-based approaches vs. C2NLU approaches?
  • What specifically can each of the two communities involved - natural language processing and deep learning - contribute to C2NLU?
  • Create an NLP/deep learning roadmap for research in C2NLU over the next 5--10 years.
Summary text license
  Creative Commons BY 3.0 Unported license
  Phil Blunsom, Kyunghyun Cho, Chris Dyer, and Hinrich Schütze


  • Artificial Intelligence / Robotics


  • Natural language processing
  • Computational linguistics
  • Deep learning
  • Robustness in learning
  • End-to-end learning
  • Machine learning


In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.


Download overview leaflet (PDF).


Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.