17. – 22. Oktober 1999, Dagstuhl Seminar 99421
Efficient Language Processing with High-level Grammar Formalisms
H. Uszkoreit (Saarbrücken), J.-I. Tsujii (Tokyo)
Auskunft zu diesem Dagstuhl Seminar erteilt
The topic of the Dagstuhl-Seminar is human language processing with sophisticated models of grammar. During the last decade many researchers have abandoned linguistically sophisticated models of grammar such as HPSG and LFG in favor of shallow processing techniques. The turn was caused by the sobering insight that even after many years of system development the existing methods for deep grammatical processing with powerful grammar formalisms did still not meet the performance criteria posed by applied research. Neither efficiency nor robustness proved sufficient for realistic applications.
On the other hand, progress in shallow processing has demonstrated that many useful applications of language technology could be achieved without accurate deep processing. Some of the shallow methods also exhibit a great potential for the automatic acquisition of language models. Thus large fractions of the discipline arrived at the conclusion that linguistic grammar models are not suited for the efficient and robust processing of human language on the computer.
However, not all researchers followed this move. Most of the groups that continued research on and with declarative grammar formalisms were driven by linguistic motivations. Others maintained their belief in the prospects of the grammar formalisms because they expected that developments in hard and software technology, better performance models and progress in computational semantics would eventually overcome the existing problems.
At a small number of centers considerable efforts were invested in the search for better processing methods. New results from several areas of computer science were exploited. Methods from constraint-logic programming, compilation technology, probabilistic language processing and many other sources were investigated.
Some of the efforts concentrated on the combination of many small improvements others focussed on the search for radically different processing models.
The diversity of investigated approaches and claims of noticeable progress in efficiency deserve a new assessment of the state of the art. Some questions need to be asked:
- Have there been any real breakthroughs, can they be expected in the near future or does all progress in the area consist of a sequence of numerous small steps?
- Are there any processing systems based on sophisticated grammar models that already meet the demands of realistic applications?
- Have the various attempts to combine statistic and linguistic methods started to bear fruit?
- Do we have methods for grammar acquisition that can replace or complement intellectual grammar engineering?
- Do the employed grammar models pose similar problems for efficient and robust processing or do they differ in interesting ways with respect to their potential for computational processing?
- Is their worst case complexity directly related to the efficiency problems encountered in existing systems?
- Do we have a more sophisticated view today on the sources of real performance problems than we had ten years ago?
- Will both methods for grammatical description and grammatical processing have to change drastically before deep processing can be the basis of useful applications?
- If so, which recent approaches and results from computer science, psycholinguistics or theoretical linguistics can be expected to feed into this development.
The seminar will bring together experienced researchers from many parts of the world. Several grammar models and the major approaches in performance modeling will be represented. The meeting shall serve as a forum for presenting results, exchanging ideas and opinions, discussing new approaches, sharing experiences and assessing the state of the art. In the selection of presentations, priority will be given to reports of new results that are supported by performance measurements. Facilities for demonstrating systems will be provided. In addition there will be a small number of topical talks summarizing relevant developments in broader research areas.