logo

Proposals

 

Title:

Automatic Text Summarization: Past, Present, and Future

Lecturer(s):Horacio Saggion (NLP Group/Computer Science Departiment/University of Sheffield)
Type:Introductory Course
Section:Language and Computation
Week:First
Time: 17.00-18.30 (Slot 4)
Webpage:http://www.dcs.shef.ac.uk/~saggion/
Room:EM 1.82


Description

The overwhelming quantity of information and the need to access the
essential content of documents accurately to satisfy users' demands
have made automatic abstracting a major research area.  Even though  some
approaches to text summarization produce acceptable summaries for
specific
tasks, it is generally agreed that the problem of coherent selection and
expression of information in text summarization is far from being
resolved, thus making it an
interesting research topic as demonstrated by the TIDES (Translingual
Information Detection, Extraction and Summarization) program and DUC
(Document Understanding Conferences) evaluations. In this course, I will
give a 
detailed account of methods and techniques used in automatic text
summarization.

Outline of the course:

*Basic concepts: summary typology and examples; human factors in the
production of summaries; summarization by professional abstractors. 

*Summarization by sentence extraction: theoretical framework; superficial
features: indicative phrases, term distribution, title,  position, etc.;
machine learning for sentence extraction; advantages and disadvantages of
extraction systems.

*Summarization by abstraction: artificial intelligence methods;
summarization by information extraction; text generation; language models;
advantages and disadvantages of abstraction systems.

*Multidocument summarization: morphological methods; syntactic methods;
semantic methods; bibliography generation.

*Evaluation: intrinsic and extrinsic methods; advantages and
disadvantages; the SUMMAC evaluation; the Document Understanding
Conference (DUC) evaluation and the new 2005-2007 Roadmap on Text
Summarization.

*The future of text summarization: linguistic resources; new
techniques; automatic summarization and question answering (QA);
summarization and the semantic web;  more on abstraction methods. Research
topics for Master and Doctorate students.




 

© ESSLLI 2005 Organising Committee 2004-12-01