Automatically learning cognitive status for multi-document summarization of newswire

Ani Nenkova, Advaith Siddharthan, Kathleen McKeown

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

12 Citations (Scopus)

Abstract

Machine summaries can be improved by using knowledge about the cognitive status of news article referents. In this paper, we present an approach to automatically acquiring distinctions in cognitive status using machine learning over the forms of referring expressions appearing in the input. We focus on modeling references to people, both because news often revolve around people and because existing natural language tools for named entity identification are reliable. We examine two specic distinctions—whether a person in the news can be assumed to be known to a target audience (hearer-old vs hearer-new) and whether a person is a major character in the news story. We report on machine learning experiments that show that these distinctions can be learned with high accuracy, and validate our approach using human subjects.
Original languageEnglish
Title of host publicationProceedings of Conference on Human Language Technology/Empirical Methods in Natural Language Processing(HLT/EMNLP)
Place of PublicationVancouver, Canada
Publication statusPublished - 2005
EventHuman Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) - Vancouver, Canada
Duration: 6 Oct 20058 Oct 2005

Conference

ConferenceHuman Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP)
Country/TerritoryCanada
CityVancouver
Period6/10/058/10/05

Bibliographical note

Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada

Fingerprint

Dive into the research topics of 'Automatically learning cognitive status for multi-document summarization of newswire'. Together they form a unique fingerprint.

Cite this