Automatically learning cognitive status for multi-document summarization of newswire

Ani Nenkova; Advaith Siddharthan; Kathleen McKeown

Automatically learning cognitive status for multi-document summarization of newswire

Ani Nenkova, Advaith Siddharthan, Kathleen McKeown

Computing Science

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

12 Citations (Scopus)

Abstract

Machine summaries can be improved by using knowledge about the cognitive status of news article referents. In this paper, we present an approach to automatically acquiring distinctions in cognitive status using machine learning over the forms of referring expressions appearing in the input. We focus on modeling references to people, both because news often revolve around people and because existing natural language tools for named entity identification are reliable. We examine two specic distinctions—whether a person in the news can be assumed to be known to a target audience (hearer-old vs hearer-new) and whether a person is a major character in the news story. We report on machine learning experiments that show that these distinctions can be learned with high accuracy, and validate our approach using human subjects.

Original language	English
Title of host publication	Proceedings of Conference on Human Language Technology/Empirical Methods in Natural Language Processing(HLT/EMNLP)
Place of Publication	Vancouver, Canada
Publication status	Published - 2005
Event	Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) - Vancouver, Canada Duration: 6 Oct 2005 → 8 Oct 2005

Conference

Conference	Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP)
Country/Territory	Canada
City	Vancouver
Period	6/10/05 → 8/10/05

Bibliographical note

Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada

Access to Document

http://userweb.cs.utexas.edu/~ml/HLT-EMNLP05/

Cite this

Automatically learning cognitive status for multi-document summarization of newswire. / Nenkova, Ani; Siddharthan, Advaith; McKeown, Kathleen.
Proceedings of Conference on Human Language Technology/Empirical Methods in Natural Language Processing(HLT/EMNLP) . Vancouver, Canada, 2005.

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Nenkova, A, Siddharthan, A & McKeown, K 2005, Automatically learning cognitive status for multi-document summarization of newswire. in Proceedings of Conference on Human Language Technology/Empirical Methods in Natural Language Processing(HLT/EMNLP) . Vancouver, Canada, Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) , Vancouver, Canada, 6/10/05. <http://userweb.cs.utexas.edu/~ml/HLT-EMNLP05/>

@inproceedings{379eb4e41aba4099b61b98a5e7f61ffa,

title = "Automatically learning cognitive status for multi-document summarization of newswire",

abstract = "Machine summaries can be improved by using knowledge about the cognitive status of news article referents. In this paper, we present an approach to automatically acquiring distinctions in cognitive status using machine learning over the forms of referring expressions appearing in the input. We focus on modeling references to people, both because news often revolve around people and because existing natural language tools for named entity identification are reliable. We examine two specic distinctions—whether a person in the news can be assumed to be known to a target audience (hearer-old vs hearer-new) and whether a person is a major character in the news story. We report on machine learning experiments that show that these distinctions can be learned with high accuracy, and validate our approach using human subjects. ",

author = "Ani Nenkova and Advaith Siddharthan and Kathleen McKeown",

note = "Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada; Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) ; Conference date: 06-10-2005 Through 08-10-2005",

year = "2005",

language = "English",

booktitle = "Proceedings of Conference on Human Language Technology/Empirical Methods in Natural Language Processing(HLT/EMNLP)",

}

TY - GEN

T1 - Automatically learning cognitive status for multi-document summarization of newswire

AU - Nenkova, Ani

AU - Siddharthan, Advaith

AU - McKeown, Kathleen

N1 - Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada

PY - 2005

Y1 - 2005

N2 - Machine summaries can be improved by using knowledge about the cognitive status of news article referents. In this paper, we present an approach to automatically acquiring distinctions in cognitive status using machine learning over the forms of referring expressions appearing in the input. We focus on modeling references to people, both because news often revolve around people and because existing natural language tools for named entity identification are reliable. We examine two specic distinctions—whether a person in the news can be assumed to be known to a target audience (hearer-old vs hearer-new) and whether a person is a major character in the news story. We report on machine learning experiments that show that these distinctions can be learned with high accuracy, and validate our approach using human subjects.

AB - Machine summaries can be improved by using knowledge about the cognitive status of news article referents. In this paper, we present an approach to automatically acquiring distinctions in cognitive status using machine learning over the forms of referring expressions appearing in the input. We focus on modeling references to people, both because news often revolve around people and because existing natural language tools for named entity identification are reliable. We examine two specic distinctions—whether a person in the news can be assumed to be known to a target audience (hearer-old vs hearer-new) and whether a person is a major character in the news story. We report on machine learning experiments that show that these distinctions can be learned with high accuracy, and validate our approach using human subjects.

M3 - Published conference contribution

BT - Proceedings of Conference on Human Language Technology/Empirical Methods in Natural Language Processing(HLT/EMNLP)

CY - Vancouver, Canada

T2 - Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP)

Y2 - 6 October 2005 through 8 October 2005

ER -

Automatically learning cognitive status for multi-document summarization of newswire

Abstract

Conference

Bibliographical note

Access to Document

Fingerprint

Cite this