Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors

Advaith Siddharthan; Kathleen McKeown

Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors

Advaith Siddharthan, Kathleen McKeown

Computing Science

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

13 Citations (Scopus)

Abstract

In this paper, we use the information redundancy in multilingual input to correct
errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases.

Original language	English
Title of host publication	Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP)
Place of Publication	Vancouver, Canada
Publication status	Published - 2005
Event	Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) - Vancouver, Canada Duration: 6 Oct 2005 → 8 Oct 2005

Conference

Conference	Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP)
Country/Territory	Canada
City	Vancouver
Period	6/10/05 → 8/10/05

Bibliographical note

Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada

Access to Document

http://userweb.cs.utexas.edu/~ml/HLT-EMNLP05/

Cite this

Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors. / Siddharthan, Advaith; McKeown, Kathleen.
Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP) . Vancouver, Canada, 2005.

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Siddharthan, A & McKeown, K 2005, Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors. in Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP) . Vancouver, Canada, Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) , Vancouver, Canada, 6/10/05. <http://userweb.cs.utexas.edu/~ml/HLT-EMNLP05/>

@inproceedings{d62c03ffbff44a24af21f7280b011ece,

title = "Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors",

abstract = "In this paper, we use the information redundancy in multilingual input to correct errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases. ",

author = "Advaith Siddharthan and Kathleen McKeown",

note = "Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada; Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) ; Conference date: 06-10-2005 Through 08-10-2005",

year = "2005",

language = "English",

booktitle = "Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP)",

}

TY - GEN

T1 - Improving Multilingual Summarization

T2 - Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP)

AU - Siddharthan, Advaith

AU - McKeown, Kathleen

N1 - Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada

PY - 2005

Y1 - 2005

N2 - In this paper, we use the information redundancy in multilingual input to correct errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases.

AB - In this paper, we use the information redundancy in multilingual input to correct errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases.

M3 - Published conference contribution

BT - Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP)

CY - Vancouver, Canada

Y2 - 6 October 2005 through 8 October 2005

ER -

Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors

Abstract

Conference

Bibliographical note

Access to Document

Fingerprint

Cite this