Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors

Advaith Siddharthan, Kathleen McKeown

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

In this paper, we use the information redundancy in multilingual input to correct
errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases.
Original languageEnglish
Title of host publicationConference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP)
Place of PublicationVancouver, Canada
Publication statusPublished - 2005
EventHuman Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) - Vancouver, Canada
Duration: 6 Oct 20058 Oct 2005

Conference

ConferenceHuman Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP)
CountryCanada
CityVancouver
Period6/10/058/10/05

Fingerprint

Redundancy
Syntactics

Cite this

Siddharthan, A., & McKeown, K. (2005). Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors. In Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP) Vancouver, Canada.

Improving Multilingual Summarization : Using Redundancy in the Input to Correct MT errors. / Siddharthan, Advaith; McKeown, Kathleen.

Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP) . Vancouver, Canada, 2005.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Siddharthan, A & McKeown, K 2005, Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors. in Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP) . Vancouver, Canada, Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP) , Vancouver, Canada, 6/10/05.
Siddharthan A, McKeown K. Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors. In Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP) . Vancouver, Canada. 2005
Siddharthan, Advaith ; McKeown, Kathleen. / Improving Multilingual Summarization : Using Redundancy in the Input to Correct MT errors. Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP) . Vancouver, Canada, 2005.
@inproceedings{d62c03ffbff44a24af21f7280b011ece,
title = "Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors",
abstract = "In this paper, we use the information redundancy in multilingual input to correct errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases.",
author = "Advaith Siddharthan and Kathleen McKeown",
note = "Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada",
year = "2005",
language = "English",
booktitle = "Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP)",

}

TY - GEN

T1 - Improving Multilingual Summarization

T2 - Using Redundancy in the Input to Correct MT errors

AU - Siddharthan, Advaith

AU - McKeown, Kathleen

N1 - Human Language Technology Conference(HLT), Conference on Empirical Methods in Natural Language Processing(EMNLP), October 6-8, 2005, Vancouver, B.C., Canada

PY - 2005

Y1 - 2005

N2 - In this paper, we use the information redundancy in multilingual input to correct errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases.

AB - In this paper, we use the information redundancy in multilingual input to correct errors in machine translation and thus improve the quality of multilingual summaries. We consider the case of multidocument summarization, where the input documents are in Arabic, and the output summary is in English. Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents. Further, the use of multiple machine translation systems provides yet more redundancy, yielding different ways to realize that information in English. We demonstrate how errors in the machine translations of the input Arabic documents can be corrected by identifying and generating from such redundancy, focusing on noun phrases.

M3 - Conference contribution

BT - Conference on Human Language Technology Conference / Empirical Methods in Natural Language Processing(HLT-EMNLP)

CY - Vancouver, Canada

ER -