Transfer learning based cross-lingual knowledge extraction for Wikipedia

Zhigang Wang, Zhixing Li, Juanzi Li, Jie Tang, Jeff Z. Pan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

Wikipedia infoboxes are a valuable source of structured knowledge for global knowledge sharing. However, infobox information is very incomplete and imbalanced among the Wikipedias in different languages. It is a promising but challenging problem to utilize the rich structured knowledge from a source language Wikipedia to help complete the missing infoboxes for a target language. In this paper, we formulate the problem of cross-lingual knowledge extraction from multilingual Wikipedia sources, and present a novel framework, called Wiki-CiKE, to solve this problem. An instancebased transfer learning method is utilized to overcome the problems of topic drift and translation errors. Our experimental results demonstrate that WikiCiKE outperforms the monolingual knowledge extraction method and the translation-based method.

Original languageEnglish
Title of host publicationProceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
PublisherAssociation for Computational Linguistics (ACL)
Pages641-650
Number of pages10
Volume1
ISBN (Print)9781937284503
Publication statusPublished - Aug 2013
Event51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 - Sofia, Bulgaria
Duration: 4 Aug 20139 Aug 2013

Conference

Conference51st Annual Meeting of the Association for Computational Linguistics, ACL 2013
CountryBulgaria
CitySofia
Period4/08/139/08/13

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Wang, Z., Li, Z., Li, J., Tang, J., & Pan, J. Z. (2013). Transfer learning based cross-lingual knowledge extraction for Wikipedia. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 641-650). Association for Computational Linguistics (ACL).