Bootstrapping yahoo! Finance by wikipedia for competitor mining

Tong Ruan*, Lijuan Xue, Haofen Wang, Jeff Z. Pan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Competitive intelligence, one of the key factors of enterprise risk management and decision support, depends on knowledge bases that contain a large amount of competitive information. A variety of finance websites have collected competitive information manually, which can be used as knowledge bases. Yahoo! Finance is one of the largest and most successful finance websites among them. However, they have problems of incompleteness, lack of competitive domain, and not-in-time updating. Wikipedia, which was built with collective wisdom and contains plenty of useful information in various forms, can solve the above-mentioned problems effectively, thus helping build a more comprehensive knowledge base. In this paper, we propose a novel semi-supervised approach to identify competitor information and competitive domain from Wikipedia based on a multi-strategy learning algorithm. More precisely, we leverage seeds of competition between companies and competition between products to distantly supervise the learning process to find text patterns in free texts. Considering that competitive information can be inferred from events, we design a learning-based method to determine event description sentences. The whole process is iteratively performed. The experimental results show the effectiveness of our approach. Moreover, the results extracted from Wikipedia supplement 14,000 competitor pairs and 8,000 competitive domains between rival companies to Yahoo! Finance.

Original languageEnglish
Title of host publicationSemantic Technology
Subtitle of host publication5th Joint International Conference, JIST 2015, Yichang, China, November 11-13, 2015, Revised Selected Papers
EditorsGuilin Qi, Kouji Kozaki, Jeff Z. Pan, Siwei Yu
PublisherSpringer-Verlag
Pages108-126
Number of pages19
ISBN (Print)9783319316758
DOIs
Publication statusPublished - 2016
Event5th Joint International Conference on Semantic Technology, JIST 2015 - Yichang, China
Duration: 11 Nov 201513 Nov 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9544
ISSN (Print)03029743
ISSN (Electronic)16113349

Conference

Conference5th Joint International Conference on Semantic Technology, JIST 2015
CountryChina
CityYichang
Period11/11/1513/11/15

Fingerprint

Wikipedia
Bootstrapping
Finance
Mining
Websites
Knowledge Base
Industry
Competitive intelligence
Risk management
Learning algorithms
Seed
Incompleteness
Risk Management
Decision Support
Learning Process
Leverage
Updating
Learning Algorithm
Experimental Results

Keywords

  • Competitor mining
  • Distant supervision
  • Multi-strategy learning
  • Relation reasoning

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Ruan, T., Xue, L., Wang, H., & Pan, J. Z. (2016). Bootstrapping yahoo! Finance by wikipedia for competitor mining. In G. Qi, K. Kozaki, J. Z. Pan, & S. Yu (Eds.), Semantic Technology: 5th Joint International Conference, JIST 2015, Yichang, China, November 11-13, 2015, Revised Selected Papers (pp. 108-126). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9544). Springer-Verlag. https://doi.org/10.1007/978-3-319-31676-5_8

Bootstrapping yahoo! Finance by wikipedia for competitor mining. / Ruan, Tong; Xue, Lijuan; Wang, Haofen; Pan, Jeff Z.

Semantic Technology: 5th Joint International Conference, JIST 2015, Yichang, China, November 11-13, 2015, Revised Selected Papers. ed. / Guilin Qi; Kouji Kozaki; Jeff Z. Pan; Siwei Yu. Springer-Verlag, 2016. p. 108-126 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9544).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ruan, T, Xue, L, Wang, H & Pan, JZ 2016, Bootstrapping yahoo! Finance by wikipedia for competitor mining. in G Qi, K Kozaki, JZ Pan & S Yu (eds), Semantic Technology: 5th Joint International Conference, JIST 2015, Yichang, China, November 11-13, 2015, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9544, Springer-Verlag, pp. 108-126, 5th Joint International Conference on Semantic Technology, JIST 2015, Yichang, China, 11/11/15. https://doi.org/10.1007/978-3-319-31676-5_8
Ruan T, Xue L, Wang H, Pan JZ. Bootstrapping yahoo! Finance by wikipedia for competitor mining. In Qi G, Kozaki K, Pan JZ, Yu S, editors, Semantic Technology: 5th Joint International Conference, JIST 2015, Yichang, China, November 11-13, 2015, Revised Selected Papers. Springer-Verlag. 2016. p. 108-126. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-31676-5_8
Ruan, Tong ; Xue, Lijuan ; Wang, Haofen ; Pan, Jeff Z. / Bootstrapping yahoo! Finance by wikipedia for competitor mining. Semantic Technology: 5th Joint International Conference, JIST 2015, Yichang, China, November 11-13, 2015, Revised Selected Papers. editor / Guilin Qi ; Kouji Kozaki ; Jeff Z. Pan ; Siwei Yu. Springer-Verlag, 2016. pp. 108-126 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{bdd53f3a73624cbbb7212fef2e6be01c,
title = "Bootstrapping yahoo! Finance by wikipedia for competitor mining",
abstract = "Competitive intelligence, one of the key factors of enterprise risk management and decision support, depends on knowledge bases that contain a large amount of competitive information. A variety of finance websites have collected competitive information manually, which can be used as knowledge bases. Yahoo! Finance is one of the largest and most successful finance websites among them. However, they have problems of incompleteness, lack of competitive domain, and not-in-time updating. Wikipedia, which was built with collective wisdom and contains plenty of useful information in various forms, can solve the above-mentioned problems effectively, thus helping build a more comprehensive knowledge base. In this paper, we propose a novel semi-supervised approach to identify competitor information and competitive domain from Wikipedia based on a multi-strategy learning algorithm. More precisely, we leverage seeds of competition between companies and competition between products to distantly supervise the learning process to find text patterns in free texts. Considering that competitive information can be inferred from events, we design a learning-based method to determine event description sentences. The whole process is iteratively performed. The experimental results show the effectiveness of our approach. Moreover, the results extracted from Wikipedia supplement 14,000 competitor pairs and 8,000 competitive domains between rival companies to Yahoo! Finance.",
keywords = "Competitor mining, Distant supervision, Multi-strategy learning, Relation reasoning",
author = "Tong Ruan and Lijuan Xue and Haofen Wang and Pan, {Jeff Z.}",
year = "2016",
doi = "10.1007/978-3-319-31676-5_8",
language = "English",
isbn = "9783319316758",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "108--126",
editor = "Qi, {Guilin } and Kouji Kozaki and Pan, {Jeff Z.} and Siwei Yu",
booktitle = "Semantic Technology",

}

TY - GEN

T1 - Bootstrapping yahoo! Finance by wikipedia for competitor mining

AU - Ruan, Tong

AU - Xue, Lijuan

AU - Wang, Haofen

AU - Pan, Jeff Z.

PY - 2016

Y1 - 2016

N2 - Competitive intelligence, one of the key factors of enterprise risk management and decision support, depends on knowledge bases that contain a large amount of competitive information. A variety of finance websites have collected competitive information manually, which can be used as knowledge bases. Yahoo! Finance is one of the largest and most successful finance websites among them. However, they have problems of incompleteness, lack of competitive domain, and not-in-time updating. Wikipedia, which was built with collective wisdom and contains plenty of useful information in various forms, can solve the above-mentioned problems effectively, thus helping build a more comprehensive knowledge base. In this paper, we propose a novel semi-supervised approach to identify competitor information and competitive domain from Wikipedia based on a multi-strategy learning algorithm. More precisely, we leverage seeds of competition between companies and competition between products to distantly supervise the learning process to find text patterns in free texts. Considering that competitive information can be inferred from events, we design a learning-based method to determine event description sentences. The whole process is iteratively performed. The experimental results show the effectiveness of our approach. Moreover, the results extracted from Wikipedia supplement 14,000 competitor pairs and 8,000 competitive domains between rival companies to Yahoo! Finance.

AB - Competitive intelligence, one of the key factors of enterprise risk management and decision support, depends on knowledge bases that contain a large amount of competitive information. A variety of finance websites have collected competitive information manually, which can be used as knowledge bases. Yahoo! Finance is one of the largest and most successful finance websites among them. However, they have problems of incompleteness, lack of competitive domain, and not-in-time updating. Wikipedia, which was built with collective wisdom and contains plenty of useful information in various forms, can solve the above-mentioned problems effectively, thus helping build a more comprehensive knowledge base. In this paper, we propose a novel semi-supervised approach to identify competitor information and competitive domain from Wikipedia based on a multi-strategy learning algorithm. More precisely, we leverage seeds of competition between companies and competition between products to distantly supervise the learning process to find text patterns in free texts. Considering that competitive information can be inferred from events, we design a learning-based method to determine event description sentences. The whole process is iteratively performed. The experimental results show the effectiveness of our approach. Moreover, the results extracted from Wikipedia supplement 14,000 competitor pairs and 8,000 competitive domains between rival companies to Yahoo! Finance.

KW - Competitor mining

KW - Distant supervision

KW - Multi-strategy learning

KW - Relation reasoning

UR - http://www.scopus.com/inward/record.url?scp=84961589567&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-31676-5_8

DO - 10.1007/978-3-319-31676-5_8

M3 - Conference contribution

AN - SCOPUS:84961589567

SN - 9783319316758

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 108

EP - 126

BT - Semantic Technology

A2 - Qi, Guilin

A2 - Kozaki, Kouji

A2 - Pan, Jeff Z.

A2 - Yu, Siwei

PB - Springer-Verlag

ER -