Harnessing the crowds for automating the identification of Web APIs

Carlos Pedrinaci; Dong Liu; Chenghua Lin; John Domingue

Harnessing the crowds for automating the identification of Web APIs

Carlos Pedrinaci, Dong Liu, Chenghua Lin, John Domingue

Computing Science

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

3 Citations (Scopus)

Abstract

Supporting the efficient discovery and use of Web APIs is increasingly important as their use and popularity grows. Yet, a simple task like finding potentially interesting APIs and their related documentation turns out to be hard and time consuming even when using the best resources currently available on theWeb. In this paper we describe our research towards an automatedWeb API documentation crawler and search engine. This paper presents two main contributions. First, we have devised and exploited crowdsourcing techniques to generate a curated dataset of Web APIs documentation. Second, thanks to this dataset, we have devised an engine able to automatically detect documentation pages. Our preliminary experiments have shown that we obtain an accuracy of 80% and a precision increase of 15 points over a keyword-based heuristic we have used as baseline.

Original language	English
Title of host publication	Papers from the 2012 AAAI Spring Symposium
Subtitle of host publication	Technical Report SS-12-04
Place of Publication	Paolo Alto, California
Publisher	AAAI Press
Pages	58-63
Number of pages	6
ISBN (Print)	978-1-57735-553-3
Publication status	Published - 2012

Access to Document

http://www.aaai.org/ocs/index.php/SSS/SSS12/paper/viewFile/4324/4661

Cite this

@inproceedings{4774e71c10c143a186a2ba705a37016f,

title = "Harnessing the crowds for automating the identification of Web APIs",

abstract = "Supporting the efficient discovery and use of Web APIs is increasingly important as their use and popularity grows. Yet, a simple task like finding potentially interesting APIs and their related documentation turns out to be hard and time consuming even when using the best resources currently available on theWeb. In this paper we describe our research towards an automatedWeb API documentation crawler and search engine. This paper presents two main contributions. First, we have devised and exploited crowdsourcing techniques to generate a curated dataset of Web APIs documentation. Second, thanks to this dataset, we have devised an engine able to automatically detect documentation pages. Our preliminary experiments have shown that we obtain an accuracy of 80% and a precision increase of 15 points over a keyword-based heuristic we have used as baseline.",

author = "Carlos Pedrinaci and Dong Liu and Chenghua Lin and John Domingue",

year = "2012",

language = "English",

isbn = "978-1-57735-553-3",

pages = "58--63",

booktitle = "Papers from the 2012 AAAI Spring Symposium",

publisher = "AAAI Press",

}

TY - GEN

T1 - Harnessing the crowds for automating the identification of Web APIs

AU - Pedrinaci, Carlos

AU - Liu, Dong

AU - Lin, Chenghua

AU - Domingue, John

PY - 2012

Y1 - 2012

N2 - Supporting the efficient discovery and use of Web APIs is increasingly important as their use and popularity grows. Yet, a simple task like finding potentially interesting APIs and their related documentation turns out to be hard and time consuming even when using the best resources currently available on theWeb. In this paper we describe our research towards an automatedWeb API documentation crawler and search engine. This paper presents two main contributions. First, we have devised and exploited crowdsourcing techniques to generate a curated dataset of Web APIs documentation. Second, thanks to this dataset, we have devised an engine able to automatically detect documentation pages. Our preliminary experiments have shown that we obtain an accuracy of 80% and a precision increase of 15 points over a keyword-based heuristic we have used as baseline.

AB - Supporting the efficient discovery and use of Web APIs is increasingly important as their use and popularity grows. Yet, a simple task like finding potentially interesting APIs and their related documentation turns out to be hard and time consuming even when using the best resources currently available on theWeb. In this paper we describe our research towards an automatedWeb API documentation crawler and search engine. This paper presents two main contributions. First, we have devised and exploited crowdsourcing techniques to generate a curated dataset of Web APIs documentation. Second, thanks to this dataset, we have devised an engine able to automatically detect documentation pages. Our preliminary experiments have shown that we obtain an accuracy of 80% and a precision increase of 15 points over a keyword-based heuristic we have used as baseline.

M3 - Published conference contribution

SN - 978-1-57735-553-3

SP - 58

EP - 63

BT - Papers from the 2012 AAAI Spring Symposium

PB - AAAI Press

CY - Paolo Alto, California

ER -

Harnessing the crowds for automating the identification of Web APIs

Abstract

Access to Document

Fingerprint

Cite this