Text mining based on Self-Organizing Map method for Arabic-English documents

Abdulsamad Al-Marghilani*, Hussein Zedan, Aladdin Ayesh

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

    4 Citations (Scopus)

    Abstract

    Computer information and retrieval is becoming increasingly sophisticated and is being exploited in more and more spheres of human activity. Many computer applications are developed as information distribution systems, of which the Internet is one of the best known and widely used. With enormous quantities of data in different languages available on the net, it is essential that more efficient methods of language data extraction are daveloped. Thus this paper is focused on text mining multilingual datasets. Arabic is a highly derivated and inflected language, requiring proper morphological analysis for effective text mining, and yet no standard approach to word stemming has emerged. This work is an attempt towards the development of a tool useful in the analysis of Arabic-English texts, and is achieved through the multilingual text mining (MTM) of a combined Arabic-English corpus. This project is based on Self- Organizing Map (SOM) and uses an Arabic-English text corpus as the test-bed. Issues related to Arabic-English text mining, stemming and clustering are discussed in this paper. To the author's knowledge, there is no significant literature available regarding SOM techniques applied to Arabic-English language text mining. In this work a framework and the outcome of its implementation is presented.

    Original languageEnglish
    Title of host publicationMAICS 2008 - Proceedings of the 19th Midwest Artificial Intelligence and Cognitive Science Conference
    Pages174-181
    Number of pages8
    Publication statusPublished - 2008
    Event19th Midwest Artificial Intelligence and Cognitive Science Conference, MAICS 2008 - Cincinnati, OH, United States
    Duration: 12 Apr 200813 Apr 2008

    Conference

    Conference19th Midwest Artificial Intelligence and Cognitive Science Conference, MAICS 2008
    Country/TerritoryUnited States
    CityCincinnati, OH
    Period12/04/0813/04/08

    Fingerprint

    Dive into the research topics of 'Text mining based on Self-Organizing Map method for Arabic-English documents'. Together they form a unique fingerprint.

    Cite this