Analysing fake news titles for 2016 Trump-Hillary campaign using contextual-based approaches in text analytics

Azwa Bin Abdul Aziz; Andrew Starkey

doi:10.14445/22315381/CATI1P219

Analysing fake news titles for 2016 Trump-Hillary campaign using contextual-based approaches in text analytics

Azwa Bin Abdul Aziz, Andrew Starkey

Universiti Sultan Zainal Abidin

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

Text analytics is the process of transforming unstructured text data into meaningful information that can be used for fact-based decision making. It is widely used for sentiment analysis, summarising text or searching for useful information from the web. Existing approaches such as machine learning or natural language processing techniques have been proven to obtain significant information from massive amounts of text data. However, these approaches can have issues of obtaining sufficiently accurate results during training or the limitation of linguistic resources for the understanding of slang or acronyms for example. Thus, we propose a new method called Contextual Analysis (CA) that accentuates the relationship of the words and sources that are used for analysis. This approach will create a self-learned knowledge tree of contextual information, based on where words appear in the underlying sources. CA provides an understanding of the degree of relationship between the context of words which is a new technique to understand textual data sources. To evaluate CA techniques, 2000 news items are used that contain fake and actual news during 2016 Trump-Hillary campaign. The results are compared with other prominent Supervised Machine Learning (SML) techniques. CA matched the best classification performance and achieved the best performance of 0.81
accuracy for fake news prediction. Moreover, CA provides a Hierarchal Knowledge Tree (HKT) that helps to understand the context of words used in both fake and real news and is one of the important findings of this method. The experimental results demonstrate that CA has the potential to undertake classification tasks and at the same time reveal the contextual relationship and hierarchy of words which improves upon existing ML methods that treat each word as independent.

Original language	English
Article number	CATI1P219
Number of pages	9
Journal	International Journal of Advanced Research Trends in Engineering and Technology
Volume	CAT PART - 1 2020
Issue number	Editor's Issue
DOIs	https://doi.org/10.14445/22315381/CATI1P219
Publication status	Published - 23 Oct 2020

Bibliographical note

ACKNOWLEDGEMENT
This research work is supported by UniSZA Research Management, Innovation & Commercialization Centre (RMIC).

Keywords

Contextual Analytics
fake news
hierarchical knowledge tree
text analytics
supervised machine learning

Access to Document

10.14445/22315381/CATI1P219Licence: Other

http://www.ijettjournal.org/special-issues/cat-part-1Licence: Other

Cite this

Analysing fake news titles for 2016 Trump-Hillary campaign using contextual-based approaches in text analytics. / Bin Abdul Aziz, Azwa; Starkey, Andrew.
In: International Journal of Advanced Research Trends in Engineering and Technology, Vol. CAT PART - 1 2020, No. Editor's Issue, CATI1P219, 23.10.2020.

Research output: Contribution to journal › Article › peer-review

@article{a41043a14f4e4e1ba9fd58aca7bf69ec,

title = "Analysing fake news titles for 2016 Trump-Hillary campaign using contextual-based approaches in text analytics",

abstract = "Text analytics is the process of transforming unstructured text data into meaningful information that can be used for fact-based decision making. It is widely used for sentiment analysis, summarising text or searching for useful information from the web. Existing approaches such as machine learning or natural language processing techniques have been proven to obtain significant information from massive amounts of text data. However, these approaches can have issues of obtaining sufficiently accurate results during training or the limitation of linguistic resources for the understanding of slang or acronyms for example. Thus, we propose a new method called Contextual Analysis (CA) that accentuates the relationship of the words and sources that are used for analysis. This approach will create a self-learned knowledge tree of contextual information, based on where words appear in the underlying sources. CA provides an understanding of the degree of relationship between the context of words which is a new technique to understand textual data sources. To evaluate CA techniques, 2000 news items are used that contain fake and actual news during 2016 Trump-Hillary campaign. The results are compared with other prominent Supervised Machine Learning (SML) techniques. CA matched the best classification performance and achieved the best performance of 0.81accuracy for fake news prediction. Moreover, CA provides a Hierarchal Knowledge Tree (HKT) that helps to understand the context of words used in both fake and real news and is one of the important findings of this method. The experimental results demonstrate that CA has the potential to undertake classification tasks and at the same time reveal the contextual relationship and hierarchy of words which improves upon existing ML methods that treat each word as independent. ",

keywords = "Contextual Analytics, fake news, hierarchical knowledge tree, text analytics, supervised machine learning",

author = "{Bin Abdul Aziz}, Azwa and Andrew Starkey",

note = "ACKNOWLEDGEMENT This research work is supported by UniSZA Research Management, Innovation & Commercialization Centre (RMIC). ",

year = "2020",

month = oct,

day = "23",

doi = "10.14445/22315381/CATI1P219",

language = "English",

volume = "CAT PART - 1 2020",

journal = "International Journal of Advanced Research Trends in Engineering and Technology",

issn = "2394-3777",

number = "Editor's Issue",

}

TY - JOUR

T1 - Analysing fake news titles for 2016 Trump-Hillary campaign using contextual-based approaches in text analytics

AU - Bin Abdul Aziz, Azwa

AU - Starkey, Andrew

N1 - ACKNOWLEDGEMENT This research work is supported by UniSZA Research Management, Innovation & Commercialization Centre (RMIC).

PY - 2020/10/23

Y1 - 2020/10/23

N2 - Text analytics is the process of transforming unstructured text data into meaningful information that can be used for fact-based decision making. It is widely used for sentiment analysis, summarising text or searching for useful information from the web. Existing approaches such as machine learning or natural language processing techniques have been proven to obtain significant information from massive amounts of text data. However, these approaches can have issues of obtaining sufficiently accurate results during training or the limitation of linguistic resources for the understanding of slang or acronyms for example. Thus, we propose a new method called Contextual Analysis (CA) that accentuates the relationship of the words and sources that are used for analysis. This approach will create a self-learned knowledge tree of contextual information, based on where words appear in the underlying sources. CA provides an understanding of the degree of relationship between the context of words which is a new technique to understand textual data sources. To evaluate CA techniques, 2000 news items are used that contain fake and actual news during 2016 Trump-Hillary campaign. The results are compared with other prominent Supervised Machine Learning (SML) techniques. CA matched the best classification performance and achieved the best performance of 0.81accuracy for fake news prediction. Moreover, CA provides a Hierarchal Knowledge Tree (HKT) that helps to understand the context of words used in both fake and real news and is one of the important findings of this method. The experimental results demonstrate that CA has the potential to undertake classification tasks and at the same time reveal the contextual relationship and hierarchy of words which improves upon existing ML methods that treat each word as independent.

AB - Text analytics is the process of transforming unstructured text data into meaningful information that can be used for fact-based decision making. It is widely used for sentiment analysis, summarising text or searching for useful information from the web. Existing approaches such as machine learning or natural language processing techniques have been proven to obtain significant information from massive amounts of text data. However, these approaches can have issues of obtaining sufficiently accurate results during training or the limitation of linguistic resources for the understanding of slang or acronyms for example. Thus, we propose a new method called Contextual Analysis (CA) that accentuates the relationship of the words and sources that are used for analysis. This approach will create a self-learned knowledge tree of contextual information, based on where words appear in the underlying sources. CA provides an understanding of the degree of relationship between the context of words which is a new technique to understand textual data sources. To evaluate CA techniques, 2000 news items are used that contain fake and actual news during 2016 Trump-Hillary campaign. The results are compared with other prominent Supervised Machine Learning (SML) techniques. CA matched the best classification performance and achieved the best performance of 0.81accuracy for fake news prediction. Moreover, CA provides a Hierarchal Knowledge Tree (HKT) that helps to understand the context of words used in both fake and real news and is one of the important findings of this method. The experimental results demonstrate that CA has the potential to undertake classification tasks and at the same time reveal the contextual relationship and hierarchy of words which improves upon existing ML methods that treat each word as independent.

KW - Contextual Analytics

KW - fake news

KW - hierarchical knowledge tree

KW - text analytics

KW - supervised machine learning

U2 - 10.14445/22315381/CATI1P219

DO - 10.14445/22315381/CATI1P219

M3 - Article

SN - 2394-3777

VL - CAT PART - 1 2020

JO - International Journal of Advanced Research Trends in Engineering and Technology

JF - International Journal of Advanced Research Trends in Engineering and Technology

IS - Editor's Issue

M1 - CATI1P219

ER -

Analysing fake news titles for 2016 Trump-Hillary campaign using contextual-based approaches in text analytics

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this