On Handling Redundancy for Failure Log Analysis of Cluster Systems

Nentawe Gurumdimma, Arshad Jhumka, Maria Liakata, Thuan Chuah, James Browne

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

Abstract

System event logs contain information that capture the sequence of events occurring in the system. They are often the primary source of information from large-scale distributed systems, such as cluster systems, which enable system administrators to determine the causes and detect system failures. Due to the complex interactions between the system hardware and software components, the system event logs are typically huge in size, comprising streams of interleaved log messages. However, only a small fraction of those log messages are relevant for analysis. We thus develop a novel, generic log compression or filtering (i.e., redundancy removal) technique to address this problem. We apply the technique over three different log files obtained from two different production systems and validate the technique through the application of an unsupervised failure detection approach. Our results are positive: (i) our technique achieves good compression, (ii) log analysis yields better results for our filtering method than normal approach. Keywords-Cluster Log Data; Unsupervised learning; Compression; Levenshtein distance; filtering
Original languageEnglish
Title of host publicationDEPEND 2015 : The Eighth International Conference on Dependability
PublisherUniversity of Jos
ISBN (Print)978-1-61208-429-9
Publication statusPublished - 2015

Bibliographical note

The data which was analyzed in this paper was available through the SUPReMM project funded by NSF grant ACI-1023604, and has utilized and enhanced the NSF-funded system Ranger (OCI-0622780). We thank the PTDF Nigeria for partly funding this research.

Copyright (c) IARIA, 2015

Fingerprint

Dive into the research topics of 'On Handling Redundancy for Failure Log Analysis of Cluster Systems'. Together they form a unique fingerprint.

Cite this