MACHINE LEARNING FOR LOG-BASED ANOMALY DETECTION AND SELF-HEALING
Keywords:
AI-powered anomaly detection, software logs, machine learning, deep learning, self-healing systems, log analysis, LSTM,Abstract
Modern software systems generate enormous volumes of log data to support monitoring and fault diagnosis, far exceeding the capacity of manual analysis. This paper proposes an architecture for detecting anomalous patterns in software logs using advanced machine learning and deep learning techniques, enabling proactive fault diagnosis and self-healing capabilities. Traditional rule-based approaches are inadequate for handling the scale and complexity of contemporary log data, whereas deep learning models such as Long Short-Term Memory (LSTM) networks and Transformer architectures excel at capturing contextual dependencies within log sequences. To address challenges such as limited labeled data, this study leverages self-supervised and contrastive learning methods. In addition, reinforcement learning and rule-based automation are incorporated to dynamically correct faults, thereby minimizing system downtime. The proposed models are evaluated on benchmark log datasets using metrics including precision, recall, F1-score, and AUC-ROC. Experimental results demonstrate that Transformer-based models achieve superior performance compared to conventional machine learning techniques, though at higher computational costs. Notably, self-healing mechanisms reduce downtime by up to 68.2%, underscoring the potential of AI to significantly enhance system reliability and availability. However, challenges such as model interpretability, computational efficiency, and real-time adaptability remain. This paper provides a state-of-the-art review of AI-based log anomaly detection approaches and outlines future research directions, emphasizing lightweight architectures, explainable AI, and scalable deployment as key enablers for advancing AI-powered anomaly detection and self-healing systems in safety-critical domains.