Seminar • Systems and Networking — Logging Statement Analysis and Automation in Software Systems with Data Mining and Machine Learning TechniquesExport this event to calendar

Friday, May 21, 2021 — 2:00 PM EDT

Please note: This Systems and Networking seminar will be given online.

Sina Gholamian, PhD candidate
Department of Electrical and Computer Engineering, University of Waterloo

Log files are widely used to record runtime information of software systems, such as the timestamp of an event, the unique ID of the source of the log, and a part of the state of task execution. The rich information of logs enables system developers (and operators) to monitor the runtime behaviors of their systems and further track down system problems in production settings. With the ever-increasing scale and complexity of modern systems, the volume of logs is rapidly growing, e. g., at a rate of gigabytes of logs per minute. Therefore, the traditional way of log analysis that largely relies on manual inspection (e.g., searching for error/warning keywords or grep) has become an inefficient, labor-intensive, and error-prone task. To address this challenge, many efforts have recently tried to automate log analysis by use of data-mining techniques. However, the current logging process is mostly manual, and thus, proper placement and content of logging statements remain as challenges. To overcome these challenges, methods that aim to automate log placement and content prediction, i.e., ‘where and what to log,’ are of high interest.

Thus, in this research, we focus on predicting the log statements, and for this purpose, we perform an experimental study on open-source Java projects. We introduce a log-aware code-clone detection method to predict the location and description of logging statements. Additionally, we incorporate natural language processing (NLP) deep learning methods to further enhance the performance of the log statements’ description prediction. We also analyze execution logs and extract natural language characteristics of logs to enable the application of natural language models for automated log file analysis. Finally, we propose an automated tool for analyzing log files and measure the information gain from logs for different log analysis tasks such as anomaly detection.


Bio: Sina Gholamian is a final-year Ph.D. student at the University of Waterloo supervised by Prof. Paul Ward. He is interested in inventing and building automated approaches for the analysis of software systems with machine learning approaches. His research is supported by an NSERC doctoral scholarship.


To join this Systems and Networking seminar on Zoom, please go to https://zoom.us/j/92268050403?pwd=bVZyS2Nmc2QwRGZOQzNSbzBCM3ROUT09.

Location 
Online seminar
200 University Avenue West

Waterloo, ON N2L 3G1
Canada
Event tags 

S M T W T F S
27
28
29
30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  1. 2021 (126)
    1. August (3)
    2. July (17)
    3. June (11)
    4. May (16)
    5. April (27)
    6. March (20)
    7. February (13)
    8. January (19)
  2. 2020 (217)
    1. December (18)
    2. November (12)
    3. October (7)
    4. September (21)
    5. August (28)
    6. July (14)
    7. June (18)
    8. May (16)
    9. April (20)
    10. March (16)
    11. February (25)
    12. January (22)
  3. 2019 (255)
  4. 2018 (217)
  5. 2017 (36)
  6. 2016 (21)
  7. 2015 (36)
  8. 2014 (33)
  9. 2013 (23)
  10. 2012 (4)
  11. 2011 (1)
  12. 2010 (1)
  13. 2009 (1)
  14. 2008 (1)