Automated extraction of behaviour model of applications
Highly replicated cloud applications are deployed only when they are deemed to be functional. That is, they generally perform their task and their failure rate is relatively low. However, even though failure is rare, it does occur and is very difficult to diagnose. We devise a tool for failure diagnosis which learns the normal behavior of an application in terms of the statistical properties of variables used throughout its execution, and then monitors it for deviation from these statistical properties. Our study reveals that many variables have unique statistical characteristics that amount to an invariant of the program. Therefore, any significant deviation from these characteristics reflects an abnormal behavior of the application which may be caused by a program error.
It is not possible to get the invariant from the application's static code analysis alone. For example, the name of a person usually does not include a semicolon; however, an intruder may try to do a SQL injection (which will include a semicolon) through the `name' field while entering his information and be successful if there is no checking for this case. This scenario can only be captured at runtime and may not be tested by the application developer. The character range of the `name' variable is one of its statistical properties; by learning this range from the execution of the application it is possible to detect the above described abnormal input. Hence, monitoring the statistics of values taken by the different variables of an application is an effective way to detect anomalies that can help to diagnose the failure of the application.
We build a tool that collects frequent snapshots of the application's heap and build a statistical model solely from the extensional knowledge of the application. The model characterizes the application's normal behavior. According to our literature studies, we are the first to design a monitoring tool that collects frequent snapshots in term of Java heap dump and use it to determine the application's behavior model without relying on any prior expected-behavior's specification or code instrumentation.
Our approach allows a behavior model to be automatically and efficiently built using the monitoring data alone. We evaluate the correctness of our approach by applying it on an e-commerce application and online bidding system, and then derive different statistical properties of variables from their runtime-exhibited values. Our experimental result demonstrates 96% accuracy in the generated statistical model. The high accuracy level indicates that our tool can successfully determine the application's normal behavior without any prior knowledge of its expected behavior. Moreover, our tool also correctly detected two anomalous condition while monitoring the application with a small amount of injected fault. In addition to anomaly detection, our tool logs all the variables of the application that violates the learned model. The log _le can help to diagnose any failure caused by the variables and gives our tool a source-code granularity in fault localization.