—Real-time embedded systems are a significant class of applications, poised to grow even further as automated vehicles and the Internet of Things become a reality. An important problem for these systems is to detect anomalies during operation. Anomaly detection is a form of classification, which can be driven by data collected from the system at execution time. We propose inter-arrival curves as a novel analytic modelling technique for discrete event traces. Our approach relates to the existing technique of arrival curves and expands the technique to anomaly detection. Inter-arrival curves analyze the behaviour of events within a trace by providing upper and lower bounds to their inter-arrival occurrence. We exploit inter-arrival curves in a classification framework that detects deviations within these bounds for anomaly detection. Also, we show how inter-arrival curves act as good features to extract recurrent behaviour that these systems often exhibit. We demonstrate the feasibility and viability of the fully implemented approach with an industrial automotive case study (CAN traces) as well as a deployed aerospace case study (RTOS kernel traces).
In a simulator-defined MDP, the Markovian dynamics and rewards are provided in the form of a simulator from which samples can be drawn. This paper studies MDP planning algorithms that attempt to minimize the number of simulator calls before terminating and outputting a policy that is approximately optimal with high probability. The paper introduces two heuristics for efficient exploration and an improved confidence interval that enables earlier termination with probabilis- tic guarantees. We prove that the heuristics and the confidence interval are sound and produce with high probability an approximately optimal policy in polynomial time. Experiments on two benchmark problems and two instances of an invasive species management problem show that the improved confidence intervals and the new search heuristics yield reductions of between 8% and 47% in the number of simulator calls required to reach near-optimal policies.
Spatiotemporal planning involves making choices at multiple locations in space over some planning horizon to maximize utility and satisfy various constraints. In Forest Ecosystem Management, the problem is to choose actions for thousands of locations each year including harvesting, treating trees for fire or pests, or doing nothing. The utility models could place value on sale of lumber, ecosystem sustainability or employment levels and incorporate legal and logistical constraints on actions such as avoiding large contiguous areas of clearcutting. Simulators developed by forestry researchers provide detailed dynamics but are generally inaccesible black boxes. We model spatiotemporal planning as a factored Markov decision process and present a policy gradient planning algorithm to optimize a stochastic spatial policy using simulated dynamics. It is common in environmental and resource planning to have actions at different locations be spatially interelated; this makes representation and planning challenging. We define a global spatial policy in terms of interacting local policies defining distributions over actions at each location conditioned on actions at nearby locations. Markov chain Monte Carlo simulation is used to sample landscape policies and estimate their gradients. Evaluation is carried out on a forestry planning problem with 1,880 locations using a variety of value models and constraints. Index
Where a legacy of aggressive wildland fire suppression has left forests in need of fuel reduction, allowing wildland fire to burn may provide fuel treatment benefits, thereby reducing suppression costs from subsequent fires. The least-cost-plus-net-value-change model of wildland fire economics includes benefits of wildfire in a framework for evaluating suppression options. In this study, we estimated one component of that benefit – the expected present value of the reduction in suppression costs for subsequent fires arising from the fuel treatment effect of a current fire. To that end, we employed Monte Carlo methods to generate a set of scenarios for subsequent fire ignition and weather events, which are referred to as sample paths, for a study area in central Oregon. We simulated fire on the landscape over a 100-year time horizon using existing models of fire behaviour, vegetation and fuels development, and suppression effectiveness, and we estimated suppression costs using an existing suppression cost model. Our estimates suggest that the potential cost savings may be substantial. Further research is needed to estimate the full least-cost-plus-net-value-change model. This line of research will extend the set of tools available for developing wildfire management plans for forested landscapes.
We analyze the foundations of cyclic causal models for discrete variables, and compare structural equation models (SEMs) to an alternative semantics as the equilibrium (stationary) distribution of a Markov chain. We show under general conditions, discrete cyclic SEMs cannot have independent noise; even in the simplest case, cyclic structural equation models imply constraints on the noise. We give a formalization of an alternative Markov chain equilibrium semantics which requires not only the causal graph, but also a sample order. We show how the resulting equilibrium is a function of the sample ordering, both theoretically and empirically.
Often the most practical way to define a Markov Decision Process (MDP) is as a simulator that, given a state and an action, produces a resulting state and immediate reward sam- pled from the corresponding distributions. Simulators in nat- ural resource management can be very expensive to execute, so that the time required to solve such MDPs is dominated by the number of calls to the simulator. This paper presents an algorithm, DDV, that combines improved confidence inter- vals on the Q values (as in interval estimation) with a novel upper bound on the discounted state occupancy probabilities to intelligently choose state-action pairs to explore. We prove that this algorithm terminates with a policy whose value is within $ε$ of the optimal policy (with probability 1−$δ$) after making only polynomially-many calls to the simulator. Ex- periments on one benchmark MDP and on an MDP for inva- sive species management show very large reductions in the number of simulator calls required.
There are many pest species that invade valuable ecosystems and damage ecosystem services. An important management goal is to limit the spread of these species or even eradicate them. From the computational perspective, these problems can be formulated as Markov Decision Processes. However, solving them is difficult, because of their spatial nature. Viewed abstractly, the current state of the ecosystem can be formalized as a graph. Each node of the graph may be invaded or not, and at each time step, a “local” action must be selected at each node. Consequently, at the level of the whole MDP, each global action is a vector of local actions of length N, the number of nodes in the graph. The exponentially large action space raises major computational problems. During the state transitions, the invasive species can spread along the edges of the graph. We study a specific instance of this general problem based on the spread of Tamarisk in river networks. In this formulation, we work with the dual of the standard river network graph. Each node corresponds to a stretch of the river between two confluences (also called a “reach”). We divide each reach into H “slots”, where each slot can either be occupied by a single native tree, occupied by a single Tamarisk tree, or empty. In each time step, the plants (both native and invader) first must survive through winter (with some probability). Then, the surviving trees produce seeds that disperse through the river network according to a dispersal kernel that prefers downstream movement with high probability. Each dispersing seed lands at some slot in some node of the network. All seeds arriving at a slot compete to establish themselves in that slot (if the slot is empty). In our model, there are three local actions: simple eradication of invasive plants, eradication followed by planting of native plants, or doing nothing at all. Doing nothing costs nothing, while eradication is more expensive, and eradication plus planting is the most expensive. One local action must be chosen for each node (reach) at each time step, and it applies to all H slots at that node. The actions only succeed stochastically. The planning goal is to minimize the expected discounted sum of the cost of invasive damage and the cost of management actions. To solve this planning problem, we apply the standard value iteration algorithm. However, standard value iteration requires a fully-specified state transition matrix. In our problem, it is easy to write a simulator for drawing samples of the dispersal and competition processes, but it is computationally intractable to compute the exact transition probabilities. Hence, we estimate those probabilities by drawing a large number of samples from the simulator. Once we have an optimal policy on the approximated MDP, we explore its behaviour both directly and via simulated rollouts. The optimal policy can be visualized with one picture per state annotated by the types of plants in each reach and the optimal action that should be taken in each reach. We also run the optimal policy forward in time on many simulated trajectories in order to gather statistics about its behaviour such as: how long it takes to reach a steady state; average cost of a policy over time; or the frequency with which completely invaded or uninvaded states are reached. Comparisons of these statistics show that in many situations our optimal policies greatly outperform established rules of thumb from the literature. An example of a rule of thumb is to always prioritize treatment of upstream over downstream reaches. Another rule targets the most invaded reaches (those with the most slots occupied by Tamarisk) first. We have found that the optimality of these rules of thumb depends strongly on the dispersal and competition parameters. Some contributions this work makes to bioeconomics and ecology are showing evidence that the stochastic spread of the invasion process needs to be modelled and that the spatial characteristics of the system under invasion are relevant to optimal management. The computational challenges raised by this problem have led us into research on methods for planning that can provide bounds on the optimality of the policy while minimizing the number of calls to expensive simulators. We are also studying algorithms that can scale to large graphs with thousands of nodes and large out-degrees. Finally, we see potential in these problems for intelligently reducing the size of the state space and the complexity of the policy description by utilizing domain knowledge.
In spatiotemporal planning, agents choose actions at multiple locations in space over some planning horizon to maximize their utility and satisfy various constraints. In forestry planning, for example, the problem is to choose actions for thousands of locations in the forest each year. The actions at each location could include harvesting trees, treating trees against disease and pests, or doing nothing. A utility model could place value on sale of forest products, ecosystem sustainability or employment levels, and could incorporate legal and logistical constraints such as avoiding large contiguous areas of clearcutting and managing road access. Planning requires a model of the dynamics. Existing simulators developed by forestry researchers can provide detailed models of the dynamics of a forest over time, but these simulators are often not designed for use in automated planning. This thesis presents spatiotemoral planning in terms of factored Markov decision processes. A policy gradient planning algorithm optimizes a stochastic spatial policy using existing simulators for dynamics. When a planning problem includes spatial interaction between locations, deciding on an action to carry out at one location requires considering the actions performed at other locations. This spatial interdependence is common in forestry and other environmental planning problems and makes policy representation and planning challenging. We define a spatial policy in terms of local policies defined as distributions over actions at one location conditioned upon actions at other locations. A policy gradient planning algorithm using this spatial policy is presented which uses Markov Chain Monte Carlo simulation to sample the landscape policy, estimate its gradient and use this gradient to guide policy improvement. Evaluation is carried out on a forestry planning problem with 1880 locations using a variety of value models and constraints. The distribution over joint actions at all locations can be seen as the equilibrium of a cyclic causal model. This equilibrium semantics is compared to Structural Equation Models. We also define an algorithm for approximating the equilibrium distribution for cyclic causal networks which exploits graphical structure and analyse when the algorithm is exact.
We describe extensive modifications made over time to a first year computer science course at the University of British Columbia covering logic and digital circuits (among other topics). Smoothly integrating the hardware-based labs with the more theory-based lectures into a cohesive picture of computation has always been a challenge in this course. The seeming disconnect between implementation and abstraction has historically led to frustration and dissatisfaction among students. We describe changes to the lab curriculum, equipment logistics, the style of in-lab activities and evaluation. We have also made logistical changes to the management and ongoing training of teaching assistants, allowing us to better anchor our larger course story into the lab curriculum. These changes have greatly improved student and TA opinions of the lab experience, as well as the overall course.
We introduce a challenging real world planning problem where actions must be taken at each location in a spatial area at each point in time. We use forestry planning as the motivating application. In Large Scale Spatial-Temporal (LSST) planning problems, the state and action spaces are defined as the cross-products of many local state and action spaces spread over a large spatial area such as a city or forest. These problems possess state uncertainty, have complex utility functions involving spatial constraints and we generally must rely on simulations rather than an explicit transition model. We define LSST problems as reinforcement learning prob- lems and present a solution using policy gradients. We compare two different policy formulations: an explicit policy that identifies each location in space and the action to take there; and an abstract policy that defines the proportion of actions to take across all locations in space. We show that the abstract policy is more robust and achieves higher rewards with far fewer parameters than the elementary policy. This abstract policy is also a better fit to the properties that practitioners in LSST problem domains require for such methods to be widely useful
When using Bayesian networks, practitioners often express constraints among variables by conditioning a common child node to induce the desired distribution. For example, an ‘or' constraint can be easily expressed by a node modelling a logical ‘or' of its parents' values being conditioned to true. This has the desired effect that at least one parent must be true. However, conditioning also alters the distributions of further ancestors in the network. In this paper we argue that these side effects are undesirable when constraints are added during model design. We describe a method called shielding to remove these side effects while remaining within the directed language of Bayesian networks. This method is then compared to chain graphs which allow undirected and directed edges and which model equivalent distributions. Thus, in addition to solving this common modelling problem, shielded Bayesian networks provide a novel method for implementing chain graphs with existing Bayesian network tools.
In this unpublished literature review we will survey the history of Influence Diagrams from their origin in Decision Theory to modern AI uses. We will compare the various methods that have been used to find an optimal strategy for a given influence diagram. These methods have various advantages under different assumptions and there is a pro- gression to the present of richer, more general solutions. We also look at an abstract algorithm used as an informal tool and determine whether it is equivalent to other formal methods in the literature.