2017 technical reports

CS-2017-01
Title	Revisiting the CLBlood Model: Formulation Enhancements and Online Deployment
Authors	Spencer R. Van Leeuwen, Gladimir V. G. Baranoski and Bradley W. Kimmel
Abstract	CLBlood is a cell-based light interaction model for human blood previously developed by the Natural Phenomena Simulation Group (NPSG) at the University of Waterloo. In this report, we revisit several elements of its formulation and describe appropriate enhancements. Furthermore, we compare our current and previous results to demonstrate that the predictive capabilities of our model have been fully preserved in its revised version. We also showcase the online deployment of CLBlood.
Date	February 2017
Report	CS-2017-01 (PDF)

CS-2017-02
Title	Renormalization of NoSQL Database Schemas
Authors	Michael J. Mior and Kenneth Salem
Abstract	NoSQL applications often use denormalized databases in order to meet performance goals, but this introduces complications. In particular, application evolution may demand changes in the underlying database schema, which may in turn require further application revisions. The NoSQL DBMS itself can do little to aid in this process, as it has no understanding of application-level denormalization. In this paper, we describe a procedure for reconstructing a normalized conceptual schema from a denormalized NoSQL database. Exposing the conceptual schema provides application developers with information that can be used to guide application and database evolution. The procedure’s input includes functional and inclusion dependencies, which may be mined from the NoSQL database. We illustrate the effectiveness of this procedure using several application case studies.
Date	April 3, 2017
Report	CS-2017-02 (PDF)

CS-2017-03
Title	Database Managed CPU Performance Scaling for Improved Energy Efficiency
Authors	Mustafa Korkmaz, Martin Karsten, Kenneth Salem and Semih Salihoglu
Abstract	Dynamic voltage and frequency scaling (DVFS) is a technique for adjusting the speed and power consumption of processors, allowing performance to be traded for reduced power consumption. Since CPUs are typically the largest consumers of power in modern servers, DVFS can have a significant impact on overall server power consumption. Modern operating systems include DVFS governors, which interact with the processor to manage performance and power consumption according to some system-level policy. In this paper, we argue that for database servers, DVFS can be managed more effectively by the database management system. We present a power-aware database request scheduling algorithm called POLARIS. Unlike operating system governors, POLARIS is aware of database units of work and database performance targets, and can achieve a better power/performance tradeoff by exploiting this knowledge. We implemented POLARIS in SHORE-MT, and we show that it can improve both power consumption and performance relative to operating system baselines.
Date	June 2017
Report	CS-2017-03 (PDF)

CS-2017-04
Title	Symmetry Reduction and Compositional Verification of Timed Automata
Authors	Hoang Linh Nguyen and Richard Trefler
Abstract	Timed automata provide a model for studying the behavior of finite-state systems as they evolve over time. We describe a technique that incorporates automatic symmetry detection and symmetry reduction in the analysis of systems modeled by timed automata. Our prototype extends the realtime model checker PAT with symmetry reduction using state swaps to reduce time and memory consumption. Moreover, our approach detects structural symmetries arising from process templates of real-time systems, requiring no additional input from the user. The technique involves finding all variables of type process identifier and computing the largest subgroup of candidate symmetries that induce automorphisms. Our technique is fully automatic, and not restricted to fully symmetric systems. We then combine elements of compositional proof, abstraction and local symmetry to decide whether a safety property holds for every process instance in a parameterized family of real-time process networks. Analysis is performed on a small cut-off network; that is, a small instance whose compositional proof generalizes to the entire parametric family. Our results show that verification is decidable in time polynomial in the state space of the cut-off instance. We apply these ideas to analyze Fischer’s protocol and the CSMA/CD protocol.
Date	August 2017
Report	CS-2017-04 (PDF)

CS-2017-05
Title	Data-intensive Applications Using the WIDE Software Platform
Authors	Donald Cowan
Abstract	The WIDE (Web Informatics Development Environment) software platform has been under development for over 15 years by researchers in the University of Waterloo Computer Systems Group (UWCSG). The WIDE platform has been used to develop over 80 data-intensive web and mobile software systems in many different sectors including land development and planning, environmental modelling and monitoring, First Nations, volunteerism, social indicators, community information, health, arts and culture, and built heritage. The objective of each application is to manage and use community information assets for community benefit. The community can be either geographic or a community of interest or practice. The UWCSG software team has worked with many different partner organizations in producing the different services and range from local NGOs, businesses, universities and research centres, to all levels and types of government including First Nations groups and the United Nations.
Date	October 11, 2017
Report	CS-2017-05 (PDF)

CS-2017-06
Title	The Socially Smart Community — Practical Uses of Community-level Socio-economic Indicators
Authors	Donald Cowan, Paulo Alencar, Kyle Young, Bryan Smale, Ryan Erb and Fred McGarry
Abstract	There is a large amount of discussion in the literature about smart cities where the focus of the discourse is on gathering and analyzing real-time data from smartphones or other sensors to support public services such as vehicular traffic flow and utility consumption or to infer human behaviour. There does not appear to be any discussion of ‘socially’ smart cities where the focus is on using citizens as ‘smart sensors.’ Here the citizens’ interactions with a community’s services are captured in a timely fashion to derive socio-economic indicators about characteristics of the population relevant to sectors such as education, food security, health, housing, community participation, community safety, income levels and government and to use those as a basis for monitoring community well-being or the effectiveness of government, social service and economic policies designed to produce community improvement. This paper provides the motivation for and description of a community-level socio-economic indica- tor system to support the socially smart community. The system will accept timely indicator base data from many different community sources and operate on that data using various software tools and maps. The data can be combined in various ways to show single indicators and relationships among indicators. In addition, multiple layers of data can be displayed on a map showing various geographic relationships. An initial version of this approach and related system to collect community-level social and economic data and display appropriate socio-economic indicators while protecting individual privacy, is being deployed in a mixed urban-rural community in Southwestern Ontario, Canada. The web site can be found at myPerthHuron.ca.
Date	October 11, 2017
Report	CS-2017-06 (PDF)

CS-2017-07
Title	The United Nations and Open Data
Authors	Donald Cowan, Paulo Alencar, Doug Mulholland, Fred McGarry and Colin Mayfield
Abstract	The United Nations through the United Nations University International Network on Water, Environment and Health (UNU-INWEH) has embarked on a number of projects related to the environment and supported by freely available data akin to open data. These projects all use a common knowledge platform called Knowledge Integration and Management United Nations University (KIM-UNU) developed by the University of Waterloo Computer Systems Group (UWCSG) and the Centre for Community Mapping COMAP). This paper describes the platform and its use in these specific projects on international water (IW:Science), sustainable land management (KM:Land) and safe water provisioning (HydroSanitas).
Date	October 11, 2017
Report	CS-2017-07

CS-2017-08
Title	Canopus: A Scalable and Massively Parallel Consensus Protocol
Authors	Sajjad Rizvi, Bernard Wong and Srinivasan Keshav
Abstract	Achieving consensus among a set of distributed entities (or participants) is a fundamental problem at the heart of many distributed systems. A critical problem with most consensus protocols is that they do not scale well. As the number of participants trying to achieve consensus increases, increasing network traffic can quickly overwhelm the network from topology-oblivious broadcasts, or a central coordinator for centralized consensus protocols. Thus, either achieving strong consensus is restricted to a handful of participants, or developers must resort to weaker models of consensus. We propose Canopus, a highly-parallel consensus protocol that is ‘plug-compatible’ with ZooKeeper, which exploits modern data center network topology, parallelism, and consensus semantics to achieve scalability with respect to the number of participants and throughput (i.e., the number of key-value reads/writes per second). In our prototype implementation, compared to EPaxos and ZooKeeper, Canopus increases through put by more than 4x and 16x respectively for read-heavy workloads.
Date	December 2017
Report	CS-2017-08 (PDF)

CS-2017-09
Title	A Catalog of Software Development Rules for Agile Methods
Authors	Ulisses Telemaco, Renata Mesquita, Toacy Oliveira, Gláucia Melo, Paulo Alencar
Abstract	Background: Software Development is typically complex, unpredictable and dependent on knowledge workers. Software Processes are an attempt to handle development complexity by defining a set of elements such as workflows, roles, templates and rules. This paper focuses on software development rules which are restrictions applied to software process. Rules are often described in natural language and spread among many documents. Representing software development rules in natural language has at least two drawbacks: (a) Following rules become hard since it depends on the expertise of the workers and (b) Compliance checking — which is the act of verify whether rules are properly followed — is often performed without the support of specialized systems. The lack of a computational solution to guide the team or to support compliance checking is a problem since these activities performed manually are more error-prone, costly and less scalable if compared with automatic or semi-automatic approaches. Problem: Any automation initiative starts with the formal representation of compliance elements and a structured catalog that presents software development rules prone to formal verification is still missing. Aim: This study aims at identifying a set of software development rules based on agile methods and that may be susceptible to a formal verification. Method: A literature review as well as semi-structured interviews with practitioners were conducted to identify a set of agile software development rules. Result: We identified a set of 10 rules for agile methods that were considered relevant according to research criteria. The rules are presented in a structured catalog.
Date	December 2017
Report	CS-2017-09 (PDF)