University of Waterloo

2009 technical reports

CS-2009-01
Title	WiFi Overcast: Enabling True Mobility for Realtime Applications in the Enterprise
Authors	Nabeel Ahmed, Usman Ismail
Abstract	Enterprises are increasingly deploying wireless local area connections (WLANs) to provide mobile access to users in corporate offices. However, existing enterprise WLANs are far from being truly mobile. In particular, they do not adequately support continuous mobility, where users access the network on-the-go. Furthermore, WLANs that do provide continuous mobility support require client modifications, making them hard to deploy in practice 20. In addition, with the growing interest in realtime applications such as voice and video, users are increasingly placing additional (QoS) demands on the network, which for inadequately designed WLANs, does not scale to large numbers of users 10. In this paper, we propose Overcast, a novel WLAN architecture that targets scenarios demanding continuous mobility and real-time support for 802.11 clients. Overcast does not require client modifications and supports all 802.11 standards. Though Overcast borrows some features from prior WLAN designs, it improves on them by incorporating a novel RF mapping framework (proposed in 3) for accurate online detection of RF interference. We describe the architecture of Overcast in detail and discuss our current efforts in realizing such a system on off-the-shelf commodity hardware. We also describe an example application of Overcast to highlight it’s usefulness in supporting realtime applications in continuously mobile user environments.
Date	January 27, 2009
Report	WiFi Overcast: Enabling True Mobility for Realtime Applications in the Enterprise (PDF)

CS-2009-02
Title	Capacity Provisioning a Valiant Load-Balanced Network
Authors	Andrew R. Curtis and Alejandro López-Ortiz
Abstract	Valiant load balancing (VLB), also called two-stage load balancing, is gaining popularity as a routing scheme that can serve arbitrary traffic matrices. To date, VLB network design is well understood on a logical full-mesh topology, where VLB is optimal even when nodes or links can fail. In this paper, we address the design and capacity provisioning of arbitrary VLB network topologies. First, we introduce an algorithm to determine if VLB can serve all traffic matrices when a fixed number of arbitrary links fail, and we show how to find a mincost expansion of the network—via link upgrades or installs or both—so that it is resilient to these failures. Additionally, we propose a method to design a new VLB network under the fixed-charge network design cost model. Finally, we prove that VLB is no longer optimal on unrestricted topologies, and can require more capacity than shortest path routing to serve all traffic matrices on some topologies. These results rely on a novel theorem that characterizes the capacity VLB requires of links crossing each cut, i.e., a partition, of the network’s nodes.
Date	January 20, 2009
Report	Capacity Provisioning a Valiant Load-Balanced Network (PDF)

CS-2009-03
Title	Fixed-Parameter Tractability and Improved Approximations for Segment Minimization
Authors	Therese Biedl, Stephane Durocher, Holger H. Hoos, Shuang Luan, Jared Saia, and Maxwell Young
Abstract	The segment minimization problem consists of finding the smallest set of integer matrices that sum to a given intensity matrix, such that each summand has only one non-zero value, and the non-zeroes in each row are consecutive. This has direct applications in intensity-modulated radiation therapy, an effective form of cancer treatment. We show here that for a single row, this problem is fixed-parameter tractable in the largest value of the intensity matrix. We use this to develop approximation algorithms for the full problem. One of these improves the approximation factor from the previous best of log 2 h + 1 to 3/2 · (log3 h + 1), where h is the largest entry in the intensity matrix; another improves the approximation factor from 2· (log D+1) to 24/13 · (log D+1), where D is the largest difference between consecutive elements of a row of the intensity matrix. Experimentation with these algorithms show that they outperform other approximation algorithms on 75% of the 172 test cases we considered, which include both real world and synthetic data.
Date	January 20, 2009
Report	Fixed-Parameter Tractability and Improved Approximations for Segment Minimization (PDF)

CS-2009-04
Title	A Heuristic for Fair Correlation-Aware Resource Placement
Authors	Raouf Boutaba, Martin Karsten, and Maxwell Young
Abstract	The configuration of network resources greatly impacts the communication overhead for data intensive tasks and constitutes a critical problem in the design and maintenance of networks. To address the issue of resource placement, we analyze and implement a heuristic for solving a known NP-complete graph optimization problem called MAXIMUM SIZE BOUNDED CAPACITY CUT. Experimental results for our heuristic demonstrate promising performance on both synthetic and real world data. Next our heuristic is used as a sub-routine to solve another known NP-complete problem called MIN-MAX MULTIWAY CUT whose traits we adapt to yield a resource placement scheme that exploits correlations between network resources. Our experimental results show that the resulting placement scheme achieves a significant savings in communication overhead.
Date	January 20, 2009
Report	A Heuristic for Fair Correlation-Aware Resource Placement (PDF)

CS-2009-05
Title	Big-Step Semantics
Authors	Shahram Esmaeilsabzali, Nancy A. Day, Joanne M. Atlee, Jianwei Niu
Abstract	With the popularity of model-driven methodologies, and the abundance of modelling languages, a major question for a requirements engineer is: which language is suitable for modelling a system under study? We address this question from a semantic point-of-view for big-step modelling languages (BSMLs). BSMLs are a popular class of behavioural modelling languages in which a model can respond to an environmental input by executing multiple, possibly concurrent, transitions. We deconstruct the semantics of a large class of BSMLs into high-level, orthogonal semantic aspects and discuss the relative advantages and disadvantages of the semantic options for each of these aspects to allow a requirements engineer to compare and choose the right BSML. We accompany our presentation with many modelling examples that illustrate the differences between a set of relevant semantic options.
Date	March 6, 2009
Report	Big-Step Semantics (PDF)

CS-2009-06
Title	Collaborative Preference Elicitation
Authors	Usman Ismail
Abstract	The inherent inaccuracies in user modelling and the use of non-deterministic algorithms means that Preference Elicitation systems are often unsatisfactory for users. We address this issue using a technique called collaborative elicitation. Leveraging the fact that humans rarely have unique preferences, we minimize explicit elicitation from users while attempting to improve user satisfaction. We categorize users into groups using collaborative techniques based on Von Ahn's "Games with a purpose." We then allow groups of users to collectively select default preferences for their peers.
Date	February 18, 2009
Report	Collaborative Preference Elicitation (PDF)

CS-2009-07
Title	Incremental Call Graph Construction for the Eclipse IDE
Authors	Usman Ismail
Abstract	A call graph is a powerful tool for program analysis and can be used to: help plan testing strategies, reduce program size and help programmers understand and debug large programs. We can use either dynamic or static call graph generation techniques. Dynamic call graphs tend to underestimate the number of target methods for a call site where as static call graphs tend to overestimate this this number. A theoretically ideal call graph is the union of the dynamic call graphs over all possible executions of the program. Dynamic call graphs are not safe and generating static call graphs is computationally expensive. We propose an incremental call graph generation approach which will compute graphs for fragments of the program as they are being developed. It will then recursively combine fragments until a graph for the whole program is generated. The graph will be as precise as corresponding traditional algorithms and will present, upon completion, a safe call graph.
Date	February 18, 2009
Report	Incremental Call Graph Construction for the Eclipse IDE (PDF)

CS-2009-08
Title	Towards Modularization and Composition in Distributed Event Based Systems
Authors	Rolando Blanco and Paulo Alencar
Abstract	A distributed and interface-based publish/subscribe system is proposed in this report. Components in the proposed system react with each other via events only, and these reactions are described in the component interfaces using a variation of Harel statecharts. By encapsulating component behaviour within the interfaces, the goal of the system is to allow the study of modularization and composition mechanisms in distributed event based applications. The presentation of the proposed system is complemented by a metamodel that describes the structural, control, and runtime aspects of distributed event systems.
Date	February 18, 2009
Report	Towards Modularization and Composition in Distributed Event Based Systems (PDF)

CS-2009-09
Title	Motivating a Distributed System of Commodity Machines1
Authors	Andrew Kane
Abstract	This report examines the price/performance benefit of using a large cluster of commodity machines rather than server level hardware for certain large scale software applications. A number of tools are presented which make it easier to produce software that runs across large clusters of commodity machines. These tools are the Chubby locking service, the Google file system, MapReduce and BigTable, all written by Google.
Date	February 19, 2009
Report	Motivating a Distributed System of Commodity Machines1 (PDF)

CS-2009-10
Title	Simulation of Distributed Search Engines: Comparing Term, Document and Hybrid Distribution1
Authors	Andrew Kane
Abstract	A method of simulating distributed search engines is presented. This method measures throughput (queries per second) and resource usage. Disk and network costs are modelled in a simple way, while CPU costs are ignored. Our simulation results show that term distribution can perform better than document distribution when using a small number of machines. This goes against the accepted view that document distribution is faster than term distribution. We introduced a hybrid distribution scheme that splits a set of machines and disks into groups and then performs term distribution within a group and document distribution between groups. Our simulation results show that this hybrid distribution scheme has higher throughput rates than document distribution, but can still scale to large numbers of machines. A special case of our hybrid distribution scheme groups together multiple disks on a machine to produce better throughput without increasing network traffic. Such a scheme could easily be deployed in a web search engine.
Date	February 19, 2009
Report	Simulation of Distributed Search Engines: Comparing Term, Document and Hybrid Distribution1 (PDF)

CS-2009-11
Title	Representation of Programming Constructs with the Kell-m Calculus
Authors	Rolando Blanco and Paulo Alencar
Abstract	Kell-m is a new asynchronous, higher-order process calculus with localities, developed for modelling and verifying distributed event-based systems and applications. Although simple, due to the low level nature of kell-m, considerable effort is required when modelling complex systems. In this report we illustrate how common programming constructs such as variables, procedures, modules and lists can be represented using kellm. These constructs facilitate the modelling of systems and applications using kell-m.
Date	March 9, 2009
Report	Representation of Programming Constructs with the Kell-m Calculus (PDF)

CS-2009-12
Title	Equivalence of Nested Queries with Mixed Semantics (extended version)
Authors	David DeHaan
Abstract	We consider the problem of deciding query equivalence for a conjunctive language in which queries output complex objects composed from a mixture of nested, unordered collection types. Using an encoding of nested objects as flat relations, we translate the problem to deciding the equivalence between encodings output by relational conjunctive queries. This encoding equivalence cleanly unifies and generalizes previous results for deciding equivalence of conjunctive queries evaluated under various processing semantics. As part of our characterization of encoding equivalence, we define a normal form for encoding queries and contend that this normal form offers new insight into the fundamental principles governing the behaviour of nested aggregation.
Date	March 13, 2009
Report	Equivalence of Nested Queries with Mixed Semantics (extended version) (PDF)

CS-2009-13
Title	Optimizing distributed XML queries through localization and pruning
Authors	Patrick Kling, M. Tamer Özsu and Khuzaima Daudjee
Abstract	Distributing data collections by fragmenting them is an effective way of improving the scalability of relational database systems. The unique characteristics of XML data present challenges that require different distribution techniques to achieve scalability. In this paper, we propose solutions to two of the problems encountered in distributed query processing and optimization on XML data, namely localization and pruning. Localization takes a fragmentation-unaware query plan and converts it to a distributed query plan that can be executed at the individual sites that hold XML data fragments in a distributed system. We then show how the resulting distributed query plan can be pruned so that only those sites are accessed that can contribute to the query result. We demonstrate that our techniques can be integrated into a real-life XML database system and that they significantly improve the performance of distributed query execution.
Date	April 6, 2009
Report	Optimizing distributed XML queries through localization and pruning (PDF)

CS-2009-14
Title	The Presentation Maestro: Direct Manipulation Through Gesture Alone
Authors	Adam Fourney, Michael Terry and Richard Mann
Abstract	Past research suggests a number of benefits to using hand-based interaction when interacting with electronic presentations. This paper introduces Maestro, a computer-vision based presentation system that uses hand gestures to allow fine-grained interaction with the contents of a projected slideshow. Maestro employs a single web camera and no other physical mediators. The contributions of this paper lie in robustly solving the gesture segmentation problem inherent in using only computer vision as input, and in a set of feedback mechanisms designed to scaffold use of the recognition system without interfering with the visual presentation of content. Specifically, Maestro employs bimanual cues, spatial and contentbased context, hand roles, and, in some case, dwell time to segment gestures. These strong segmentation cues are robust with respect to false positives and allow the actual gestures themselves to more directly match the action to be performed. They also result in a direct manipulation-style interface built on gestures alone. To assist with use, Maestro introduces a feedback comet, an augmented cursor that provides mnemonics for gestures and feedback to help govern gesture speeds to those conducive to gesture recognition. We present the design of the system and lessons learned from user evaluations
Date	March 18, 2009
Report	The Presentation Maestro: Direct Manipulation Through Gesture Alone (PDF)

CS-2009-15B
Title	Modeling and Querying Possible Repairs in Duplicate Detection
Authors	George Beskales, Mohamed A. Soliman, Ihab F. Ilyas and Shai Ben-David
Abstract	One of the most prominent data quality problems is the existence of duplicate records. Current duplicate elimination procedures usually produce one clean instance (repair) of the input data, by carefully choosing the parameters of the duplicate detection algorithms. Finding the right parameter settings can be hard, and in many cases, perfect settings do not exist. Furthermore, replacing the input dirty data with one possible clean instance may result in unrecoverable errors, for example, identification and merging of possible duplicate records in health care systems. In this paper, we treat duplicate detection procedures as data processing tasks with uncertain outcomes. We concentrate on a family of duplicate detection algorithms that are based on parameterized clustering. We propose a novel uncertainty model that compactly encodes the space of possible repairs. We show how to efficiently support relational queries under our model, and allowing new types of queries on the set of possible repairs. We give an experimental study illustrating the scalability and the efficiency of our techniques in different configurations.
Date	June 29, 2009
Report	Modeling and Querying Possible Repairs in Duplicate Detection (PDF)

CS-2009-16
Title	IPASS: Error Tolerant NMR Backbone Resonance Assignment by Linear Programming
Authors	Babak Alipanahi, Xin Gao, Emre Karakoc, Frank Balbach, Logan Donaldson and Ming Li
Abstract	The automation of the entire NMR protein structure determination process requires a superior error tolerant backbone resonance assignment method. Although a variety of assignment approaches have been developed, none works well on noisy automatically picked peaks. IPASS is proposed as a novel integer linear programming (ILP) based assignment method. In order to reduce size of the problem, IPASS employs probabilistic spin system typing based on chemical shifts and secondary structure predictions. Furthermore, IPASS extracts connectivity information from the inter-residue information and the 15N-edited NOESY peaks which are then used to fix reliable fragments. The experimental results demonstrate that IPASS significantly outperforms the previous assignment methods on the synthetic data sets. It achieves an average of 99% precision and 96% recall on the synthesized spin systems, and an average of 96% precision and 90% recall on the synthesized peak lists. When applied on automatically picked peaks from experimentally derived data sets, it achieves an average precision and recall of 78% and 67%, respectively. In contrast, the next best method, MARS, achieved an average precision and recall of 50% and 40%, respectively. Availability: IPASS is available upon request, and the web server for IPASS is under construction.
Date	June 9, 2009
Report	IPASS: Error Tolerant NMR Backbone Resonance Assignment by Linear Programming (PDF)

CS-2009-17
Title	Taking Advantage of the Interplay among Software Product Lines, Service-oriented Architectures and Multi-agent Systems
Authors	Ingrid Oliveira de Nunes, Carlos José Pereira de Lucena, Paulo Alencar and Donald D. Cowan
Abstract	Multi-agent Systems (MASs) are often being applied in a wide range of industrial applications, showing the effectiveness of the agent abstraction to develop open, highly interactive, autonomous and context-aware systems. MASs have been combined with Service-oriented Architectures (SOAs) in order to provide customization and flexibility in these systems. This combination calls for new approaches that support personalized user services through autonomous and pro-active components in dynamic environments. Existing approaches fail to provide reusable multi-agent service components as well as suitable representations and processes that support automated software generation based on common and variable features within a domain. In this paper we present a domain engineering process-oriented approach to build service-oriented user agents using the Software Product Line (SPL) engineering paradigm. The approach comprises activities and models to support the development of service-oriented customized agents that automate user tasks based on service orchestration involving multiple agents in open environments, and takes advantage of the synergy of SOA, MAS and SPL. The domain-based process involves extended domain analysis with goals and variability, domain design with the specification of agent services and plans, and domain implementation. Keywords: Software Product Lines, Multi-agent Systems, Service-oriented Architectures.
Date	May 4, 2009
Report	Taking Advantage of the Interplay among Software Product Lines, Service-oriented Architectures and Multi-agent Systems (PDF)

CS-2009-18
Title	QUICK: Queries Using Inferred Concepts from Keywords
Authors	Jeffrey Pound, Ihab F. Ilyas, and Grant Weddell
Abstract	We present QUICK, an entity-based text search engine that blends keyword search with structured query processing over rich knowledge bases (KB) with massive schemas. We introduce a new formalism for structured queries based on keywords that combines the flexibility of keyword search and the expressiveness of structures queries. We propose a solution to the resulting disambiguation problem caused by introducing keywords as primitives in a structured query language. We show how expressions in our proposed language can be rewritten using the vocabulary of a given KB. The rewritten query describes an arbitrary topic of interest for which corresponding documents are retrieved. We also introduce a method for indexing that allows efficient search over large KB schemas in order to find sets of relevant entities. We then cross-reference the entities with their occurrences in a corpus and rank the resulting document set to produce the top-k relevant documents. An extensive experimental study demonstrates the viability of our approach.
Date	May 6, 2009
Report	QUICK: Queries Using Inferred Concepts from Keywords (PDF)

CS-2009-19
Title	Textured Agreements: Re-envisioning Electronic Consent
Authors	Matthew Kay, Michael Terry
Abstract	Research indicates that less than 2% of the population reads license agreements during software installation [7]. To address this problem, we developed textured agreements, visually redesigned agreements that employ information layering, vignettes, sensationalism, and visual variety to accentuate information and highlight its personal relevance. Notably, textured agreements accomplish these goals without requiring modification of the underlying text. A between-subjects experimental study with 84 subjects indicates these agreements can significantly increase reading times. In our study, subjects spent approximately 30 seconds longer on consent screens than the in control condition, where subjects spent only seven seconds, on average. Furthermore, the study results indicate that the effects observed are not due to the novelty of the textured agreements’ visual appearance alone, but rather, particular features of the designs. These results provide convincing evidence of the potential for textured agreements to positively impact software consent processes.
Date	June 23, 2009
Report	Textured Agreements: Re-envisioning Electronic Consent (PDF)

CS-2009-20
Title	Drawing planar 3-trees with given face-areas
Authors	Therese Biedl and Lesvia Elena Ruiz Velazquez
Abstract	We study straight-line drawings of planar graphs such that each interior face has a prescribed area. It was known that such drawings exist for all planar graphs with maximum degree 3. We show here that such drawings exist for all planar partial 3-trees. Moreover, vertices have rational co-ordinates if the face-areas are rational, and we can bound the resolution. We also give some negative results for other graph classes.
Date	June 4, 2009
Report	Drawing planar 3-trees with given face-areas (PDF)

CS-2009-21
Title	Using A-patches to Tessellate Algebraic Curves and Surfaces
Authors	Stephen Mann
Abstract	This technical report expands on the A-patch tessellation work in the masters thesis of Curtis Luk [Luk08], improving on many of the techniques in his work and extending the A-patch constraints to allow a wider class of single sheeted Bernstein representations.
Date	June 11, 2009
Report	Using A-patches to Tessellate Algebraic Curves and Surfaces (PDF)

CS-2009-22
Title	A Survey of Incentive Mechanisms in Peer-to-Peer Systems
Authors	Muntasir Raihan Rahman
Abstract	The fundamental assumption that peer-to-peer (P2P) networks can thrive on voluntary contribution of altruistic peers can no longer be supported without considering the impact of rational behaviour on such decentralized systems. This paper attempts to shed light on the impact of rational free-riding behaviour of participating peers on the stability and existence of real-world peer-to-peer networks and the various attempts to cope with this problem. In particular, we focus on the economic principles that drive these problems, the various incentive mechanisms proposed to thwart these problems and analytical tools used to describe these rational manipulations in P2P systems.
Date	June 16, 2009
Report	A Survey of Incentive Mechanisms in Peer-to-Peer Systems (PDF)

CS-2009-23
Title	An Evaluation of Shape/Split Grammars for Architecture
Authors	Jingyuan Huang, Alex Pytel, Cherry Zhang, Stephen Mann, Elodie Fourquet, Marshall Hahn, Kate Kinnear, Michael Lam, and William Cowan
Abstract	Shape grammars have been used as a method for modelling buildings, both in architecture and in computer graphics. In this report, we survey some of the shape grammar papers, present some projects using shape grammars, and give our evaluation of shape grammars in computer graphics. In particular, we feel shape grammars are useful for generating a large variety of buildings, such as might be needed in computer games or animations, and that they could also be useful in design for exploring a large number of similar variations. However, we feel they are less useful for the design of new buildings or modelling of existing buildings.
Date	August 17, 2009
Report	An Evaluation of Shape/Split Grammars for Architecture (PDF)

CS-2009-24
Title	Approximation of Generalized Processor Sharing with Interleaved Stratified Timer Wheels - Extended Version
Authors	Martin Karsten
Abstract	This paper presents Interleaved Stratified Timer Wheels as a novel priority queue data structure for traffic shaping and scheduling in packet-switched networks. The data structure is used to construct an efficient packet approximation of General Processor Sharing (GPS). This scheduler is the first of its kind by combining all desirable properties without any residual catch. In contrast to previous work, the scheduler presented here has constant and near-optimal delay and fairness properties, and can be implemented with O(1) algorithmic complexity, and has a low absolute execution overhead. The paper presents the priority queue data structure and the basic scheduling algorithm, along with several versions with different cost-performance trade-offs. A generalized analytical model for rate-controlled rounded timestamp schedulers is developed and used to assess the scheduling properties of the different scheduler versions. Some illustrative simulation results are presented to reaffirm those properties. Index terms: communication systems, computer network performance, packet scheduling, data structures, algorithms
Date	September 17, 2009
Report	Approximation of Generalized Processor Sharing with Interleaved Stratified Timer Wheels - Extended Version (PDF)

CS-2009-25
Title	Effects of Target Size and Distance on Kinematic Endpoint Prediction
Authors	Jaime Ruiz and Edward Lank
Abstract	Because of the ubiquity of the WIMP paradigm, many researchers seek to design new pointing facilitation techniques for Fitts-style pointing tasks. However, many of these pointing facilitation techniques make one of two simplifying assumptions: either salient targets are sparsely placed on the display, or there exists some ability to identify the endpoint, the target, of a user's movement in real time. In this paper we extend previous work on kinematic endpoint prediction (KEP), a technique that uses models of user motion to predict endpoint in Fitts-style pointing tasks. We introduce a simplified algorithm to predict user endpoint. We present a technique to measure the numerical stability of endpoint predictions in real time. We show that the distance of motion has a significant effect on predictor accuracy. Finally, we develop an accurate understanding of the relationship between movement distance and predictor accuracy and show how we can use this understanding to infer accurate, real-time probability distributions on target sets within an interface. Together, these results allow KEP to be applied in new and novel ways to pointing facilitation techniques.
Date	September 25, 2009
Report	Effects of Target Size and Distance on Kinematic Endpoint Prediction (PDF)

CS-2009-26
Title	Perceptions and Practices of Usability in the Free/Open Source Software (FOSS) Community
Authors	Michael Terry, Matthew Kay, Ben Lafreniere
Abstract	This paper presents results from a study examining perceptions and practices of usability in the free/open source software (FOSS) community. 27 individuals associated with 11 different FOSS projects were interviewed to understand how they think about, act on, and are motivated to address usability issues. Our results indicate that FOSS project members possess rather sophisticated notions of software usability, which collectively mirror definitions commonly found in HCI textbooks. Our study also uncovered a wide range of practices that ultimately work to improve software usability. Importantly, these activities are typically based on close, direct relationships between developers and their core users, a group of users who closely follow the project and provide high quality, respected feedback. These relationships, along with positive feedback from other users, generate social rewards that ultimately serve as the primary motivations for attending to usability issues on a day-to-day basis in FOSS development. These findings suggest a need to reconceptualize HCI methods to better fit this culture of practice and its corresponding value system.
Date	October 15, 2009
Report	Perceptions and Practices of Usability in the Free/Open Source Software (FOSS) Community (PDF)

CS-2009-27
Title	Characterizing Large-Scale Use of a Direct Manipulation Application in the Wild
Authors	Ben Lafreniere, Andrea Bunt, John Whissell, Charlie Clarke, and Michael Terry
Abstract	Examining large-scale, long-term application use is critical to understanding the degree to which an application meets the needs of its user community. However, there has been limited published analysis of this type of data, none of which pertains to applications that support creating and modifying content using direct manipulation. In this paper, we present an analysis of two years of usage data from an instrumented version of the GNU Image Manipulation Program, including data from over 200 users. In the course of our analysis, we contribute to the body of knowledge on large-scale application use in three ways. First, we show that previous findings concerning the sparseness of command use and idiosyncrasy of users’ command vocabularies extend to a new domain and interaction style. Second, we demonstrate that direct manipulation applications require new analysis methods to determine command popularity. Finally, we describe the novel application of a clustering technique to characterize users’ higher-level tasks.
Date	October 15, 2009
Report	Characterizing Large-Scale Use of a Direct Manipulation Application in the Wild (PDF)

CS-2009-28
Title	Communicating Software Agreement Content Using Narrative Pictograms
Authors	Matthew Kay, Michael Terry
Abstract	This paper presents narrative pictograms, diagrams designed to convey the abstract concepts of a software agreement. Narrative pictograms arose out of a need to increase the chance that subjects of any culture or language could understand the purposes and intent of a consent agreement accompanying publicly available experimental software. Accordingly, the diagrams presented in this paper are designed to be used without the aid of explanatory text. We first present our iterative design process and initial formative evaluation of the diagrams. We then present example diagrams designed to describe the data collection policies of research software, and the more general composition rules used to create them. Finally, we demonstrate the diagrams’ ability to effectively communicate concepts by presenting results from an experimental study based on the ISO 9186-1 comprehension test for graphical symbols.
Date	October 15, 2009
Report	Communicating Software Agreement Content Using Narrative Pictograms (PDF)

CS-2009-29
Title	Exploring Usability and Learnability of Mode Inferencing in Pen/Tablet Interfaces
Authors	Matei Negulescu, Jaime Ruiz and Edward Lank
Abstract	The inferred mode protocol uses contextual reasoning and local mediators to eliminate the need to access specific modes to perform draw, select, move and delete operations in a sketch interface. In this paper, we describe an observational experiment to understand the learnability – whether the features are discovered independently – and the usability – user preference and frequency of use – of mode inferencing within a tablet-based sketch application. The experiment showed that those participants instructed in the interface features liked the fluid transitions between draw and lasso selection, but did not like click-select and delete inferencing. As well, interaction techniques were not self-revealing: Participants who were not instructed in interaction techniques took longer to learn about inferred mode features and were more negative about the interaction techniques. Together these results inform the future design of pen/tablet interfaces that seek to make use of computational intelligence in support of interaction.
Date	November 05, 2009
Report	Exploring Usability and Learnability of Mode Inferencing in Pen/Tablet Interfaces (PDF)

CS-2009-30
Title	EXPECT-K: Expanding Predictive Endpoint Cued Tablet Keyboard
Authors	Jaime Ruiz and Edward Lank
Abstract	Virtual keyboards are a common method of text entry for devices where a physical keyboard is not present. In this abstract we present EXPECT-K, a virtual keyboard that implements visual cues, endpoint prediction, and target expansion to increase text entry performance using a stylus on Tablet PC computers.
Date	October 7, 2009
Report	EXPECT-K: Expanding Predictive Endpoint Cued Tablet Keyboard (PDF)

CS-2009-31
Title	Speeding Pointing in Tiled Widgets: Understanding the Effects of Target Expansion and Misprediction
Authors	Jaime Ruiz and Edward Lank
Abstract	Target expansion is a pointing facilitation technique where the user's target, typically an interface widget, is dynamically enlarged to speed pointing in interfaces. However, with densely packed (tiled) arrangements of widgets, interfaces cannot expand all potential targets; they must, instead, predict the user's desired target. As a result, mispredictions will occur which may disrupt the pointing task. In this paper, we present a model describing the cost/benefit of expanding multiple targets using the probability distribution of a given predictor. Using our model, we demonstrate how the model can be used to infer the accuracy required by target prediction techniques. The results of this work are another step toward pointing facilitation techniques that allow users to outperform Fitts' Law in realistic pointing tasks.
Date	October 7, 2009
Report	Speeding Pointing in Tiled Widgets: Understanding the Effects of Target Expansion and Misprediction (PDF)

CS-2009-32
Title	Integer Programming Model for Automated Structure-based NMR Assignment
Authors	Richard Jang, Xin Gao, and Ming Li
Abstract	We introduce the “Automated Structure-based Assignment" problem: Given a reference 3D structure, a protein sequence, and its NMR spectra, automatically interpret the NMR spectra and do backbone resonance assignment. We then propose a solution to solve this problem. The core of the solution is a novel integer linear programming model, which is a general framework for many versions of the structure-based assignment problem. As a proof of concept, our system has generated an automatic assignment on a real protein TM1112 with 91% recall and 99% precision, starting from scratch. When we restrict ourselves to the special case where perfect peak lists are given, we are able to compare our results with existing results in the field. In particular, we reduced the assignment error of Xiong-Pandurangan-Bailey-Kellogg’s method by five folds on average, with over a thousand fold speed up. Our system also achieves 91% assignment accuracy on real experimental data for Ubiquitin. These results have direct practical implications. For example, in the protein design process, a protein is modified slightly and its structure is again measured by NMR experiments. Our method automates this process, saving time on tedious peak-picking and resonance assignment. As another example, when there is a homologous protein with known structure, our method increases the assignment accuracy and hence enables automated NMR structure determination
Date	October 14, 2009
Report	Integer Programming Model for Automated Structure-based NMR Assignment (PDF)

CS-2009-33
Title	Reconstructing hv-convex multi-coloured polyominoes
Authors	Adam Bains and Therese Biedl
Abstract	In this paper, we consider the problem of reconstructing polyominoes from information about the thickness in vertical and horizontal directions. We focus on the case where there are multiple disjoint polyominoes (of different colours) that are hv-convex, i.e., any intersection with a horizontal or vertical line is contiguous. We show that reconstruction of such polyominoes is polynomial if the number of colours is constant, but NP-hard for an unbounded number of colours.
Date	November 13, 2009
Report	Reconstructing hv-convex multi-coloured polyominoes (PDF)

CS-2009-34
Title	Mechanism design for Network Virtualization
Authors	Muntasir Raihan Rahman
Abstract	Recently network virtualization has been proposed as a promising approach to thwart the current ossifcation of the Internet by allowing multiple heterogeneous virtual networks (VN) to coexist on a shared infrastructure which itself is controlled by self-interested infrastructure providers. A major challenge in this respect is the VN embedding problem that deals with effcient mapping of virtual nodes and virtual links onto the substrate network resources. Most previous research on this problem has focused on designing heuristic and approximation algorithms for the VN embedding problem. However a common aspect of these previous results is that they assume that the different stake-holders in the network virtualization environment do not act in strategic ways.In this paper, we propose to utilize mechanism design to address this issue. Mechanism design is a branch of micro-economics that deals with protocols and algorithms for aligning the conflicting preferences of self interested agents with the global objective of a central designer. Specifically we show that the celebrated Vickrey Clarke Groves (VCG) mechanism can be used to find the optimal cost minimizing embedding of a virtual network on top of a substrate network, where different parts of the substrate network are owned by strategic agents. In the special case where each substrate link is owned by a separate agent, we show the exact formulation of the VCG payment function and develop simple algorithms for computing the payment functions based on shortest path algorithms. We also discuss extensions of the basic model in terms of more realistic economic and network models and also show how the VCG payment computation can be carried out in a distributed fashion.
Date	November 13, 2009
Report	Mechanism design for Network Virtualization (PDF)

CS-2009-35
Title	Design Principles for Robust Opportunistic Communication
Authors	S Keshav
Abstract	A mobile device can simultaneously increase its throughput and dramatically reduce energy and bandwidth usage costs by exploiting transient communication opportunities. We argue that this opportunistic communication mode will play a significant role in future mobile communication systems. We present some non-trivial applications that exploit opportunistic communication and their corresponding communication requirements. We outline the Opportunistic Communication Management Protocol, developed over the last four years at the University of Waterloo, that meets most of these requirements. We then focus on some design principles for robust opportunistic communication drawing from our experiences in developing and deploying several practical systems. We conclude with a sketch of some areas for future research.
Date	December 09, 2009
Report	Design Principles for Robust Opportunistic Communication (PDF)