Speaker: Verena Kantere, University of Ottawa
Abstract: Big Data analytics in science and industry are performed on a range of heterogeneous data stores, both traditional and modern, and on a diversity of query engines. Workflows are difficult to design and implement since they span a variety of systems. To reduce development time and processing costs, some automation is needed. In this talk we will present a new platform to manage analytics workflows.
The platform enables workflow design, execution, analysis and optimization with respect to time efficiency, over multiple execution engines. Such configurations are emerging as a common paradigm used to combine analysis of unstructured data with analysis of structured data (e.g., NoSQL plus SQL). We focus on the usability of the platform by users with various expertise, the automation of the analysis and optimization of execution, as well as the effect of optimization on workflow execution. The platform performs also multi-workflow optimisation and workflow recalibration. The talk will finish with some plans for future research on data management optimization on hybrid infrastructures, i.e. infrastructures that comprise multiple sites heterogeneous parts and combine private clusters and public resources.
Bio: Verena Kantere is an Associate Professor at the School of Electrical Engineering and Computer Science (EECS) in the University of Ottawa (UOttawa). Before, she was an Assistant Professors at the School of Electrical and Computer Engineering (ECE) of the National Technical University of Athens (NTUA) and a Maître d’Enseignement et de Recherche at the Centre Universitaire d’ Informatique (CUI) of the University of Geneva (UniGe). She has been working towards the provision of data services in large-scale systems, like cloud systems, focusing on the management of Big Data and the performance of Big Data analytics, by developing methods, algorithms and fully fledged systems. Before coming to the UniGe she was a tenure-track junior assistant professor at the Department of Electrical Engineering and Information Technology at the Cyprus University of Technology (CUT). She has received a Diploma and a Ph.D. from the National Technical University of Athens, (NTUA) and a M.Sc. from the Department of Computer Science at the University of Toronto (UofT), where she also started her PhD studies. After the completion of her PhD studies she worked as a postdoctoral researcher at the École Polytechnique Fédérale de Lausanne (EPFL). During her graduate studies she developed methods, algorithms and fully fledged systems for data exchange and coordination in Peer-to-Peer (P2P) overlays with structured and unstructured data, focusing on the solution of problems of data heterogeneity, query processing and rewriting, multi-dimensionality and management of continuous queries. Furthermore, she has shown interest and work in the field of the Semantic Web, concerning the problem of semantic similarity, annotation, clustering and integration.