Speaker: Michael Cormier, PhD Candidate
In this talk I will present recent advances in our vision-based web page segmentation method. This method is an edge-based, Bayesian image segmentation algorithm, capable of estimating the semantic structure of a web page from an image of the page. The image of the page is, of course, what was designed to convey the semantic structure to users, while the source code is simply designed to create the correct appearance. Assistive technology (such as screen readers for visually impaired users or efforts for decluttering web pages for users with cognitive impairment) may be less than ideal if reliant on the source code structure to infer the semantic structure of the page. Moreover, guidelines for designing webpages in order to be more helpful to users with assistive needs are not always taken into consideration. This talk is focused on recent developments in the design of our algorithm and the statistical models used in infering the semantic structure, as well as on a series of experiments involving a prototype assistive interface. The assistive interface has been tested offline, against partial groundtruth segmentations, and online, with older adult users. In all, we offer an approach to enable users with assistive needs to have a better online experience with webpages, leveraging methods from computer vision. We also contribute a methodology for validating vision-based models, of use in the context of a study with participants.