The following is a condensed article by Bill Bean; see original at https://uwaterloo.ca/news/artificial-intelligence-needs-good-data-grow-future
Data, the currency of the future, has nearly boundless opportunities, but a common standard is needed for how it is created and used, said Anil Arora, Chief Statistician of Canada, in his keynote address at the Fall Industry Day on Monday, November 28, hosted by the Waterloo Artificial Intelligence Institute (Waterloo.AI) and Communitech.
Titled “Data — The Fuel for AI,” the day-long event at Waterloo’s Fed Hall brought together more than 350 business leaders, data scientists, academics and government officials from around the country, both in person and online.
“Since launching in 2018, Waterloo.AI’s multidisciplinary research teams have been collaborating with industry to develop intelligent systems. This hybrid event provides an opportunity to bring together our leading researchers and industry partners to discuss emerging trends and the future of Canada’s data network,” said Harold Godwin, managing director of Waterloo.AI.
Anil Arora told the audience that every company in the world will be buying and selling data in the future, but the quality of that data has to be assured.
He offered Statistics Canada as a trusted steward of that data, with a century of experience in data collection and use. That century has seen the evolution of the agency, from simply surveying citizenry to integrating data from sources, such as satellite imagery, not considered a generation ago. To gather and share that data, partnerships are crucial, he said.
Using the satellite imagery example, Anil Arora explained how data scientists can detect crop types from those images, then run models to predict crop yields or measure the water stress on plants. The pandemic accelerated the collection and use of data around the world, he said, making it critical to establish a common language for data sharing.
“Data is a team sport,” and Statistics Canada continues to seek new partners in the private sector, governments and academia to establish those common standards.
Anil Arora emphasized that the role of Statistics Canada is to stimulate, but not to compete. He said that there would be messiness and some mistakes going forward, but declared that change agents can’t stay in their own lanes. Co-operation and collaboration are key to their success.
He cited the case of applying AI insights to the rising tide of opioid deaths. When timelines of victims were examined, it was found that a significant number of them were in the construction industry. A workplace injury might lead to an opioid addiction. Knowing this at the point of treatment could save a life. This kind of data use can demonstrate to Canadians the value proposition of working together.
Those issues, and others, were explored in the late-morning panel on “Data Trends, Opportunities and Challenges,” led by Waterloo.AI co-director and Cheriton Chair Jimmy Lin, Waterloo.AI co-director and ECE Professor Vijay Ganesh, Anil Arora, along with Reem Al-Halimi, Chief Data Scientist at the Airbus company, NAVBLUE.
Professor Ganesh led the questions by asking how Canada ranks beside the U.S. and EU in data legislation. Anil noted that the government is revising legislation written decades ago, and said it is a tricky balance between being relevant in an evolving context and yet not hindering industry’s ability to innovate. Building a consensus can be challenging: “It’s a bit of a messy art.”
Reem Al-Halimi suggested that the EU’s General Data Protection Regulation (GDPR) needs to be understood by all employees who work with data, and said companies need a process or team to respond to GDPR-related changes.
Asked how one ensures compliance, Professor Lin notes that there is little about data ethics in current university curricula. And the data science community is still “in the deer in the headlights stage.” It was only a few years ago that data scientists understood that models could be biased by gender or race, for instance. There has to be training in data ethics for the upcoming data scientists, he said, not just to track the things one would expect to go wrong, but also to monitor the aspects that one doesn’t expect to go wrong.
Professor Lin addressed a common concern that AI is replacing human control by noting that AI is not a substitute for human creativity; rather, it is an aid to it.
Anil Arora echoed that, noting that good-quality data can be used to understand the inequities in contemporary infrastructure. “The opportunities are endless.” Old problems can be looked at with the power of today, to reveal relationships not seen before. “The opportunities are truly exciting.”