Power in Data
Freshwater data helps us learn about our world. Here’s how Global Water Futures is improving the academic culture of research data management.
As we strive to better understand our world and its ecosystems, we collect and study data. Many environmental researchers collect primary data in the field using advanced technologies like sensor networks, machine learning, and AI, which gives them more accurate and comprehensive information. That information can be used to create models for different situations to identify solutions and help make better decisions.
There is great power in data. At the same time, managing hundreds – even millions – of data points can be overwhelming. Ensuring the data is managed in a way that makes it useful to others, today and in the future, adds another, more complex layer to any research project.
Global Water Futures (GWF), one of the world’s largest university-led freshwater research programs, is committed to ensuring datasets collected by its research teams follow the FAIR data principles – that is, ensuring they are findable, accessible, interoperable, and reusable.
The FAIR principles are critical, says Jimmy Lin, Faculty Lead for the GWF data management and computer science core teams at the University of Waterloo. “By adhering to the FAIR data principles, researchers can enhance the overall quality, impact, and transparency of their research, while fostering open scientific collaboration.”
Bhaleka Persaud, Senior Research Data Management Specialist with Lin’s team, expands on this idea. “Most GWF studies help us understand how the climate is changing and impacting water and ecosystems. They inform decisions about how we manage and protect these resources. Following the FAIR principles means we can access, trust, and rely on that data.”
The GWF program is not only leaving behind a legacy of datasets, but also best practices, tools and resources in data management that researchers can continue to use for years to come – even after the program concludes, Persaud explains.
We often point to GWF as a trailblazer. GWF is an enviable model, and one that has been and will continue to have a positive impact on data management, data publication, and long-term stewardship of Canadian research data across disciplines if it were to be implemented more widely.
Determining a FAIR way forward
Committing to effective FAIR data management requires a framework, policies, infrastructure, and human resources. When GWF officially launched in 2017, however, UW did not have a research data management strategy.
Without a post-secondary policy to reference, GWF established a framework to accommodate the diversity of the program’s research themes and projects, while keeping a keen eye on the concurrent development of the Tri-Agency Research Data Management (RDM) Policy.
Today, this RDM policy provides guidance for managing data for post-secondary research that is funded by Canada’s federal research granting agencies. Approved in 2021, the policy required institutions to create their own research data management strategy. GWF’s dedication to FAIR principles, coupled with the implementation of its own RDM policy, positioned the University of Waterloo to be ahead of the curve.
“The experience of implementing GWF’s robust RDM policy helped inform the development of the University of Waterloo’s RDM institutional strategy, providing real-life situational knowledge,” says Kathy Szigeti, University of Waterloo Librarian and Data Management Specialist. “The GWF team had already ground-truthed its own framework with academic researchers and worked through many of the challenges throughout the RDM cycle for the water discipline.”
Opening access to data
Even with policies and systems in place, FAIR data is not a guarantee.
“The acquisition of data sometimes hinges on connections and a stroke of luck. However, even when you obtain the data, sometimes it is not readily reusable," Persaud says. "Many of our researchers have to clean and format data before they can conduct any meaningful analysis. Unfortunately, it’s a time-consuming necessity.”
Thanks in part to its RDM policy, the GWF team has seen shift towards more open and meticulously described research data, she says.
But making data FAIR requires more than a policy, Persaud adds. “We’ve learned that there needs to be a cultural shift toward valuing these principles, and that we need knowledgeable personnel to support making it happen.”
As FAIR principles become better supported and more widely accepted, change is happening. “I see acceptance increasing every day in our work with GWF researchers. When data is accessible, we can work together to transfer knowledge, technologies, and ideas not only in Canada but also across national borders,” Persaud says.
Unleashing the FAIR advantage
Ensuring FAIR data means thinking about how to collect, organize, and store data from the beginning of a research project. Since data management is still evolving in Canada, educating researchers and building awareness is a big part of the GWF team’s work.
To support its researchers, GWF’s data management team offered training through webinars, workshops, peer reviewed articles, and personal support. The team also created guidance materials and new tools, including a data management template for water research.
“Researchers often know that it is important to have a data management plan, but they may not always know where to start. The template provides guidelines for researchers to consider as they develop their plans for water-related projects,” Persaud says.
Researchers are seeing how FAIR data can transform their work. “The GWF RDM policy enables us to be more transparent, efficient, and impactful in our research. It also led to the development of a specific data management plan for our research group which is now one of Canada’s national exemplars,” says Steph Slowinski, a research biogeochemist in the University of Waterloo’s Ecohydrology Research Group (ERG).
Shaping the future of data management
Persistently working toward FAIR data, troubleshooting challenges, and keeping ahead of the curve has earned GWF a reputation for excellence in the research data management community in Canada.
“We often point to GWF as a trailblazer. GWF is an enviable model, and one that has been and will continue to be a positive impact on data management, data publication, and long-term stewardship of Canadian research data across disciplines if it were to be implemented more widely,” says Erin Clary, Curation Coordinator with the Digital Research Alliance of Canada (the Alliance), a non-profit funded by the federal government to build and sustain a strong and vibrant research data infrastructure ecosystem in Canada.
GWF has been actively engaged in national-level conversations with groups like the Alliance, DataStream, the Water Institute, and university libraries to advance RDM for water science in Canada. The program also provides strategic direction and feedback as part of Alliance-led expert working groups focused on advancing FAIR data practices in Canada.
On providing consistent feedback to the Alliance and UW’s data management policy committee, Lin says we don’t need to reinvent the wheel.
“The science and data management practices are still evolving, but we have the systems and infrastructure we need to get us to start to foster change in RDM practices,” he says. “Our focus is to continue to make existing solutions stronger and better, and ensuring we have the human resources to water science data more accessible in Canada.”