From a CANARIE News Release:
CANARIE today announced nine successful recipients of its Research Data Management (RDM) funding call, announced in May 2018. This new funding will enable research teams to develop software components and tools to enable Canadian researchers to adopt best practices in managing data resulting from scientific research.
Data management practices impact the entire research lifecycle, from project planning and execution, to backing up data as it is created and used, and finally to its long-term preservation after the investigation is complete. RDM best practices help ensure the protection of data during the research lifecycle and beyond, and help meet the increasingly stringent requirements of research ethics and reproducibility.
The RDM stakeholder community’s broad engagement in CANARIE’s January 2018 consultation identified the priorities of this funding call.
This funding is part of the Government of Canada’s $105 million investment supporting CANARIE through its 2015-2020 mandate.
Research Teams Awarded Funding
The following research projects will receive funding through this call. These projects contribute both to the priorities identified by the RDM stakeholder community: enriching [meta]data and discovery, federated repositories / interoperability, domain-specific repositories, data deposit and curation, preservation, persistent IDs / citability, data access and analytics, and data privacy and security; and support of the FAIR Principles: Findability, Accessibility, Interoperability, and Reusability of research data.
- Canadian Health Omics Repository, Distributed (CanDIG CHORD) – Led by Dr. Guillaume Bourque, McGill University
CanDIG is a national project that allows collaborative analysis of human health genomics data distributed across the country, enabling stewards of this data complete, auditable control over data access. The CHORD project will create a federated Canadian national data service for privacy-sensitive genomic and related health data. It will also broaden the Canadian health research community’s access to the technologies and services being built by CanDIG and its international partners in the Global Alliance for Genomics and Health.
- Dataverse for the Canadian Research Community – Led by Kate Davis, University of Toronto
Dataverse (DV) is an open-source research data repository platform, developed by Harvard University’s Institute for Quantitative Social Science with adopters and contributors from Canada, the US, and Europe. Originally architected to serve the needs of social science researchers with small to medium size data files, this project will adapt Dataverse’s software architecture to address the needs of a broad range of researchers in Canada through improved scalability, support for large data files, curation worksflows, and integration with Canadian storage and authentication providers.
- DuraCloud – Linking Data Repositories to Preservation Storage – Co-led by Corey Davis, Council of Prairie and Pacific Research Libraries; and Stephen Marks and Kate Davis, University of Toronto
Canadian researchers have access to many storage services suitable for the long-term preservation of digital content, including research data. The DuraCloud project will connect several Canadian preservation storage services via this software, which is maintained by the DuraSpace Foundation. As a result, Canadian researchers will be able to seamlessly access different storage services through a single interface.
- FAIR Repository for Annotations, Corpora and Schemas (FRACS) – Led by André Lapointe, CRIM
Artificial intelligence-based applications require access to massive quantities of data. To enable Canada’s academic researchers to scale their AI-based projects such that they are competitive with private sector applications, large volumes of data must be coupled with detailed annotations. Annotated data sets allow models to be effectively trained and validated by machine learning algorithms.
The FRACS project will simplify the management of largescale datasets by facilitating the creation, storage, search, manipulation and sharing of their annotations.
- Federated Geospatial Data Discovery for Canada – Co-Led by Eugene Barsky, Evan Thornberry, and Paul Lesack, University of British Columbia Library
Traditionally, research data repositories have relied on text-based searching. However, there is increasing demand for geographic components in research, examples of which include migration paths, the distribution of agricultural yields, infrared satellite imagery, the distribution of artifacts in an archaeological site, and the flow routes of water. The goal of this project is to create an extensible, open-source software method to search and discover Canadian geospatial research data using an interface specifically designed for maps, enabling users to discover geospatial resources in a more spatially-intuitive way.
- Making Identifiers Necessary to Track Evolving Data (MINTED) – Led by Reyna Jenkyns, Ocean Networks Canada (ONC), University of Victoria
ONC operates world-leading ocean observatories and dynamic data repository services. While there has been a growing recognition of the benefits and need for data citations made evident by the introduction of the FAIR Principles, existing platforms and tools are currently only able to serve the needs of static or non-frequently updated datasets.
The MINTED project will apply best practices for dynamic dataset citation, Digital Object Identifiers (DOIs), and researcher ORCIDs into ONC’s Oceans 2.0 digital infrastructure.
- Radiam: Management Software for Active Research Data – Led by Dr. Kevin Schneider, University of Saskatchewan
Research data, which may have value beyond the research for which it was collected, is often distributed across multiple storage devices, tools, and platforms. Simply knowing that a dataset exists, let alone finding it, presents a significant challenge. Radiam will provide a project-level metadata index of research data, regardless of where or how it is stored. Radiam will improve researchers’ ability to find and cite existing datasets by not only storing the location of the data, but also the standard and custom metadata records associated with it.
- Managing the Research Data Lifecycle using Islandora – Co-led by Donald Moses and Rosemary Le Faive, University of Prince Edward Island (UPEI)
In collaboration with Simon Fraser University and the Islandora Foundation, UPEI will build research data management capacity and integrations using the latest version of Islandora, also known as CLAW. Islandora is an open-source software framework designed to help organizations collaboratively manage, discover, and share digital assets using a best-practices, standards-based approach. The project will develop integrations with identifier, metadata, authentication, storage, and dissemination systems, supporting the FAIR principles and the research data lifecycle.
- Research Portal for Secure Data Discovery, Access and Collaboration – Co-led by Dr. Elizabeth Theriault, Ontario Brain Institute and Moyez Dharsee, Indoc Research
The Ontario Brain Institute (OBI) and Indoc Research have developed Brain-CODE, an extensible neuroinformatics platform designed to manage the collection, curation, analysis and sharing of different data types across several brain disorders.
To address the RDM needs of researchers studying disorders of the brain and other disease areas, this project will develop data portal software that will enable research teams to securely and seamlessly capture, query, and visualize patient data; collaborate and share datasets; and access support and training resources. The project will serve the needs of teams using Brain-CODE as well as those from collaborating institutions and the broader medical research community.
The projects funded through this call are on track to be completed before April 2020.