Released on April 1, 2021
Back to episode listJoin guest host Dr. Emma Griffiths as she talks with Dr. Finn McGuire and Dr. William Hsiao about the SARS-CoV-2 genomics epidemiology efforts in Canada.
For more information, visit the CanCOGeN website.
The podcast discusses the challenges and frameworks for harmonizing metadata in microbial bioinformatics, particularly in the context of the National Genomic Surveillance Database and the CanCoGen initiative.
The metadata management plan for CanCoGen consists of three tiers:
Synchronizing metadata involves challenges due to differing definitions of non-identifiable information across jurisdictions, and legal consultation was needed to clarify these.
A long list of relevant metadata for SARS-CoV-2 sampling was established within the PHAGE consortium (Public Health Alliance for Genomic Epidemiology) and adopted in CanCoGen.
Metadata include information on the reasoning for sequencing and conditions of sample collection, which are critical for epidemiological interpretations.
Efforts are placed on ensuring metadata is publicly available and that the process adopts an open-access framework, allowing contributions to standards improvement.
The podcast mentions the development of a tool called the Data Harmonizer, designed to standardize data collection, perform validation, and ensure compatibility for public repositories and national reporting.
The conversation highlights data-sharing challenges in Canada due to differing privacy laws among provinces, affecting the complete and timely sharing of genomic data with global repositories like GISAID.
Solutions for the future involve improving the social science aspect of data sharing to align technical infrastructure with public opinion and ethical considerations.
There is a recommendation for centralized data curation and analysis while maintaining enough flexibility for decentralized systems to function through interconnected data sharing and expertise.
The need to focus on quality control metrics and fostering good communication was emphasized to overcome data-sharing hurdles in decentralized health systems.