Released on March 31, 2022
Back to episode listDr. Erin Young and Dr. Kelsey Florek recently joined us to discuss StaPH-B, a U.S. state public health bioinformatics group, and provided insights into the popular SARS-CoV-2 pipeline, Cecret.
Kelsey Florek explained that StaPH-B was created to facilitate collaborations between bioinformaticians in state public health laboratories. This group is particularly beneficial for those who are new to sequencing and understanding the data generated. It provides a communication and expertise network among different laboratories, contributing to projects funded by the NIH, CDC, and other grant agencies.
Erin Young highlighted the diverse membership of StaPH-B, which offers excellent learning opportunities. With nearly 400 members and over 50 channels focused on bioinformatics, StaPH-B uses a Slack workspace to provide a valuable resource where bioinformaticians can ask questions and share ideas.
When asked about membership, Kelsey clarified that while StaPH-B was initially founded for state public health bioinformaticians, it is open to everyone. However, the content is focused on state public health activities. Key achievements discussed include the Slack workspace, collaborations on GitHub, Docker, and the development of collaborative workflows.
StaPH-B's training activities, including the StaPH-B Toolkit, training sessions, and videos, ensure that knowledge and expertise are shared effectively across the community.
The discussion moved to the Cecret pipeline, one of Erin’s bioinformatics pipelines for SARS-CoV-2. Developed during the pandemic, the intention was to use the Arctic group's protocol for sequencing SARS-CoV-2 on the Nanopore sequencing platform. However, Erin required an Illumina-based pipeline, as sequencing SARS-CoV-2 on the MiSeq was preferable to the Nanopore platform. The Cecret pipeline was developed using BWA as the default aligner and is intended for viral-based sequencing with a known, reliable reference.
Erin highlighted the SEQret pipeline tutorials and the monthly videos produced by StaPH-B, which outline various state laboratory projects, as useful resources for those entering the field.
In a previous conversation, discussions on the evolution of COVID genome analysis workflows were highlighted, noting how they have adapted due to the growing amount of data being analyzed. Various workflows like Secret, NF Core, Monro, and the Next Flow Optic Pipeline were mentioned for their unique features and popularity.
Erin, the creator of Secret, shared her initial apprehension about making her workflow public and her diligence in tracking changes in her repository to ensure scientific validity. Over time, the workflow has evolved with gradual improvements and fewer bugs, maintaining a consistent trajectory without dramatic shifts. The name "Secret" was inspired by a meaningful hiking landmark in Northern Utah.
The speakers emphasized the necessity of managing and connecting the increasing amounts of COVID data to public health efforts.
In conclusion, StaPH-B and workflows like Secret are playing significant roles in the fields of bioinformatics and COVID genome analysis. Collaborations and resources like StaPH-B are essential for sharing knowledge and expertise among laboratories, which is crucial for the successful execution of projects funded by organizations such as the NIH and CDC.
Microbial Bioinformatics Insights
STAFFB (State Public Health Bioinformatics Workgroup)
Collaboration and Resource Sharing
Workflow Development for SARS-CoV-2 Sequencing
Challenge of Workflow Adaptations
Community Feedback and Workflow Evolution
Shifts in Sequencing Practice
Training and Knowledge Dissemination
Innovative Tool Development and Usage
Challenges Faced