HomeAboutSoftwarePublicationsPostsMicroBinfie Podcast

MicroBinfie Podcast, 84 Bioinformatics in the noughties with Mark Pallen

Released on June 9, 2022

Back to episode list

Mark Pallen describes the excitement surrounding the field of microbial bioinformatics at the turn of the millennium, as scientists began to obtain genomes from model organisms and dangerous pathogens for the first time. He recounts collaborating with his hero, David Relman, on the genome sequencing of the unusual slow-growing organism, Tropheryma whipplei, in a race against a French team.

In late 1999, Mark moved to Belfast and began collaborating with another Englishman based in Ireland, Tim Foster, who was working in Dublin. Pallen describes the exhilarating experience of using PSI-BLAST to identify new sortases and sortase substrates across various new genomes, likening it to the addictive nature of "crack cocaine." He references philosopher Alfred North Whitehead, quoting that the aim of every scientist is "to seek simplicity but distrust it." Pallen discovered that sortases in most organisms behaved quite differently from the simple model observed in Staphylococcus aureus. He made similar discoveries regarding WXG100 proteins and type VII secretion, which he found in several new contexts distinctly different from the original context of ESAT-6 as an antigen in Mycobacterium tuberculosis. Pallen notes that even twenty years later, we still don't fully comprehend the role of ESAT-6.

The focus of Pallen's research then shifted to Escherichia coli, where he described vestigial gene clusters responsible for non-functional type III secretion systems in this model organism. He realized that E. coli K-12 was not inherently special as a model organism but was merely another strain of E. coli. Many of the earliest genomes sequenced came from worn-out lab strains. To address this issue, Gordon Dougan at the Wellcome Trust Sanger Institute shifted the focus toward genome-sequencing freshly isolated, minimally passaged isolates. Alongside Brendan Wren, Pallen authored a review article for Nature, highlighting the importance of adopting an eco-evo perspective when interpreting bacterial genomes.

During this period, Scott Beatson joined Pallen's group. Mark succeeded in convincing Scott to study type III secretion in E. coli instead of Pseudomonas aeruginosa. This work led to the discovery of dozens of new type III secretion effectors, integrating bioinformatics and laboratory work, and culminating in a publication in PNAS.

For more details, visit the source.

Extra notes

  • The podcast features Professor Mark Pallant, who discusses significant contributions and developments in microbial bioinformatics from the 21st century.

  • Genome Sequencing: There was a notable surge in microbial genome sequencing projects around the turn of the millennium, leading to new discoveries in well-studied organisms like E. coli.

  • Project Involvement: Professor Pallant was involved in genome sequencing projects for less commonly studied bacteria such as Carinobacterium epitherii and Tropheryma whipplei, the latter needing innovative culture techniques due to its inability to be grown in the lab at the time.

  • 16S rRNA Sequencing: The identification of Tropheryma whipplei as a distinct organism was initially based on 16S rRNA sequencing, highlighting the importance of this method in classifying microbial species.

  • Homology Searches: Pallant emphasized using homology searches to explore new genomes, identifying unknown genes and pathways in those genomes. This method proved crucial in uncovering new SORTASE substrates in Staphylococcus aureus genomes, benefitting from the use of iterative tools like CyBLAST.

  • Sortase Systems: Analysis revealed that sortase substrates were often clustered with their enzymes in gene clusters, contrary to prior belief they were scattered. This was significant for understanding the functional genomics of Staphylococcus and other bacteria.

  • Bioinformatics Tools: The iterative tool CyBLAST, akin to "crack cocaine" for bioinformaticians as described by Pallant, was used extensively to identify a wealth of new homologs, revealing pathways and functions not previously anticipated.

  • Comparative Genomics: The work on type 3 secretion systems in E. coli and other bacteria demonstrated the power of comparative genomics to reveal vestigial genes and insights into microbial evolution.

  • Reverse Vaccinology: Although not explicitly discussed, the techniques and insights discussed, such as antigenic discovery and functional genomics, are precursors to fields like reverse vaccinology, which emerged alongside early genomic studies to predict potential vaccine candidates.

  • Sequencing Model Strains: The initial focus on sequencing model strains such as E. coli K12 highlighted the challenges related to whether laboratory strains accurately represent wild-type organisms, advocating for sequencing diverse and minimally subcultured isolates from nature.

  • Eco-Evolutionary Context: An eco-evolutionary framework was recommended for interpreting genomic data, considering both the evolutionary ancestry of strains and their ecological roles.

  • Involvement of Eukaryotic Hosts: The discussion acknowledged the ecological interactions between bacteria and eukaryotic hosts, emphasizing the relevance of studying microbial genetics in the context of broader ecological relationships.

These insights underscore the evolving landscape of microbial bioinformatics, tools, and methodologies that are foundational to the field's advancement.

Episode 84 transcript