Hello, and thank you for listening to the Microbid Key Podcast. Here, we will be discussing topics in microbial bioinformatics. We hope that we can give you some insights, tips, and tricks along the way. There is so much information we all know from working in the field, but nobody really writes it down. There's no manual, and it's assumed you'll pick it up. We hope to fill in a few of these gaps. My co-hosts are Dr. Nabil Ali Khan and Professor Andrew Page. Nabil is a Senior Bioinformatician at the Center for Genomic Pathogen Surveillance, University of Oxford. And Andrew is the Director of Technical Innovation for Theogen in Cambridge, UK. I am Dr. Lee Katz, and I am a Senior Bioinformatician at Centers for Disease Control and Prevention in Atlanta in the United States. Hello and welcome to the Microbid Key Podcast. We have a special episode today where we are actually sitting in the same room as a few people from CDC. We have here today Christine Lee, Sinead Waters, and Megan Mickham from CDC. And Andrew's on the line. Nabil couldn't be here today, unfortunately. We thought we'd talk a little bit about outbreaks. And honestly, I've been waiting for them to be published so we don't get in trouble about talking about insider government information. So welcome, Christine. You're head of the Live Lab, but that also includes things like cholera. And Sinead is also in your team, the bioinformatician, the hero, sorry, hero to us, sorry, you're both heroes. But this is a bioinformatics podcast, so I have to bring that up. And I don't know, let's just get into it a little bit. So you work at the CDC, just tell me about what the Live Lab is and kind of what your day-to-day is, Christine. Yeah, Live stands for Listeria, Yersinia, Vibrio, and other enterobacterioles. My day-to- day, depending on what I'm doing, can be, yeah, it's a lot. But we're part of the National Reference Laboratories and the Enteric Diseases Laboratory Branch. And I think the most important commitment we have is to our state and local public health partners here in the United States. And so foodborne illness is a priority of our division and our branch, we're the Enteric Diseases Laboratory Branch. What we try to do is understand any kinds of emerging trends and patterns and help our state and local public health partners characterize these organisms, especially if they come across some unusual or weird organisms. Cholera is not endemic here in the United States, but we oftentimes get travel importation cases and the causative agent of cholera is a special kind of Vibrio cholerae, which we could get into later. But Vibrio cholerae is present in the United States domestically, though it's not the kind that we observe in global settings. And we can talk a little bit more about the global epidemiology of the cholera strains that in the past couple of years have caused resurgence in areas where cholera had not been detected for many years. Awesome. And then you have a bioinformatician in your lab, and that makes you pretty lucky, at least in my opinion. I was intentional about that. Sine Wolters is here. She's really talented. I'll let her introduce herself. Thank you, Christine. So, hi. Yes, I'm Sine Wolters. I am the bioinformatician for the Live Lab. My role, I guess, in the simplest words, is that I analyze the whole genome sequencing data of the organisms Listeria, Yersinia, Vibrio, and Enterobacterialis. I also am a certified CLIA tester, so I get to actually run the wet lab protocols of whole genome sequencing, so I can sequence DNA of bacteria. That's awesome. I didn't know that you did CLIA, too. That's actually a big deal. Yeah. Wow. Thanks to Christine. I get to do a lot of, you know, trying out different things in the lab, so, wow. So, Sine is an expert about everything whole genome sequencing, from soup to nuts, and about enterobacteria. CLIA, for those of you who are not familiar, is a way that we can conduct patient-level reporting, and because it involves PII, anybody who is involved in CLIA testing undergoes a strict set of, how do you say, ethics and patient identity sort of protocols to ensure that the quality of the data is reproducible and of the best quality, and that we keep patient identifiers private and conduct all of our testing with the utmost respect, and because these are clinical cases, these patients come to the hospital really sick and ill, and we want to make sure that we do our very best to really understand the pathogen that was the causative agent of their illness. All right. So, I've been really excited to do the actual topic, so I'm going to jump into it, though. We wanted to talk about cholera today and the Haiti cholera outbreak, and a little bit of, I guess, a little history for myself is that I was brought into the live lab originally more than 10 years ago to look at the Haiti cholera outbreak, so this is sort of a joke question, but a leading question. Did we already have a Haiti cholera outbreak? Yeah. In 2010, there was an earthquake in Haiti that really caused devastating environmental damage and it, how do you say, caused havoc on the water infrastructure systems, and I think in the aftermath of that came the emergence of a really deadly strain of Vibrio cholerae that really affected thousands of people for many years, and Lee, I think you were on the boots on the lab ground doing the analysis for this, and I think this outbreak in 2010 was the first time whole genome sequencing was used to solve an outbreak. Am I understanding that correctly? Um, there might have been some other smaller cases, but this was definitely a very early high-prominent case, for sure. Can I ask, how did you link or do source attribution for that? Because I remember there was a big controversy over it. There was a big controversy. At the time, the opinion of CDC was that we weren't sure where the source was, but eventually over time, a few years later, WHO definitely had an opinion, and then CDC eventually had an opinion, and I won't go too much into that. You're going to get me in trouble, but for me, the genomics, the bioinformatics is that we did make a phylogeny. We did see some clustering between Nepalese genomes and Haitian genomes. But then you guys, I'm going to bring it back to you guys, though. So, how did you do it in the latest outbreak? What is the latest outbreak, and how did you look at it? So, there was another devastating natural disaster event in Haiti. End of 2021, it was a hurricane that blazed through the country and, again, disrupted the water infrastructure, and some of the sociopolitical dynamics complicated the access to clean water for the Haitians, which is really unfortunate to hear. And there was a reemergence of cholera. The country was actually getting ready to put in an application for cholera-free status through the Global Task Force for Ending Cholera Control, which is contributed by the WHO, and that requires three years of no detected clinical cases or no confirmed clinical cases of cholera, and Haiti was on track to do that. But unfortunately, through natural disasters and some of these events in-country, cholera reemerged, and it was really devastating, I think, for the communities because, again, the water infrastructure was not stable, and a lot of people got really sick. With the support of our laboratory partners at the National Public Health Laboratory in Haiti, we were able to help identify some of the strains using whole genome sequencing and do an analysis based on some of the early cases from the outbreak. I see Andrew's hand up. Yeah. So, I was just going to ask, were there new introductions, and could you prove that by genomics, or was it one emergence of one clone, or was it multiple reemergences of many, many different lineages within the phylogeny? So, what did it actually look like? Yeah. So, since I actually did the bioinformatic analysis on this, we received a subset of samples from the outbreak. One of our colleagues here in the Enteric Diseases Laboratory Branch, her name is Marianne Ternsik, has had a longstanding relationship with our Haitian partners and was able to go and deploy to support the response as part of the laboratory pillar, and was able to coordinate a subset of samples from multiple regions throughout the country during this outbreak, and Sine can speak to a little bit about the genomics after the sequencing was conducted. Yeah. So, we took the whole genome sequencing data, and we used a tool called Lifesat that Lee actually championed. And so, what Lifesat does, it looks at the single nucleotide polymorphisms, or the genetic variant. variation of single nucleotides in, I guess, the genomic data. And so, using that, we compared the variation between the new samples that we collected from the reemergence, and we compared it to historical strains to see if, I guess, the DNA was conserved or if it changed. And if it changed more, then we would say maybe this is a reintroduction versus if it stayed relatively the same, it was probably the same strains. So, what we did find out was that, based on the SNP analysis, they were…it was about like 0 to 25 SNPs differences between the newer data versus the historical strains. So, that just tells us that the strains are relatively similar. So, it was probably…it is related to the historical strains. And then, Sine, what years did the data include? We had strains from the initial 2010 Haiti outbreak, and I think on your tree, there's another set, subcluster, right? Yeah. So, yeah, what were the data that you analyzed when you say 0 to 20 or so SNPs different? How was that over time? Yes. So, the collection years of all the strains ranged from 2010, starting with that first outbreak, all the way up to 2022. So, some of those years were like 2010, 2011, 2012, 2013, 2014, 2015, 2016, and 2017. So, the 2022 strains were about 0 to 25 SNPs differences from the 2010. And so, we usually use the threshold of 10 SNPs as being very closely related. But because the range was from 0 to 25, it was still related, but not identical or that similar. So, based on that, we know that they are…these 2022 strains were related to the pre-existing strains. Kalmarai has obviously got two chromosomes, and how does that impact your pathomatics analysis? Yeah. So, when you have two chromosomes, it might affect things, but as long as you have a good reference genome, it doesn't quite change things when you're running SNP analysis. I would say that the place where we get in trouble is that there's the superintegron on the second chromosome, the smaller chromosome of cholera. It's about a megabase and has a lot of transposable elements. And the cool thing about LiveSat is it will mask those places. It runs a, quote-unquote, phage-finding program, but really it's just finding anything that might look like a mobile element or something from a phage. So, those places are masked, and the SNPs actually won't appear there, so it won't really break the evolutionary model that we put in there. So, you guys did SNP analysis on this. That's awesome. Is there something that could have informed you more that you would have liked to run or is there something else you'd like to run that might, I don't know, but you have time constraints and stuff, so maybe you just had to fit everything into one paper at a certain time range. Yes, that's a good question. So, the re-emergence suggests that this particular strain had been circulating either in the environment or through the population for more than a decade, right? And that's an interesting, how do you say, an interesting characteristic about Vibrio cholerae and how durable it is as a pathogen and how it can survive in the environment or in reservoirs, right? People could have been shedding over time and no detections would have been made. What I think Marco Salemi and Glenn Morris' group were able to show through a paper that was released at around the same time was evidence from a molecular clock analysis suggesting that the strain might have been circulating in the environment. They had a relatively close strain from environmental samples that were related to the re-emergent strain. And I think that that is a controversial topic, doing environmental surveillance for cholerae can be, I think there's a lot to unpack there and I won't go into it too much, but environmental sampling with COVID, for example, has been shown to be pretty powerful with wastewater surveillance set by the precedent of polio surveillance in wastewater as well. And I think that the challenge with Vibrio that's different than polio or SARS-CoV-2, for example, is that Vibrio live in the water and separating out the natural Vibrio that live in the environment versus those that cause true pandemics and epidemics in these huge outbreaks will take a little bit more finessing in terms of the bioinformatics and then metagenomic piece of it. But I think we're getting there and I said I wasn't going to make an opinion about that, but I think that linking to and understanding how these strains can persist and continue to subtly evolve over time is a compelling question. And I wish that we had surveillance data that could couple the clinical results that we see. But again, to answer your question, is there anything that I wish we could have if we had more time, that would definitely be one of them. That's awesome. So maybe onto the bioinformatics, did you guys do long-range sequencing on any of the strains and did you use those as your reference genomes? Not yet, but we use the type strain, the Haiti cholera strain that Lee had defined as our robust reference and we have long-range sequencing for that. But we use, I think, the short-range assemblies for that. I just want to put it in a note, I can't take by far the credit for that reference genome. Like that was a huge, huge effort from us and multiple partners, especially Canada. So I can't take credit for that one, but I appreciate it. Well, if we can thank Canada and every opportunity that we can get, I'll take it. Yeah. Can I ask a question about drug resistance and have you been tracking it, AMR, and is there any emergence of any more or less or, you know, how is that working out? Because every environment is different. Yep. And Cindy can answer the question about the AMR because she also did that analysis in this paper. Yeah. So we included the AMR resistance data in the paper that we published there. I was actually assisted by Jess Chen. She is one of the more senior bioinformaticians here. We found that really three strains had a unique AMR profile of, there were numerous genes that were present versus absent. And so it was really three that were different from the majority of the samples. So based on the resistance, we know that they are, it's still the same profile. And so, again, comparing that to the SNP analysis, we know that the strains are very similar. I don't know much about cholera. So how does resistance come into cholera? Is it point-based mutations? Is it genes coming out of plasmids? You know, how does it actually work? Yeah. So they use, and the paper goes into more details about the methods. So I'll just take a high level, but ResFinder was used, and the two particular genes that we interrogated were gyrase A and Parse C. And I see Andrew nodding his head, so that means something to him, but in any case, but the thing about Vibrio cholerae that I learned from the legend Cheryl Tarr, who was my predecessor and was really pivotal in integrating whole genome sequencing for enteric foodborne surveillance here at the CDC and working with our Canadian partners is that Vibrio cholerae doesn't change too much over time. It's pretty clonal. And so when we do see antimicrobial resistance emerging, we don't always attribute it to evolutionary changes in a specific strain. It's introduced by another strain that's different that had acquired the resistance possibly elsewhere. Yeah. So that leads into maybe be a final question just because we're out of time, but so in 2010, the controversy was, did cholera come from the environment or was it introduced? And now we have this outbreak a decade later, do you feel like you have that question answered for this particular outbreak? Like I said, I think if we had additional data, additional clinical surveillance over time, right? Like Susan, I mentioned that we included some genomes from 2016, but we don't have any recent clinical cases reported. And so we don't have a really good benchmark to say that this had been circulating in the population. So that's an unknown. We don't have a control for that experiment, if you will. And then environmental sampling is not something that's routinely done. And so we don't have the data for that. So I mean, if I want to be true to my scientific inclinations, I would say we don't have any evidence to suggest either. But we do have some information from the sequencing data that this is very closely related to the strains that were circulating from the outbreak in 2010 up through 2016. So maybe that's also an answer to the previous question, if you could do extra stuff, maybe just having routine surveillance. So, the power of routine surveillance is really important, especially in countries, and if I can mention, you know, cholera has made a resurgence in many global countries. The WHO has a nice little dashboard, but we've seen cases and outbreaks in areas where we thought cholera had been previously eliminated, and there's a lot to be said about water warming. People in Ontario love to live in warm waters, and they're thriving, and, you know, there's a lot to be said about the movement, the global movement of people and things is unprecedented than a decade ago, even. So I think there's a lot to be said about the changing world, the geopolitical things that we can't really comment about, but the reality that we live in is totally different than it was in 2010 in the way that we live our lives and travel and move, and the way that water is protected is also different when there are disruptions as well. So, right, what can we do? I think the importance of surveillance, especially in areas where cholera is about to be eliminated, that's where tools like whole genome sequencing can be very powerful and important for really getting at the answer of are we close to elimination or not. All right, so you guys have been awesome. I think that you learned a lot about cholera, and you guys are doing a great job. I appreciate it. Thank you so much for listening to our podcast. If you like this podcast, please subscribe and rate us on iTunes, Spotify, SoundCloud, or the platform of your choice. Follow us on Twitter at MicroBinfy. This podcast was recorded by the Microbial Bioinformatics Group. The opinions expressed here are our own and do not necessarily reflect the views of CDC, Theogen, or the Center for Genomic Pathogen Surveillance.