Hello, and thank you for listening to the MicroBinfeed podcast. Here, we will be discussing topics in microbial bioinformatics. We hope that we can give you some insights, tips, and tricks along the way. There's so much information we all know from working in the field, but nobody writes it down. There is no manual, and it's assumed you'll pick it up. We hope to fill in a few of these gaps. My co-hosts are Dr. Nabil Ali Khan and Dr. Andrew Page. I am Dr. Lee Katz. Both Andrew and Nabil work in the Quadram Institute in Norwich, UK, where they work on microbes in food and the impact on human health. I work at Centers for Disease Control and Prevention and am an adjunct member at the University of Georgia in the US. This episode, we're doing another software deep dive. This is where we interview the author of a bioinformatics software package. We talk about some of the obscure and interesting details of popular programs that do not make it into the paper. We have Robert Petit, who is talking to us today about Bactopia. Robert received his master's in bioinformatics from the Georgia Institute of Technology in Atlanta, Georgia, USA, and his PhD from Emory University, also in Atlanta. During his graduate studies, he worked on mostly Staphylococcus aureus genomics, but he was also involved in sequencing of the first whale shark genome and developing a typing scheme for identifying Bacillus anthracis, which is the causative agent of anthrax in metagenomic sequences. A major component of Robert's work was the development of Staphopia, a bioinformatics workflow specifically designed for the analysis of Staphylococcus aureus genomes. This work ultimately laid the groundwork for Bactopia, which we'll be discussing today. Currently, Robert is working with the Wyoming Public Health Laboratory in the US to help build their bioinformatics infrastructure to help complement their sequencing efforts in response to SARS-CoV-2. In the show notes, we'll have some links for the docs, the repo, and the publication. So Robert, hello. You and I have had some intersection in our history before. I was also at Georgia Tech at the same time as you, and I don't know if you knew this, but I was an undergrad at Emory, so we kind of flip-flopped. I was undergrad over there, and then you went to graduate over there. So let's just start it off. What is Bactopia? So first, before I get into Bactopia, a quick thank you to the 70, 80-plus tool developers that Bactopia uses. Bactopia includes a bunch of tools, and basically, without these people developing these tools, there's no way Bactopia exists, and so thank you very much. So what is Bactopia? Bactopia is an all-in-one workflow for the complete analysis of bacterial genomes. My punchline is get through the analysis steps quickly so you can get to the fun part of the data, the results, much quicker. So what I remember is that you guys were starting off on Staphopia, and maybe that'll be like a good intro into what this is all about. Would you say so? Yeah, definitely. So Bactopia kind of goes back to 2010, when I was a master's student, and get into the master's program, like, you got to find a lab to work with, and I send all the emails out and couldn't get a lab. And so at that point, I'm like, oh, crap, what do I do? So I even started looking into ecology programs, because my undergrad was in ecology, and I loved field ecology. But King Jordan, Georgia Tech, suggested, hey, there's this guy, Tim Reed, over at Emory University who's doing bacterial genomics stuff, and he's like, you should send him an email. And he sent him an email, and the rest is history. And so now I'm in bacterial genomics, and I'm quite happy with it. Otherwise, I'd probably be in a field taking samples of animals, and I'd probably be happy with that. You know, that's kind of fun, too. But so Tim had this grand idea, hey, you know, there's all these public Staph aureus genome samples, and when I say all these public Staph aureus genome samples, I'm talking, like, there's, like, hundreds of genomes available on the sequence read archive and the European nucleotide archive. And it's like, hey, we could download these and process them and, you know, include them in our own analyses, but we'd have to develop a workflow for it. And so that led to Staphopedia, which was the predecessor to Staphopia. And so we developed this PHP LAMP infrastructure, where you would submit your FASTQs to our website, and it would go on the back end and launch a bunch of shell scripts and spit out some results. That LAMP is that's Linux, Apache... MySQL and PHP. Yep. Yeah. So it was back in the heyday before all, like, the static site generators and all that. I put together Staphopedia. I still have the source code somewhere on Dropbox. It's one of those, like, historical things, I'm like, I gotta keep this. At that point, we were versioning with SVN, and I think that took about a year before we even started versioning. We got away from this whole web platform, submit these large FASTQ files to this random website that would process your genomes, just because at that point, take a few hours to submit your genomes, your one genome, and then take hours to get your results. And so at that point, I converted, there was a name change at some point from Staphopedia to Staphopia, compliments to Tim Reed on the name. And then at that point, we shifted from this Bash shell script type thing to a even bigger Perl script that was managing the analyses. And I think at that point, we just had the basics, take in the FASTQs, do some quality metrics. Is it a good enough FASTQ? Will it pass? Is it actually a FASTQ? MLST, annotate the genomes, and then do some blast stuff. And so eventually I segued from Perl and learned Python, so that meant Staphopia followed and it turned into another monolithic workflow within another language. So it turned into a complete Python script, but I thought I'd get fancy and do some Python modules and create a Python library. So all the steps are in different little files and that worked great for a little while. Then eventually rewrote again, except this time using Rufus, which was a workflow manager written in Python. And so I think at this point we're somewhere 2015-ish. So within those five years, Staphopia was rewritten multiple times, only for Staphopia to get rewritten one more time. And that the final, the final rewrite was in Nextflow, probably 2016, 17-ish. So I mean, we're looking at six, six plus years of development in Staphopia before it got to its final form. And I think the transition from 2010 bioinformatics, specifically bacterial genomics, was very different from like what it looked like in 2010 was very different from what it looked like in 2016. But Staphopia contained a lot of the philosophy of 2010 because that's kind of where it started. And so it didn't really have the nuts and bolts of like 2016, where we were starting to get into containerization and all that. And some of the newer workflow programs. We put out Staphopia and the first question I was always asked was, that's cool, but can I use it for my bug? And it's like, I think so. Like it should just be just, you know, modify this, modify that. And most people were like, okay, so no, I can't use it for my bug. And so we were eventually asked, hey, we have some homophilus influenza samples that we want to process. You think you could send them through Staphopia? And so we're like, okay, we'll do it. We did some ad hoc, you know, take out these steps that are Staph specific or change this reference genome to a, so it was very manual and very, we're just manipulating because we know the backend. Whereas if you were to hand it to somebody else, they would have been like, it's going to take me ages to, you know, figure out where all the steps are. And this Bactopia's first citation is actually disguised as Staphopia. So we had this like very early ad hoc conversion of Staphopia to Bactopia and it worked. But at that point it was kind of, let's take this to the next level and get this caught up with the times. So Staphopia is still kind of 2010 ish, 2012 ish. Let's go ahead and we had the opportunity to completely rewrite again because job security, I guess, just keep rewriting. But no, it made sense because you know, we were kind of rewriting with the, with the, the progression, the evolution of the community and the field. At this point, I mean, you were talking about, by the way, this is really awesome. Like this kind of history, because we don't think older than Git anymore. Like you started with SVN and Dropbox. Oh, it reminds me like when we get there, like at one point in Staphopia, similar to like Torsten has all his tools, the binaries in the programs, Staphopia had all the binaries you needed. And I'm sure I violated, you know, numerous licenses, but you know, we had to get these programs out to people. It had about a gigabyte tarball you could download that had all the reference data and all the binaries to help you run it on your system. And it was hosted on my Dropbox for a while. So I guess, I guess it's, I'd never got a notice from Dropbox saying, Hey, too many people are downloading, Staphopia has stayed pretty small, but yeah, it's one of those where you're like, you know, ad hoc things that you did back in 2013, 2012 to get, to make it easier. Let's hope we're past the statute of limitations there. I mean, it was a different time. always tell me about how they used to get GenBank sent to them on floppy disks, like all of GenBank and you're just like, yeah, you're going to, you're going to package RefSec, you're going to package like the database with the binaries with your software. And now I think what is the size of a play with Backtopia that does allow you to download and stage data sets. If you want to download a cracking data database, like 90 gigs, like TTDB, if you try and build that into Backtopia, that is like 90 gigs to put in there, just that, just that one step and then there's everything else on top of it. So it's definitely changed and the way you write the tools have to change. I mean, this is, this is just like a really fun view into history. I feel like maybe, maybe in bioinformatics, our generation time is just so fast. Like maybe I'm a couple of generations down into it now. Cause we were, we were saying, you know, floppy, you're saying floppy disks and GenBank, which I have also heard that that was not my time. I guess I would be more the CD-ROM generation because we, we had sequences from four or five, four sequences on CD-ROMs that were, that's how they delivered it to us. I have a funny feeling. My, my doctoral lab had TigerFam on a disk on a CD drive, like that was sent out. I think, I can't remember, maybe it was something someone made, or maybe that was something that, that TIGR actually sent out. Tiger had the, you could download it, but I don't know if they actually sent it out. We did download it in grad school. And so like another generation, I feel like people are making like hand woven pipelines. And I'm part of that and using things like SVN and, and, and a whole bunch of other things. Was it CSV? CVS is the, CVS and Mercurial, that was the predecessor to SVN, wasn't it? Yeah. I feel like I used Mercurial at some point. Yeah. I used Mercurial and it was a lot like Git and I was using it for a while. And then everyone was just like, we're using Git. So that's why I switched over. But actually I feel like Mercurial was, was, was more advanced than SVN. So it's just, it's funny seeing like the culmination. I said, I would say like of all this stuff and it's like, why don't we have a good pipeliner right now? Like why even I I'm like so guilty of this. I've, I've made so many pipelines, just like hand weaving them. And, and why aren't I using something like next flow or be pipe or, or something. And I think this is just a really good. I think you've been giving yourself a really good history of why you've been getting into backtopia. So I, I just wanted to say, I appreciate that. I think for me, the whole history that, that sort of rabbit hole of going through all these different software and rewriting it over and over again, reminds me of this quote, this famous quote about writing, which just goes, every writer has only one story to tell. And he has to find a way of telling it until the meaning becomes clearer and until the story becomes once more narrow and larger, more and more precise, more and more reverberating. And that is basically, you can, you can do the text, you can stick it on the wall, because that's not exactly what's going on here, it's just this refinement and in doing so understanding deeper, the kind of problem that we face, but this is, which is fairly straight, let's, let's drop into that. Like the, what stopopia was doing. You mentioned a little bit, but let's, let's focus on backtopia, which is the current iteration. Like what does it actually provide to the user? Like the analysis, because I've used the software. I like the software as it's analytical outputs. It's running a fairly straightforward tools. So it's a culmination of about 10 years of, you know, those rewrites to where I'm like, I think I'm to the point I could just fire this off. And this gives me all the results. I'm most likely going to need for, if somebody says, Hey, I have this genome, this bacterial genome. Can you process it? And I can process it. And here's most of the results are going to use. And so I think what backtopia does is simplifies that process of going from raw sequence to just a slew of results that you may or may not need. But at least they're there when you do need them. And there's no, like, I need to rerun this specific analysis and the target audience I think would be those that, you know, they, they may be novices in bioinformatics, but sequencing is so easy to get now that they it's either, you know, I got these a hundred sequences. Now, what do I do? You could throw them in the pipeline and get all the results instead of starting from the beginning and trying to figure out how do I write a workflow or, you know, which tools do I need to include in the workflow, which analysis steps in this whole, can I even do this? Do I need to hire a biopharmacist to run this or completely rewrite a workflow? You know, there's all kinds of nuances. And I just wanted to create something that one I'm going to use on a daily basis, but two, something that hopefully others can pick up and get to a result and get out of the nuts and bolts of the weeds of like a gritty bioinformatics workflow development, command line and all that. Now I do, I do encourage people to learn the command line and understand what Backtopia is doing, not just, you know, run it through and get some stuff and do stuff with it. So, and I think that kind of goes into the documentation for Backtopia where I've kind of tried to highlight what each step is doing and how it's doing it and which tools are involved in each step. So that way they can go back and say, all right, this step, this program ran this and now I can take this program and go see what that program's doing and learn more about it if I want to. And so it's kind of like this, this hybrid, get your stuff faster, but potential training introduction into bioinformatics or specifically, just let's just make the assumption that anything I talk about is bacterial genomics. The WellShark was my introduction and my exit. I love being able to process thousands of samples on a small infrastructure and not like require terabytes of memory to, I mean, I can still eat up terabytes of memory doing some bacterial stuff, but just to assemble the WellShark took quite a bit and it's just like, we need more, we need more resources. So for Bactopia, what would be a typical use case that someone has used the platform for and what are the kinds of outputs that they would get, like tangible outputs they would get as a user that would be useful for them? A typical use case is I have this bacterial genome. Is it, does it have resistance to certain antibiotics? So Bactopia would take in your raw FASTQ files, assemble them. It'll do quality control to filter out bad reads. It'll create an assembly. And then from that assembly, it can annotate the genome and then make predictions about antimicrobial resistance. And then you may be interested in what multilocus sequence type it is. Is it clustered or is it specific to a sequence type that is known to be associated with certain virulence? And so you would get that outputs as well. But that, that is getting into Bactopia has a general, general workflow where you provided no datasets, no species specific datasets or no general datasets. And another workflow where if those datasets are provided, they'll supplement the initial Bactopia run. And so those datasets can be MASHs, GenBank, RepSeq, Sketch. So basically you can query your sequence against all the RepSeq using MASH or SourMASHs, GenBank, Sketch, where you can get an idea. It's like a preliminary, a quick way to say, I sequenced this, do these kind of align with that? So if you sequence Staph aureus and it comes up E.coli, then there's something for you to figure out. Then you can also include, you know, reference genomes to call variants against. So if you have your, your reference of interest, you can say, Hey, I want to know what SNPs and indels are in there. Include all your genes, proteins, primers that you want to blast against the, but I think in most cases, most people are going to run this Bactopia datasets command, build a species specific database, and then run it through that organism. So does this do any, any phylogenetics? It can. So this, this is kind of getting another subset of Bactopia. So, so there's this Bactopia datasets, which includes, tries to pull in public datasets to supplement your analysis. There's Bactopia, which is kind of like this isolate based sequence analysis. So something you would just run on all isolates or all samples. And then there's Bactopia tools, which are separate workflows for comparative genomics. And so those Bactopia tools is where you would run your phylogenetic analyses and all that. I chose to keep them separate mostly because that's kind of like checkpoints. If I, if I want to run, you know, a thousand samples, I don't necessarily want to run all a thousand samples and then build trees just because the time constraints or the time to take, it takes to build those versus run all a thousand samples, figure out which ones I want to include and not include. And so I can exclude samples based on low quality, low coverage. And Bactopia does that in the, the main workflow where it will, before it processes genomes, it'll say, Hey, this is the minimum requirement to, you know, go further in the pipeline. And that's just to prevent downstream. You tried to assemble this genome with two X coverage and it just failed. And so why not just catch it at the beginning? And that's, that's users can control how much coverage and all that. But yeah, it's, I like to keep it all separate and it kind of goes into the resource requirement of running isolates one at a time versus running a complete phylogenetic analysis where I'm going to have to ask for a much bigger machine to run a pan genome and phylogenetic analysis of a thousand samples versus I can process. all thousands of samples on a small desktop, it might take a while, but on a cluster it would go much faster. This will, and again, it allows me to keep Bactopia somewhat species agnostic and include Bactopia tools which can become very species specific. And that would be like, run Cleverate on your Klebsiella's or Aggravate on your Staphylorea samples, stuff like that. And so it's, I like the separation and I don't always know what comparative genomic tools I'm going to want to run. And I like being able to just, after I process all my genomes, say, all right, these are what I'm going to do. Yeah, I mean, I really like the workflow and what I think Bactopia does, since it is sort of this thing where you've been wrangling with this process that we all go through of genomic analysis, microbial genomics analysis. These breakpoints are exactly where I would also have breakpoints myself, if I was doing it by hand, or if I was doing it with a student, like if I had a doctor student and I was saying, okay, here's the sequencing data, I want you to get to this point and then we'll have a meeting, you know, assuming they know how to do the, run the software, whatever. It is very much run MLST, do the assemblies, do some QC stuff on it, run crafting, give me the species, species breakdown for the reads, whatever. And then stop and then take stock of that, check if these are okay. And then move to the next stage, which is the species specific, and then check if that's okay. And then start thinking about more computationally intensive analysis, like the trees and the, I mean, I think I want to have a quick and dirty tree up front, but like, but not something, not something heavy handed and that quick and dirty tree, I would probably just throw away anyway. You would go back and, you know, as you find samples that are no good, you would throw them out and, you know, you'd have a better tree at the end. So this matches my workflow, which is why I was so excited to read the stuff, read the paper and have a look at the software as well, that this kind of, it's, it's my, it's my brain as an exo-python, like what would I tell someone to do? So even that as a, even if you could, so I think for anyone listening out there, if you're like listening to this and thinking, I'm not going to use this software because whatever, I think just as a concept, the workflow, the way it's laid out is fantastic. And that just gives you a good way to think about all of the different tools and how all of these things interact and what actually needs to be run before the other. You want to run your MLST and you want to know which ST the strains are in before you launch like this massive thousand taxa tree. Because if one of those guys are like, not what you think it is, and a little bit too divergent, your tree is going to look like nonsense. You're not going to get any sites because that outlier is going to have no core SNPs compared to everything else. That's going to, that's going to mess your tree up. That's kind of, in our, in our paper for Beck Tokyo, we, we looked at all lactobacillus, the genus. And so I think it's worth mentioning that Backtopia is set up to pull public data from either GenBank, RefSeq, SRA, ENA, just because we've, we've only, we've always, me and Tim have always had this, like, there's all this public data, why don't we use it type thing. So we, when we develop Backtopia, like I purposely, one of the inputs you can provide is SRA accessions, and it'll go download that. Sorry. And now Nexpo includes that as a default channel. Like you, you can do that with Nexpo now, but we were, this is before that. And so, but going back to lactobacillus, exactly to your point in the bill, it's like the genomes we process, we're not all lactobacillus. Like they were labeled lactobacillus. There was a yeast sample in there. There was some streptococcus. I don't know. It was only a handful, but we quickly, like we, we built this quick and dirty tree. And like, it was just like, there's this whole section that you could see, these are, these are something else. But what we ended up doing is we, we built this tree, and then we wanted to focus on a specific species, lactobacillus. So we used FastANI to get a, the ANI, a reference genome against all the samples. And then we said within this reference, or this, the span of ANI values, we're going to pull those samples out that only include those samples in our actual pan-genome and core-genome tree. And so that, that was the type of things where you're like, that step where you're like, we need to figure out if we want to include all 2000 samples or just these samples that match our criteria. Yeah. I think is it, is that FastANI calculation and then subsequent tree, neighbor joining tree, I suppose, a part of Bactopia, a part of Bactopia? Yeah, that would, that would be a Bactopia tool. So Bactopia outputs come in a structured format that we can basically programmatically access. Basically, I just put stuff in nice spots. And so that way we can, we can find them easy. And so Bactopia tools understand this structure, and then you can give a whole, a whole folder of Bactopia results. And then these text files that say, I want to either include just these samples, or instead of the alternate, I want to exclude these samples, which failed to meet some criteria, and that could be low quality. Or in our case, we only wanted to include this specific set of lactobacillus, or, you know, we want to exclude all genomes that had low coverage or had too many contigs. Something that would really mess up the downstream analysis. Yeah. I think one, one other thing that people come to me these days is the next question is how do you sub- sample? And I'm not sure if Bactopia has a solution for this. I'm not demanding it. I'm just saying like, this might be a future thing to think about for anybody. You run 67,000 Staph aureus genomes, and then you want to make a tree or you want to make a tree that's meaningful. Like you don't need 67,000 tips to say something. You need like a thousand-ish. How do you go from 67,000 to 1,000? That's honestly something where I, I would love to know the answer to. So for Staph aureus, we created this non-redundant data set. We call it the nerd set, which was basically, we picked a high quality genome from each sequence type. And then we use that to, as our, so I think the Staph aureus, it, it went from at that time, 40,000 genomes to about four, 400-ish that all represented a unique sequence type. And then you could kind of use that to sub-sample your genomes based on where they fell on the small tree. But I think ideally it would be something that, similar to like Big C's approach, where you, you just put in some sequence and it gives you a bunch of public samples that meet the threshold or similarity. Cause I think for me, at least I'll always want to use public data just because one, you know, have we seen this before? Is there something that we can, you know, is there something in the past in these public databases that could help me in the analysis currently? How we do that? That's, that's, I hope someone comes up with a super clever approach that says, all right, here's 200,000 public samples. And here's the 500 that are most likely similar years or meet your threshold. I've seen a few attempts at this is kind of like fingerprinting with some degree of granularity and you can pick which level of relatedness you want. I mean, the obvious ones you can make, you can make the CG-Lemma-C, which is quite explicit, but then there's other like kind of rough mash-esque measurements that you can use. And I mean, a lot of it is still quite immature at the moment. I think people are kind of just reaching the same point we are with this, where we're sort of thinking we need to, we now have too many genomes to deal with. Now we have too many public genomes to sift through and we need to kind of have an input that allows us to dig down to what we want. And I think there will be a solution eventually. This, this is like a thing, I was just curious if there was something in Bactopia or if you'd run into an easy, cheap way to do it. I mean, my cheap way is to use RMLST and just pick one for RMLST type, because RMLST is basically genotyping one protein. So it basically works on any organism and you can use that just kind of cookie cutter on anything. That's the cheap, that's the cheap code we use in the interface papers and things like that. So. Those would have been genomes you'd already processed though? Well, we had the assemblies and we had already done the RMLST type from them. Yeah. So we kind of, you do have to look at everything, have all the data somewhere and then pick out of it. Kind of like there's the other way, because it's saying like, here's a workflow that works on your strains and it looks outwards. And this would kind of be like, no, I understand the population structure of the species and now I'm going to drill down. So I might not, I think just trying to fit that into Bactopia as a thing would be difficult, but it would make sense to have it call out to this magic thing that has this index to ask what other ones could I use? Yeah. See, that's, I think you mentioned it earlier too, where it becomes kind of like species specific population structures. And it's like, there's no way we could be experts in all population structures. It's this kind of segues into, we kind of started these curated Bactopia datasets. We have familiarity with Staph aureus. And so Tim has a set of reference genomes that he uses for his Staph aureus analysis. The potential is, you know, there could be these experts in their field that are their bacterial species that could contribute to these curated datasets to say, Hey, you're going to run this bacteria. Here's a set of data that some expert in the field has said, this is what you'll want to include because, you know, when I don't know what references might be important to a certain species, whereas the expert will say, you'll want this reference. This is the one that we always use and stuff like that. So, yeah, it would be in to see where that goes. For Steph Aureus, there's, there's a lot, still a lot more to do, but we at least have a work in proof of concept for it. So Robert, everyone has their own favorite analysis. If you were going to put out like a call to the community, I'm sure you already have this. What's the mechanism for me to put in like my favorite tool for my favorite bug. So on Git, create an issue. There's a button that says feature requests say, Hey, I want this in Backtopia. And there's quite a few that, so it's kind of like I use that Backtopia on a day-to-day basis. So it contains stuff that I'm currently analyzing. And I would love for it to turn into this thing that everybody else is using. And they say, Hey, you know, in this, this bug, we use this. Can you include that? And I'm happy to do that. The only requirement is it has to be on Bioconda just because I think that's a good starting point. The installation has already been figured out. There's biocontainers has a Docker. The Galaxy group is building the singularity images. So it's already laid in place. If it's not on Bioconda, I'm more than happy to say, how difficult will this be to put on Bioconda and make an attempt to put it on Bioconda. And that has led to a few tools in Backtopia to be included on Bioconda. But this, when I first started Backtopia, Nextflow was on DSL one. And now they have this DSL two, which allows you to really create some fun workflows. And so basically it becomes way more modular. This is kind of fun story. I'm sitting at my desk one day. If you submit a GitHub issue, I usually try to respond fairly quickly just because like Backtopia is my fourth kid. I want to take care of it. So it's, it's one of those where it's just like, you know, Hey, it's not even that it's, if you're creating a GitHub issue, most, most times you ran into a problem. And I have a correction. This, this predates three of your kids, right? No. One of your kids? Well, Staffopia does. Backtopia predates one. Yeah. So I guess, yeah, Backtopia would be third kid. But so it's, it's, if somebody's using your tool and they've run into an issue and all likelihood you may have lost them. And like, so I try to be pretty quick and as helpful as I can with the GitHub issues. Like there, there's some that it's like hundreds of comments and it's just like, I'm just trying to, you know, help, help you get up running. So I'm sitting there again, like, because like you get the whole Slack integration would get, you know, you get GitHub issue gets created and I get a ping notification on Slack and it's like, Oh, that's an issue. Let me take it out. And so this one was actually a pull request. I'm like, Oh, that's cool. So I'm gonna submit a pull request. And there's this guy named Davi Marcon and he was doing an internship with Abhinav Sharma. And so he submitted a pull request. It just, it just had the line that tell the next flow to use version two of DSL. And so I'm like, wait, what's happening? Like I've been wanting to do this. And then over the next few months, I'm like cyber- stalking their branch or their fork of Backtopia because I'm like, Oh man, they're, they're making pretty good progress. And the benefits of DSL two is now we can create workflows, sub-workflows and modules and completely reuse code. So Staffopia has been eaten by Backtopia. So there is a sub-work, there's a workflow where you run Backtopia and then you run the Stafforia specific analyses that were included in Staffopia. And so now you can execute Staffopia in version two, but this really allows me to include new workflows really easily. I've benefited a lot in this DSL two bit from the Nexpo core group, the NF core group. I'm piggybacking off of their, their work because they do some really fun stuff with Nexpo. And so there's going to be this integration with Nexpo, the NF core modules into Backtopia and all that. And this again, like tools that I'm using one-off I'm we're, we're trying to push the NF core modules so that other people can also, you know, they may not want to use Backtopia, but they, they can use the NF core modules to rapidly pull these one-off workflows and then glue them together like I've done in Backtopia. And so, yeah, I'm, I'm super excited and super thankful for Davi and Abhinav push to this DSL two bit. I think it's going to be pretty fun. Well, thanks for a great discussion. This was another one of our software deep dives. There's always some interesting facts about how these different tools came into being. Today, we're talking about Backtopia with developer Robert Petit. You can check out the software on GitHub and Conda. And that's all the time we have for this episode. See you next time. Thank you so much for listening to us at home. If you like this podcast, please subscribe and rate us on iTunes, Spotify, SoundCloud, or the platform of your choice. Follow us on Twitter at microbinfee. And if you don't like this podcast, please don't do anything. This podcast was recorded by the Microbial Bioinformatics Group. The opinions expressed here are our own and do not necessarily reflect the views of CDC or the Quadram Institute.