Hello, and thank you for listening to the Microbid Key Podcast. Here, we will be discussing topics in microbial bioinformatics. We hope that we can give you some insights, tips, and tricks along the way. There is so much information we all know from working in the field, but nobody really writes it down. There's no manual, and it's assumed you'll pick it up. We hope to fill in a few of these gaps. My co-hosts are Dr. Nabil Ali Khan and Professor Andrew Page. Nabil is a Senior Bioinformatician at the Center for Genomic Pathogen Surveillance, University of Oxford. And Andrew is the Director of Technical Innovation for Theogen in Cambridge, UK. I am Dr. Lee Katz, and I am a Senior Bioinformatician at Centers for Disease Control and Prevention in Atlanta in the United States. Hello, and welcome to the Microbial Bioinformatics Podcast. Nabil and I are your co-hosts today. Today, we're talking about web-based tree visualization tools. There's a lot of them out there, and we love them all. Yeah, and we're going to also be talking about the application of minimum spanning trees throughout line of work. Someone was recently asking, why would anyone ever want to use one? So let's get started with trees on the web. So, Lee, what was the last tree software you used? Let's go from there. Yeah, I mean, usually I'm just, I'm running MASHtree, and I make a tree. Not necessarily a phylogeny. And the easiest thing for me to do is to copy and paste from the command line, get the ASCII representation of the tree, put it into ictree.org. I used to always use Mega, which is local. It's not a web-based tool like what we want to talk about today. But now it's easier and easier to use these web applications. So I was introduced to ictree.org, and it's amazing. I like it. Yeah, it's not one I've used. I did have a quick look at it before we started today. And yeah, it's pretty slick. How close is it to, say, an ITOL type where you can annotate it and put all plots on it and so on? I believe that you can put in the basic NUIC file format. It does have a lot of things like you can search for. You can search for the names on the tree. You can add bootstraps. You can add all sorts of visualization, like where the tips are, if you're showing the tips. Oh, the manual does say it accepts Nexus annotation. So it should have everything on there. Yeah, so I can see that it does things like skyline plots from one of the options and so on. So that's something that I don't think I've seen out of the box. One just gives a skyline plot. Yeah, this was given as sort of an alternative to viewing beast analyses and some other advanced things. So I don't doubt it. I haven't run beast myself in a little while. And anyway, this is supposed to be something that accepts a lot of interesting things, especially molecular clock analysis. Oh, yeah. So that's one people probably haven't heard of. It's icy tree, as in ice, water, ice, icy tree. Check it out. So tell me about the last time you made a tree. What did you use? Last tree I made. I think last tree I made for reals. I was actually testing out. I was thinking with some new changes to Nextstrain and Auspice. That's probably the last one I looked at. So now if you go to Auspice, they have, they've taken. So the tree viewers, everyone knows Nextstrain. You would have, you must have seen it during, definitely during COVID, looking at any of those sort of online and social media. People will be posting data sets presented in Nextstrain. It's very, it shows time phylogenies. It's really, that's its strongest point. And so it was very good for all the viral evolution stuff and all the COVID stuff. You see it next flu and so on. You'll see it used a lot out there in the wild. But one of the nice things is, is that there's now, I didn't realize, but it's probably been out for some time. There's a, there's a version of it called Auspice, which allows you to just drop in your own data. And so that's looking for like a drag and drop in a tree and a metadata file. And it'll draw that for you. So it's kind of similar to MicroReact a bit. Is Auspice the one that you, that's used on Nextstrain behind the scenes? Yeah, yeah. So that's the basic, yeah, that's basically the Nextstrain tree view, which everyone would have seen from COVID or viral evolution stuff or whatever, because it's so good at showing time phylogenies. So you would have, people would have definitely have seen it. And now you can put in your own stuff much, much easier directly from the browser. You can load in your own data and play around with it. That's incredible. I like the visualization on Nextstrain where it's like, you can click the button and it animates through the whole thing. Yeah, so you can actually, you can do that in, I think you can do that in Auspice if you've, if you've loaded in some, some data information. I think you can see, so it's auspice.us, but I think you can do, is it, you know, like a Nextstrain dot, is it Nextstrain.org slash SARS- CoV-2 or something? Oh yeah. You can see any of the public trees through the Nextstrain site. Okay. I went to avian-flu and I found an example tree on here. Yeah. One of the really fun things with it is you get tanglograms now. I haven't seen that. So what's a tanglogram in between? So the tanglogram, it puts two trees next to each other, one on the left and then another facing it on the right. And it draws, looks at the tip labels and sort of matches them up and draws lines between them in the middle. So you can see how much consistency there is between the two trees. So if your trees are basically exactly the same, all the taxes will, they'll just be straight lines between the taxes. If they've been moved around, if the topology is different, you'll see lines crisscrossing all over the place. Love it. There's a visual, it's a nice visual way to see, like if my tree, if one tree is the same as the other and where that might, where the differences might be, but I was just playing around with that. So in this totally normal conversation where two people just, we just talk about trees, that's how we do. I know that you went over to the Center for Genomic Pathogen Surveillance recently. I know that you guys use MicroReact. Do you want to take us through that? Yeah. So MicroReact, it's used for a lot of different things and you'll find it in a number of different projects. Again, I think if you were looking at SARS-CoV-2 data from COG-UK, it would have been presented to you in a MicroReact. That's probably one of the good examples, but it's always used for all sorts of data sets. One of the really nice things is you can load your metadata and a NUIC file, your tree into it, and then you can actually get that state of your visualization saved and you can get a permanent link that you then can share with all of your friends and input into manuscripts. You'll often see a lot of manuscripts with links to MicroReact in it. I like MicroReact a lot because it allows me to avoid firing up RStudio and using ggTree because I can get more or less the same type of visualization without writing any R. And those type of figures are the ones where you have the tree and then around it you'll have, or next to it, you'll have blocks of color indicating different metadata states, so different colors for country, different colors for gene presence, absence, or whatever it is. You can actually get hold of the tree viewer library itself called Phylocanvas, and you can actually take Phylocanvas and put it into your own software if you like. And I only found this out in researching this session today, that Fandango, if people remember that, actually uses Phylocanvas under the hood. I don't see too many people use it. I think people still use Fandango. I don't use it much, but it was very popular for quite some time because a lot of people would put it, you'd be able to show the tree really easily and show some blocks of metadata, so maybe serotype, maybe country, whatever it is. And then you'd be able to, it would plug straight into viewing recombinant sites from things like Govans and ClonalFrame, and you'd also be able to show gene presence, absence as well from other software. So it's really good at looking at sort of pan-genome, tree and pan-genome together, or recombination and tree together. So the tree viewer from Fandango actually is using the same tree viewer as MicroReact, which is Phylocanvas, which is a fun tidbit people might not know, but I didn't know about it until yesterday, until today. So if you load up Phylocanvas and you have your tree in there, you have whatever your results are, it looks like you put in some RARI results or you said serotyping, I think. If you have all that, can you package up that web page and send it to somebody over email? Is there an interactive page that can be sent? The Fandango? I don't think so. I think you just export that page out. I don't know about anything about launching it yourself or anything like that, but within it, I don't think there ever was a way of saving it. Some tools that we'll talk about do allow you to do that. So micro react will allow you to do that. I told which we haven't talked about yet, but you might definitely does. Yeah, I always want. So like, I kind of got my eyes open. This isn't the same thing as trees, but like when Krona plots came out, you could email someone a Krona plot and have like all the data loaded and you can interact with it for like minutes. It'd be nice to have it with a tree. I think you could do that with file canvas on its own. But Krona is actually, because it's bundling every, it's bundling up the HTML. Yeah, it's a big HTML. It's a big HTML with all the JavaScript and all the data and everything embedded in it. And you're just passing that around. So yeah, speaking of sharing trees around, itool is really good at that. In terms of online, I think it's the first one I ever used. It's been around since 2006. It's first published really old and classic and it's amazing. It's had such a long continuity and it's still really, really good. Yeah, I think it has like the all in one. Maybe it's not, you know, as like flashy as showing like the molecular clock analysis on Nextstrain or anything like that. But like, it can get pretty flashy in, like you can share it with somebody, share a tree. It's loaded the tree on their website and you can add like extra tracks. So you can add up the stuff that you were talking about with Fandango, add the extra tracks on there and everything. The downside to me is that like you have, there's like a learning curve to this, like to actually know how to add the data and everything. And that's a learning curve I haven't really achieved yet on my own, honestly. Yeah. Yeah. Likewise. I think I got as far as doing a simple loaded the tree and just show maybe one or two concentric rings of information or color code some of the taxa labels or something like that, or color code the nodes. You can find some itool figures, which are absolutely bonkers with the amount of detail there's, you could, because you can show graphs, right? For each tip. So you've got this radial phylogeny and then outside of that, you've got this block, this bar coming out of it and have this radiation of bar values all the way around. It's pretty, pretty full on with it. And there's other stuff. There's ancillary software as well and so on to use with itool, which I've never used. I think you have to have a subscription, pay for a subscription to use some of the extra software, but it's pretty complicated. Good on them. If they have a business model. Oh yeah. There's a link on here for itool access modes and subscription. So yeah, there's a free access and there's, there's other tiers on here. I've never quite understood the dif like what's the difference. Like what am I missing out with the standard subscription if we pay for the subscription? Well, there are a lot of bullet points here, but in bold, I can see right away. It says all tree on the free access, all tree annotations features are available, but the annotations cannot be saved for free. And standard subscription says trees and data uploaded through the batch mode during an active subscription remain accessible indefinitely, even when the subscription expires. Oh, that's interesting. That's interesting. So, you know, buy it for a month or two and get everything done and then just have it there forever, put it in, in your publications and carry on. And then you can have a one more tier. I guess this would be the higher tier. Itool annotation editor subscription. And then in bold, I'm not, I'm, I'm skipping over some of the bullets here, but in bold it says, Oh, unlimited access to the itool annotation editor. So there's like Gus. There's like a higher tier annotation editor that I didn't know about. That's interesting. Indeed. So I'll turn it back over to you because you helped develop one of the premier visualization tools yourself, even before you got to this job, GrapeTree. You want to talk about that? GrapeTree. Yeah. GrapeTree. It was originally conceived to be an online minimum spanning tree viewer, but it actually can show a tree. I think we'll, we'll explain what the difference is a bit later. Why, what, what, what they do, but GrapeTree is able to do both of those things. And it's, I use it these days, like objectively, I use it for when I want to show really blobby clustered data, that's the name GrapeTree. So you have an outbreak with a bunch of close clusters together, and then maybe you've got some long branches to something else. And that's where it really, really works well. Otherwise, and if your data is like particularly complicated and busy, so you have a bunch of really closely related texts and you want to collapse them into one node, GrapeTree has little options to help you collapse branches really, really quickly. There's a little threshold you can set that just does it for you. And that allows you to just have these big blocks, big cluster groups that you could just look at and color code and annotate and so on. So what's that word you use, blobby? Blobby, yeah. So, so in terms of phylogenetics, is blobby just like the collapsing of very similar profiles? What does, what does blobby do? So yeah, blobby to me is, yeah, very similar profiles that you've big cluster groups together, outbreak cluster groups that you put together. Things that are one or two slips away and effectively you just say, well, this is the same thing for what I'm doing. So like usually in an outbreak analysis, we might even switch from a classic like root tree, rooted tree to like a minimum spanning tree. And I think it handles that blobby kind of minimum spanning tree also, right? Yeah, yeah, it does the minimum spanning tree. I mean, GrapeTree, it's going to be unrooted. So a lot of the tools we've been talking about have the capacity to be, allow you to pick rooted or unrooted and allow you to specify the root and so on. GrapeTree doesn't, it's always going to be an unrooted bunch of nodes, but that's usually fine. I mean, there's nothing wrong with that. Yeah, yeah, yeah. It depends because you just, you know, you might not want to, you might want to say, what's the difference between these groups? You want, what are these big major groups that I can see in my data versus trying to actually establish what was the evolutionary direction? What came from what? Yeah, that's really good. And like, I think just speaking from public health, like it just feels good to have like an option for a free online minimum spanning tree. So I'll say at least for me, like, I appreciate that you and, and your lab made this. Yeah, it was Martin, Sajjan, Jem and Joe and myself were the main three developers with our boss, Mark Ackman back in Warwick that wrote this up. Yeah. And, and there, there's at least one other minimum spanning tree software that, that we use some time, PhyloViz, but so that's, that was from Zhou Ao's lab when, before I went to industry. So maybe we can have them on another time and see if we'll talk about that more in full. I think we're almost out of time. Let's go quickly to some of the other software that's out there. So some of the other ones that I really like are Taxonium from Theo Sanderson. It is, I haven't found an excuse to use it very much, but it is absolutely brilliant for scaling to really, really large trees. And the website is fantastically responsive. It feels really good scrolling around and jumping around and zooming in and out. It has this great tactility to it, responsiveness to it. So if you've got a massive, massive tree, throw it into Taxonium and give that a go, that might help you out. How did he do that? Like everything else, I guess, is JavaScript. Is this WebAssembly? This is, no, I think it's, we'll have to ask, I don't know what it's implemented. I think it's straight canvas. It is very nice and responsive. I like it a lot. I have no idea what he wrote it in. I'm not going to read this code base in two minutes. It's very shiny. I agree. I haven't had a lot of chance to. It was showing up to 1 million COVID genomes that it talks about in the paper. So imagine it. Yeah, exactly. This is bonkers. Before COVID, we didn't even think this was possible or useful or anything. And now it's like, no, there's a thing online. You can just, for free, you can just do it. Oh God, yeah. With Mastery, for example, people will say, I tried putting in 5,000 genomes in it and it crashed because of, and I'm like, do you need 5,000 genomes? Can you put in fewer? But actually, I think with Taxonium and with SARS COVID too, we're seeing that it is actually useful now. You're right. I need to be less cynical. I think a lot of the tools, like the microreacts and the next strains and so on, were also optimized during COVID because there was that requirement to try and show more and more data. And so it's not just Taxonomia, but like a lot of the other tools have had optimizations because of COVID to kind of deal with larger and larger data sets because that happened. Before you didn't really have other than to just do it for bragging rights. You really didn't have a reason to do it. Anyway, let's wrap up. We're out of time. You want to take us out? Yeah, so that's it for this week on the MicroBin Tree Podcast. Lee and I were talking about some of the tree visualization tools you can find online on the web. They are all free, easy to use. So give them a go, particularly ones like IcyTree, ITOL, Taxonium, Nextstrain, Auspice, GrapeTree, MicroReact, PhyloViz. Give them a go. And we'll see you next time. Bye. If you like this podcast, please subscribe and rate us on iTunes, Spotify, SoundCloud, or the platform of your choice. Follow us on Twitter at MicroBinFee. And if you don't like this podcast, please don't do anything. This podcast was recorded by the Microbial Bioinformatics Group. The opinions expressed here are our own and do not necessarily reflect the views of CDC or the Quadram Institute.