Hello, and thank you for listening to the MicroBinfeed podcast. Here we will be discussing topics in microbial bioinformatics. We hope that we can give you some insights, tips, and tricks along the way. There's so much information we all know from working in the field, but nobody writes it down. There is no manual, and it's assumed you'll pick it up. We hope to fill in a few of these gaps. My co-hosts are Dr. Nabil Ali Khan and Dr. Andrew Page. I am Dr. Lee Katz. Both Andrew and Nabil work in the Quadram Institute in Norwich, UK, where they work on microbes in food and the impact on human health. I work at Centers for Disease Control and Prevention and am an adjunct member at the University of Georgia in the U.S. Hello and welcome to the MicroBinfeed podcast. Nabil and Andrew are your hosts today, and this is part three of our extended holiday special on bacterial taxonomy. Professors Ian Sutcliffe, Phil Hugenholtz, and Mark Pallant continue with us, and we're closing off talking about nomenclature and the recent renaming of phyla. So my opening question to anyone is, what is the difference between nomenclature and taxonomy? Let's give it to Mark. Taxonomy is the scientific classification, and you know, how do you decide what's in which group and how those groups fit within other groups and so forth. Nomenclature is just how do we stick a label on the things that we have discovered or that we've circumscribed. And traditionally, taxonomic nomenclature has relied on Latin as the kind of lingua franca that was the language of the science at the time of Linnaeus. Linnaeus used Latin and Greek roots. So the names that we have used for bacteria and for archaea are based on Latin and Greek roots primarily. That's one of the rules in the code. These names have to be presented as Latin. They're a long way from classical Latin, they're what we call neo-Latin, but they're made up in a way that they look like Latin words. That's one of the rules. In fact, if you look at the rules of nomenclature generally otherwise, they're very lax. And in fact, they go so far as to say that you can use completely arbitrary coinages. You can do whatever you like as long as you make them Latin. Now, the thing is, if you use classical Latin and Greek roots, you have to have a certain degree of understanding of how those languages work to do it correctly. And one of the big bugbears is that in Latin, adjectives agree with the noun in terms of gender. So if the noun is masculine, the adjective has to have a masculine form. So, for example, in Staphylococcus aureus, we have aureus has to agree with the ending coccus because they're both in the masculine. So we couldn't call it Staphylococcus aurea or aureum because that would be bad Latin. But the problem is that most people now are not learning Latin at school. They don't have any familiarity with it and they just don't appreciate these issues. And so there are a group of nomenclature experts who say, well, we know how these rules work and you must apply them. And people, when they want to rename, when they want to name a new species, they have to go along and make sure that they comply with that. It turns out, of course, that even the taxonomists of yesteryear didn't know their Latin and Greek well. There are still some validly named species that break the rules, you know, have a genus name that's in the neuter and then an adjective that's in the feminine form and they haven't been corrected. And one can argue, is this important or not? In some ways, you can argue that if someone presented a paper and they didn't and it was written in English, but they used slang or they didn't bother with spelling conventions or grammatical conventions, that's a bad thing. We'd all say that was wrong. But if we're going to do that, why don't we say that it's wrong to use bad Latin, malform Latin or whatever? The other argument is that these are just arbitrary labels. And one of the things that the code makes clear is that they don't have to actually mean what they say, they don't have to mean anything. So the fact that we call Haemophilus influenzae Haemophilus influenzae, we still do that, even though we now recognize it has nothing to do with influenza. And the code makes clear that you can't just go and rename something because your knowledge increases about its phenotypic properties or you can't broaden or narrow the way in which the name is placed because of that. If it turns out you called something, you know, we've done it. We've called things, we've called it a chicken chip microbe. If it turns out that chicken chip microbe actually occurs in pigs as well, we don't go back and rename it. We just keep the original name. And so when you look at the fundamentals, really, that's all there is to it. They don't have to be Latin and you can draw them from wherever you like. There is a whole incrustation of of recommendations that say, well, when it's a Greek root, you put an O between it and the next root, when it's a Latin one, you put an I and all this sort of stuff. But these are just recommendations and they can be ignored. And in many ways, they're just fussy and they are actually intimidating to people, off- putting. And I think it's time that we actually swept this away. I'll speak a bit in a moment about an even more radical plan to sweep it away. At the very basic, we can just say that let's just stop being quite so fussy about the way in which descriptions and names are, the way in which names are formulated and described. Well, actually now I want to hear your radical plan, but sweeping it all away. But no, just I was going to add that I'm speaking very personally, so not speaking as chair of the ICSP, but for myself, I'm pretty relaxed about this idea of Latin being what one might call approximate Latin rather than perfect Latin, because at the moment we have a small group of people who serve the bacteriological and archaeal communities very well by checking the formation of names. But the reality is, is that probably maybe in a decade's time, that's a dwindling band of people by their own admission. And the reality is that in probably a decade's time, there'll be very few people that can recognise whether a name is malformed or not. And actually, I think by then we will just get on with it. If the name sounds vaguely like a Latin binomial, we'll accept it as a valid name. And actually, it's worth, it won't happen quickly, but it's worth pointing out that the code itself, the code of nomenclature that the ICSP looks after is a living document. It evolves over time. So it evolves very, very slowly, parts of that go back over 150 years to the Kendall Laws of 1000. We are now currently on the 2008 revision, but the ICSP itself is presently has a public debate underway through Slack, which will end at the end of December and will lead to the publication of a 2022 revision of the code. And some of the things that change are minutiae that people that don't have a grammaticist's understanding of Latin won't be able to follow. You know, it is a code that evolves over time. And I can certainly, casting my eye, you know, more than a decade into the future, see it evolving to a point where the requirements on absolutely perfect Latin perhaps get relaxed. Well, those requirements aren't even in the code. It just says the words have to be Latin. It doesn't say they have to be perfect Latin. And as I say, many of these so-called recommendations, incrustations upon the basic heart, the core of the code, that is a series of rules. We can talk about this in several ways. One point that's worth making is that principle one of the code says aim at stability in names. And I propose that we talk about that in a short while. Before that, let me just talk about some ideas about how we can make names as we go forward. So working with Aharon Oren and we're working on the chicken gut, I said to him, look, we want to name 600 species. How are we going to come up with 600 names quickly? And normally, as Ian's pointed out, it's one name, one paper. People have loads of time to sort of think about that name and polish it off and whatever. But we need 600. So I said, well, what we can do is we can just use this combinatorial approach. So we'll just take, you know, the first route will be the host or the context, like chicken or bird. Second one will be, well, it's the gut or feces, you know, the sample that we're dealing with. And the third one, we just we can just use a lot of generic words that mean microbial, microbium, biome, or soma or plasma or, you know, that don't mean they have no specific meaning. Obviously, we can't use things like coccus because that implies we know the morphology down the microscope. But but there are many of these terms that can can be used. And if you did that in a combinatorial way, you end up if you use 10 routes at each position in the first, second, third out of just 30 routes, you end up with a thousand names. And we applied that approach. And it was productive in that setting. I then went on with Aharon and we wrote a paper where we suggested a million new names using the same kind of approach. But it soon became clear that there were two problems with that. One is that even if you use that approach and use this combinatorial approach to generate many, many names, it turns out that the number of new species that are out there, in particular ecological context, are much, much higher than than we can make names for. So if we just wanted to make names for gut microbe with our most creative thinking, we could perhaps come up with a few hundred or a few thousand. But we know that if we look at all the vertebrate gut microbiota out there, there are going to be tens of thousands or hundreds of thousands of names. So that descriptive approach to trying to describe things and create a name, it didn't scale very well. And the other problem is that if you start reading all the different routes and you start concatenating them, you end up with some very, very long names. And one of the key points is that these names are just handles and we want them to be handy and easy to use. We don't want to have a name like electrio-intestinal microbial, you know, we don't want to have a long word like that. We want them to be short and punchy so that we can remember them. And so that was where we got to sort of a year or so ago. But looking at GTDB, Phil. set as a problem in that, as it turns out in GTDB, they named all of these uncultured species, but they named them with placeholder names that were more like, you know, telephone numbers or postcodes that were useful placeholders, but they totally hard to remember to say, you know, SP00066918914, you know. That's a good species that one. We'd have genus, genus names by E2. And I said, no, that what we got to do is we've got to rename all of that stuff. Like we did with the chicken gut, where we did 600 Linnaean binomials, we've got a properly formed Latin names to everything in GTDB. And from what I remember, about a third of the things in GTDB species are named, two thirds of them have yet to be named. So can I just play devil's advocate for one second? I mean, several people have said to me, and I've been on Twitter too, of course, that they kind of like the placeholder names, because then they know that's an uncultured species or uncultured lineage, as opposed to one with a Latin name. So what's your thoughts on that? Well, there is obviously within the code, one of the things that we haven't mentioned is that the current code has this problem with uncultured, but it does have a get out. It says, if it's uncultured, you can stick candidatus in front of it. You still give it a perfectly formed Latin name, stick candidatus. In GTDB, you decided not to do that. NCBI still does do that. There are arguments for and against it as to whether it's cumbersome and all that sort of stuff. But there is already a way of flagging things to say, well, this is candidatus. But in a sense, where do you draw a line? Because some of those things that haven't been cultured have been very well characterized. People have reconstructed their metabolism, done studies on them. Mycobacterium leprae really shouldn't have a name, even though it's in the approved list, because it's never been grown in exotic culture. There's another organism called Mycobacterium lepromatosis, which has a very similar pathology and ecology, and that doesn't have a valid name because it can't be grown. So, you know, this fundamental issue, why should we flag things just on this operational criteria and they're going to be able to grow it and stick it into culture collections? That seems to be a bit mistaken. I know I've argued that, well, we can muddle through with candidatus, but fundamentally, I don't think that that is the way forward. The issue that I have with the use of the candidatus status is that, for clarification, candidatus names lack priority, which means that they can be effectively overwritten by anyone who says, well, those people called it candidatus XY, but I want to call it candidatus, I want to name it AB. You know, that is a fundamental flaw, without getting too detailed about it, the concept of priority. In reality, that is very, very seldom happens. It's been one or two cases, thousands of candidatus names. It hasn't happened very much in the past because we've been dealing with relatively small numbers of organisms coming into culture. But if people are naming, if we have a facility to name uncultivated organisms, then people might be naming at scale, as indeed you have done yourself. And then people might say, well, actually, I'm going to overwrite all those candidatus names with my name. You know, if people start publishing papers that have 10,000 names in them, then we could have absolute nomenclature. But then in that situation, this is down to peer reviewers and editors to do their job and say, you can't name it, that's already been named. The idea that there are these kind of Olympian gods of nomenclature that those names have standing, or they don't, that's irrelevant for most people. But to use the sort of argument you would have used, you know, if somebody proposed a whole load of names, they would challenge the editor and say, why can't I do that? These names don't have priority. My point is, the editor would say, the first principle of the code is aimed at stability of names. You're just creating confusion. But unless names have priority, they don't have, they don't achieve that stability because somebody can always overwrite. I'll just finish by saying on that point that there is a reason why rules of priority exist in all of the major codes of nomenclature for plants, for animals, and for bacteria in our care. And the fact, therefore, that candidatus names lack priority is its Achilles heel. Well, I would argue that you're talking about the 2% that have been cultured up to now, this little parochial argument over the last century or so. In the millennia to come, when we discover the million other bacteria that haven't been named yet, nobody's going to care about priority because we'll be naming the tens of thousands or hundreds of thousands at a time. Nomenclature systems obtain their authority and obtain their stability to go back to principle one from their rules. So yes, the C code was mentioned there and I'll probably explain this as quickly as I can through some historic context. I made a gag when we were talking earlier. It was a cheap gag, but it was a good one. The committee should really be better called the International Committee for the Systematic Culture of Prokaryotes and the code itself would be better off called the International Code for the Nomenclature of Prokaryotes. That's because rule 30 of the code places culture at the heart of the ability to validly name bacteria archaea under the current system. And that of course creates this headache, it's been an elephant in the room for a long time now, that we cannot validly name uncultivated organisms. So one of the first people to really sort of take this on, Ramon Ruffalo-Mora and Costas Konstantinidis, I hope I'm pronouncing Costas's name right there, published a fairly provocative article in ISMI journal about this and about the same time Barney Whitman published some very high-profile proposals to amend the code of nomenclature that would allow the use of genomic sequences type and would therefore bring uncultivated organisms under the IGES, under the umbrella of the code of nomenclature. Personally I was in favour of that, but when it went to the vote of the ICSP it became very clear that the majority of the ICSP were not. The two-thirds majority of the voting members of the ICSP voted down the Whitman proposals and that meant that the code stayed as it was, it meant that the valid naming of uncultivated taxa cannot be achieved under the ICNP. And so it became inevitable based on that, I think, that a separate code of nomenclature that would allow the valid naming of uncultivated taxa would be developed and a group of people has been working on that. In interest of full disclosure, I am one of those people and so is Phil, so we have been working on the development of a parallel code of nomenclature called the SEEK code which would allow the naming of uncultivated taxa and provide a naming and nomenclature framework. A manuscript describing with the first draft of that code or version one should we say of that code is currently under peer review. So people interested in the SEEK code can find out about a bit more about it through the ISMI website. There is a link from within the ISMI website. One of the reasons for that is that I mentioned earlier that the ICSP operates under the umbrella of the IUMS. One of the visions that we have for the operation of the SEEK code, because codes of nomenclature do evolve over time, they need structures to maintain them and we expect that the SEEK code will be administered under the umbrella of the ISMI society and that is very much a work in progress. But we would like to think that we could use the SEEK code to, I think very rapidly, validly name large numbers of taxa from the classification of uncultivated organisms like the GTDB that we have been discussing earlier. So that initiative is coming and hopefully coming sooner rather than later. The idea is to put the first draft of the code there for community feedback. Yeah. How can you publish the paper without having done that first? I think you could compare us actually to where we were in 1948 with cultivated organisms. So in 1948 the first draft of what became the bacteriological code was published, if my memory is correct, initially in Journal of Bacteriology and then there was a mirrored publication in what was then the Journal of General Microbiology and that said, well this is a workable code and then the community got on with it. But the world's changed since then, you know, it didn't have preprint in those days, it didn't have this, you know, basically you sat on what you were doing until the last moment and then published it and everyone agreed, oh you were the expert and we can't query you. We live in the age of Twitter, we live in the age of democratisation and Seed Code really is, you know, it's not shown itself in a great light by the fact that it's hidden itself away. You say there's a website but there's not much on that website to tell you what's going on. What I hear down the grapevine is that they've arbitrarily drawn a QC boundary to say we're only going to allow you to name these things, so they've trampled all over the whole idea of freedom of taxonomic thought. So it's a code of taxonomy rather than nomenclature. I personally, I welcome Seed Code in principle but in practice I'm a bit concerned about the way it's being rolled out. I think it'll evolve, I think it may evolve fairly rapidly. We need to put the structures in place to allow that. We are reaching out to the community. In February 2021 we had online workshops that were attended by many hundreds of delegates which, you know, frankly having been to taxonomy sessions at physical meetings you're lucky if you can get 50, you know, so we did reach a reasonably good audience with those. And the very first draft of the code was made available as reading material. o ddatblygiadau sydd wedi bod ar gael i ddarparwyr yn y gweithgareddau hynny. Felly yn hynny, ac rydyn ni wedi cael cymorth ddefnyddiol iawn o'r cymuned, roedd yn ychydig yn cefnogol, roedd ychydig bethau sydd wedi cael eu cymryd ymlaen gan drafnidiaeth. Rwy'n credu bod y cymhlethedd o'r ddatganiad, rydyn ni, rydw i, rydw i'n cymryd y pwynt y gallwch ei ddweud ein bod ni'n gyfarwyddwr cyfarwyddus o gyffredinwyr, ond rwy'n credu bod y cymhlethedd o'r ddatganiad wedi cymryd yr ymdrech, efallai, sy'n cymdeithasol i ddod ymlaen gyda drafnidiaeth gweithredol cyntaf, ac nawr, beth sy'n digwydd yw ein bod ni'n cymryd cymorth o'r cymuned ar sut i wneud ei ddatganiad mwy ffwngsinol a fwy cyffredin i aelodau pobl. Roeddwn i'n bersonol yn fwy hyfforddiol o sefydlu ymdrechion cyffredinol, ond rwy'n gwybod y syniad y byddwch chi'n dechrau gyda'r ymdrechion cyffredinol cyhoeddus, gallwch chi eu hysbysu mewn rhan gyda'r adnoddau o'r cymuned. Os ydych chi'n dechrau gyda standardau LAC, ac yn ddiweddar, ymdrechwch eich sefyllfa chaotig, byddai hynny'n gwneud mwy o ddarganiad neu dda. Rydw i'n meddwl y byddai'n rhaid i ni edrych i weld a allwn ni'i rhoi ar y wefan ac efallai dim ond y cyllid cyffredinol cyhoeddus ar y wefan. Gallwn ni eisiau gwneud hynny trwy'r amser mae'n dal i'w ystyried, dwi'n dweud bod y cyllid cyffredinol mewn adnoddau cyhoeddus oherwydd mae'n ddigon llawn i fynd. Mae'n gwneud, mae'n gwneud, mae'n gwneud cymdeithas cyffredinol i ysgrifennu pam nad ydyn ni'n mynd ar y ffordd hon ac yna mae'r cyllid cyhoeddus yn cael ei ddatblygu yn y ddangosrwydd hyfforddiant. Yn fy ngwneud y byd, ond pam angen bod cyllid cyhoeddus mor fawr? Yn gwneud y byddwn yn ei greu'n haws a'n boblus, ac un o'r gloriau fawr o'r cyllid cyhoeddus ar hyn o bryd yw pan ydych chi'n ei ddweud, trwy'r cyhoeddus cyhoeddu, ac edrychwch ar y rhan. Mae'n llawn a hamdden. Ac mae adnoddau cyffredinol a gwneud arddau, nid yw'r ffordd i fynd, ond i, rydych chi'n gwybod, i ddod yn ôl a dweud, dwi'n mynd i gael ymdrechion cyhoeddus sy'n gweithio, yn hytrach na rhoi ffyrdd ar ffyrdd ar ffyrdd ar ffyrdd. Iawn, ond rydyn ni'n defnyddio cymorth ar hynny, rwy'n credu. Iawn, Mark, rydych chi wedi sôn am ennill ennill cyhoeddus, ond beth ydych chi'n ei ddweud o hynny? Wel, os edrych ar y cod, mae'n dweud eich bod chi'n gallu defnyddio ennill cyhoeddus, ac efallai os edrych yn ôl i Linnaeus, y cysylltiad rhwng y ennill a'r ysgrifennydd oedd yn amlwg, ac roedd llawer o enghraifftau o ennill cyhoeddus dros ddegedau neu ffyrdd yn cael eu defnyddio yn taxonomi. Os edrych ar y broblem o sgiliau, mae gennym 30 oed, 1,000, 40,000 o ddynion i fod yn ennill yn GTDB, y fersiwn mwyaf diweddar, a dweud Phil, mae 17,000 arall yn dod yn y fersiwn nesaf, os ydw i'n cofio'n gywir, defnyddio'r ysgrifennydd ddim yn cyhoeddus. Felly, efallai os ydych chi'n rhannu un munud yn meddwl, dwi'n mynd i meddwl o ennill Llain sy'n golygu rhywbeth i bob un o'r ysgrifennydd, byddai angen mlynedd o waith ar gyfer chi. Felly, yr hyn y mae'n rhaid i ni ei wneud yw ddod â ymgeisydd newydd, ymgeisydd newydd, un sy'n gallu gysylltu â'r cyfrifoldeb bod gennym ennill Llain sy'n edrych fel ennill Llain, ond mae'n arbennig. Mae'n unig y gallwn ei ddefnyddio. A'r hyn rydyn ni'n ei eisiau pan rydyn ni'n eisiau ennill y pethau rydyn ni'n gallu eu gweld. Felly, yr hyn rydw i'n ei wneud, ac rydych chi wedi helpu fi gyda'r codi, dwi'n dysgu Python ar 61 oed i wneud hyn i ddigwydd, yw ein bod ni'n defnyddio'r dechrau llyfr Llain, y cyntaf llyfrau, o'r llyfrau yma yn y dechrau Llain, de-replicio hynny, ac yna gofyn y llawd o ddechrau Llain sy'n gallu cael ei ddefnyddio i ffurfio y ddynion ffemininellau. Ac yna, rydyn ni'n edrych ac rydyn ni'n ysgrifio trwy 6 miliwn ymdrechion yn y gweithdrau Cymraeg i sicrhau bod y nofion hynny erioed wedi cael eu defnyddio o'r blaen. Ysgrifio nhw ar gyfer pob y nofion sy'n cael eu defnyddio yn taxonomi, gwneud yn siŵr nad ydyn nhw wedi cael eu defnyddio o'r blaen, ac yn digwydd gyda'r 60,000 o nofion y gallwn ei ddefnyddio i'r dynion hwnnw sydd wedi cael eu defnyddio. Rydyn ni wedi gwneud hynny, rydyn ni wedi ddefnyddio cyffredin, rydyn ni wedi ddefnyddio nofion. Rydyn ni hefyd wedi mynd mor ffwrdd, mae'r trafodaethwyr sy'n hoffi gweld brotologau gwrth ddarllen, rydyn ni hefyd wedi creu protologau gwrth ddarllen. Mae'r ddogfen sydd gyda'r holl brotologau ar gyfer y nofion newydd ar gyfer bacterïau, y dynion a phlaenau eraill mae'r trafodaethwyr hwnnw yn 10,000 o ffyrdd. Ac felly roeddwn i'n rhoi hynny allan i'r cymuned i ddweud, edrychwch, mae'r nofion hyn yn dda iawn, maen nhw'n edrych fel nofion Latyn. Os nad ydych chi'n gwybod Latyn, nid ydych chi'n gallu ddweud nad oeddent yn ymddangos o ffyrdd Latyn gwirioneddol, ond maen nhw'n dda, maen nhw'n gwneud y swydd, beth ydych chi'n meddwl? Ac roedd gennym dŵr Twitter ac roedd yn gwneud dŵr i ddŵr i ddŵr. Dŵr i ddŵr o bobl sydd wedi dweud, ie, mae hyn yn well na phlaenoriaethwyr, mae dŵr i ddŵr o bobl yn amlwg nad oedden nhw'n hoffi hynny. Ac mae'n canfod ar gyfer ystyriaeth ar hyn o bryd, mae'n canfod fel cyffredin. Er mwyn i mi feddwl, mae hyn yn unig, dydw i ddim yn gweld unrhyw ffordd eraill y gallwn ei ennill y nifer o ddynion sydd angen ei ennill ar y sgiliau, ar y sgiliau sy'n angen, a mynd ymlaen yn rhan o'r fframwaith Linnaean, rydw i'n ddiolchgar yn y cymorth o'r plan hwnnw, er mwyn i'r ecologwyr microbiaid i mi hoffi pethau fel y halo, rhywbeth neu rhywbeth, oherwydd yna dwi'n gwybod rhywbeth amdano, ac rydw i'n cymorth ar gyfer y ffaith nad yw'r ennill yn rhaid i'r ffisiologi ymdrechu'r ffisiologi, ond dwi'n dweud y bydd yr etorhaeth nesaf efallai'n gwella drwy edrych ar y genoem ymlaen. Rydyn ni'n cael y technolegau, er enghraifft, os yw'r genoem yn cynnwys y gen MREB, dyna'r proteín sy'n ymdrechu'r ffordd, felly yna dwi'n gwybod nad yw'n cocas. If it doesn't have MREB, it's more likely to be a coccus. We could look for sulfate-reducing genes or other genes that you could actually guide those names a little bit if we wanted to capture that, because we do have the blueprints, we do have an ability to rapidly screen them and pull out key genes that would help, and I totally take the point that a name just has to be a nice-looking handle that doesn't have to reflect the taxonomy, but I am a little concerned with the early iteration of this where we put words, you know, cacoplasma, and then there was other names that were pretty similar, and I did notice that, but I think now in this latest iteration of arbitrary, you actually selected it to be as phonetically distinct from each other as possible. And by the way, plasma, I would look for genes, absence of cell envelope genes to indicate that it doesn't have a, maybe just has a cell membrane like a microplasma, because I would have thought plasma would be attached to that kind of organism. I'm broadly supportive of this. I absolutely agree that it's a very practical way, pragmatic way of naming at scale. My concern is a little concern, which is that, say I'm a PhD student that has been beavering away in a lab for two or three years. I've sequenced a genome. I've been working on analyzing the content of that genome. And I'm just getting around to writing my paper on describing that genome and naming the organism as a candidatus organism. And then this chap Palin overwrites the name in GTB, for that taxon in GTB with his arbitrary name. You've had your thunder stolen. And how do we, how do we prevent that happening? There's a genome deposited in the public domain. At the time that genome's deposited in the public domain, the person will have assigned the name to it if they wanted to assign a name to it. The fact that it's got a placeholder in GTDB means that nobody has bothered to do that. So we're not overwriting anyone's ability to give names. They can give those names perfect. There's a perfectly well-formed path for doing that. If you submit a MAG to NCBI, you can give it a name. They won't accept the name until the paper's out, but that's a very narrow window. And the thing is, these are candidatus names. So if someone says, I want to, I don't like that. I want to overwrite it. And the community says, people working on it says, I don't like that name. We're going to give it a descriptive name. They can do that. We're not, we're not forcing anyone's arm here. We're just saying, when I do an analysis of chicken gut, chicken feces, or pig feces, or horse feces, and we run the GTDB toolkit over it, over, well over half of what we get out there are just these placeholders. These are names that, you know, what on earth is this all about? And, and, you know, if I'm trying to talk to a, to a collaborator, and we found that sitting on this thing, it's just a mess trying to use that. It's far easier to have Latin names. And, and so we're not trying to say to anyone, this is it, we've, we've colonized your area of the microbial world forever and laid down our flag. We just say, look, this is, these are effectively placeholders because they're candidators. Since we're on the subject of phyla, I wanted to ask Phil about this recent renaming of phyla that the people do seem to care about. So Phil, I think you're the one who's been in the process. Why are you asking me? Because you've been in the process and I appreciate your Twitter thread that explained it all quite nicely, the situation. So I was wondering if you could recap on that and talk about some of the, some of the flack you've been taking. Well, I haven't personally been taking any flack, but I thought that the, I thought the ICSP and NCBI taxonomy was taking some flack. And basically this is around a very sensible proposal to include phylum in the code, in the pro-coded code. So phylum is the thing I learned about in school. And you're saying it doesn't, it's not actually. It wasn't officially in the code. So that means there are no real rules around governing. So it's very important that it's in the code. There's some, there's some useful properties about being in the code. First of all, you have to have a nomenclatural type. That's, that's important to, to provide a fixed point of reference for a group. And I draw your attention here to, because phylum wasn't officially in the code, you could define a phylum pretty much as anything without any nomenclatural type. And so you have this thing and I'm, I'm as guilty as any for naming phyla without actually saying, well, what is that connected to? The problem is if somebody makes another tree and you've in a previous paper, you've said, oh, these 316s sequences represent my phylum, you know, phyllobacteria. And then it splits up in another tree. I don't know which, where the name should carry through to. So that's why that's important. And it also has no priority if it's not, if it's not formally recognized. And we've seen that multiple times where the same group has given multiple different phylum names. So it's actually a long overdue and important process for the ICSP, which voted on this last year. So everybody. yes, we'll take on the rank of phylum. And then there was some specific questions. So in the prokaryotic code, all of the other ranks have fixed suffixes or the higher ranks. So you have ACA for family, you have alias for order, you have ear for class. And so we, to standardize, you want to have a standardized suffix for phylum, which was voted on as OTA. And then in order to apply this, then the other thing is to make the type, the nomenclature type a genus, that was the other vote, which is the same for family, for instance. And so family has a type genus. This is fine for the majority of phyla, because if you look, these non-official phylum names are often built off the first genus of an early genus that's described in that group, like Nitrosporota or Nitrospirae, depending on if you use the old or new name, is built from that genus name. But there are a couple of really important exceptions, and that's the proteobacteria and the Firmicutes. And that's what's got everybody in a big pickle, because if you follow the new rules, then the Firmicutes become the Bacillota after Bacillus, and the proteobacteria become the Pseudomonadata after Pseudomonas. And that's what caused all of the big Twitter furor over. And I was the very funny comment about from some wits saying that I made a straw poll asking people what they would like. And it was the same as when people were up in arms about Pluto no longer being a planet, and they made a poll and everybody said, no, we want it to be a planet. And so that's what's happened. The majority of people want to keep proteobacteria and Firmicutes. And there's another little interesting twist before I hand over to people that know more about nomenclature than me. And that came from the idea that in GTDB, we've re-circumscribed what proteobacteria are. So now in GTDB, proteobacteria are just the alpha, beta, gamma classes. And we've somewhat re- circumscribed Firmicutes as well. So some people are saying, actually, you would like a different name. So it's clear that it's a different taxonomic entity. So that's an interesting consideration as well. If you look at the definition of Firmicutes, it just says a phylum for gram-positive bacteria. And that was how it was described several decades ago. And what is slightly concerning is that when these new names of phyla were published, they just said, oh, it's going to be called Bacillota. But it's got the same description as Firmicutes. And you think, is that really where we're at in the 21st century, that we define a phylum by saying, oh, it's the phylum for gram-positives? What should have happened is that the phyla should have had names and circumscriptions that were modern. And there's still an opportunity to do that. And there's even an opportunity to save the old names, or nearly save the old names. If we just took one of your unnamed genera, maybe one of the split genus that's in GTDB, and you stick an AR after the end of it, to say that you don't recognise it as being part of the original genus. If we got one of those where we've got deposits in two different type culture collections, we could name that the type genus. And we could call it Firmicutes. And we could have Firmicutota, which would be near enough the same as Firmicutes. And we could save the name. But it's just this, the trouble is that the people that do this stuff are not creative people. They're very much driven by rules. And they have to follow the rules. And they have to roll those rules out. And there's an argument to say being consistent. But if you combine consistency with creativity, you can get around a lot of these problems. So I'm half-minded to go and do that, actually, just to publish a paper that says, here's a new genus called Firmicutes. And here's a new genus called Proteobacterium. And we'll name a new phylum after it. And we'll circumscribe the phylum using the techniques of GTDB, rather than waving around saying, what's the phylum for green positives. Sometimes the nomenclature experts do get themselves into a situation where they say, we have to make this right. And we don't care whether the community cares about the changes. 20-odd years ago, Hans Truber did this, where he said, oh, there's a load of bacteria that have been named, where the species name is actually a noun. But it describes a thing. It doesn't say of the thing. It describes a thing. So you're calling this bacterium a pineapple, rather than saying of the pineapple. And we must change that. And he went and put all these changes over things that were already established names. And people complained at the time. So it's a difficult issue. I mean, Ian's going to say, well, people will get used to it. And maybe they will. Let's see what Ian actually has to say. I was going to say that, but not straight away. The thing that I was going to say is that the elegance, and it is a genuinely elegant document in the way it's constructed, and the new rules. And this is where rules are important for nomenclature, are actually pretty clear now, which is that the priority will follow from this, which is what will prevent you from proposing alternative names, is that the rules, the way the code is now written, means that the phylum that contains the genus Bacillus must be called the Bacilliota. And that is really quite straightforward, I think. Those historic definitions, like, well, these are gram positive, genuinely guff, because they're not all gram positive for a start. Sorry, now the negative qtism. Yeah, but I'm sure you could find some gram variable ones if you look at the original descriptions, you know. So that system is nomenclature is really quite clear, that the phylum that will eventually be named that contains the Clostridia, you know, can be given a name, one would hope it would be given a high profile name, like Clostridiota, you know, that's really quite straightforward, and people will get used to it. I'm in this interesting situation, I got involved in a spat with some veterinary scientists about our proposals to rename Rhodococcus equi in the genus Prescutella as Prescutella equi, which wasn't universally popular. And yet I am long in the tooth enough to remember the fact that that same, that many of the same people were very upset about the proposals to rename Carinibacterium equi as Rhodococcus equi. And so they all got used to calling it Rhodococcus equi quickly enough. I actually think the younger generations that come through adopt the current classification and the current names pretty quickly. I'd be interested to look back and see if there was a big outlaw when the purple bacteria were renamed to Proteobacteria. The only way in which this matters is when someone uses the GTDB toolkit or an equivalent from the NCBI and wants to name their stuff and come up with a taxonomy. And so the names that you apply, Phil, are the ones that people are going to care about. The idea that someone in authority has named it, that doesn't really have any impact. It's what actually happens. So if you go to the NCBI taxonomy, Firmicutes are still there. For Proteobacteria, they haven't actually, there's no Pseudomonota in the NCBI taxonomy at the moment. Even though they made that declaration of something, they haven't, if you go to their taxonomy, they haven't changed it yet. And you haven't changed it in GTDB. And so it doesn't matter what the so-called experts or people that, and it would be interesting to see what the C code wants to do with it as well. But yeah, it's an interesting question. And there's no right or wrong answer to this. Is it like just forgetting to change the date on the calendar when you get into January, you know, change the year and we'll catch up? Or is it something more fundamental that actually we're all used to this and we want to stick with the old names? There's nothing to stop people using the old names as vulgar names anyway. We've had decades of calling the filer by names that don't have any standing. So people can continue to use those names. Well, my only final comment would be the reminder that the arguments are about classification. So the ICSP oversees a code of nomenclature, but the rules only apply to nomenclature. And really what people are getting upset about is whether a classification matches their perception of the world or not. And that's taxonomic opinion, which I think... There's two issues. There's a change of names because of change of taxonomic opinion. And there's a change of names because rules of names or whatever. What's happened here is a change of names because the nomenclature experts want to change them. That's what I'm saying. Nobody has reclassified firm duties in a different way when they call it. They just ported across decades old grand positives. And that, I think, is troubling. And this is where GTDB actually is consistent. It has an approach. And the names in GTDB, perhaps we need to just roll out protologs. So even the name things in there to say, this is the GTDB taxonomy. Here's a protolog to name this species according to the rules of GTDB, because this differs from what people said in the past. And certainly for the higher level, it will be. I mean, none of those things have ever been defined before. As far as I'm concerned, the world of taxonomy began with GTDB and everything that went before was chaos. Alison Murray, who Phil and I have worked with on the CCO project, alerted me to this quote, which is apparently from Bill Bryson's A Short History of Nearly Everything. And he wrote that taxonomy is described sometimes as a science and sometimes as an art, but really it's a battleground. And Cowan, who I mentioned earlier, writing in the 1950s, summarised this, that the taxonomists do like a good scrap. You know, earlier on we quoted Darwin. Let's quote Newton now. I mean, Newton, when he came to the end of his life, he said, I do not know what may appear to the world, but to myself, I seem to have been only like a boy playing on the seashore and diverting myself in now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before me. The arguments we've been having are about a few pebbles of clinical importance or a few shells representing cultured organisms. What we should rejoice in is the fact that there is this great ocean of microbial truth, as you've called it, this sublime scale of the microbial world through the techniques that Phil and others have been developing. We now have a glimpse of that great ocean and we now have a way of charting it as we go forward. And we should be rejoicing in that instead of arguing this angels in a pinhead stuff. You know, the great vision is there. Darwin's dream is real. and it's here, and now, in sequences. On that, I think we will close. That was a marathon effort. I want to thank our esteemed guests, Professors Ian, Phil, and Mark. This has been an almost crash course on bacterial taxonomy. I've been Nabil, with my co-host Andrew, and I wanted to thank you all for tuning into our holiday special of the MicroBinFeed podcast. We will have a lot of extended references for you to read in the show notes, see the description on your podcast platform, and we'll see you next time. Thank you so much for listening to us at home. If you like this podcast, please subscribe and rate us on iTunes, Spotify, SoundCloud, or the platform of your choice. Follow us on Twitter at MicroBinFeed. And if you don't like this podcast, please don't do anything. This podcast was recorded by the Microbial Bioinformatics Group. The opinions expressed here are our own and do not necessarily reflect the views of CDC or the Quadram Institute.