Video: Coffee & Viz: Visualizing the Microbial Ecology of Sourdough Starters

In 2016, the Public Science lab (led by Rob Dunn in Applied Ecology at NC State) launched the Sourdough Project, an effort to characterize the microbial diversity associated with sourdough starters across the globe. We partnered with researchers at UC-Boulder and Tufts University to collect and analyze 500 sourdough starters from 17 countries. In addition to discovering novel patterns in microbial membership and function across sourdough starters, we also discovered that bakers—the people who grow and tend sourdough starters—are uniquely attentive to their microbial gardens; and their collective observations, anecdotes, and questions greatly enriched our research. But how to report our findings effectively to our stakeholders?

In this presentation, Erin McKenney will share the different goals and strategies associated with designing effective data visualizations for publication versus public education.

View the slides and linked content here (Google Slides).

Listen to and support Jude Casseday's album, RISE: Sonic Sketches from Sourdough Cultures

  • Transcript

    It is my enormous pleasure to introduce our speaker today, Dr. Erin McKenney. Dr. McKenney is an assistant professor in applied ecology here at NC State. Her research blends microbial ecology, nutrition, and comparative gut morphology. In addition to being an accomplished researcher and educator, she also engages the public through citizen science projects and by telling compelling narratives through data visualization, both of which you are about to hear and experience today.

    So thank you for joining us, Dr. McKenney. And I'll let you take it away.

    Awesome. Thank you so much, Hannah, for that incredible introduction, and for all of the thought that goes into the production of these Coffee and Viz talks. I think I gave the last in-person Coffee and Viz over in Hunt Library before we shut down for pandemic. And it was-- amazing and tremendous are pale descriptors.

    It was awesome. I am thrilled to be back, albeit in a digital format. So thank you. Thank you for having me to tell another story.

    So as Hannah said, today I'm talking about sourdough starters, the Global Sourdough Project specifically. And I'm going to be sharing some different design decisions that went into visualizing our research results, differences between figures produced at different stages and for different purposes. So it's all context dependent.

    [laughter]

    And in case you missed our little immersive sourdough experience, Data vis? Meet Data vox, I have also been working with Joe Cassidy for a few years, a local soundscape artist in Durham, who has sonified a lot of the data from the global sourdough project that I'm presenting today. So if you're interested in listening one more time, there's a link to that album. And you'll notice all the text in my slides that is teal and underlined, or turquoise and underlined, indicates a hyperlink. And I'll be sharing those slides out through the libraries after this talk. So what's mine is yours.

    So a bit of context about this research project, I first want to just say I did not do this alone. I'm speaking on behalf of a widespread team, widespread geographically and in terms of our various expertise.

    We have got specialists in microbial culturing techniques. We have specialists in computation algorithms and the data analysis. We have specialists in citizen science, in communications with the public. We have specialists in the history and culture of food. So this is definitely a team effort. And a tremendous, huge thanks to Neil McCoy, who helped us really get a lot of our first visualizations off the ground.

    And then a bit of context about the data, the cycle that we employed in this project, it was a citizen science-based research project. That will become more apparent, like how critical the citizen science component was to actually making the project happen when you start to get an idea of the scale of the project.

    So it started with citizen science and outreach, in a way to engage and educate the general public. Folks, want to know about their sourdough. How do I make the bread that I make. Is it all about me as a human or how much of it is microbial? So we'll be digging into a few of those questions.

    And so anything that we're doing with the public tends to take an interactive format, even if it is housed and archive digitally online. And that's at the beginning stages, when we are actually receiving the surveys and then summarizing and returning some of the results at those early stages of research. Or at the end, after we've completed our research, if we ever feel like the research is done, you'll find-- I hope it will be apparent throughout that every question that we've been able to answer has unlocked and revealed-- it's like a hydra. You cut off one head, seven or 12 more heads grow on. Every question that we're able to answer reveals multiple unanswered questions that we now need to dig into, or someone should.

    So when we're reporting back, we're translating those results for public consumption. But then in the research process itself, we tend to be using more static, multipanel figures, more typical of the scientific publication process within academia, in different peer-reviewed journals. So look for those-- that dichotomy in data visualization.

    And as far as this talk structure, I tried this structure it around a series of vignettes. So the figures progress from the preliminary, to the exploratory data, to our publication choices.

    The early figures, I presented at the International bread symposium at Johnson and Wales University over a couple of years, so 2017, 2018, and 2019, so pre-pandemic. And that was a great opportunity for me to test out the data-- the science communication, through these iterative, developing data communications or data visualization attempts. The final multipanel figures are all published in eLife, the peer-reviewed journal. And again, there's a link to that paper online.

    And I cannot ever give a talk about this without giving kudos, respect, tons of credit to the bakers and the citizen science participants who have truly informed our research through their lived experiences and expert questions. So they have offered so much perspective that has really enriched and dare I say leavened our research task. You can't do research without having puns. And sourdough is rich with puns.

    So, first, where did these samples even come from? In order to communicate that back to the public as a big thank you, we first sent out a survey, just tell us about your sourdough m tell us about yourselves, give us an idea of why you came to prepare sourdough starter and take care of it, what importance does it have in your life, what do you make with it, what are your maintenance practices, what kind of flour do you feed it? We got a thousand responses via survey. And then in fine text at the bottom, we said, oh, and also, PS, if you want to know more about the microbes in your sourdough starter, you can send us a sample. And of the 1,000 or so participants who filled out the survey, 500 people sent us samples of their sourdough starters.

    So if you would like to explore this interactive map, feel free to take a minute. Here's a link to it. And you can see how we reported back as a thank you for your samples.

    Here is the scale. Again, I'm returning to scale, 500 starters. It takes about two weeks to grow a starter from raw flour and water, to a bubbly microbial garden that you can use to bake with. And I have taken care of and maintained 24 starters at a time. I can tell you that takes-- with measuring.

    I was also taking data on height and pH of the starter. It took three hours a day. So scale that up to 500. This research would not have been possible without our citizen science participants.

    So for those of you who maybe haven't gone to the map, if you go to the website, then you can actually can see that we do have a Western bias. We are missing samples from South America. We are missing samples from Africa. And we're also missing samples from most of Asia.

    But for each of these samples-- and you'd have to play with the zoom to get it right-- when you hover over the map, you can see here are all the participants IDs if you click one. Then we have a featured picture of-- these are the pancakes that this participant actually made with their starter. And then we have the most dominant yeast taxa and then some of the most dominant bacterial taxa. And we tried to include notes on what they actually do, their ecological roles and functions.

    So a participant can either hover over and look and see what are the geographic patterns, so they can do some exploration of their own. Or they can actually search for, whoa, which is my starter and actually enter that in-- so in that interactive map.

    But then when we went to publish, taking a static, multipanel figure approach, here it is. Here's figure 1, an intro visual primer on the entire premise of the project. Here's the background of-- and I'll take you through panel by panel.

    So in panel A, we have sourdough starters. How do you make one? You have your initial starter, that has flour, with microbes in the flour; water, with microbes in the water; and then environmental microbes, filtering as well.

    And it's a kind of a serial, S-E-R-I-A-L, not cereal, like the grains, a-ha. We have this serial backslopping, or you remove part of the community to feed what remains, so that we can continue to grow. Otherwise, your starter would become-- it would grow exponentially every day. It would double in size every day. So to keep it in a maintainable size, that's why we discard the sourdough.

    And then after some number, 10 to 14 feedings, you get a starter that is mature, and active, and can actively leaven dough. So then we can mix it with a larger proportion or total amount of flour, and water, and some salt. And then after some amount of fermentation, we can bake bread-- so the process in that panel.

    And then here's the static version of that interactive map. We had 500 sourdough starters. Here's where they were around the world.

    And then a bit more about that-- what flours where they fed? And so from our total proportion of samples, this gives us an idea that across the world, most sourdough keepers are using unbleached all-purpose flour. But then there's bleached wheat. There's rye, whole grain. And there's whole wheat, whole grain.

    And then we had some other stats based on the human-driven characteristics and reporting from those surveys. Where are people keeping their starters? Mostly in the fridge, in between bakes. Otherwise, you have to feed an active starter two to three times a day. So we're managing our time, as well as our microbes.

    We have feeding frequency. It varies a lot, depending on how long you keep that starter in the fridge and how often you bake. And then the age of the starter-- most starters are young. But we do have some that are even 200 years old or more. So a credit to Matthew Booker, who, in an early discussion, referred to sourdough starters as the heirloom pets. They outlive you, right, just a beautiful phrase. So I have borrowed that wholeheartedly.

    And then panel G, our starter origin. Most of our participants grew their own starters from scratch. A few procured them-- and by a few, I mean almost a hundred. But a relative fraction of our participants procured their starters already made from a business.

    So then we want to know, which microbes are living in these starters? So these are bar charts. Keep in mind, on the left, for comparison, there's a thin, black line. That's the size of one sample's bar chart.

    And what we have here are 500 bar charts smushed together in a single slide. So it's very difficult to identify specific starters here. But we can get overarching patterns of global diversity. And again, this was a preliminary figure.

    So on the top, we see the yeast communities. And that entire rust-orange block is Saccharomyces cerevisiae, the same species that is commonly sold commercially by Fleischmann's or Red Star. If you buy it in a packet or in a little jar in the store, that Saccharomyces cerevisiae, the same species.

    Now, there are many different strains of Saccharomyces cerevisiae, kind of like there are many different breeds of domestic dog. They're all Canis familiaris. But there are so many different breeds.

    So what we see from that top, smushed bar chart approach, a global view of sourdough starter yeast communities, is that Saccharomyces cerevisiae is globally prevalent. It occurs in sourdough starters across the entire world. And when it does occur, it tends to take over the sourdough starter.

    But you see, on the right, kind of that confetti happening, or it might feel a little matrixy. That's a sign of a tremendous amount of yeast diversity, that we hadn't previously known about or appreciated. So there are, in fact, 70 different species of yeast.

    And again, in the canine terms, that would be not just domestic dogs, not just wolves, but maned wolves, and timber wolves, and red wolves, and Arctic foxes, and Fennec foxes, and you name it-- many species of canines. So there's a diversity of yeasts associated with sourdough starters that we didn't formally appreciate.

    And some of those yeasts, you can see that thin block of gray or that thin block of salmon, toward the right, of the top graph. Some of those yeasts also tend to take over their sourdough starters.

    On the bottom, we have a similar view of diversity, previously unappreciated diversity, of lactic acid bacteria. Those are the bacteria that produce those sour flavors, that make you salivate when you think about them, because our bodies recognize acid as a source of nutrition.

    So we see a lot more variability among starters. But we do still see some blocks of color. This block of what-- NC State, brick red? And then we see another kind of salmon block, that there are some bacteria that tend to occur in starters across the world, no matter where they're from. And of those, some of those bacterial types will take over the entire bacterial community.

    Another way of presenting these data is to look for patterns. Of those most prevalent types of bacteria and yeasts, do we tend to see patterns of co-occurrence? And that would be the track that we played out on from Jude Cassidy's album. Of these most common or most relatively abundant types of yeasts, and bacteria, and starters across the world, do we tend to see patterns in specific types of bacteria and yeast occurring in the same starter?

    And that really came to light when we use this heat map approach. You can see those color bands. That not only is any single starter pretty well dominated by only a very few types of bacteria and yeast, but there are specific types of bacteria and yeast that get along better.

    And when we then ask the question of, is that a predictable pattern, we see yes. On the left side of this heat map, you see this family tree of all starters grouped by the similarity of their microbial communities. So we can see about 15 different types of sourdough starter communities based on the most prevalent types of bacteria or yeasts.

    And then, Ria, you said, how do you think an expanded sample base from areas in Africa and Asia might impact the yeast communities? Would you anticipate more diversity or would it be appropriate to assume that the mainstream continue to dominate? That is a great question.

    Let's see. So from other research, not sourdough related, folks have demonstrated that yeasts, as a type of fungi, tend to be vary dependent on climate factors, temperature and humidity. So I would suspect that if we include more different climatic regions, that we might discover novel yeasts. So we might be able to increase the use diversity associated with sourdough starters from 70 to a greater number of species. Oh, thanks, Ria.

    So, yeah, that would be my suspicion. And with that increased diversity of yeasts, we might also-- we might also find an increased diversity of bacteria that tend to co-occur with those yeasts. And also, appreciating and understanding that sourdough ferments across the continent of Africa may involve many different grains, not just rye or whole wheat, which tend to be more European derived. So, again, with those grains, we might see different microbes associated or that are favored by the different nutritional components in those grains. So thank you.

    So those were the preliminary figures for just the community composition. Here is our figure that we published, the standalone, static, multipanel, peer-reviewed figure that's published in the journal eLife. I always try and pause at this point, even if I'm narrating my pause, to let folks explore and just get an idea of, wow, this is dense; also, what am I trying to gain from this figure?

    So now, working through this figure, with panel A, we did keep those samples organized by community similarities. So the tree across the top of panel A of this figure is a similar, kind of similarity tree that we saw on the left side of the heat map. And we kept the original bar charts, with bacterial and yeast membership. So you'll see similar potential banding patterns.

    We put the bacteria on top this time and the yeast on bottom. And you'll see, with that different order by similarity, instead of by percent ranked abundance of the most dominant types of bacteria and yeast, we're not getting that whole block of rust-orange anymore. But we can still certainly recognize that rust color is Saccharomyces cerevisiae, is highly prevalent and predominant across all the starters. And then we organized the legend, similar to the heat map, by those functional groups.

    And then panels B and C, we actually wanted to highlight the geographic story because, as we were mulling over the results on our own, we were also getting a ton of questions from participants, who were rightfully so very eager to learn about their sourdough starters. So they were asking questions, including, when will we see the results? And we said we have to tease through these results and make sure that it's a strong story, that is accurate-- so a bit of education about the length of time that the scientific process takes.

    But then in that publication, our decision to include these figures was informed by that geographic interest. And we were able to identify some microbial types that tended to cluster based on geography within the United States. And we used that US focus-- in part, because we had such a high density of samples, we actually were able to detect statistically significant geographic correlations.

    And then in panels D and E, again, informed by the baker's interests, the practice and that investment of our stakeholders, based on whether we have the purchase from a business or an individual, whether the starter is young or old. So we can see in panel D, a lot of these characteristics are linked back to that initial figure, where we summarized here are kind of the summary stats from our data set. So we're able to link between the figures throughout the paper for a more cohesive narrative story.

    And this helps to answer a lot of questions that bakers have. If I have an older starter, what am I more likely to have growing in that starter? And, hopefully, this is edifying for folks whose starters were not sequenced and included in this portion of the project.

    And what we found is, overall humans don't matter as much as we think we do. In fact, for this entire variable data set, all of those human-controlled maintenance, and care, and feeding regimes, everything that we do that is-- human centrically, we think it matters so much. I'm the baker with all the control-- it actually only accounted for and explained 10% of the total variation across the entire data set, all of our human practices combined.

    This is one of my favorite parts about studying microbial communities, because it is constantly humbling. We think we do so much. But really, the microbes are underlying everything that we do. We've maybe been manipulated by microbes across history to do these things, in order to make bread.

    So if it's not about us as much as we might think, then what's causing these patterns? So we asked a couple other questions. And these are preliminary figures. We have, do fast-growing microbes outcompete the slowpoke? Is it kind of like when you- since backslopping your starter every day is almost like mowing your lawn, are you selecting for the weedy species, like dandelions and grass, because you're mowing out the trees?

    So we actually had a side project investigation to map out the growth patterns of some of the most dominant microbes. And you see tremendous variation. It's not necessarily who grows fastest, though we know that ecologically the growth rate does make a difference in many communities.

    What we then did was have a time-lapse video. And here's a link in the Chat to the time-lapse video. And you should be able to watch it. Yes. OK.

    So you have this time-lapse video. And it's actually playing. And now here we can see some of these starters are actually growing up. We can start to see them late. But for an 11-second video, that's over, I think, a 24-hour period, the time-lapse video is compelling for talks. I get to show you 11 seconds of doughs rising in test tubes.

    But these videos are also a useful source of data in themselves. So the dough rise analysis was using a common garden approach. All these representative starters from each of these different families of distinct microbial communities were fed the same flour and water inputs. And then they were filmed. And we-- thankfully, there are software that track the pixels rising and falling. So we were able to fit logistic growth curves using data analysis.

    So we went from a video that is fun to watch on loop, when it loads quickly, to pixelated data that then we can graph. So these are some preliminary figures visualizing-- telling the story to ourselves as researchers-- before we decide what's the best way to optimize this visual story for publication and for the public as we report back?

    So here you can see these different clusters represent the different, similar sourdough families. So they have statistically significant-- or statistically similar microbial communities. We had different starters grown in replicate. So that it looks like there's more than one mustard yellow or teal line because there are three lines.

    And, generally, we see mostly overlap between the same colors. But you can see there's gray and brown in the cluster 7. Those are two different representatives of that family.

    So we see a bit of variation, but overall similar behavioral patterns in terms of rise. So that shows us-- that demonstrates to us that not only is the microbial community fairly consistent or similar between these starters in each family type, but that the community composition-- who is there-- actually makes a pretty big difference for what the starter is able to do, how it behaves in terms of rise, and also in terms of aromatics.

    So the next big baker question, do specific microbes impart unique flavors? Does it matter who is growing in my starter? So to do this, we focused on four families, that I have highlighted here in blue, red, green, and yellow-- so these families 10, 11, 12, and 13.

    And you can see a fair bit of variation in the co-occurrence patterns, in the community composition from the heat map. We chose these families in part because they were statistically distinct. But also, we needed to choose families that had enough samples that were still viable from being essentially cryogenically frozen. It gets a little sci-fi when you're trying to save your sourdough starters long-term and then revive them.

    So they're are the Lazarus starters, raised from the dead. They've been put into a thick glycerol, sugar goo to protect them from freezing. So when we thaw them out, we can revive them. And, again, feed them all the same flour, water inputs.

    When we feed them, in panel 1, we use these special little bars, that are sticky for smells. So we suspended a bar in the headspace of the jar, so that they would collect all of the compounds, all of the aromatic compounds, that the microbial communities in each starter are producing. So then we could load that little glass tube into, in panel 3, this giant, high-powered liquid chromatography machine, that essentially pipes that gas sample down such a long, skinny copper tube that all the compounds separate, so that then they burn up in a flame one at a time. And when they burn, they leave the machine a different signal. So we can actually tell what is the chemical structure of each of the compounds.

    And then in panel 2, concurrently with that, feeding the same array of starters the same diet, then we had-- I call them sourdough sommeliers. They're professionally trained sensory analysts. And like tasting a fine wine, they are sniffing or huffing a, ooh, vanilla notes; oh, rose, floral; or sweaty feet. This whole variety of sensory experiences, coupled with that diversity of chemical compounds.

    So then, when they're measuring those chemical compounds, we have the sensory specialists then stick their nose in that little funnel, so that as each compound is being identified, chemically, structurally, they're able to smell a little bit of it. So they're doing a live recording in time, so that they can identify, this chemical structure has this human aesthetic experience-- so just amazing.

    That's not so much data viz. But I can't tell you about sourdough in this project without telling you about that incredible component. So thank you winding down that path with me.

    And what we found from that analysis was not only in the first-- I think in the first three samples, that team identified over 120 chemical compounds; so an incredible diversity. Truly, different microbial communities are producing different components-- different chemical components that actually impact our experience of bread.

    But, also, we saw that those samples grouped by community starter type. So that means it does matter who is living in your community. Or when you read the last names in a phone book, you can infer something about the jobs of those folks in a community are doing. Different members in a community are performing different functions.

    So here is the figure that we published from those data. Again, you'll see, at the top, we have that x-axis. It's that same tree of organization of membership similarity as we had in the previous figure, the original bar chart. So we're maintaining kind of an organizational continuity. And we're telling different stories with it.

    The y-axis becomes a similarity tree of the aromatic compounds. So when we see-- when we organize these starters by who is living in them, we can also organize the outputs by what is being produced. So different microbes are performing different jobs.

    And then we have three layers of context. At the top, we have the percent acetic acid bacteria per sample, which again are kind of ecosystem engineers or keystone species. They're not always highly prevalent. But when they are abundant in a starter, they make a big difference in the overall community composition, and in the different functions that those microbes are performing, and in the dominant smell or the dough rise. So those are acetic acid bacteria play a pretty important role in the final bread that you get at.

    So once we want to-- having published those data, we simultaneously wanted to report back to the public. So we produced a series of the same figures, but high resolution. So here's a Zoom in of panel A from that second figure that I showed you.

    And what we did was create a very high resolution. That clearly didn't quite translate to this slide. All the numbers are clear, if you click on the link that is in the slide.

    We annotated this figure with every individual participant's starter ID number. So that every participant who contributed to this project can then see which one is mine, where am I? So they're able to actually peruse the figure in the context of their own starter, or their friend's starter if they want.

    And then, giving just an extra, extra layer, I've been explaining how we've interpreted these things or how we've organized the data. But here, I'm going to share a link to the story map that we also created. And by "we," I mean Lauren Nichols made the story map happen. I think had the idea for it in the first place, certainly created it. I helped walk through, here's how I might interpret-- oh, here we have our time lapse still going.

    So I'm going to scroll through so you get the sourdough story map experience. So when you-- this is what you saw in that previous slide. So then we have-- as you scroll, we have an introduction, links to the published paper. We wanted to make sure that everyone who submitted a sample knows as much about their starter as we do, and links to the high-resolution images.

    So we're going to walk through them. So here's figure 2. And here's how we-- we broke it down into smaller increments, instead of panel by panel, as I did for a broad overview for you. We actually have interpretations.

    This is a stacked bar chart, looking at the bacteria across all 500 samples and some of the main takeaways. As we continue to scroll-- stacked bar chart below is similar or analogous. But it's for yeast and the other fungi. And then we'll zoom in.

    So, again, for scale, right. I showed you a tiny pin line when I showed you our first stacked bar charts in the preliminary results. Here we do this-- here's one actual sourdough starter bar chart. And expanding out. So that's two samples, so comparing just two samples first.

    And then here they are, all 500. So incrementally building that story. And I'm not going through every single thing. I just wanted to highlight the thought, the care, and the detail that goes into hopefully the reporting back. And it becomes a public outreach and education effort as well.

    So with that, I'll say thank you so much. And I want to make sure we have time to open it up for questions.

    All right, thank you so much Dr. McKenney. I've got a bunch of questions here from folks. We're just going to go in the order they were received in Chat. And I'll read them out to you.

    From David Auerbach, we have, from a flavor perspective, I vastly prefer the lactic beasts rather than the aesthetic beasts. Can you say something about the patterns with regards to that contrast?

    So let's see.

    I can put that text in Chat for you, as well.

    Yeah, thank you, thank you. Yeah, speaking of visualization, super helpful to refer to that.

    So I can say the lactic acid bacteria, lactic acid has a flavor profile that is more described as dairy. So we think about lactic acid bacteria often being a component of fermented dairy products, like yogurt. Many people consider that lactic acid taste and milder. So it often is a preferred flavor profile.

    And what I can tell you-- yeah, compared to that acetic acid bacteria, again, when the acetic acid bacteria are present, they make a big difference, both in the overall microbial community structure, and in the rise rate, and in the overall flavor profile. That kind of vinegar zing, not only does it make you drool, but it also outcompete any other more nuanced flavors.

    So it might depend. I'm sure there is as much variation in human preference for flavor as there is variation in microbial metabolism. But, yeah, if that hopefully helps shed any light.

    All right. And our next question, from Margaret Simon. I may be misremembering here, but didn't someone on the team trace the geographical trajectory of S. cerevisiae arriving in Western Europe?

    Yes. So I think Margaret, that you're remembering Caiti Heil's work. So Caiti is also-- she's a research professor in biology at NC State. And I believe that was-- I think you're referencing our ferment meetings every other Tuesday morning.

    So I believe that was a report back from the Heil Lab tracing the trajectory of Saccharomyces cerevisiae specifically. But that was a separate effort from this project.

    I love that you have a standing ferment meeting. That makes me happy. So the next question we have from Eric Manson, in the top chart-- and I believe this was referencing several slides back-- is the clustering just by yeast or by the combination of yeast and bacteria?

    Let's see.

    And Eric, feel free to either unmute or talk in Chat if you want to identify the exact chart. I neglected to write that down when I took your question.

    Awesome. Hi, Eric. Yes. The very top-- they're clustered by total community composition similarity, so not just by the bacteria, not just by the yeast. But all in all, yeah, taking everything together.

    All right. And then we have some questions from Brad Taylor. I'll put that in the Chat there. What are the units of growth rate on the charts?

    Ah, so let's see. You can see in the preliminary data, it was scaled. This is a part-- I'm going to step back and say, all of the growth rate experiments were done by Megan Biango-Daniels, on my left in this picture. So she would be the one to tell you what the units are.

    But in that preliminary figure, everything was just scaled. And then this is actually in pixels for this next figure. So again, we're using a package in R to actually extract the change in pixel height of every one of those different starters from the time-lapse video, so you know your growth in pixels.

    And then sort of a related question, I believe it was one of the next panels here. And I'll put that in Chat. For the top and bottom panels, can you do any sort of parentage or ancestor analysis. Such analyses are often used to relate the percentage of wild origin fish ancestry to the performance of hatchery origin fish.

    Oh, that's really interesting. I can tell you that we never had a discussion about identifying the most recent common ancestor microbial community makeup of sourness. But now I want to. So thank you for that idea.

    At some point, we did have a map of the sourdough histories. So in that initial survey, we asked, where did this-- has this starter moved or traveled across its life? And there were several-- and, of course, in the broader settler history of the United States, in particular, there has been a lot of bringing of resources, including sourdough starters, from the Old World, and whether that is Europe, Russia-- from those countries of origin, when immigrants come to the country.

    So historically there are patterns of travel. I think it would be really interesting to think through what that ancestral microbial history might have been. We did not do that. Thank you so much for the idea.

    I think, off the top of my head, it would require-- maybe for some term-- long-term study or analysis of kind of the temporal stability of a sourdough starter. We were at the mercy of citizen science, in a way, with this initial project. So even thinking through how old were these starters-- I mean, I have been alive for 36 years. I could not have grown a 50-year-old starter by now.

    So thinking through-- and if we hadn't been studying this-- this was a single time point in a way, collecting these starters at the point when all the participants sent them to us. So in order to really graph or monitor the dynamic microbiome changes of a start over its life, we would need to actually sample starters for a long time, just to even see could we infer anything.

    Yeah. Wow, so much to unpack there. Maybe a topic for a future for meeting.

    So then David had kind of a question/observation here, about the other side of the coin, of how little our manipulations matter, maybe the character of the starters and the robustness. And if you had any commentary on that?

    Well, so I think it's important to remember that what we do-- bakers only have these best practices because what we do has correlated to an end product. So we are selecting or providing conditions that select for specific microbes or specific functionalities. If I ferment my dough on the counter, then it will rise faster. It will get sour faster. And I am favoring the growth and the metabolism of lactic acid bacteria.

    If I do a cold proof in the fridge, then I'm favoring more complex flavor development. And I'm favoring the growth of yeasts. And I'm slowing down the lactic acid bacteria.

    So the things that we do as bakers do make a difference to the end product of the bread. I mean, that's how we're able to say, oh, my bread is this way or my bread is that way. Like, that's how bakers are able to sell bread and keep a consistent product.

    I think the humbling part is that who is doing the job, who actually, in terms of the microbes themselves, who lives and thrives in the starter. That's kind of not really up to us entirely. Does that make better sense?

    So I'll go to the next one here. From Eric, we've got a question about, is there any evidence of founder effects? In other words, for an array of starter samples, do the longer term community compositions vary as a function of which species were initially present in a starter?

    That's a great question. Let's see. And that founder effect or priority effect is a reference to ecological processes of succession. This idea that, yeah, does it matter who gets there first, in our trailblazer species, when we first begin that succession process of community building?

    We have not done that controlled experiment, in which we inoculate flour with specific microbes. I can tell you the closest I've gotten, when I was teaching at the North Carolina Governor School, in 2017, I had my 66 students from across the State of North Carolina, all bring a flour type from their house. And they all grew-- cultured the bacteria and the yeast from their flours.

    And we did see some fairly robust patterns in yeast diversity versus bacterial diversity across different samples of similar flour types. So whole grain flours, whether it's rye, or whole wheat, or buckwheat, or spelt, tend to have more-- greater yeast diversity. All-purpose flour, it looks like you sneezed on your bacterial plate, like just the lactic acid bacteria thrive. They're very dense colonies. But there's very little to no yeast growth at all from an all-purpose flour.

    So speculating based off of those results, I could imagine that in growing a sourdough starter from scratch, using a different type of flour that maybe contains a different microbial seed bank, that not only in that case-- not only would the nutritional differences between the different types of flour based on the grain type, but also the different microbes that come in, in that flour to begin with, might make a big difference because we see those patterns of co-occurrence and because those microbes perform different functions that might favor different communities as they continue in the succession process.

    Great questions, you all. I'm going to save this Chat.

    [laughter]

    We have some more for you. The next question is, did you find any pathogenic associated microbes in these sourdough starters?

    Great question. So let's see. We found-- we identified several types of microbes, that if you google them, you will get-- it's a dangerous game, the Google game. And that's part of why we tried to offer in that interactive global sourdough map, the hover over information that, here's a little bit about these starters or about these most dominant types of microbes.

    In general, we did not find potential pathogens. And that's how I refer to them. In some studies, they may have been identified as a pathogen or associated with a disease state. However, there's so much context.

    I'll say the most dominating types or the most prevalent types of microbes in a mature sourdough starter tends to not be pathogenic or even associated with pathogenicity, in part because a mature sourdough starter has a lower pH because it has thriving and highly functional lactic acid and acetic acid-producing bacteria. And that acid production-- you may be familiar with a lot of acidic foods that are traditional. That lactose fermentation process drops the pH to levels of acidity that most pathogens are not able to survive.

    So our sourdough protects us. This may be in part why, way before-- millennia before knowing that microbes existed, we were cultivating microbial cultures in kimchi, in sauerkraut, in kombucha, or in sourdough starter because people tend to get sick less often if they consume those foods, even though they are living, filled with microbes. So statistically, you're less likely to encounter a pathogenic species, in general, in a mature sourdough starter. Rambly.

    But it's fascinating. Thank you. Another question here for you, and I'll put it in Chat as well. Was there any ever any pushback or tension around publishing very dense visualizations, that include each individual sample, as opposed to generalizations? Or was it always a plan of the project to maintain the ability to trace back an individual's contribution?

    So within the project, I won't say-- I won't say pushback. But there is a great deal of caution, that we need to be able to respect the privacy of every citizen science participant. So huge kudos to Lauren Nichols; to Lea Shell; to I'd say all of the members of Rob Dunn's lab, who had been involved with citizen science work before I joined the team because they were well versed in-- before we publish these data, we need to truncate GPA coordinates. We need to jitter the dots on any geographic map. We need to make it impossible to trace an individual.

    We anonymize all data. And again, when I say "we," Laurie Nichols was instrumental in curating these data for publication, in making sure that it respected that all of our data sets, and figures respected the privacy of our participants.

    So once we took all of those preliminary cautions, then I think we were satisfying the requirements for publication. There are always a number of rounds of revision. Well, in my experience, there tend to be several rounds of revision for any single publication. But I think in our case, we were able to justify the importance that these data were not possible without the participation of hundreds of citizen scientists and baking experts.

    And so it is so important to include those individual nods, but also to highlight that variation. The summary statistics do not do the starters themselves justice.

    All right. Well, thank you so, so much Dr. McKinney for this fascinating talk. Really, truly appreciate it. There's some more praise for you in the Chat from fellow scholars. I see Dr. Booker is here.

    I'm going to put some more information in the Chat for everyone about the rest of our Coffee and Viz talks please do consider filling out that feedback form, to be entered into the entry for Anisette coffee. In a normal life scenario, we'd all clap and say thank you. But again, thank you so much Dr. McKenney for a truly fantastic session. We really appreciate it.

    Absolute pleasure to be back. And thank you all for your questions. I have saved the Chat, not least so that I can follow up with individuals and say thanks for your thoughts.

This video is a recording of Coffee & Viz: Visualizing the Microbial Ecology of Sourdough Starters, February 18, 2022