A very nice paper in the ISME Journal came across my Google Scholar alerts this week – Satellite remote sensing data can be used to model marine microbial metabolite turnover, by Larsen et al. The author list includes some heavy hitters in the field of microbial ecology, including Rob Knight and Jack Gilbert. The paper is significant for being, as far as I’m aware, the first study to quantitatively link microbial taxonomy, metabolic potential, and environmental parameters in a predictive manner. This is something of a holy grail for microbial ecologists because, while microbial taxonomy and metabolic potential are difficult to measure, and can only be measured at discrete times and places in a study region, some environmental parameters (chlorophyll, sea surface temperature, etc.) are easily and near-continuously measured by satellite across a broad study region. Robustly correlating to these easily-observed parameters allows the prediction of microbial community composition and the resulting metabolome (pool of material originating from the microbial community) from some future set of environmental parameters. Here’s a quick summary of what they did.
Focusing on the Western English Channel, an ecologically very well characterized site, the authors developed spatially contiguous data for dissolved oxygen, phosphate, nitrate, ammonium, silicate, chlorophyll A, photosynthetically active radiation, particulate organic carbon in small, medium, large, and semi-labile classes, and bacterial abundance. Most of this data is not observable via satellite, but all of it is much easier to collect than it is to determine microbial community structure or metabolic potential directly. It’s my understanding that the authors first correlated the parameters that could not be measured by satellite to those that could, and then used these models to construct the contiguous datasets. They then use a microbial assemblage prediction (MAP) model (see Larsen et al., 2012) to, well, predict the microbial assemblage. If I understand correctly this also works off correlations, via a neural network algorithm that identifies linkages between different taxa and environmental parameters (previously measured at the same time and place in the environment). The next step is to predict the metabolic potential (genetic contents) of the microbial community. The authors do this at the order level to sidestep some of the issues with genetic plasticity at finer taxonomic scales. By taking all the available genomes for each predicted order, they can estimate its genetic contents. This is pretty hand-wavy, as many functions are not shared between all members of an order, but it’s a good place to start.
The next step is where the rubber meets the road. The authors attempt to connect metabolic potential to the community metabolome. They do this by using the KEGG database to identify reaction products associated with the enzymes comprising the community metabolic potential. To validate this approach the authors focus on the one metabolite for which there is abundant data; CO2.
I’m pretty excited about what they’ve done, it’s a great start to elucidating how microbial community structure interacts with the environment and vice-verse. There are however, a number of caveats to this analysis. First, as touched on earlier, there is a huge issue with genomic plasticity in both prokaryotic and eukaryotic marine microbes. Genomes from very closely related taxa, even from the sames species, can differ by 40 % or more. That is a lot of metabolic function that is present in one cell but not the other. Another issue is phenotypic plasticity, which is the ability of a cell to run different pieces of “software” on its genetic “hardware”. By expressing different combinations of genes for example, a cell can achieve a much higher level of phenotypic diversity than one might think by looking just at the contents of its genome. As we are all well aware, software is easily broken or lost. Thus even if genomic plasticity is at a minimum for a certain taxa, there is no guarantee that all its members well respond in a like manner to the same set of environmental conditions. Even clonal cells are often observed to drift apart phenotypically long before their genomes diverge. It will take a lot more phenotype-aware data collections (e.g. transcriptomics, proteomics, and metabolomics) to sort out the true impact of community structure on the environment.