Seeking postdoc in phytoplankton ecology

The Bowman Lab seeks a postdoctoral researcher for a two-year project to investigate patterns in phytoplankton community composition across molecular time series collected at the Ellen Browning Scripps Memorial Pier at Scripps Institution of Oceanography (SIO) and the Cal Polytechnic State University (Cal Poly, San Louis Obispo) Pier at Avila Beach.  The postdoctoral researcher should have a background in biological oceanography or microbial ecology, a solid understanding of ecological statistics and the analysis of amplicon sequence data, be familiar with phytoplankton sampling techniques, and understand basic molecular lab procedures.  Specific knowledge of eukaryotic harmful algal bloom-forming taxa is a plus.  The position will be located at SIO but jointly mentored by Dr. Alexis Pasulka (Cal Poly) and Dr. Jeff Bowman (SIO).  There will be specific opportunities to work with undergraduate students at Cal Poly and SIO.  The position will be open until filled and applicants should reach out jointly to Dr. Bowman (jsbowman at ucsd.edu) and Dr. Pasulka (apasulka at calpoly.edu) with a copy of their CV and a brief expression of interest.

Posted in Uncategorized | Leave a comment

Recent blog post by PhD student Beth Connors

Check out this recent blog post written by PhD student Beth Connors for International Women in Science Day!

Posted in Uncategorized | Leave a comment

New paper: Antarctic metagenomes reveal novel microbial diversity

I’ve fallen very far behind on my one-time goal of writing a blog post for each new lab publication. Seeking redemption via a paper out today by former Bowman Lab postdoc Avishek Dutta, now an assistant professor at UGA. His paper, Depth drives the distribution of microbial ecological functions in the coastal western Antarctic Peninsula, is the first study (we believe) to employ high throughput shotgun metagenomics to evaluate the ecological functions of bacteria and archaea across multiple depths and seasons in the Antarctic marine system. It builds on some wonderful work by Tom Delmont in the Amundsen Sea and Joseph Grzymski in the same region as this study.

Avishek selected 48 historic samples collected by our group in collaboration with the Palmer Long Term Ecological Research study for shotgun metagenomic analysis. Although this is not a particularly large sample set by today’s standards, using Avishek’s iMAGine pipeline it was nonetheless sufficient to yield 2,940 bins (collections of contigs that might be from similar genomes) and 137 dereplicated, high quality bins that we considered metagenome assembled genomes (MAGs). MAGs are not perfect constructs; they’re incomplete and likely composites of different, highly similar genomes. Nonetheless, standard genomic tools and analyses can be applied to them to try and infer likely metabolisms and other characteristics.

You really have to stare at the above figure for a while to start to wrap your head around the details, but the main points are: 1) There’s a lot of potential for carbohydrate degradation potential, as we might expect for bacteria and archaea that respond to phytoplankton blooms. 2) There’s a lot of dark carbon fixation potential, and this is often combined with different heterotrophic metabolisms to produce a prokaryotic mixotrophic functional type (i.e. an organism that can switch between autotrophic and heterotrophic metabolisms). This is one of those things that we assume must be fairly common without actually observing it. Right now there is a lot of discussion in various communities about the role of mixotrophic protists in marine foodwebs and the carbon cycle, but mixotrophic bacteria and archaea get surprisingly little attention.

Posted in Uncategorized | Tagged , | Leave a comment

New postdoctoral research opportunity!

Andrew Barton (https://adbarton.scrippsprofiles.ucsd.edu/) and Jeff Bowman (https://jsbowman.scrippsprofiles.ucsd.edu/) at Scripps Institution of Oceanography at the University of California San Diego are recruiting a postdoc to study interactions among marine microbes, inferred from regular genomic measurements and cell images taken at Scripps Pier (https://ecoobs.ucsd.edu/). Possible research areas include but are not limited to: quantifying the strength and direction of microbial interactions, identifying “keystone” microbial taxa, and assessing how microbial interactions shape ecosystem function. The ideal candidate will have a PhD in ecology, marine biology, or related disciplines, and proficiency in data science techniques, machine learning, novel statistical methods, and/or numerical modeling approaches for studying natural populations and communities. Please direct qualified candidates to contact Andrew Barton (adbarton@ucsd.edu) for more information. We anticipate filling the open position by Fall 2023. 

Posted in Uncategorized | 1 Comment

Alignment and phylogenetic inference with hmmalign and RAxML-ng

RAxML is one of the most popular programs around for phylogenetic inference via maximum likelihood. Similarly, hmmalign within HMMER 3 is a popular way to align amino acid sequences against HMMs from Pfam or created de novo. Combine the two and you have an excellent method for constructing phylogenetic trees. But gluing the two together isn’t exactly seamless and novice users might be deterred by a couple of unexpected hurdles. Recently, I helped a student develop a workflow which I’m posting here.

First, define some variables just to make the bash commands a bit cleaner. REF refers to the name of the Pfam hmm that we’re aligning against (Bac_rhodopsin.hmm in this case), while QUERY is the sequence file to be aligned (hop and bop gene products, plus a dinoflagellate rhodopsin as outgroup).

REF=Bac_rhodopsin
QUERY=uniprot_hop_bop_reviewed

Now, align and convert the alignment to fasta format (required by RAxML-ng).

hmmalign --amino -o $QUERY.sto $REF.hmm $QUERY.fasta
seqmagick convert $QUERY.sto $QUERY.align.fasta

Test which model is best for these data. Here we get LG+G4+F.

modeltest-ng -i $QUERY.align.fasta -d aa -p 8

Check your alignment!

raxml-ng --check --msa $QUERY.align.fasta --model LG+G4+F --prefix $QUERY

Oooh… I bet it failed. Exciting! In this case (using sequences from Uniprot) the long sequence descriptions are incompatible with RAxML-ng. Let’s do a little Python to clean that up.

from Bio import SeqIO

with open('uniprot_hop_bop_reviewed.align.clean.fasta', 'w') as clean_fasta:
    for record in SeqIO.parse('uniprot_hop_bop_reviewed.align.fasta', 'fasta'):
        record.description = ''
        SeqIO.write(record, clean_fasta, 'fasta')

Check again…

raxml-ng --check --msa $QUERY.align.clean.fasta --model LG+G4+F --prefix $QUERY

If everything is kosher go ahead and fire up your phylogenetic inference. Here I’ve limited bootstrapping to 100 trees. If you have the time/resources do more.

raxml-ng --all --msa $QUERY.align.clean.fasta --model LG+G4+F --prefix $QUERY --bs-trees 100

Superimpose the bootstrap support values on the best ML tree.

raxml-ng --support --tree $QUERY.raxml.bestTree --bs-trees $QUERY.raxml.bootstraps

And here’s our creation as rendered by Archaeopteryx. Some day I’ll create a tree that is visually appealing, but today is not that day. But you get the point.

Posted in Computer tutorials | Tagged , , | Leave a comment

New paper on using machine learning to predict biogeochemistry from microbial community structure

Congratulations to Avishek Dutta for his paper Machine Learning Predicts Biogeochemistry from Microbial Community Structure in a Complex Model System that was recently published in the journal Microbiology Spectrum. I’m really excited about this paper; the study it is based on inspired this perspective that I wrote for an mSystems early career special issue last year.

Summary of experimental design and analysis, from Dutta et al., 2022.

The figure above summarizes the experimental design and analysis. The experiment was designed to address the question of whether the microbial community contains sufficient information to predict a biogeochemical state in a dynamic system. The structure of a microbial community is highly sensitive to environmental change. Small changes in the chemical or physical environment will result in a shift in abundance of one or more taxa as mortality and growth rates respond. These shifts in structure are easily observed by amplicon sequencing of taxonomic marker genes. These relative abundance data can be combined with flow cytometry analysis of microbial abundance to yield absolute abundance data.

The trick of course is relating an observed shift in community structure to a specific biogeochemical state. Machine learning provides a number of ways to do this, but all require large training datasets. Fortunately gene sequencing is pretty cheap these days and DNA extractions are much more high-throughput than they were just a few years ago. Because of this it’s possibly to generate community structure data for hundreds of samples in relatively short order. In this study Avishek used over 700 samples from sediment bioreactors and the random forest algorithm to predict the concentration of hydrogen sulfide with a reasonably high degree of accuracy.

Like any statistical model, developing machine learning models takes careful attention to detail. Careful segregation of the data into training and validation sets and engineering of the features used for prediction yield the most honest models that can be best applied for future predictions. Avishek’s paper is an excellent template for developing a predictive machine learning model from microbial community structure data.

Posted in Research | Tagged , , | Leave a comment

Lab manager position open!

We’re on the hunt for a lab manager/senior lab technician to take on a variety of key tasks in the Bowman Lab. The position is being advertised at the Staff Research Associate II level and the ideal applicant will have an MS in a relevant field, or a BS and equivalent experience. We are looking for someone with complementary skills to the rest of the lab; the ideal applicant would have a background in environmental or analytical chemistry to complement our core expertise in microbiology. However, a background in the life sciences also works fine. The formal job posting is pasted below (note that it deviates slightly from what’s described here due to limitations of the UC San Diego HR system).

DESCRIPTION

Under supervision, independently perform a variety of standard laboratory and data analysis procedures (and some non-standard procedures) related to the function of coastal ocean environments. Coordinates and conducts instrument calibrations and data collection for long-term time-series of microbial community structure, microbial abundance, and dissolved gases. Responsible for the operation and maintenance of a membrane inlet mass spectrometer, flow cytometer, and in situ imaging flow cytometer (IFCB), DNA extraction, data entry, and light programming in Python and R. Travel to field stations as needed, which may involve driving University vehicles and operating small boats for diving and coastal field work. Scuba dive to clean and service underwater instrumentation. Coordinate and communicate with lab members about supplies, data and sampling techniques. Oversee and work-direct undergraduate research assistants. Process, analyze, and interpret results from data sets, evaluate quality of data, generate and update design and method documentation, and update web pages. Perform general office duties including but not limited to filing, photocopying, faxing and library searches for research articles. Manage laboratory space, computers, and equipment.

  • Must be able to lift 50 lbs.

QUALIFICATIONS

  • B.S. in Chemistry, Marine Science, Oceanography, or equivalent combination of education and experience with a strong background in data analysis and computer operations.
  • Demonstrated experience with diving and ability to acquire or maintain AAUS and SIO scientific diving certification.
  • Demonstrated knowledge of mathematics, scientific, and programming principles.
  • Demonstrated experience with R, Matlab, or Python programming languages for data analysis and visualization.
  • Demonstrated laboratory experience. Demonstrated knowledge and experience with laboratory techniques and instrumentation, specifically flow cytometry and DNA extractions. Demonstrated experience with laboratory safety procedures and calibration techniques.
  • Proven ability to work effectively on multiple tasks in parallel, with each requiring a different focus and level of detail and attention. Proven ability to prioritize tasks and solve problems.
  • Demonstrated data entry and data analysis experience. Demonstrated experience with spreadsheets and/or databases for data entry, archival and basic data analysis using standard software (e.g., MS Excel, MS Access, Matlab, or other statistical software packages).
  • Experience communicating and interacting with a variety of people from the public to governmental agencies, students and volunteers. Ability to effectively communicate instructions and interact using tact and diplomacy with diverse personalities including academic, staff, student and volunteer employees and institutions/organizations.
  • Proven ability and experience using PCs, email, internet, general office tools and software.
  • Tolerance of repetitive tasks such as data entry and checking, or extended periods in laboratory filtering samples or analyzing seawater samples via flow cytometry.
  • Demonstrated ability to find and follow written and oral procedures from standard laboratory resources.
  • Must be organized and a self-motivator with the ability to work efficiently while unsupervised.
  • Proven ability to document significant results of data analysis in technical notes. Good writing skills. Ability to integrate data products and methodologies from laboratory and field instrumentation into research results for publication purposes.
  • Proven ability to communicate with technical and scientific personnel. Ability to instruct and aid research associates and students on the use of software packages and data procedures/protocols.
  • Ability to travel for days to weeks for field work and work extended hours as needed.
  • Ability to drive University vehicles to field stations. Valid driver’s license.
  • Proven ability to work with others under demanding conditions, sometimes for extended periods of time.

SPECIAL CONDITIONS

  • Ability to work at sea. Must have demonstrated experience with SCUBA diving and ability to acquire and maintain AAUS and SIO scientific diving certification.
  • Must have valid driver’s license and ability to drive University vehicles to field stations.
  • Ability to travel for days to weeks for field work and work extended hours as needed.
  • This position is subject to a DMV check for driving record. Fluency in Spanish is preferred.
Posted in Uncategorized | Leave a comment

New paper on protein adaptations to high salinity and low temperature

Congratulations to Luke Piszkin (now a PhD student in the Biophysics Department at the University of Notre Dame) for the first paper in the lab to be first-authored by an undergraduate! Luke’s paper is titled Extremophile enzyme optimization for low temperature and high salinity are fundamentally incompatible and appears in the journal Extremophiles. In the paper Luke explores the molecular basis underlying the intriguing observation that there appear to be very few (no?) extreme halophiles that are also extreme psychrophiles, despite the fact that there are many environments on Earth that are both cold and salty.

Deep Lake Antarctica: cold and salty, but dominated by archaea with a surprisingly high optimal growth temperature. Image from http://www.lateralmag.com/articles/issue-7/the-cold-case-of-deep-lake with credits to Ricardo Cavicchioli.

One of these environments is Deep Lake, Antarctica, which supports a microbial community dominated by the mesophilic archaeon Halorubrum lacusprofundi (optimal growth temperature of 36 °C). That’s rather surprising given that your typical true psychrophile conks out at about 18 °C. Like all haloarchaea, what H. lacusprofundi can do is tolerate high levels of salt, up to 4.5 M NaCl or 262 g L-1. That level of salt tolerance is not seen among the documented true psychrophiles. Why not?

In the manuscript we posit that it comes down to the different amino acid substitutions needed to adapt a protein to high salt or low temperature conditions. High salt proteins typically have low isoelectric points, derived from more acidic amino acids. The practical implication of this is that they have a more negatively charged surface that requires a high concentration of salt for stability. This is a requirement for the “salt-in” strategists that dominate the most saline environments (such as salt crystallizer ponds). These microbes are primarily archaea but include a few bacteria, and deal with the high salinity of their environment by accumulating high intracellular concentrations of the salt KCl. This maintains their osmotic balance while excluding more harmful salts, but requires proteins that are compatible with high concentrations of KCl. By contrast most halotolerant bacteria (including psychrophiles that inhabit moderate salinity environments) are “salt-out” strategists that accumulate organic solutes to maintain osmotic balance. These solutes impose no particular requirements on intracellular proteins.

The trick is that amino acid substitutions that lead to a lower isoelectric point also decrease the flexibility of the protein. Increased flexibility is the key protein adaptation to low temperature. Thus the fundamental incompatibility between optimization to low temperature and high salinity. To test this idea Luke dusted off a model, the Protein Evolution Parameter Calculator (PEPC), that I developed many years ago in the waning days of my PhD. After updating the code from Python 2 to Python 3 and making some other improvements, Luke devised an experiment to “evolve” core haloarchaea orthologous group (tucHOG) proteins from H. lacusprofundi and the related mesophile Halorubrum salinarum. By telling the model to select for increased flexibility or decreased isoelectric point he could identify how improvements in one parameter impacted the other. As expected, likely amino acid substitutions (based on position in the protein and the BLOSUM80 substitution matrix) that increased flexibility also strongly favored an increased isoelectric point.

From Piszkin and Bowman, 2022. The directed evolution of tucHOG proteins from H. lacusprofundi and H. salinarum. The proteins were forced to evolve toward increasing flexibility while monitoring the resulting change in isoelectric point.
Posted in Research | 1 Comment

Tutorial: altering an existing NPZ model

I had the recent pleasure this summer of teaching high school students as a part of a Sally Ride Science Junior Academy. My class was called Polar Microbes, and we discussed adaptations to environments unique to the poles and the importance of microbes to the food webs of the Arctic and Antarctic. One of the things I most wanted to show students was how a simple ecological model could be changed to better fit the polar environment and explicitly include micro-organisms. I was so impressed by how quickly my students were able to understand and change the code underlying the model we used. I wanted to write a quick tutorial to expand that learning to anyone that is intimidated by ecological modeling and wants an easy place to start.

It is valuable to start out with a basic definition: a model is a simple representation of a complex phenomenon. Models are useful because they explicitly describe important mechanisms, which then can be tested against observations. This testing will ultimately demonstrate if your concept of a natural phenomenon was valid or that it needs to be refined.  With very little modeling experience myself, I started with an existing model from the excellent textbook “A Practical Guide to Ecological Modeling” by Karline Soetaert and Peter Herman from Springer. If you use R as a coding language, it is a great book to start modeling, as they have many conceptional explanations paired with highly understandable code. All the examples from the book are in the R package ecolMod:

install.packages("ecolMod”)

library(ecolMod)

demo("chap2")

Once you have the package loaded, you can click through the examples to see how to build a simple ecological model, where a forcing function causes flow between state variables. It is easier to understand with the below visual (Fig 2.1 of Soetart and Herman).

In oceanography, a common real-world application of this conceptual type of model is the NPZD, which stands for Nutrient, Phytoplankton, Zooplankton and Detritus. It is important for us to understand the flow of carbon and nitrogen (among other elements!) through both the macroscopic (zooplankton) and microscopic (detritus that is re-mineralized by bacteria) food web. This is one of the simplest ways to mathematically model it.

Along with figures, the authors are kind enough to include the code for the model. In their code, each of the state variables of NPZ or D (the boxes) are mathematically equal to the flows in minus the flows out. Based on the figure above for instance, PHYTO = f1 – f2. In turn, each of the flows are their own mathematical equations with parameters (constants that are experimentally determined). The equation provided for f1 for instance is:

 f1 = Nuptake  <- maxUptake * PAR/(PAR+ksPAR) * din/(din+ksDIN)

This is because Nuptake is dependent on solar radiation (PAR) and the amount of nutrients that are available (din), as well as the parameters maxUptake, ksPAR and ksDIN which are set as equal to 1/day, 140 muEinst/m2/s and 0.5 mmolN/m3 respectively when we define our parameters later in the model. I encourage you to download the model code and follow how each of the state variable definitions, flows and parameters are connected. Even in a model as simple as this it gets complicated!

Even more exciting are the model solutions, which show a sensible story over two years. As you know from above, the forcing function for the model is PAR (solar radiation), which varies over the season (the sine wave in panel A of the following figure). As PAR increases in the spring, there is a modeled increase in Chlorophyll and Zooplankton (what oceanographers call a “spring bloom”!) and a decrease in DIN.

As I was teaching a class called Polar Microbes, I wanted to change some parts of the model to better reflect a polar environment. Since the model’s forcing function is the seasonal light cycle, I knew it was the first thing that needed to change. The tilt of our rotation axis ensures that our poles have a much more extreme seasonal light cycles, with time in both full darkness and full light.

When you change the model to reflect this planetary fact (just change the PAR function to have a steeper slope and a period of darkness), the output variables change drastically (the Polar Model is in blue below):

Our class had long discussions about this model output. Is it sensible? What can you infer about the polar regions from this? How could it be improved? In our class, we ended up even adding another state variable, Bacteria, and altering the flows from it (viral lysis) to see what happens.

I encourage you to download the ecolMod package and see for yourself! If you are a high school student, consider joining us next summer at Sally Ride Science for my summer class on Polar Microbes as well.

Posted in Uncategorized | Leave a comment

New paper on detecting successful mitigation of sulfide production

Congrats to Avishek Dutta for his new paper “Detection of sulfate-reducing bacteria as an indicator for successful mitigation of sulfide production” currently available as an early view in Applied and Environmental Microbiology. This was intended to be the second of two papers on a complex experiment that we participated in with BP Biosciences, but the trials and tribulations of peer review led this to be the first. We’re pretty excited about it.

Here’s the quick background. When microbes run out of oxygen the community turns to alternate electron acceptors through anaerobic respiration. One of these is sulfate, which anaerobic respiration reduces to hydrogen sulfide. In addition to smelling bad hydrogen sulfide is pretty reactive and forms sulfuric acid when dissolved in water. For industrial processes this is a problem. Sulfide can destroy products, inhibit desired reactions, and corrode pipes and equipment. To make matters worse, sulfate reducing bacteria (SRBs: those microbes that are capable of using sulfate as an alternate electron acceptor) can form tough biofilms that are hard to dislodge.

One way of dealing with undesired SRBs is to fight biology with biology and add a more preferential electron acceptor. Oxygen would of course work really well, but it typically isn’t feasible to implement oxygen injection on a really large scale. However, nitrate also works well. If nitrate is abundant nitrate reducing bacteria (NRBs) will outcompete SRBs for resources (e.g., labile carbon). Great! Now here’s the challenge… adding massive quantities of nitrate salts is expensive and likely has it’s own ecologically and environmental consequences. So we’d like to do this judiciously, adding just enough nitrate to the system to offset sulfate reduction. But how to know when you’ve added enough? In a really big system (like an oil field) the sulfide production can be happening very far from any possible sampling site so simply measuring the concentration of hydrogen sulfide doesn’t help much. But we can learn some useful things by monitoring the microbial community in the effluent.

Schematic of biofilm dispersal, leading to a recognizable signal in the effluent. From Dutta et al., 2021.

The figure above is a schematic of the formation and decay of the biofilm before, during, and after mitigation. In our study the biofilm was presumed to be sulfidogenic and the mitigation strategy was addition of nitrate salts, but the concept applies equally well to any biofilm and any mitigation strategy. The trick – and this is one of those things that seems painfully obvious after the fact but not before – is that you’re looking for the thing you’re mitigating to appear in the effluent. Although this might seem to suggest increased abundance in the system, it actually represents decay of the biofilm and loss from the system. To take this a step further we used paprica to predict genes in the effluent and then identified anomalies in the abundance of genes involved in sulfate reduction. This anomalies provide specific markers of successful mitigation and a means to a general strategy for monitoring the effectiveness of mitigation.

The detection of anomalies in the predicted abundance of relevant genes provides a way to detect the successful mitigation of SRBs (or any biofilm forming microbes). From Dutta et al., 2021.
Posted in Research | Leave a comment