A simple solution for continuous, real-time monitoring with the Seabird SUNAV2 over RS232

Over the last couple of years we’ve been trying to operate a Seabird SUNAV2 for continuous, real-time nitrate observations as part of the Scripps Ecological Observatory effort. At the start of the project this seemed like a simple thing… verify the factory calibration against a standard, stick it in the water, and stream some ASCII data for parsing and plotting, right? I should have known better. Although the SUNA can operate through Seabird’s UCI GUI (a vast improvement over the nightmarish landscape of configuration files and aughts-era GUIs required to run some Seabird devices) we were never successful in operating ours for more than 1-2 weeks via the GUI. While UCI did produce an interpretable *sbslog file that could be parsed for nitrate concentration and other data, inevitably either the software would freeze or the SUNAV2 would become unresponsive after a few days. Whenever the latter happened we would need to power cycle the instrument to bring it back online and this eventually corrupted the memory (or so say Seabird) requiring a factory flash of the firmware and other maintenance. I think in two years of operation we maybe acquired 2 weeks of usable data and the SUNA made two trips to Bellevue, WA. Once upon a time I lived near Bellevue and I wish I could go back as frequently.

We recently received the instrument back from Seabird and I was stumped on what to do with it. I knew it was capable of RS232 communication, which might free us from the UCI GUI and solve at least one of our problems, but I have limited experience with serial device communication. With few other options and only cryptic guidance from the SUNAV2 manual I grabbed a power supply and USB-serial converter and gave it a try.

PuTTY was the logical starting place as I use it for SSH connections on my Windows laptop. After the normal guessing of COM ports I made contact at the specified baud rate of 57600 and otherwise default connection parameters. It took me a few tries to realize that I needed to send some (any) character to the instrument to “wake it up” and induce a response and command prompt. Protip, you gotta be fast on that command prompt else it the instrument goes to sleep. But you can’t prompt with a command!

PuTTY worked great for the initial connection but provides no solution for continuous, real-time monitoring. You can get PuTTY to create a log file (containing your data and anything else sent over the serial connection), but the file isn’t created until the session closes which makes it not very useful for real-time monitoring. PuTTY includes the plink utility that is supposed to provide an automation solution. While it can do some cool things I couldn’t crack it for this task. Because the instrument takes some time to respond to the initial “wake up” keystroke, a programmatic solution needs a delay, and I couldn’t find a way to implement one with plink.

Almost in desperation I turned to Python, which is probably where I should have started. A previous day’s frustration turned into a pleasant hour of tinkering over a cup of coffee, and voilà! Here’s the solution, may it help others. The major unsolved issue is that some encoding issue (?) prevents proper parsing of the data passed from the instrument. I tinkered with io.TextIOWrapper and various encoding options to no avail, and ultimately settled on Python string operations to set things right.

Posted in Uncategorized | Leave a comment

New postdoctoral position in pathogen ecology

The Bowman Lab has a new position open for a postdoc in viral pathogen ecology. The postdoc will join a dynamic team of experimentalists, ecological modelers, and physical oceanographers working to understand the distribution of and exposure risk to norovirus and other pathogens in coastal California. The work is motivated by urgent need to better forecast risk associated with cross-border sewage transport in southern San Diego County. The postdoc will have primary responsibility for conducting experiments to determine the decay of norovirus under realistic environmental conditions and for analyzing a growing dataset of norovirus abundance obtained with ddPCR. There will be opportunities to be involved in both the field and modeling components of the project, depending on interest and professional development goals. Applicants should have a strong publication record and excellent writing skills, knowledge of experimental design, qPCR experience, and theoretical knowledge of pathogen ecology. Specific knowledge of norovirus and mammalian cell culture techniques is a plus. Position will remain open until filled. Please send an expression of interest and CV to Jeff at jsbowman at ucsd.edu.

Planet Lab imagery of the south San Diego County coastline on April 10, 2024. Offshore and nearshore plumes associated with cross-border sewage transport can be seen. Known sources include the Tijuana River (visible in the lower right) and Punta Bandera (not shown).
Posted in Uncategorized | Leave a comment

Seeking postdoc in phytoplankton ecology

The Bowman Lab seeks a postdoctoral researcher for a two-year project to investigate patterns in phytoplankton community composition across molecular time series collected at the Ellen Browning Scripps Memorial Pier at Scripps Institution of Oceanography (SIO) and the Cal Polytechnic State University (Cal Poly, San Louis Obispo) Pier at Avila Beach.  The postdoctoral researcher should have a background in biological oceanography or microbial ecology, a solid understanding of ecological statistics and the analysis of amplicon sequence data, be familiar with phytoplankton sampling techniques, and understand basic molecular lab procedures.  Specific knowledge of eukaryotic harmful algal bloom-forming taxa is a plus.  The position will be located at SIO but jointly mentored by Dr. Alexis Pasulka (Cal Poly) and Dr. Jeff Bowman (SIO).  There will be specific opportunities to work with undergraduate students at Cal Poly and SIO.  The position will be open until filled and applicants should reach out jointly to Dr. Bowman (jsbowman at ucsd.edu) and Dr. Pasulka (apasulka at calpoly.edu) with a copy of their CV and a brief expression of interest.

Posted in Uncategorized | Leave a comment

Recent blog post by PhD student Beth Connors

Check out this recent blog post written by PhD student Beth Connors for International Women in Science Day!

Posted in Uncategorized | Leave a comment

New paper: Antarctic metagenomes reveal novel microbial diversity

I’ve fallen very far behind on my one-time goal of writing a blog post for each new lab publication. Seeking redemption via a paper out today by former Bowman Lab postdoc Avishek Dutta, now an assistant professor at UGA. His paper, Depth drives the distribution of microbial ecological functions in the coastal western Antarctic Peninsula, is the first study (we believe) to employ high throughput shotgun metagenomics to evaluate the ecological functions of bacteria and archaea across multiple depths and seasons in the Antarctic marine system. It builds on some wonderful work by Tom Delmont in the Amundsen Sea and Joseph Grzymski in the same region as this study.

Avishek selected 48 historic samples collected by our group in collaboration with the Palmer Long Term Ecological Research study for shotgun metagenomic analysis. Although this is not a particularly large sample set by today’s standards, using Avishek’s iMAGine pipeline it was nonetheless sufficient to yield 2,940 bins (collections of contigs that might be from similar genomes) and 137 dereplicated, high quality bins that we considered metagenome assembled genomes (MAGs). MAGs are not perfect constructs; they’re incomplete and likely composites of different, highly similar genomes. Nonetheless, standard genomic tools and analyses can be applied to them to try and infer likely metabolisms and other characteristics.

You really have to stare at the above figure for a while to start to wrap your head around the details, but the main points are: 1) There’s a lot of potential for carbohydrate degradation potential, as we might expect for bacteria and archaea that respond to phytoplankton blooms. 2) There’s a lot of dark carbon fixation potential, and this is often combined with different heterotrophic metabolisms to produce a prokaryotic mixotrophic functional type (i.e. an organism that can switch between autotrophic and heterotrophic metabolisms). This is one of those things that we assume must be fairly common without actually observing it. Right now there is a lot of discussion in various communities about the role of mixotrophic protists in marine foodwebs and the carbon cycle, but mixotrophic bacteria and archaea get surprisingly little attention.

Posted in Uncategorized | Tagged , | Leave a comment

New postdoctoral research opportunity!

Andrew Barton (https://adbarton.scrippsprofiles.ucsd.edu/) and Jeff Bowman (https://jsbowman.scrippsprofiles.ucsd.edu/) at Scripps Institution of Oceanography at the University of California San Diego are recruiting a postdoc to study interactions among marine microbes, inferred from regular genomic measurements and cell images taken at Scripps Pier (https://ecoobs.ucsd.edu/). Possible research areas include but are not limited to: quantifying the strength and direction of microbial interactions, identifying “keystone” microbial taxa, and assessing how microbial interactions shape ecosystem function. The ideal candidate will have a PhD in ecology, marine biology, or related disciplines, and proficiency in data science techniques, machine learning, novel statistical methods, and/or numerical modeling approaches for studying natural populations and communities. Please direct qualified candidates to contact Andrew Barton (adbarton@ucsd.edu) for more information. We anticipate filling the open position by Fall 2023. 

Posted in Uncategorized | 1 Comment

Alignment and phylogenetic inference with hmmalign and RAxML-ng

RAxML is one of the most popular programs around for phylogenetic inference via maximum likelihood. Similarly, hmmalign within HMMER 3 is a popular way to align amino acid sequences against HMMs from Pfam or created de novo. Combine the two and you have an excellent method for constructing phylogenetic trees. But gluing the two together isn’t exactly seamless and novice users might be deterred by a couple of unexpected hurdles. Recently, I helped a student develop a workflow which I’m posting here.

First, define some variables just to make the bash commands a bit cleaner. REF refers to the name of the Pfam hmm that we’re aligning against (Bac_rhodopsin.hmm in this case), while QUERY is the sequence file to be aligned (hop and bop gene products, plus a dinoflagellate rhodopsin as outgroup).

REF=Bac_rhodopsin
QUERY=uniprot_hop_bop_reviewed

Now, align and convert the alignment to fasta format (required by RAxML-ng).

hmmalign --amino -o $QUERY.sto $REF.hmm $QUERY.fasta
seqmagick convert $QUERY.sto $QUERY.align.fasta

Test which model is best for these data. Here we get LG+G4+F.

modeltest-ng -i $QUERY.align.fasta -d aa -p 8

Check your alignment!

raxml-ng --check --msa $QUERY.align.fasta --model LG+G4+F --prefix $QUERY

Oooh… I bet it failed. Exciting! In this case (using sequences from Uniprot) the long sequence descriptions are incompatible with RAxML-ng. Let’s do a little Python to clean that up.

from Bio import SeqIO

with open('uniprot_hop_bop_reviewed.align.clean.fasta', 'w') as clean_fasta:
    for record in SeqIO.parse('uniprot_hop_bop_reviewed.align.fasta', 'fasta'):
        record.description = ''
        SeqIO.write(record, clean_fasta, 'fasta')

Check again…

raxml-ng --check --msa $QUERY.align.clean.fasta --model LG+G4+F --prefix $QUERY

If everything is kosher go ahead and fire up your phylogenetic inference. Here I’ve limited bootstrapping to 100 trees. If you have the time/resources do more.

raxml-ng --all --msa $QUERY.align.clean.fasta --model LG+G4+F --prefix $QUERY --bs-trees 100

Superimpose the bootstrap support values on the best ML tree.

raxml-ng --support --tree $QUERY.raxml.bestTree --bs-trees $QUERY.raxml.bootstraps

And here’s our creation as rendered by Archaeopteryx. Some day I’ll create a tree that is visually appealing, but today is not that day. But you get the point.

Posted in Computer tutorials | Tagged , , | Leave a comment

New paper on using machine learning to predict biogeochemistry from microbial community structure

Congratulations to Avishek Dutta for his paper Machine Learning Predicts Biogeochemistry from Microbial Community Structure in a Complex Model System that was recently published in the journal Microbiology Spectrum. I’m really excited about this paper; the study it is based on inspired this perspective that I wrote for an mSystems early career special issue last year.

Summary of experimental design and analysis, from Dutta et al., 2022.

The figure above summarizes the experimental design and analysis. The experiment was designed to address the question of whether the microbial community contains sufficient information to predict a biogeochemical state in a dynamic system. The structure of a microbial community is highly sensitive to environmental change. Small changes in the chemical or physical environment will result in a shift in abundance of one or more taxa as mortality and growth rates respond. These shifts in structure are easily observed by amplicon sequencing of taxonomic marker genes. These relative abundance data can be combined with flow cytometry analysis of microbial abundance to yield absolute abundance data.

The trick of course is relating an observed shift in community structure to a specific biogeochemical state. Machine learning provides a number of ways to do this, but all require large training datasets. Fortunately gene sequencing is pretty cheap these days and DNA extractions are much more high-throughput than they were just a few years ago. Because of this it’s possibly to generate community structure data for hundreds of samples in relatively short order. In this study Avishek used over 700 samples from sediment bioreactors and the random forest algorithm to predict the concentration of hydrogen sulfide with a reasonably high degree of accuracy.

Like any statistical model, developing machine learning models takes careful attention to detail. Careful segregation of the data into training and validation sets and engineering of the features used for prediction yield the most honest models that can be best applied for future predictions. Avishek’s paper is an excellent template for developing a predictive machine learning model from microbial community structure data.

Posted in Research | Tagged , , | Leave a comment

Lab manager position open!

We’re on the hunt for a lab manager/senior lab technician to take on a variety of key tasks in the Bowman Lab. The position is being advertised at the Staff Research Associate II level and the ideal applicant will have an MS in a relevant field, or a BS and equivalent experience. We are looking for someone with complementary skills to the rest of the lab; the ideal applicant would have a background in environmental or analytical chemistry to complement our core expertise in microbiology. However, a background in the life sciences also works fine. The formal job posting is pasted below (note that it deviates slightly from what’s described here due to limitations of the UC San Diego HR system).

DESCRIPTION

Under supervision, independently perform a variety of standard laboratory and data analysis procedures (and some non-standard procedures) related to the function of coastal ocean environments. Coordinates and conducts instrument calibrations and data collection for long-term time-series of microbial community structure, microbial abundance, and dissolved gases. Responsible for the operation and maintenance of a membrane inlet mass spectrometer, flow cytometer, and in situ imaging flow cytometer (IFCB), DNA extraction, data entry, and light programming in Python and R. Travel to field stations as needed, which may involve driving University vehicles and operating small boats for diving and coastal field work. Scuba dive to clean and service underwater instrumentation. Coordinate and communicate with lab members about supplies, data and sampling techniques. Oversee and work-direct undergraduate research assistants. Process, analyze, and interpret results from data sets, evaluate quality of data, generate and update design and method documentation, and update web pages. Perform general office duties including but not limited to filing, photocopying, faxing and library searches for research articles. Manage laboratory space, computers, and equipment.

  • Must be able to lift 50 lbs.

QUALIFICATIONS

  • B.S. in Chemistry, Marine Science, Oceanography, or equivalent combination of education and experience with a strong background in data analysis and computer operations.
  • Demonstrated experience with diving and ability to acquire or maintain AAUS and SIO scientific diving certification.
  • Demonstrated knowledge of mathematics, scientific, and programming principles.
  • Demonstrated experience with R, Matlab, or Python programming languages for data analysis and visualization.
  • Demonstrated laboratory experience. Demonstrated knowledge and experience with laboratory techniques and instrumentation, specifically flow cytometry and DNA extractions. Demonstrated experience with laboratory safety procedures and calibration techniques.
  • Proven ability to work effectively on multiple tasks in parallel, with each requiring a different focus and level of detail and attention. Proven ability to prioritize tasks and solve problems.
  • Demonstrated data entry and data analysis experience. Demonstrated experience with spreadsheets and/or databases for data entry, archival and basic data analysis using standard software (e.g., MS Excel, MS Access, Matlab, or other statistical software packages).
  • Experience communicating and interacting with a variety of people from the public to governmental agencies, students and volunteers. Ability to effectively communicate instructions and interact using tact and diplomacy with diverse personalities including academic, staff, student and volunteer employees and institutions/organizations.
  • Proven ability and experience using PCs, email, internet, general office tools and software.
  • Tolerance of repetitive tasks such as data entry and checking, or extended periods in laboratory filtering samples or analyzing seawater samples via flow cytometry.
  • Demonstrated ability to find and follow written and oral procedures from standard laboratory resources.
  • Must be organized and a self-motivator with the ability to work efficiently while unsupervised.
  • Proven ability to document significant results of data analysis in technical notes. Good writing skills. Ability to integrate data products and methodologies from laboratory and field instrumentation into research results for publication purposes.
  • Proven ability to communicate with technical and scientific personnel. Ability to instruct and aid research associates and students on the use of software packages and data procedures/protocols.
  • Ability to travel for days to weeks for field work and work extended hours as needed.
  • Ability to drive University vehicles to field stations. Valid driver’s license.
  • Proven ability to work with others under demanding conditions, sometimes for extended periods of time.

SPECIAL CONDITIONS

  • Ability to work at sea. Must have demonstrated experience with SCUBA diving and ability to acquire and maintain AAUS and SIO scientific diving certification.
  • Must have valid driver’s license and ability to drive University vehicles to field stations.
  • Ability to travel for days to weeks for field work and work extended hours as needed.
  • This position is subject to a DMV check for driving record. Fluency in Spanish is preferred.
Posted in Uncategorized | Leave a comment

New paper on protein adaptations to high salinity and low temperature

Congratulations to Luke Piszkin (now a PhD student in the Biophysics Department at the University of Notre Dame) for the first paper in the lab to be first-authored by an undergraduate! Luke’s paper is titled Extremophile enzyme optimization for low temperature and high salinity are fundamentally incompatible and appears in the journal Extremophiles. In the paper Luke explores the molecular basis underlying the intriguing observation that there appear to be very few (no?) extreme halophiles that are also extreme psychrophiles, despite the fact that there are many environments on Earth that are both cold and salty.

Deep Lake Antarctica: cold and salty, but dominated by archaea with a surprisingly high optimal growth temperature. Image from http://www.lateralmag.com/articles/issue-7/the-cold-case-of-deep-lake with credits to Ricardo Cavicchioli.

One of these environments is Deep Lake, Antarctica, which supports a microbial community dominated by the mesophilic archaeon Halorubrum lacusprofundi (optimal growth temperature of 36 °C). That’s rather surprising given that your typical true psychrophile conks out at about 18 °C. Like all haloarchaea, what H. lacusprofundi can do is tolerate high levels of salt, up to 4.5 M NaCl or 262 g L-1. That level of salt tolerance is not seen among the documented true psychrophiles. Why not?

In the manuscript we posit that it comes down to the different amino acid substitutions needed to adapt a protein to high salt or low temperature conditions. High salt proteins typically have low isoelectric points, derived from more acidic amino acids. The practical implication of this is that they have a more negatively charged surface that requires a high concentration of salt for stability. This is a requirement for the “salt-in” strategists that dominate the most saline environments (such as salt crystallizer ponds). These microbes are primarily archaea but include a few bacteria, and deal with the high salinity of their environment by accumulating high intracellular concentrations of the salt KCl. This maintains their osmotic balance while excluding more harmful salts, but requires proteins that are compatible with high concentrations of KCl. By contrast most halotolerant bacteria (including psychrophiles that inhabit moderate salinity environments) are “salt-out” strategists that accumulate organic solutes to maintain osmotic balance. These solutes impose no particular requirements on intracellular proteins.

The trick is that amino acid substitutions that lead to a lower isoelectric point also decrease the flexibility of the protein. Increased flexibility is the key protein adaptation to low temperature. Thus the fundamental incompatibility between optimization to low temperature and high salinity. To test this idea Luke dusted off a model, the Protein Evolution Parameter Calculator (PEPC), that I developed many years ago in the waning days of my PhD. After updating the code from Python 2 to Python 3 and making some other improvements, Luke devised an experiment to “evolve” core haloarchaea orthologous group (tucHOG) proteins from H. lacusprofundi and the related mesophile Halorubrum salinarum. By telling the model to select for increased flexibility or decreased isoelectric point he could identify how improvements in one parameter impacted the other. As expected, likely amino acid substitutions (based on position in the protein and the BLOSUM80 substitution matrix) that increased flexibility also strongly favored an increased isoelectric point.

From Piszkin and Bowman, 2022. The directed evolution of tucHOG proteins from H. lacusprofundi and H. salinarum. The proteins were forced to evolve toward increasing flexibility while monitoring the resulting change in isoelectric point.
Posted in Research | 1 Comment