Computers are an ever-present aspect of modern biochemistry, and many faculty focus on computation as a major component of their research. Molecular simulations provide valuable insights into the mechanisms of protein and RNA interactions and functions. The growing number of available structural and gene expression data have made possible research related to management, integration, and analysis of these data. Our faculty perform molecular simulations, develop algorithms and data analysis pipelines to help understand biological molecules and their functions. Several faculty work on genomics and epigenomics to help understand how biological information is encoded in DNA and how this directs the regulation and expression of genes.
Computational biology and bioinformatics researchers
Research faculty are accepting new graduate students unless designated with (*).
The Hendrix lab uses a broad range of techniques to study gene regulation, including sequence analysis, machine learning, data science, and pattern recognition. We are using RNA-seq and metabolomic data in flies to uncover changes in diurnal gene expression after aging and blue light exposure. We are using genome assembly and annotation and comparative genomics to describe the evolution of the Cannabaceae family, with focus on hop and hemp flavor and aroma compounds. We also use machine learning and data science to uncover patterns in RNA sequence and structure
Juan M. Vanegas
The Vanegas laboratory combines techniques from molecular simulation, continuum mechanics, and quantum chemistry to understand how molecular structure modulates the activation of mechanosensitive proteins and determines the mechanical response of lipid membranes. Our group is highly interdisciplinary working at the interface between biology, physics, chemistry, and engineering. The central focus of our research is to provide mechanistic insights into essential biological processes such as membrane fission and fusion, organelle and cellular shaping, touch and pain sensing, cardiovascular control and development, and osmotic regulation among others. We work closely with other laboratories to integrate our modeling efforts with experimental results.
The Mortimer group uses an integrated ‘omics approach to explore interactions between Drosophila hosts and the parasitoid wasps that infect them. They make use of genomic and RNA sequencing, as well as high throughput proteomics and metabolomics to better characterize parasitoid venom repertoires and the alterations they provoke in their Drosophila hosts. They are currently studying the metabolic response to infection, the transcriptional correlates of inflammation, and, in collaboration with the Genomics Education Partnership (https://thegep.org/projects/wasps/), we are producing well-annotated parasitoid genomes to study the function and evolution of venom-encoding genes.
The focus of the work is on the study of multivalent interactions between intrinsically disordered proteins and the hub protein LC8. To this end, we develop and maintain an algorithm and publicly-available tool for predicting LC8-binding motifs. Additionally, we are developing methods to apply Bayesian statistical models to isothermal titration calorimetry (ITC) data to examine binding energies of multivalent LC8-IDP complexes to understand how these complexes are formed, and what role they play in the cell.
The Hsu laboratory is interested in studying the structural aspects of biomolecular recognition and interactions, especially in protein-nucleic acid complexes. These interactions account for many of the major cell functions such as the induction or repression of gene expression and the packaging of nucleic acids into other superstructures. The primary technique that he uses is nuclear magnetic resonance (NMR) spectroscopy, which is uniquely suited for studying biomolecular structures at atomic resolution. The lab studies both sequence-specific and nonspecific DNA-binding proteins and has been active in developing isotope-edited NMR strategies to obtain more accurate distance constraints for use in structure calculations and to investigate the intrinsic flexibility of protein and DNA backbones.
I was known as a computational linguist (parsing, translation, algorithms and theory), but in recent years I became more interested in adapting my NLP algorithms to computational biology, thanks to the shared mathematical foundations between the two seemingly distant fields. In particular, I have been working on efficient (mostly linear-time) algorithms for RNA folding, mRNA design, homologous folding, and RNA design. More interestingly, when COVID-19 hit, this line of work became much more relevant because SARS-CoV-2 is the longest RNA virus known today (~30,000 nucleotides) which requires linear runtime, and because mRNA vaccine is the best way to prevent it but mRNA design is computationally challenging. Therefore, my work has made impact on the fight against COVID-19, and has resulted in high-profile papers such as LinearTurboFold (PNAS 2021) and LinearDesign (Nature 2023).
I am most interested in the theoretical and algorithmic aspects of biology and language, and many of my NLP/bio papers draw unexpected connections from theoretical computer science, e.g., my synchronous binarization algorithm (binarizing a synchronous context-free grammar in linear-time) was inspired by Graham Scan for Convex Hull, my LinearDesign algorithm (optimal mRNA design) uses the intersection between context-free and regular languages, and my k-best parsing algorithms are often featured in my Algorithms courses.
Dr. Karplus no longer leads a research group but still is involved in collaborative work with others who study proteins. One area of expertise is the mining of information in known protein structures, especially using ultrahigh-resolution protein structures to discover details and principles of structure that have not yet been recognized. Such insights can help improve protein structure prediction. Projects have included work on the non-planarity of the peptide bond, how covalent geometry depends on conformation and comparing protein ensembles to locate key conformational differences.
Freitag's research has applied and optimized high-throughput methods useful for the investigation of genome-wide epigenetic regulation, including ChIP-seq, RNA-seq and Hi-C. While his current focus is on the control and function of histone H3 lysine 27 methylation (NSF MCB1818006), he also studies transcriptional gene networks of strains lacking chromatin factors (NIH R01GM132644). Another interest of the lab is the generation of complete, telomere-to-telomere genome sequences by PacBio SMRT and Oxford Nanopore DNA sequencing (JGI CSP504417).