Mathematical Modeling/Computational Biology

Project 4: Statistical Methods for the Analysis of Microbial Community Structure

Dr. Zaid Abdo - Project Director Dr. Paul Joyce - Mentor Dr. Jack Sullivan - Mentor

Harmful pathogens are not the sole cause of disease - disease might result from the disruption of the balance of the human-microbial ecosystem (the "human microbiome"). The goal for this project is to develop computational approaches to differentiate between normal and abnormal human microbiome communities. This project will: 1) develop model-based clustering methods to distinguish and characterize different microbial community groups, 2) develop classification methods to correctly and efficiently classify newly sampled microbial communities to pre-existing, well-characterized groups, and 3) test those methods using real and simulated data. Successful completion of these goals will result in software that will assist clinicians in identifying deviations from the normal human microbiome for research and diagnostic purposes. This project is in concordance with the Human Microbiome Project of NIH Roadmap Initiative.

Ecological Genomics

Erica Bree Rosenblum

Amphibians around the world have been experiencing massive population losses and extinctions. Although these declines have been precipitated by a number of factors, the fungal pathogen Batrachochytrium dendrobatidis (Bd) is a devastating threat, infecting hundreds of amphibian species worldwide. Our work on disease-related declines in amphibians leverages whole genome sequences to understand: 1) The genetics of host/pathogen interactions: The primary thrust of our frog/chytrid work focuses on understanding the genetic changes associated with fungal infection of frog hosts from a whole-genome perspective. From the host perspective, we use whole genome expression assays to identify genes that are involved in frog response to Bd under different conditions. We leverage whole-genome data for the model frog species Xenopus tropicalis to make much of our lab-based immunogenetics work possible. From the pathogen perspective, we use both comparative and functional genomics to study genes that may be involved in Bd pathogenicity under different conditions. 2) The evolution of pathogenicity: We were involved in the initial whole-genome sequencing project for B. dendrobatidis and are now sequencing whole-genomes of a number of additional Bd strains. The resulting sequence data will provide a wealth of information about the biology of this basal group of fungi and provide insight into the evolutionary origin and spread of this emerging pathogen. 3) Applying genomics to ecologically important questions: One of our major research aims is to integrate functional genomics and organismal biology by applying genomic data to questions in nature. We have an ongoing interest in developing large-scale genetic resources for non-model species and diverse collaborations to better tackle evolutionary questions in complex natural systems.

Small Deviations of Gaussian Processes

Frank Gao

Gaussian processes are one of the most important random processes, of which Brownian motion and fractional Brownian motion are two examples. Small deviation probabilities concern rare events in which a random process resides in a small designated area over a given period of time. The study of small deviations in probabilities of Gaussian offers a better understanding of a number of limit theorems in probability.

Metric Entropy of Function Classes

Frank Gao

How many fire stations are needed so that every house in the city of Moscow is within one mile of the nearest fire station? This is an example of the concept of covering numbers. By studying covering numbers and their function, we are lead to a better understanding of the minimum sample size needed to predict an unknown probability distribution. Determination of the exact covering number for a given class of functions is, however, typically difficult and unnecessary. What is important is the logarithm of the covering number, which is called metric entropy. Analytic tools are traditionally used to estimate metric entropy, but recently probabilistic methods have added new insights into this area, and this is a focus of my research.

Phylogenetic Methods

Jack Sullivan

Phylogenetic analysis, the estimation of evolutionary trees, has become the cornerstone of evolutionary biology. In addition to their more traditional applications in evolutionary biology, molecular phylogenies (i.e., phylogenies that have been estimated from molecular data such as DNA sequences) are being applied to an ever-widening array of disciplines. These include biomedicine (e.g., tracing infection pathways for HIV and other pathogens), bioinformatics (e.g., genome evolution), and forensics (phylogenies estimated from HIV sequences have recently been allowed as evidence in murder trial). Because of this, the development and testing of phylogenetic methods assumes a position of critical importance and extremely broad relevance. Furthermore, the influx of molecular sequence data and the adoption of an explicitly statistical approach to data analysis have led to the requirement to refine methods of phylogenetic inference. You can see more detail by following this link.

Microbial Diversity

James Foster

We are developing techniques and tools with which to infer the makeup of a microbial community from the DNA present in a sample without having to sequence the DNA or culture the microbes. We are particularly interested in microbial communities in the human microbiome. We are beginning to perform wetbench experiments in addition to our in silico and mathematical projects. Our goal is to understand why different ecosystems host the communities they do, and how those communities change in response to evolutionary dynamics. See our Microbial Community Analysis (MiCA) website.

Evolutionary Computation and Algorithmics

James Foster

We are applying evolutionary computation to address problems in bioinformatics. For example, we have developed and tested algorithms that use genetic programming and genetic algorithms for multiple (DNA or protein) sequence alignment, and for phylogenetic reconstruction. Algorithms are springboards for two directions of development: tools for practicioners, especially those with large volumes of data; and theoretical limits to algorithmic efficiency for these problems.

Simulation of Molecular Evolution

James Foster

We have developed sophisticated computer simulations. One system simulates the evolution of DNA sequences which undergo reverse transcription. This simulator maintains complete sequence information and can be configured to accommodate a wide range of assumptions about the mechanisms and parameters involved. Our intended model system was mammalian retrotransposons, but the system is not tied to any particular biological system. The objective is to understand the dynamics of evolving systems with multiple levels of selection.

Determining Mechanisms for Adaptive Protein Evolution

F. Marty Ytreberg

An experimental biologist can often determine specific amino acid mutations that occur due to adaptive evolution. However, it is not always possible to deduce the molecular-level mechanism behind these mutations. The central idea of this project is to understand adaptive protein evolution in bacteriophage by using computer simulations to estimate protein-protein binding affinities. The goal is to understand, and eventually predict, the effects of amino acid mutations that occur during evolution. This same general technique could be used to study, for example, protein evolution and drug resistance.

Adaptive Evolution of Viruses

Steve Krone

We employ a combination of mathematical/computational models and experimental evolution to investigate the role of spatial structure in the adaptive evolution of phages (viruses that infect bacteria) in constant and fluctuating environments. The mathematical models involve rather complex stochastic cellular automata (CA) that characterize the spatial dynamics of bacterial growth and phage infection down to the level of individual bacterial cells. These models are validated and calibrated using laboratory experiments; model predictions then help to explain empirical results and suggest new experiments. In addition to directing my own phage laboratory, I collaborate with the phage labs of Drs. Holly Wichman and Celeste Brown. (This work is funded by a grant from NIH.)

The Spread of Antibiotic Resistance Plasmids

Steve Krone

This project also combines stochastic CA mathematical models with laboratory experiments (in collaboration with Dr. Eva Top) to study one of the key mechanisms leading to the rapid spread of antibiotic resistance in bacteria: plasmids - relatively small circular strings of DNA that reside in many bacterial cells but are not part of the cell's chromosome. An amazing feature of plasmids is that, in addition to being passed from mother to daughter cell upon cell division, along with the cell's chromosomal DNA, these extra-chromosomal packages of genetic material can be exchanged between different cells - even between very different species of bacteria. Our work focuses on trying to understand the factors controlling the spread and evolution of plasmids, especially in spatially structured environments. (This work is funded by a grant from NIH.)

November 20, 2009


Friday - 12:30pm
TLC 044

John Bunge - Dept. of Statistical Science, Cornell University:  "Population diversity estimation: The state of the art, and the new software CatchAll"