Professor Steven Brenner
Steven E. Brenner
Computational Biology, Genomics, Protein Function Prediction
Professor
461 Koshland Hall
Berkeley, California 94720-3102
Phone 510.643.9131
Lab Phone 510.642.9614
Fax 510.666.2505

Ph.D.  Biological Sciences    University of Cambridge, MRC Laboratory of Molecular Biology, 1997
A.B.   Biochemical Sciences    Harvard University, 1992

The Brenner research lab has three key research interests involving computational and experimental genomics.

Gene regulation by alternative splicing and nonsense-mediated mRNA decay. Nonsense-mediated mRNA decay (NMD) is a cellular RNA surveillance system that recognizes transcripts with premature termination codons and degrades them. We discovered large numbers of natural alternative splice forms that appear to be targets for NMD, and we speculated that this might be a mode of gene regulation which we termed RUST (regulated unproductive splicing and translation). All conserved members of the SR family of splice regulators have an unproductive alternative mRNA isoform targeted for NMD1. Strikingly, the splice pattern for each is conserved in mouse and always associated with an ultraconserved or highly-conserved region of perfect identity between human and mouse. Remarkably, this seems to have evolved independently in every one of the genes, suggesting that this is a natural mode of regulation. We are using RNA-Seq to explore the pervasiveness of NMD in numerous species2, and to understand its behavior. As part of a modENCODE consortium, we discovered the repertoire of targets for alternative splicing in the fly, as well as unexpected relationships between the development of fly and worm34. We are now detailing the regulators in the SR family and exploring the evolution of this gene-expression regulation mechanism.

Prediction of protein function using Bayesian phylogenomics. We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Unfortunately, these predictions have littered the databases with erroneous information, for a variety of reasons including the propagation of errors and the systematic flaws in BLAST and related methods. In collaboration with Michael Jordan's group, we have developed a statistical approach to predicting protein function that uses a protein family's phylogenetic tree, as the natural structure for representing protein relationships. We overlay on this all known protein functions in the family. We use a model of function evolution to then infer the functions of all other protein functions. Even our initial implementations of this method, called SIFTER (statistical inference of function through evolutionary relationships) have performed better than other methods in widespread use5. We are presently making numerous improvements to the underlying SIFTER algorithm and enhancing its ability to work on a wide range of data and to incorporate more experimental association data. SIFTER was honored as a top performing method in the Critical Assessment of Function Annotation9. We are collaborating with the ENIGMA project at LBNL to improve annotation on a large scale. In collaboration with Jack Kirsch, we are also experimentally validating the function predictions, with a focus on the Nudix family. We are also involved in maintaining the SCOP: Structural Classification of Proteins, a key resource for understanding protein structure data. We therefore analyze structural genomics efforts and guide their future directions8. Using kernel methods and selected features, we are building systems to recognize ancient protein evolutionary relationships.

Personal genomics. We have a longstanding interest in personal genome interpretation, including developing a genome commons6, understanding the basis of Mendelian diseases from sequenced genomes10, and organizing the Critical Assessment of Genome Interpretation (CAGI) project.

Research

Computational Genomics
Gene regulation by alternative splicing & RNA surveillance.

Nonsense-mediated mRNA decay (NMD) is a cellular RNA surveillance system that recognizes transcripts with premature termination codons and degrades them. Several years ago, we discovered large numbers of natural alternative splice forms that appear to be targets for NMD, and we speculated that this might be a mode of gene regulation which we termed RUST (regulated unproductive splicing and translation). This seems to be confirmed by our finding that all conserved members of the SR family of splice regulators have an unproductive alternative mRNA isoform targeted for NMD. Strikingly, the splice pattern for each is conserved in mouse and always associated with an ultraconserved or highly-conserved region of ~100 or more nucleotides of perfect identity between human and mouse. Remarkably, this seems to have evolved independently in every one of the genes, suggesting that this is a natural mode of regulation. We are using microarray data to explore the pervasiveness of NMD in humans and in Drosophila, in collaboration with Don Rio. As part of a modENCODE consortium, we plan to discover the repertoire of cis-reglatory sites for alternative splicing in insects. Future directions include detailing the regulators in the SR family and exploring the evolution of this gene-expression regulation mechanism.

References:

1. Lareau LF, Inada M, Green RE, Wengrod JC, Brenner SE. 2007. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446:926-929. doi:10.1038/nature05676 [PDF 1.3M] [supplementary information .9M]

2. Hansen KD, Lareau LF, Blanchette M, Green RE, Meng Q, Rehwinkel J, Gallusser FL, Izaurralde E, Rio DC, Dudoit S, Brenner SE. 2009. Genome-wide identification of alternative splice forms down-regulated by nonsense-mediated mRNA decay in Drosophila. PLoS Genetics 5:e1000525. doi:10.1371/journal.pgen.1000525 [PDF .5M]

3. Celniker SE et al. 2009. The modENCODE (model organism ENCyclopedia Of DNA Elements) project. Nature 459:927-930. doi:10.1038/459927a [PDF 1M]

Prediction of protein function using Bayesian phylogenomics.
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Unfortunately, these predictions have littered the databases with erroneous information, for a variety of reasons including the propagation of errors and the systematic flaws in BLAST and related methods. In collaboration with Michael Jordan's group, we have developed a statistical approach to predicting protein function that uses a protein family's phylogenetic tree, as the natural structure for representing protein relationships. We overlay on this all known protein functions in the family. We use a model of function evolution to then infer the functions of all other protein functions. Even our initial implementations of this method, called SIFTER (statistical inference of function through evolutionary relationships) have performed better than other methods in widespread use. We are presently making numerous improvements to the underlying SIFTER algorithm and enhancing its ability to work on a wide range of data. We are collaborating with the Joint Genome Institute and numerous protein databases to improve annotation on a large scale. In collaboration with Jack Kirsch, we are also experimentally validating the function predictions, with a focus on the Nudix family.

Reference:

Engelhardt BE, Jordan MI, Muratore KE, Brenner SE. 2005. Protein molecular function prediction by Bayesian phylogenomics. PLoS Comput Biol 1:432-445. doi:10.1371/journal.pcbi.0010045 [PDF 1.4M]

Medical and environmental metagenomics; personal genomics.

The Sorcerer II global ocean sampling project revealed the sequences millions of new putative protein sequences, arguably doubling the known repertoire of proteins. We collaborated with the Venter Institute in the analysis of these proteins, understanding how they differ from those previously seen, and discovering ancient relationships amongst them. We are developing a new binning method that will help assign individual sequence reads and contigs to clades, and we are collaborating with Jill Banfield to apply this to the acid mine drainage community. Our initial medical/metagenomics project is to understand the role of gut microbiota in Crohn's disease. Crohn's disease has long been known to be associated with microbial communities in the intestine, but the exact etiology has been unclear. By explicitly sampling these communities we aim to better understand how they cause disease. In addition, by studying how gut flora change during the withdrawal of long-term antibiotics, we hope to gain insight into the action of these drugs on the intestinal microbiota. We also have a longstanding interest in personal genome interpretation and developing a genome commons.

References:

1. Yooseph S et al. 2007. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biology 5:e16. doi:10.1371/journal.pbio.0050016 [PDF 3M]

2. Brenner SE. 2007. Common sense for our genomes. Nature 449:783-784. doi:10.1038/449783a [PDF .2M]

Structural genomics and proteins complexes.
Structural genomics ultimately aims to provide an experimental structure or a high-quality model for every protein. We are involved in maintaining the SCOP: Structural Classification of Proteins and ASTRAL databases which are key resources for accessing and understanding protein structure data. We therefore analyze structural genomics efforts and guide their future directions. Using kernel methods and selected features, we are building systems to recognize ancient protein evolutionary relationships. We are also involved in the Protein Complex Analysis Project, which uses mass-spectrometry, electron microscopy, and electron tomography to understand protein complexes and their cellular distribution.

Reference:

Chandonia JM, Brenner SE. 2006. The impact of structural genomics: expectations and outcomes. Science 311:347-351. doi:10.1126/science.1121018 [PDF .2M] [supporting material 1.2M]

Recent Publications

Full list of publications

Selected Publications:
Radivojac P, et al. 2013. A large-scale evaluation of computational protein function prediction. 2013. Nature Methods 10:221-227. doi:10.1038/nmeth.2340 [PDF 720K] [supplementary information (3.0M)]
Mallot J, Kwan A, Church J, Gonzalez D, Lorey F, Tang LF, Sunderam U, Rana S, Srinivasan R, Brenner SE, Puck J. 2012. Newborn screening for SCID identifies patients with ataxia telangiectasia. Journal of Clinical Immunology 33:540-549. doi:10.1007/s10875-012-9846-1 [PDF 445K]
Soergel DA, Dey N, Knight R, Brenner SE. 2012. Selection of primers for optimal taxonomic classification of environmental 16S sequences. The ISME Journal 6:1440-1444 doi:10.1038/ismej.2011.208 [PDF 1.9M] [supplementary information 1(395K) 2(895K) fig. 1(174K) supplementary tables (595K)]
Engelhardt BE, Jordan MI, Srouji JR, Brenner SE. 2011. Genome-scale phylogenetic function annotation of large and diverse protein families. Genome Research 21:1969-1980. doi:10.1101/gr.104687.109 [PDF 1.2M]
Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin J, Yang L, et al. 2011. The developmental transcriptome of Drosophila melanogaster. Nature 471:473-479. doi:10.1038/nature09715 [PDF 1.9M] [supplementary information (3.8M) supplementary tables 1-12 (11.7M) 13-18 (23M) 19-26 (19M) 27-34 (18M)]
Brooks AN, Yang L, Duff MO, Hansen KD, Dudoit S, Brenner SE, Graveley BR. 2011. Conservation of an RNA regulatory map between Drosophila and mammals. Genome Research 21:193-202. doi:10.1101/gr.108662.110 [PDF .8M] [supplementary information (2.1MB) datasets 1 (4K) 2 (2K) 3 (471K) 4 (6K) 5 (4K) spa5.pl]
Hansen KD, Lareau LF, Blanchette M, Green RE, Meng Q, Rehwinkel J, Gallusser FL, Izaurralde E, Rio DC, Dudoit S, Brenner SE. 2009. Genome-wide identification of alternative splice forms down-regulated by nonsense-mediated mRNA decay in Drosophila. PLoS Genetics 5:e1000525. doi:10.1371/journal.pgen.1000525 [PDF .5M]
Brenner SE. 2007. Common sense for our genomes. Nature 449:783-784. doi:10.1038/449783a [PDF .2M]
Lareau LF, Inada M, Green RE, Wengrod JC, Brenner SE. 2007. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446:926-929. doi:10.1038/nature05676 [PDF 1.3M] [supplementary information .9M]
Lareau LF, Brooks AN, Soergel DAW, Meng Q, Brenner SE. 2007. The coupling of alternative splicing and nonsense mediated mRNA decay. in Blencowe B & Graveley B, eds. Alternative splicing in the postgenomic era. Landes Biosciences. 191-212. http://www.landesbioscience.com/curie/chapter/3531 [PDF 2.4M]
Yooseph S, ... (13 authors) ..., Mashiyama ST, Joachimiak MP, van Belle C, Chandonia JM, Soergel DA, ... (6 authors) ..., Brenner SE, ... (6 authors) ..., Venter JC. 2007. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biology 5:e16. doi:10.1371/journal.pbio.0050016 [PDF 3M] [Cover PDF 19M]
Chandonia JM, Brenner SE. 2006. The impact of structural genomics: expectations and outcomes. Science 311:347-351. doi:10.1126/science.1121018 [PDF .2M] [supporting material 1.2M]
Engelhardt BE, Jordan MI, Muratore KE, Brenner SE. 2005. Protein molecular function prediction by Bayesian phylogenomics. PLoS Computational Biology 1:e45. doi:10.1371/journal.pcbi.0010045 [PDF 1.4M] [Cover PDF 5M] [Cover image 4M]
Crooks GE, Wolfe J, Brenner SE. 2004. Measurements of protein sequence-structure correlations. Proteins: Structure, Function, and Bioinformatics 57:804-810. doi:10.1002/prot.20262 [PDF .18M]
Lareau LF, Green RE, Bhatnagar RS, Brenner SE. 2004. The evolving roles of alternative splicing. Current Opinion in Structural Biology 14:273-282. doi:10.1016/j.sbi.2004.05.002 [PDF .18M]
Hillman RT, Green RE, Brenner SE. 2004. An unappreciated role for RNA surveillance. Genome Biology 5:R8.1-R8.16 doi:10.1186/gb-2004-5-2-r8 [PDF .36M]
Lewis BP, Green RE, Brenner SE. 2003. Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proceedings of the National Academy of Sciences of the United States of America 100:189-192. doi:10.1073/pnas.0136770100 [PDF .25M]
Green RE, Brenner SE. 2002. Bootstrapping and normalization for enhanced evaluations of pairwise sequence comparison. Proceedings of the IEEE 9:1834-47. doi:10.1109/JPROC.2002.805303 [PDF 1.6M]
Brenner SE. 1999. Errors in genome annotation. Trends in Genetics 15:132-133. [PDF .16M]
Brenner SE, Hubbard T, Murzin A, Chothia C. 1995. Gene duplications in H. influenzaeNature 378:140. [OCR PDF 1.43M]
Murzin AG, Brenner SE, Hubbard T, Chothia C. 1995. SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247:536-540. [PDF 3.4M]
 
 

Honors and Awards

Overton Prize - International Society for Computational Biology - 2010
Fellow - American Association for the Advancement of Science - 2008
Miller Research Professorship - Miller Institute for Basic Research in Science - 2007
Young Faculty/Cooperative Extension Specialist Award - College of Natural Resources - 2004
Searle Scholar - Searle Scholars Program - 2001