Bioinformatics and Statistics

Module Leaders
Maureen Pittman (
Jean Costello (

George Hartoularos
Jared Lumpe
Christa Caggiano
Douglas Myers-Turnbull

Science is the process of building knowledge by testing hypotheses and statistics is mathematical toolkit we use to test hypotheses. This makes statistics is one of the most broadly disciplines in science, and it even has relevance outside traditional science. While we can't give a comprehensive coverage in this short time, we hope to expose you to the types of questions that can be asked with statistics.

Broadly, bioinformatics deals with the development and application of tools to make inferences about biology from data. Often such data includes genomes of one or many organisms. When dealing with such vast datasets consideration of speed and statistical validity become very relevant.

Key Questions
How do you uncover factors--genetic, non-genetic, and combinations thereof--that underlie a biological trait?
How can the algorithmic and statistical concepts of bioinformatics be broadly applied to the study of biological systems?


Tuesday 9/5

9:00-9:10 Introduction and Agenda

9:10-10:00 Introduction to Statistics (Maureen)

10:00-10:30 Journal Club Meet Up

10:30 - 10:50 Journal Club 1

10:50 - 11:10 Journal Club 2

11:10-12:00 blitz talks 1 and 2 (Matt and George)

12:00-1:00 Lunch

1:00-1:45 Machine learning (Jared)

1:50-2:10 Journal Club 3

2:10-2:30 Journal Club 4

2:30-2:45 Break

2:50-3:10 Journal Club 5

3:15-4:00 blitz talk s 3 & 4 (Douglas and Christa)

4:05-4:25 Journal Club 6

4:30-5:00 Break

Journal Club 1

Sequencing - Laura, Laurel, Bryan

Gantz, Valentino M., Nijole Jasinskiene, Olga Tatarenkova, Aniko Fazekas, Vanessa M. Macias, Ethan Bier, and Anthony A. James. "Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi." Proceedings of the National Academy of Sciences 112, no. 49 (2015): E6736-E6743.

Journal Club 2

Electronic Health Records - Stephanie, Maria, Elissa

Denny, Joshua C., Lisa Bastarache, Marylyn D. Ritchie, Robert J. Carroll, Raquel Zink, Jonathan D. Mosley, Julie R. Field et al. "Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data." Nature biotechnology 31, no. 12 (2013): 1102.

Journal Club 3

Statistical genetics - Garrett, Calla, Daniel

Storey, John D., and Robert Tibshirani. "Statistical significance for genomewide studies." Proceedings of the National Academy of Sciences 100, no. 16 (2003): 9440-9445.

Journal Club 4

Network analysis - Christina, Ajikarunia, Miriam

Goh, Kwang-Il, Michael E. Cusick, David Valle, Barton Childs, Marc Vidal, and Albert-László Barabási. "The human disease network." Proceedings of the National Academy of Sciences 104, no. 21 (2007): 8685-8690.

Journal Club 5

Computational immunology - Matthew, Hayarpi, Maurisa

Good, Zinaida, Jolanda Sarno, Astraea Jager, Nikolay Samusik, Nima Aghaeepour, Erin F. Simonds, Leah White et al. "Single-cell developmental classification of B cell precursor acute lymphoblastic leukemia at diagnosis reveals predictors of relapse." Nature medicine 24, no. 4 (2018): 474.

Journal Club 6

Microbiome - Nicholas, Elizabeth, Jack

Martínez-del Campo, Ana, Smaranda Bodea, Hilary A. Hamer, Jonathan A. Marks, Henry J. Haiser, Peter J. Turnbaugh, and Emily P. Balskus. "Characterization and detection of a widely distributed gene cluster that predicts anaerobic choline utilization by human gut bacteria." MBio 6, no. 2 (2015): e00042-15.

Blitz lecture 1 - Matt, single cell RNA-seq
Blitz lecture 2 - George, functional genomics
Blitz lecture 3 - Douglas, Imaging
Blitz lecture 4 - Christa, population genetics

Labs at UCSF in Bioinformatics & Genomics
Nadav Ahituv -- Gene Enhancers and Genomic Regulatory Sequences in Disease and Biology
Sourav Bandyopadhyay -- Network-Based Analysis of Cancer Cell-Line Exposures for Therapeutic Insights
Sergio Baranzini -- Complex Trait Genomics and Methods Development for GWAS and Data Integration
Atul Butte -- Biomarker and Drug Discovery, Translational Informatics, Complex Trait Genomics
Joe Derisi -- Genomics of Infectious Disease
Kathy Giacomini -- Pharmacogenomics
Hani Goodarzi -- Post-transcriptional (RNA) Regulation of Cancer Metastasis
Michael Keiser -- Deep learning for drug discovery and diagnostics
Hao Li -- Gene Regulatory-Based eQTL Modeling of Genes and Pathways in Complex Trait Genomics
Michael McManus -- Non-coding RNAs and Epigenomics
Katie Pollard -- Comparative Genomics, Human Microbiome and Metagenomics, Phylogenomics, Population Genetics
Neil Risch -- Population Genomics, Statistical Genetics, Genetic Epidemiology, Complex Trait Genomics
Mark Segal -- Genome 3D Structure, Biostatistics
Mark Seielstad -- Complex Trait Genomics, Population Genetics
Peter Turnbaugh -- Impact of the Human Microbiome on Pharmacology and Nutrition
Jeff Wall -- Evolutionary Genomics, Comparative Genomics, Genomic Recombination
John Witte -- Genetic Epidemiology, Complex Trait Genomics, Statistical Genetics, Cancer Genomics
Jimmie Ye -- Gene by Environment Interactions, Genomic Regulation of Molecular Phenotypes
Elad Ziv -- Application of Population Genetics to the Study of Complex Traits in Humans
Matt Jones,
Sep 5, 2018, 2:20 PM
Maureen Pittman,
Sep 5, 2018, 12:26 PM
Matt Jones,
Sep 5, 2018, 1:10 PM
Matt Jones,
Sep 7, 2018, 11:13 AM