Skip to main content
Home

Main navigation

  • Home
  • Series
  • People
  • Depts & Colleges
  • Open Education

Main navigation

  • Home
  • Series
  • People
  • Depts & Colleges
  • Open Education

Veridical Data Science for biomedical discovery: detecting epistatic interactions with epiTree

Series
Department of Statistics
Video Audio Embed
Bin Yu, Chancellor's Professor, Departments of Statistics and Electrical Engineering and Computer Science, UC Berkeley, gives a seminar for the Department of Statistics.
'A.I. is like nuclear energy - both promising and dangerous' - Bill Gates, 2019.
Data Science is a pillar of A.I. and has driven most of recent cutting-edge discoveries in biomedical research. In practice, Data Science has a life cycle (DSLC) that includes problem formulation, data collection, data cleaning, modeling, result interpretation and the drawing of conclusions. Human judgement calls are ubiquitous at every step of this process, e.g., in choosing data cleaning methods, predictive algorithms and data perturbations. Such judgment calls are often responsible for the "dangers" of A.I. To maximally mitigate these dangers, we developed a framework based on three core principles: Predictability, Computability and Stability (PCS). Through a workflow and documentation (in R Markdown or Jupyter Notebook) that allows one to manage the whole DSLC, the PCS framework unifies, streamlines and expands on the best practices of machine learning and statistics - bringing us a step forward towards veridical Data Science.
In this lecture, we will illustrate the PCS framework through the epiTree; a pipeline to discover epistasis interactions from genomics data. epiTree addresses issues of scaling of penetrance through decision trees, significance calling through PCS p-values, and combinatorial search over interactions through iterative random forests (which is a special case of PCS). Using UK Biobank data, we validate the epiTree pipeline through an application to the red-hair phenotype, where several genes are known to display epistatic interactions.

More in this series

View Series
Department of Statistics

(Not) Aggregating Data: The Corcoran Memorial Lecture

Professor Kerrie Mengersen, Distinguished Professor of Statistics at Queensland University of Technology in the Science and Engineering Faculty, gives the The Corcoran Memorial Lecture, held on 21st January 2021.
Previous
Department of Statistics

Finding Today’s Slaves: Lessons Learned From Over A Decade of Measurement in Modern Slavery

Professor Davina Durgana, award-winning international human rights statistician and professor with almost 15 years of experience developing leading global models to assess risk to modern slavery, gives a talk on their work on modern slavery.
Next
Transcript Available

Episode Information

Series
Department of Statistics
People
Bin Yu
Keywords
statistics
ai
computing
artificial intelligence
Department: Department of Statistics
Date Added: 26/02/2021
Duration: 01:01:58

Subscribe

Apple Podcast Video Apple Podcast Audio Audio RSS Feed Video RSS Feed

Download

Download Video Download Audio Download Transcript

Footer

  • About
  • Accessibility
  • Contribute
  • Copyright
  • Contact
  • Privacy
'Oxford Podcasts' Twitter Account @oxfordpodcasts | MediaPub Publishing Portal for Oxford Podcast Contributors | Upcoming Talks in Oxford | © 2011-2022 The University of Oxford