Résumé
Machine Learning (ML) systems are being used to make critical societal, scientific, and business decisions. To promote trust, transparency, and accountability in these systems, many advocate making them interpretable or explainable. In response, there has been dramatic growth in techniques to provide human understandable interpretations of black-box techniques. Yet we ask: Can we trust these ML interpretations? How do we know if they are correct? Unlike for prediction tasks, it is difficult to directly test the veracity of ML interpretations. In this talk, we focus on interpreting predictive models to understand important features and important feature patterns. We first present motivating results from a large-scale empirical stability study illustrating that feature interpretations are generally unreliable and far less reliable than predictions. To address these issues, we present a new statistical inference framework for quantifying the uncertainty in feature importance and higher-order feature patterns. Our approach allows one to test whether a feature significantly contributes to any ML model’s predictive ability in a completely distribution free manner, thus promoting trust in ML feature interpretations. We highlight our inference for interpretable ML approaches via real scientific case studies and a fun illustrative example.
Biographie
Dr. Genevera Allen is a professor of statistics at Columbia University and an investigator at the Zuckerman Institute. Before joining Columbia, she served as assistant and associate professor of Electrical and Computer Engineering, Statistics, and Computer Science at Rice University and as investigator at the Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital and Baylor College of Medicine. She is also the Founding Director of the Rice Center for Transforming Data to Knowledge. Dr. Allen received her Ph.D. in statistics in 2010 from Stanford University under the mentorship of Prof. Robert Tibshirani.
Dr. Allen’s research develops new statistical machine learning tools to help people make reliable data-driven discoveries. She is known for her methods and theory work in the areas of unsupervised learning, interpretable machine learning, data integration, graphical models, and high-dimensional statistics. Dr. Allen is the recipient of several honors for both her research and educational efforts including a National Science Foundation Career Award, Rice University’s Duncan Achievement Award for Outstanding Faculty, the Curriculum Innovation Award, and the School of Engineering’s Research and Teaching Excellence Award. In 2014, she was named to the “Forbes ’30 under 30′: Science and Healthcare” list. She is also an elected member of the International Statistics Institute and an elected fellow of the American Statistical Association and of the Institute for Mathematical Statistics.