Baron Peters

 

 

 

 

 

Talk: Importance learning: When tails dominate the signal 

 This talk considers machine learning in situations where fringes of the population are critical to accurate predictions. It is motivated by amorphous heterogeneous catalysts, where (i) the nature of the disorder is quenched and unknown; (ii) each active site has
a unique local environment and activity; and (iii) sites that make any appreciable contribution to the observed reactivity are rare, often less than ∼10% of the overall population. For these systems, standard machine learning efforts to predict the site-averaged kinetics have poor data efficiency because the training data acquisition procedures investigate typical (and therefore kinetically inactive) sites. For some properties, straightforward averages even give biased estimates. We present
a new algorithm that combines machine learning (kernel regression) with importance
sampling (Metropolis-Hastings) to efficiently learn the distribution of activation energies and predict site-averaged activities for amorphous catalysts. The algorithm should be useful in many situations where atypical members of a population have outsized influence on the quantity of interest.

Workshop Home Page

Return to the TAU-UIUC Workshop Home Page

Home→