Session: Machine Learning for Process and Product Chemistry
Large Open Datasets and Graph Neural Networks for Generalizable Models in Catalysis
Zachary Ulissi, Associate Professor, Carnegie Mellon University
Machine learning accelerated catalyst discovery efforts have seen much progress in the last few years. Datasets of computational calculations have improved, models to connect surface structure with electronic structure or adsorption energies have gotten more sophisticated, and active learning exploration strategies are becoming routine in discovery efforts. However, there are several large challenges that remain: to date, models have had trouble generalizing to new materials or reaction intermediates and applying these methods requires significant training.
I will briefly introduce the Open Catalyst Project and the Open Catalyst 2020 dataset, a collaborative project to span surface composition, structure, and chemistry and enable a new generation of deep machine learning models for catalysis. I will then discuss initial results for state-of-the-art deep graph convolutional models and significant recent progress from others in the community, many of which are likely to improve models in related materials science areas. As an example application I will show how these efforts are already assisting in accelerating new catalyst simulations and transfer learning across relevant datasets and tasks.
Zachary Ulissi is an Associate Professor of Chemical Engineering at Carnegie Mellon University. He completed a BE in Chemical Engineering and a BS in Physics at the University of Delaware. He then did an M.A.St. in Applied Mathematics at the University of Cambridge and a PhD in Chemical Engineering at MIT. He did postdoctoral research with Jens Nørskov at Stanford and the Stanford Linear Accelerator (SLAC). He works on the development and application of high-throughput computational methods in catalysis, machine learning models to predict their properties, and active learning methods to guide these systems. Applications include energy materials, CO2 utilization, fuel cell development, and additive manufacturing.
Data-driven design of concrete with amortized Gaussian processes and multi-objective optimization
Kristen Severson, Senior Researcher, Microsoft
Concrete is the most widely used building material in the world with an estimated global annual production of 30 billion metric tons. Largely because of this scale, the concrete industry is estimated to produce approximately 8% of all global CO2 emissions, therefore decreasing the carbon footprint of the concrete industry is an important consideration for the global decarbonization effort.
In this talk, I will present a method to design concretes with decreased global warming potential using an optimization framework which relies upon quality attributes estimated from an amortized Gaussian process model. Using industrial data, our approach proposed novel concrete formulations with 60% reductions in climate impact.
Kristen Severson is a senior researcher at Microsoft Research in New England. Her research focuses on using domain knowledge in the design and implementation of machine learning models with applications in healthcare and materials research. Before joining Microsoft, Kristen was a research staff member at IBM. Kristen received her BS from Carnegie Mellon University and PhD from MIT.