(2ge) Multi-Fidelity Computer-Aided Molecular Design

Conference

AIChE Annual Meeting

Year

2023

Proceeding

2023 AIChE Annual Meeting

Group

Meet the Candidates Poster Sessions

Session

Meet the Faculty and Post-Doc Candidates Poster Session

Time

Sunday, November 5, 2023 - 1:00pm to 3:00pm

Authors

Greenman, K. P. - Presenter, Massachusetts Institute of Technology

Research Interests

My interests and expertise are in combining machine learning with physics-based calculations and high-throughput experiments for molecular design and discovery.

Prior Work

In my PhD work with Profs. Rafael Gómez-Bombarelli and William Green at MIT, I have worked on several projects, including:

(1) Developed multi-fidelity deep learning models to achieve state-of-the-art predictions of absorption wavelength of dye molecules in a variety of solvents, based on a combination of data from time-dependent density functional theory and experiments.

(2) Integrated models from (1) into a closed-loop active learning framework with high-throughout synthesis and characterization experiments, in collaboration with Profs. Klavs Jensen, Regina Barzilay, and Tommi Jaakola. Exploration to improve models in high-uncertainty regions of chemical space, and exploitation to discover novel dye molecules with desired properties.

(3) Developed tool for extracting domain-specific molecular structures from US patents using keyword queries. Used these structures and high-throughout density functional theory calculations to train generative models to propose novel organic photodiodes.

(4) Served as a lead maintainer of the open-source Chemprop package for molecular property prediction, with over 1000 stars on GitHub.

I also completed a micro-internship at Microsoft Research New England, where I created a benchmark of uncertainty quantification strategies for protein engineering. I applied these uncertainty methods in active learning and Bayesian optimization to identify the effect of domain shift on these tasks.

Additionally, I completed an internship at Eli Lilly, where I studied multi-fidelity modeling of bioactivity, as well as impurity prediction for deep learning forward synthesis models.

Future Vision

Building on my previous work, my research group will initially address the following aims:

(1) Create molecular benchmark datasets for the study of multi-fidelity, multi-objective, batch active learning.

Molecular design is typically a multi-objective task, and training data for machine learning models used for this task is often available at more than one level of fidelity with a cost-accuracy tradeoff (e.g. theoretical calculations and experiments). Furthermore, it is often advantageous to intelligently choose experiments using active learning to reduce costs over random searches. When active learning is used to propose new measurements that will be most informative to a model, these should ideally be proposed in batches to be compatible with high-throughout experimental setups. A large body of work has been done on single-fidelity modeling for single objectives, and single-sample active learning is also routinely studied in chemical design. Recent work has begun to extend design to the multi-fidelity, multi-objective, batch case, but many challenges remain. My group will develop benchmark datasets and evaluate state-of-the-art approaches on these benchmarks to enable more efficient design and discovery of new molecules.

(2) Design new molecular optical probes for biomedical imaging applications.

Molecular optical probes can be used for biomedical imaging and guided surgery. Applications typically require the molecules to satisfy several design constraints, including peak absorption and emission wavelength, Stokes shift, quantum yield, solubility, and toxicity. My group will use the techniques studied in aim (1) to design novel optical probes with improved properties. In particular, we will focus on molecules that absorb or emit near-infrared light because biological tissue is more transparent in this range of the spectrum.

(3) Develop tools for robust natural language explanations of uncertainty quantification, acquisition, and design choices.

Explainability tools for machine learning allow for more interpretable predictions/decisions and can build trust among chemical domain experts. We will integrate state-of-the-art attribution methods with large language models and study how to produce robust natural-language explanations of machine learning models based on user questions. Our work will focus on explaining why a model is uncertain for a particular prediction or in a general region of chemical space, why a particular batch of molecules was acquired during active learning, and how a molecule can be modified to fit design criteria.

Teaching Interests

I would be glad to teach any core undergraduate or graduate chemical engineering course, and I would be particularly excited about teaching numerical methods, fluid mechanics, or kinetics and reactor design. In addition, my previous research and teaching experience has prepared me to teach electives in molecular modeling / computational chemistry or machine learning / data science.

I have participated in several pedagogical training opportunities and have earned certificates in subject design, lesson planning, and inclusive teaching from MITâ€™s Teaching and Learning Lab. I have also put this training into practice as an instructional aide for undergraduate fluid mechanics, as a teaching assistant for graduate machine learning for molecular design, and as a mentor of five undergraduate students on research projects. I won first place in the MIT chemical engineering departmentâ€™s 2021 teach-off competition for a lesson I prepared and taught.

As an undergraduate student, I led the creation of the computational curriculum for a new course to reduce barriers to undergraduate research (in collaboration with faculty, graduate students, and other undergraduates). I published an article on this course in the Chemical Engineering Education journal. In the future, I plan to explore opportunities to incorporate artificial intelligence tools (including large language models such as ChatGPT) into the chemical engineering curriculum and to conduct pedagogical research on this topic.

Topics

Computational Molecular Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

The Foundations of Computer Aided Process Design (FOCAPD) Conference

Foundations of Molecular Modeling and Simulation (FOMMS 2024)

Upcoming Conferences & Events

The Foundations of Computer Aided Process Design (FOCAPD) Conference

2024 BASF Sponsored CCPS Faculty Workshop

Artificial Intelligence in PSM: First Steps

Foundations of Molecular Modeling and Simulation (FOMMS 2024)

2024 Brazil Student Regional Conference

2024 Dow Sponsored CCPS Process Safety Faculty Workshop

2024 International Mammalian Synthetic Biology Workshop (mSBW)

2024 Chemical Ventures Conference

2024 China Chem-E-Car Competition

CEP: July 2024

CEP: June 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(2ge) Multi-Fidelity Computer-Aided Molecular Design

AIChE Annual Meeting

2023

2023 AIChE Annual Meeting

Meet the Candidates Poster Sessions

Meet the Faculty and Post-Doc Candidates Poster Session

Sunday, November 5, 2023 - 1:00pm to 3:00pm

Authors

Topics

More Conference Links

Visit Orlando

Universal Studios Offer

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams