(218g) Classifying the Toxicity of Pesticides to Honey Bees Via Support Vector Machines with Random Walk Graph Kernels
AIChE Annual Meeting
2022
2022 Annual Meeting
Topical Conference: Applications of Data Science to Molecules and Materials
Applications of Data Science in Molecular Sciences II
Monday, November 14, 2022 - 5:00pm to 5:15pm
Pesticides benefit agriculture by increasing crop yield, quality, and security. However, pesticides may inadvertently harm bees, which are agriculturally and ecologically vital as pollinators. The development of new pesticides---driven by pest resistance to and demands to reduce negative environmental impacts of incumbent pesticides---necessitates assessments of pesticide toxicity to bees. We leverage a data set of 382 molecules labeled from honey bee toxicity experiments to train a classifier that predicts the toxicity of a new pesticide molecule to honey bees. Traditionally, the first step of a molecular machine learning task is to explicitly convert molecules into feature vector representations for input to the classifier. Instead, we (i) adopt the fixed-length random walk graph kernel to express the similarity between any two molecular graphs and (ii) use the kernel trick to train a support vector machine (SVM) to classify the bee toxicity of pesticides represented as molecular graphs. We assess the performance of the graph-kernel-SVM classifier under different walk lengths used to describe the molecular graphs. The optimal classifier, with walk length 5, achieves an (mean over 100 runs) accuracy, precision, and recall of 0.83, 0.71, and 0.72 on a test data set.