(242d) Sample Imbalance and Its Role in Understanding Drug Network Characteristics
AIChE Annual Meeting
2015
2015 AIChE Annual Meeting Proceedings
Computing and Systems Technology Division
Interactive Session: Applied Mathematics and Numerical Analysis
Monday, November 9, 2015 - 6:00pm to 8:00pm
Recently, several studies have shown that the targets of dangerous or ineffective compounds often have distinct topological positions in protein-protein interaction networks. These distinctions suggest that machine learning algorithms can play a valuable role in evaluating the therapeutic value of potential compounds. Here, we use compounds listed in the DrugBank databases to analyze the topological differences of the targets of clinically useful versus unsafe compounds. We show that while initially some mild distinctions are observed, these distinctions prove to be of minimal value for predicting clinical value. By applying permutation-based cross-validation, we find that the imbalances in the number of known targets between useful and unsafe compounds inhibits classifier training. We end by suggesting several methods for resolving target imbalance and discuss how to best optimize classifier training for biological networks.