(764b) Data-Driven Network Reconstruction of Biological Systems: Comparison of Statistical and Optimization-Based Methods
AIChE Annual Meeting
2011
2011 Annual Meeting
Computing and Systems Technology Division
Control In Medicine and Biology
Friday, October 21, 2011 - 8:50am to 9:10am
Data-driven
network reconstruction of biological systems is an essential step towards extracting
information from large volumes of biological data. There are several methods
developed recently to reconstruct biological networks. However, to the best of our
knowledge, no systematic and comprehensive studies have been carried out to compare
different methods based on different properties of datasets in terms of their ability
to handle noisy data, different types of noise, level of correlation/collinearity,
size of the data set and incomplete data sets. In this study, we have compared
three popular methodsÑprincipal component regression (PCR)[1], linear matrix
inequalities (LMI) [2],
and Least Absolute Shrinkage and Selection Operator (LASSO)[3] Ñ on both real/experimental and synthetic data sets. Each of
these methods is a representative of a category of popular methods that can be
found in the literature. Method of PCR is based on dimensionality reduction. In
LASSO, the aim is to minimize the L-2 norm of the residual vector while satisfying
a parsimony constraint on the parameters. In LMI, the goal is to minimize an L-infinity-norm
of the residual vector, with the ability to simultaneously incorporate a priori
knowledge into the optimization problem. We have used three different metrics to
compare the performance of the methods: root-mean-squared-error (RMSE) in
prediction, average fractional error in the value of estimated coefficients,
and semi-binary evaluation metrics: accuracy, sensitivity, specificity, and the
geometric mean of sensitivity and specificity. This comparison enables us to
establish criteria for selection of an appropriate approach for network
reconstruction based on a priori properties of experimental data. For
example, while PCR is the fastest method, LASSO and LMI perform better in terms
of accuracy, sensitivity and specificity. Both PCR and LASSO are better than
LMI in terms of fractional error in the values of the computed parameters. These
trade-offs suggest that more than one aspect of each method needs to be taken into
account in selecting a methodology for network reconstruction.
References