(764b) Data-Driven Network Reconstruction of Biological Systems: Comparison of Statistical and Optimization-Based Methods | AIChE

(764b) Data-Driven Network Reconstruction of Biological Systems: Comparison of Statistical and Optimization-Based Methods

Authors 

Tartakovsky, D. M. - Presenter, University of California, San Diego
Subramaniam, S. - Presenter, University of California, San Diego


Data-driven
network reconstruction of biological systems is an essential step towards extracting
information from large volumes of biological data. There are several methods
developed recently to reconstruct biological networks. However, to the best of our
knowledge, no systematic and comprehensive studies have been carried out to compare
different methods based on different properties of datasets in terms of their ability
to handle noisy data, different types of noise, level of correlation/collinearity,
size of the data set and incomplete data sets. In this study, we have compared
three popular methodsÑprincipal component regression (PCR)[1], linear matrix
inequalities (LMI) [2],
and Least Absolute Shrinkage and Selection Operator (LASSO)[3] Ñ on both real/experimental and synthetic data sets. Each of
these methods is a representative of a category of popular methods that can be
found in the literature. Method of PCR is based on dimensionality reduction. In
LASSO, the aim is to minimize the L-2 norm of the residual vector while satisfying
a parsimony constraint on the parameters. In LMI, the goal is to minimize an L-infinity-norm
of the residual vector, with the ability to simultaneously incorporate a priori
knowledge into the optimization problem. We have used three different metrics to
compare the performance of the methods: root-mean-squared-error (RMSE) in
prediction, average fractional error in the value of estimated coefficients,
and semi-binary evaluation metrics: accuracy, sensitivity, specificity, and the
geometric mean of sensitivity and specificity. This comparison enables us to
establish criteria for selection of an appropriate approach for network
reconstruction based on a priori properties of experimental data. For
example, while PCR is the fastest method, LASSO and LMI perform better in terms
of accuracy, sensitivity and specificity. Both PCR and LASSO are better than
LMI in terms of fractional error in the values of the computed parameters. These
trade-offs suggest that more than one aspect of each method needs to be taken into
account in selecting a methodology for network reconstruction.

 

References

1.            Pradervand,
S., M.R. Maurya, and S. Subramaniam, Identification of signaling components
required for the prediction of cytokine release in RAW 264.7 macrophages.

Genome Biology, 2006. 7(2): p. R11.

2.            Cosentino,
C., et al., Linear matrix inequalities approach to reconstruction of
biological networks.
IET Systems Biology, 2007. 1(3): p. 164-173.

3.            Tibshirani,
R., Regression shrinkage and selection via the Lasso. Journal of the
Royal Statistical Society Series B-Methodological, 1996. 58(1): p.
267-288.