Publication Details
Combination of Logistic Regression and Boosting to Predict Disease Outcome
combining of heterogeneous data, gene expression data, class prediction, boosting
An important current bioinformatic challenge is incorporation of diverse data types. Different bioinformatic data can provide complementary information. The combination of relevant data may lead to more accurate findings, e.g., it can help to understand complex diseases or it can derive more accurate hybrid diagnostic or prognostic signature. We propose a prediction approach that combines logistic regression and boosting. Logistic regression is employed with low-dimensional data, while boosting uses high-dimensional data. The presented approach is extended and incorporates more than two data sources. It is validated using simulated data sets and then applied to real bioinformatic data sets with clinical variables, gene expression data and SNP data. We show that this kind of data combination can increase predictive performance.