- Manual/Paper:
Culp M, Johnson K, Michailidis G (2006) ada: an R package for
stochastic boosting. Journal of Statistical Software. 17:2 [pdf][code]
Culp M., Johnson K., Michailidis G. (2006) On Regularized Stochastic Boosting. In revision.
- Updates/Package:
- 11-04-2007: Release of ada-2.0-2 [windows][UNIX]
- Update: The ada update now incorporates several bug fixes found over the past year and the variable importance function has been improved.
- Updates in version 2:
The algorithm performs Stochastic Boosting with exponential and logistic loss similar to stochastic gradient boosting (SGB)
(Friedman, 2002). Discrete, real and gentle boost versions are presented.
- 9-10-2005: Fixed the problem with the fits in the
predict function.
- 7-13-2005: Release of ada-1.0.0 (no longer available) [windows][UNIX]
- Additional Boosting Resources:
Currently, free R packages exist for
advanced boosting which
efficiently build regression trees, smoothing splines, and additive models (such
as gbm, and mboost) [14;20]. The
gbm package provides an internal regression tree engine, marginal
plots and additional utilities for optimizing a wider range of loss
functions outside of classification. The mboost
package is a new advanced boosting tool for processing several base
learners and arbitrary loss functions. In our
experience, these packages can be powerful tools for modeling with either a regression or count
outcome and are recommended as additional tools for advanced
boosting procedures.
Both of these packages are available for use on R's main web
site.
- References:
- Becker R, Chambers J, Wilks A (1988). The new S language: a programming environment for data analysis
and graphics. Wadsworth and Brooks/Cole Advanced Books \& Software, Monterey, CA.
- Boonyanunta N, Zeephongsekul P (2003). "Improving the Predictive Power of AdaBoost: A Case Study in
Classifying Borrowers." In "Proceedings of the 16th International Conference on
Developments in Applied Artificial Intelligence", pp. 674--685. Springer
Verlag Inc.
- Breiman L (1996)."Bagging Predictors." Machine Learning, 24(2), 123--140.
- Breiman L (2001). "Random Forests." Machine Learning, 45(1), 5--32.
- Breiman L, Friedman J, Olshen R, Stone C (1984). "Classification and Regression Trees.
Chapman & Hall, New York.
- Cohen J (1960). "A Coefficient of Agreement for Nominal Data."
Education and Psychological Measurement, 20, 37--46.
- Dettling M (2004). "BagBoosting for Tumor Classification with Gene Expression
Data." Bioinformatics, 20(18), 3583--3593.
- Freund Y, Schapire R (1996). "Experiments with a New Boosting Algorithm."
In "International Conference on Machine Learning," pp.
148--156.
- Freund Y, Schapire R (1997). "A Decision-Theoretic Generalization of On-Line Learning and
an Application to Boosting." Journal Computer and System Sciences, \textbf{55}(1),
119--139.
- Friedman J (2001). "Greedy Function Approximation: A Gradient Boosting Machine."
The Annals of Statistics, 29(5), 1189--1232.
- Friedman J (2002). "Stochastic Gradient Boosting."
Computational Statistics & Data Analysis, 38(4),
367--378.
- Friedman J, Hastie T, Tibshirani R (2000).
"Additive Logistic Regression: A Statistical View of
Boosting." The Annals of Statistics, 28(2), 337--407.
- Hastie T, Tibshirani R, Friedman J (2001).
The Elements of Statistical Learning (Data Mining, Inference
and Prediction). Springer Verlag.
- Hothorn T, Bühlmann P (2006). mboost: Model-Based Boosting.
R package version 0.4-13.
- Huang K, Murphy R (2004).
"Boosting Accuracy of Automated Classification of
Fluorescence Microscope Images for Location Proteomics."
BMC Bioinformatics, 5, 78.
- Kawakita M, Minami M, Eguchi S, Lennert-Cody C (2005).
"An Introduction to the Predictive Technique AdaBoost with a
Comparison to Generalized Additive Models."
Fisheries Research, 76(6), 323--343.
- Lemmens A, Croux C (2005).
"Bagging and Boosting Classification Trees to Predict Churn."
Journal of Marketing Research, 43(2), 276--268.
- Liaw A, Wiener M (2002).
"Classification and Regression by randomForest."
R News, 2(3), 18--22.
http://CRAN.R-project.org/doc/Rnews/.
- R Development Core Team (2006).
"R: A Language and Environment for Statistical
Computing".
R Foundation for Statistical Computing, Vienna, Austria.
http://www.R-project.org.
- Ridgeway G (2006).
gbm: Generalized Boosted Regression Models.
R package version 1.5-7,
http://www.i-pensieri.com/gregr/gbm.shtml.
- Rosset S, Zhu J, Hastie T (2004).
"Boosting as a Regularized Path to a Maximum Margin
Classifier." Journal of Machine Learning Research, 5, 941--973.
- Schapire R (1990).
"The Strength of Weak Learnability."
Machine Learning, 5(2), 197--227.
- Segal M (2004).
"Machine Learning Benchmarks and Random Forest Regression."
Technical report, Center for Bioinformatics \& Molecular
Biostatistics, University of California, San Francisco, CA.
href=http://repositories.cdlib.org/cbmb/bench_rf_regn.
- Sugata S, Abe Y (2001).
"Computer Simulation of Hydrodynamic Models for
Chemical/Pharmaco-Kinetics."
Journal of Chemical Software, 7(2).
- Therneau T, Atkinson B (2005).
rpart: Recursive Partitioning Software.
R package version 3.1-32.
- Ulintz P, Zhu J, Qin Z, Andrews P (2006).
"Improved Classification of Mass Spectrometry Database Search
Results Using Newer Machine Learning Approaches."
Molecular and Cellular Proteomics, 5(3), 497--509.
- Valiant L (1984).
"A Theory of The Learnable."
In "Proceedings of the 16th Annual ACM Symposium on Theory of
Computing," pp. 436--445. ACM Press, New York, NY.
|