Custom cover image
Custom cover image

Statistical Learning from a Regression Perspective

Contributor(s): Resource type: Ressourcentyp: Buch (Online)Book (Online)Language: English Series: Springer Series in Statistics | SpringerLink BücherPublisher: New York, NY : Springer New York, 2008Description: Online-Ressource (XVIII, 360p, digital)ISBN:
  • 9780387775012
Subject(s): Additional physical formats: 9780387775005 | Buchausg. u.d.T.: Statistical learning from a regression perspective. New York, NY [u.a.] : Springer, 2008. XVII, 358 S.DDC classification:
  • 519.536 22
  • 519.2 23
  • 519.5/36
  • 510
MSC: MSC: *62G08 | 62-02 | 62P25 | 68T05 | 62-04RVK: RVK: QH 234 | SK 840LOC classification:
  • QA278.2
DOI: DOI: 10.1007/978-0-387-77501-2Online resources:
Contents:
7.2 Support Vector Machines in Pictures7.3 Support Vector Machines in Statistical Notation; 7.4 A Classification Example; 7.5 Software Considerations; 7.6 Summary and Conclusions; 8 Broader Implications and a Bit of Craft Lore; 8.1 Some Fundamental Limitations of Statistical Learning; 8.2 Some Assets of Statistical Learning; 8.3 Some Practical Suggestions; 8.4 Some Concluding Observations; References; Index
CONTENTS; Preface; 1 Statistical Learning as a Regression Problem; 1.1 Getting Started; 1.2 Setting the Regression Context; 1.3 The Transition to Statistical Learning; 1.4 Some Initial Concepts and Definitions; 1.5 Some Common Themes; 1.6 Summary and Conclusions; 2 Regression Splines and Regression Smoothers; 2.1 Introduction; 2.2 Regression Splines; 2.3 Penalized Smoothing; 2.4 Smoothing Splines; 2.5 Locally Weighted Regression as a Smoother; 2.6 Smoothers for Multiple Predictors; 2.7 Smoothers with Categorical Variables; 2.8 Locally Adaptive Smoothers; 2.9 The Role of Statistical Inference
2.10 Software Issues2.11 Summary and Conclusions; 3 Classification and Regression Trees (CART); 3.1 Introduction; 3.2 An Overview of Recursive Partitioning with CART; 3.3 Splitting a Node; 3.4 More on Classification; 3.5 Classification Errors and Costs; 3.6 Pruning; 3.7 Missing Data; 3.8 Statistical Inference with CART; 3.9 Classification Versus Forecasting; 3.10 Varying the Prior, Costs, and the Complexity Penalty; 3.11 An Example with Three Response Categories; 3.12 CART with Highly Skewed Response Distributions; 3.13 Some Cautions in Interpreting CART Results; 3.14 Regression Trees
3.15 Software Issues3.16 Summary and Conclusions; 4 Bagging; 4.1 Introduction; 4.2 Overfitting and Cross-Validation; 4.3 Bagging as an Algorithm; 4.4 Some Thinking on Why Bagging Works; 4.5 Some Limitations of Bagging; 4.6 An Example; 4.7 Bagging a Quantitative Response Variable; 4.8 Software Considerations; 4.9 Summary and Conclusions; 5 Random Forests; 5.1 Introduction and Overview; 5.2 An Initial Illustration; 5.3 A Few Formalities; 5.4 Random Forests and Adaptive Nearest Neighbor Methods; 5.5 Taking Costs into Account in Random Forests; 5.6 Determining the Importance of the Predictors
5.7 Response Functions5.8 The Proximity Matrix; 5.9 Quantitative Response Variables; 5.10 Tuning Parameters; 5.11 An Illustration Using a Binary Response Variable; 5.12 An Illustration Using a Quantitative Response Variable; 5.13 Software Considerations; 5.14 Summary and Conclusions; 6 Boosting; 6.1 Introduction; 6.2 Adaboost; 6.3 Why Does Adaboost Work So Well?; 6.4 Stochastic Gradient Boosting; 6.5 Some Problems and Some Possible Solutions; 6.6 Some Examples; 6.7 Software Considerations; 6.8 Summary and Conclusions; 7 Support Vector Machines; 7.1 A Simple Didactic Illustration
Summary: Statistical Learning as a Regression Problem -- Regression Splines and Regression Smoothers -- Classification and Regression Trees (CART) -- Bagging -- Random Forests -- Boosting -- Support Vector Machines -- Broader Implications and a Bit of Craft Lore.Summary: Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this is can be seen as an extension of nonparametric regression. Among the statistical learning procedures examined are bagging, random forests, boosting, and support vector machines. Response variables may be quantitative or categorical. Real applications are emphasized, especially those with practical implications. One important theme is the need to explicitly take into account asymmetric costs in the fitting process. For example, in some situations false positives may be far less costly than false negatives. Another important theme is to not automatically cede modeling decisions to a fitting algorithm. In many settings, subject-matter knowledge should trump formal fitting criteria. Yet another important theme is to appreciate the limitation of one’s data and not apply statistical learning procedures that require more than the data can provide. The material is written for graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. Intuitive explanations and visual representations are prominent. All of the analyses included are done in R. Richard Berk is Distinguished Professor of Statistics Emeritus from the Department of Statistics at UCLA and currently a Professor at the University of Pennsylvania in the Department of Statistics and in the Department of Criminology. He is an elected fellow of the American Statistical Association and the American Association for the Advancement of Science and has served in a professional capacity with a number of organizations such as the Committee on Applied and Theoretical Statistics for the National Research Council and the Board of Directors of the Social Science Research Council. His research has ranged across a variety of applications in the social and natural sciences.PPN: PPN: 1647205549Package identifier: Produktsigel: ZDB-2-SEB | ZDB-2-SXMS | ZDB-2-SMA
No physical items for this record