Regression for predictive analytics : parametric and nonparametric regression /: parametric and nonparametric regression. (2020)
- Record Type:
- Book
- Title:
- Regression for predictive analytics : parametric and nonparametric regression /: parametric and nonparametric regression. (2020)
- Main Title:
- Regression for predictive analytics : parametric and nonparametric regression
- Further Information:
- Note: Ajit C. Tamhane, Edward C. Malthouse.
- Authors:
- Tamhane, Ajit C
Malthouse, Edward C - Contents:
- Preface xiii Acknowledgments xvii 1 Introduction 1 1.1 Supervised Versus Unsupervised Learning 2 1.2 Parametric Versus Nonparametric Models 2 1.3 Types of Data 3 1.4 Overview of Parametric Predictive Analytics 4 2 Simple Linear Regression and Correlation 7 2.1 Fitting a Straight Line 8 2.1.1 Least Squares Method 8 2.1.2 Linearizing Transformations 10 2.1.3 Fitted Values and Residuals 12 2.1.4 Assessing Goodness of Fit 13 2.2 Statistical Inferences for Simple Linear Regression 15 2.2.1 Simple Linear Regression Model 15 2.2.2 Inferences on _0 and _1 17 2.2.3 Analysis of Variance for Simple Linear Regression 18 2.2.4 Pure Error versus Model Error 20 2.2.5 Prediction of Future Observations 20 2.3 Correlation Analysis 22 2.3.1 Bivariate Normal Distribution_ 24 2.3.2 Inferences on on Correlation Coefficient_ 25 2.4 Modern Extensions_ 26 2.5 Technical Notes_ 27 2.5.1 Derivation of the LS Estimators 27 2.5.2 Sums of Squares 28 2.5.3 Distribution of the LS Estimators 28 2.5.4 Prediction Interval 29 Exercises 29 3 Multiple Linear Regression: Basics 33 3.1 Multiple Linear Regression Model 34 3.1.1 Model in Scalar Notation 34 3.1.2 Model in Matrix Notation 35 3.2 Fitting a Multiple Regression Model 36 3.2.1 Least Squares (LS) Method 36 3.2.2 Interpretation of Regression Coefficients 39 3.2.3 Fitted Values and Residuals 40 3.2.4 Measures of Goodness of Fit 41 3.2.5 Linearizing Transformations 42 3.3 Statistical Inferences for Multiple Regression 42 3.3.1 Analysis of Variance for MultiplePreface xiii Acknowledgments xvii 1 Introduction 1 1.1 Supervised Versus Unsupervised Learning 2 1.2 Parametric Versus Nonparametric Models 2 1.3 Types of Data 3 1.4 Overview of Parametric Predictive Analytics 4 2 Simple Linear Regression and Correlation 7 2.1 Fitting a Straight Line 8 2.1.1 Least Squares Method 8 2.1.2 Linearizing Transformations 10 2.1.3 Fitted Values and Residuals 12 2.1.4 Assessing Goodness of Fit 13 2.2 Statistical Inferences for Simple Linear Regression 15 2.2.1 Simple Linear Regression Model 15 2.2.2 Inferences on _0 and _1 17 2.2.3 Analysis of Variance for Simple Linear Regression 18 2.2.4 Pure Error versus Model Error 20 2.2.5 Prediction of Future Observations 20 2.3 Correlation Analysis 22 2.3.1 Bivariate Normal Distribution_ 24 2.3.2 Inferences on on Correlation Coefficient_ 25 2.4 Modern Extensions_ 26 2.5 Technical Notes_ 27 2.5.1 Derivation of the LS Estimators 27 2.5.2 Sums of Squares 28 2.5.3 Distribution of the LS Estimators 28 2.5.4 Prediction Interval 29 Exercises 29 3 Multiple Linear Regression: Basics 33 3.1 Multiple Linear Regression Model 34 3.1.1 Model in Scalar Notation 34 3.1.2 Model in Matrix Notation 35 3.2 Fitting a Multiple Regression Model 36 3.2.1 Least Squares (LS) Method 36 3.2.2 Interpretation of Regression Coefficients 39 3.2.3 Fitted Values and Residuals 40 3.2.4 Measures of Goodness of Fit 41 3.2.5 Linearizing Transformations 42 3.3 Statistical Inferences for Multiple Regression 42 3.3.1 Analysis of Variance for Multiple Regression 42 3.3.2 Inferences on Regression Coefficients 44 3.3.3 Confidence Ellipsoid for the _ Vector_ 45 3.3.4 Extra Sum of Squares Method 47 3.3.5 Prediction of Future Observations 50 3.4 Weighted and Generalized Least Squares 51 3.4.1 Weighted Least Squares 51 3.4.2 Generalized Least Squares_ 53 3.4.3 Statistical Inference on GLS Estimator_ 53 3.5 Partial Correlation Coefficients_ 53 3.6 Special Topics 56 3.6.1 Dummy Variables 56 3.6.2 Interactions 58 3.6.3 Standardized Regression 62 3.7 Modern Extensions_ 63 3.7.1 Regression Trees 63 3.7.2 Neural Nets 66 3.8 Technical Notes_ 68 3.8.1 Derivation of the LS Estimators 68 3.8.2 Distribution of the LS Estimators 68 3.8.3 GaussMarkov Theorem 69 3.8.4 Properties of Fitted Values and Residuals 69 3.8.5 Geometric Interpretation of Least Squares 70 3.8.6 Confidence Ellipsoid for _ 71 3.8.7 Population Partial Correlation Coefficient 71 Exercises 72 4 Multiple Linear Regression: Model Diagnostics 79 4.1 Model Assumptions and Distribution of Residuals 79 4.2 Checking Normality 80 4.3 Checking Homoscedasticity 82 4.3.1 Variance Stabilizing Transformations 83 4.3.2 BoxCox Transformation_ 85 4.4 Detecting Outliers 87 4.5 Checking Model Misspecification_ 89 4.6 Checking Independence 90 4.6.1 Runs Test 90 4.6.2 DurbinWatson Test_ 93 4.7 Checking Influential Observations 94 4.7.1 Leverage 94 4.7.2 Cook’s Distance 95 4.8 Checking Multicollinearity 97 4.8.1 Multicollinearity: Causes and Consequences 97 4.8.2 Multicollinearity Diagnostics 98 Exercises 101 5 Multiple Linear Regression: Shrinkage and Dimension Reduction Methods 107 5.1 Ridge Regression 108 5.2 Lasso Regression 110 5.3 Principal Components Analysis and Regression_ 114 5.3.1 Principal Components Analysis (PCA) 114 5.3.2 Principal Components Regression (PCR) 122 5.4 Partial Least Squares (PLS)_ 126 5.5 Technical Notes_ 132 5.5.1 Properties of Ridge Estimator 132 5.5.2 Derivation of Principal Components 133 Exercises 134 6 Multiple Linear Regression: Variable Selection and Model Building 137 6.1 Best Subset Selection 138 6.1.1 Model Selection Criteria 138 6.2 Stepwise Regression 142 6.3 Model Building 149 6.4 Technical Notes_ 151 6.4.1 Derivation of the Cp Statistic 151 Exercises 152 7 Logistic Regression and Classification 155 7.1 Simple Logistic Regression 157 7.1.1 Model 157 7.1.2 Parameter Estimation 159 7.1.3 Inferences on Parameters 163 7.2 Multiple Logistic Regression 163 7.2.1 Model and Inference 163 7.3 Likelihood Ratio (LR) Test 166 7.3.1 Deviance 167 7.3.2 Akaike information criterion (AIC) 169 7.3.3 Model Selection and Diagnostics 169 7.4 Binary Classification Using Logistic Regression 172 7.4.1 Measures of Correct Classification 172 7.4.2 Receiver Operating Characteristic (ROC) Curve 175 7.5 Polytomous Logistic Regression 178 7.5.1 Nominal Logistic Regression 178 7.5.2 Ordinal Logistic Regression 181 7.6 Modern Extensions_ 184 7.6.1 Classification Trees 184 7.6.2 Support Vector Machines 186 7.7 Technical Notes_ 190 Exercises 192 8 Discriminant Analysis 199 8.1 Linear Discriminant Analysis (LDA) Based on Mahalnobis Distance 200 8.1.1 Mahalnobis Distance 200 8.1.2 Bayesian Classification 201 8.2 Fisher’s Linear Discriminant Function (LD) 203 8.2.1 Two Groups 203 8.2.2 Multiple Groups_ 205 8.3 Naive Bayes 207 8.4 Technical Notes_ 208 8.4.1 Calculation of Pooled Sample Covariance Matrix 208 8.4.2 Derivation of Fisher’s Linear Discriminant Functions 208 8.4.3 Bayes Rule 209 Exercises 210 9 Generalized Linear Models 213 9.1 Exponential Family and Link Function 214 9.1.1 Exponential Family 214 9.1.2 Link Function 216 9.2 Estimation of Parameters of GLM 216 9.2.1 Maximum Likelihood Estimation 216 9.2.2 Iteratively Reweighted Least Squares (IRWLS) Algorithm_ 217 9.3 Deviance and AIC 218 9.4 Poisson Regression 222 9.4.1 Poisson Regression for Rates 226 9.5 Gamma Regression_ 228 9.6 Technical Notes_ 232 9.6.1 Mean and Variance of the Exponential Family of Distributions 232 9.6.2 MLE of _ and Its Evaluation Using the IRWLS Algorithm 232 Exercises 234 10 Survival Analysis 239 10.1 Hazard Rate and Survival Distribution 240 10.2 KaplanMeier Estimator 241 10.3 Log Rank Test 243 10.4 Cox’s Proportional Hazards Model 246 10.4.1 Estimation 246 10.4.2 Examples 247 10.4.3 TimeDependent Covariates 251 10.5 Technical Notes_ 255 10.5.1 ML Estimation of the Cox Proportional Hazards Model 255 Exercises 256 A Primer on Matrix Algebra and Multivariate Distributions 261 A.1 Review of Matrix Algebra 261 A.2 Review of Multivariate Distributions 263 A.3 Multivariate Normal Distribution 264 B Primer on Maximum Likelihood Estimation 267 B.1 Maximum Likelihood Estimation 267 B.2 Large Sample Inference on MLEs 269 B.3 NewtonRaphson and Fisher Scoring Algorithms 270 B.4 Technical Notes_ 271 C Projects 275 C.1 Project … (more)
- Edition:
- 1st
- Publisher Details:
- Hoboken, New Jersey : John Wiley & Sons, Inc
- Publication Date:
- 2020
- Extent:
- 1 online resource
- Subjects:
- 006.312
Data mining
Predictive control -- Mathematical models - Languages:
- English
- ISBNs:
- 9781118948903
- Related ISBNs:
- 9781118948910
- Notes:
- Note: Description based on CIP data; resource not viewed.
- Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.562679
- Ingest File:
- 03_191.xml