Data mining for business analytics : concepts, techniques, and applications in Microsoft Office Excel with XLMiner /: concepts, techniques, and applications in Microsoft Office Excel with XLMiner. (2016)
- Record Type:
- Book
- Title:
- Data mining for business analytics : concepts, techniques, and applications in Microsoft Office Excel with XLMiner /: concepts, techniques, and applications in Microsoft Office Excel with XLMiner. (2016)
- Main Title:
- Data mining for business analytics : concepts, techniques, and applications in Microsoft Office Excel with XLMiner
- Further Information:
- Note: Galit Shmueli, Nitin R. Patel, Peter C. Bruce.
- Authors:
- Shmueli, Galit, 1971-
Patel, Nitin R (Nitin Ratilal)
Bruce, Peter C, 1953- - Contents:
- Foreword xvii Preface to the Third Edition xix Preface to the First Edition xxii Acknowledgments xxiv PART I PRELIMINARIES CHAPTER 1 Introduction 3 1.1 What is Business Analytics? 3 1.2 What is Data Mining? 5 1.3 Data Mining and Related Terms 5 1.4 Big Data 6 1.5 Data Science 7 1.6 Why Are There So Many Different Methods? 8 1.7 Terminology and Notation 9 1.8 Road Maps to This Book 11 Order of Topics 12 CHAPTER 2 Overview of the Data Mining Process 14 2.1 Introduction 14 2.2 Core Ideas in Data Mining 15 2.3 The Steps in Data Mining 18 2.4 Preliminary Steps 20 2.5 Predictive Power and Overfitting 26 2.6 Building a Predictive Model with XLMiner 30 2.7 Using Excel for Data Mining 40 2.8 Automating Data Mining Solutions 40 Data Mining Software Tools (by Herb Edelstein) 42 Problems 45 PART II DATA EXPLORATION AND DIMENSION REDUCTION CHAPTER 3 Data Visualization 50 3.1 Uses of Data Visualization 50 3.2 Data Examples 52 Example 1: Boston Housing Data 52 Example 2: Ridership on Amtrak Trains 53 3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots 53 Distribution Plots 54 Heatmaps: Visualizing Correlations and Missing Values 57 3.4 Multi-Dimensional Visualization 58 Adding Variables 59 Manipulations 61 Reference: trend line and labels 64 Scaling up to Large Datasets 65 Multivariate Plot 66 Interactive Visualization 67 3.5 Specialized Visualizations 70 Visualizing Networked Data 70 Visualizing Hierarchical Data: Treemaps 72 Visualizing Geographical Data: Map Charts 73 3.6Foreword xvii Preface to the Third Edition xix Preface to the First Edition xxii Acknowledgments xxiv PART I PRELIMINARIES CHAPTER 1 Introduction 3 1.1 What is Business Analytics? 3 1.2 What is Data Mining? 5 1.3 Data Mining and Related Terms 5 1.4 Big Data 6 1.5 Data Science 7 1.6 Why Are There So Many Different Methods? 8 1.7 Terminology and Notation 9 1.8 Road Maps to This Book 11 Order of Topics 12 CHAPTER 2 Overview of the Data Mining Process 14 2.1 Introduction 14 2.2 Core Ideas in Data Mining 15 2.3 The Steps in Data Mining 18 2.4 Preliminary Steps 20 2.5 Predictive Power and Overfitting 26 2.6 Building a Predictive Model with XLMiner 30 2.7 Using Excel for Data Mining 40 2.8 Automating Data Mining Solutions 40 Data Mining Software Tools (by Herb Edelstein) 42 Problems 45 PART II DATA EXPLORATION AND DIMENSION REDUCTION CHAPTER 3 Data Visualization 50 3.1 Uses of Data Visualization 50 3.2 Data Examples 52 Example 1: Boston Housing Data 52 Example 2: Ridership on Amtrak Trains 53 3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots 53 Distribution Plots 54 Heatmaps: Visualizing Correlations and Missing Values 57 3.4 Multi-Dimensional Visualization 58 Adding Variables 59 Manipulations 61 Reference: trend line and labels 64 Scaling up to Large Datasets 65 Multivariate Plot 66 Interactive Visualization 67 3.5 Specialized Visualizations 70 Visualizing Networked Data 70 Visualizing Hierarchical Data: Treemaps 72 Visualizing Geographical Data: Map Charts 73 3.6 Summary: Major Visualizations and Operations, by Data Mining Goal 75 Prediction 75 Classification 75 Time Series Forecasting 75 Unsupervised Learning 76 Problems 77 CHAPTER 4 Dimension Reduction 79 4.1 Introduction 79 4.2 Curse of Dimensionality 80 4.3 Practical Considerations 80 Example 1: House Prices in Boston 80 4.4 Data Summaries 81 4.5 Correlation Analysis 84 4.6 Reducing the Number of Categories in Categorical Variables 85 4.7 Converting A Categorical Variable to A Numerical Variable 86 4.8 Principal Components Analysis 86 Example 2: Breakfast Cereals 87 Principal Components 92 Normalizing the Data 93 Using Principal Components for Classification and Prediction 94 4.9 Dimension Reduction Using Regression Models 96 4.10 Dimension Reduction Using Classification and Regression Trees 96 Problems 97 PART III PERFORMANCE EVALUATION CHAPTER 5 Evaluating Predictive Performance 101 5.1 Introduction 101 5.2 Evaluating Predictive Performance 102 Benchmark: The Average 102 Prediction Accuracy Measures 103 5.3 Judging Classifier Performance 106 Benchmark: The Naive Rule 107 Class Separation 107 The Classification Matrix 107 Using the Validation Data 109 Accuracy Measures 109 Cutoff for Classification 110 Performance in Unequal Importance of Classes 114 Asymmetric Misclassification Costs 116 5.4 Judging Ranking Performance 119 5.5 Oversampling 123 Problems 129 PART IV PREDICTION AND CLASSIFICATION METHODS CHAPTER 6 Multiple Linear Regression 134 6.1 Introduction 134 6.2 Explanatory vs. Predictive Modeling 135 6.3 Estimating the Regression Equation and Prediction 136 Example: Predicting the Price of Used Toyota Corolla Cars 137 6.4 Variable Selection in Linear Regression 141 Reducing the Number of Predictors 141 How to Reduce the Number of Predictors 142 Problems 147 CHAPTER 7 k-Nearest Neighbors (kNN) 151 7.1 The k-NN Classifier (categorical outcome) 151 Determining Neighbors 151 Classification Rule 152 Example: Riding Mowers 152 Choosing k 154 Setting the Cutoff Value 154 7.2 k-NN for a Numerical Response 156 7.3 Advantages and Shortcomings of k-NN Algorithms 158 Problems 160 CHAPTER 8 The Naive Bayes Classifier 162 8.1 Introduction 162 Example 1: Predicting Fraudulent Financial Reporting 163 8.2 Applying the Full (Exact) Bayesian Classifier 164 8.3 Advantages and Shortcomings of the Naive Bayes Classifier 172 Advantages and Shortcomings of the naive Bayes Classifier 172 Problems 176 CHAPTER 9 Classification and Regression Trees 178 9.1 Introduction 178 9.2 Classification Trees 179 Example 1: Riding Mowers 180 9.3 Measures of Impurity 183 9.4 Evaluating the Performance of a Classification Tree 187 Example 2: Acceptance of Personal Loan 188 9.5 Avoiding Overfitting 192 Stopping Tree Growth: CHAID 192 Pruning the Tree 193 9.6 Classification Rules from Trees 198 9.7 Classification Trees for More Than two Classes 198 9.8 Regression Trees 198 Prediction 199 Measuring Impurity 200 Evaluating Performance 200 9.9 Advantages and Weaknesses of a Tree 200 9.10 Improving Prediction: Multiple Trees 202 Problems 205 CHAPTER 10 Logistic Regression 209 10.1 Introduction 209 10.2 The Logistic Regression Model 211 Example: Acceptance of Personal Loan 212 Model with a Single Predictor 214 Estimating the Logistic Model from Data 215 Interpreting Results in Terms of Odds 218 10.3 Evaluating Classification Performance 219 Variable Selection 220 10.4 Example of Complete Analysis: Predicting Delayed Flights 222 Data Preprocessing 224 Model Fitting and Estimation 224 Model Interpretation 226 Model Performance 226 Variable Selection 227 10.5 Appendix: Logistic Regression for Profiling 231 Appendix A: Why Linear Regression Is Problematic for a Categorical Response 231 Appendix B: Evaluating Explanatory Power 233 Appendix C: Logistic Regression for More Than Two Classes 235 Problems 239 CHAPTER 11 Neural Nets 242 11.1 Introduction 242 11.2 Concept and Structure of a Neural Network 243 11.3 Fitting a Network to Data 243 Example 1: Tiny Dataset 244 Computing Output of Nodes 245 Preprocessing the Data 248 Training the Model 248 Example 2: Classifying Accident Severity 253 Avoiding overfitting 254 Using the Output for Prediction and Classification 258 11.4 Required User Input 258 11.5 Exploring the Relationship Between Predictors and Response 259 11.6 Advantages and Weaknesses of Neural Networks 261 Problems 262 CHAPTER 12 Discriminant Analysis 264 12.1 Introduction 264 Example 1: Riding Mowers 265 Example 2: Personal Loan Acceptance 265 12.2 Distance of an Observation from a Class 267 12.3 Fisher’s Linear Classification Functions 268 12.4 Classification Performance of Discriminant Analysis 272 12.5 Prior Probabilities 273 12.6 Unequal Misclassification Costs 274 12.7 Classifying More Than Two Classes 274 Example 3: Medical Dispatch to Accident Scenes 274&l … (more)
- Edition:
- Third edition
- Publisher Details:
- Hoboken, New Jersey : John Wiley & Sons, Inc
- Publication Date:
- 2016
- Extent:
- 1 online resource
- Subjects:
- 005.54
Business -- Data processing
Data mining - Languages:
- English
- ISBNs:
- 9781118729243
9781118729137 - Related ISBNs:
- 9781118729274
- Notes:
- Note: Description based on CIP data; item not viewed.
- Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.57350
- Ingest File:
- 02_132.xml