Statistical and machine-learning data mining : techniques for better predictive modeling and analysis of big data /: techniques for better predictive modeling and analysis of big data. (2012)
- Record Type:
- Book
- Title:
- Statistical and machine-learning data mining : techniques for better predictive modeling and analysis of big data /: techniques for better predictive modeling and analysis of big data. (2012)
- Main Title:
- Statistical and machine-learning data mining : techniques for better predictive modeling and analysis of big data
- Further Information:
- Note: Bruce Ratner.
- Other Names:
- Ratner, Bruce
Ratner, Bruce
Ratner, Bruce - Contents:
- Introduction; The Personal Computer and Statistics; Statistics and Data Analysis; EDA; The EDA Paradigm; EDA Weaknesses; Small and Big Data; Data Mining Paradigm; Statistics and Machine Learning; Statistical Data Mining; References; ; Two Basic Data Mining Methods for Variable Assessment ; Introduction; Correlation Coefficient; Scatterplots; Data Mining; Smoothed Scatterplot; General Association Test; Summary; References; ; CHAID-Based Data Mining for Paired-Variable Assessment ; Introduction; The Scatterplot; The Smooth Scatterplot; Primer on CHAID; CHAID-Based Data Mining for a Smoother Scatterplot; Summary; References; Appendix; ; The Importance of Straight Data: Simplicity and Desirability for Good Model-Building Practice ; Introduction; Straightness and Symmetry in Data; Data Mining Is a High Concept; The Correlation Coefficient; Scatterplot of (xx3, yy3); Data Mining the Relationship of (xx3, yy3); What Is the GP-Based Data Mining Doing to the Data?; Straightening a Handful of Variables and a Dozen of Two Baker’s Dozens of Variables; Summary; References; ; Symmetrizing Ranked Data: A Statistical Data Mining Method for Improving the Predictive Power of Data ; Introduction; Scales of Measurement; Stem-and-Leaf Display; Box-and-Whiskers Plot; Illustration of the Symmetrizing Ranked Data Method; Summary; References; ; Principal Component Analysis: A Statistical Data Mining Method for Many-Variable Assessment; Introduction; EDA Reexpression Paradigm; What Is the Big Deal?;Introduction; The Personal Computer and Statistics; Statistics and Data Analysis; EDA; The EDA Paradigm; EDA Weaknesses; Small and Big Data; Data Mining Paradigm; Statistics and Machine Learning; Statistical Data Mining; References; ; Two Basic Data Mining Methods for Variable Assessment ; Introduction; Correlation Coefficient; Scatterplots; Data Mining; Smoothed Scatterplot; General Association Test; Summary; References; ; CHAID-Based Data Mining for Paired-Variable Assessment ; Introduction; The Scatterplot; The Smooth Scatterplot; Primer on CHAID; CHAID-Based Data Mining for a Smoother Scatterplot; Summary; References; Appendix; ; The Importance of Straight Data: Simplicity and Desirability for Good Model-Building Practice ; Introduction; Straightness and Symmetry in Data; Data Mining Is a High Concept; The Correlation Coefficient; Scatterplot of (xx3, yy3); Data Mining the Relationship of (xx3, yy3); What Is the GP-Based Data Mining Doing to the Data?; Straightening a Handful of Variables and a Dozen of Two Baker’s Dozens of Variables; Summary; References; ; Symmetrizing Ranked Data: A Statistical Data Mining Method for Improving the Predictive Power of Data ; Introduction; Scales of Measurement; Stem-and-Leaf Display; Box-and-Whiskers Plot; Illustration of the Symmetrizing Ranked Data Method; Summary; References; ; Principal Component Analysis: A Statistical Data Mining Method for Many-Variable Assessment; Introduction; EDA Reexpression Paradigm; What Is the Big Deal?; PCA Basics; Exemplary Detailed Illustration; Algebraic Properties of PCA; Uncommon Illustration; PCA in the Construction of a Quasi-Interaction Variable; Summary; ; The Correlation Coefficient: Its Values Range between Plus/Minus 1, or Do They? ; Introduction; Basics of the Correlation Coefficient; Calculation of the Correlation Coefficient; Rematching; Calculation of the Adjusted Correlation Coefficient; Implication of Rematching; Summary; ; Logistic Regression: The Workhorse of Response Modeling ; Introduction; Logistic Regression Model; Case Study; Logits and Logit Plots; The Importance of Straight Data; Reexpressing for Straight; Straight Data for Case Study; Technique†s When Bulging Rule Does Not Apply; Reexpressing MOS_OPEN; Assessing the Importance of Variables; Important Variables for Case Study; Relative Importance of the Variables; Best Subset of Variables for Case Study; Visual Indicators of Goodness of Model Predictions; Evaluating the Data Mining Work; Smoothing a Categorical Variable; Additional Data Mining Work for Case Study; Summary; ; Ordinary Regression: The Workhorse of Profit Modeling ; Introduction; Ordinary Regression Model; Mini Case Study; Important Variables for Mini Case Study; Best Subset of Variables for Case Study; Suppressor Variable AGE; Summary; References; ; Variable Selection Methods in Regression: Ignorable Problem, Notable Solution ; Introduction; Background; Frequently Used Variable Selection Methods; Weakness in the Stepwise; Enhanced Variable Selection Method; Exploratory Data Analysis; Summary; References; ; CHAID for Interpreting a Logistic Regression Model ; Introduction; Logistic Regression Model; Database Marketing Response Model Case Study; CHAID; Multivariable CHAID Trees; CHAID Market Segmentation; CHAID Tree Graphs; Summary; ; The Importance of the Regression Coefficient ; Introduction; The Ordinary Regression Model; Four Questions; Important Predictor Variables; P Values and Big Data; Returning to Question 1; Effect of Predictor Variable on Prediction; The Caveat; Returning to Question 2; Ranking Predictor Variables by Effect on Prediction; Returning to Question 3; Returning to Question 4; Summary; References; ; The Average Correlation: A Statistical Data Mining Measure for Assessment of Competing Predictive Models and the Importance of the Predictor Variables ; Introduction; Background; Illustration of the Difference between Reliability and Validity; Illustration of the Relationship between Reliability and Validity; The Average Correlation; Summary; Reference; ; CHAID for Specifying a Model with Interaction Variables ; Introduction; Interaction Variables; Strategy for Modeling with Interaction Variables; Strategy Based on the Notion of a Special Point; Example of a Response Model with an Interaction Variable; CHAID for Uncovering Relationships; Illustration of CHAID for Specifying a Model; An Exploratory Look; Database Implication; Summary; References; ; Market Segmentation Classification Modeling with Logistic Regression ; Introduction; Binary Logistic Regression; Polychotomous Logistic Regression Model; Model Building with PLR; Market Segmentation Classification Model; Summary; ; CHAID as a Method for Filling in Missing Values; Introduction; Introduction to the Problem of Missing Data; Missing Data Assumption; CHAID Imputation; Illustration; CHAID Most Likely Category Imputation for a Categorical Variable; Summary; References; ; Identifying Your Best Customers: Descriptive, Predictive, and Look-Alike Profiling ; Introduction; Some Definitions; Illustration of a Flawed Targeting Effort; Well-Defined Targeting Effort; Predictive Profiles; Continuous Trees; Look-Alike Profiling; Look-Alike Tree Characteristics; Summary; ; Assessment of Marketing Models; Introduction; Accuracy for Response Model; Accuracy for Profit Model; Decile Analysis and Cum Lift for Response Model; Decile Analysis and Cum Lift for Profit Model; Precision for Response Model; Precision for Profit Model; Separability for Response and Profit Models; Guidelines for Using Cum Lift, HL/SWMAD, and CV; Summary; ; Bootstrapping in Marketing: A New Approach for Validating Models ; Introduction; Traditional Model Validation; Illustration; Three Questions; The Bootstrap; How to Bootstrap; Bootstrap Decile Analysis Validation; Another Question; Bootstrap Assessment of Model Implementation Performance; Summary; References; ; Validating the Logistic Regression Model: Try Bootstrapping ; Introduction; Logistic Regression Model; The Bootstrap Validation Method; Summary; Reference; ; Visualization of Marketing ModelsData Mining to Uncover Innards of a Model ; Introduction; Brief History of the Graph; Star Graph Basics; Star Graphs for Single Variables; Star Graphs for Many Variables Considered Jointly; Profile Curves Method; Illustration; Summary; References; Appendix 1: SAS Code for Star Graphs for Each Demographic Variable about the Deciles; Appendix 2: SAS Code for Star Graphs for Each Decile about the Demographic Variables; Appendix 3: SAS Code for Profile Curves: All Deciles ; ; The Predictive Contribution Coefficient: A Measure of Predictive Importance ; Introduction; Background; Illustration of Decision Rule; Predictive Contribution Coefficient; Calculation of Predictive Contribution Coefficient; Extra Illustration of Predictive Contribution Coefficient; Summary; Reference; ; Regression Modeling Involves Art, Science, and Poetry, Too ; Introduction; Shakespearean Modelogue; Interpretation of the Shakespearean Modelogue; Summary; Reference; ; Genetic and Statistic Regression Models: A Comparison ; Introduction; Background; Objective; A Pithy Summary of the Development of Genetic Programming; The GenIQ Model: A Brief Review of Its Objective and Salient Features<BR&gt … (more)
- Edition:
- 2nd ed
- Publisher Details:
- Place of publication not identified : CRC Press
- Publication Date:
- 2012
- Extent:
- 1 online resource, illustrations
- Subjects:
- 658.872
Database marketing -- Statistical methods
Data mining -- Statistical methods - Languages:
- English
- ISBNs:
- 9781466551213
1466551216 - Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.145524
- Ingest File:
- 02_092.xml