Machine Learning with R. (2015)
- Record Type:
- Book
- Title:
- Machine Learning with R. (2015)
- Main Title:
- Machine Learning with R.
- Further Information:
- Note: Brett Lantz.
- Other Names:
- Lantz, Brett
- Contents:
- Cover -- Copyright -- Credits -- About the Author -- About the Reviewers -- www.PacktPub.com -- Table of Contents -- Preface -- Chapter 1: Introducing Machine Learning -- The origins of machine learning -- Uses and abuses of machine learning -- Machine learning successes -- The limits of machine learning -- Machine learning ethics -- How machines learn -- Data storage -- Abstraction -- Generalization -- Evaluation -- Machine learning in practice -- Types of input data -- Types of machine learning algorithms -- Matching input data to algorithms -- Machine learning with R -- Installing R packages -- Loading and unloading R packages -- Summary -- Chapter 2: Managing and Understanding Data -- R data structures -- Vectors -- Factors -- Lists -- Data frames -- Matrixes and arrays -- Managing data with R -- Saving, loading, and removing R data structures -- Importing and saving data from CSV files -- Exploring and understanding data -- Exploring the structure of data -- Exploring numeric variables -- Measuring the central tendency -- mean and median -- Measuring spread -- quartiles and the five-number summary -- Visualizing numeric variables -- boxplots -- Visualizing numeric variables -- histograms -- Understanding numeric data -- uniform and normal distributions -- Measuring spread -- variance and standard deviation -- Exploring categorical variables -- Measuring the central tendency -- the mode -- Exploring relationships between variables -- Visualizing relationships --Cover -- Copyright -- Credits -- About the Author -- About the Reviewers -- www.PacktPub.com -- Table of Contents -- Preface -- Chapter 1: Introducing Machine Learning -- The origins of machine learning -- Uses and abuses of machine learning -- Machine learning successes -- The limits of machine learning -- Machine learning ethics -- How machines learn -- Data storage -- Abstraction -- Generalization -- Evaluation -- Machine learning in practice -- Types of input data -- Types of machine learning algorithms -- Matching input data to algorithms -- Machine learning with R -- Installing R packages -- Loading and unloading R packages -- Summary -- Chapter 2: Managing and Understanding Data -- R data structures -- Vectors -- Factors -- Lists -- Data frames -- Matrixes and arrays -- Managing data with R -- Saving, loading, and removing R data structures -- Importing and saving data from CSV files -- Exploring and understanding data -- Exploring the structure of data -- Exploring numeric variables -- Measuring the central tendency -- mean and median -- Measuring spread -- quartiles and the five-number summary -- Visualizing numeric variables -- boxplots -- Visualizing numeric variables -- histograms -- Understanding numeric data -- uniform and normal distributions -- Measuring spread -- variance and standard deviation -- Exploring categorical variables -- Measuring the central tendency -- the mode -- Exploring relationships between variables -- Visualizing relationships -- scatterplots -- Examining relationships -- two-way cross-tabulations -- Summary -- Chapter 3: Lazy Learning -- Classification Using Nearest Neighbors -- Understanding nearest neighbor classification -- The k-NN algorithm -- Measuring similarity with distance -- Choosing an appropriate k -- Preparing data for use with k-NN -- Why is the k-NN algorithm lazy?. Example -- Diagnosing breast cancer with the k-NN algorithm -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Transformation -- normalizing numeric data -- Data preparation -- creating training and test datasets -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Transformation -- z-score standardization -- Testing alternative values of k -- Summary -- Chapter 4: Probabilistic Learning -- Classification Using Naive Bayes -- Understanding Naive Bayes -- Basic concepts of Bayesian methods -- Understanding probability -- Understanding joint probability -- Computing conditional probability with Bayes' theorem -- The Naive Bayes algorithm -- Classification with Naive Bayes -- The Laplace estimator -- Using numeric features with Naive Bayes -- Example -- filtering mobile phone spam with the Naive Bayes algorithm -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Data preparation -- cleaning and standardizing text data -- Data preparation -- splitting text documents into words -- Data preparation -- creating training and test datasets -- Visualizing text data -- word clouds -- Data preparation -- creating indicator features for frequent words -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Summary -- Chapter 5: Divide and Conquer -- Classification Using Decision Trees and Rules -- Understanding decision trees -- Divide and conquer -- The C5.0 decision tree algorithm -- Choosing the best split -- Pruning the decision tree -- Example -- identifying risky bank loans using C5.0 decision trees -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Data preparation -- creating random training and test datasets -- Step 3 -- training a model on the data. Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Boosting the accuracy of decision trees -- Making mistakes more costlier than others -- Understanding classification rules -- Separate and conquer -- The 1R algorithm -- The RIPPER algorithm -- Rules from decision trees -- What makes trees and rules greedy? -- Example -- identifying poisonous mushrooms with rule learners -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Summary -- Chapter 6: Forecasting Numeric Data -- Regression Methods -- Understanding regression -- Simple linear regression -- Ordinary least squares estimation -- Correlations -- Multiple linear regression -- Example -- predicting medical expenses using linear regression -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Exploring relationships among features -- the correlation matrix -- Visualizing relationships among features -- the scatterplot matrix -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Model specification -- adding non-linear relationships -- Transformation -- converting a numeric variable to a binary indicator -- Model specification -- adding interaction effects -- Putting it all together -- an improved regression model -- Understanding regression trees and model trees -- Adding regression to trees -- Example -- estimating the quality of wines with regression trees and model trees -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Step 3 -- training a model on the data -- Visualizing decision trees -- Step 4 -- evaluating model performance -- Measuring performance with the mean absolute error. Step 5 -- improving model performance -- Summary -- Chapter 7: Black Box Methods -- Neural Networks and Support Vector Machines -- Understanding neural networks -- From biological to artificial neurons -- Activation functions -- Network topology -- The number of layers -- The direction of information travel -- The number of nodes in each layer -- Training neural networks with backpropagation -- Example -- Modeling the strength of concrete with ANNs -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Understanding Support Vector Machines -- Classification with hyperplanes -- The case of linearly separable data -- The case of nonlinearly separable data -- Using kernels for nonlinear spaces -- Example -- performing OCR with SVMs -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Chapter 8: Finding Patterns -- Market Basket Analysis Using Association Rules -- Understanding association rules -- The Apriori algorithm for association rule learning -- Measuring rule interest -- support and confidence -- Building a set of rules with the Apriori principle -- Example -- identifying frequently purchased groceries with association rules -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Data preparation -- creating a sparse matrix for transaction data -- Visualizing item support -- item frequency plots -- Visualizing the transaction data -- plotting the sparse matrix -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Sorting the set of association rules. Taking subsets of association rules -- Saving association rules to a file or data frame -- Summary -- Chapter 9: Finding Groups of Data -- Clustering with k-means -- Understanding clustering -- Clustering as a machine learning task -- The k-means clustering algorithm -- Using distance to assign and update clusters -- Choosing the appropriate number of clusters -- Example -- finding teen market segments using k-means clustering -- Step 1 -- collecting data -- Step 2 -- exploring and preparing the data -- Data preparation -- dummy coding missing values -- Data preparation -- imputing the missing values -- Step 3 -- training a model on the data -- Step 4 -- evaluating model performance -- Step 5 -- improving model performance -- Summary -- Chapter 10: Evaluating Model Performance -- Measuring performance for classification -- Working with classification prediction data in R -- A closer look at confusion matrices -- Using confusion matrices to measure performance -- Beyond accuracy -- other measures of performance -- The kappa statistic -- Sensitivity and specificity -- Precision and recall -- The F-measure -- Visualizing performance trade-offs -- ROC curves -- Estimating future performance -- The holdout method -- Cross-validation -- Bootstrap sampling -- Summary -- Chapter 11: Improving Model Performance -- Tuning stock models for better performance -- Using caret for automated parameter tuning -- Creating a simple tuned model -- Customizing the tuning process -- Improving model performance with meta-learning -- Understanding ensembles -- Bagging -- Boosting -- Random forests -- Training random forests -- Evaluating random forest performance -- Summary -- Chapter 12: Specialized Machine Learning Topics -- Working with proprietary files and databases -- Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files -- Querying data in SQL databases. … (more)
- Edition:
- 2nd ed
- Publisher Details:
- Birmingham : Packt Publishing
- Publication Date:
- 2015
- Copyright Date:
- 2015
- Extent:
- 1 online resource (452 pages)
- Subjects:
- 005.8
Application software
Mathematical statistics -- Data processing
R (Computer program language)
Machine learning -- Statistical methods
MATHEMATICS -- Applied
MATHEMATICS -- Probability & Statistics -- General
Machine learning -- Statistical methods
R (Computer program language)
Electronic books - Languages:
- English
- ISBNs:
- 9781784394523
1784394521 - Related ISBNs:
- 9781784393908
- Notes:
- Note: Description based on publisher supplied metadata and other sources.
- Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.435223
- Ingest File:
- 02_555.xml