HOME > 상세정보

상세정보

Machine learning in Python : essential techniques for predictive analysis

Machine learning in Python : essential techniques for predictive analysis (12회 대출)

자료유형
단행본
개인저자
Bowles, Michael.
서명 / 저자사항
Machine learning in Python : essential techniques for predictive analysis / Michael Bowles.
발행사항
Indianapolis, IN :   Wiley,   c2015.  
형태사항
xxix, 326 p. : ill. ; 24 cm.
ISBN
9781118961742
서지주기
Includes bibliographical references and index.
일반주제명
Machine learning. Python (Computer program language).
000 00000cam u2200205 a 4500
001 000045848412
005 20151030093048
008 151029s2015 inua b 001 0 eng d
020 ▼a 9781118961742
035 ▼a (KERIS)BIB000013763368
040 ▼a 211046 ▼c 211046 ▼d 211009
082 0 4 ▼a 006.31 ▼2 23
084 ▼a 006.31 ▼2 DDCK
090 ▼a 006.31 ▼b B787m
100 1 ▼a Bowles, Michael.
245 1 0 ▼a Machine learning in Python : ▼b essential techniques for predictive analysis / ▼c Michael Bowles.
260 ▼a Indianapolis, IN : ▼b Wiley, ▼c c2015.
300 ▼a xxix, 326 p. : ▼b ill. ; ▼c 24 cm.
504 ▼a Includes bibliographical references and index.
650 0 ▼a Machine learning.
650 0 ▼a Python (Computer program language).
945 ▼a KLPA

소장정보

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 과학도서관/Sci-Info(2층서고)/ 청구기호 006.31 B787m 등록번호 121234605 도서상태 대출중 반납예정일 2021-11-04 예약 예약가능 R 서비스 M

컨텐츠정보

목차

Introduction xxiii

Chapter 1 The Two Essential Algorithms for Making Predictions 1

Why Are These Two Algorithms So Useful? 2

What Are Penalized Regression Methods? 7

What Are Ensemble Methods? 9

How to Decide Which Algorithm to Use 11

The Process Steps for Building a Predictive Model 13

Framing a Machine Learning Problem 15

Feature Extraction and Feature Engineering 17

Determining Performance of a Trained Model 18

Chapter Contents and Dependencies 18

Summary 20

Chapter 2 Understand the Problem by Understanding the Data 23

The Anatomy of a New Problem 24

Different Types of Attributes and Labels Drive Modeling Choices 26

Things to Notice about Your New Data Set 27

Classification Problems: Detecting Unexploded Mines Using Sonar 28

Physical Characteristics of the Rocks Versus Mines Data Set 29

Statistical Summaries of the Rocks versus Mines Data Set 32

Visualization of Outliers Using Quantilei¿½]Quantile Plot 35

Statistical Characterization of Categorical Attributes 37

How to Use Python Pandas to Summarize the

Rocks Versus Mines Data Set 37

Visualizing Properties of the Rocks versus Mines Data Set 40

Visualizing with Parallel Coordinates Plots 40

Visualizing Interrelationships between Attributes and Labels 42

Visualizing Attribute and Label Correlations Using a Heat Map 49

Summarizing the Process for Understanding Rocks versus Mines Data Set 50

Reali¿½]Valued Predictions with Factor Variables: How Old Is Your Abalone? 50

Parallel Coordinates for Regression Problems—Visualize Variable Relationships for Abalone Problem 56

How to Use Correlation Heat Map for Regression—Visualize Pairi¿½]Wise Correlations for the Abalone Problem 60

Reali¿½]Valued Predictions Using Reali¿½]Valued Attributes: Calculate How Your Wine Tastes 62

Multiclass Classification Problem: What Type of Glass Is That? 68

Summary 73

Chapter 3 Predictive Model Building: Balancing Performance, Complexity, and Big Data 75

The Basic Problem: Understanding Function Approximation 76

Working with Training Data 76

Assessing Performance of Predictive Models 78

Factors Driving Algorithm Choices and Performance—Complexity and Data 79

Contrast Between a Simple Problem and a Complex Problem 80

Contrast Between a Simple Model and a Complex Model 82

Factors Driving Predictive Algorithm Performance 86

Choosing an Algorithm: Linear or Nonlinear? 87

Measuring the Performance of Predictive Models 88

Performance Measures for Different Types of Problems 88

Simulating Performance of Deployed Models 99

Achieving Harmony Between Model and Data 101

Choosing a Model to Balance Problem Complexity, Model Complexity, and Data Set Size 102

Using Forward Stepwise Regression to Control Overfitting 103

Evaluating and Understanding Your Predictive Model 108

Control Overfitting by Penalizing Regression

Coefficients—Ridge Regression 110

Summary 119

Chapter 4 Penalized Linear Regression 121

Why Penalized Linear Regression Methods Are So Useful 122

Extremely Fast Coefficient Estimation 122

Variable Importance Information 122

Extremely Fast Evaluation When Deployed 123

Reliable Performance 123

Sparse Solutions 123

Problem May Require Linear Model 124

When to Use Ensemble Methods 124

Penalized Linear Regression: Regulating Linear Regression for Optimum Performance 124

Training Linear Models: Minimizing Errors and More 126

Adding a Coefficient Penalty to the OLS Formulation 127

Other Useful Coefficient Penalties—Manhattan and ElasticNet 128

Why Lasso Penalty Leads to Sparse Coefficient Vectors 129

ElasticNet Penalty Includes Both Lasso and Ridge 131

Solving the Penalized Linear Regression Problem 132

Understanding Least Angle Regression and Its Relationship to Forward Stepwise Regression 132

How LARS Generates Hundreds of Models of Varying Complexity 136

Choosing the Best Model from The Hundreds LARS Generates 139

Using Glmnet: Very Fast and Very General 144

Comparison of the Mechanics of Glmnet and LARS Algorithms 145

Initializing and Iterating the Glmnet Algorithm 146

Extensions to Linear Regression with Numeric Input 151

Solving Classification Problems with Penalized Regression 151

Working with Classification Problems Having More Than Two Outcomes 155

Understanding Basis Expansion: Using Linear Methods on Nonlinear Problems 156

Incorporating Non-Numeric Attributes into Linear Methods 158

Summary 163

Chapter 5 Building Predictive Models Using Penalized Linear Methods 165

Python Packages for Penalized Linear Regression 166

Multivariable Regression: Predicting Wine Taste 167

Building and Testing a Model to Predict Wine Taste 168

Training on the Whole Data Set before Deployment 172

Basis Expansion: Improving Performance by Creating New Variables from Old Ones 178

Binary Classification: Using Penalized Linear Regression to Detect Unexploded Mines 181

Build a Rocks versus Mines Classifier for Deployment 191

Multiclass Classification: Classifying Crime Scene

Glass Samples 204

Summary 209

Chapter 6 Ensemble Methods 211

Binary Decision Trees 212

How a Binary Decision Tree Generates Predictions 213

How to Train a Binary Decision Tree 214

Tree Training Equals Split Point Selection 218

How Split Point Selection Affects Predictions 218

Algorithm for Selecting Split Points 219

Multivariable Tree Training—Which Attribute to Split? 219

Recursive Splitting for More Tree Depth 220

Overfitting Binary Trees 221

Measuring Overfit with Binary Trees 221

Balancing Binary Tree Complexity for Best Performance 222

Modifications for Classification and Categorical Features 225

Bootstrap Aggregation: “Bagging” 226

How Does the Bagging Algorithm Work? 226

Bagging Performance—Bias versus Variance 229

How Bagging Behaves on Multivariable Problem 231

Bagging Needs Tree Depth for Performance 235

Summary of Bagging 236

Gradient Boosting 236

Basic Principle of Gradient Boosting Algorithm 237

Parameter Settings for Gradient Boosting 239

How Gradient Boosting Iterates Toward a Predictive Model 240

Getting the Best Performance from Gradient Boosting 240

Gradient Boosting on a Multivariable Problem 244

Summary for Gradient Boosting 247

Random Forest 247

Random Forests: Bagging Plus Random Attribute Subsets 250

Random Forests Performance Drivers 251

Random Forests Summary 252

Summary 252

Chapter 7 Building Ensemble Models with Python 255

Solving Regression Problems with Python Ensemble Packages 255

Building a Random Forest Model to Predict Wine Taste 256

Constructing a Random Forest Regressor Object 256

Modeling Wine Taste with Random Forest Regressor 259

Visualizing the Performance of a Random

Forests Regression Model 262

Using Gradient Boosting to Predict Wine Taste 263

Using the Class Constructor for Gradient Boosting Regressor 263

Using Gradient Boosting Regressor to

Implement a Regression Model 267

Assessing the Performance of a Gradient Boosting Model 269

Coding Bagging to Predict Wine Taste 270

Incorporating Non-Numeric Attributes in Python Ensemble Models 275

Coding the Sex of Abalone for Input to Random Forest Regression in Python 275

Assessing Performance and the Importance of Coded Variables 278

Coding the Sex of Abalone for Gradient Boosting Regression in Python 278

Assessing Performance and the Importance of Coded Variables with Gradient Boosting 282

Solving Binary Classification Problems with Python Ensemble Methods 284

Detecting Unexploded Mines with Python Random Forest 285

Constructing a Random Forests Model to Detect Unexploded Mines 287

Determining the Performance of a Random Forests Classifier 291

Detecting Unexploded Mines with Python Gradient Boosting 291

Determining the Performance of a Gradient Boosting Classifier 298

Solving Multiclass Classification Problems with Python Ensemble Methods 302

Classifying Glass with Random Forests 302

Dealing with Class Imbalances 305

Classifying Glass Using Gradient Boosting 307

Assessing the Advantage of Using Random Forest Base Learners with Gradient Boosting 311

Comparing Algorithms 314

Summary 315

Index 319


정보제공 : Aladin

관련분야 신착자료

National Academies of Sciences, Engineering, and Medicine (U.S.) (2020)
Cartwright, Hugh M. (2021)
한국소프트웨어기술인협회. 빅데이터전략연구소 (2021)