Provost, Foster

Data science for business: what you need to know about data mining and data-analytic thinking - Provost, Foster - Sebastopol O'Reilly 2013 - xxi, 386 p.

Table of Contents:

Chapter 1 Introduction: Data-Analytic Thinking

1.The Ubiquity of Data Opportunities
2.Example: Hurricane Frances
3.Example: Predicting Customer Churn
4.Data Science, Engineering, and Data-Driven Decision Making
5.Data Processing and “Big Data”
6.From Big Data 1.0 to Big Data 2.0
7.Data and Data Science Capability as a Strategic Asset
8.Data-Analytic Thinking
9.This Book
10.Data Mining and Data Science, Revisited
11.Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist

Chapter 2 Business Problems and Data Science Solutions

13.From Business Problems to Data Mining Tasks
14.Supervised Versus Unsupervised Methods
15.Data Mining and Its Results
16.The Data Mining Process
17.Implications for Managing the Data Science Team
18.Other Analytics Techniques and Technologies

Chapter 3 Introduction to Predictive Modeling: From Correlation to Supervised Segmentation

20.Models, Induction, and Prediction
21.Supervised Segmentation
22.Visualizing Segmentations
23.Trees as Sets of Rules
24.Probability Estimation
25.Example: Addressing the Churn Problem with Tree Induction

Chapter 4 Fitting a Model to Data

27.Classification via Mathematical Functions
28.Regression via Mathematical Functions
29.Class Probability Estimation and Logistic “Regression”
30.Example: Logistic Regression versus Tree Induction
31.Nonlinear Functions, Support Vector Machines, and Neural Networks

Chapter 5 Overfitting and Its Avoidance

3.Overfitting Examined
4.Example: Overfitting Linear Functions
5.* Example: Why Is Overfitting Bad?
6.From Holdout Evaluation to Cross-Validation
7.The Churn Dataset Revisited
8.Learning Curves
9.Overfitting Avoidance and Complexity Control

Chapter 6 Similarity, Neighbors, and Clusters

11.Similarity and Distance
12.Nearest-Neighbor Reasoning
13.Some Important Technical Details Relating to Similarities and Neighbors
15.Stepping Back: Solving a Business Problem Versus Data Exploration

Chapter 7 Decision Analytic Thinking I: What Is a Good Model?

17.Evaluating Classifiers
18.Generalizing Beyond Classification
19.A Key Analytical Framework: Expected Value
20.Evaluation, Baseline Performance, and Implications for Investments in Data

Chapter 8 Visualizing Model Performance

22.Ranking Instead of Classifying
23.Profit Curves
24.ROC Graphs and Curves
25.The Area Under the ROC Curve (AUC)
26.Cumulative Response and Lift Curves
27.Example: churn performance analytics for modeling performance analytics, for modeling churn Performance Analytics for Churn Modeling

Chapter 9 Evidence and Probabilities

29.Example: Targeting Online Consumers With Advertisements
30.Combining Evidence Probabilistically
31.Applying Bayes’ Rule to Data Science
32.A Model of Evidence “Lift”
33.Example: Evidence Lifts from Facebook "Likes"

Chapter 10 Representing and Mining Text

1.Why Text Is Important
2.Why Text Is Difficult
4.Example: Jazz Musicians
5.* The Relationship of IDF to Entropy
6.Beyond Bag of Words
7.Example: Mining News Stories to Predict Stock Price Movement

Chapter 11 Decision Analytic Thinking II: Toward Analytical Engineering

9.Targeting the Best Prospects for a Charity Mailing
10.Our Churn Example Revisited with Even More Sophistication

Chapter 12 Other Data Science Tasks and Techniques

11.Co-occurrences and Associations: Finding Items That Go Together
12.Profiling: Finding Typical Behavior
13.Link Prediction and Social Recommendation
14.Data Reduction, Latent Information, and Movie Recommendation
15.Bias, Variance, and Ensemble Methods
16.Data-Driven Causal Explanation and a Viral Marketing Example

Chapter 13 Data Science and Business Strategy

18.Thinking Data-Analytically, Redux
19.Achieving Competitive Advantage with Data Science
20.Sustaining Competitive Advantage with Data Science
21.Attracting and Nurturing Data Scientists and Their Teams
22.Examine Data Science Case Studies
23.Be Ready to Accept Creative Ideas from Any Source
24.Be Ready to Evaluate Proposals for Data Science Projects
25.A Firm’s Data Science Maturity

Chapter 14 Conclusion

26.The Fundamental Concepts of Data Science
27.What Data Can’t Do: Humans in the Loop, Revisited
28.Privacy, Ethics, and Mining Data About Individuals
29.Is There More to Data Science?
30.Final Example: From Crowd-Sourcing to Cloud-Sourcing
31.Final Words

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.
Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.
1. Understand how data science fits in your organization—and how you can use it for competitive advantage
2. Treat data as a business asset that requires careful investment if you’re to gain real value
3. Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way
4. Learn general concepts for actually extracting knowledge from data
5. Apply data science principles when interviewing data science job candidates



Data mining
Big data
Information science
Business - Data processing
Data Mining
Automatic Data Processing

005.74 / P7D2

Powered by Koha