Data science for business: what you need to know about data mining and data-analytic thinking - Provost, Foster - Sebastopol O'Reilly 2013 - xxi, 386 p.

Table of Contents:

Chapter 1 Introduction: Data-Analytic Thinking

1.The Ubiquity of Data Opportunities

2.Example: Hurricane Frances

3.Example: Predicting Customer Churn

4.Data Science, Engineering, and Data-Driven Decision Making

5.Data Processing and “Big Data”

6.From Big Data 1.0 to Big Data 2.0

7.Data and Data Science Capability as a Strategic Asset

8.Data-Analytic Thinking

9.This Book

10.Data Mining and Data Science, Revisited

11.Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist

12.Summary

Chapter 2 Business Problems and Data Science Solutions

13.From Business Problems to Data Mining Tasks

14.Supervised Versus Unsupervised Methods

15.Data Mining and Its Results

16.The Data Mining Process

17.Implications for Managing the Data Science Team

18.Other Analytics Techniques and Technologies

19.Summary

Chapter 3 Introduction to Predictive Modeling: From Correlation to Supervised Segmentation

20.Models, Induction, and Prediction

21.Supervised Segmentation

22.Visualizing Segmentations

23.Trees as Sets of Rules

24.Probability Estimation

25.Example: Addressing the Churn Problem with Tree Induction

26.Summary

Chapter 4 Fitting a Model to Data

27.Classification via Mathematical Functions

28.Regression via Mathematical Functions

29.Class Probability Estimation and Logistic “Regression”

30.Example: Logistic Regression versus Tree Induction

31.Nonlinear Functions, Support Vector Machines, and Neural Networks

32.Summary

Chapter 5 Overfitting and Its Avoidance

1.Generalization

2.Overfitting

3.Overfitting Examined

4.Example: Overfitting Linear Functions

5.* Example: Why Is Overfitting Bad?

6.From Holdout Evaluation to Cross-Validation

7.The Churn Dataset Revisited

8.Learning Curves

9.Overfitting Avoidance and Complexity Control

10.Summary

Chapter 6 Similarity, Neighbors, and Clusters

11.Similarity and Distance

12.Nearest-Neighbor Reasoning

13.Some Important Technical Details Relating to Similarities and Neighbors

14.Clustering

15.Stepping Back: Solving a Business Problem Versus Data Exploration

16.Summary

Chapter 7 Decision Analytic Thinking I: What Is a Good Model?

17.Evaluating Classifiers

18.Generalizing Beyond Classification

19.A Key Analytical Framework: Expected Value

20.Evaluation, Baseline Performance, and Implications for Investments in Data

21.Summary

Chapter 8 Visualizing Model Performance

22.Ranking Instead of Classifying

23.Profit Curves

24.ROC Graphs and Curves

25.The Area Under the ROC Curve (AUC)

26.Cumulative Response and Lift Curves

27.Example: churn performance analytics for modeling performance analytics, for modeling churn Performance Analytics for Churn Modeling

28.Summary

Chapter 9 Evidence and Probabilities

29.Example: Targeting Online Consumers With Advertisements

30.Combining Evidence Probabilistically

31.Applying Bayes’ Rule to Data Science

32.A Model of Evidence “Lift”

33.Example: Evidence Lifts from Facebook "Likes"

34.Summary

Chapter 10 Representing and Mining Text

1.Why Text Is Important

2.Why Text Is Difficult

3.Representation

4.Example: Jazz Musicians

5.* The Relationship of IDF to Entropy

6.Beyond Bag of Words

7.Example: Mining News Stories to Predict Stock Price Movement

8.Summary

Chapter 11 Decision Analytic Thinking II: Toward Analytical Engineering

9.Targeting the Best Prospects for a Charity Mailing

10.Our Churn Example Revisited with Even More Sophistication

Chapter 12 Other Data Science Tasks and Techniques

11.Co-occurrences and Associations: Finding Items That Go Together

12.Profiling: Finding Typical Behavior

13.Link Prediction and Social Recommendation

14.Data Reduction, Latent Information, and Movie Recommendation

15.Bias, Variance, and Ensemble Methods

16.Data-Driven Causal Explanation and a Viral Marketing Example

17.Summary

Chapter 13 Data Science and Business Strategy

18.Thinking Data-Analytically, Redux

19.Achieving Competitive Advantage with Data Science

20.Sustaining Competitive Advantage with Data Science

21.Attracting and Nurturing Data Scientists and Their Teams

22.Examine Data Science Case Studies

23.Be Ready to Accept Creative Ideas from Any Source

24.Be Ready to Evaluate Proposals for Data Science Projects

25.A Firm’s Data Science Maturity

Chapter 14 Conclusion

26.The Fundamental Concepts of Data Science

27.What Data Can’t Do: Humans in the Loop, Revisited

28.Privacy, Ethics, and Mining Data About Individuals

29.Is There More to Data Science?

30.Final Example: From Crowd-Sourcing to Cloud-Sourcing

31.Final Words

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.

Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.

1. Understand how data science fits in your organization—and how you can use it for competitive advantage

2. Treat data as a business asset that requires careful investment if you’re to gain real value

3. Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way

4. Learn general concepts for actually extracting knowledge from data

5. Apply data science principles when interviewing data science job candidates

(http://shop.oreilly.com/product/0636920028918.do)

9781449361327

Data mining

Big data

Information science

Business - Data processing

Data Mining

Commerce

Automatic Data Processing

005.74 / P7D2