06912 a2200265 4500
160708b2013 xxu||||| |||| 00| 0 eng d
9781449361327
005.74
P7D2
Provost, Foster
334566
Data science for business: what you need to know about data mining and data-analytic thinking
Provost, Foster
Sebastopol
O'Reilly
2013
xxi, 386 p.
Table of Contents:
Chapter 1 Introduction: Data-Analytic Thinking
1.The Ubiquity of Data Opportunities
2.Example: Hurricane Frances
3.Example: Predicting Customer Churn
4.Data Science, Engineering, and Data-Driven Decision Making
5.Data Processing and “Big Data”
6.From Big Data 1.0 to Big Data 2.0
7.Data and Data Science Capability as a Strategic Asset
8.Data-Analytic Thinking
9.This Book
10.Data Mining and Data Science, Revisited
11.Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
12.Summary
Chapter 2 Business Problems and Data Science Solutions
13.From Business Problems to Data Mining Tasks
14.Supervised Versus Unsupervised Methods
15.Data Mining and Its Results
16.The Data Mining Process
17.Implications for Managing the Data Science Team
18.Other Analytics Techniques and Technologies
19.Summary
Chapter 3 Introduction to Predictive Modeling: From Correlation to Supervised Segmentation
20.Models, Induction, and Prediction
21.Supervised Segmentation
22.Visualizing Segmentations
23.Trees as Sets of Rules
24.Probability Estimation
25.Example: Addressing the Churn Problem with Tree Induction
26.Summary
Chapter 4 Fitting a Model to Data
27.Classification via Mathematical Functions
28.Regression via Mathematical Functions
29.Class Probability Estimation and Logistic “Regression”
30.Example: Logistic Regression versus Tree Induction
31.Nonlinear Functions, Support Vector Machines, and Neural Networks
32.Summary
Chapter 5 Overfitting and Its Avoidance
1.Generalization
2.Overfitting
3.Overfitting Examined
4.Example: Overfitting Linear Functions
5.* Example: Why Is Overfitting Bad?
6.From Holdout Evaluation to Cross-Validation
7.The Churn Dataset Revisited
8.Learning Curves
9.Overfitting Avoidance and Complexity Control
10.Summary
Chapter 6 Similarity, Neighbors, and Clusters
11.Similarity and Distance
12.Nearest-Neighbor Reasoning
13.Some Important Technical Details Relating to Similarities and Neighbors
14.Clustering
15.Stepping Back: Solving a Business Problem Versus Data Exploration
16.Summary
Chapter 7 Decision Analytic Thinking I: What Is a Good Model?
17.Evaluating Classifiers
18.Generalizing Beyond Classification
19.A Key Analytical Framework: Expected Value
20.Evaluation, Baseline Performance, and Implications for Investments in Data
21.Summary
Chapter 8 Visualizing Model Performance
22.Ranking Instead of Classifying
23.Profit Curves
24.ROC Graphs and Curves
25.The Area Under the ROC Curve (AUC)
26.Cumulative Response and Lift Curves
27.Example: churn performance analytics for modeling performance analytics, for modeling churn Performance Analytics for Churn Modeling
28.Summary
Chapter 9 Evidence and Probabilities
29.Example: Targeting Online Consumers With Advertisements
30.Combining Evidence Probabilistically
31.Applying Bayes’ Rule to Data Science
32.A Model of Evidence “Lift”
33.Example: Evidence Lifts from Facebook "Likes"
34.Summary
Chapter 10 Representing and Mining Text
1.Why Text Is Important
2.Why Text Is Difficult
3.Representation
4.Example: Jazz Musicians
5.* The Relationship of IDF to Entropy
6.Beyond Bag of Words
7.Example: Mining News Stories to Predict Stock Price Movement
8.Summary
Chapter 11 Decision Analytic Thinking II: Toward Analytical Engineering
9.Targeting the Best Prospects for a Charity Mailing
10.Our Churn Example Revisited with Even More Sophistication
Chapter 12 Other Data Science Tasks and Techniques
11.Co-occurrences and Associations: Finding Items That Go Together
12.Profiling: Finding Typical Behavior
13.Link Prediction and Social Recommendation
14.Data Reduction, Latent Information, and Movie Recommendation
15.Bias, Variance, and Ensemble Methods
16.Data-Driven Causal Explanation and a Viral Marketing Example
17.Summary
Chapter 13 Data Science and Business Strategy
18.Thinking Data-Analytically, Redux
19.Achieving Competitive Advantage with Data Science
20.Sustaining Competitive Advantage with Data Science
21.Attracting and Nurturing Data Scientists and Their Teams
22.Examine Data Science Case Studies
23.Be Ready to Accept Creative Ideas from Any Source
24.Be Ready to Evaluate Proposals for Data Science Projects
25.A Firm’s Data Science Maturity
Chapter 14 Conclusion
26.The Fundamental Concepts of Data Science
27.What Data Can’t Do: Humans in the Loop, Revisited
28.Privacy, Ethics, and Mining Data About Individuals
29.Is There More to Data Science?
30.Final Example: From Crowd-Sourcing to Cloud-Sourcing
31.Final Words
Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.
Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.
1. Understand how data science fits in your organization—and how you can use it for competitive advantage
2. Treat data as a business asset that requires careful investment if you’re to gain real value
3. Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way
4. Learn general concepts for actually extracting knowledge from data
5. Apply data science principles when interviewing data science job candidates
(http://shop.oreilly.com/product/0636920028918.do)
Data mining
56428
Big data
334567
Information science
62
Business - Data processing
22464
Data Mining
56428
Commerce
4607
Automatic Data Processing
334568
Fawcett, Tom
334569
ddc
BK
204084
204084
0
0
ddc
0
005_740000000000000_P7D2
0
NFIC
342475
VSL
VSL
GEN
2016-07-12
22
2112.03
4
11
005.74 P7D2
192476
2020-02-25
2017-11-16
2017-11-16
2829.00
BK