Normal view MARC view ISBD view

Data mining algorithms: explained using R

By: Cichosz, Pawel.
Material type: materialTypeLabelBookPublisher: United Kingdom Wiley 2015Description: xxxi, 683 p.ISBN: 9781118332580.Subject(s): Mathematics - Probability and statistics - General | Data mining | Computer algorithms | R - Computer program language | Mathematics - Probability and statistics - GeneralDDC classification: 006.312 Summary: Data Mining Algorithms is a practical, technically-oriented guide to data mining algorithms that covers the most important algorithms for building classification, regression, and clustering models, as well as techniques used for attribute selection and transformation, model quality evaluation, and creating model ensembles. The author presents many of the important topics and methodologies widely used in data mining, whilst demonstrating the internal operation and usage of data mining algorithms using examples in R. (http://www.wiley.com/WileyCDA/WileyTitle/productCd-111833258X.html)
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Collection Call number Status Date due Barcode
Books Vikram Sarabhai Library
General Stacks
Non-fiction 006.312 C4D2 (Browse shelf) Checked out 12/11/2019 192621

Table of Contents:

Part I Preliminaries

1.Tasks
1.1.Introduction
1.1.1.Knowledge
1.1.2.Inference
1.2.Inductive learning tasks
1.2.1.Domain
1.2.2.Instances
1.2.3.Attributes
1.2.4.Target attribute
1.2.5.Input attributes
1.2.6.Training set
1.2.7.Model
1.2.8.Performance
1.2.9.Generalization
1.2.10.Overfitting
1.2.11.Algorithms
1.2.12.Inductive learning as search
1.3.Classification
1.3.1.Concept
1.3.2.Training set
1.3.3.Model
1.3.4.Performance
1.3.5.Generalization
1.3.6.Overfitting
1.3.7.Algorithms
1.4.Regression
1.4.1.Target function
1.4.2.Training set
1.4.3.Model
1.4.4.Performance
1.4.5.Generalization
1.4.6.Overfitting
1.4.7.Algorithms
1.5.Clustering
1.5.1.Motivation
1.5.2.Training set
1.5.3.Model
1.5.4.Crisp vs. soft clustering
1.5.5.Hierarchical clustering
1.5.6.Performance
1.5.7.Generalization
1.5.8.Algorithms.
1.5.9.Descriptive vs. predictive clustering
1.6.Practical issues
1.6.1.Incomplete data
1.6.2.Noisy data
1.7.Conclusion
1.8.Further readings
References

2.Basic statistics
2.1.Introduction
2.2.Notational conventions
2.3.Basic statistics as modeling
2.4.Distribution description
2.4.1.Continuous attributes
2.4.2.Discrete attributes
2.4.3.Confidence intervals
2.4.4.m-Estimation
2.5.Relationship detection
2.5.1.Significance tests
2.5.2.Continuous attributes
2.5.3.Discrete attributes
2.5.4.Mixed attributes
2.5.5.Relationship detection caveats
2.6.Visualization
2.6.1.Boxplot
2.6.2.Histogram
2.6.3.Barplot
2.7.Conclusion
2.8.Further readings

Part II Classification

3.Decision trees
3.1.Introduction
3.2.Decision tree model
3.2.1.Nodes and branches
3.2.2.Leaves
3.2.3.Split types
3.3.Growing
3.3.1.Algorithm outline.
3.3.2.Class distribution calculation
3.3.3.Class label assignment
3.3.4.Stop criteria
3.3.5.Split selection
3.3.6.Split application
3.3.7.Complete process
3.4.Pruning
3.4.1.Pruning operators
3.4.2.Pruning criterion
3.4.3.Pruning control strategy
3.4.4.Conversion to rule sets
3.5.Prediction
3.5.1.Class label prediction
3.5.2.Class probability prediction
3.6.Weighted instances
3.7.Missing value handling
3.7.1.Fractional instances
3.7.2.Surrogate splits
3.8.Conclusion
3.9.Further readings

4.Naive Bayes classifier
4.1.Introduction
4.2.Bayes rule
4.3.Classification by Bayesian inference
4.3.1.Conditional class probability
4.3.2.Prior class probability
4.3.3.Independence assumption
4.3.4.Conditional attribute value probabilities
4.3.5.Model construction
4.3.6.Prediction
4.4.Practical issues
4.4.1.Zero and small probabilities.
4.4.2.Linear classification
4.4.3.Continuous attributes
4.4.4.Missing attribute values
4.4.5.Reducing naivety
4.5.Conclusion
4.6.Further readings

5.Linear classification
5.1.Introduction
5.2.Linear representation
5.2.1.Inner representation function
5.2.2.Outer representation function
5.2.3.Threshold representation
5.2.4.Logit representation
5.3.Parameter estimation
5.3.1.Delta rule
5.3.2.Gradient descent
5.3.3.Distance to decision boundary
5.3.4.Least squares
5.4.Discrete attributes
5.5.Conclusion
5.6.Further readings

6.Misclassification costs
6.1.Introduction
6.2.Cost representation
6.2.1.Cost matrix
6.2.2.Per-class cost vector
6.2.3.Instance-specific costs
6.3.Incorporating misclassification costs
6.3.1.Instance weighting
6.3.2.Instance resampling
6.3.3.Minimum-cost rule
6.3.4.Instance relabeling.
6.4.Effects of cost incorporation
6.5.Experimental procedure
6.6.Conclusion
6.7.Further readings

7.Classification model evaluation
7.1.Introduction
7.1.1.Dataset performance
7.1.2.Training performance
7.1.3.True performance
7.2.Performance measures
7.2.1.Misclassification error
7.2.2.Weighted misclassification error
7.2.3.Mean misclassification cost
7.2.4.Confusion matrix
7.2.5.ROC analysis
7.2.6.Probabilistic performance measures
7.3.Evaluation procedures
7.3.1.Model evaluation vs. modeling procedure evaluation
7.3.2.Evaluation caveats
7.3.3.Hold-out
7.3.4.Cross-validation
7.3.5.Leave-one-out
7.3.6.Bootstrapping
7.3.7.Choosing the right procedure
7.3.8.Evaluation procedures for temporal data
7.4.Conclusion
7.5.Further readings

Part III Regression

8.Linear regression
8.1.Introduction
8.2.Linear representation.
8.2.1.Parametric representation
8.2.2.Linear representation function
8.2.3.Nonlinear representation functions
8.3.Parameter estimation
8.3.1.Mean square error minimization
8.3.2.Delta rule
8.3.3.Gradient descent
8.3.4.Least squares
8.4.Discrete attributes
8.5.Advantages of linear models
8.6.Beyond linearity
8.6.1.Generalized linear representation
8.6.2.Enhanced representation
8.6.3.Polynomial regression
8.6.4.Piecewise-linear regression
8.7.Conclusion
8.8.Further readings

9.Regression trees
9.1.Introduction
9.2.Regression tree model
9.2.1.Nodes and branches
9.2.2.Leaves
9.2.3.Split types
9.2.4.Piecewise-constant regression
9.3.Growing
9.3.1.Algorithm outline
9.3.2.Target function summary statistics
9.3.3.Target value assignment
9.3.4.Stop criteria
9.3.5.Split selection
9.3.6.Split application
9.3.7.Complete process
9.4.Pruning.
9.4.1.Pruning operators
9.4.2.Pruning criterion
9.4.3.Pruning control strategy
9.5.Prediction
9.6.Weighted instances
9.7.Missing value handling
9.7.1.Fractional instances
9.7.2.Surrogate splits
9.8.Piecewise linear regression
9.8.1.Growing
9.8.2.Pruning
9.8.3.Prediction
9.9.Conclusion
9.10.Further readings

10.Regression model evaluation
10.1.Introduction
10.1.1.Dataset performance
10.1.2.Training performance
10.1.3.True performance
10.2.Performance measures
10.2.1.Residuals
10.2.2.Mean absolute error
10.2.3.Mean square error
10.2.4.Root mean square error
10.2.5.Relative absolute error
10.2.6.Coefficient of determination
10.2.7.Correlation
10.2.8.Weighted performance measures
10.2.9.Loss functions
10.3.Evaluation procedures
10.3.1.Hold-out
10.3.2.Cross-validation
10.3.3.Leave-one-out
10.3.4.Bootstrapping
10.3.5.Choosing the right procedure.
10.4.Conclusion
10.5.Further readings

Part IV Clustering

11.(Dis)similarity measures
11.1.Introduction
11.2.Measuring dissimilarity and similarity
11.3.Difference-based dissimilarity
11.3.1.Euclidean distance
11.3.2.Minkowski distance
11.3.3.Manhattan distance
11.3.4.Canberra distance
11.3.5.Chebyshev distance
11.3.6.Hamming distance
11.3.7.Gower's coefficient
11.3.8.Attribute weighting
11.3.9.Attribute transformation
11.4.Correlation-based similarity
11.4.1.Discrete attributes
11.4.2.Pearson's correlation similarity
11.4.3.Spearman's correlation similarity
11.4.4.Cosine similarity
11.5.Missing attribute values
11.6.Conclusion
11.7.Further readings

12.k-Centers clustering
12.1.Introduction
12.1.1.Basic principle
12.1.2.(Dis)similarity measures
12.2.Algorithm scheme
12.2.1.Initialization
12.2.2.Stop criteria
12.2.3.Cluster formation.
12.2.4.Implicit cluster modeling
12.2.5.Instantiations
12.3.k-Means
12.3.1.Center adjustment
12.3.2.Minimizing dissimilarity to centers
12.4.Beyond means
12.4.1.k-Medians
12.4.2.k-Medoids
12.5.Beyond (fixed) k
12.5.1.Multiple runs
12.5.2.Adaptive k-centers
12.6.Explicit cluster modeling
12.7.Conclusion
12.8.Further readings

13.Hierarchical clustering
13.1.Introduction
13.1.1.Basic approaches
13.1.2.(Dis)similarity measures
13.2.Cluster hierarchies
13.2.1.Motivation
13.2.2.Model representation
13.3.Agglomerative clustering
13.3.1.Algorithm scheme
13.3.2.Cluster linkage
13.4.Divisive clustering
13.4.1.Algorithm scheme
13.4.2.Wrapping a flat clustering algorithm
13.4.3.Stop criteria
13.5.Hierarchical clustering visualization
13.6.Hierarchical clustering prediction
13.6.1.Cutting cluster hierarchies
13.6.2.Cluster membership assignment.
13.7.Conclusion
13.8.Further readings

14.Clustering model evaluation
14.1.Introduction
14.1.1.Dataset performance
14.1.2.Training performance
14.1.3.True performance
14.2.Per-cluster quality measures
14.2.1.Diameter
14.2.2.Separation
14.2.3.Isolation
14.2.4.Silhouette width
14.2.5.Davies
Bouldin Index
14.3.Overall quality measures
14.3.1.Dunn Index
14.3.2.Average Davies
14.3.3.C Index
14.3.4.Average silhouette width
14.3.5.Loglikelihood
14.4.External quality measures
14.4.1.Misclassification error
14.4.2.Rand Index
14.4.3.General relationship detection measures
14.5.Using quality measures
14.6.Conclusion
14.7.Further readings

Part V Getting Better Models

15.Model ensembles
15.1.Introduction
15.2.Model committees
15.3.Base models
15.3.1.Different training sets
15.3.2.Different algorithms.
15.3.3.Different parameter setups
15.3.4.Algorithm randomization
15.3.5.Base model diversity
15.4.Model aggregation
15.4.1.Voting/Averaging
15.4.2.Probability averaging
15.4.3.Weighted voting/averaging
15.4.4.Using as attributes
15.5.Specific ensemble modeling algorithms
15.5.1.Bagging
15.5.2.Stacking
15.5.3.Boosting
15.5.4.Random forest
15.5.5.Random Naive Bayes
15.6.Quality of ensemble predictions
15.7.Conclusion
15.8.Further readings

16.Kernel methods
16.1.Introduction
16.2.Support vector machines
16.2.1.Classification margin
16.2.2.Maximum-margin hyperplane
16.2.3.Primal form
16.2.4.Dual form
16.2.5.Soft margin
16.3.Support vector regression
16.3.1.Regression tube
16.3.2.Primal form
16.3.3.Dual form
16.4.Kernel trick
16.5.Kernel functions
16.5.1.Linear kernel
16.5.2.Polynomial kernel
16.5.3.Radial kernel
16.5.4.Sigmoid kernel.
16.6.Kernel prediction
16.7.Kernel-based algorithms
16.7.1.Kernel-based SVM
16.7.2.Kernel-based SVR
16.8.Conclusion
16.9.Further readings

17.Attribute transformation
17.1.Introduction
17.2.Attribute transformation task
17.2.1.Target task
17.2.2.Target attribute
17.2.3.Transformed attribute
17.2.4.Training set
17.2.5.Modeling transformations
17.2.6.Nonmodeling transformations
17.3.Simple transformations
17.3.1.Standardization
17.3.2.Normalization
17.3.3.Aggregation
17.3.4.Imputation
17.3.5.Binary encoding
17.4.Multiclass encoding
17.4.1.Encoding and decoding functions
17.4.2.1-ok-k encoding
17.4.3.Error-correcting encoding
17.4.4.Effects of multiclass encoding
17.5.Conclusion
17.6.Further readings

18.Discretization
18.1.Introduction
18.2.Discretization task
18.2.1.Motivation
18.2.2.Task definition.
18.2.3.Discretization as modeling
18.2.4.Discretization quality
18.3.Unsupervised discretization
18.3.1.Equal-width intervals
18.3.2.Equal-frequency intervals
18.3.3.Nonmodeling discretization
18.4.Supervised discretization
18.4.1.Pure-class discretization
18.4.2.Bottom-up discretization
18.4.3.Top-down discretization
18.5.Effects of discretization
18.6.Conclusion
18.7.Further readings

19.Attribute selection
19.1.Introduction
19.2.Attribute selection task
19.2.1.Motivation
19.2.2.Task definition
19.2.3.Algorithms
19.3.Attribute subset search
19.3.1.Search task
19.3.2.Initial state
19.3.3.Search operators
19.3.4.State selection
19.3.5.Stop criteria
19.4.Attribute selection filters
19.4.1.Simple statistical filters
19.4.2.Correlation-based filters
19.4.3.Consistency-based filters
19.4.4.RELIEF
19.4.5.Random forest
19.4.6.Cutoff criteria.
19.4.7.Filter-driven search
19.5.Attribute selection wrappers
19.5.1.Subset evaluation
19.5.2.Wrapper attribute selection
19.6.Effects of attribute selection
19.7.Conclusion
19.8.Further readings

20.Case studies
20.1.Introduction
20.1.1.Datasets
20.1.2.Packages
20.1.3.Auxiliary functions
20.2.Census income
20.2.1.Data loading and preprocessing
20.2.2.Default model
20.2.3.Incorporating misclassification costs
20.2.4.Pruning
20.2.5.Attribute selection
20.2.6.Final models
20.3.Communities and crime
20.3.1.Data loading
20.3.2.Data quality
20.3.3.Regression trees
20.3.4.Linear models
20.3.5.Attribute selection
20.3.6.Piecewise-linear models
20.4.Cover type
20.4.1.Data loading and preprocessing
20.4.2.Class imbalance
20.4.3.Decision trees
20.4.4.Class rebalancing
20.4.5.Multiclass encoding
20.4.6.Final classification models
20.4.7.Clustering.
20.5.Conclusion
20.6.Further readings

Closing
A.Notation
A.1.Attribute values
A.2.Data subsets
A.3.Probabilities
B.R packages
B.1.CRAN packages
B.2.DMR packages
B.3.Installing packages
C.Datasets.

Data Mining Algorithms is a practical, technically-oriented guide to data mining algorithms that covers the most important algorithms for building classification, regression, and clustering models, as well as techniques used for attribute selection and transformation, model quality evaluation, and creating model ensembles. The author presents many of the important topics and methodologies widely used in data mining, whilst demonstrating the internal operation and usage of data mining algorithms using examples in R.

(http://www.wiley.com/WileyCDA/WileyTitle/productCd-111833258X.html)

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha