Cybersecurity analytics
Verma, Rakesh M.
creator
Marchette, David J.
Co-author
text
Boca Raton
CRC Press
2020
monographic
| 0
xv, 339 p.: ill. Includes bibliographical references and index
Cybersecurity Analytics is for the cybersecurity student and professional who wants to learn data science techniques critical for tackling cybersecurity challenges, and for the data science student and professional who wants to learn about cybersecurity adaptations. Trying to build a malware detector, a phishing email detector, or just interested in finding patterns in your datasets? This book can let you do it on your own. Numerous examples and datasets links are included so that the reader can "learn by doing." Anyone with a basic college-level calculus course and some probability knowledge can easily understand most of the material.
The book includes chapters containing: unsupervised learning, semi-supervised learning, supervised learning, text mining, natural language processing, and more. It also includes background on security, statistics, and linear algebra. The website for the book contains a listing of datasets, updates, and other resources for serious practitioners.
https://www.routledge.com/Cybersecurity-Analytics/Verma-Marchette/p/book/9780367346010
Table of contents
1. Introduction
2. What is Data Analytics?
Data Ingestion
Data Processing and Cleaning
Visualization and Exploratory Analysis
Scatterplots
Pattern Recognition
Classification
Clustering
Feature extraction
Feature Selection
Random Projections
Modeling
Model Specification
Model Selection and Fitting
Evaluation
Strengths and Limitations
The Curse of Dimensionality
3. Security: Basics and Security Analytics
Basics of Security
Know Thy Enemy – Attackers and Their Motivations
Security Goals
Mechanisms for Ensuring Security Goals
Confidentiality
Integrity
Availability
Authentication
Access Control
Accountability
Non-repudiation
Threats, Attacks and Impacts
Passwords
Malware
Spam, Phishing and its Variants
Intrusions
Internet Surfing
System Maintenance and Firewalls
Other Vulnerabilities
Protecting Against Attacks
Applications of Data Science to Security Challenges
Cybersecurity Datasets
Data Science Applications
Passwords
Malware
Intrusions
Spam/Phishing
Credit Card Fraud/Financial Fraud
Opinion Spam
Denial of Service
Security Analytics and Why Do We Need It
4. Statistics
Probability Density Estimation
Models
Poisson
Uniform
Normal
Parameter Estimation
The Bias-Variance Trade-Off
The Law of Large Numbers and the Central Limit Theorem
Confidence Intervals
Hypothesis Testing
Bayesian Statistics
Regression
Logistic Regression
Regularization
Principal Components
Multidimensional Scaling
Procrustes
Nonparametric Statistics
Time Series
5. Data Mining – Unsupervised Learning
Data Collection
Types of Data and Operations
Properties of Datasets
Data Exploration and Preprocessing
Data Exploration
Data Preprocessing/Wrangling
Data Representation
Association Rule Mining
Variations on the Apriori Algorithm
Clustering
Partitional Clustering
Choosing K
Variations on K-means Algorithm
Hierarchical Clustering
Other Clustering Algorithms
Measuring the Clustering Quality
Clustering Miscellany: Clusterability, Robustness, Incremental,
Manifold Discovery
Spectral Embedding
Anomaly Detection
Statistical Methods
Distance-based Outlier Detection
kNN based approach
Density-based Outlier Detection
Clustering-based Outlier Detection
One-class learning based Outliers
Security Applications and Adaptations
Data Mining for Intrusion Detection
Malware Detection
Stepping-stone Detection
Malware Clustering
Directed Anomaly Scoring for Spear Phishing Detection
Concluding Remarks and Further Reading
6. Machine Learning – Supervised Learning
Fundamentals of Supervised Learning
The Bayes Classifier
Naïve Bayes
Nearest Neighbors Classifiers
Linear Classifiers
Decision Trees and Random Forests
Random Forest
Support Vector Machines
Semi-Supervised Classification
Neural Networks and Deep Learning
Perceptron
Neural Networks
Deep Networks
Topological Data Analysis
Ensemble Learning
Majority
Adaboost
One-class Learning
Online Learning
Adversarial Machine Learning
Adversarial Examples
Adversarial Training
Adversarial Generation
Beyond Continuous Data
Evaluation of Machine Learning
Cost-sensitive Evaluation
New Metrics for Unbalanced Datasets
Security Applications and Adaptations
Intrusion Detection
Malware Detection
Spam and Phishing Detection
For Further Reading
7. Text Mining
Tokenization
Preprocessing
Bag-Of-Words
Vector space model
Weighting
Latent Semantic Indexing
Embedding
Topic Models: Latent Dirichlet Allocation
Sentiment Analysis
8. Natural Language Processing
Challenges of NLP
Basics of Language Study and NLP Techniques
Text Preprocessing
Feature Engineering on Text Data
Morphological, Word and Phrasal Features
Clausal and Sentence Level Features
Statistical Features
Corpus-based Analysis
Advanced NLP Tasks
Part of Speech Tagging
Word sense Disambiguation
Language Modeling
Topic Modeling
Sequence to Sequence Tasks
Knowledge Bases and Frameworks
Natural Language Generation
Issues with Pipelining
Security Applications of NLP
Password Checking
Email Spam Detection
Phishing Email Detection
Malware Detection
Attack Generation
9. Big Data Techniques and Security
Key terms
Ingesting the Data
Persistent Storage
Computing and Analyzing
Techniques for Handling Big Data
Visualizing
Streaming Data
Big Data Security
Implications of Big Data Characteristics on Security and Privacy
Mechanisms for Big Data Security Goals
A. Linear Algebra Basics
Vectors
Matrices
Eigenvectors and Eigenvalues
The Singular Value Decomposition
B. Graphs
Graph Invariants
The Laplacian
7. Probability
Probability
Conditional Probability and Bayes’ Rule
Base Rate Fallacy
Expected Values and Moments
Distribution Functions and Densities
Models
Bernoulli and Binomial
Multinomial
Uniform
Computer security
Security measures - Computer networks
Computers
005.8 V3C9
Chapman and Hall/CRC: data science
9780367346010
210401