Parallel computing for data science (Record no. 203387)

000 -LEADER
fixed length control field 05337 a2200229 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 160530b2016 xxu||||| |||| 00| 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9781466587014
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 005.3
Item number M2P2
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name Matloff, Norman
9 (RLIN) 321888
245 ## - TITLE STATEMENT
Title Parallel computing for data science
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Name of publisher, distributor, etc CRC Press
Date of publication, distribution, etc 2016
Place of publication, distribution, etc Boca Raton
300 ## - PHYSICAL DESCRIPTION
Extent xxiii, 324 p.
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc Table of Contents:
1. Introduction to Parallel Processing in R
Recurring Theme: The Principle of Pretty Good Parallelism
A Note on Machines
Recurring Theme: Hedging One's Bets
Extended Example: Mutual Web Outlinks

2. "Why Is My Program So Slow?": Obstacles to Speed
Obstacles to Speed
Performance and Hardware Structures
Memory Basics
Network Basics
Latency and Bandwidth
Thread Scheduling
How Many Processes/Threads?
Example: Mutual Outlink Problem
"Big O" Notation
Data Serialization
"Embarrassingly Parallel" Applications

3. Principles of Parallel Loop Scheduling
General Notions of Loop Scheduling
Chunking in Snow
A Note on Code Complexity
Example: All Possible Regressions
The partools Package
Example: All Possible Regressions, Improved Version
Introducing Another Tool: multicore
Issues with Chunk Size
Example: Parallel Distance Computation
The foreach Package
Stride
Another Scheduling Approach: Random Task Permutation
Debugging snow and multicore Code

4. The Shared Memory Paradigm: A Gentle Introduction through R
So, What Is Actually Shared?
Clarity and Conciseness of Shared-Memory Programming
High-Level Introduction to Shared-Memory Programming: Rdsm Package
Example: Matrix Multiplication
Shared Memory Can Bring a Performance Advantage
Locks and Barriers
Example: Finding the Maximal Burst in a Time Series
Example: Transformation of an Adjacency Matrix
Example: k-Means Clustering

5. The Shared Memory Paradigm: C Level
OpenMP
Example: Finding the Maximal Burst in a Time Series
OpenMP Loop Scheduling Options
Example: Transformation an Adjacency Matrix
Example: Transforming an Adjacency Matrix, R-Callable Code
Speedup in C
Run Time vs. Development Time
Further Cache/Virtual Memory Issues
Reduction Operations in OpenMP
Debugging
Intel Thread Building Blocks (TBB)
Lockfree Synchronization

6. The Shared Memory Paradigm: GPUs
Overview
Another Note on Code Complexity
Goal of This Chapter
Introduction to NVIDIA GPUs and CUDA
Example: Mutual Inlinks Problem
Synchronization on GPUs
R and GPUs
The Intel Xeon Phi Chip

7. Thrust and Rth
Hedging One's Bets
Thrust Overview
Rth
Skipping the C++
Example: Finding Quantiles
Introduction to Rth

8. The Message Passing Paradigm
Message Passing Overview
The Cluster Model
Performance Issues
Rmpi
Example: Pipelined Method for Finding Primes
Memory Allocation Issues
Message-Passing Performance Subtleties

9. MapReduce Computation
Apache Hadoop
Other MapReduce Systems
R Interfaces to MapReduce Systems
An Alternative: "Snowdoop"

10. Parallel Sorting and Merging
The Elusive Goal of Optimality
Sorting Algorithms
Example: Bucket Sort in R
Example: Quicksort in OpenMP
Sorting in Rth
Some Timing Comparisons
Sorting on Distributed Data

11. Parallel Prefix Scan
General Formulation
Applications
General Strategies for Parallel Scan Computation
Implementations of Parallel Prefix Scan
Parallel cumsum() with OpenMP
Example: Moving Average

12. Parallel Matrix Operations
Tiled Matrices
Example: Snowdoop Approach to Matrix Operations
Parallel Matrix Multiplication
BLAS Libraries
Example: A Look at the Performance of OpenBLAS
Example: Graph Connectedness
Solving Systems of Linear Equations
Sparse Matrices

13. Inherently Statistical Approaches: Subset Methods
Chunk Averaging
Bag of Little Bootstraps
Subsetting Variables



520 ## - SUMMARY, ETC.
Summary, etc Parallel Computing for Data Science: With Examples in R, C++ and CUDA is one of the first parallel computing books to concentrate exclusively on parallel data structures, algorithms, software tools, and applications in data science. It includes examples not only from the classic "n observations, p variables" matrix format but also from time series, network graph models, and numerous other structures common in data science. The examples illustrate the range of issues encountered in parallel programming.

With the main focus on computation, the book shows how to compute on three types of platforms: multicore systems, clusters, and graphics processing units (GPUs). It also discusses software packages that span more than one type of hardware and can be used from more than one type of programming language. Readers will find that the foundation established in this book will generalize well to other languages, such as Python and Julia.

(https://www.crcpress.com/Parallel-Computing-for-Data-Science-With-Examples-in-R-C-and-CUDA/Matloff/p/book/9781466587014)
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element C++ - Computer program language
9 (RLIN) 56700
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element CUDA - Computer architecture
9 (RLIN) 333116
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Data transmission systems
9 (RLIN) 73514
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Parallel processing - Electronic computers
9 (RLIN) 119553
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Electronic data processing
9 (RLIN) 1439
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element R - Computer program language
9 (RLIN) 55638
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme
Item type Books
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Collection code Permanent location Current location Shelving location Date acquired Source of acquisition Cost, normal purchase price Item location Total Checkouts Total Renewals Full call number Barcode Date last seen Date last borrowed Cost, replacement price Koha item type
          Non-fiction Vikram Sarabhai Library Vikram Sarabhai Library General Stacks 2016-05-30 5 3308.71 Slot 73 (0 Floor, West Wing) 3 4 005.3 M2P2 192178 2019-02-05 2018-11-07 4135.89 Books

Powered by Koha