TY - BOOK AU - Arnold,Taylor AU - Tilton, Lauren TI - Humanities data in R: exploring network, geospatial data, images, and text SN - 9783319207018 U1 - 519.5 PY - 2015/// CY - Switzerland PB - Springer KW - Mathematical statistics N1 - Table of Contents: 1.Set-Up 1.1.Introduction 1.2.Structure of This Book 1.3.Obtaining R 1.4.Supplemental Materials 1.5.Getting Help with R References 2.A Short Introduction to R 2.1.Introduction 2.2.Calculator and Objects 2.3.Numeric Vectors 2.4.Logical Vectors 2.5.Subsetting 2.6.Character Vectors 2.7.Matrices and Data Frames 2.8.Data I/O 2.9.Advanced Subsetting 3.EDA I: Continuous and Categorical Data 3.1.Introduction 3.2.Tables 3.3.Histogram 3.4.Quantiles 3.5.Binning 3.6.Control Flow 3.7.Combining Plots 3.8.Aggregation 3.9.Applying Functions 4.EDA II: Multivariate Analysis 4.1.Introduction 4.2.Scatter Plots 4.3.Text 4.4.Points 4.5.Line Plots 4.6.Scatter Plot Matrix 4.7.Correlation Matrix 5.EDA III: Advanced Graphics 5.1.Introduction 5.2.Output Formats 5.3.Color 5.4.Legends 5.5.Randomness 5.6.Additional Parameters 5.7.Alternative Methods 6.Networks 6.1.Introduction 6.2.A Basic Graph 6.3.Citation Networks 6.4.Graph Centrality 6.5.Graph Communities 6.6.Further Extensions 7.Geospatial Data 7.1.Introduction 7.2.From Scatter Plots to Maps 7.3.Map Projections and Input Formats 7.4.Enriching Tabular Data with Geospatial Data 7.5.Enriching Geospatial Data with Tabular Data 7.6.Further Extensions 8.Image Data 8.1.Introduction 8.2.Basic Image I/O 8.3.Day/Night Photographic Corpus 8.4.Principal Component Analysis 8.5.K-Means 8.6.Scatter Plot of Raster Graphics 8.7.Extensions 9.Natural Language Processing 9.1.Introduction 9.2.Tokenization and Sentence Splitting 9.3.Lemmatization and Part of Speech Tagging 9.4.Dependencies 9.5.Named Entity Recognition 9.6.Coreference 9.7.Case Study: Sherlock Holmes Main Characters 9.8.Other Languages 9.9.Conclusions and Extensions 10.Text Analysis 10.1.Introduction 10.2.Term Frequency: Inverse Document Frequency 10.3.Topic Models 10.4.Stylometric Analysis 10.5.Further Methods and Extensions 11.R Packages 11.1.Installing from Within R 11.2.rJava 11.3.coreNLP 11.4.sessionInfo 12.100 Basic Programming Exercises. N2 - ​This pioneering book teaches readers to use R within four core analytical areas applicable to the Humanities: networks, text, geospatial data, and images. This book is also designed to be a bridge: between quantitative and qualitative methods, individual and collaborative work, and the humanities and social sciences. Humanities Data with R does not presuppose background programming experience. Early chapters take readers from R set-up to exploratory data analysis (continuous and categorical data, multivariate analysis, and advanced graphics with emphasis on aesthetics and facility). Following this, networks, geospatial data, image data, natural language processing and text analysis each have a dedicated chapter. Each chapter is grounded in examples to move readers beyond the intimidation of adding new tools to their research. Everything is hands-on: networks are explained using U.S. Supreme Court opinions, and low-level NLP methods are applied to short stories by Sir Arthur Conan Doyle. After working through these examples with the provided data, code and book website, readers are prepared to apply new methods to their own work. The open source R programming language, with its myriad packages and popularity within the sciences and social sciences, is particularly well-suited to working with humanities data. R packages are also highlighted in an appendix. This book uses an expanded conception of the forms data may take and the information it represents. The methodology will have wide application in classrooms and self-study for the humanities, but also for use in linguistics, anthropology, and political science. Outside the classroom, this intersection of humanities and computing is particularly relevant for research and new modes of dissemination across archives, museums and libraries. (http://www.springer.com/gp/book/9783319207018) ER -