Normal view MARC view ISBD view

XML and web technologies for data sciences with R

By: Nolan, Deborah.
Contributor(s): Lang, Duncan Temple.
Material type: materialTypeLabelBookSeries: Use R!. Publisher: New York Springer 2014Description: xxiv, 663 p.ISBN: 9781461478997.Subject(s): XML | Document markup language | Computer program language | Web services | Electronic data processingDDC classification: 006.74 Summary: Web technologies are increasingly relevant to scientists working with data, for both accessing data and creating rich dynamic and interactive displays. The XML and JSON data formats are widely used in Web services, regular Web pages and JavaScript code, and visualization formats such as SVG and KML for Google Earth and Google Maps. In addition, scientists use HTTP and other network protocols to scrape data from Web pages, access REST and SOAP Web Services, and interact with NoSQL databases and text search applications. This book provides a practical hands-on introduction to these technologies, including high-level functions the authors have developed for data scientists. It describes strategies and approaches for extracting data from HTML, XML, and JSON formats and how to programmatically access data from the Web. Along with these general skills, the authors illustrate several applications that are relevant to data scientists, such as reading and writing spreadsheet documents both locally and via Google Docs, creating interactive and dynamic visualizations, displaying spatial-temporal displays with Google Earth, and generating code from descriptions of data structures to read and write data. These topics demonstrate the rich possibilities and opportunities to do new things with these modern technologies. The book contains many examples and case-studies that readers can use directly and adapt to their own work. The authors have focused on the integration of these technologies with the R statistical computing environment. However, the ideas and skills presented here are more general, and statisticians who use other computing environments will also find them relevant to their work. Deborah Nolan is Professor of Statistics at University of California, Berkeley. Duncan Temple Lang is Associate Professor of Statistics at University of California, Davis and has been a member of both the S and R development teams. http://www.springer.com/in/book/9781461478997
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Item location Collection Call number Status Date due Barcode
Books Vikram Sarabhai Library
Slot 108 (0 Floor, West Wing) Non-fiction 006.74 N6X6 (Browse shelf) Available 193523

Table of content:


Part I. Data Formats: XML and JSON
1.
Getting Started with XML and JSON
2.
An Introduction to XML

3. Parsing XML Content

4. XPath, XPointer, and XInclude

5. Strategies for Extracting Data from HTML and XML Content

6. Generating XML
7.
JavaScript Object Notation
8.
Part II. Web Technologies Getting Data from the Web
9.
HTTP Requests
10.
Scraping Data from HTML Forms
11.
REST -based Web Services
12.
SimpleWeb Services and Remote Method Calls with XML-RPC
13.
Accessing SOAP Web Services
14.
Authentication for Web Services via OAuth
15.
Part III. General XML Application Areas
16.
Meta-Programming with XML Schema
17.
Spreadsheets
18.
Scalable Vector Graphics
19.
Keyhole Markup Language
20.
New Ways to Think about Documents. Data Formats XML and JSON
21.
Web Technologies, Getting Data from the Web
22.
General XML Application Areas
23.
Bibliography

24. General Index
25.
R Function and Parameter Index
26.
R Package Index

27. R Class Index
28.
Colophon.

Web technologies are increasingly relevant to scientists working with data, for both accessing data and creating rich dynamic and interactive displays. The XML and JSON data formats are widely used in Web services, regular Web pages and JavaScript code, and visualization formats such as SVG and KML for Google Earth and Google Maps. In addition, scientists use HTTP and other network protocols to scrape data from Web pages, access REST and SOAP Web Services, and interact with NoSQL databases and text search applications. This book provides a practical hands-on introduction to these technologies, including high-level functions the authors have developed for data scientists. It describes strategies and approaches for extracting data from HTML, XML, and JSON formats and how to programmatically access data from the Web.
Along with these general skills, the authors illustrate several applications that are relevant to data scientists, such as reading and writing spreadsheet documents both locally and via Google Docs, creating interactive and dynamic visualizations, displaying spatial-temporal displays with Google Earth, and generating code from descriptions of data structures to read and write data. These topics demonstrate the rich possibilities and opportunities to do new things with these modern technologies. The book contains many examples and case-studies that readers can use directly and adapt to their own work. The authors have focused on the integration of these technologies with the R statistical computing environment. However, the ideas and skills presented here are more general, and statisticians who use other computing environments will also find them relevant to their work.
Deborah Nolan is Professor of Statistics at University of California, Berkeley.
Duncan Temple Lang is Associate Professor of Statistics at University of California, Davis and has been a member of both the S and R development teams.


http://www.springer.com/in/book/9781461478997

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha