Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale
In Stock
$19.11
You Save: 62%
mpn: black & white illustrations, ean: 9781491901632, isbn: 1491901632,
4.2 out of 5 stars with 254 reviews
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark.
Clinical Text Mining: Secondary Use of Electronic Patient Records
In Stock
$46.61
You Save: 22%
ean: 9783319785028, isbn: 3319785028,
4.9 out of 5 stars with 249 reviews
This open access book describes the results of natural language processing and machine learning methods applied to clinical text from electronic patient records. It is divided into twelve chapters. Chapters 1-4 discuss the history and background of the original paper-based patient records, their purpose, and how they are written and structured. These initial chapters do not require any technical or medical background knowledge. The remaining eight chapters are more technical in nature and
Text Mining with R: A Tidy Approach
In Stock
$23.51
You Save: 41%
ean: 9781491981658, isbn: 1491981652,
4.3 out of 5 stars with 257 reviews
Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective.The authors demonstrate
Spark: The Definitive Guide: Big Data Processing Made Simple
In Stock
$41.15
You Save: 31%
mpn: 48032261, ean: 9781491912218, isbn: 1491912219,
4.4 out of 5 stars with 34 reviews
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals.You’ll explore the basic operations and common functions of Spark’s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end