Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale
Blowout Sale! Save 62% on the Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale by O'Reilly Media at Translate This Website. MPN: black & white illustrations. Hurry! Limited time offer. Offer valid only while supplies last. Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable,
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.
- Learn fundamental components such as MapReduce, HDFS, and YARN
- Explore MapReduce in depth, including steps for developing applications with it
- Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN
- Learn two data formats: Avro for data serialization and Parquet for nested data
- Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)
- Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop
- Learn the HBase distributed database and the ZooKeeper distributed configuration service
|Part Number:||black & white illustrations|
|MPN:||black & white illustrations|
|Item Weight:||2.78 pounds|
|Item Size:||1.6 x 9.2 x 9.2 inches|
|Package Weight:||2.35 pounds|
|Package Size:||7 x 1.5 x 1.5 inches|
Have questions about this item, or would like to inquire about a custom or bulk order?
If you have any questions about this product by O'Reilly Media, contact us by completing and submitting the form below. If you are looking for a specif part number, please include it with your message.
Related Best Sellers
ean: 9783319785028, isbn: 3319785028,
This open access book describes the results of natural language processing and machine learning methods applied to clinical text from electronic patient records. It is divided into twelve chapters. Chapters 1-4 discuss the history and background of t...
By Brand: Microsoft Press
ean: 9780735658189, isbn: 0735658188,
Build agile and responsive Business Intelligence solutions Analyze tabular data using the BI Semantic Model (BISM) in Microsoft® SQL Server® 2012 Analysis Services—and discover a simpler method for creating corporate-level BI solutions. Led by th...
By O'Reilly Media
ean: 9781491981658, isbn: 1491981652,
Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a packag...
ean: 9783319997216, isbn: 9783319997216,
This book constitutes the refereed proceedings of the 13th International Conference on Computational Processing of the Portuguese Language, PROPOR 2018, held in Canela, RS, Brazil, in September 2018.The 42 full papers, 3 short papers and 4 other pa...
By McGraw-Hill Education
ean: 9781259585494, isbn: 9781259585494,
A fully updated, integrated self-study system for the Oracle Database SQL ExamThis thoroughly revised Oracle Press guide offers 100% coverage of all objectives on the latest version of the Oracle Database SQL Exam. Ideal both as a study guide and on-...
By CRC Press
mpn: 79 black & white illustrations, ean: 9781482234817, isbn: 1482234815,
Effectively Access, Transform, Manipulate, Visualize, and Reason about Data and Computation Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving illustrates the details involved in solving real computational probl...
By O'Reilly Media
mpn: 48032261, ean: 9781491912218, isbn: 1491912219,
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei...
By Brand: Chapman and Hall/CRC
ean: 9781420085921, isbn: 9781420085921,
The Handbook of Natural Language Processing, Second Edition presents practical tools and techniques for implementing natural language processing in computer systems. Along with removing outdated material, this edition updates every chapter and expand...