Our editors have compiled this directory of the best data preparation books based on Amazon user reviews, rating, and ability to add business value.
There are loads of free resources available online (such as Solutions Review’s Data Analytics Software Buyer’s Guide, visual comparison matrix, and best practices section) and those are great, but sometimes it’s best to do things the old fashioned way. There are few resources that can match the in-depth, comprehensive detail of one of the best data preparation books.
The editors at Solutions Review have done much of the work for you, curating this directory of the best data preparation books on Amazon. Titles have been selected based on the total number and quality of reader user reviews and ability to add business value. Each of the books listed in this compilation meets a minimum criteria of 10 reviews and a 4-star-or-better ranking.
Below you will find a library of titles from recognized industry analysts, experienced practitioners, and subject matter experts spanning the depths of predictive analytics all the way to data science. This compilation includes publications for practitioners of all skill levels.
Note: Titles with recently published new editions will be included if the previous edition met our review and ranking criteria.
“If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.”
“Kafka Streams in Action teaches you to implement stream processing within the Kafka platform. In this easy-to-follow book, you’ll explore real-world examples to collect, transform, and aggregate data, work with multiple processors, and handle real-time events. You’ll even dive into streaming SQL with KSQL! Practical to the very end, it finishes with testing and operational aspects, such as monitoring and debugging. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.”
“Event Streams in Action teaches you techniques for aggregating, storing, and processing event streams using the unified log processing pattern. In this hands-on guide, you’ll discover important application designs like the lambda architecture, stream aggregation, and event reprocessing. You’ll also explore scaling, resiliency, advanced stream patterns, and much more! By the time you’re finished, you’ll be designing large-scale data-driven applications that are easier to build, deploy, and maintain. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.”
Building Data Streaming Applications with Apache Kafka: Design, develop and streamline applications using Apache Kafka, Storm, Heron and Spark
“This book is a comprehensive guide to designing and architecting enterprise-grade streaming applications using Apache Kafka and other big data tools. It includes best practices for building such applications, and tackles some common challenges such as how to use Kafka efficiently and handle high data volumes with ease. This book first takes you through understanding the type messaging system and then provides a thorough introduction to Apache Kafka and its internal details. The second part of the book takes you through designing streaming application using various frameworks and tools such as Apache Spark, Apache Storm, and more.”
“The book Kafka Streams – Real-time Stream Processing helps you understand the stream processing in general and apply that skill to Kafka streams programming. This book is focusing mainly on the new generation of the Kafka Streams library available in the Apache Kafka 2.x. The primary focus of this book is on Kafka Streams. However, the book also touches on the other Apache Kafka capabilities and concepts that are necessary to grasp the Kafka Streams programming.”
“More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise ebook, you’ll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm. Authors Ted Dunning and Ellen Friedman (Real World Hadoop) help you explore some of the best technologies to handle stream processing and analytics, with a focus on the upstream queuing or message-passing layer. To illustrate the effectiveness of these technologies, this book also includes specific use cases.”
Upcoming Releases Worth Checking Out
“Kafka in Action is a practical, hands-on guide to building Kafka-based data pipelines. Filled with real-world use cases and scenarios, this book probes Kafka’s most common use cases, ranging from simple logging through managing streaming data systems for message routing, analytics, and more. In systems that handle big data, streaming data, or fast data, it’s important to get your data pipelines right. Apache Kafka is a wicked-fast distributed streaming platform that operates as more than just a persistent log or a flexible message queue. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.”
“This practical guide explores the world of real-time data systems through the lens of these popular technologies and explains important stream processing concepts against a backdrop of interesting business problems. Mitch Seymour, senior data systems engineer at Mailchimp, introduces you to both Kafka Streams and ksqlDB so that you can choose the best tool for each unique stream processing project. Non-Java developers will find the ksqlDB path to be an especially gentle introduction to stream processing.”
Latest posts by Timothy King (see all)
- The 13 Best Data Virtualization Tools and Software for 2020 - November 10, 2020
- The 6 Best Data Preparation Books on Our Reading List - November 6, 2020
- Boomi AtomSphere Gets Project LightSpeed Data Synchronization - October 30, 2020