Home

Books

Computers & Technology

Computer Programming

Database Design & Programming

Apache Flume: Distributed Log Collection for Hadoop

165.00

Free Delivery

Get it by 11 July

Order in 15h54m

Explore other bestsellers in
Database Design & Programming

Coupons

Extra 15% off

Payment discount

Sold by Apollo books

Free delivery on Lockers & Pickup Points

This item is eligible for free returns

Secure Payments

Product Overview

Specifications

Publisher	Packt Publishing
ISBN 10	1782167919
Language	English
Publication Date	4 July 2013

ISBN 13	9781782167914
Author	Subas D'Souza
Book Description	If your role includes moving datasets into Hadoop, this book will help you do it more efficiently using Apache Flume. From installation to customization, it's a complete step-by-step guide on making the service work for you. Overview Integrate Flume with your data sources Transcode your data en-route in Flume Route and separate your data using regular expression matching Configure failover paths and load-balancing to remove single points of failure Utilize Gzip Compression for files written to HDFS In Detail Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with many failover and recovery mechanisms. Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of Flume, which includes moving data to/from databases, NO-SQL-ish data stores, as well as optimizing performance. This book includes real-world scenarios on Flume implementation. Apache Flume: Distributed Log Collection for Hadoop starts with an architectural overview of Flume and then discusses each component in detail. It guides you through the complete installation process and compilation of Flume. It will give you a heads-up on how to use channels and channel selectors. For each architectural component (Sources, Channels, Sinks, Channel Processors, Sink Groups, and so on) the various implementations will be covered in detail along with configuration options. You can use it to customize Flume to your specific needs. There are pointers given on writing custom implementations as well that would help you learn and implement them.
Number of Pages	108 pages

Cart Total  165.00

Apache Flume: Distributed Log Collection for Hadoop

165.00

 165