Professional hadoop solutions / Boris Lublinsky; Kevin T. Smith; Alexey Yakubovich

Von:

Lublinsky, Boris [aut]

Mitwirkende(r):

Resource type: Ressourcentyp: Buch (Online)Buch (Online)Sprache: Englisch Verlag: Somerset : John Wiley & Sons, Incorporated, [2013]Auflage: Online-AusgBeschreibung: Online-Ressource (1 online resource (1 online resource (505 pages))) : illustrations (some color)ISBN:

9781306451123
1306451124
1118611934
9781118612545
111861254X

Schlagwörter:

Andere physische Formen: 9781118612545 | 9781118611937 | 1306451175 | Erscheint auch als: Professional Hadoop solutions. Druck-Ausgabe Indianapolis, Ind. : Wrox/Wiley, 2013. XXIII, 477 S.DDC-Klassifikation:

006
005.74

RVK: RVK: ST 230 | ST 200 | ST 271LOC-Klassifikation:

QA76.9
QA76.9.D5

Online-Ressourcen:

Zugang im Netz des KIT

Inhalte:

Professional Hadoop® Solutions; Copyright; Credits; About the Authors; About the Technical Editors; Acnowledgments; Contents; Introduction; Who This Book Is For; What This Book Covers; How This Book Is Structured; What You Need to Use This Book; Conventions; Source Code; Errata; P2P.Wrox.Com; Chapter 1: Big Data and the Hadoop Ecosystem; Big Data Meets Hadoop; Hadoop: Meeting the Big Data Challenge; Data Science in the Business World; The Hadoop Ecosystem; Hadoop Core Components; Hadoop Distributions; Developing Enterprise Applications with Hadoop; Summary; Chapter 2: Storing Data in Hadoop

HDFSHDFS Architecture; Using HDFS Files; Hadoop-Specific File Types; HDFS Federation and High Availability; HBase; HBase Architecture; HBase Schema Design; Programming for HBase; New HBase Features; Combining HDFS and HBase for Effective Data Storage; Using Apache Avro; Managing Metadata with HCatalog; Choosing an Appropriate Hadoop Data Organization for Your Applications; Summary; Chapter 3: Processing Your Data with MapReduce; Getting to Know MapReduce; MapReduce Execution Pipeline; Runtime Coordination and Task Management in MapReduce; Your First MapReduce Application

Building and Executing MapReduce ProgramsDesigning MapReduce Implementations; Using MapReduce as a Framework for Parallel Processing; Simple Data Processing with MapReduce; Building Joins with MapReduce; Building Iterative MapReduce Applications; To MapReduce or Not to MapReduce?; Common MapReduce Design Gotchas; Summary; Chapter 4: Customizing MapReduce Execution; Controlling MapReduce Execution with InputFormat; Implementing InputFormat for Compute-Intensive Applications; Implementing InputFormat to Control the Number of Maps; Implementing InputFormat for Multiple HBase Tables

Reading Data Your Way with Custom RecordReadersImplementing a Queue-Based RecordReader; Implementing RecordReader for XML Data; Organizing Output Data with Custom Output Formats; Implementing OutputFormat for Splitting MapReduce Job's Output into Multiple Directories; Writing Data Your Way with Custom RecordWriters; Implementing a RecordWriter to Produce Output tar Files; Optimizing Your MapReduce Execution with a Combiner; Controlling Reducer Execution with Partitioners; Implementing a Custom Partitioner for One-to-Many Joins; Using Non-Java Code with Hadoop; Pipes; Hadoop Streaming

Using JNISummary; Chapter 5: Building Reliable MapReduce Apps; Unit Testing MapReduce Applications; Testing Mappers; Testing Reducers; Integration Testing; Local Application Testing with Eclipse; Using Logging for Hadoop Testing; Processing Applications Logs; Reporting Metrics with Job Counters; Defensive Programming in MapReduce; Summary; Chapter 6: Automating Data Processing with Oozie; Getting to Know Oozie; Oozie Workflow; Executing Asynchronous Activities in Oozie Workflow; Oozie Recovery Capabilities; Oozie Workflow Job Life Cycle; Oozie Coordinator; Oozie Bundle

Oozie Parameterization with Expression Language

Zusammenfassung: The go-to guidebook for deploying Big Data solutions with HadoopToday's enterprise architects need to understand how the Hadoop frameworks and APIs fit together, and how they can be integrated to deliver real-world solutions. This book is a practical, detailed guide to building and implementing those solutions, with code-level instruction in the popular Wrox tradition. It covers storing data with HDFS and Hbase, processing data with MapReduce, and automating data processing with Oozie. Hadoop security, running Hadoop with Amazon Web Services, best practices, and automating Hadoop processes in real time are also covered in depth. With in-depth code examples in Java and XML and the latest on recent additions to the Hadoop ecosystem, this complete resource also covers the use of APIs, exposing their inner workings and allowing architects and developers to better leverage and customize them.The ultimate guide for developers, designers, and architects who need to build and deploy Hadoop applicationsCovers storing and processing data with various technologies, automating data processing, Hadoop security, and delivering real-time solutionsIncludes detailed, real-world examples and code-level guidelinesExplains when, why, and how to use these tools effectivelyWritten by a team of Hadoop experts in the programmer-to-programmer Wrox styleProfessional Hadoop Solutionsis the reference enterprise architects and developers need to maximize the power of Hadoop. Boris Lublinskyis principal architect at Nokia and an author of more than 70 publications, including Applied SOA: Service-Oriented Architecture and Design Strategies.Kevin T. Smithis Director of Technology Solutions for the AMS division of Novetta Solutions, where he builds highly secure, data-oriented solutions for customers.Alexey Yakubovichis a system architect at Hortonworks and a member of the Object Management Group SIG on SOA governance and model-driven architecture.PPN: PPN: 80722295XPackage identifier: Produktsigel: ZDB-26-MYL | ZDB-30-PAD | ZDB-30-PQE

Exemplare ( 0 )

Dieser Titel hat keine Exemplare