Hadoop 2.0 YARN: Splitting Resource Management from MapReduce Computation

Rethinking cluster orchestration. We analyze ResourceManager, NodeManager, and multi-tenant big data engines.

VP
SHIVAM ITCS
·2 June 2013·10 min read·1 views

The Hadoop 1.0 Bottleneck

In Hadoop 1.0, the JobTracker daemon managed both cluster resource allocation and execution monitoring.

  • Scaling Limits: JobTracker hit scalability walls at roughly 4,000 nodes, failing to handle concurrent queries.
  • Compute lock-in: The cluster was restricted to MapReduce computations, preventing engines like Spark or Storm from accessing HDFS.

The launch of Hadoop 2.0 YARN (Yet Another Resource Negotiator) resolves this by splitting cluster resource management from execution.

YARN Principle: Decouple cluster resource allocation from computation monitoring to create a multi-tenant big data infrastructure.

The YARN Architecture

YARN replaces JobTracker with a two-tiered manager:

  • ResourceManager (Global Master): Allocates compute resources (memory, CPU) across all applications in the cluster.
  • NodeManager (Node Agent): Monitors resource utilization (containers) on individual cluster nodes.
  • ApplicationMaster (App Master): A per-job manager that coordinates task execution with NodeManagers.
DaemonScopeCore Responsibility
ResourceManagerGlobal ClusterAllocates compute containers to applications.
NodeManagerIndividual NodeMonitors CPU and RAM usage inside local containers.
ApplicationMasterIndividual JobRequests resources and monitors job progress.

Running Multi-Tenant Frameworks

With YARN, a single Hadoop cluster can run diverse frameworks concurrently:

xmlcode
<!-- Conceptual yarn-site.xml cluster resource configuration -->
<configuration>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>yarn-master.shivamitcs.in</value>
  </property>
</configuration>

This multi-tenant architecture increases hardware utilization and allows organizations to run real-time analytics alongside standard batch transformations.

VP
Vijay Paliwal
Founder, SHIVAM ITCS · 18+ years enterprise & AI engineering
MCA · Ex-HiveGPT USA · Ex-Social27 Seattle
Hadoop 2.0 YARN: Splitting Resource Management from MapReduce Computation | SHIVAM ITCS Blog | SHIVAM ITCS