UC BERKELEY
EECS technical reports
TECHNICAL REPORTS


EECS-2010-6.pdf
Conditions of Use

Archive Home Page

Statistical Workloads for Energy Efficient MapReduce

Authors:
Chen, Yanpei
Ganapathi, Archana Sulochana
Fox, Armando
Katz, Randy H.
Patterson, David A.
Technical Report Identifier: EECS-2010-6
January 21, 2010
EECS-2010-6.pdf

Abstract: Energy efficiency is a growing concern in modern datacenters. As Internet services increasingly rely on MapReduce workloads to fuel their flagship businesses, there is a growing need for better MapReduce energy efficency evaluation mechanisms. We present a statistics-driven workload generation framework that distills summary statistics from production MapReduce traces and realistically reproduces representative workloads. These workloads help us evaluate design decisions with regard to scale, configuration, scheduling, and other issues. We use this framework to identify specific suggestions to improve MapReduce energy efficiency. Our key finding is that evaluations using trace-driven workloads reverse current design priorities in optimizing for data intensive synthetic jobs.