0% found this document useful (0 votes)
226 views101 pages

A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks

This document summarizes challenges in deploying and optimizing big data applications and machine learning algorithms in cloud data centers and networks. It discusses how MapReduce and Hadoop have enabled many cloud services and applications by providing efficient distributed computing. However, increased traffic between data centers handling big data is creating bottlenecks. The document reviews characteristics of big data programming models and applications, and technologies like virtualization and SDN that support big data systems. It also discusses optimization efforts to improve performance and energy efficiency of systems handling big data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
226 views101 pages

A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks

This document summarizes challenges in deploying and optimizing big data applications and machine learning algorithms in cloud data centers and networks. It discusses how MapReduce and Hadoop have enabled many cloud services and applications by providing efficient distributed computing. However, increased traffic between data centers handling big data is creating bottlenecks. The document reviews characteristics of big data programming models and applications, and technologies like virtualization and SDN that support big data systems. It also discusses optimization efforts to improve performance and energy efficiency of systems handling big data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

A Survey of Big Data Machine Learning Applications

Optimization in Cloud Data Centers and Networks


Sanaa Hamid Mohamed, Student Member, IEEE, Taisir E.H. El-Gorashi, Member, IEEE, and Jaafar
M.H. Elmirghani, Senior Member, IEEE

Abstract— This survey article reviews the challenges associated with deploying and optimizing big data applications
and machine learning algorithms in cloud data centers and networks. The MapReduce programming model and its
widely-used open-source platform; Hadoop, are enabling the development of a large number of cloud-based services
and big data applications. MapReduce and Hadoop thus introduce innovative, efficient, and accelerated intensive
computations and analytics. These services usually utilize commodity clusters within geographically-distributed data
centers and provide cost-effective and elastic solutions. However, the increasing traffic between and within the data
centers that migrate, store, and process big data, is becoming a bottleneck that calls for enhanced infrastructures
capable of reducing the congestion and power consumption. Moreover, enterprises with multiple tenants requesting
various big data services are challenged by the need to optimize leasing their resources at reduced running costs and
power consumption while avoiding under or over utilization. In this survey, we present a summary of the characteristics
of various big data programming models and applications and provide a review of cloud computing infrastructures,
and related technologies such as virtualization, and software-defined networking that increasingly support big data
systems. Moreover, we provide a brief review of data centers topologies, routing protocols, and traffic characteristics,
and emphasize the implications of big data on such cloud data centers and their supporting networks. Wide ranging
efforts were devoted to optimize systems that handle big data in terms of various applications performance metrics
and/or infrastructure energy efficiency. This survey aims to summarize some of these studies which are classified
according to their focus into applications-level, networking-level, or data centers-level optimizations. Finally, some
insights and future research directions are provided.
Index Terms— Big Data, MapReduce, Machine Learning, Data Streaming, Cloud Computing, Cloud Networking,
Software-Defined Networking (SDN), Virtual Machines (VM), Network Function Virtualization (NFV), Containers,
Data Centers Networking (DCN), Energy Efficiency, Completion Time, Scheduling, Routing.

I INTRODUCTION
THE evolving paradigm of big data is essential for critical advancements in data processing models and the
underlying acquisition, transmission, and storage infrastructures [1]. Big data differs from traditional data in being
potentially unstructured, rapidly generated, continuously changing, and massively produced by a large number of
distributed users or devices. Typically, big data workloads are transferred into powerful data centers containing
sufficient storage and processing units for real-time or batch computations and analysis. A widely used
characterization for big data is the "5V" notion which describes big data through its unique attributes of Volume,
Velocity, Variety, Veracity, and Value [2]. In this notation, the volume refers to the vast amount of data produced
which is usually measured in Exabytes (i.e. 260 or 1018 bytes) or Zettabytes (i.e. 270 or 1021 bytes), while the
velocity reflects the high speed or rate of data generation and hence potentially the short lived useful lifetime of
data. Variety indicates that big data can be composed of different types of data which can be categorized into
structured and unstructured. An example of structured data is bank transactions which can fit into relational
database systems, and an example of the unstructured data is social media content that could be a mix of text,
photos, animated Graphics Interchange Format (GIF), audio files, and videos contained in the same element (e.g.
a tweet, or a post). The veracity measures the trustworthiness of the data as some generated portions could be
erroneous or inaccurate, while the value measures the ability of the user or owner of the data to extract useful
information from the data.

In 2020, the global data volume is predicted to be around 40,000 Exabytes which represents a 300 times
growth factor compared to the global data volume in 2005 [3]. An estimate of the global data volume in 2010 is
about 640 Exabytes [4], and in 2015 is about 2,700 Exabytes [5]. This huge growth in data volumes is the result
of continuous developments in various applications that generate massive and rich content related to a wide range
of human activities. For example, online business transactions are expected to have a rate of 450 Billion
transactions per day by 2020 [4]. Social media such as Facebook, LinkedIn, and Twitter, which have between 300
Million and 2 Billion subscribers who access these social media platforms through web browsers in personal
computers (PCs), or through applications installed in tablets and smart phones are enriching the content of the
Internet with content in the range of several Terabytes (240 bytes) per day [5]. Analyzing the thematic connections
between the subscribers, for example by grouping people with similar interests, is opening remarkable
opportunities for targeted marketing and e-commerce. Moreover, the subscriber's behaviours and preferences
tracked by their activities, clickstreams, requests, and collected web log files can be analyzed with big data mining
tools for profound psychological, economical, business-oriented, and product improvement studies [6], [7]. To
accelerate the delay-sensitive operations of web searching and indexing , distributed programming models for big
data such as MapReduce were developed [8]. MapReduce is a powerful, reliable, and cost-effective programming
model that performs parallel processing for large distributed datasets. These features have enabled the
development of different distributed programming big data solutions and cloud computing applications.

Fig. 1. Big data communication, networking, and processing infrastructure, and examples of big data
applications.

A wide range of applications are considered big data applications from data-intensive scientific applications
that require extensive computations to massive datasets that require manipulation such as in earth sciences,
astronomy, nanotechnology, genomics, and bioinformatics [9]. Typically, the computations, simulations, and
modelling in such applications are carried out in High Performance Computing (HPC) clusters with the aid of
distributed and grid computing. However, the datasets growth beyond these systems capacities in addition to the
desire to share datasets for scientific research collaborations in some disciplines are encouraging the utilization of
big data applications in cloud computing infrastructures with commodity devices for scientific computations
despite the resultant performance and cost tradeoffs [10].

With the prevalence of mobile applications and services that have extensive computational and storage
demands exceeding the capabilities of the current smart phones, emerging technologies such as Mobile Cloud
Computing (MCC) were developed [11]. In MCC, the computational and storage demands of applications are
outsourced to remote (or close as in mobile edge computing (MEC)) powerful servers over the Internet. As a
result, on-demand rich services such as video streaming, interactive video, and online gaming can be effectively
delivered to the capacity and battery limited devices. Video content accounted for 51% of the total mobile data
traffic in 2012 [11], and is predicted to account for 78% of an expected total volume of 49 Exabytes by 2021 [12].
Due to these huge demands, in addition to the large sizes of video files, big video data platforms are fronting
several challenges related to video streaming, storage, and replication management, while needing to meet strict
quality-of-experience (QoE) requirements [13].
In addition to mobile devices, the wide range of everyday physical objects that are increasingly interconnected
for automated operations has formed what is known as the Internet-of-Things (IoT). In IoT systems, the underlying
communication and networking infrastructure are typically integrated with big data computing systems for data
collection, analysis, and decision-making. Several technologies such as RFID, low power communication
technologies, Machine-to-Machine (M2M) communications, and wireless sensor networking (WSN) have been
suggested for improved IoT communications and networking infrastructure [14]. To process the big data generated
by IoT devices, different solutions such as cloud and fog computing were proposed [15]-[31]. Existing cloud
computing infrastructures could be utilized by aggregating and processing big data in powerful central data
centers. Alternatively, data could be processed at the edge where fog computing units, typically with limited
processing capacities compared to cloud, are utilized [32]. Edge computing reduces both; the traffic in core
networks, and the latency by being closer to end devices. The connected devices could be sensors gathering
different real-time measurements, or actuators performing automated control operations in industrial, agricultural,
or smart building applications. IoT can support vehicle communication to realize smart transportation systems.
IoT can also support medical applications such as wearables and telecare applications for remote treatment,
diagnosis and monitoring [33]. With this variety in IoT devices, the number of Internet-connected things is
expected to exceed 50 Billion by 2020, and the services provided by IoT are expected to add $15 Trillion to the
global Gross Domestic Product (GDP) in the next 20 years [14]. Figure 1 provides generalized big data
communication, networking, and processing infrastructure and examples of applications that can utilize it.

Fig.2. Classification of big data applications optimization studies.

Achieving the full potential of big data requires a multidisciplinary collaboration between computer scientists,
engineers, data scientists, as well as statisticians and other stakeholders [4]. It also calls for huge investments and
developments by enterprises and other organizations to improve big data processing, management, and analytics
infrastructures to enhance decision making and services offerings. Moreover, there are urgent needs for integrating
new big data applications with existing Application Program Interfaces (API) such as Structured Query Language
(SQL), and R language for statistical computing. More than $15 billion have already been invested in big data
systems by several leading Information Technology (IT) companies such as IBM, Oracle, Microsoft, SAP, and
HP [34]. One of the challenges of big data in enterprise and cloud infrastructures is the existence of various
workloads and tenants with different Service Level Agreement (SLA) requirements that need to be hosted on the
same set of clusters. An early solution to this challenge at the application level is to utilize a distributed file system
to control the access and sharing of data within the clusters [35]. On the infrastructure level, solutions such as
Virtual Machines (VMs) or Linux containers dedicated to each application or tenant were utilized to support the
isolation between their assigned resources [1], [34]. Big data systems are also challenged by security, privacy, and
governance related concerns. Furthermore, as the increasing computational demands of the increasing data
volumes are exceeding the capabilities of existing commodity infrastructures, future enhanced and energy efficient
processing and networking infrastructure for big data have to be investigated and optimized.

This survey paper aims to summarize a wide range of studies that use different state-of-art and emerging
networking and computing technologies to optimize and enhance big data applications and systems in terms of
various performance metrics such as completion time, data locality, load balancing, fairness, reliability, and
resources utilization, and/or their energy efficiency. Due to the popularity and wide applicability of the
MapReduce programming model, its related optimization studies will be the main focus in this survey. Moreover,
optimization studies for big data management and streaming, in addition to generic cloud computing applications
and bulk data transfer are also considered. The optimization studies in this survey are classified according to their
focus into applications-level, cloud networking-level, and data center-level studies as summarized in Figure 2.
The first category at the application level targets the studies that extend or optimize existing framework parameters
and mechanisms such as optimizing jobs and data placements, and scheduling, in addition to developing
benchmarks, traces, and simulators [36]-[119]. As the use of big data applications is evolving from clusters with
controlled environments into cloud environments with geo-distributed data centers, several additional challenges
are encountered. The second category at the networking level focuses on optimizing cloud networking
infrastructures for big data applications such as inter-data centers networking, virtual machine assignments, and
bulk data transfer optimization studies [120]-[203]. The increasing data volumes and processing demands are also
challenging the data centers that store and process big data. The third category at the data center level targets
optimizing the topologies, routing, and scheduling in data centers for big data applications, in addition to the
studies that utilize, demonstrate, and suggest scaled-up computing and networking infrastructures to replace
commodity hardware in the future [204]-[311]. For the performance evaluations in the aforementioned studies,
either realistic traces, or deployments in experimental testbed clusters are utilized.

Although several big data related surveys and tutorials are available, to the best of our knowledge, none has
extensively addressed optimizing big data applications while considering the technological aspects of their hosting
cloud data centers and networking infrastructures. The tutorial in [1] and the survey in [312] considered mapping
the role of cloud computing, IoT, data centers, and applications to the acquisition, storage and processing of big
data. The authors in [313]-[316] extensively surveyed the advances in big data processing frameworks and
compared their components, usage, and performance. A review of benchmarking big data systems is provided in
[317]. The surveys in [318], [319] focused on optimizing jobs scheduling at the application level, while the survey
in [320] additionally tackled extensions, tuning, hardware acceleration, security, and energy efficiency for
MapReduce. The environmental impacts of big data and its usage to green applications and systems were
discussed in [321]. Security and privacy concerns of MapReduce in cloud environments were discussed in [322],
while the challenges and requirements of geo-distributed batch and streaming big data frameworks were outlined
in [323]. The surveys in [324]-[326] addressed the use of big data analytics to optimize wired and wireless
networks, while the survey in [327] overviewed big data mathematical representations for networking
optimizations. The scheduling of flows in data centers for big data is addressed in [328] and the impact of data
centers frameworks on the scheduling and resource allocation is surveyed in [329] for three big data applications.

This survey paper is structured as follows: For the convenience of the reader, brief overviews for the state-of-
art and advances in big data programming models and frameworks, cloud computing and its related technologies,
and cloud data centers are provided before the corresponding optimization studies. Section II reviews the
characteristics of big data programming models and existing batch, streaming processing and storage management
applications while Section III summarizes the applications-focused optimization studies. Section IV discusses the
prominence of cloud computing for big data applications, and the implications of big data applications on cloud
networks. It also reviews some related technologies that support big data and cloud computing systems such as
machine and network virtualization, and Software-Defined Networking (SDN). Section V summarizes the cloud
networking-focused optimization studies. Section VI briefly reviews data center topologies, traffic characteristics
and routing protocols, while Section VII summarizes the data center-focused optimization studies. Finally, Section
VIII provides future research directions, and Section IX concludes the survey. Key Acronyms are provided below.

APPENDIX A LIST OF KEY ACRONYMS


AM Application Master.
ACID Atomicity, Consistency, Isolation, and Durability.
BASE Basically Available Soft-state Eventual consistency.
CAPEX Capital Expenditure.
CPU Central Processing Unit.
CSP Cloud Service Provider.
DAG Directed Acyclic Graph.
DCN Data Center Networking.
DFS Distributed File System.
DVFS Dynamic Voltage and Frequency Scaling.
EC2 Elastic Compute Cloud.
EON Elastic Optical Network.
ETSI European Telecom Standards Institute.
HDFS Hadoop Distributed File System.
HFS Hadoop Fair Scheduler.
HPC High Performance Computing.
I/O Input/Output.
ILP Integer Linear Programming.
IP Internet Protocol.
ISP Internet Service Provider.
JT Job Tracker.
JVM Java Virtual Machine.
MILP Mixed Integer Linear Programming.
NFV Network Function Virtualization.
NM Node Manager.
NoSQL Not only SQL.
OF Open Flow.
O-OFDM Optical Orthogonal Frequency Division Multiplexing.
OPEX Operational Expenses.
OVS Open v Switch.
P2P Peer-to-Peer.
PON Passive Optical Network.
QoE Quality of Experience.
QoS Quality of Service.
RAM Random Access Memory.
RDBMS Relational Data Base Management System.
RDD Resilient Distributed Data sets.
RM Resource Manager.
ROADM Reconfigurable Optical Add Drop Multiplexer.
SDN Software Defined Networking.
SDON Software Defined Optical Networks.
SLA Service Level Agreement.
SQL Structured Query Language.
TE Traffic Engineering.
ToR Top-of-Rack.
TT Task Tracker.
VM Virtual Machine.
VNE Virtual Network Embedding.
VNF Virtual Network Function.
WAN Wide Area Network.
WC WordCount.
WDM Wavelength Division Multiplexing.
WSS Wavelength Selective Switch.
YARN Yet Another Resource Negotiator.

II PROGRAMMING MODELS, PLATFORMS, AND APPLICATIONS FOR BIG DATA ANALYTICS:


This Section reviews some of the programming models developed to provide parallel computation for big data.
These programming models include MapReduce [8], Dryad [330], and Cloud Dataflow [331]. The input data in
these models can be generally categorized into bounded and unbounded data depending on the required
computational speed. In applications that have no strict processing speed requirements, the input data can be
aggregated, bounded, and then processed in batch mode. In applications that require instantaneous analysis of
continuous data flows (i.e. unbounded data), streaming mode is utilized. The MapReduce programming model,
with the support of open source platforms such as Apache Hadoop, has been widely used for efficient, scalable,
and reliable computations at reduced costs especially for batch processing. The maturity of such platforms have
supported the development of several scalable commercial or open-source big data applications and services for
processing and data storage management. The rest of this Section is organized as follows: Subsection II-A
illustrates the aforementioned programming models, while Subsection II-B briefly describes the characteristics
and components of Apache Hadoop and its related applications. Subsection II-C focuses on big data storage
management applications, while Subsection II-D summarizes some of the in-memory big data applications.
Finally, Subsections II-E, and II-F elaborate on big data stream processing applications, and the Lambda hybrid
architecture, respectively.
A. Programming Models:
The MapReduce programming model was introduced by Google in 2003 as a cost-effective solution for
processing massive data sets. MapReduce utilizes distributed computations in commodity clusters that run in two
phases; map and reduce which are adopted from the Lisp functional programming language [8]. The MapReduce
user defines the required functions of each phase by using a programming language such as C++, Java, or Python,
and then submits the code as a single MapReduce job to process the data. The user also defines a set of parameters
to configure the job. Each MapReduce job consists of a number of map and reduce tasks depending on the input
data size and the configurations, respectively. Each map task is assigned to process a unique portion of the input
data set, preferably available locally, and hence can run independently from other map tasks. The processing starts
by transforming the input data into the key-value schema and applies it to the map function to compute another
key-value pair also known as the intermediate results. These results are then shuffled to reduce tasks according to
their keys where each reduce tasks is assigned to process intermediate results with a unique set of keys. Finally,
each reduce task generates the final outputs. The internal operational details of MapReduce such as assigning the
nodes within the cluster to map or reduce tasks, partitioning the input data, tasks scheduling, fault tolerance, and
inter-machine communications are typically performed by the run-time system and are hidden from the users. The
input and output data files are typically managed by a Distributed File System (DFS) that provides a unified view
of the files and their details, and allows various modifications such as replication, read, and write operations. An
example of DFSs is the Google File System (GFS) [332] which is a fault-tolerant, reliable, and scalable chunk-
based distributed file system designed to support MapReduce in Google's commodity servers. The typical chunk
size in GFS is 64 MB and each chunk is replicated in different nodes with a default value of 3 to support fault-
tolerance.
The components of a typical MapReduce cluster and the implementation details of the MapReduce
programming model are illustrated in Figure 3. One of the nodes in the cluster is set to be a Master, while the
others are set to be Workers and are assigned to either a map or a reduce task by the master. Beside task
assignments, the master is also responsible for monitoring the performance of the running tasks and for checking
the statuses of the nodes within the cluster. The master also manages the information about the location of the
running jobs, data replicas, and intermediate results. The detailed steps of implementing a MapReduce job are as
follows [8]:
1) The MapReduce code is copied to all cluster nodes. In this code, the user defines the map and reduce
functions and provides additional parameters such as input and output data types, names of the output
files, and the number of reduce workers.
2) The master assigns map and reduce tasks to available workers where typically, map workers are more
than reduce workers and are assigned to several map tasks.
3) The input data, in the form of key-value pairs, is split into smaller partitions. The splits (S) and their
replicas (SR) are distributed in the map workers’ local disks as illustrated in Figure 3. The splits are then
processed concurrently in their assigned map workers according to their scheduling. Each map function
produces intermediate results (IR) consisting of the intermediate key-value pairs. These results are then
materialized (i.e. saved persistently) in the local disks of the map workers.
4) The intermediate results are divided into (R) parts to be processed by R reduce workers. Partitioning can
be done through hash functions (e.g. hash(key) mod R) to ensure that each key is assigned to only one
reduce worker. The locations of the hashed intermediate results and their file sizes are sent to the master
node.
Fig. 3. Google’s MapReduce cluster components and the programming model implementation.

5) A reduce task is composed of shuffle, sort, and reduce phases. The shuffle phase can start when 5% of
the map results are generated, however the last reduction phase cannot start unless all map tasks are
completed. For shuffling, each reduce worker obtains the locations of the intermediate pairs with the
keys assigned to it and fetches the corresponding results from the map workers’ local disks typically via
the HyperText Transfer Protocol (HTTP).
6) Each reduce worker then sorts its intermediate results by the keys. The sorting is performed in the
Random Access Memory (RAM) if the intermediate results can fit, otherwise, external sort-merge
algorithms are used. The sorting groups all the occurrences of the same key and forms the shuffled
intermediate results (SIR).
7) Each reduce worker applies the assigned user-defined reduce function on the shuffled data to generate
the final key-value pairs output (O), the final output files are then saved in the distributed file system.
In MapReduce, fault-tolerance is achieved by re-executing the failed tasks. Failures can occur due to hardware
causes such as disk failures, out of disk, out of memory, and socket time out. Each map or reduce task can be in
one of three statuses which are idle, in-progress, and completed. If an in-progress map task fails, the master
changes its status to idle to allow it to be re-scheduled on an available map node containing a replica of the data.
If a map worker fails while having some completed map tasks, all contained map tasks must be re-scheduled as
the intermediate results, which are only saved on local disks, are no longer accessible. In this case, all reduce
workers must be re-scheduled to obtain the correct and complete set of intermediate results. If a reduce worker
fails, only in-progress tasks are re-scheduled as the results of completed reduce jobs are saved in the distributed
file system. To improve MapReduce performance, speculative execution can be activated, where backup tasks are
created to speed up the lacking in-progress tasks known as stragglers [8].
The MapReduce programming model is capable of solving several common programming problems such as
words count and sort in addition to implementing complex graph processing, data mining, and machine learning
applications. However, the speed requirements for some computations might not be satisfied by MapReduce due
to several limitations [333]. Moreover, developing efficient MapReduce applications requires advanced
programming skills to fit the computations into the map and reduce pipeline, and deep knowledge of underlying
infrastructures to properly configure and optimize a wide range of parameters [334]-[337]. One of MapReduce
limitations is that transferring non-local input data to map workers and shuffling intermediate results to reduce
workers typically require intensive networking bandwidth and disk I/O operations. Early efforts to minimize the
effects of these bottlenecks included maximizing data locality, where the computations are carried closer to data
[8]. Another limitation is due to the fault-tolerance mechanism that requires materializing the entire output of
MapReduce jobs in the disks managed by the DFS before being accessible for further computations. Hence,
MapReduce is generally less suitable for interactive and iterative computations that require repetitive access to
results. An implementation variation known as MapReduce Online [338], supports shuffling by utilizing RAM
resources to pipeline intermediate results between map and reduce stages before the materialization.
Several other programming models were developed as variants to MapReduce [314]. One of these variants is
Dryad which is a high-performance general-purpose distributed programming model for parallel applications with
coarse-grain data [330]. In Dryad, a Directed Acyclic Graph (DAG) is used to describe the jobs by representing
the computations as the graph vertices and the data communication patterns as the graph edges. The Job Manager
within Dryad, schedules the vertices, which contain sequential programs, to run concurrently in a set of machines
that are available at the run time. These machines can be different cores within the same multi-core PC or can be
thousands of machines within a large data center. Dryad provides fault-tolerance and efficient resource utilization
by allowing graph modification during the computations. Unlike MapReduce, which restricts the programmer to
provide a single input file and produces a single output file, Dryad allows for an arbitrary number of input and
output files.
As a successor to MapReduce, Google has recently introduced “Cloud Dataflow” which is a unified
programming model with enhanced processing capabilities for bounded and unbounded data. It provides a balance
between correctness, latency, and cost when processing massive, unbounded, and out-of-order data [331]. In
Cloud Dataflow, the data is represented as tuples containing the key, value, event-time, and the required time
window for the processing. This supports sophisticated user requirements such as event-time ordering of results
by utilizing data windowing that divide the data streams into finite chunks to be processed in groups. Cloud Data
flow utilizes the features of both FlumeJava which is a batch engine and MillWheel which is a streaming engine
[331]. The core primitives of Cloud Dataflow, are ParDo, which is an element-wise generic parallel processing
function, and GroupByKey and GroupByKeyandWindow which aggregate data with the same key according to the
user requests.
B. Apache Hadoop Architecture and Related Software:
Hadoop, which is currently under the auspices of the Apache Software Foundation, is an open source software
framework written in Java for reliable and scalable distributed computing [339]. This framework was initiated by
Doug Cutting who utilized the MapReduce programming model for indexing web crawls and was open-sourced
by Yahoo in 2005. Beside Apache, several organizations have developed and customized Hadoop distributions
tailored for their infrastructures such as HortonWorks, Cloudera, Amazon Web Services (AWS), Pivotal, and
MAPR technologies. The Hadoop ecosystem allows other programs to run on the same infrastructure with
MapReduce which made it a natural choice for enterprise big data platforms [35].
The basic components of the first versions of Hadoop; Hadoop 1.x are depicted in Figure 4. These versions
contain a layer for the Hadoop Distributed File System (HDFS), a layer for the MapReduce 1.0 engine which
resembles Google's MapReduce, and can have other applications on the top layer. The MapReduce 1.0 layer
follows the master-slave architecture. The master is a single node containing a Job Tracker (JT), while each slave
node contains a Task Tracker (TT). The JT handles jobs assignment and scheduling and maintains the data and
metadata of jobs, in addition to resources information. It also monitors the liveness of TTs and the availability of
their resources by sending periodic heartbeat messages typically each 3 seconds. Each TT contains a predefined
set of slots. Once it accepts a map or a reduce task, it launches a Java Virtual Machine (JVM) in one of its slots
to perform the task, and periodically updates the JT with the task status [339].
The HDFS layer consists of a name node in the master and several data nodes in each slave node. The name
node stores the details of the data nodes and the addresses of the data blocks and their replicas. It also checks the
data nodes via heartbeat messages and manages load balancing. For reliability, a secondary name node is typically
assigned to save snapshots of the primary name node. As in GFS, the default file size in HDFS is 64 MB and three
replicas are maintained for each file for fault-tolerance, performance improvements, and load balancing. Beside
GFS and HDFS, several distributed files systems were developed such as Amazon's simple Storage Service (S3),
Moose File System (MFS), Kosmos distributed file system (KFS), and Colossus [314], [340].
Default tasks scheduling mechanisms in Hadoop are First-In First-Out (FIFO), capacity scheduler, and Hadoop
Fair Scheduler (HFS). FIFO schedules the jobs according to their arrival time which leads to undesirable delays
in environments with a mix of long batch jobs and small interactive jobs [319]. The Capacity scheduler developed
at Yahoo reserves a pool containing minimum resources guarantees for each user, and hence suits systems with
multiple users [319]. FIFO scheduling is then used for the jobs of the same user. The Fair scheduler developed at
Facebook dynamically allocates the resources equally between jobs. It thus improves the response time of small
jobs [46].

(a) (b)
Fig. 4. Framework components in Hadoop (a) Hadoop 1.x, and (b) Hadoop 2.x.

Hadoop 2.x, which is also depicted in Figure 4, introduced a resource management platform named YARN;
Yet Another Resource Negotiator [341]. YARN decouples the resource management infrastructure from the
processing components and enables the coexistence of different processing frameworks beside MapReduce which
increases the flexibility in big data clusters. In YARN, the JT and TT are replaced with three components which
are the Resource Manager (RM), the Node Manager (NM), and the Application Master (AM). The RM is a per-
cluster global resources manager which runs as daemon on a dedicated node. It contains a scheduler that
dynamically leases the available cluster resources in the form of containers (further explained in Subsection IV-
B4), which are considered as logical bundles (e.g. 2 GB RAM, 1 Central Processing Unit (CPU) core), among
competing MapReduce jobs and other applications according to their demands and scheduling priorities. A NM
is a per-server daemon that is responsible for monitoring the health of its physical node, tracking its containers
assignments, and managing the containers lifecycle (i.e. starting and killing). An AM is a per-application container
that manages the resources consumption, the jobs execution flow, and also handles the fault-tolerance tasks. The
AM, which typically needs to harness resources from several nodes to finish its job, issues a resource request to
the RM indicating the required number of containers, the required resources per container, and the locality
preferences.
Figure 5 illustrates the differences between Hadoop 1.x, and Hadoop 2.x with YARN. A detailed study of
various releases of Hadoop is presented in [342]. The authors also provided a brief comparison of these releases
in terms of their energy efficiency and performance.

(a) (b) (c)


Fig. 5. Comparison of clusters with (a) Google’s MapReduce, (b) Hadoop 1.x, and (b) Hadoop 2.x with YARN.

YARN increases the resource allocation flexibility in MapReduce 2 as it utilizes flexible containers for
resources allocations and hence eliminates the use of fixed resources slots as in MapReduce 1. However, this
advantage comes at the expenses of added system complexity and a slight increase in the power consumption
when compared to MapReduce 1 [342]. Another difference is that the intermediate results shuffling operations
in MapReduce 2 are performed via auxiliary services that preserve the output of a container’s results before killing
it. The communications between the AMs, NMs, and the RM are heartbeat-based. If a node fails, the NM sends
an indicating heartbeat to the RM which in turn informs the affected AMs. If an in-progress job fails, the NM
marks it as idle and re-executes it. If an AM fails, the RM restarts it and synchronizes its tasks. If an RM fails, an
old checkpoint is used and a secondary RM is activated [341].
A wide range of applications and programming frameworks can run natively in Hadoop as depicted in Figure
4. The difference in their implementation between the Hadoop versions is that in the 1.x versions, they are forced
to follow the MapReduce framework while in the 2.x versions they are no longer restricted to it. Examples of
these applications and frameworks are Pig [343], Tez [344], Hive [345], HBase [314], Storm [346], Giraph for
graph processing, and Mahout for machine learning [315], [340]. Due to the lack of built-in declarative languages
in Hadoop, Pig, Tez, and Hive were introduced to support querying and to replace ad-hoc users-written programs
which were hard to maintain and reuse. Pig is composed of an execution engine and a declarative scripting
language named Pig Latin that compiles SQL-like queries to an equivalent set of sequenced MapReduce jobs. Pig
hides the complexity of MapReduce and directly provides advanced operations such as filtering, joining, and
ordering [343]. Tez is a flexible input-processor output-runtime model that transforms the queries into abstracted
DAG where the vertices represent the parallel tasks, and the edges represent the data movement between different
map and reduce stages [344]. It supports in-memory operations which makes it more suitable for interactive
processing than MapReduce. Hive [345] is a data warehouse software developed at Facebook. It contains a
declarative language; HiveQL that automatically generates MapReduce jobs from SQL-like user queries.

C. Big Data Storage Management Applications:


As with the programming models, the increasing data volume is challenging legacy storage management
systems. Using relational centralized database systems with big data will typically lead to several inefficiencies.
This has encouraged the use of large-scale distributed cloud-based data storage management systems known as
“key-value stores” or, “Not only SQL (NoSQL)”. It is well known that no single computing system is capable of
providing effective processing to all types of workloads. To design a generic platform that suits several types of
requests and workloads, some trade-offs have be considered. A set of measures for these trade-offs for data storage
management systems is defined and follows the Consistency, Availability, and Partition-tolerance (CAP) theorem
which states that any distributed system can only satisfy two of these three properties. Consistency reflects the
fact that all replicas of an entry must have the same value at all times, and that reading operations should return
the latest values of that entry. Availability implies that the requested operations should be allowed always and
performed promptly, while partition-tolerance indicates that the system can function if some parts of it are
disconnected [313]. Most of the traditional Relational Data Base Management Systems (RDBMS), which utilize
SQL for querying, are classified as “CA” which stands for consistency and availability. As partition-tolerance is
a key requirement in cloud distributed infrastructures, NoSQL systems are classified as either “CP” where the
availability is relaxed or “AP” where the consistency is relaxed and replaced by eventual, timeline, or session
consistency.
For any transaction to be processed concurrently, RDBMS are required to provide the Atomicity, Consistency,
Isolation, and Durability (ACID) guarantees. Most of RDBMS rely on expensive shared-memory or shared-disks
hardware to ensure high performance [314]. On the other hand, NoSQL systems utilize commodity share-nothing
hardware while ensuring scalability, fault-tolerance, and cost-effectiveness, traded for consistency. Thus, the
ACID guarantees are typically relaxed or replaced by the Basically Available Soft-state Eventual consistency
(BASE) guarantees. While RDBMS are considered mature and well-established, NoSQL systems still lack
experienced programmers and most of these systems are still in pre-production phases [313].
Several big data storage management applications were developed for commercial internal production
workloads operations such as BigTable [347], PNUTS [348], and DynamoDB [349]. Big Table is a Column
oriented storage system developed at Google for managing structured large-scale data sets in petabyte scale. It
only supports single row transactions and performs an atomic read-modify-write sequence that relies on a
distributed locking system called Chubby. PNUTS is a massive-scale data base system designed at Yahoo to
support their web-based applications, while DynamoDB is a highly scalable and available distributed key-value
data store developed at Amazon to support their cloud-based applications [340]. Examples of open-sourced big
data management systems include HBase, HadoopDB, and Cassandra [314]. HBase is a key-value column-
oriented data base management system while HadoopDB is a hybrid system that combines the scalability of
MapReduce with the performance guarantees of parallel databases. In HadoopDB, the queries are expressed in
SQL but are executed in parallel through the MapReduce framework [313], [350]. Cassandra is a highly scalable
eventually consistent, distributed structured key-value store developed at Facebook [351]. Examples of other
systems in efforts to integrate indexing capabilities with MapReduce are Hadoop++ [352], and Hadoop
Aggressive Indexing Library (HAIL). Hadoop++ adds indexing and joining capabilities to Hadoop without
changing its framework, while HAIL creates different clustered indexes for each data replica and supports multi-
attribute querying [315].
D. Distributed In-memory Processing:
Distributed in-memory processing systems are widely preferred for iterative, interactive, and real-time big
data applications due to the bottleneck of the relatively slow data materialization processes in existing disk-based
systems [316]. Running these applications in legacy MapReduce systems requires repetitive materialization as
map and reduce tasks have to be re-launched iteratively. This in turn leads to excessive un-utilized disk
Input/Output (I/O), CPU, and network bandwidth resources [315]. In-memory systems use fast memory units such
as Dynamic Random Access Memory (DRAM) or cache units to provide rapid access to data during the run-time
to avoid the slow disk I/O operations. However, as memory units are volatile, in-memory systems are required to
adopt advanced mechanisms to guarantee fault-tolerance and data durability. Examples of in-memory NoSQL
databases are RAMCloud [353] and HANA [354]. RAMCloud aggregates DRAM resources from thousands of
commodity servers to provide large-scale database system with low latency. Distributed cache systems can also
enhance the performance of large-scale web applications. Examples of these systems are Redis, which is an in-
memory data structure store, and Memcached which is a light-weight in-memory key-value object caching system
with strict Least Recently Used (LRU) eviction mechanism [316].
To support in-memory interactive and iterative data processing, Spark which is a scalable data analytics
platform written in Scala, was introduced [355]. In Spark, the data is stored as Resilient Distributed Data sets
(RDD) [356] which are general purpose and fault-tolerant abstraction for data sharing in distributed computations.
RDDs are created in the memory through course-grained deterministic transformations to datasets such as map,
flatmap, filter, join, and GroupByKey. In cases of insufficient RAM, Spark performs lazy materialization of
RDDs. RDDs are immutable and by applying further transformations, new RDDs are created. This provides fault-
tolerance without the need for replication as the information about the transformations that created each RDD can
be retrieved and reapplied to obtain the lost portions.
E. Distributed Graph Processing:
The partitioning and processing of graphs (i.e. a description of entities by vertices and their relationships by
connecting edges) is considered a key class of big data applications especially for social networks that contain
graphs with up to Billions of entities and edges [357]. Most of big data graph processing applications utilize in-
memory systems due to the iterative nature of their algorithms. Pregel [358], developed at Google, targets the
processing of massive graphs on distributed clusters of commodity machines by using the Bulk Synchronous
Parallel (BSP)-based programming model. Giraph, which is the open-source implementation of Pregel, uses online
hashing or range-based partitioning to dispatch sub-graphs to workers [39]. Trinity is a distributed graph engine
that optimizes distributed memory usage and communication cost under the assumption that the whole graph is
partitioned across a cloud memory [359]. Other examples of distributed graph processing applications are
GraphLab [360] which utilizes asynchronous distributed shared memory, and PowerGraph [361], which focuses
on efficient partitioning of graphs with power-law distributions.
F. Big Data Streaming Applications:
Applications related to e-commerce and social networking typically receive virtually unbounded data, also
known as streams that are asynchronously generated by a huge number of sources or users at uncontrollable rates.
Stream processing systems are required to provide low latency and efficient processing that copes with the arrival
rate of the events within the streams. If they did not cope, the processing of some events is dropped leading to
load shedding which is undesirable. Conceptually, a streaming systems model is composed of continuously
arriving input data to almost static queries for processing, while the batch and RDBMS systems model is
composed of different queries applied to static data [362]. Streams processing can be performed in batch systems,
however this leads to several inefficiencies. Firstly, the implementation is not straightforward as it requires
transforming the streams into partitions of batch data before processing. Secondly, considering relatively large
partitions increases the processing latency while considering small partitions increases the overheads of
segmenting, maintaining the inter-segment dependencies, in addition to fault-tolerance overheads caused by the
frequent materialization.
Main Memory MapReduce (M3R) [363], MillWheel [364], Storm [346], Yahoo S4 [365], and Spark streaming
[366] are examples of streams processing systems. M3R is introduced to provide high reliability for interactive
and continuous queries over streamed data of terabytes in small clusters [363]. However, it does not provide
adequate resilience guarantee as it caches the results totally in memory. MillWheel is a scalable fault-tolerant,
low-latency stream processing engine developed at Google [364] that allows time-based aggregations and
provides fine-grained check-pointing. Storm was developed at Twitter as a distributed and fault-tolerant platform
for real-time processing of streams [346]. It utilized two primitives; spout and bolts to apply transformations to
data streams, also named tuples, in a reliable and distributed manner. The spouts define the streams sources, while
the bolts perform the required computations on the tuples, and emit the resultant modified tuples to other bolts.
The computations in Storm are described as graphs where each node is either a spout or a bolt, and the vertices
are the tuples routes. Storm relies on different grouping methods to specify the distribution of the tuples between
the nodes. These include shuffle where streams are partitioned randomly, field where partitioning is performed
according to a defined criteria, all where the streams are sent to all bolts, and global where all streams are copied
to a single bolt.
Yahoo Simple Scalable Streaming System (S4) is a general-purpose distributed stream processing engine that
provides low latency and fault-tolerance [365]. In S4, the computations are performed by identical Processing
Elements (PEs) that are logically hosted in Processing Nodes (PNs). The coordination between PEs is performed
by “ZooKeeper” which is an open-source cluster management application. The PEs either emit results to another
PE or publish them, while PNs are responsible for listening to events, executing the required operations,
dispatching events, and emitting output events. Spark streaming [366], which is extended from Spark, introduced
a stream programming model named discretized streams (D-Streams) to provide consistency, fault tolerance, and
efficient integration with batch systems. Two types of operations can be used which are transformation operations
that produce new D-streams, and output operations that save resultant RDDs to HDFS. Spark Streaming allows
users to seamlessly combine streaming, batch, and interactive queries. It also contains stateful operators such as
windowing, incremental aggregation, and time-skewed joins.
G. The Lambda Architecture:
The Lambda architecture is a hybrid big data processing system that provides concurrent arbitrary processing
and querying for batch and real-time data in a single entity at the same time [367]. As depicted in Figure 6, it
consists of three layers which are the batch layer, the speed layer, and the serving layer. The batch layer contains
a batch system such as Hadoop or Spark to support the querying and processing of archived or stored data, while
the speed layer contains a streaming system to provide low-latency processing to incoming input data in real-time.
The input data is fed to both; the batch and speed layers in parallel to produce two sets of results that represent
the batch and real-time view of the queries, respectively. The serving layer role is to combine both results and
produce the finalized results. The Lambda architecture is typically integrated with general messaging systems
such as Kafka to aggregate the queries of users. Kafka is a high performance, open-source messaging system
developed at LinkedIn for the purpose of aggregating high-throughput log files [368]. Although the Lambda
architecture introduces high level of flexibility by getting the benefits of both real-time, and batch systems, it lacks
simplicity due to the need to maintain two systems. Moreover, the Lambda architecture encounters some
limitations with event-oriented applications [367].

Fig. 6. Components of the Lambda architecture.

III APPLICATIONS-FOCUSED OPTIMIZATION STUDIES


This Section discusses some big data applications optimization studies performed at the application-level.
Optimizations at this level include tuning application parameters rather than using the default settings, or
modifying the programming models themselves to obtain enhanced performance. The rest of this Section is
organized as follows: Subsection III-A considers studies focusing on optimizing data and/or jobs and tasks
placements. Subsection III-B summarizes studies that optimize jobs scheduling, while Subsection III-C describes
and evaluates studies focusing on reducing the completion time of jobs. Finally, Subsection III-D provides a
summary of big data benchmarking suites, traces, and simulators. Table I provides a summary of the studies
presented in the first three Sections while focusing on their objectives, targeted platform, optimization tools,
benchmarks used, and experimental setup and/or simulations environments.
A. Optimized Jobs/Data Placements:
The early work in [36] addressed the problem of optimizing data placement in Homogeneous Hadoop clusters.
A dynamic algorithm that assigns data fragments according to nodes processing capacities was proposed to reduce
data movement between over-utilized and under-utilized servers. However, the effects of replications were not
considered. Typically, most of the nodes within a Hadoop cluster are kept active to provide high data availability
which is energy inefficient. In [37], the energy consumption of Hadoop clusters was reduced by dynamically
switching off unused nodes while ensuring that replicas of all data sets are contained in a covering subset which
is kept always active. Although the energy consumption was reduced, the jobs running time was increased.
Dynamic sizing and locality-aware scheduling in dedicated MapReduce clusters with three types of workloads
namely; batch MapReduce jobs, web applications, and interactive MapReduce jobs, were addressed in [38]. A
balance between performance and energy saving was achieved by allocating the optimum number of servers by
delaying batch workloads while considering data locality and delay constraints of web and interactive workloads.
A Markov Decision Process (MDP) model was considered to reflect the stochastic nature of jobs arrival and an
event-driven simulator was used to empirically validate the proposed algorithms optimality. Energy savings
between 30% and 59% were achieved over the no allocation strategy.
Graphs and iterative applications typically have highly dependent and skewed workloads as some portions of
the datasets are more accessed for processing than others. The study in [39] proposed a Dependency Aware
Locality for MapReduce (DALM) algorithm to process highly skewed and dependent input data. The algorithm
was tested with Giraph and was found to reduce the cross-server traffic by up to 50%. Several studies addressed
optimizing reduce tasks placement to minimize networking traffic by following greedy approaches. These
approaches degrade the performance as they maximize the intermediate data locality for current reduce tasks
without considering the effects on the input data locality for map tasks in following MapReduce jobs [40]. The
theoretical and experimental study in [40] addressed optimizing reduce tasks data locality for sequential
MapReduce jobs and achieved up to 20% improvement in performance compared to the greedy approaches.
The efficiency of indexing, grouping, and joint querying operations can be significantly affected by the used
framework data placement strategy. In [41], a data block allocation approach was proposed to reduce job execution
time and improve query performance in Hadoop clusters. The default random data placement policy of Hadoop
1.0.3 was modified by using k-means clustering techniques to co-allocate related data blocks on same clusters
while considering the default replication factor. The results indicated a query performance improvement by up to
25%. The study in [42] addressed the poor resource utilization caused by misconfiguration in HBase. An algorithm
named HConfig for semi-automating resources allocation and data bulks loading was proposed and an
improvement between 2% and 3.7% was achieved compared to default settings.
Several recent studies considered jobs or data placement optimizations in Hadoop 2.x with YARN. The impact
of data locality in YARN with delay scheduler on jobs completion time was addressed in [43]. To measure the
data locality achieved, a YARN Location Simulator (YLocSim) tool was developed, validated, and compared
with experimental results in a private cluster. Moreover, the effects of inherent resources allocation imbalance
which increase with data locality, and the redundant I/O operations due to same data requests by different
containers, were studied. Existing job schedulers ignore the partitioning skew caused by uneven shuffled
intermediate data volumes to reducers. In [44], a framework to control resource allocation to reduce jobs, named
Dynamic REsource Allocation technique for MapReduce with Partitioning Skew (DREAMS), was developed
while considering data replication. Experimental results show an improvement by up to 2.29 times compared to
native YARN. Inappropriate configurations in YARN lead to degraded performance as resource usage by tasks
typically vary during the execution, while the resources initially assigned to them are fixed. In [45], JellyFish
which is a self-tuning system based on YARN, was proposed to reduce the completion time and improve the
utilization by considering elastic containers, performing online reconfigurations, and rescheduling for the idle
resources. The study considered the most critical subset of YARN parameters and an average improvement of
65% was achieved compared to default YARN for repetitive jobs.
B. Jobs Scheduling:
Default scheduling mechanisms such as FIFO, capacity scheduler, and HFS, can lead to undesirable
performance especially with mixtures of small interactive and long batch jobs [318]. Several studies suggested
enhanced scheduling mechanisms to overcome the inefficiencies of default schedulers in terms of resource
utilization, jobs makespan (i.e. the completion time of last task), or energy efficiency. The conflict between
fairness in scheduling and data locality was addressed in [46]. To maximize data locality, some tasks assigned
according to fairness scheduling were intentionally delayed until a cluster containing the corresponding data is
available. An algorithm named delay scheduling was implemented on HFS and tested on commercial and private
clusters based on Hive workloads from Facebook. It was found that only short delays were required to achieve
almost 100% locality. The response times for small jobs were improved by about five times at the expense of
moderately slowing down larger jobs. Quincy, a slot-based scheduler for Dryad, was implemented as a minimum-
cost flow algorithm in [47] to achieve fairness and data locality for concurrent distributed jobs. The scheduling
was implemented as a DAG, where the edges represent the competing demand of locality and fairness. Quincy
provided better performance than queue-based schedulers and was capable of effectively reducing the network
traffic.
In [48], a data replication-aware scheduling algorithm was proposed to reduce the network traffic and
speculative executions. The proposed map jobs scheduler was implemented in two waves. Firstly, the empty slots
in the cluster were filled based on the number of hosted maps and the replication schemes. Secondly, a run-time
scheduler was utilized to increase the locality and balance the intermediate data distribution for the shuffling
phase. A map scheduler that balances locality and load balancing was proposed in [49]. In this work, the
computing cluster is modeled as a time-slotted system where job arrivals are modeled as Bernoulli random
variables and the service time is modelled as a geometric random variable with different mean values based on
data locality. The scheduler utilized ‘Join the Shortest Queue’ (JSQ) and the maximum weight policies and was
proven to be throughput and delay optimal under heavy traffic conditions.
In [50], a Resource-aware Adaptive Scheduling (RAS) mechanism was proposed to improve resource
utilization and achieve user-defined completion time goals. RAS utilizes offline job profiling information to
enable dynamic adjustments for the number of slots in each machine while meeting the different requirements of
the map and reduce phases. FLEX, is a flexible scheduling allocation scheme introduced in [51] to optimize the
response time, makespan, and SLA. FLEX ensures minimum slot guarantees for jobs as in fair scheduler and
introduces maximum slots guarantees. Experimental results showed that FLEX outperformed the fair scheduler
by up to 30% in terms of the response time. The studies in [52], [53] scheduled MapReduce jobs according to
apriori known job sizes information to optimize Hadoop clusters performance. HFSP in [52] focused on resource
allocation fairness and the reduction of the response time among concurrent interactive and batch jobs by using
the concepts of ageing function and virtual time. LsPS in [53] is a self-tuning two-tier scheduler, the first scheduler
is used among multiple users and the second scheduler is for each user. It aimed to improve the average response
time by adaptively selecting between fair and FIFO schedulers. Prior job information was also utilized in [54] for
offline MapReduce scheduling while taking server assignments into consideration.
The authors in [55] reported that shuffling in MapReduce accounted for about a third of the overall jobs
completion time. A joint scheduling mechanism for the map, shuffle, and reduce phases that considered their
dependencies was implemented via linear programs and heuristics with precedence constraints. The study in [56]
focused on improving the CPU utilization by dynamically scheduling waiting map tasks to nodes with high I/O
waiting time. The execution time of I/O intensive jobs with the proposed scheduler was improved by 23%
compared to FIFO. To reduce the energy consumption of performing MapReduce jobs, the work in [57] proposed
an energy-aware scheduling mechanism while considering the Dynamic Voltage Frequency Scaling (DVFS)
settings for the cluster. DVFS was experimentally utilized in [58] to reduce the energy consumption of
computation extensive workloads. In [59], two heuristics namely EMRSA-I, and EMRSA-II were developed to
assign map and reduce tasks to slots with the goal of reducing the energy consumption while satisfying SLAs.
The energy consumption was reduced by about 40% compared to default schedulers at the expense of increased
jobs makespan. In [60], a dynamic slot allocation framework; DynamicMR was proposed to improve the
efficiency of Hadoop. Dynamic Hadoop Slot Allocation (DHSA) algorithm was utilized to relax the strict slot
allocation to either map or reduce tasks by enabling their reallocation to achieve a dynamic map to reduce ratio.
DynamicMR outperformed YARN by 2% - 9%, and considerably reduced the network contention caused by the
high number of over-utilized reduce tasks. Unlike scheduling at the task-level, the work in [61] introduced PRISM
which is a fine-grained resource aware scheduler. PRISM, which divides tasks into phases each with its own
resources usage profile, provided 1.3 times reduction in the running time compared to default schedulers.
Several recent studies considered optimizing MapReduce scheduling under YARN. HaSTE in [62] utilized
the fine-grained CPU and RAM resources management capabilities of YARN to schedule map and reduce tasks
under dependency constrains without the need for prior knowledge of tasks execution times. HaSTE improved
the resources utilization and the makespan even for mixed workloads. The study in [63] suggested an SLA-aware
energy-efficient scheduling scheme based on DVFS for Hadoop with YARN. The scheme, which was applied to
the per-application AM, reduced the CPU frequency in the cases of tasks finishing before their expected
completion time. In [64], a priority-based resource scheduler for a streaming system named Quasit was proposed.
The scheduler dynamically modifies the data paths of lower priority tuples to allow faster processing for higher
priority tuples for vehicular traffic streams.

C. Completion Time:
The reduction of jobs overall completion time was considered in several studies with the aim of improving the
SLA or reducing the power consumption in underlying clusters. Numerical evaluations were utilized in [65] to
test two power efficient resource allocation approaches in a pool of MapReduce clusters. The algorithms
developed aimed to reduce the end-to-end delay or the energy consumption while considering the availability as
an SLA metric. In an effort to provide predictable services to deadline-constrained jobs, the work in [66] suggested
a Resource and Deadline-aware Hadoop scheduler (RDS). RDS is based on online optimization and a self-learning
completion time estimator for future tasks which makes it suitable for dynamic Hadoop clusters with mixture of
energy sources and workloads. The work in [67] focused on reducing the completion time of small jobs that
account for the majority of the jobs in production Hadoop clusters. The proposed scheduler; Fair4S achieved an
improvement by a factor of 7 compared to fair scheduler where 80% of small jobs waited less than 4 seconds to
be served. In [68], a Bipartite Graph MapReduce scheduler (BGMRS) was proposed for deadline-constrained
MapReduce jobs in clusters with heterogeneous nodes and dynamic job execution time. By optimizing scheduling
and resources allocations, BGMRS reduced the deadline miss ratio by 79% and the completion time by 36%
compared to fair scheduler.
The authors in [69] developed an Automatic Resource Inference and Allocation (ARIA) framework based on
jobs profiling to reduce the completion time of MapReduce jobs in shared clusters. A Service Level Objective
(SLO) scheduler was developed to utilize the predicted completion times and determine the schedule of resources
allocation for tasks to meet soft deadlines. The study in [70] proposed a Dynamic Priority Multi-Queue Scheduler
(DPMQS) to reduce the completion time of map tasks in heterogeneous environments. DPMQS increased both
the data locality and the priority of map jobs that are near to completion. Optimizing the scheduling of mixed
MapReduce-like workloads was considered in [71] through offline and online algorithms that determine the order
of tasks that minimize the weighted sum of the completion time. The authors in [72] also emphasized the role of
optimizing jobs ordering and slots configurations in reducing the total completion time for offline jobs and
proposed algorithms that can improve non-optimized Hadoop by up to 80%. The work in [73] considered
optimizing four NoSQL databases (i.e. HBase, Cassandra, and Hive) by reducing the Waiting Energy
Consumption (WEC) caused by idle nodes waiting for job assignments, I/O operations, or results from other
nodes. RoPE was proposed in [73] to reduce the response time of relational queries performed as cascaded
MapReduce jobs in SCOPE which is a parallel processing engine used by Microsoft. A profiler for code and data
properties was used to improve future invocations of the same queries. RoPE achieved 2× improvements in the
response time for 95% of Bing's production jobs while using 1.5× less resources.
Jobs failures lead to huge increase in their completion time. To improve the performance of Hadoop under
failures, a modified MapReduce work flow with fine-grained fault-tolerance mechanism called BEneath the Task
Level (BeTL) was proposed in [75]. BeTL allows generating more files during the shuffling to create more
checkpoints. It improved the performance of Hadoop under no failures by 6.6% and under failures by up to 51%.
The work in [76] proposed four multi-queue size-based scheduling policies to reduce jobs slowdown variability
which is defined as the idle time to wait for resources or I/O operations. Several factors such as parameters
sensitivity, load unbalance, heavy-traffic, and fairness were considered. The work in [77] optimized the number
of reduce tasks, their configurations, and memory allocations based on profiling the intermediate results size. The
results indicated a complete disregard for job failures due to insufficient memory and a reduction in the completion
time by up to 88.79% compared to legacy memory allocation approaches. To improve the performance of
MapReduce in memory-constrained systems, Mammoth was proposed in [78] to provide global memory
management. In Mammoth, related map and reduce tasks were launched in a single Java Virtual Machine (JVM)
as threads that share the memory at run time. Mammoth actively pushes intermediate results to reducers unlike
Hadoop that passively pulls from disks. A rule-based heuristic was used to prioritize memory allocations and
revocations among map, shuffle, and reduce operations. Mammoth was found to be 5.19 times faster than Hadoop
1 and to outperform Spark for interactive and iterative jobs when the memory is insufficient [78]. An automatic
skew mitigation approach; SkewTune was proposed and optimized in [79]. SkewTune detects different types of
skew, and effectively re-partitions the unprocessed data of the stragglers to process them in idle nodes. The results
indicated a reduction by a factor of 4 in the completion time for workloads with skew and minimal overhead for
workloads without skew.
TABLE I
SUMMARY OF APPLICATIONS-FOCUSED OPTIMIZATIONS STUDIES
Ref Objective Application Tools Benchmarks/workloads Experimental Setup/Simulation environment
[36]* Optimize data placement Hadoop 1.x Dynamic Grep, WordCount 5 heterogeneous nodes with Intel's
in heterogeneous clusters algorithm (Core 2 Duo, Celeron, Pentium 3)
[37]* Reduce energy Hadoop 0.20.0 Modification to Webdata_sort, data_scan 36 nodes (8 CPU cores, 32GB RAM, Gigabit
consumption by Hadoop from Gridmix benchmark NIC, two disks), 48-port HP ProCurve 2810-
scaling down unutilized and defining (16-128 GB) 48G switch
clusters covering subset
[38]* Balance energy - locality-aware Geometric distribution Event-driven simulations
consumption scheduler for tasks arrival rate
and performance via Markov
Decision Process
[39]* Dependency-Aware Hadoop 1.2.1, Modification to 3.5 GB Graph from 4 nodes (Intel i7 3.4 GHz Quad, 16GB DDR3
Locality for MapReduce Giraph 1.0.0 HDFS replication Wikimedia database, RAM, 1 TB disk, 1 Gbps NIC, NETGEAR 8-
(DALM) scheme and 2.1 GB of public social port switch
scheduling networks data
algorithm
[40]* Optimizing reduce task Hadoop 1.x Classical stochastic Grep (10 GB), 7 slave nodes (4 cores 2.933 GHz CPU,
locality for sequential sequential Sort (15 GB, 4.3 GB) 32KB cache, 6GB RAM, 72GB disk)
MapReduce jobs assignment
[41]* Optimizing data Hadoop 1.0.3, k-means clustering 920 GB business ad-hoc 10 nodes; 1 NameNode (6 2.6 GHz CPU cores,
placements for query with Hive 0.10.0 for data queries from TPC-H 16GB RAM, 1TB SATA), 9 DataNodes (Intel
operations placement, HDFS i5, 4GB RAM, 300GB disk), 1 Gbps Ethernet
extension to
support customized
data placement
[42]* HConfig: Configuring HBase 0.96.2, Algorithms to semi- YCSB Benchmark 13 nodes (1 manager, 3 coordinators, 9
data loading in HBase Hadoop 2.2.0 automate data workers), and 40 nodes (1 manager, 3
clusters loading and coordinators, 36 workers) with (AMD
resource allocation Opeteron CPU, 8GB RAM, 2 SATA 1TB
disks), 1 Gigabit Ethernet
[43]* Impact of data locality Hadoop 2.3.0, Modify Rumen and 8GB of synthetic text for One node (4 CPU cores, 16GB RAM), 16
in YARN on completion Sqoop, Pig, YARN Grep, WordCount, 10GB nodes (2 CPU cores, 8GB RAM), 1 Gigabit
time and I/O operations Hive Scheduler Load TPC-H queries Ethernet, YARN Location Simulator
Simulator (SLS) (YLocSim)
to report data
locality
[44]* DREAMS: dynamic YARN 2.4.0 Additional feature WC, Inverted Index, k- 21 Xen-based virtual machines (4 2GHz
reduce tasks resources in YARN means, classification, CPU cores, 8GB RAM, 80GB disk)
allocation DataJoin, Sort, Histo-
movies (5GB, 27GB)
[45]* JellyFish: Self tuning Hadoop 2.x Parameters tuning, PUMA benchmark 4 nodes (dual Intel 6 cores Xeon
system based on YARN resources (TeraSort, WordCount, CPU, 15MB L3 cache, 7GB RAM,
rescheduling Grep, Inverted Index) 320GB disk) Gigabit Ethernet
via elastic
containers
[46]° Delay scheduling: Hadoop 0.20 Algorithms Facebook traces (text 100 nodes in Amazon EC2 (4 2 GHz core, 4
balance fairness and data implemented search, simple filtering disks, 15GB RAM, 1 Gbps links), 100 nodes
locality in HDFS selection, aggregation, (8 CPU cores, 4 disks), 1 Gbps Ethernet
join)
[47]° Quincy: Fair Scheduling Dryad Graph-based Sort(40,160,320) GB, 243 nodes in 8 racks (16GB RAM,
with locality and fairness algorithms Join (11.8, 41.8) GB, 2 2.6 GHz dual core AMD) 48-port
PageRank(240) GB, Gigabit Ethernet per rack
WordCount (0.1-5) GB
[48]° Maestro: Replica- Hadoop 0.19.0, Heuristic GridMix (Sort, 20 virtual nodes in local cluster, 100 Grid5000
aware Scheduling Hadoop 0.21.0 WordCount) nodes (2 GHz dual core AMD, 2GB RAM,
(2.5,1,12.5,200) GB 80GB disk)
[49]° Map task scheduling to - Join Shortest Parameters from search Simulations for a 400 machines cluster
balance data locality and Queue, Max Weight jobs in databases
load balancing Scheduler in heavy
traffic queues
[50]° Resource-aware Hadoop 0.23 Scheduler and Gridmix Benchmark 22 nodes (64-bit 2.8 GHz Intel Xeon,
Adaptive Scheduling job profiler (Sort,Combine,Select) 2GB RAM), Gigabit Ethernet
(RAS)
[51]° FLEX: flexible allocation Hadoop 0.20.0 Standalone plug-in 1.8 TB synthesized 26 nodes (3 GHz Intel Xeon, 13 4-core blades
scheduling scheme or data, and GridMix2 in one rack and 13 8-core blades in second
add-on module in rack
FAIR
[52]° HFSP: size- Pig, Job size estimator PigMix (1,10,100 20 workers with TaskTracker
based scheduling Hadoop 1.x and algorithm GB and 1TB) (4 CPU cores, 8GB RAM)
[53]° LsPS: self-tuning Hadoop 1.x Plug-in WordCount, Grep, EC2 m1.large (7.5GB RAM, 850GB disk), 11
two-tiers scheduler scheduler PiEstimator, Sort nodes (1 master, 10 slaves), trace-driven
simulations
[54]° Joint scheduling of Hadoop 1.2.0 3-approximation WordCount (43.7 GB 16 nodes EC2 VM (1 GHz CPU,
MapReduce in servers algorithm, heuristic Wikipedia document 1.7GB memory, 160GB disk)
package)
[55]° Joint Scheduling of - Linear Synthesized workloads Event-based simulations
processing and shuffle programming,
heuristics
[56]° Dynamic slot scheduling Hadoop 1.0.3 Dynamic algorithm Sort (3,6,9,12,15) GB 2 masters (12 CPU cores 1.9 GHz AMD, 32GB
for I/O intensive jobs based RAM), 4,8,16 slaves (Intel Core i5 1.9 GHz,
on I/O and CPU 8GB RAM)
statistics
[57]° Energy-efficient - Polynomial time Synthesized workloads MATLAB simulations
Scheduling constant-factor
approximation
algorithm
[58]° Energy efficiency for Hadoop 0.20.0 DVFS control in Sort, CloudBurst, 8 nodes (AMD Opteron quad-core 2380 with
comp- source code Matrix Multiplication DVFS support, 2 64 kB L1 caches) Gigabit
utation intensive and via external Ethernet
workloads scheduler
[59]° Energy-aware Hadoop 0.19.1 Scheduling TeraSort, Page Rank, 2 nodes (24GB RAM,16 2.4 GHz CPU cores,
Scheduling algorithms k-means 1TB Disk), 2 nodes (16GB RAM,16 2.4 GHz
EMRSA-I, CPU cores, 1TB Disk)
EMRSA-II
[60]° DynamicMR: slot Hadoop 1.2.1 Algorithms: PI- PUMA benchmark 10 nodes (Intel's X5675, 3.07
allocation in shared DHSA, PD-DHSA, GHz, 24GB RAM, 56GB disk)
clusters SEPB, slot pre-
scheduling
[61]° PRISM: Fine-grained Hadoop 0.20.2 Phase-level Gridmix 2, 10 nodes; 1 master, 15 slaves (Quad-core Xeon
resource aware scheduling PUMA benchmark E5606, 8GB RAM, 100GB disk), Gigabit
scheduling algorithm Ethernet
[62]° HaSTE: scheduling in Hadoop YARN dynamic WordCount (14,10.5) 8 nodes (8 CPU cores, 8GB RAM)
YARN to reduce 2.2.0 programming GB, Terasort 15GB,
makespan algorithm WordMean 10.5GB,
PiEstimate
[63]° SLA-aware energy - DVFS in the per- Sort, Matrix CloudSim-based simulations for 30 servers
efficient scheduling in applications master Multiplications with network speeds between 100-200 Mbps
Hadoop YARN
[64]° Priority-based resource Quasit Meta-scheduler 40 million tuples of 5 nodes; 1 master, 4 workers (AMD
scheduling for realistic 3 hours Athlon64 3800+, 2GB RAM)
Distributed Stream vehicular traffic traces
Processing Systems
[65]† Power-efficient resources - Algorithms - Arena simulator
allocation and mean end-
to-end delay
minimization
[66]† Resource and Deadline- Hadoop 1.2.1 Receding horizon PUMA benchmark 21 Virtual Machines, 1 master and
aware scheduling in control algorithm, (WordCount, TeraSort, 20 slaves (1 CPU core, 2GB RAM)
dynamic Hadoop clusters self-learning Grep)
completion time
estimator
[67]† Fair4S: Modified Hadoop 0.19 Scheduler and Synthesized workloads Discrete event-based
Fair Scheduler modi- generated by Ankus MapReduce simulator
fication to
JobTracker
[68]† Deadline-Constrained Hadoop 1.2.1 Bipartite Graph WordCount, 20 virtual machines in 4 physical machines
MapReduce Scheduling Modelling Sort, Grep (Quad core 3.3 GHz, 32 GB RAM, 1TB disk),
to perform BGMRS Gigabit Ethernet. MATLAB simulations for
scheduler 3500 nodes
[69]† ARIA (Automatic Hadoop 0.20.2 Service level WordCount, Sort, 66 nodes (4 AMD CPU cores, 8GB RAM, 2
Resource Inference and Objective Bayesian classification, 160GB disks) in two racks, Gigabit Ethernet
Allocation) scheduler to meet TF-IDF from Mahout,
soft deadlines WikiTrends, twitter
workload
[70]† Scheduling for Hadoop 1.2.4 Dynamic Priority Text search, Sort, Cluster-1: 6 nodes (dual core CPU 3.2 GHz,
improved response time multi-queue WordCount, Page Rank 2GB RAM, 250GB disk), Cluster-2: 10 nodes
scheduler as a plug- (dual core CPU 3.2 GHz, 2GB RAM, 250GB
in disk), Gigabit Ethernet
[71]† Scheduling for fast - 3-approximation Synthesized workload Simulations
completion in algorithms;
MapReduce-like systems OFFA, Online
heuristic ONA
[72]† Dynamic Job Ordering Hadoop 1.x Greedy algorithms PUMA Benchmark 20 nodes EC2 Extra Large instances (4
and slot configuration based on (WordCount, Sort, CPU cores, 15GB RAM, 4 420GB disks)
Johnson’s rule for Grep), synthesized
2-stage flow shop Facebook traces
[73]† Reduction of idle HDFS 0.20.2, Energy Loading, Grep, Selection, 12 nodes ( Intel i5-2300, 8GB RAM, 1TB disk)
energy consumption HBase 0.90.3, consumption Aggregation, join
due to waiting in NoSQL Hive 0.71, model
Cassandra 1.0.3,
HadoopDB 0.1.1
[74]† RoPE: reduce response SCOPE Query optimizer 80 jobs from major Bing’s Production cluster (tens of thousands
time of relational that business groups of 64 bit, multi-core, commodity servers)
querying piggyback job
execution
[75]† BEneath the Task Level Hadoop 2.2.0 Algorithms to Hibench Benchmark 16 nodes in Windows Azure (1 CPU core,
(BeTL) modify (WordCount, Hive 1.75GB RAM, 60GB disk), Gigabit Ethernet
fine-grained checkpoint MapReduce queries)
strategy workflow
[76]† Job Slowdown Hadoop 1.0.0 Four algorithms; SWIM traces, 6 nodes (24 dual-quad core, 24GB RAM,
Variability reduction FBQ, Grep, sort, WordCount 50TB storage), 1 Gigabit Ethernet and 20
TAGS, SITA, Gbit/s InfiBand, Mumak simulations
COMP
[77]† Memory-aware reduce Hadoop 1.2.1 Mnemonic PUMA benchmarks 1 node (4 CPU cores, 7GB RAM)
tasks configurations mechanism (InvertedIndex,
of Reduce tasks to determine Ranked InvertedIndex,
number Self Join,
Sequence Count,
WordCount)
[78]† SkewTune: mitigating Hadoop 0.21.1 Modification to job Inverted Index, 20 nodes cluster (2 GHz quad-core CPU,
skew in MapReduce tracker and task PageRank, 16GB RAM, 2 750GB disks)
applications tracker CloudBurst
[79] † Mammoth: global Hadoop 1.0.1 Memory WordCount, Sort, 17 nodes (2 8 CPU cores 2.6 GHz Intel Xeon
memory management management WordCount with E5-2670, 32GB memory, 300GB SAS disk)
via the public pool Combiner
*Jobs/data placement, °Jobs Scheduling, †Completion time.

D. Benchmarking Suites, Production Traces, Modelling, Profiling Techniques, and Simulators for Big Data
Applications:
1) Benchmarking Suites:
Understanding the complex characteristics of big data workloads is an essential step toward optimizing the
configurations for the frameworks parameters used and identifying the sources of bottlenecks in the underlying
clusters. As with many legacy applications, MapReduce and other big data frameworks are supported by several
standard benchmarking suites such as [80]-[88]. These benchmarks have been widely utilized to evaluate the
performance of big data applications in different infrastructures either experimentally, analytically or via
simulations as summarized for the optimization studies in Tables I, III, V, and VI. Moreover, they can be used in
production environments for initial tuning and debugging purposes, in addition to stress-testing and bottlenecks
analysis before the actual run of intended commercial services.
The workloads contained in these benchmarks are typically described by a semantic that run on previously
collected or randomly generated datasets. Examples are text retrieval-based (e.g. Word-Count (WC), Word-Count
with Combiner (WCC), and Sort), and web search-based (e.g. Grep, Inverted Index, and Page Rank) workloads.
WC, and WCC calculate the occurrences of each word in large distributed documents. WC and WCC differ in
that the reduction is performed totally at the reduce stage in WC, and is done partially at the map stage with the
aid of a combiner at the map stage in WCC. Sort generates alphabetically sorted output from input documents.
Grep finds the match of regular expressions (regex) in input files, while Inverted Index generates a word-to-
document indexing for a list of documents. PageRank is a link analysis algorithm that measures the popularity of
web pages based on their referral by other websites. In addition to the above examples, computations related to
graphs and to machine learning such as k-means clustering are also considered in benchmarking big data
applications. These workloads vary in being I/O, memory, or CPU intensive. For example, PageRank, Grep, and
sort are I/O intensive, while WC, Page rank, and k-means are CPU intensive [8]. This should be considered in
optimization studies to correctly address bottlenecks and targeted resources. For a detailed review of
benchmarking different specialized big data systems, the reader is referred to the survey in [317], where workloads
generation techniques, input data generation, and assessment metrics are extensively reviewed. Here, we briefly
describe some examples of the widely used big data benchmarks and their workloads as summarized in Table II
and listed below:
1) GridMix [80]: GridMix is the standard benchmark included within the Hadoop distribution. GridMix,
has three versions and provides a mix of workloads synthesized from traces generated by Rumen [369]
from Hadoop clusters.
2) HiBench1 [81]: HiBench is a comprehensive benchmark suite provided by Intel for big data. HiBench
provides a wide range of workloads to evaluate computations speed, systems throughput, and resources
utilization.
3) HcBench [82]: HcBench is a Hadoop benchmark provided by Intel that includes workloads with a mix
of CPU, storage, and network intensive jobs with Gamma inter-job arrival times for realistic clusters
evaluations.
4) PUMA [83]: PUMA is a MapReduce benchmark developed at Purdue University that contains workloads
with different computational and networking demands.
5) Hive Benchmark [84]: The Hive benchmark contains 4 queries and targets comparing Hadoop with Pig.
6) PigMix [85]: PigMix is a benchmark that contains 12 queries types to test the latency and scalability
performance of Apache Pig and MapReduce.
7) BigBench [86]: BigBench is an industrial-based benchmark that provides quires with structures, semi-
structured and unstructured data.
8) TPC-H [87]: The TPC-H benchmark, provided by the Transaction Processing Performance Council
(TPC), allows generating realistic datasets and performing several business oriented ad-hoc queries.
Thus, it can be used to evaluate NoSQL and RDBMS scalability, processing power, and throughput.
9) YCSB2 [88]: The Yahoo Cloud Serving Benchmark (YCSB) targets testing the inserting, reading,
updating and scanning operations in database-like systems. YCSB contains 20 records data sets and
provides a tool to create the workloads.
10) TABLE II
11) EXAMPLES OF BENCHMARKING SUITES FOR BIG DATA APPLICATIONS AND THEIR WORKLOADS.
Benchmark Application Workloads
GridMix [80] Hadoop Synthetic loadjob, Synthetic sleepjob
HiBench [81] Hadoop, SQL, Micro Benchmarks (Sort, WordCount, TeraSort, Sleep, enhanced DFSIO to test HDFS throughput), Machine
Kafka, Spark Learning (Bayesian Classification, k-means, Logistic Regression, Alternating Least Squares, Gradient Boosting
Streaming Trees, Linear Regression, Latent Dirichlet Allocation, Principal Components Analysis, Random Forest, Support
Vector Machine, Singular Value Decomposition), SQL(Scan, Join, Aggregate), Websearch Benchmarks
(PageRank,
Nutch indexing), Graph Benchmark (NWeight), Streaming Benchmarks (Identity, Repartition, Stateful WC,
Fixwindow)
HcBench [82] Hadoop, Hive, Telco-CDR (Call Data Records) interactive queries, Hive workloads (PageRank URLs, aggregates by
Mahout source, average of PageRank), k-means Clustering iterative jobs in machine learning, and Terasort
PUMA [83] Hadoop Micro Benchmarks (Grep, WordCount, TeraSort), term vector, inverted index, self-join, adjacency-
list, k-means, classification, histogram, histogram ratings, sequence count, ranked inverted index
Hive [84] Hive Grep selection, Ranking selection, user visits aggregation, user visits join
PigMix [85] Pig Explode, fr join, join, distinct agg, anti-join, large group by key, nested split,
group all, order by 1 field, order by multiple fields, distinct + union, multi-store
BigBench [86] RDBMS, Business queries (cross-selling, customer micro-segmentation, sentiment analysis, enhancing
NoSQL multi-channel customer experience, assortment and pricing optimization, performance
transparency, return analysis, inventory management, price comparison)
TPC-H [87] RDBMS, Ad-hoc queries for New Customer Web Service, Change Payment Method, Create Order, Shipping, Stock
NoSQL Management Process, Order Status, New Products Web Service Interaction, Product Detail, Change Item
YCSB [88] Cassandra,HBase, Update heavy, Read heavy, Read only, Read latest, Short ranges
Yahoo’s PNUTS

2) Production Traces:


1
Available at: https://github.com/intel-hadoop/hibench
2
Available at: https://github.com/brianfrankcooper/YCSB
As the above-mentioned benchmarks are defined by their semantics where the codes functionalities are known
and the jobs are submitted by a single user to run on deterministic datasets, they might be incapable of fully
representing production environments workloads where realistic mixtures of workloads with different data sizes
and inter-arrival times for multi users coexist. Information about such realistic workloads can be provided in the
form of traces collected from previously submitted jobs in production clusters. Although sharing such information
for production environments is hindered by confidentiality, legal and business restrictions, several companies
have provided archived jobs traces while normalizing the resources usage. Examples of realistic evaluations based
on publicly available production traces were conducted by Yahoo [89], Facebook [90], [91], Cloudera and
Facebook [92], Google [93], [94], IBM [95], and in clusters with scientific [96], or business-critical workloads
[97]. These traces describe various job features such as number of tasks, data characteristics (i.e. input, shuffle
and output data sizes and their ratios), completion time, in addition to jobs inter arrival times and resources usage
without revealing information about the semantics or users sensitive data. Then, synthesized workloads can be
generated by running dummy codes to artificially generate shuffle and outputs data sizes that match the trace on
randomly generated data. To accurately capture the characteristics in the trace while running in testing clusters
with a scale smaller than production clusters, effective sampling should be performed.
Traces based on ten month log files for data intensive MapReduce jobs running in the M45 supercomputing
production Hadoop cluster for Yahoo were characterized in [89] according to utilization, job patterns, and sources
of failures. Facebook traces collected from a 600 nodes for 6 months and Yahoo traces were utilized in [90] to
provide comparisons and insights for MapReduce jobs in production clusters. Both traces were classified by k-
means clustering according to the number of jobs, input, shuffle, and output data sizes, map and reduce tasks, and
jobs durations into 10 and 8 bins for Facebook and yahoo traces, respectively. Both traces indicated input data
sizes ranging between kBytes and TBytes. The workloads are labeled as small jobs which constitute most of the
jobs, load jobs with only map tasks, in addition to expand, aggregate, and transformation jobs based on input,
shuffle, and output data sizes. A framework that properly sample the traces to synthesize representative and a
scaled down workloads for use in smaller clusters is proposed and sleep requests in Hadoop to emulate the inter-
arrivals of jobs were utilized. Facebook traces from 3000 machines with total data size of 12TBytes for
MapReduce Workloads with Significant Interactive Analysis were utilized in [91] to evaluate the energy
efficiency of allocating more jobs to idle nodes in interactive job clusters. Six Cloudera traces from e-commerce,
telecommunications, media, and retail users’ workloads and Facebook traces collected over a year for 2 Million
jobs were analyzed in [92]. Several insights for production jobs such as the weekly time series, and the burstiness
of submissions were provided. These traces are available in a public repository which also contains a synthesizing
tool; SWIM3 which is integrated with Hadoop. Although previously-mentioned traces contain rich information
about various jobs characteristics, the lack of per job resources utilization information makes them partially
representative for estimating workloads resources demands.
Google workloads traces were characterized in [93] and [94] using k-means based on their duration and CPU
and memory resources usage per task to aid in capacity planning, forecasting demands growth, and for improving
tasks scheduling. Insights such as “most tasks are short”, “duration of long and short tasks follows a Bimodal
distribution”, and that “most of the resources are taken by few extensive jobs” were provided. The traces4 were
collected from 12k machines cluster in 2011 and include information about scheduling requests, taken actions,
tasks submission times and normalized resources usage, and machines availability [94]. However, disk and
networking resources were not covered. The traces collected from IBM-based private clouds for banking,
communication, e-business, production, and telecommunication industries in [95] further considered disk and file
system usage in addition to CPU and memory. The inter-dependencies between the resources were measured and
disk and memory resources were found to be negatively correlated indicating the potential benefit of co-locating
memory intensive and disk intensive tasks.
A user-centric study was conducted in [96] based on traces5 collected from three research clusters;
OPENCLOUD, M45, and WEB MINING. Workloads, configurations, in addition to resources usage and sharing
information were used to address the gaps between data scientists needs and systems design. Evaluations for
preferred applications and the penalties of using default parameters were provided. Two large-scale and long term

3
Available at: https://github.com/SWIMprojectucb/swim/wiki
4
Available at: https://github.com/google/cluster-data
5
Available at: http://www.pdl.cmu.edu/HLA/
traces6 collected from distributed data centers for business-critical workloads such as financial simulators were
utilized in [97] to provide basic statistics, correlations, and time-pattern analysis. Full characteristics for the
provisioned and actual usage of CPU, memory, disk I/O, and network I/O throughput resources were presented.
However, no information about the inter arrival times were provided. Ankus MapReduce workload synthesizer
was developed as part of the study in [67] based on e-commerce traces collected from Taobao which is a 2k nodes
production Hadoop cluster. The inter arrival times in the traces were found to be Poisson.
3) Modelling and Profiling Techniques:
Statistical-based characterization and modelling were considered for big data workloads based on production
clusters traces as in [98] or benchmarks as in [99], and [101]. Also, different profiling and workloads modelling
studies such as in [102]-[105] were conducted with the aim of automating clusters configurations or estimating
different performance metrics such as the completion time based on the selected configurations and resources
availability. A statistical-driven workloads generator was developed in [98] to evaluate the energy efficiency of
MapReduce clusters under different scales, configurations, and scheduling policies. A framework to ease
publishing production traces anonymously based on inter-arrival times, jobs mixes, resources usage, idle time,
and data sizes was also proposed. The framework provide non-parametric statistics such as the averages and
standard deviations and 5 numbers percentile summaries (1st, 25th , 50 th, 75 th, and 99 th) for inter-arrival times and
data sizes. Statistical modelling for GridMix, Hive, and HiBench workloads based on principal component
analysis for 45 metrics and regression models was performed in [99] to provide performance estimations for
Hadoop clusters under different workloads and configurations. BigDataBench7 was utilized in [100] to examine
the performance of 11 representative big data workloads in modern superscale out-of-order processors.
Correlation analysis was performed to identify the key factors that affect the Cycles Per Instruction (CPI) count
for each workload. Keddah8 in [101] was proposed as a toolchain to profile based on end host or switches traffic
measurements, empirically characterize, and reproduce Hadoop traffic. Flow-level traffic models can be derived
by Keddah for use with network simulators with varied settings that affect networking requirements such as
replication factor, cluster size, split size, and number of reducers. Results based on TeraSort, PageRank and k-
means workloads in dedicated and cloud-based clusters indicated high correlation with Keddah-based simulation
results.
To automate Hadoop clusters configurations, the authors in [102], [103] developed a cost-based optimization
that utilize an online profiler and a What-if Engine. The profiler uses the BTrace Java-based dynamic
instrumentation tool to collect job profiles at run time for the data flows (i.e. input data and shuffle volumes) and
to calculate the cost based on the program, input data, resources, and configuration parameters at tasks granularity.
The What-if Engine contains a cost-based optimizer that utilizes a task scheduler simulator and model-based
optimization to estimate the costs if different combination of cost variables are used. This suits just-in-time
configurations for the computations that run periodically in production clusters, where the profiler can trial the
execution on a small set of the cluster to obtain the cost function, then the What-if Engine obtains the best
configurations to be used for the rest of the workload. In [104], a Hadoop performance model is introduced. It
estimates completion time and resources usage based on Locally Weighted Linear Regression and Langrage
Multiplier, respectively. The non-overlapped and over-lapped phases of shuffling and the number of reduce waves
were carefully considered. Polynomial regression is used in [105] to estimate the CPU usage in clocks per cycle
for MapReduce jobs based on the number of map and reduce tasks.
4) Simulators:
Tuning big data applications in clusters requires time consuming and error-prone evaluations for a wide range of
configurations and parameters. Moreover, the availability of a dedicated cluster or an experimental setup is not
always guaranteed due to their high deployment costs. To ease tuning the parameters and to study the behaviour
of big data applications in different environments, several simulation tools were proposed [106]-[119], and [148].
These simulators differ in their engines, scalability, flexibility with parameters, level of details, and the support
for additional features such as multi disks and data skew. Mumak [106] is an Apache Hadoop simulator included
in its distribution to simulate the behaviour of large Hadoop clusters by replaying previously generated traces. A
built-in tool; Rumen [369] is included in Hadoop to generate these traces by extracting previous jobs information

6
Available at: http://gwa.ewi.tudelft.nl/datasets/Bitbrains
7
Available at: http://prof.ict.ac.cn/BigDataBench
8
Available at: https://github.com/deng113jie/keddah
from their log files. Rumen collects more than 40 properties of the tasks. In addition, it provides the topology
information to Mumak. However, Mumak simplifies the simulations by assuming that the reduce phase starts only
after the map phase finishes, thus it does not provide accurate modelling for shuffling and provides rough
completion time estimation. SimMR is proposed in [107] as a MapReduce simulator that focuses on modelling
different resource allocation and scheduling approaches. SimMR is capable of replaying real workloads traces as
well as synthetic traces based on the statistical properties of the workloads. It relies on a discrete event-based
simulator engine that accurately emulates Job Tracker decisions in Hadoop for map/reduce slot allocation, and a
pluggable scheduling policy engine that simulates decisions based on the available resources. SimMR was tested
on a 66-node cluster and was found to be more accurate and two orders of magnitude faster than Mumak. It
simplifies the node modelling by assuming several cores but only one disk. MRSim9 is a discrete event-based
MapReduce simulator that relies on SimJava, and GridSim to test workloads behaviour in terms of completion
time and utilization [108]. The user provides cluster topology information and jobs specifications such as the
number of map and reduce tasks, the data layout (i.e locations and replication factor), and algorithms description
(i.e. number of CPU instructions per record and average record size) for the simulations. MRSim focuses on
modelling multi-cores, single disk, network traffic, in addition to memory, buffers, merge, parallel copy, and sort
parameters. The results of MRSim were validated on a single rack cluster with four nodes.
The authors in [109] proposed MR-cloudsim as a simulation tool for MapReduce in cloud computing data
centers. MR-cloudsim is based on the widely used open-source cloud systems event-driven simulator; CloudSim
[110]. In MR-cloudsim, several simplifications such as assigning one reduce per map, and allowing the reduce
phase to start only after the map phase finishes are assumed. To assist with MapReduce clusters design and testing,
MRPerf10 is proposed in [111] to provide fine-grained simulations while focusing on modelling the activities
inside the nodes, the disk and data parameters, and the inter and intra rack networks configurations. MRPerf is
based on Network Simulator-2 (ns-2)11 which is a packet-level simulator, and DiskSim12 which is an advanced
disk simulator. A MapReduce heuristic is used to model Hadoop behaviour and perform scheduling. In MRPerf,
the user provides the topology information, the data layout, and the job specifications, and obtains a detailed
phase-level trace that contains information about the completion time and the volume of transferred data.
However, MRPerf needs several minutes per evaluation and has the limitation of modelling only one replica and
a single disk per node. Also it does not enable speculative execution modeling and simplifies I/O and computations
processes by not overlapping them. MRemu13 in [112] is an emulation-based framework based on Miminet 2.0
for MapReduce performance evaluations in terms of completion time in different data center networks. The user
can emulate arbitrary network topologies and assign bandwidth, packet loss ratio, and latency parameters.
Moreover, network control based on SDN can be emulated. A layered simulation architecture; CSMethod is
proposed in [113] to map the interactions between different software and hardware entities at the cluster level
with big data applications. Detailed models for JVMs, the name nodes, data nodes, JT, TT, and scheduling were
developed. Furthermore, the diverse hardware choices such components, specifications, and topologies were
considered. Similar approaches were also used in [114], and [115] to simulate NoSQL and Hive applications,
respectively. As part of their study, the authors in [136] developed a network flow level discrete-event simulator
named PurSim to aid with simulating MapReduce executions in up to 200 virtual machines.
Few recent articles considered the modelling and simulations of YARN environments [116]-[119]. Yarn
Scheduler Load Simulator (SLS) is included in the Hadoop distribution to be used with Rumen to evaluate the
performance of YARN different scheduling algorithms with different workloads [116]. SLS utilizes a single JVM
to exercise a real RM with thread-based simulators for NM and AM. However, SLS ignores simulating the
network effects as the NM and AM simulators interact with the RM only via heartbeat events. The authors in
[117] suggested an extension for SLS with an SDN-based network emulator; MaxiNet and a data center traffic
generator DCT2 to add realistic modelling for network and traffic in YARN environments and to emulate the
interactions between jobs scheduling and flow scheduling. Real-Time ABS language is utilized in [118] to develop
ABS-YARN simulator that focuses on prototyping YARN and modelling job executions. YARNsim in [119] is a

9
Available at: http://code.google.com/p/mrsim
10
Available at: https://github.com/guanying/mrperf
11
Available at: http://www.isi.edu/nsnam/ns
12
Available at: http://www.pdl.cmu.edu/DiskSim/
13
Available at: https://github.com/mvneves/mremu
parallel discrete-event simulator in YARN that provides comprehensive protocol-level accuracy simulations for
task executions and data flow. A detailed modeling for networking, HDFS, data skew, and I/O read and write
latencies is provided. However, YARNsim simplifies scheduling policies and fault tolerance. A comprehensive
set of Hadoop benchmarks in addition to bioinformatics clustering applications were utilized for experimental
validation and an average error of 10% was achieved.

IV. CLOUD COMPUTING SERVICES, DEPLOYMENT MODELS, AND RELATED TECHNOLOGIES:


Cloud computing aims to enable seamless access for multi-users or tenants to a pool of computational, storage,
and networking resources that typically reside within and between several geographically distributed data centers.
Unlike traditional IT services that are limited by localized resources inaccessible by remote computing units, cloud
computing services allow dynamic outsourcing to software and/or hardware resources. Hence, they can provide
scalable and large-scale computational solutions while increasing the resources utilization. Moreover, cloud
computing considerably reduces both; the capital expenditures (CAPEX) and operational expenses (OPEX) of
software and hardware and increases the resilience. Consequently, it is continuing to encourage wide deployments
of large-scale Internet-based services by various organizations and enterprises [370].
Although MapReduce and many other big data frameworks were originally provisioned for use in local clusters
under controlled environments, there is an increasing number of cloud computing-based big data applications
realizations and services despite the incurred overheads and challenges. This Section provides a brief overview of
cloud computing concepts and discusses the challenges and implications of deploying big data applications in
cloud computing environments. Moreover, it reviews some related emerging technologies that support computing
and networking resource provisioning in cloud computing infrastructures and can hence improve the performance
of big data applications. These related technologies aim mainly to virtualize and softwarize networking and
computing systems at different levels to provide agile and flexible future-proof solutions. The rest of this Section
is organized as follows: Subsection IV-A reviews cloud computing services and deployment models while
Subsection IV-B reviews machine and network virtualization, Network Function Virtualization (NFV),
containers, and Software-Defined Networking (SDN) technologies. Subsection IV-C discusses the requirements
and the challenges of deploying big data applications in cloud computing infrastructures, while Subsection IV-D
illustrates the options for deploying big data application in geo-distributed clouds. Finally, Subsection IV-E
presents the implications of big data on cloud networking infrastructures.
A. Cloud Computing Services and Deployment Models:
The ability to share hardware, software, development platforms, network, or applications resources between
multi-users enables cloud computing systems to provide what can be described as anything-as-a-service (XaaS).
Cloud computing services can be categorized according to the outsourced resources and end-users’ privileges into
Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), and Infrastructure-as-a-Service (IaaS) [371]. SaaS
provides on-demand Internet-based services and applications to end-users without providing the privilege of
controlling or accessing the hardware, network, operating system, or the development platforms resources.
Examples of SaaS are Salesforce, Google Apps, and Microsoft Office 365. In the PaaS model, end-users have
access privilege to the platform which enables them to develop, control, and upgrade their own cloud applications,
but not to the underlying hardware. Thus, more flexibility is provided without the need for owning, operating and
maintaining the hardware. Microsoft Azure, AWS Elastic Beanstalk, and Google App Engine are examples of
PaaS provided as a pay-as-you-go service. The IaaS model provides end-users with extra provisioning privileges
that allow them to fully control the hardware, the operating systems, and the applications development platforms
[372]. Examples of IaaS are Amazon Elastic Compute Cloud (EC2), Rackspace, and Google Compute Engine.
Based on their accessibility and ownership, cloud infrastructures have four deployment models which are
public, private, hybrid, and community clouds [313]. Providers of public clouds allow users to rent cloud resources
so they can accelerate the launching of their businesses, services, and applications. The advantages include the
elimination of CAPEX and OPEX costs, and the reduction of risks especially for start-up companies. The
disadvantage is however the lack of fine-grained control over data storage, network, or security settings. In the
private cloud model, the infrastructure is exclusive for use by the owners. Although the costs are high, the full
control of the infrastructure typically leads to guaranteed performance, reliability, and security metrics. In the
hybrid clouds model, a private cloud may offload or burst some of its workload to a public cloud due to exceeding
its capacity, for costs reduction, or to achieve other performance metrics [121], [142]. Hybrid clouds combine the
benefits of both private and public clouds but at the expense of additional controlling overhead. Finally,
community clouds are owned by several organizations to serve customers with shared and specific interests and
needs [313].

B. Cloud Computing and Big Data Related Technologies:


1) Machine Virtualization:
Machine virtualization abstracts the CPU, memory, network, and disk resources and provides isolated
interfaces to allow several Virtual Machines (VMs), also called instances, to coexist in the same physical machine
while having different Operating Systems (OS) [373]. Cloud computing relies heavily on machine virtualization
as it allows dynamic resources sharing among different workloads, applications, or tenants while isolating them
[374]. Moreover, VMs are widely considered in the billing structures of pay-as-you-go cloud services as they can
be leased with different prices based on pre-defined sets of resources and configurations. For example, IaaS is
typically provided in the form of on-demand VMs such as on EC2 [375] and Google Compute engine [376].
Examples of EC2 offerings are the general purpose t2 VMs with nano, micro, small, medium, large, xlarge, and
2xlarge options, in addition to compute, memory, or storage optimized offers [375]. Google Compute engine
offers standard machines (e.g. n1-standard-64), high memory, and high CPU machines [376]. Figure 7(a)
illustrates the components of a physical machine with VMs.

(a) (b)
Fig. 7. Physical machines with (a) VMs and (b) containers.

Virtualization increases the utilization and energy efficiency of cloud infrastructures as with proper
management, VMs can be efficiently assigned and seamlessly migrated while running and can thus be
consolidated in fewer physical machines. Then, more machines can be switched to low-power or sleep modes.
Managing and monitoring VMs in cloud data centers have been realized through several management platforms
(i.e. hypervisors) such as Xen [377], KVM [378], and VMWare [379]. Also, virtual infrastructures management
tools such as Open Nebula, and VMWare VSphere, were introduced to support the management of virtualized
resources in heterogeneous environments such as hybrid cloud [374]. However, and compared to ``bare-metal''
implementations (i.e. native use of physical machines), virtualization can add performance overheads related to
the additional software layer, memory management requirements, and the virtualization of I/O and networking
interfaces [380], [381]. Moreover additional networking overheads are encountered when migrating VMs between
several clouds for load balancing and power saving purposes through commercial or dedicated inter data center
networks [382]. These migrations, if done concurrently by different service providers, can lead to increased
network congestion, exceeding delay limits, and hence services performance fluctuation, and violations to SLAs
[383]. In the context of VMs, several studies considered optimizing their placements and resource allocation to
improve big data applications performance. Also, overcoming the overheads of using VMs with big data
applications is considered as detailed in Subsection V-B.
2) Network Virtualization:
To complement the benefits of the mature VMs technologies used in cloud computing systems, virtualizing
networking resources have also been considered [384], [385]. Network virtualization enables efficient use of cloud
networking resources by allowing multiple heterogeneous Virtual Networks (VNs), also known as slices,
composed of virtual links and nodes (i.e. switches and routers) to coexist on the same physical network (substrate
network). As with VMs, VNs can be created, updated, migrated, and deleted upon need and hence, allow
customizable and scalable on-demand allocations. The assignments and mapping of the VNs on the physical
network resources can be performed through Virtual Network Embedding (VNE) offline or online algorithms
[386]. VNE can target different goals such as maximizing revenue and resilience through effective links
remapping and virtual node migrations, in addition to energy efficiency by consolidating over-provisioned
resources [387], [388]. Figure 8 illustrates the concept of VNs and VNE where three different VNs share the
physical network. Several challenges can be encountered with network virtualization due to the large scale and
the heterogeneous and autonomous nature of cloud networking infrastructures [384]. Also, different infrastructure
providers (InP) can have conflicting goals and un-unified QoS measures for their network services.

Fig. 8. The concept of virtual networks and VNE.

3) Network Function Virtualization (NFV):


Network Function Virtualization (NFV) is the detachment of networking functions (NFs) from their proprietary
equipment (i.e. middleboxes and appliances) [389] - [393]. Various NFs such as routing, load balancing, firewalls,
multimedia caching, QoS monitoring, gateways, proxies, Deep Packet Inspection (DPI), Dynamic Host
Configuration Protocol (DHCP), and Network Address Translator (NAT), in addition to mobile networking and
telecom operators services such as BaseBand Units (BBUs) and Mobility Management Entities (MMEs) can then
be provided as software instances virtualized in data centers, servers, or high capacity edge networking devices
[394], [395]. The state-of-the-art implementations of such functions in proprietary hardware are characterized by
relatively high OPEX and CPEX, and slow per-device basis upgrades and development cycles. NFV allows rapid
development of each NF independently which supports the release and upgrades of new, and existing networking
and telecom services in a cost-effective, scalable, and agile manner. Such virtualized NF instances can be created,
replicated, terminated, and migrated as needed to elastically improve the handling of the evolving functions and
the increasing big data traffic to be acquired, transferred, cached or, processed by those functions. Moreover, NFV
can increase the energy efficiency for example by consolidating and scaling resource usage of BBU in cloud
environments and virtualized MME pools in base stations [396]-[399].
A key effort to standardize NFV was initiated in 2012 by the European Telecommunication Standard Institute
(ETSI) which was selected by seven leading telecom operators to provide the requirements, deployment
recommendations, and unified terminologies for NFV. According to ETSI, the components of a Virtual Network
Function (VNF) architecture are illustrated in Figure 9 and are listed below [393], [394]:
• The networking services which are delivered as VNFs.
• The NFV Infrastructure (NFVI) composed of the software and hardware to virtualize the computing and
networking infrastructure required to realize and host the VNFs.
• The Management and Orchestration (MANO) which manages the life-cycle of VNFs and their resources
usage.
NFV aided by cloud computing infrastructures is considered a key enabler for future 5G networks that target
innovation, agility, and programmability in networking services for multi-tenant usage with heterogeneous
requirements and demands [393], [400]. To achieve this, telecom infrastructures are required to transform their
infrastructure design and management for example by integrating VNF with Cloud Radio Access Networks (C-
RANs) in the front-haul network (e.g. virtualized BBUs that support multiple Remote Radio Heads (RRHs) and
MMEs) [401], [402], and with virtualized Evolved Packet Core networks (vEPC) in the back-haul network [390],
[391]. Moreover, and to provide isolated and unified end-to-end network slices in 5G, wireless and spectrum
resources in access [403] and wireless networks [404], [405] are also virtualized and integrated with cloud
infrastructures and VNFs.
Among the challenges with NFV deployments are the need for the coexistence of NFV-enabled and legacy
equipment, and the sensitivity of VNFs to underlying NFVI specifications [406]. Also, further challenges are
encountered in the optimized allocation of NFVI resources to VNFs while keeping the cost and energy
consumption low, and the composition of end-to-end chained services where traffic has to pass through a certain
ordered set of NFVs to deliver the required functionality, [394].

Fig. 9. The concept of NFV and the ETSI-NFV architecture [393].

4) Container-based Virtualization:
Container-based virtualization is a recently introduced technology for cloud computing infrastructures proposed
to reduce the overheads of VMs as containers can help provide up to five times better performance [407]. While
hypervisors perform the isolation at the hardware level and require an independent guest OS for each VM,
containers perform the isolation at the OS level and thus, can be regarded as a lightweight alternative to VMs
[408]. In each physical machine, the containers share the OS kernel and provide isolation through having different
user-spaces by using Linux Kernel containment (LKC) user-space interfaces for example. Figure 7 compares the
components of physical machines when hosting VMs 7(a) and containers 7(b). In addition to sharing the OS,
containers can also share the binary and library files of the applications running on them. With these features,
containers can be deployed, migrated, restarted, and terminated faster and can be deployed in larger numbers in a
single machine compared to VMs. However, and due to sharing the OS, containers can be less secure than VMs.
To improve their security, containers can be deployed inside VMs and share the Guest OS [409] at the cost of
reduced performance.
Some examples of Linux-based container engines are Linux-Vserver, OpenVZ and Linux containers (LXC).
The performance isolation, ease of resources management, and overheads of these systems for MapReduce
clusters have been addressed in [410] and near bare-metal performance was reported. Docker [407] is an open-
source and widely-used containers manager that extends LKC with the kernel and application APIs within the
containers [411]. Docker containers have been used to launch the containers within YARN (e.g. in Hadoop 2.7.2)
to provide better software environment assembly and consistency and elasticity in assigning the resources to
different components within YARN (e.g. map or reduce containers) [412]. Besides YARN, other cloud resources
management platforms have successfully adopted containers for resources management such as Mesos and Quasar
[160]. Similar to YARN, V-Hadoop was proposed in [413] to enable the use of Linux containers for Hadoop. V-
Hadoop scales the number of containers used according to the resource usage and availability in cloud
environments.
5) Software-defined Networking (SDN):
Software-defined networking (SDN) is an evolving networking paradigm that separates the control plane, which
generates the rules for where and how data packets are forwarded, from the data plane, which handles the packets
received at the device according to agreed rules, in networking infrastructures [414]. Legacy networks as depicted
in Figure 10(a), contain networking devices with vendor-specific integrated data and control planes that are hard
to update and scale, while software-defined networks introduce a broader level of programmability and flexibility
in networking operations while providing a unified view for the entire network. SDN architectures can have
centralized or semi-centralized controllers distributed across the network [415] to monitor and operate networking
devices such as switches and routers while considering them as simple forwarding elements [416] as illustrated
in Figure 10(b).
Software defined networks have three main layers namely; infrastructure, control, and application. The
infrastructure layer which is composed of SDN-enabled devices interacts with the control layer through a
Southbound Interface (SBI), while the application layer connects to the control layer through a Northbound
Interface (NBI) [415], [417]. Several protocols for SBI such as OpenFlow were developed as an effort to
standardize and realize SDN architectures [418] - [420]. OpenFlow performs the switching at the flow granularity
(i.e. group of sequenced packets with common set of header fields) where each forwarding element contains a
flow table that receives rules updates dynamically from the SDN controllers. Examples of commercially available
OpenFlow centralized controllers are NOX, POX, Trema, Ryu, FloodLight, Beacon, and Maestro, and of
distributed controllers are Onix, ONOS, and HyperFlow [420] - [422]. Software-based OpenFlow switches such
as Open vSwitch (OVS) are also introduced to enable SDN in virtualized environments and ease the interactions
between hypervisors, container engines, and various application elements and the software-based SDN controller
while connecting several physical machines [423]. An additional tool that aids the control in SDN at a finer
granularity is the Programmable Protocol-independent Packet Processor (P4) high-level language that enables
arbitrary modifications to packets in forwarding elements at the line rate [424].
The flexibility and agility benefits that result from adopting SDN in large-scale networks have been
experimentally validated in several testbeds [425] and in commercial Wide Area Networks (WANs) that
interconnect geo-distributed cloud data centers such as Google B4 WAN [426], and Microsoft Software-driven
WAN (SWAN) [427]. SDN in WAN supports various Traffic Engineering (TE) operations and improves their
congestion management, load balancing, and fault tolerance. Also, SDN relaxes the over-provisioning
requirements of traditional WANs which are typically 30-60-% utilized to increase the resilience and performance
[421], [422].
The programmability of SDN enables fine-grained integration of adaptive routing and flow aggregation
protocols with additional resource allocation and energy-aware algorithms across different network layers [428].
This allows dynamic configuration and ease of implementation for networking applications with scalability
requirements and dynamic traffic. The concepts have found wide spread use in cloud computing services and big
data applications [429], [430], NFVs [391]-[393] and wireless networks [405]. This is in addition to intra data
center networking [421], [431] as will be discussed in Subsection VI-E.
(a) (b)
Fig. 10. Architectures of (a) Legacy, and (b) SDN-enabled networks.

6) Advances in Optical Networks:


Transport networks, as depicted in Figure 1, are composed of a core WAN, typically connected as a mesh to
link cities in a country or across continents, Metropolitan Area Network (MAN) connected in ring or star
topologies, and access networks mostly with Passive Optical Networks (PONs) technologies and topologies [432].
Core networks use fiber optic communication technologies due to their high capacity, reach, and energy
efficiency. High-end Internet Protocol (IP) routers in core networks are widely integrated with optical switches to
form IP over Wavelength-Division-Multiplexing (WDM) architectures. The IP layer aggregates the traffic from
metro and access networks and connects with the optical layer through short reach interfaces. The optical layer is
composed of optical switching devices, mainly Reconfigurable Optical Add Drop Multiplexers (ROADM)
realized by Wavelength Selective Switches (WSS), or Optical Cross Connect (OXC) switches to drop wavelengths
of terminating traffic, insert wavelengths of newly received traffic, and convert wavelengths, if necessary, to
groom transit traffic. In addition, transponders for demodulation and modulation and Optical-Electrical-Optical
(O/E/O) conversations are required [433]. Between the nodes, fiber links spanning up to thousands of kilometers
are utilized. Due to physical impairments in fibers causing optical power losses and pulse dispersion, amplification
is required mainly with Erbium-Doped Fiber Amplifiers (EDFAs) at fixed distances. Reamplification, Reshaping,
and Retiming (3R) regenerators can also be installed at distances depending on the line rate. Such networks have
sophisticated heterogeneous configurations and are typically configured to be static for long periods to reduce
labor risk and costs. However, the increasing aggregated traffic due to Internet-based services with big data
workloads makes legacy infrastructures incapable of serving future demands efficiently. This calls for
improvements at different layers [434]-[437].
Legacy Coarse or Dense WDM (CDMA or DWDM) systems, standardized by the International
Telecommunication Union (ITU), utilized transponders based on On Off Keying (i.e. intensity) modulation and
direct detection with carriers in the Conventional band (C-band) (1530-1565 nm) with 50 GHz spacing between
channels and with up to 96 channels [438]. Typically, Single-Mode Fiber (SMF) links with a Single Line Rate
(SLR) at 10 Gbps or 40 Gbps per fiber are utilized. To increase the capacity with existing fiber plants and to
improve the spectral efficiency, the use of Mixed Line Rates (MLR) and multiple modulation formats with more
than 1 bit per symbol such as duobinary and Differential Quadrature Phase Shift Keying (DQPSK) were proposed
[438]. The rates in MLR systems can be adjusted to transport traffic with short/long reach with high/low rates to
reduce regeneration requirements. With the advances in digital coherent detection enabled with high sampling
rate Digital-to-Analogue Converters (DACs), Digital Signal Processors (DSP), pre and post digital Chromatic
Dispersion (CD) compensation, and Forward Error Correction (FEC) [439]-[441], the use of coherent higher order
modulation formats such as QPSK, Quadrature Amplitude Modulation (QAM), in addition to Polarization
Multiplexing were also proposed to enhance the spectral efficiency. DSP can realize matched-filters for detection
and hence can enable transmission at the Nyquist limit for Inter Symbol Interference (ISI)-free transmission and
relaxes the 50 GHz guard bands requirements between adjacent WDM channels allowing more channels in the C-
band [440]. The use of the Long-band (L-band) (1565-1625 nm), and the Short-band (S-band) (1460–1530 nm)
was also proposed to accommodate more channels, however, costly and complex amplifiers such as Raman
amplifiers are required in addition to careful impairments and non-linearities compensation approaches [440],
[442]. Space Division multiplexing (SDM)-based solutions that enable wavelengths reuse in the fiber are also
considered to increase links’ capacity. However, these solutions call for the replacement of SMFs with Multi
Mode Fibers (MMFs) (i.e. with large core diameter to accommodate several lightpaths propagating at different
modes), or Multi Core Fiber (MCFs) (i.e. with several cores within the fiber) to spatially separate lightpaths that
use the same carrier [442].
To further improve the spectral efficiency and allow dynamic bandwidth allocation, the concept of superchannels
in Elastic Optical Networks (EONs) is also introduced [443]-[446]. A superchannel is composed of bundled spatial
or spectral channels with variable bandwidths at the granularity of 12.5 GHz, as defined by the ITU-T
SG15/G.694.1 FlexGrid. These channels can be transmitted as a single entity with guard bands only between
superchannels. Such channels can be constructed with Nyquist WDM or with coherent Optical Orthogonal
Frequency Division Multiplexing (O-OFDM) which overlaps adjacent subcarriers [447]-[449]. Such flexibility in
bandwidth assignments in FlexGrids requires programmable and adaptive networking equipment such as
Bandwidth-Variable Transponders (BV-Ts), Bandwidth-Variable Wavelength Selective Switches (BV-WSSs),
and Contentionless, Directionless, and Colorless (CDC) Reconfigurable Optical Add Drop Multiplexers
(ROADMs) as detailed in [435], [436], and [442]. The modulation format, line rate, and bandwidth of each
channel can then be dynamically configured with SDN control [436] to provide programmable and agile Software-
Defined Optical Networks with fine-grained control at the optical components and IP layer levels [188], [450].

C. Challenges and requirements for Big Data Applications deployments in Cloud Environments:
Although Hadoop and other big data frameworks were originally designed and provisioned to run in dedicated
clusters under controlled environments, several cloud-based services, enabled by leasing resources from public,
private or hybrid clouds, are offering big data computations to public users aiming to increase the profit and
utilization of cloud infrastructures [451]. Examples of such services are Amazon Elastic MapReduce (EMR)14,
Microsoft's Azure HDInsight15, VMWare Serengeti's project [452], Cloud MapReduce [120], and Resilin [453].
Cloud-based implementations are increasingly considered as cost-effective, powerful, and scalable delivery
models for big data analytics and can be preferred over cluster-based deployments especially for interactive
workloads [454]. However, to suit cloud computing environments, conventional big data applications are required
to adopt several changes in their frameworks [124], [148]. Also, to suit big data applications and to meet the SLAs
and QoS metrics for the heterogeneous demands of multi-users, cloud computing infrastructures are required to
provide on-demand elastic and resilient services with adequate processing, storage, memory, and networking
resources [126]. Here we identify and discuss some key challenges and requirements for big data applications in
cloud environments:
Framework modifications: Single site cluster implementations typically couple data and compute nodes (e.g.
name nodes and JVMs coexist in same machines) to realize data locality. In virtualized cloud environments and
because elasticity is favored for computing, while resilience is favored for storage, the two entities are typically
decoupled or splitted [142]. For example, VMWare follow this split Hadoop architecture, and Amazon EMR
utilizes S3 for storage and EC2 instances for computing. This requires an additional data loading step into
computing VMs before running jobs [148]. Such transfers are typically free of charge within the same
geographical zone to promote the use of both services, but are charged if carried between different zones. For
cloud-based RDBMs and as discussed in Section II-C, the ACID requirements are replaced by the BASE
requirements, where either the availability or the consistency are relaxed to guarantee partition-tolerance which
is a critical requirement in cloud environments [455]. Another challenge associated with the scale and
heterogeneity in cloud-implementations is that applications debugging cannot be performed in smaller
configurations as in single-site environments and requires tracing in the actual scale [313].
Offer selection and pricing: The availability of large number of competing Cloud Service Providers (CSP)
each offering VMs with different characteristics can be confusing to users whom should define their policies, and
requirements. To ease optimizing Hadoop-based applications for un-experienced users, some platforms such as

14
Available at: https://aws.amazon.com/emr/
15
Available at: https://azure.microsoft.com/en-gb/services/hdinsight/
AmazonEC216 provide a guiding script to aid in estimate computing requirements and tuning of configurations.
However, unified methods for estimation and comparison are not available. CSP are also challenged with
resources allocation while meeting revenue goals as surveyed in [456] at different networking layers where
different economic and pricing models are considered.
Resources allocation: Cloud-based deployments for big data applications require dynamic resource allocation
at run-time as newly generated data arrivals and volumes are unprecedented and production workloads might
change periodically, or irregularly. Fixed resource allocations or pre-provisioning can lead to either under-
provisioning and hence QoS violations or over-provisioning which lead to increased costs [120]. Moreover, cloud-
based computations might experience performance variations due to interference caused by the share of resources
usage. Such variations require careful resources allocations especially for scientific workflow, which also require
elasticity in resources assignments as the requirements of different stages vary [457]. Data intensive applications
and scientific workflows in clouds have data management challenges (i.e. transfers between storage and compute
VMs) and data transfer bottlenecks over WANs [458].
QoS and SLA guarantees: In cloud environments, guaranteeing and maintaining QoS metrics such as response
time, processing time, trust, security, failure-rate, and maintenance time to meet SLAs, which are the agreements
between CSPs and users about services requirements and violation penalties, is a challenging task due to several
factors as discussed in the survey in [459]. With multiple tenants sharing the infrastructure, QoS metrics for each
user should be predictable and independent of the co-existence with other users which requires efficient resource
allocation and performance monitoring. A QoS metric that directly impacts the revenue is the response time of
interactive services. It was reported in [460], and [461] that a latency of 100 ms in search results caused 1% loss
in the sales of Amazon, while a latency of 500 ms caused 20% sales drop, and a speed up of 5 seconds resulted
10% sales increase in Google. QoE in video services is also impacted by latency. It was measured in [462] that
with more than 2 seconds delay in content delivery, 60% of the users abandon the service. Different interactive
services depending on big data analytics are expected to have similar impacts on revenue and customer behaviors.
Resilience: Increasing the reliability, availability, data confidentiality, and ensuring the continuity of services
provided by cloud infrastructures and applications against cyber-attacks or systems failures are critical
requirements especially for sensitive services such as banking and healthcare applications. Resiliency in cloud
infrastructures is approached at different layers including the computing and networking hardware layer, the
middleware, and virtualization layer, and at the applications layer by efficient replication, checkpointing, and
clouds collaboration as extensively surveyed in [463].
Energy consumption: A drawback of cloud-based applications is that the energy consumption at both network
and end-user devices can exceed the energy consumption of local deployments [464], and [465].
D. Options for Big Data Applications deployment in Geo-distributed Cloud Environments:
CSPs place their workloads and content in geo-distributed data centers to improve the quality of their services,
and rely on multiple Internet Service Providers (ISPs), or dedicated WANs [426], [427] to connect these data
centers. Advantages such as load balancing, increasing the capacity, availability and resilience against catastrophic
failures, and reducing the latency by being close to the users are attained [440]. Such environments attract
commercial and non-commercial big data analytics as they match the geo-distributed nature of data generation
and provide scales beyond single-cluster implementations. Thus, there is a recent interest in utilizing geo-
distributed data centers for big data applications despite the challenging requirements as surveyed in [323], and
suggested in [466]. Figure 11 illustrates different scenarios for deploying big data applications in geo-distributed
data centers and compares single-site clusters (case A) with various geo-distributed infrastructures with big data
including bulk data transfers (case B), cloud bursting (case C), and different implementations for geo-distributed
big data frameworks (case D) as outlined in [177]. Case B refers to legacy and big data bulk data transfers between
geo-distributed data centers including data backups, content replications, bulk data transfers, and VMs migrations
which can reach between 330TB and 3.3PB a month [164] requiring cost and performance optimized scheduling
and routing. Case C is for hybrid frameworks, where a private cloud bursts some of its workloads to public clouds
for various goals such as reducing costs, or accelerating completion time. Such scenarios require interfacing with


16
Available at: https://wiki.apache.org/hadoop/AmazonEC2
public cloud frameworks in addition to profiling for performance estimations to ensure the gain from bursting
compared to localized computations.

Fig. 11. Different scenarios for deploying big data applications in geo-distributed cloud environments.

Geo-distributed big data frameworks in public cloud have three main deployment modes [177]. Case D-1
copies all input data into a single data center prior to processing. This case is cost-effective and less complex in
terms of framework management and computations but encounters delays due to WAN bottlenecks. Case D-2
distributes the map-like tasks of a single job in geo-distributed locations to locally process input data. Then,
intermediate results are shuffled geo-distributively to a single location for the final reduce-like tasks for final
output computations. This arrangement suits workloads with intermediate data sizes much less than the input data.
Although networking overheads can be reduced compared to case D-1, this case requires complex geo-aware
control for the distributed framework components and experiences task performance fluctuations under
heterogeneous clouds computing capabilities. The last case, D-3, creates a separate job in each location and
transmits the individual output results to a single location to launch a final job for results aggregation. This
realization relaxes the fine-grained control requirements of the distributed framework in D-2, but is considered
costly due to the large number of jobs. Also, it only suits associative workloads such as ‘averages’ or ‘counting’
where the computations can be performed progressively in stages.
Although such options provide flexibility in implementing geo-distributed big data applications, some
unavoidable obstacles can be encountered. Due to privacy and governance regulation reasons, copying data
beyond their regions might be prohibited [467]. Also, due to HDFS federation (i.e. data nodes cannot register in
other locations governed by a different organization), and the possibility of using different DFSs, additional codes
to unify data retrieval are required.
E. Big Data Implications on the Energy Consumption of Cloud Networking Infrastructures:
Driven by the economic, environmental, and social impacts of the increased CAPEX, OPEX, Global
Greenhouse Gas (GHG) emission, and carbon footprints as a result of the expanding demands for Internet-based
services, tremendous efforts have been devoted by industry and academia to reduce the power consumption and
increase the energy efficiency of transport networks [468]-[472]. These services empowered by fog, edge, and
cloud computing, and various big data frameworks, incur huge traffic loads on networking infrastructures and
computational loads on hosting data centers which in turn, increase the power consumption and carbon footprints
of these infrastructures [1], [473]-[475]. The energy efficiency of utilizing different technologies for wireless
access networks has been addressed in [476]-[481], while for wired PONs and hybrid access networks in [482]-
[488]. Core networks, that interconnect cloud data centers with metro and access networks containing IoT and
users devices, transport huge amounts of aggregated traffic. Therefore, optimizing core networks plays an
important role in improving the energy efficiency of the cloud networking infrastructures challenged by big data.
The reduction of energy consumption and carbon footprint in core networks, mainly IP over WDM networks,
have been widely considered in the literature by optimizing the design of their systems, devices, and/or routing
protocols [432], [489]-[506], utilizing renewable energy sources [507]-[513], and by optimizing the resources
assignment and contents placement in different Internet-based applications [387], [514]-[527].
The early positioning study in [489] to green the Internet addressed the impact of coordinated and
uncoordinated sleeps (for line cards, crossbars, and main processors within switches) on the switching protocols
such as Open Shortest Path First (OSPF) and Internal Broader Gateway Protocol (IBGP). Factors such as how,
when, and where to cause devices to sleep, and the overheads of redirecting the traffic and awakening the devices
were addressed. The study pointed out that energy savings are feasible but are challenging due to the modification
required in devices and protocols. In [490], several energy minimization approaches were proposed such as
Dynamic Voltage Scaling (DVF) and Dynamic Frequency Scaling (DFS) at the circuit level, and efficient routing
based on equipment with efficient energy profiles at the network level. The consideration of switching off idle
nodes and rate adaptation have also been reported in [491]. The energy efficiency of the bypass and non-bypass
virtual topologies and traffic grooming schemes in IP over WDM have been assessed in [492] through Mixed
Integer Linear Programming (MILP) and heuristic methods. The non-bypass approach requires O/E/O
conversation to lightpaths (i.e. traffic carried optically in fiber links and optical devices) in all intermediate nodes,
to be processed electronically in the IP layer and routed to following lightpaths. On the other hand, the bypass
approach omits the need for O/E/O conversation in intermediate nodes, and hence reduces the number of IP router
ports needed, and achieves power consumption savings between 25% and 45% compared to the non-bypass
approach. In [493], a joint optimization for the physical topology of core IP over WDM networks, the energy
consumption and average propagation delay is considered under bypass or non-bypass virtual topologies for
symmetric and asymmetric traffic profiles. Additional 10% saving was achieved compared to the work in [492].
Traffic-focused optimizations for IP over WDM networks were also considered for example in [494]-[499].
Optimizing static and dynamic traffic scheduling and grooming were considered in [494]-[497] in normal and
post-disaster situations to reduce the energy consumption and demands blocking ratio. Techniques such as
utilizing excess capacity, traffic filtering, protection path utilization, and services differentiation were examined.
To achieve lossless reduction for transmitted traffic, the use of Network Coding (NC) in non-bypass IP over WDM
was proposed in [498], [499]. Network-coded ports encode bidirectional traffic flows via XOR operations, and
hence reduce the number of router ports required compared to un-coded ports. The energy efficiency and resilience
trade-offs in IP over WDM networks were also addressed as in [500]-[503]. The impact on the energy
consumption due to restoration after link cuts and core node failures was addressed in [500]. An energy efficient
NC-based 1+1 protection scheme was proposed in [501], and [502] where the encoding of multiple flows sharing
protection paths in non-bypass IP over WDM networks was optimized. MILP, heuristics, in addition to closed
form expressions for the networking power consumption as a function of the hop count, network size, and demands
indicated power saving by up to 37% compared to conventional 1+1 protection. The authors in [503] optimized
the traffic grooming and the assignment of router ports to protection or working links under different protection
schemes while considering the sleep mode for protection ports and cards. Up to 40% saving in the power
consumption was achieved.
Utilizing renewable energy resources such as solar and wind to reduce non-renewable energy usage in IP
over WDM networks with data centers was proposed in [507], and [508]. Factors such as renewable energy
average availability and their transmission losses, regular and inter data center traffic, and the network topology
were considered to optimize the locations of the data centers and an average reduction by 73% in non-renewable
energy usage was achieved. The work in [509] considered periodical reconfiguration to virtual topologies in IP
over WDM networks based on a “follow the sun, follow the wind” operational strategy. Renewable energy was
also considered in IP over WDM networks for cloud computing to green their traffic routing [510], content
distribution [511], services migration [512], and for VNE assignments [513].
The energy efficiency of Information-Centric Networks (ICNs) and Content Distribution Networks (CDNs)
were extensively surveyed in [512] and [514] respectively. CDNs, are scalable services that cache popular contents
throughout ISP infrastructures, while ICNs support name-based routing to ease the access to contents. The
placement of data centers and their content in IP over WDM core nodes were addressed in [516] while considering
the energy consumption, propagation delay, and users upload and download traffic. An early effort to green the
Internet [517] suggested distributing Nano Data Centres (NaDa) next to home gateways to provide various caching
services. In [518], the energy efficiency of Video-on-Demand (VoD) services was examined by evaluating five
strategic caching locations in core, metro, and access networks. The work in [519]-[521] addressed the energy
efficiency of Internet Protocol Television (IPTV) services by optimizing video content caching in IP over WDM
networks while considering the size and power consumption of the caches and the popularity of the contents. To
maximize caches hit rate, the dynamics of TV viewing behaviors throughout the day were explored. Several
optimized content replacement strategies were proposed and up to 89% power consumption reduction was
achieved compared to networks with no caching. The energy efficiency of Peer-to-Peer (P2P) protocol-based
CDNs in IP over WDM networks was examined in [522] while considering the network topology, content
replications, and the behaviors of users. In [523], the energy efficiency and performance of various cloud
computing services over non-bypass IP over WDM networks under centralized and distributed computing modes
were considered. Energy-aware MILP models were developed to optimize the number, location, capacity and
contents of the clouds for three cloud services namely; content delivery, Storage-as-a-Service (SaaS), and virtual
machines (VM)-based applications. An energy efficient cloud content delivery heuristic (DEER-CD) and a real-
time VM placement heuristic (DEERVM) were developed to minimize the power consumption of these services.
The results showed that replicating popular contents and services in several clouds yielded 43% power saving
compared to centralized placements. The placement of VMs in IP over WDM networks for cloud computing was
optimized in [524] while considering their workloads, intra-VM traffic, number of users, and replicas distribution
and an energy saving of 23% was achieved compared to one location placements. The computing and networking
energy efficiency of cloud services realized with VMs and VNs in scenarios using a server, a data center, or
multiple geo-distributed data centers were considered in [387]. A Real-time heuristics for Energy Optimized
Virtual Network Embedding (REOViNE) that considered the delay, clients locations, load distribution, and
efficient energy profiles for data centers was proposed and up to 60% power savings were achieved compared to
bandwidth cost optimized VNE. Moreover, the spectral and energy efficiencies of O-OFDM with adaptive
modulation formats and ALR power profile were examined.
To bridge the gap between traffic growth and networking energy efficiency in wired access, mobile, and
core networks, GreenTouch17, which is a leading Information and Communication Technology (ICT) research
consortium composed of fifty industrial and academic contributors, was formed in 2010 to provide architectures
and specifications targeting energy efficiency improvements by a factor of 1000 in 2020 compared to 2010. As
part of the GreenTouch recommendations, and to provide a road map to ISP operators for energy efficient design
for cloud networks, the work in [504], [506] proposed a combined consideration for IP over WDM design
approaches, and the cloud networking-focused approaches in [387], and [523]. The design approaches jointly
consider optical bypass, sleep modes for components, efficient protection, MLR, optimized topology and routing,
in addition to improvements in hardware where two scenarios; Business-As-Usual (BAU), and BAU with
GreenTouch improvements are examined. Evaluations on AT&T core network with realistic 7 data centers
locations and 2020 projected traffic, based on Cisco Visual Network Index (VNI) forecast and a population-based
gravity model, indicated energy efficiency improvements of 315x compared to 2010 core networks. Focusing on
big data and its applications, the work in [526], [527] addressed improving the energy efficiency of transport
networks while considering different “5V” characteristics of big data and suggested progressive processing in
intermediate nodes as the data traverse from source to central data centers. A tapered network that utilizes limited
processing capabilities in core nodes in addition to 2 optimally selected cloud data centers is proposed in [527]
and energy consumption reduction by 76% is achieved compared to centralized processing. In [527], the effects
of data volumes on the energy consumption is examined. The work in the above mentioned two papers is extended
in [173], [174] and is further explained is Section V with other energy efficiency and/or performance-related
studies for big data applications in cloud networking environments.

V. CLOUD NETWORKING-FOCUSED OPTIMIZATION STUDIES


This Section addresses a number of network-level optimization studies that target big data applications
deployed partially or fully in virtualized cloud environments. Unlike in dedicated clusters implementations, cloud-
based applications encounter additional challenges to meet QoS and SLA requirements such as the heterogeneity


17
Available at: www.greentouch.org
of infrastructures and the uncertainty of resources availability. Thus, a wide range of optimizations objectives,
and utilizations for cloud technologies and infrastructures are proposed. This Section is organized as the follows:
Subsection V-A focuses on cloud resources management and optimization for applications, while Subsection V-
B addresses VMs and containers placement and resources allocation optimizations studies. Subsection V-C
discusses optimizations for big data bulk transfer between geo-distributed data centers and for inter data center
networking with big data traffic. Finally, Subsection V-D summarizes studies that optimize big data applications
while utilizing SDN and NFV. The studies presented in this Section are summarized in Tables III, and IV for
generic cloud-based, and specific big data frameworks, respectively.

A. Cloud Resource Management:


To avoid undesired under or over-provisioning in cloud-based MapReduce systems, the authors in [120]
provided a theoretical-based mechanism for resource scaling at run-time based on three sufficient conditions that
satisfy hard-deadline QoS metrics. A dynamic resources allocation policy was examined in [121] to meet soft-
deadlines of map tasks in hybrid clouds while minimizing their budget. The policy utilized local resources in
addition to public cloud resources to allow concurrent executions and hence reduce the completion time. In [122],
a Hyper-Heuristic Scheduling Algorithm (HHSA) was proposed to reduce the makespan of Hadoop map tasks in
virtualized cloud environments. HHSA was found to outperform FIFO and Fair schedulers for jobs with
heterogeneous map tasks. To jointly optimize resource allocation and scheduling for MapReduce jobs in clusters
and clouds while considering data locality and meeting earliest start time, execution time, and deadline SLAs, the
work in [123] proposed a MapReduce Constraint Programming-based Resource Management (MRCP-RM)
algorithm. An average reduction by 63% in deadlines missing is achieved compared to existing schedulers. Cloud
RESource Provisioning (CRESP) was proposed in [124] to minimize the cost of running MapReduce workloads
in public clouds. A reliable cost model that allocates MapReduce workloads while meeting deadlines is obtained
by quantifying the relations between the completion time, the size of input data, and the required resources.
However, a linear relation was assumed between data size and computation cost which do not suit intensive
computations. The energy efficiency of running a mix of MapReduce and video streaming applications in P2P
and community clouds was considered in [125]. Results based on simulations that estimate the processing and
networking power consumption indicated that community clouds are more efficient than P2P clouds for
MapReduce workloads.
Due to the varying prices, in addition to the heterogeneity in hardware configurations and provisioning
mechanisms among different cloud providers, choosing a service that ensures adequate and cost-efficient
resources for big data applications is considered a challenging task for users and application providers. In [126],
a real-time QoS-aware multi-criteria decision making technique was proposed to ease the selection of IaaS clouds
for applications with strict delay requirements such as interactive online games, and real-time mobile cloud
applications. Factors such as the cost, hardware features, location, in addition to WAN QoS were considered.
Typically, cloud providers adopt resource-centric interfaces, where the users are fully responsible for defining
their requirements and selecting corresponding resources. As these selections are usually non-optimal in terms of
performance, and utilization, Bazaar was alternatively proposed in [127] as a job-centric interface. Users only
provide high-level performance and cost goals and then the provider allocates the corresponding computing and
networking resources. Bazaar achieved accepted request rates increase between 3% and 14% as it argued that
several sets of resources lead to similar performance and hence provide more flexibility in resource assignments.
In [128], an accurate projection-based model was proposed to aid the selection of VMs and network bandwidth
resources in public clouds with homogeneous nodes. The model profiles MapReduce applications on small
clusters and projects the resources it predicts into larger clusters requirements. The concurrency of in-memory
sort, external merge-sort, and disk read/write speeds were considered. To address the heterogeneity in cloud
systems caused by regular server replacements due to failures, the work in [129] proposed Application to
Datacenter Server Mapping (ADSM) as a QoS-aware recommendation system. Results with 10 different server
configurations indicated up to 2.5x efficiency improvement over a heterogeneity-oblivious mapping scheme with
minimal training overhead. The authors in [130] considered a strategy for resource placement to provide high
availability cloud applications. A multi-constrained MILP model was developed to minimize the total down time
of cloud infrastructures while considering the interdependencies and redundancies between applications, the mean
time to failure (MTTF), and mean time to recover (MTTR) of underlying infrastructures and applications
components. Up to 52% performance improvement was obtained compared to OpenStack Nova scheduler.
The studies in [131]-[134] focused on the management of bandwidth resources in cloud environments. The
authors in [131] proposed CloudMirror to guarantee bandwidth allocation for interactive and multi-tier cloud
applications. A network abstraction that utilized tenants’ knowledge of applications bandwidth requirements
based on their communication patterns was utilized. Then, a workload placement algorithm that balances locality
and availability and a runtime mechanism to enforce the bandwidth allocation were used. The results indicated
that CloudMirror can accept more requests while improving the availability from 20% to 70%. Falloc was
proposed in [132] to provide VM-level bandwidth guarantees in IaaS environments. First, a minimum base
bandwidth is guaranteed for each VM, then a proportional share policy through a Nash bargaining game approach
is utilized to allocate excess bandwidth. Compared to fixed bandwidth reservation, an improvement by 16% in
jobs completion time was achieved. ElasticSwitch in [133] is a distributed solution that can be implemented in
hypervisor to provide minimum bandwidth guarantees and then dynamically allocate residual bandwidth to VMs
in a work-conserving manner. Small testbed-based results showed successful guarantees even for challenging
traffic patterns such as one-all traffic patterns; and improved links utilization to 75% to 99%. A practical
distributed system to provide fine-grained bandwidth allocation among tenants, EyeQ, was examined in [134]. To
provide minimum bandwidth guarantees and resolve contention at the edge, EyeQ maintains at each host a Sender
EyeQ Module (SEM) that control the transmitting rate and a Receiver EyeQ Module (REM) that measures the
received rate. These modules can be implemented in the codebase of virtual switches in untrusted environments
or as a shim layer in non-virtualized trusted environments. For traffic with different transport protocols, EyeQ
maintained the 99.9th percentile of latency equivalent to that of dedicated links.
The authors in [135] addressed optimizing NoSQL databases deployments in hybrid cloud environment.
A cost-aware resources provisioning algorithm that considers the existence of different SLAs among multiple
cloud tiers is proposed to meet queries QoS requirements while reducing their cost. In [136], the need for dynamic
provisioning of cloud CPU and RAM resources was addressed for real-time streaming applications such as Storm.
Jackson open queuing networks theory was utilized for modelling arbitrary operations such as loops, splits, and
joins. The fairness of assigning Spark-based jobs in geo-distributed environments was addressed in [137] through
max-min fairness scheduling. For up to 5 concurrent sort jobs and compared to default Spark scheduler, job
completion time improvements by up to 66% was achieved. The work in [138] outlined that the memoryless fair
resource allocation strategies of YARN are not suitable for cloud computing environments as they violate their
good properties. To address this, two fair resource allocation mechanisms: Long-Term Resource Fairness (LTRF)
for single-level and Hierarchical LTRF (H-LTRF) were proposed and implemented in YARN. The results
indicated improvements compared to existing schedulers in YARN, however, only RAM resources were
considered. While considering CPU, memory, storage, network and data resources management in multi-tenancy
Hadoop environment, the authors in [139] utilized a meta-data scheme, meta-data manager, and a multi-tenant
scheduler to introduce multi-tenancy features to YARN and HDFS. Kerberos authentication was also proposed to
enhance the security and access control of Hadoop.

B. Virtual Machines and Containers Assignments:


The efficiency of VMs for big data applications relies heavily on the resources allocation scheme used. To
ease sizing the virtualized resources for cloud MapReduce clusters and meeting the requirements of non-expert
users, the authors in [140] suggested Elastisizer which automates the selection of both VM sizes and jobs
configurations. Elastisizer utilizes Starfish detailed profiling [103], in addition to white-box and black-box
techniques to estimate the virtualized resource needs of workloads. Cura was proposed in [141] to provide cost-
effective provisioning for interactive MapReduce jobs by utilizing Starfish profiling, VM-aware scheduling, and
online VMs reconfigurations. A reduction of 80% in costs and 65% in job completion time were achieved but at
the cost of increasing the risks for cloud providers as they fully decide on resource allocation and satisfying users
QoS metrics. Cloud providers typically over-provision resources to ensure meeting SLAs, however, this leads to
poor utilization. The work in [142] utilized the over-provisioning of interactive transactional applications in hybrid
virtual and native clusters by concurrently running MapReduce batch jobs to increase the utilization. HybridMR,
a 2-phase hierarchical scheduler, was proposed to dynamically configure the native and virtual clusters. The first
phase categorizes the incoming jobs according to their expected virtualization overhead, then decides on placing
them in the physical or virtual machine. The second phase is composed of dynamic resource manager (DRM) at
run time, and an Interference Prevention System (IPS) to reduce the interference between tasks in co-hosted VMs
and tasks co-located in the same VM. The results indicated an improvement of 40% in completion time compared
with fully virtualized systems, and by 45% in utilization compared to native systems with minimum performance
overhead. The differences in the requirements of long and short duration MapReduce jobs were utilized in [143]
through fine-grained resources allocations and scheduling. A trade-off between scheduling speed and accuracy
was considered where accurate constrained programming was utilized to maximize the revenue for long jobs, and
two heuristics; first-fit and best-fit algorithms were utilized to schedule small jobs. As a result, the jobs
accommodated have increased by 18% leading to an improvement of 30% in revenue. However, simplifications
such as a job must run only in one machine, no preemption, and homogeneous VMs were assumed.
The elasticity of VM assignments was utilized in several studies to reduce the energy consumption of running
big data applications in cloud environments [144]-[147]. In [144] an online Energy-Aware Rolling-Horizon
(EARH) scheduling algorithm was proposed to dynamically adjust the scale of VMs to meet the real-time
requirements of aperiodic independent big data jobs while reducing the energy consumption. Cost and Energy-
Aware Scheduling (CEAS) was proposed in [145] to reduce the cost and energy consumption of workflows while
meeting their deadline through utilizing DVFS and 5 sub algorithms. The work in [146] considered spatial and
temporal placement algorithms for VMs that run repeated batch MapReduce jobs. It was found that spatio-
temporal placements outperform spatial-only placement in terms of both energy saving and job performance. The
work in [147] considered energy efficient VM placement while focusing on the characteristics of MapReduce
workloads. Two algorithms; TRP as a trade-off between VM duration and resources utilization, and VCS to
enhance the utilization of active servers, were utilized to reduce the energy consumption. Savings by about 16%,
and 13% were obtained compared to Recipe Packing (RP), and Bin Packing (BP) heuristics, respectively.
Several studies addressed data locality improvements in virtualized environments [148]-[151]. To reduce the
impact of sudden increases in shuffling traffic on the performance of clouds, Purlieus was proposed in [148] as a
cloud-based MapReduce system. The system suggests using dedicated clouds with preloaded data and utilizes
separate storage and compute units, which is different from conventional MapReduce clouds, where the data must
be imported before the start of the processing. Purlieus improved the input and intermediate data locality by
coupling data and VM placements during both; map and reduce phases via heuristics that consider the type of
MapReduce jobs. The challenge of seamlessly loading data into VMs is addressed by loop-back mounting and
VM disk attachment techniques. The results indicated an average reduction in job execution time by 50% and in
traffic between VMs by up to 70%. However, it assumed knowledge about loads on datasets, job arrival rate and
mean execution time. The headroom CPU resources that cloud providers typically preserve to accommodate the
peaks in demands were utilized in [149] to maximize data locality of Hadoop tasks. These resources were
dynamically adjusted at run-time via hot-plugging. If unstarted tasks in a VM have the data locally, headroom
CPU resources are reserved to it while removing equivalent resources from other VMs to keep the price constant
for the users. For the allocation and deallocation of resources, two Dynamic Resource Reconfiguration (DRR)
schedulers were proposed; synchronous DRR that utilize CPU headroom resources only when they are available,
and queue-based DRR that delay the schedule of tasks with locality-matched VM until headroom resources are
available. An average improvement in the throughput by 15% was achieved. DRR differs from Purlies in that it
assumes no prior knowledge and schedules dynamically, however, the locality for reduce tasks was not
considered.
To tackle resources usage interference between VMs in virtual MapReduce clusters that can cause
performance degradation, the authors in [150] proposed an interference and locality aware (ILA) scheduling
algorithm based on delay scheduling and task performance prediction models. Based on observations that the
highest interference is in the I/O resources, ILA was tested under different storage architectures. ILA achieved a
speed up by 1.5-6.5 times for individual jobs, and 1.9 times improvement in the throughput compared with 4 state-
of-art schedulers, interference-aware only, and locality aware only scheduling. The authors in [151] proposed
CAM which is a cloud platform to support cloud MapReduce by determining the optimal placement of new VMs
requested by users while considering the physical location and the distribution of workloads and input data. CAM
exposes networking topology information to Hadoop and integrates IBM ISAAC protocol with GPFS-SNC for
the storage layer which allows co-locating related data blocks to increase the locality of VM image disks. As a
result, the cross-rack traffic related to operating system and MapReduce was reduced by 3 times and the execution
duration was reduced by 8.6 times compared to state-of-art resources allocation schemes.
To increase the revenue and reliability for business critical workloads in virtualized environments, the authors
in [152] proposed Mnemos which is a self-expressive resources management and scheduling architecture. Mnemos
utilized existing system monitoring and VM management tools, in addition to a portfolio scheduler and Nebu
topology-aware scheduler to adapt the policies according to workload types. Mnemos was found to respond well
to workload changes, and to reduce costs and risks. Similarly, and to provide automated sizing for NoSQL clusters
in real-time based on workloads and user defined policies, an open-sourced framework for use in IaaS platforms,
TIRAMOLA was suggested in [153]. TIRAMOLA integrates VM-based monitoring with Markov Decision
Processing (MDP) to decide on the addition and removal of VMs while monitoring the client and server sides.
The results here showed a successful VMs resizing that tracks the typical sinusoidal variations in commercial
workloads while using different load smoothing techniques to avoid VM assignments oscillations (i.e. being
continuously added and removed without being useful to workloads).
To enhance current Xen-based VM schedulers that do not fully support MapReduce workloads, a MapReduce
Group Scheduler (MRG) was proposed in [154]. MRG aims to improve the scheduling efficiency by aggregating
the I/O requests of VMs located in the same physical machine to reduce context switching and improve the fairness
by assigning weights to VMs. MRG was tested with speculative execution and under multiprocessor, and an
improvement by 30% was achieved compared to Xen credit default scheduler. In [155], a large segment scheme
and Xen I/O ring mechanism between front-end and back-end drivers were utilized to optimize block I/O
operations for Hadoop. As a result, CPU usage was reduced by a third during I/O, and the throughput was
improved by 10%.
Few recent studies addressed Hadoop improvements in Linux containers-based systems [156]-[160].
FlexTuner was proposed in [156] to allow users and developers to jointly configure the network topology and
routing algorithms between Linux containers, and to analyze communication costs for applications. Compared
with Docker containers that are connected through a single virtual Ethernet bridge, FlexTuner used virtualized
Ethernet links to connect the containers and the hosts (i.e. each container has its own host name and IP address).
The authors in [157] utilized a modified k-nearest neighbour algorithm to select the best configurations for newly
submitted jobs based on information from similar past jobs in Hadoop clusters based on Docker containers. 28%
gain in the performance was obtained compared to default YARN configurations. An automated configuration for
Twister, which is a dedicated MapReduce implementation in container-driven clouds for iterative computations
workloads, was proposed in [158]. The traditional client/server communication model (e.g. RPC) was replaced
with a publish/subscribe model via NaradaBrokering open-source messaging infrastructure. Negligible difference
in jobs completion time was found between physical and container-based implementations. To improve the
performance of networking in Linux container-based Hadoop in cloud environments, the authors in [159]
proposed the use of IEEE 802.3ad that enables separate interfaces to instances and then aggregates the traffic at
the physical interface. Completion time results indicated a comparable performance with bare-metal
implementation and improvement by up to 33.73% compared to regular TCP. EHadoop in [160] tackled the impact
of saturated WAN resources on the completion time and costs of running MapReduce in elastic cloud clusters. A
network-aware scheduler the profile jobs and monitor network bandwidth availability is proposed to schedule the
jobs in optimum number of containers so that the costs are reduced and network contentions are avoided.
Compared to default YARN scheduler, 30% reduction in costs was achieved.

C. Bulk Transfers and Inter Data Center Networking (DCN):


Several studies considered optimizing bulk data transfers for backups, replication, and data migration between
geo-distributed data centers in terms of links utilization, cost, completion time, or energy efficiency [161]-[170].
The early work in [161] proposed NetStitcher to utilize the unused bandwidth between multiple data centers for
bulk transfers. NetStitcher used future bandwidth availability information (e.g. fixed free slot 3-6 AM), and
intermediate data centers between the source and destination in a store-and-forward fashion to overcome
availability mismatch between non-neighbouring nodes. Moreover, data bulks were split and routed over
multipath to overcome intermediate nodes storage constraints. Postcard in [162] further considered different file
types, multi-source-destination pairs, and proposed a time slotted model to consider the dynamics of storing and
forwarding. An online optimization algorithm was developed to utilize the variability in bandwidth prices to
reduce the routing costs. GlobeAny was proposed in [163] as an online profit-driven scheduling algorithm for inter
data center data transfer requested by multiple cloud applications. Request admission, store-and-forward routing,
and bandwidth allocation were considered jointly to maximize the cloud providers profit. Furthermore,
differentiated services can be provided by adjusting applications weights to satisfy different priority and latency
requirements.
CLoudMPcast is suggested in [164] to optimize bulk transfers from a source to multi geo-distributed destinations
through overlay distribution trees that minimize costs without affecting transfer times while considering full mesh
connections and store-and-forward routing. The charging models used by public cloud service providers which
are characterized by flat cost depending on location and discounts for transfers exceeding a threshold in the range
of TBytes were utilized. Results indicated improved utilization and savings for customers by 10% and 60% for
Azure and EC2, respectively compared to direct transfers. Similarly, the work in [165] optimized big data
broadcasting over heterogeneous networks by constructing a pipeline broadcast tree. An algorithm was developed
to select the optimum uplink rate and construct an optimal LockStep Broadcast Tree (LSBT). The authors in [166],
and [167] focused on improving bulk transfers while considering the optical networking and routing layer. In
[166], joint optimization for the backup site and the data transmission paths was proposed to accelerate regular
backups. The results indicated that it is better to select the sites and the paths separately. The spectrum
management agility of elastic optical networks, realized by Bandwidth-Variable transponders and wavelength
selective switches, was utilized in [167] to handle the uncertainty and heterogeneity of big data traffic by
adaptively routing backups which were modeled as dynamic anycasts.
End systems Solutions besides networking solutions were proposed in [168] to reduce the energy consumption
of data transfers between geo-distributed data centers. Factors such as file sizes, and TCP pipelining, parallelism,
and concurrency, were considered. Moreover, data transfer power models based on power meters measurements,
that consider CPU, memory, hard disk, and NIC resources, were developed. To emulate geo-distributed transfers
in testbeds, artificial delays obtained by setting the round-trip time (RTT) were considered. A similar approach
was also utilized in [169], and [170]. To assist users to get the best configurations, the work in [169] investigated
the effects of the distance on the delay and the traffic characteristics in geo-distributed data centers while
considering different block sizes, locality, and replicas. The authors in [170] experimentally addressed the cost
and speed efficiency of data movements into a single data center for MapReduce workloads. Two online
algorithms; Online Lazy Migration (OLM), and Randomized Fixed Horizon Control (RFHC) were proposed and
the authors generalized that MapReduce computation are better if processed in single data center.
On the other hand, several other studies have utilized the fact that for most production jobs, the intermediate
data generated is much smaller than the input data and they proposed alternative solutions based on distributed
computations [171]-[174]. To reduce the computational and networking costs of processing globally collected
data, the authors in [171] proposed a parallel algorithm to jointly consider data centric job placement and network
coding based routing to avoid duplicate transmissions of intermediate data. An improvement by 70% was achieved
compared to simple data aggregation and regular routing. To decrease MapReduce inter data centers traffic while
providing predictable completion time, the authors in [172] also addressed joint optimization of input data routing
and tasks placement. The bandwidth between data centers and the data input to intermediate ratio were determined
by profiling, and the global allocation of map and reduce tasks was performed so that the maximum time required
for input data, and intermediate data transmissions is minimized. A reduction by 55% was achieved compared to
the centralized approach. The authors in [173], [174] improved the energy efficiency of bypass IP over WDM
transport networks while considering the different 5V attributes of big data. The use of distributed Processing
Nodes (PNs) in the sources and/or intermediate core nodes was suggested to progressively process the data chunks
before reaching cloud data centers, and hence reduce the network usage. In [173], a joint optimization of the
processing in PNs or cloud data centers and the routing of data chunks with different volumes and processing
requirement yielded up to 52% reduction in the power consumption. The capacities and power usage effectiveness
(PUE) of the PNs and the availability of the required processing frameworks were also examined. The impact of
the velocity of big data on the power consumption of the network is further considered in [174] where two
processing modes were examined; expedited (i.e. real-time) which is delay optimized, and relaxed (i.e. batch)
which is energy optimized. Power consumption savings ranging between 60% for fully relaxed mode and 15%
for fully expedited mode due to additional processing requirements were achieved compared to single location
processing without progressive processing.
Several studies addressed detailed optimizations for the configurations and scheduling of MapReduce
framework in geo-distributed environments for example to reduce the costs [175]-[177], the completion time
[178], [179], and for hybrid clouds [180], [181], while few addressed optimizing geo-distributed data streaming,
querying, and graph processing applications [182]-[187]. An inter data center allocation framework was proposed
in [175] to reduce the costs of running large jobs while considering the variabilities in electricity and bandwidth
prices. The framework considered jobs indivisibility (i.e. ability to run in data centers without the need for inter
communications), data centers processing capacities, and time constraints. A StochAstic power redUction schEme
(SAVE) was proposed in [176] to schedule incoming batch jobs into a front-end server to geo-distributed back-
end clusters while trading-off power costs and delay. Scheduling in SAVE was based on two time scales, the first
is to allocate jobs and activate/deactivate servers, and the second, which is finer, is to manage service rates by
controlling CPUs power levels. To reduce the cost and execution time of MapReduce sequential jobs in geo-
distributed environments, G-MR in [177] utilized data Transformation Graphs (DTG) to select an optimal
execution sequence detailed by data movements and processing locations. G-MR used approximating functions
to estimate completion times for map and reduce tasks, and sizes of intermediate and output data. Replicas were
considered by constructing a DTG for each and comparing. Moreover, the effects of choosing different providers
(i.e. different inter data center bandwidths, and networking and compute costs) with data centers in the US east
coast, US west coast, Europe and Asia were considered. The work in [178] addressed some geo-distributed
Hadoop limitations through prediction based job localization, map input data pre-fetching to balance completion
times of local and remote map tasks, and by modifying HDFS data placement policy to be sub-cluster aware to
reduce output data writing time. An improvement by 48.6% was achieved for the completion time of reduce tasks.
The authors in [179] addressed the placement of MapReduce jobs in geo-distributed clouds with limited WAN
bandwidth. A Community Detection-based Scheduling (CDS) algorithm was utilized to reduce the WAN data
transmission volume and hence reduce the overall completion time while considering the dependencies between
tasks, workloads, and data centers. Compared to a hyper-graph partition based scheduling and a greedy algorithm,
the transmitted data volume was reduced by 40.7% and the completion time was reduced by 35.8%.
A performance model was proposed in [180] to estimate the speed up in completion time for iterative
MapReduce jobs when considering cloud bursting. The model focused on weak links between on-premise and
off-permise VMs and suggested asynchronous rack-aware replica placement and replica-aware scheduling to
avoid additional data transmission to off-premise VMs if a task is scheduled before its data is replicated there.
Experimental results with up to 15 on-premise VMs and 3 off-premise VMs indicated between 1-10% error in the
model-based estimation. Compared to ARIA [69], up to one order of magnitude higher accuracy was achieved.
BStream was proposed in [181] as cloud bursting framework that optimally couples batch processing in limited
resources internal clouds (IC) with stream processing in external clouds (EC). To account for the delays
incorporated with data uploading to ECs, BStream matched the stream processing rate with the data upload rate
by allocating enough EC resources. Storm was used for stream processing with the aid of Nimbus, Zookeeper,
and LevelDB for coordination, and Kafka servers to temporary store data in EC. BStream assumed homogeneous
resources and no data skew and suggested an approach that performs partial reduction in IC if the tasks are
associative. Alternatively, the approach runs final reduce tasks only in IC. Two checkpointing strategies were
developed to optimize the reduce phase by overlapping the transmission of reduce results back to IC with input
data bursting and processing in EC. However, users need to write two different codes (i.e. for Hadoop and for
Storm) to express the same tasks.
SAGE was proposed in [182] as an efficient service-oriented geo-distributed data streaming framework. SAGE
suggested the use of loosely coupled cloud queues to manage and process streams locally and then utilized
independent global aggregator services for final results. Compared to single location processing, a performance
improvement by a factor of 3.3 was achieved. Moreover, an economical model was developed to optimize the
resource usage and costs of the decomposed services. However, the impact of limited WAN bandwidths on the
performance was not addressed. To address this, the authors in [183] proposed JetStream which utilizes structured
Online Analytical Processing (OLAP) data cubes to aggregate data streams in edge nodes, and proposed to
adaptively filter the data (i.e. degrade) to match the instantaneous capacity of the WAN in exchange for the
analysis quality. The work in [184] aimed to reduce the inter data centers networking cost while meeting SLAs of
big data stream processing in geo-distributed data centers by jointly optimizing VM placements and flow
balancing. Each task is replicated in multiple VMs and the distribution of data streams between them was
optimized. The results revealed improvements compared to single VM tasks allocation. The computational and
communication costs of streaming workflows in geo-distributed data centers was considered in [185]. An
Extended Streaming Workflow Graph (ESWG)-based allocation algorithm that accounts for semantic types (i.e.
sequential, parallel, and choice for incoming and outgoing patterns) and price diversity of computational and
communication tasks was constructed. The algorithm assumes that each task is allocated a VM in each data center,
and divides the graph into sub-graphs to simplify the allocation solutions. The results indicated improved
performance and reduced costs compared to greedy, computational-cost only, and communication-cost only
approaches. To optimize queries processing latency and the usage of limited bandwidth and varied price WANs
in geo-distributed Spark-like systems, Iridium was proposed in [186]. An online heuristic that iteratively optimizes
the placement of data prior queries arrival based on their value per byte, and the processing for a complex DAGs
of tasks was used. Iridium was found to speed the queries by 3-19 times and reduce WAN bandwidth usage by
15-64% compared to single location processing and unmodified Spark. The work in [187] proposed G-Cut for
efficient geo-distributed static and dynamic graphs partitioning. Two optimization phases were suggested to
account for the usage constraints and the heterogeneity of WANs. The first phase improves the greedy vertex-cut
partitioning approach of PowerGraph to be geo-aware and hence reduces inter data center data transfers. The
second phase adopts a refinement heuristic to consider the variability in WAN bottlenecks. Compared to
unmodified PowerGraph, reductions by up to 58% in the completion time, and 70% in WAN usage were achieved.

D. SDN, EON, and NFV-based Optimization Studies:


Static transport circuit-switched networks cannot provide bandwidth guarantees to applications as they have
low visibility to transported traffic. The early work in [188], utilized OpenFlow as a unified control plane for the
packet-switched, and circuit-switched layers to enable applications-aware routing that can meet advanced
requirements such as low latency or low jitter. OpenFlow was shown to enable differentiated services and
resilience for applications with different priorities. To provide latency and protection guarantees to applications,
the Application Centric IP/Optical Networks Orchestration (ACINO) project was proposed in [189]. ACINO
facilitates applications with the ability to specify reliability and latency requirements, and then groom the traffic
of applications with similar requirements into specific optical services. Moreover, an IP layer restoration
mechanism was introduced to assist optical layer restoration. A Zero Touch Provisioning, Operation and
Management (ZTP/ZTPOM) model proposed in [190]. It utilized SDN and NFV to support Cloud Service
Delivery Infrastructures (CSDI) by automating the provisioning of storage and networking resources of multiple
cloud providers. However, these studies considered generalized large scale web, voice and video applications, and
did not focus on big data applications. The authors in [191] focused on big data transmission and demonstrated
the benefits of using Nyquist superchannels in software-defined optical networks for high speed transmission.
Routing, Modulation Level, and Spectrum Allocation (RMLSA) optimizations were carried to select paths and
modulation levels that improve the transmission speed while ensuring flexible spectrum allocation and high
spectral efficiency. Compared to traditional Routing and Spectrum Assignments (RSA), an improvement by 77%
was achieved. However, only bandwidth requirements were considered.
To control geo-distributed shuffling traffic that typically saturates local and wide area network switches, the
work in [192] suggested an OpenFlow (OF) over Ethernet enabled Hadoop implementation to dynamically control
flows paths and define OF-enabled queues with different priorities to accelerate jobs. Experiments with sort
workloads demonstrated that OF-enabled Hadoop outperformed standard Hadoop. Palantir was proposed in [193]
as an SDN service in Hadoop to ease obtaining adequate network proximity information such as rack-awareness
within the data center and data center-awareness in geo-distributed deployments, without revealing security
sensitive information. Two optimization algorithms were proposed; the first for data center-aware replica
placement and tasks scheduling, and the second for proactive data prefetching to accelerate tasks with non-local
data. To automate provisioning of computing and networking resources in hybrid clouds for big data applications,
VersaStack was proposed in [194] as an SDN-based full-stack model driven orchestration. VersaStack computes
models for virtual cloud network, manages VM placement, computes L2 paths, performs Single Root Input/Output
Virtualization (SR-IOV) stitching, and manages Ceph which is a networked block storage.
Enhancing bulk data transfers has been considered with the aid of SDN and elastic optical inter data center
networking in [195]-[199]. To overcome the limitations of current inter data centers WANs with pre-allocate static
links and capacities for big data transfers, the authors in [195] proposed Open Transport Switch (OTS) which is a
virtual light-weight switch that abstracts the control of transport networks resources for hybrid clouds. OpenFlow
was extended to manage packet-optical cross connect (XCON) through additional messaging capabilities. OTS
allows applications to optimally compute end-to-end paths across multiple domains or to just submit bandwidth
requirements to OTS. Two modes of operations were suggested; implicit and explicit. The former allows the SDN
controller to communicate only with edge switches which are located between different transport domains (e.g.
Ethernet, Multiprotocol Label Switching (MPLS), OTN), and the later unifies the control for all the components
across different domains. Malleable reservations for data bulks transfers in EONs were proposed in [196]. While
accounting for flow-oriented reservations with fixed bandwidth allocations, the RSA of the bulk transfers was
adjusted dynamically to increase the utilization of the EON and to recycle 2-D spectrum fragmentation. The
authors in [197] considered dynamic routing for different data chunks in bulk data transfers between geo-
distributed data centers by utilizing an SDN centralized controller and distributed gateway servers in the data
centers. Tasks admission control, routing, and the store-and-forward mechanism were considered to maximize the
utilization of dedicated links between the data centers. To boost high priority transfers when the networking
resources are insufficient, transmission of lower priority chunks can be paused. These chunks were then
temporarily stored in intermediate gateway servers and forwarded later at a carefully computed time. To further
enhance the transfer performance, three dynamic algorithms were proposed which are bandwidth reserving,
dynamic adjustment, and future demand friendly. Owan was proposed in [198] as an approach that can help reduce
the completion time of deadline-constrained business-oriented bulk transfers in WANs by utilizing a centralized
SDN controller that controls ROADMs and packet switches. Network throughput was maximized by joint
optimizations of the network topology, routing, and transfer rates while assuming prior knowledge for demands.
Topology reconfigurations were achieved by changing the connectivity between router and ROADM ports for
example to double the capacity for certain links. Simulated annealing was utilized to update the network gradually
and hence ease the reconfigurations. Moreover, consistency was ensured while applying the updates to avoid
packets dropping. Compared to packet layer control only, the results indicated 4.45 times faster transfers and 1.36
times lower completion time. Flexgrid optical technologies were proposed with SDN control to meet SLAs and
increase revenue of inter data center transfers in [199]. Application-Based Network Operations (ABNO)
architecture, which is a centralized architecture under the standardization of IEFT, for the control plane was
utilized to manage applications-oriented semantics to identify data volumes and completion time requirements.
Heuristics for the Routing and Scheduled Spectrum Allocation (RSSA) were proposed to adjust BV transponders,
and OXC to ensure sufficient resources and hence meet deadlines. If workloads with high priority and nearest
completion time lack resources, assigned resources for lowest priority workloads can be dynamically reduced.
Few recent studies addressed utilizing NFV for improving the performance of big data applications [200]-
[203]. An NFV framework for genome-centric networking was proposed in [200] to address the challenges of
exchanging both; genetic data with large file sizes (e.g. 3.2 GB) and the VM images that contain the processing
software between data centers. To reduce the traffic, SDN and NFV-based caches were utilized to store parts of
the files to be processed along the path. A signalling protocol with on and off path message distribution is proposed
to discover the cache resources. NFV enables service chaining by using sequences of VNFs to perform
fundamental networks operations such as firewalls and deep packet inspection (DPI) on traffic. The work in [201]
addressed optimizing VNF instances placement to minimize the communication costs between connected VNF
instances while considering the volumes of big data traffic. An extended network service chain graph was
proposed and joint optimization of VNFs and the flows between them was considered. The ETSI NFV
specifications were followed in [202] and [203] to enhance cloud infrastructures with big data video workloads.
The authors in [202] implemented an elastic CDN as a NFV to deliver live TV and VoD services via the Standard
MPEG Dynamic Adaptive Streaming over HTTP (MPEG- DASH). A centralized CDN manager was proposed to
control the virtualized CDN functions such as requests admission, and intermediate and leaf cache contents and
capacity while reducing the networking costs and ensuring high QoE. NUBOMEDIA was proposed in [203] as an
elastic PaaS with integrated API stack to integrate Real-time Multimedia Communication (RTC) with multimedia
big data processing such as video content analysis (VCA) and augmented reality (AR) with the aid of NFV. The
results indicated that NUBOMEDIA scaled well with increasing requests while maintaining the QoS for WebRTC
workloads. A summary of the cloud networking-focused optimization studies for big data applications discussed
is given in Table III. Table IV, in contrast, focuses on generic cloud applications and provides a summary in a
fashion similar to that in TABLE III.
TABLE III
SUMMARY OF CLOUD NETWORKING-FOCUSED OPTIMIZATION STUDIES FOR BIG DATA APPLICATIONS
Ref Objective Application Tools Benchmarks/workloads Experimental Setup/Simulation environment
[120]* Resources scaling to avoid MapReduce Theoretical HiBench (TeraSort Emulator for IaaS and
over and under provisioning analysis and WordCount) cloud-based MapReduce
[121]* Dynamic resources allocation in Hadoop 1.x Policy applied WordCount Aneka platform; 4 IBM X3200 M3 servers
hybrid clouds to meet soft deadlines In the master (4 CPU cores, 3 GB RAM) and a master (2
cores, 2 GB RAM) EC2 m1.small instances.
[122]* Hyper-Heuristic Scheduling Algorithm Hadoop 1.1.3 simulated annealing, genetic fMRI, protein annotation, CloudSim simulator (4 VMs in a data center),
(HHSA) for cloud computing systems algorithm, swarm, ant colony PSPLIB, Grep, TeraSort Single node (Intel i7, 64GB RAM) multi-node
cluster with 4 VMs (8 cores, 16GB RAM)
[123]* MapReduce jobs resource Hadoop 1.2.1 Matchmaking and scheduling Synthetic workloads, 11 nodes cluster in EC2 (Intel Xeon
allocation and scheduling algorithm (MRCP-RM) WordCount CPU, 3.75 GB RAM), simulations
[124]* Cloud RESource Provisioning (CRESP) Hadoop 1.0.3 Profiling and WordCount, Sort, 16 nodes cluster (4 Quad-core CPU, 16GB
to minimize cost or completion time machine learning TableJoin, PageRank RAM, 2 500GB disks), EC2 small instances
(1 CPU, 1.7GB RAM, 160GB disk)
[127]* Bazaar: job-centric interface for Hadoop 0.20.2 Profiling+ greedy heuristic Sort, WordCount, GridMix, TF- Prediction: 35 Emulab nodes (from
data analytics applications in clouds for resources selection IDF, LinkGraph Cloudera), evaluation on 26 Emulab nodes
[128]* Public cloud resources allocation to Hadoop 1.x Naive and linear Sort, Word Count, 33 nodes (4 cores, 4 GB RAM,
meet completion time requirements profiling GridMix 1TB disk for data), 1 Gbps Ethernet
[132]* Falloc: VM-level bandwidth Hadoop 1.0.0 Nash bargaining Random VM-VM traffic, 16 nodes cluster with 64 VMs, 1 Gbps
guarantees in IaaS environments game, rate control WordCount, Sort, join Trace-driven Mininet simulations
[135]* Cost model for NoSQL Elasticsearch Profiling + look- YCSB 6 nodes with mixture of VMs with
in public and private clouds 0.20.6 ahead optimization (2-4 CPU, 2.4-3.2 GHz, 3-8GB RAM)
[136]* Dynamic resource scheduling for Data Storm DRS Video logo detection, 6 nodes (Intel quad core CPU,
Stream Management Systems (DSMS) scheduler frequent pattern detection 8 GB RAM), LAN network
[137]* Fairness scheduler for Spark jobs Apache Linear program, Sort workloads for 12 EC2 instances in 6 geo-distributed
In geo-distributed data centers Spark heuristic multiple jobs data centers (2 CPU, 7.5GB RAM, 32GB SSD
[138]* Long Term Fair Resource Hadoop 2.2.0 LTYARN scheduler Facebook traces, Purdue suite, 10 nodes cluster (2 Intel X5675
Allocation in cloud computing In YARN Hive workloads CPUs„ 24GB RAM) to emulate EC2
TPC-H, Spark ML t2.medium (2 CPU cores, 4 GB RAM)
[139]* Introduce multi-tenancy Hadoop 2.6.0 Modification to WordCount, Grep, 8 nodes cluster(2.4 GHz CPU, 3GB
features in Hadoop YARN and HDFS PiEstimator RAM, 200 GB HDD), 1 GB Ethernet
[140]° Elastisizer: automating virtualized Hadoop 1.x Profiling, white Link Graph, Join, TF-IDF, Various EC2 instances
cluster sizing for cloud MapReduce and black-box methods TeraSort, WordCount
[141]° Cura: cost-effective cloud Hadoop 1.x Profiling, VM-aware Facebook-like traces 20 nodes KVM-based cluster (16
provisioning for MapReduce scheduling, online generated by SWIM cores CPU, 16GB RAM) in two racks,
VM reconfiguring 1 Gbps Ethernet, Java based simulations
[142]° HybridMR: MapReduce scheduler in Hadoop 0.22.0 2-phase hierarchical Sort, WC, PiEst, DistGrep, 24 nodes (dual core 2.4 GHz CPU, 4GB
hybrid native/virtualized clusters scheduler, profiling twitter, K-means, TestDFSIO RAM), 48 VMs (1 CPU core, 1 GB RAM)
[146]° Energy-aware MapReduce in clouds Hadoop 1.x BinCardinality, BinDu- Sort, WordCount, 6 nodes (dual core CPU, 2GB RAM,
by spatio-temporal placement tradeoffs ration, RandomFF, recipe PiEstimator 250GB disk), 3 VM types, 1 Gbps Ethernet.
[147]° Energy-efficient VM placement in IaaS Hadoop 1.1.0 Tight Recipe Packing (TRP), Sort, TeraSort, 10 nodes Xen-based cluster (4 cores
cloud data centers running MapReduce Virtual Cluster Scaling (VCS) WordCount CPU, 6GB RAM, 600GB storage)
Gigabit Ethernet, Java-based simulations
[148]° Purlieus: optimizing data and VM Hadoop 1.x Heuristics based Grep, Permutation 2 racks, 20 nodes, 20 VMs/job (4 2 GHz
placements for MapReduce in clouds on k-club generator, Sort CPUs, 4GB RAM), KVM as hypervisor,
1 Gbps network, simulations (PurSim)
[149]° Locality-aware Dynamic Hadoop 0.20.2 Hot-plugging, synchronous Hive workloads (Grep, 100 nodes EC2 cluster (8 CPU, 7 GB RAM),
Resource Reconfiguration (DRR) and queue-based scheduler select aggregation, join) 6 nodes cluster (6 CPU, 16GB RAM),
With 5 VMs/node (2CPU, 2GB)
[150]° Interference and locality-aware Hadoop Meta task scheduler TeraSort, Grep RWrite, 12 physical machines in 2 racks forming
MapReduce tasks scheduling in 0.20.205 WCount, TeraGen 72-nodes Xen-based cluster (12 CPU cores,
virtual clusters 32GB, 500GB disk), 1 Gbps Ethernet
[151]° CAM: topology-aware Vanilla Scheduler based on Hive workloads 4 machines (2 quad CPU, 48GB RAM),
resources manager for Hadoop min-cost flow problem KVM hypervisor, 1 Gbps Ethernet, 23 VMs
MapReduce in clouds (2 CPU, 1GB RAM), simulations (PurSim)
[152]° Self-expressive management for business Hadoop and Mnemos: portfolio scheduler, Real-world business-critical Simulations
critical workloads in clouds Microsoft HPC Nedu: virtualization scheduler traces from 1300 VMs
[153]° TIRAMOLA: open-source Hadoop 1.0.1, Ganglia for monitoring, YCSB benchmark OpenStack cactus cluster with 20 clients
framework to automate VM HBase 0.92.0 Markov Decision Process VMs (2 CPU cores, 4GB RAM), and
resizing in NoSQL clusters 16 server VMS (4 CPU cores, 8GB RAM)
[154]° Enhanced Xen scheduler Hadoop 0.20.2 modification to Xen Word Count, Grep, sort, Machine type 1 (2.8 GHz 2 core CPU, 2MB
for MapReduce hypervisor, 2-level Hive benchmarks L2 cache, 3GB RAM) Machine type 2 (3
scheduling policy GHz 2 core CPU, 6MB L2 cache, 6GB RAM)
[155]° Enhancing Hadoop Hadoop 1.x Optimizing Xen TestDFSIO (Intel i7 CPU, 16GB RAM, SSD 128GB disk)
on Xen-based VMs I/O ring mechanism
[157]° Automatic configurations for Hadoop 2.x Lightweight custom k- HiBench workloads (k- 5 nodes (2 Quad-core 2.8 GHz CPU, 12 MB L2
Hadoop on Docker containers nearest neighbour heuristic Means, Bayes, PageRank) cache, 8GB RAM, 320GB disk), 1 Gbps Ethernet
[158]° Iterative MapReduce Twister Pub/sub broker network k-means clustering 2 Docker containers
on Docker containers via NaradaBrokering
[159]° Improving networking performance Hadoop 2.7.3 Link aggregation via TestDFSIO 2 ProLiant DL385 G6 servers (2 6-cores AMD, 32GB
in Container-based Hadoop IEEE 802.3ad standard RAM), IEEE 802.3ad-enabled Gigabit Ethernet
[160]° EHadoop: Network-aware scheduling YARN Online profiling, HiBench (sort, 30 nodes, 4 data and 20-25 compute (8 CPU
in elastic MapReduce clusters FIFO and fair schedulers WordCount, PageRank) cores, 8GB RAM), 800 Mbps network
[169]† Improving Hadoop in Hadoop 2.2.0 Statistical analysis of WordCount, 18 nodes cluster in 3 sub-clusters, 1 Gbps network, each
geo-distributed data centers traffic log data TeraSort sub-cluster contains 3 racks and 6 data nodes
[171]† Data-centric cloud architecture Hadoop 1.x parallel algorithm for job Aggregate, expand, Simulations for 12 nodes in EC2
for geo-distributed MapReduce scheduling and data routing transform, summary
[172]† MapReduce in geo-distributed data Hive Chance-constrained Hive traces Simulations for 12 data centers with network bandwidth
centers with predictable completion time optimization uniformly distributed between 100Mbps and 1Gbps
[173]† Balancing processing location and type, Hadoop 1.x, MILP, Input data chunks (10 -330)GB NSFNET, Cost239, and Italian networks with 14
and energy consumption in big data others heuristic with output sizes (0.01-330)GB processing nodes and 2 cloud data centers
networks
[174]† Balancing processing speed and energy Hadoop 1.x MILP Input data chunks (10 -220)GB NSFNET network with 14 processing nodes
consumption in big data networks with output sizes (0.2-220)GB and 2 cloud data centers
[177]† G-MR: MapReduce framework for Hadoop 1.x Data Transformation CensusData, WordCount, 10 large instances from 4 EC2 data centers
Geographically-distributed data centers. Graph (DTG) algorithm KNN, Yahoo, Google traces (4 CPU cores, 7.5GB RAM), 2 VICCI clusters
[178]† Hadoop system-level improvements Hadoop 1.x Sub-cluster scheduling and WordCount, Grep 20 nodes in 2 sub-clusters, cross access latency
in geo-distributed data centers reduce output placement of 200 ms, and subcluster latency of 1 ms
[180]† Performance model for iterative Hadoop 2.6.0 Profiling, rack-aware Iterative GREP, k-means, 8 nodes, 4 (4 CPU cores, 500GB HDD, 4GB RAM),
MapReduce with cloud bursting scheduler HiBench, PageRank 4 (2_8 CPU cores, 1 TB HDD, 64GB RAM), KVM
[181]† BStream: Cloud bursting Hadoop 0.23.0, Profiling, modification WordCount, MultiWord- Cluster: 21 nodes (2 2GHz CPU, 4GB
framework for MapReduce Storm 0.8.1 to YARN Count, InvertedIndex, , RAM, 500GB disk), Cloud: 12 nodes
sort (Quad 3.06GHz CPU, 8GB RAM)
[187]† G-Cut: geo-distributed PowerGraph modification to PowerGraph PageRank, Shortest Single EC2 instances,
graph partitioning partitioning algorithm Source Path (SSSP), Subgraph Simulations on Grid5000
Isomorphism (SI) on 5 real-
world graphs
[192]‡ OpenFlow (OF)-enabled Matsu, modified JT, MalStone Benchmark, Open Cloud Consortium Project infra-
Hadoop over Ethernet Hadoop 1.x Open vSwitch (OVS) sort structure with 10 nodes, 3 data centers,
OF-enabled switches, 10Gbps network
[193]‡ Palantir: SDN service to obtain Hadoop 1.x Service module Facebook traces Testbed with 32 nodes in 4 data centers,
network proximity for Hadoop in Floodlight each with 2 racks, Pronto 3290 OpenFlow
core switch and OVS in Top-of-Rack (ToR) switches
*Cloud resources management, °VM and containers assignments, †Bulk transfers and inter DCN, ‡SDN and NFV-based.

TABLE IV
SUMMARY OF CLOUD NETWORKING-FOCUSED OPTIMIZATION STUDIES FOR GENERIC CLOUD APPLICATIONS
Ref Objective Tools Benchmarks/workloads Experimental Setup/Simulation environment
[125]* Energy efficiency in cloud data energy consumption MapReduce synthesized Simulations
centers with different granularities evaluation workloads, video streaming
[126]* Cloud recommendation system based Analytic Hierarchy Process Information about cloud Single machine as master, server from NeCTAR
on multicriteria QoS optimization (AHP) decision making providers (e.g. prices, locations) cloud, small EC2 instance, andC3.8xlarge EC2 instance
[129]* QoS and heterogeneity-aware Application Lightweight controller Single and multi-threaded std Homogeneous 40 nodes clusters with 10 configurations,
to Datacenter Server Mapping (ASDM) in cluster schedulers benchmarks, Microsoft workloads heterogeneous 40 nodes cluster with 10 machine types
[130]* Highly available cloud MILP model for availability- Randomly generated 50 nodes cluster with total
applications and services aware VM placements MTTF and MTTR (32 CPU cores, 30GB RAM)
[131]* CloudMirror: Applications-based network TAG modeling Empirical and synthesized Simulations for tree-based cluster with 2048 servers
abstraction and workloads placements with placement algorithm workloads
high availability
[133]* ElasticSwitch: Work-conserving minimum Guarantee Partitioning (GP), Shuffling traffic 100 nodes testbed (4 3GHz CPU, 8GB RAM)
bandwidth guarantees for cloud computing Rate Allocation (RA), OVS
[134]* EyeQ: Network performance Sender and receiver Shuffling traffic, 16 nodes cluster (Quad core CPU), 10 Gbps NIC
Isolation at the servers EyeQ modules Memcached traffic packet-level simulations
[143]° Multi-resource scheduling in Constrained programming, Google cluster Trace-driven simulations for
cloud data centers with VMs first-fit, best-fit heuristics Traces (1024 nodes, 3-tier tree topology)
[144]° Real-time scheduling for Rolling-horizon Google cloud Simulations via
tasks in virtualized clouds optimization Tracelogs CloudSim toolkit
[145]° Cost and energy aware scheduling VM selection/reuse, tasks Montage, LIGO, Simulations via
For deadline constraint tasks in clouds with merging/slaking heuristics SIPHT, CyberShake CloudSim toolkit
[156]° FlexTuner: Flexible container- Modified Mininet, MPICH2 version 1.3, Simulation for one VM
based tuning system iperf tools NAS Parallel Benchmarks (2GB RAM, 1 CPU core)
[161]† NetStitcher: system for inter Multipath and multi-hop, Video and content Equinix topology, 49 CDN (Quad Xeon
data center bulk transfers store-and-forward algorithms sharing traces CPU, 4GB RAM, 3TB disk), 1 Gbps links
[162]† Costs reduction for Convex optimizations, Uniform distribution Simulations for inter
inter-data cener traffic time-slotted model for file sizes DCN with 20 data centers
[163]† Profit-driven traffic scheduling for inter Lyapunov Uniform distribution Simulations for 7 nodes in EC2
DCN with multi cloud applications optimizations for arrival rate with 20 different cloud applications
[164]† CloudMPcast: Bulk transfers in multi ILP and Heuristic based Data backup, Trace-driven simulations for 14
data centers with CSP pricing models on Steiner Tree Problem video distribution data centers from EC2 and Azure
[165]† Reduce completion time of big data LockStep Broadcast - Numerical evaluations
broadcasting in heterogeneous clouds Tree (LSBT) algorithm
[166]† Optimized regular data backup in ILP and Transfer of Simulations for US backbone
Geographically-distributed data centers heuristics 1.35 PBytes network topology with 6 data centers
[167]† Optimize migration and backups in Greedy anycast Poisson arrival demands, Simulations for
EON-based geo-distributed DCNs algorithms -ve exponential service time NSFNET network
[168]† Energy saving in end- Power modelling by 20 and 100GB data sets Servers (Intel Xeon, 6GB RAM, 500 GB disk), (AMD FX,
to-end data transfers linear regression with different file sizes 10GB RAM, 2TB disk), Yokogawa WT210 power meter
[170]† Reducing costs of migrating geo- An offline and 2 Meteorological 22 nodes to emulate 8 data centers, 8 gateways, and
dispersed big data to the cloud online algorithms data traces 6 user-side gateways, additional node for routing control
[175]† Reduction of electricity and bandwidth Distributed 22k jobs from 4 data centers with varying electricity
costs in inter DCNs with large jobs algorithms Google cluster traces and bandwidth costs throughout the day
[176]† Power cost reduction in distributed data Two time scales Facebook Simulations for 7 geo-distributed data centers
centers for delay-tolerant workloads scheduling algorithm traces
[179]† Tasks scheduling and WAN bandwidth Community Detection 2000 file with Simulations in China-VO
allocation for big data analytics based Scheduling (CDS) different sizes network with 5 data centers
[182]† SAGE: service-oriented architecture Azure Queues 5 streaming services Azure public cloud
for data steaming in public cloud with synthetic benchmark (North and West US and North and West Europe)
[183]† JetStream: execution engine for OLAP cubes, 140 Million HTTP requests (51GB) VICCI testbed to emulate Coral CDN
data streaming in WAN adaptive data filtering
[184]† Networking cost reduction for geo- MILP model, Multiple Streams with four Simulations for
distributed big data stream processing VM Placement algorithm different semantics NSFNET network
[185]† Reduction of streaming workflows MILP and 500 streaming workflows Simulations for
costs in geo-distributed data centers 2 heuristics each with 100 tasks 20 data centers
[186]† Iridium: low-latency geo- Linear program, Bing, Conviva, Facebook, TPC- EC2 instances in 8 worldwide regions,
distributed data analytics online heuristic DS queries, AMPLab Big-Data trace-driven simulations
[188]‡ Application-aware Aggregation and TE OpenFlow, Voice, video, 7 GE Quanta packet switched, 3 Ciena CoreDirector
in converged packet-circuit networks POX controller and web traffic hybrid switch, 6 PCs with random traffic generators
[189]‡ Application-Centric IP/optical SDN Bulk transfers, dynamic Software for controller and use
Network Orchestration (ACINO) orchestrator 5G services, security, CDN cases for ACINO infrastructure
[190]‡ ZeroTouch Provisioning (ZTP) Network automation Bioenformatics, GEANT network and use
for managing multiple clouds with SDN and NFV UHD video editing cases for ZTP/ZTPOM
[191]‡ Routing, Modulation level and Spec- MILP, Bulk data Emulation of NSFNET
trum Allocation for bulk transfers NOX controller transfers network with OF-enabled WSSs
[194]‡ VersaStack: full-stack model-driven Resources modelling and Reading/writing from/to AWS and Google VMs, OpenStack-based VMs
orchestrator for VMs in hybrid clouds orchestration workflow Ceph parallel storage with SR-IOV interfaces, 100G SDN network
[195]‡ Open Transport Switch OTS prototype Bulk data ESnet Long Island Metropolitan
(OTS) for Cloud bursting based on OpenFlow transfers Area Network (LIMAN) testbed
[196]‡ Malleable Reservation for efficient MILP, dynamic Bulk data Simulations on NSFNET
bulk data transfer in EONs programming transfers
[197]‡ SDN-based bulk data Dynamic algorithm, Bulk data 10 data centers (IBM BladeCenter HS23 cluster), 10 OF-
transfers orchestration OpenFlow, Beacon transfers enabled HP3500 switches, 10 server-based gateways
[198]‡ Owan: SDN-based traffic management Simulated Bulk data Prototype, testbed, simulations
system for bulk transfers over WAN annealing transfers for ISP and inter data center WAN
[199]‡ SDN-managed Bulk transfers ABNO, ILP Bulk data OMNeT++ simulations for Telefonica (TEL), British
in Flexgrid inter DCNs heuristics transfers Telecom (BT), and Deutsche Telekom (DT) networks
[200]‡ Genome-Centric cloud-based NetServ, algorithms Genome 3-server clusters with total (160 CPU cores, 498GB
networking and processing off-path signaling data RAM, and 37TB storage), emulating GEANT Network
[201]‡ VNF placement for MILP, Random DAGs 10 nodes network with random
big data processing heuristics for chained NFs communication cost values
[202]‡ NFV-based CDN network for OpenStack-based HD and full Leaf cache on SYNERGY testbed (1 DC with 3 servers
big data video distribution CDN manager HD videos each with Intel i7 CPU, 16GB RAM, 1TB HDD)
[203]‡ NUBOMEDIA: real-time multimedia PaaS-based WebRTC 3 media server on
communication and processing APIs data KVM-based instances
*Cloud resources management, °VM and containers assignments, †Bulk transfers and inter DCN, ‡SDN and NFV-based.

VI. DATA CENTERS TOPOLOGIES, ROUTING PROTOCOLS, TRAFFIC CHARACTERISTICS, AND ENERGY
EFFICIENCY:

Data centers can be classically defined as large warehouses that host thousands of servers, switches, and
storage devices to provide various data processing and retrieval services [528]. Intra Data Center Networking
(DCN), defined by the topology (i.e. the connections between the servers and switches), links capacity, and the
switching technologies utilized and routing protocols, is an important design aspect that impacts the performance,
power consumption, scalability, resilience, and cost. Data centers have been successfully hosting legacy web
applications but are challenged by the need to host an increasing number of big data and cloud-based applications
with elastic requirements, multi-tenant, and heterogeneous workloads. Such requirements are contentiously
challenging data center architectures to improve their scalability, agility, and energy efficiency while providing
high performance and low latency. The rest of this Section is organized as follows: Subsection VI-A reviews
electronic switching-based data centers, while Subsection VI-B reviews proposed and demonstrated hybrid
electronic/optical and optical switching-based data centers. Subsection VI-C briefly describe HPC clusters, and
disaggregated data centers. Subsection VI-D presents traffic characteristics in cloud data centers, while Subsection
VI-E reviews intra DCN routing protocol and scheduling mechanisms. Finally, Subsection VI-F addresses the
energy efficiency in data centers.
A. Electronic Switching Data Centers:
Extensive surveys on the categorization and characterization of different data center topologies and
infrastructures are presented in [529]-[535]. In what follows, we briefly review some of state-of-the-art electronic
switching DCN topologies while emphasizing their suitability for big data and cloud applications. Servers in data
centers are typically organized in “racks” where each rack typically accommodates between 16 and 32 servers. A
Top-of-Rack (ToR) switch (also known as access or edge switch) is used to provide direct connections between
the rack's servers and indirect connections with other racks via higher layer/layers switches according to the DCN
topology. Most of legacy DCNs have a multi-rooted tree structure where the ToR layer is connected either to an
upper core layer (two-tiers) or upper aggregation and core layers (three-tiers) [528]. For various improvement
purposes, alternative designs based on Clos networks, flattened connections with high-radix switches,
unstructured connections, and wireless transceivers were also considered. These architectures can be classified as
switch-centric as the servers are only connected to ToR switches and the routing functionalities are exclusive to
the switches. Another class of DCNs, known as server-centric, utilizes the servers/set of servers with multiport
NIC and software-based routing to aid the process of traffic forwarding. A brief description of some electronic
switching DCNs is provided below and some examples of small topologies are illustrated in Figure 12 showing
the architecture in each case:
• Three-tier data centers [528]: Three-tier designs have access, aggregation, and core layers (Figure 12(a)).
Different subsets of ToR/access switches are connected to aggregation switches which connect to core
switches with higher capacity to ensure all-to-all racks connectivity. This increases the over-subscription
ratio as the bisection bandwidth between different layers varies due to link sharing. Supported by firewall,
load balancing, and security features in their expensive switches, three-tier data centers were well-suited for
legacy Internet-based services with dominant south-north traffic and were widely adopted in production data
centers.
• k-ary Fat-tree [536]: Fat-tree was proposed to provide 1:1 oversubscription and multiple equal-cost paths
between servers in a cost-effective manner by utilizing commodity switches with the same number of ports
(k) at all layers. Fat-tree organizes sets of equal edge and aggregation switches in pods and connects each
pod as a complete bipartite graph. Each edge switch is connected to a fixed number of servers and each pod
is connected to all core switches forming a folded-Clos network (Figure 12(b)). The Fat-tree architecture is
widely considered in industry and research [529] indicating its efficiency with various workloads. However,
its wiring complexities increase massively with scaling.
• VL2 [537]: Is a three-tier Clos-based topology with 1:1 oversubscription proposed to provide performance
isolation, load balancing, resilience, and agility in workload placements by using a flat layer-2 addressing
scheme with address resolution. VL2 suits virtualization and multi-tenants, however, its wiring complexities
are high.
• Flattened ButterFLY (FBFLY) [538]: FBFLY is a cost-efficient topology that flattens k-ary n-fly butterfly
networks into k-ary n-flat networks by merging the n switches in each row into a single high-radix switch.
FBFLYs improve the path diversity of butterfly networks, and achieve folded-Clos network performance
under load-balanced traffic with half the costs. However, with random adversarial traffic patterns, both load
balancing and routing become challenging, thus, FBFLY is not widely considered for big data applications.
• HyperX [539]: HyperX is a direct network of switches proposed as an extension to hypercube and flattened
butterfly networks. Further design flexibility is provided as several regular or general configurations are
possible. For load-balanced traffic, HyperX achieved the performance of folded Clos with fewer switches.
Like FBFLY, HyperX did not explicitly target improving big data applications.
• Spine-leaf ([e.g. [540], [541]): Spine-leaf DCNs are folded Clos-based architectures that gained widespread
adoption by industry as they utilize commercially-available high-capacity and high-radix switches. Spine-
leaf allows flexibility in the number of spine, leaf, and servers per leaf and links capacities at all layers (e.g.
in Figure 12(c)). Hence, controllable oversubscription according to cost-performance trade-offs can be
attained. Their commercial usage indicates acceptable performance with big data and cloud applications.
However, wiring complexities are still high.
• BCube [542] and MDCube [543]: BCube is a generalized hyper cube-based architecture that targets
modular data centers with scales that fit in shipping containers. The scaling in BCube is recursive where the
first building block “BCube0” is composed of n servers and an n-port commodity switch and the kth level
(i.e. BCubek) is composed of n BCubek-1 and nk n-port switches. Figure 12(d) shows a BCube1 with n=4. For
its multipath routing and to provide low latency and high bisection bandwidth and fault-tolerance, BCube
utilizes switches and servers equipped with multiple ports to connect with switches at different levels. BCube
is hence, suitable for several traffic patterns such as 1-1, 1-many, one-all, and all-all which arise in big data
workloads. However, with large scales, lower level to higher level bottlenecks increase and address space
have to be overwritten. For larger scales, MDCube in [543] was proposed to interconnect BCube containers
by high speed links in 1D or 2D connections.
• CamCube [544]: CamCube is a server-centric architecture that directly connects servers in a 3D torus
topology where each server is connected to other neighbouring 6 servers. In CamCube, switches are not
needed for intra routing which reduces costs and energy consumption. A Greedy key-based routing
algorithm, Symbiotic routing, is utilized at the servers which enables applications-specific routing and
arbitrary in-network functions such as caching and aggregation at each hop which can improve big data
analytics [214]. However, CamCube might not suit delay sensitive workloads with high number of servers
due to routing complexities, longer paths, and high store-and-forward delay.
• DCell [545]: DCellk is a recursively-scaled data center that utilizes a commodity switch per DCell0 pod to
connect its servers, and the remaining of the (k+1) ports in the servers for direct connections with servers in
other pods of same level and in higher levels pods. Figure 12(e) shows a DCell1 with 4 servers per pod.
DCell provides high bandwidth, scalability, and fault-tolerance at low costs. In addition, under all-all, many-
1, and 1-many traffic patterns, DCell achieves balanced routing, which ensures high performance for big
data applications. However, as it scales, longer paths between servers in different levels are required.
• FiConn [546]: FiConn is a server-centric DCN that utilizes switches and dual port servers to recursively
scale while maintaining low diameter and high bandwidth at reduced cost and wiring complexity compared
to BCube and DCell. In FiConn0, a port in each server is connected to the switch and in each level, half of
the remaining ports in the pods are reserved for the connections with servers in the next level. For example,
Figure 12(f) shows a FiConn2 with 4 servers per FiConn0. Real-time requirements are supported by
employing a small diameter and hop-by-hop traffic-aware routing according to the network condition. This
also improves the handling of bursty traffic of big data applications.
• Unstructured data centers with random connections: With the aim of reducing the average path lengths
and easing incremental expansions, unstructured DCNs based on random graphs such as Jellyfish [547],
Scafida [548], and Small-World Data Center (SWDC) [549] were proposed. Jellyfish [547] creates random
connections between homogeneous or heterogeneous ToR switches and connect hosts to the remaining ports
to support incremental scaling while achieving higher throughput due to low average path lengths. Scafida
[548], is an asymmetric scale-free data center that incrementally scale under limits on the longest path length.
In Scafida, two disjoint paths are assigned per switch pairs to ensure high resilience. SWDC [549] includes
its servers in the routing and connects them in small-world-inspired distribution connections. Unstructured
DCNs, however, have routing and wiring complexities and their performance with big data workloads is not
widely addressed.
• Data centers with wireless 60 GHz radio transceivers: To improve the performance of tree-based DCNs
without additional wiring complexity, the use of wireless transceivers at servers or ToR switches was also
proposed [550], [551].

Fig. 12. Examples of electronic switching DCNs (a) Three-tier, (b) Fat-tree, (c) Spine-leaf, (d) BCube, (e)
DCell, and (f) FiConn.
B. Hybrid Electronic/Optical and All Optical Switching Data Centers:
Optical switching technologies have been proposed for full or partial use in DCNs as solutions to overcome
the bandwidth limitations of electronic switching, reduce costs, and to improve the performance and energy
efficiency [552]-[557]. Such technologies eliminate the need for O/E/O conversion at intermediate hops and make
the interconnections data-rate agnostic. Hybrid architectures add Optical Circuit Switching (OCS), typically
realized with Micro-Electro-Mechanical System Switches (MEMSs) or free-space links, to enhance the capacity
of an existing Electronic Packet Switching (EPS) network. To benefit from both technologies, bursty traffic (i.e.
for mice flows) is offloaded to EPS while bulky traffic (i.e. for elephant flows) is offloaded to the OCS. MEMS-
based OCS requires reconfiguration time in the scale of ms or µs before setting paths between pairs of ToR
switches, and because packet headers are not processed, external control is needed for the reconfigurations.
Another shortcoming of MEMS is their limited port count. WDM technology can increase the capacity of ports
without huge increase in the power consumption [270], resolve wavelength contention, and reduce wiring
complexities at the cost of additional devices for multiplexing, de-multiplexing, and fast tuning lasers and tuneable
transceivers at ToRs or servers. In addition, most of the advances in optical networking discussed in Subsection
IV-B6 have also been considered for DCNs such as OFDM [558], PONs technologies [559]-[573], and EONs
[574]-[576].
In hybrid and all optical DCNs, both active and passive components were considered. The passive
components including fibers, waveguides, splitters, couplers, Arrayed Waveguide Gratings (AWGs), and Arrayed
Waveguide Grating Routers (AWGRs), do not consume power but have insertion loss, crosstalk, and attenuation
losses. Active components include Wavelength Selective Switches (WSSs), that can be configured to route
different sets of wavelengths out of a total of M wavelengths in an input port to N different output ports (i.e. 1×N
switch), MEMSs, Semiconductor Optical Amplifiers (SOAs) that can provide switching time in the range of ns,
Tuneable Wavelength Converters (TWCs), and Mach-Zehnder Interferometer (MZI) which are external
modulators based on controllable phase shifts in split optical signals. In addition to OCS, Optical Packet Switching
(OPS) [577]-580] was also considered with or without intermediate electronic buffering. Examples of hybrid
electrical/optical and all optical switching DCNs are summarized below, and some are illustrated in Figure 13:
• c-Through [581]: In c-Through, electronic ToR switches are connected to a two-tier EPSs network and a
MEMS-based OCS as depicted in Figure 13(a). The EPS maintains persistent but low bandwidth connections
between all ToRs and handles mice flows, while the OCS must be configured to provide high bandwidth
links between pairs of ToRs at a time to handle elephant flows. As the MEMS used have ms switching time,
c-Through was only proven to improve the performance of workloads with slowly varying traffic.
• Helios [582]: In Helios, electronic ToR switches are connected to a single tier containing arbitrary number
of EPSs and MEMS-based OCSs as in Figure 13(b). Helios performs WDM multiplexing in the OCS links
and hence requires WDM transceivers in the ToRs. Due to its complex control, Helios was demonstrated to
improve the performance of applications with second-scale traffic stability.
• Mordia [583]: Mordia is a 24-port OCS prototype based on ring connection between ToRs each with 2D
MEMS-based WSS that provides 11.5 µs reconfiguration time at 65% of electronic switching efficiency.
Mordia can support unicast, multicast, and broadcast circuits, and enables both long and short flows
offloading which makes it suitable for big data workloads. However, it has limited scalability as each source-
destination needs a dedicated wavelength.
• Optical Switching Architecture (OSA) [584] / Proteus [585]: OSA and Proteus utilize a single MEMS-
based optical switching matrix to dynamically change the physical topology of electronic ToRs connections.
Each ToR is connected to the MEMS via an optical module that contains multiplexers/demultiplexers for
WDM, a WSS, circulators, and couplers as depicted in Figure 13(c). This flexible design allows multiple
connections per ToR to handle elephant flows and eliminates blocking for mice flows by enabling multi-hop
connections via relaying ToRs. OSA was examined with bulk transfers and mice flows and minimal
overheads were reported while achieving 60%-100% non-blocking bisection bandwidth.
• Data center Optical Switch (DOS) [586]: DOS utilizes an N+1-ports AWGR to connect N ToR electronic
switches through OPS with the aid of optical label extractors as shown in Figure 13(d). Each ToR is
connected via a TWC to the AWGR to enable it to connect to one other ToR at a time. At the same time,
each ToR can receive from multiple ToR simultaneously. The last ports on the AWGR are connected to an
electronic buffer to resolve contention for transmitting ToRs. DOS suits applications with bursty traffic
patterns, however, its disadvantages include the limited scalability of AWGRs and the power hungry
buffering.
• Petabit [587]: Petabit is a bufferless and high-radix OPS architecture that utilizes three stages of AWGRs
(Input, central, and output) in addition to TWCs as depicted in Figure 13(e). At each stage the wavelength
can be tuned to a different one according to the contention at the next stage. However, electronic buffering
at the ToRs and effective scheduling are required to achieve high throughput. Petabit can scale without
impacting latency and thus can guarantee high performance for applications even at large scales.
• Free-Space Optics (FSO)-based data centers: Using FSO-based interconnection to link ToRs using
mirrors in roofs in DCNs was proposed by several studies such as FireFly [588], and Patch Panels [589].

Fig. 13. Examples of hybrid/all optical switching DCNs (a) c-Through, (b) Helios, (c) OSA/Proteus, (d)
DOS, and (e) Petabit.

C. HPC Clusters and Disaggregated Data Centers:


The DCNs summarized in the previous two Subsections target production and enterprise data centers with
general-purpose usage. Alternatively, HPC clusters target specific set of compute-intensive application and are
thus designed with specialized servers, accelerators such as Graphical Processing Units (GPUs), in addition to
having dedicated high speed interconnections to parallel storage systems such as InfiniBand and Myrinet, Torus
or mesh topologies, and photonic interconnects for chip-chip, board-board, blade-blade, and rack-rack links [590]-
[592]. According to the International Data Corporation (IDC), more than 67% of HPC facilities performed big
data analytics [78]. Another variation to conventional data centers is disaggregating the CPU, memory, IO, and
network resources at rack, pod, or entire data center level to achieve utilization and power efficiency advantages
over legacy single-box servers [593]-[599]. Combinations of big data applications with uncorrelated resources
demands can be deployed in disaggregated data centers with higher utilization, energy efficiency, and performance
[593].
D. Characteristics of Traffic inside Data Centers:
Traffic characteristics within enterprise, cloud, social networking, and university campus data centers have
been reported and analysed in [600]-[604] to provide several insights about traffic patterns, volume variations,
and congestion, in addition to various flows statistics such as their duration, arrivals, and inter-arrival times. Intra
data center traffic is mainly composed of different mixes of data center applications traffic including retrieval
services and big data analytics, and provisioning operations such as data transfers, replication and backups. The
first three pioneering studies by Microsoft Research [600]-[602] pointed that traffic monitoring tools used by ISPs
in WANs do not suit data centers environments as their traffic characteristics are not statistically similar. The
authors in [600] utilized low-overhead socket-level instrumentation at 1500 servers to collect application-aware
traffic information. This cluster contained diverse workloads including MapReduce style workloads that generated
10GB per server per day. Two traffic patterns were defined; Work-Seeks-Bandwidth: for engineered applications
with high locality, and Scatter-Gather: for applications that require servers to push or pull data from several others.
It was found that 90% of the traffic stays inside the racks and that 80% of the flows last less than 10s, while less
than 0.1% last more than 200s. The traffic patterns showed transient or stable spikes and the flows inter-arrivals
were periodic short term bursts spaced by an average of 15ms and a maximum of 10s. In [601], an empirical study
for traffic patterns in 19 tree-based two-tier and three-tier data centers for web-based services was carried based
on coarse-grained measurements for links utilization and packets loss rate taken from Simple Network
Management Protocol (SNMP) logs at routers and switches every five minutes. Average link loads were found
the highest at the core switches and the highest packets losses were found at edge (i.e. ToR) switches. Additionally,
fine-grained packet-level statistics at five edge switches in a smaller cluster showed a clear ON-OFF intensity and
log-normal inter-arrivals during ON intervals. Reverse engineering methods to obtain fine-grained characteristics
from SNMP logs were suggested for data center traffic generators and simulators.
SNMP logs were also utilized in [602] to study the traffic empirically while considering broader data centers
usages and topologies. Those included 10 data centers with two-tier, three-tier, star-like, and Middle-of-Rack
switches-based (i.e. connecting servers in several racks) topologies for university campuses, private enterprises,
in addition to web-based services and big data analytics cloud data centers. The send/receive patterns of
applications including authentication services, HTTP and secure HTTP-based, and local-use applications were
examined at the flow-level to measure their effects on links utilization, congestion, and packets drop rate. It was
found that 80% of the traffic stayed inside the racks in cloud data centers, while 40-90% left the racks in
universities and enterprises data centers. Fine-grained packet traces from selected switches in 4 DCNs indicated
that 80% of the flows are small (i.e. ≤ 10 kB), active flows were less than 10,000 per second per rack, and Inter-
arrivals were less than 10 µs for 2-13% of the flows. The recent studies [603], and [604] presented the traffic
characteristics inside Facebook's 4-tier Clos-based data center that hosts hundreds of thousands of 10 Gbps
servers. In [603], wide monitoring tools and per-host packet-level traces were utilized to characterize the traffic
while focusing on its locality, stability, and predictability. The practice in this architecture recommends assigning
each machine to one role (e.g. cache, web server, Hadoop), localizing each type of workloads in certain racks, and
varying the level of oversubscription according to the workload needs. Traffic was found to be neither fully rack-
local nor all-all, and without ON-OFF behaviour. Also, it was found that servers communicated with up to 100s
of servers concurrently, most of the flows were long-lived, and non-Hadoop packets were <200 Bytes. To capture
fine-grained network behaviours such as µbursts (i.e. high utilization events lasting <1 ms), high-resolution
measurements at the rack-level with granularity of tens/hundreds of µs were utilized in [604] for the same data
center and application sets in the previous study. The measurements were based on a developed counter collection
framework that polls packet counters and buffer utilization statistics every 25µs, and 50µs, respectively. It was
found that high utilization events were short-lived as more than 70% of bursts lasted at most for tens of µs, and
that load is very unbalanced as web and Hadoop racks had hot downlink ports while Cache had uplink hot ports.
90% of bursts lasted < 200µs for all application types, and < 50µs for Web racks, and the highest tail was recorded
for Hadoop racks at 0.5 ms. It was noticed that the packets included in µbursts are larger than in the outside,
µbursts were caused by application behavioural changes, and that the arrival rate of µbursts was not Poisson with
40% of inter-arrivals being > 100 µs for Cache and Web racks. Regarding the impact of µbursts on shared buffers,
Hadoop racks had ports buffers at > 50% utilization, while web and cache racks had a maximum of 71% and 64%
of their ports buffers at high utilization. Latency and packet loss measurements between VMs in different public
clouds were presented and performed in [605] through a developed tool, PTPmesh, that aided cloud users in
monitoring network conditions. Results for one way messaging delay between data centers in the same and
different clouds were shown to range between µs and ms values. Specific traffic measurements for big data
applications were presented in [279] and three traffic patterns; Single peak, repeated fixed-width peaks, varying
heights and widths peaks were reported.
To forecast Hadoop traffic demands in cloud data centers, HadoopWatch which utilizes real-time file system
monitoring tools was proposed in [606]. HadoopWatch monitors the meta-data and log files of Hadoop to extract
accurate traffic information for importing, exporting, and shuffling data flows before entering the network by few
seconds. Based on coflows information, a greedy joint optimization for scheduling and routing flows improved
jobs completion time by about 14%. Several recent studies tackled improving the profiling, estimation, and
generation of data center's traffic to aid in examining DCN TE, congestion control and load balancing methods
[607]-[610]. The authors in [607] applied sketch-based streaming algorithms to profile data center traffic while
considering its skewness among different services. To model spatial and temporal characteristics of traffic in
large-scale DC systems, ECHO was developed in [610] as a scalable topology and workload-independent
modeling scheme that utilizes hierarchical Markov Chains at the granularities of groups of racks, racks and
servers. CREATE, a fine-grained Traffic Matrix (TM) estimation proposed in [608], utilized the sparsity of traffic
between ToR switches to remove underutilized links. Then, the correlation coefficient between ToR switches in
the reduced topology is extracted via SNMP counters and service placement logs to estimate the TM. In [609],
random number generators for Poisson shot-noise processes were utilized to design a realistic TM generator at
the flow-level between hosts in tree-based DCN while considering flows arrival rate, ratio in the same rack,
duration, and size that can be used with packet-level traffic generators for DCN simulations.
E. Intra Data Centers Routing Protocols and Traffic Scheduling Mechanisms:
Routing protocols, which define the rules for choosing the paths for flows or flowlets between source and
destination servers, were extensively surveyed in [611]-[616]. Routing in DCNs can be static or adaptive where
paths assignments can be dynamic according to criteria measured by a feedback mechanism. Adaptive routing or
traffic scheduling can be centralized where a single controller is required to gather network-wide information and
to distribute routing and rate decisions to switches and servers, or distributed where the decisions are taken
independently by the switches or servers according to local decision based on partial view of the network.
Centralized mechanisms provide optimal decisions but have limited scalability while distributed mechanisms are
scalable but not always optimal.
Tree-based data centers such as three-tier designs typically utilize VLAN with Spanning Tree Protocol (STP),
which is a simple Layer2 protocol that eliminates loops by disabling redundant links and forcing the traffic to
route through core switches. Spine-leaf DCNs typically use improved protocols such as Transparent
Interconnection of Lots of Links (TRILL) or Shortest Path Bridging (SPB) that enables the utilization of all
available links while ensuring loop-free routing. CONGA [617] was proposed as distributed flowlets routing
mechanism for spine-leaf data centers, and achieves load balancing by utilizing leaf-leaf congestion feedback.
Improved tree-based DCNs such as Fat-tree and server-centric DCNs require designing their routing protocols
closely with their topological properties to fully exploit the topology. For example, Fat-tree requires specific
routing with two-level forwarding tables for servers with fixed pre-defined addresses [536]. To select paths
according to network bandwidth utilization in Fat-tree, DARD was proposed in [618] as a host-based distributed
adaptive routing protocol. For agility, VL2 uses two addresses for servers; a Location-specific Address (LA), and
an Application-specific Address (AA) [537]. For packets forwarding, VL2 employs Equal Cost Multi-Path
(ECMP), which is a static layer3 routing protocol that distributes flows to paths by hashing, and Valiant Load
Balancing (VLB), that randomly selects intermediate nodes between a source and a destination. BCube employs
a Source Routing protocol (BSR) [542], DCell adopts a distributed routing protocol (DFR) [545], and JellyFish
[547] uses a k-shortest paths algorithm. c-Through uses the Edmond's algorithm to obtain the MEMSs
configurations from the traffic matrix, then the ToR switches traffic is sent via VLAN-based routing into the OSC
or the EPS [581]. Helios has a complex control scheme of three modules; Topology Manager (TM), circuit switch
manager (CSM), and Pod Switch Manager (PSM) [582]. Mordia utilizes a Traffic Matrix Scheduling (TMS)
algorithm that obtains effective short-lived circuit schedules, based on predicted demands, that can be applied to
configure the MEMS and WSSs sequentially. OSA, and Proteus use the maximum-weight b-matching problem
to enable the connection of multiple ToR switches, configure the WSS to match capacities and then use shortest
path-based routing [584].
Using the Transmission Control Protocol (TCP) of Internet in data center environments has been proved to
be inefficient [613] due to the difference in the nature of their traffic, the higher sensitivity to incast, and the key
requirement in data center applications of minimizing Flow Completion Time (FCT). Thus, different transport
protocols were proposed for DCNs. DCTCP [619] provides similar or better throughput than TCP and guarantees
low Round Trip Time (RTT) by active control of queue lengths in the switches. MPTCP [620] splits flows to sub-
flows and balances the load across several paths via linked congestion control. However, it might perform
excessive splitting which requires extensive CPU and memory resources at end hosts. D2TCP [621] is a Deadline-
aware Data center TCP protocol that considers single paths for flows and performs load balancing. For FCT
reduction, D3 proposed in [622] uses flow deadline information to control the transmission rate. pFabric [623]
and PDQ [624] enable the prioritirization of the flows closet to completion, and DeTail [625] splits flows, and
performs adaptive load balancing based on queues occupancy to reduce the highest FCT. Alternatively, the
centralized schedulers; Orchestra [258], Varys [262], and Baraat [263], which are further elaborated in Subsection
VII-C, target reducing the completion time of coflows which are sets of flows with applications-related semantics
such as intermediate data shuffling in MapReduce.
Virtualization in data centers was surveyed in [626] with a focus on routing and resources management and
in [627] while focusing on the techniques for network isolation in multi-tenant data centers. SDN has been widely
considered for data centers as it can improve the load balancing, congestion detection and mitigation [421], [431],
[628]. To allow users to make bandwidth reservation in data centers for their VM-to-VM communications, a
centralized controller is used to determine the rate and path for each user's flow. SecondNet was proposed in [629]
and is such a controller. Hedera in [630], detects elephant flows and maximizes their throughput via a centralized
SDN-based controller. ElasticTree in [631] improves the energy efficiency of Fat-tree DCNs by dynamically
switching-off sets of links and switches while meeting demands and maintaining fault-tolerance.
F. Energy Efficiency in Data Centers:
The energy consumption in data center is attributed to servers and storage, networking devices, in addition to
cooling, powering, and lightning facilities with percentages of 26%, 10%, 50%, 11%, and 3% respectively of the
total energy consumption [632]. As the energy consumption of servers is becoming proportional to loads, hence
their energy efficiency is improving faster, the portion of the networking is expected to increase [633]. In [632],
techniques for modeling the data centers energy consumption were comprehensively surveyed where they were
divided into, hardware-centric, and software-centric approaches. In [634], green metrics including the Power
Usage Effectiveness (PUE) (defined as the total facility power over the IT equipment power), and measurement
tools that can characterise emissions were surveyed to aid in sustaining distributed data centers.
Several studies considered reducing the energy consumption and costs in data centers at different levels [635]-
[639]. For the hardware, dynamically switching off the idle components, proposing efficient hardware with
inherent higher efficiency components, DVFS, and utilizing optical networking elements were considered. For
example, to improve the energy proportionality of Ethernet switches, the Energy Efficient Ethernet (EEE) standard
[639] was developed. EEE enables three states for interfaces which are active, idle with no transmission, and low
power idle (i.e. deep sleep). Although EEE have gained industrial adoption, its activation is not advised due to
uncertainty with its impact on applications performance [283]. Placement of workloads and VMs into fewer
servers, and scheduling tasks to shave peak power usage were also proposed to balance the power consumption
and utilization in data centers.

VII. DATA CENTERS-FOCUSED OPTIMIZATION STUDIES

This Section summarizes a number of big data applications optimization studies that consider the
characteristics of their hosting data centers including details such as improving their design and protocols or
analyzing the impact of their computing and networking parameters on the applications’ performance. Subsection
VII-A addresses the performance, scalability, flexibility, and energy consumption improvements and tradeoffs for
big data applications under various data centers topologies and design considerations. Subsection VII-B focuses
on the studies that improve intra data centers routing protocols to enhance the performance of big data applications
while improving the load balancing and utilization of the data centers. Subsection VII-C discusses flows, coflows,
and jobs scheduling optimization studies to achieve different applications and data centers performance goals.
Finally, Subsection VII-D addresses the studies that utilize advanced technologies to scale up big data
infrastructures and improve their performance. The studies presented in this Section are summarized in Tables V,
and VI.
A. Data Center Topology:
Evaluating the performance and energy efficiency of big data applications in different data centers topologies
was considered in [204]-[209]. The authors in [204] modeled Hadoop clusters with up to 4 ToR switches and a
core switch to measure the influence of the network on the performance. Several simplifications such as
homogeneous servers and uniform data distribution were applied and model-based and experimental evaluations
indicated that Hadoop scaled well enough under 9 different clusters configurations. The MRPerf simulator was
utilized in [205] to study the effect of the data center topology on the performance of Hadoop while considering
several parameters related to clusters (e.g. CPU, RAM, and disk resources), configurations (e.g. chunk size,
number of map and reduce slots), and framework (e.g. data placement and task scheduling). DCell was compared
to star and double-rack clusters with 72 nodes under the assumptions of 1 replica and no speculative execution
and was found to improve sorting by 99% compared to double-rack clusters. The authors in [206] extended
CloudSim simulator [110] as CloudSimExMapReduce to estimate the completion time of jobs in different data
center topologies with different workload distributions. Compared to hypothetically optimal topology for
MapReduce with a dedicated link for each intermediate data shuffling flow, CamCube provided the best
performance. Different levels of intermediate data skew were also examined and worse performance was reported
for all the topologies. In [207], we examined the effects of the data center network topology on the performance
and energy efficiency of shuffling operations in MapReduce with sort workloads in different data centers with
electronic, hybrid and all-optical switching technologies and different rate/server values. The results indicated that
optical switching technologies achieved an average power consumption reduction by 54% compared to electronic
switching data centers with comparable performance. In [208], the Network Power Effectiveness (NPE) defined
as the ratio between the aggregate throughput and the power consumption was evaluated for six electronic
switching data center topologies under regular and energy-aware routing. The power consumption of the switches,
the server's NIC ports and CPU cores used to process and forward packets in server centric topologies were
considered. The results indicated that FBFLY achieved the highest NPE followed by the server-centric data
centers, and that NPE is slightly impacted by the topology size as the number of switches scales almost linearly
with the data center size for the topologies examined. Design choices such as link speeds, oversubscription ratio,
and buffer sizes in spine and leaf architectures with realistic web search queries with Poisson arrivals and heavy-
tail size distribution were examined by simulations in [209]. It was found that ECMP is efficient only at link
capacities higher than 10 Gbps, where those resulted in 40% degradation in the performance compared to ideal
non-blocking switch. Higher oversubscription ratios degraded the performance only at 60% and higher loads.
Examining spine and leaf switches queue sizes revealed that it is better to maintain consistency and that additional
capacities are beneficial at leaf switches.
Flexible Fat-tree is proposed in [210] as an improvement and generalization of the Fat-tree topology in [536]
to achieve higher aggregate bandwidth and richer paths by allowing uneven number of aggregation and access
switches in the pods. With more aggregation switches, shuffling results indicated about 50% improvement in the
completion time. As a cost-effective solution to improve oversubscribed production data centers, the concept of
flyways, where additional on-demand wireless or wired links between congested ToRs, was introduced in [211]
and further examined in [212]. Under sparse ToR-to-ToR bandwidth requirements, the results indicated that few
flyways allocated in the right ToR switches improved the performance by up to 50% bringing it closer to 1:1 DCNs
performance. The flyways can be 802.11g, 802.11n, or 60 GHz wireless links, or random wired connections for
subset of the ToR switches via commodity switches. The wired connections, however, cannot help if the
congestion is between unconnected ToRs. A central controller is proposed to gather the demands and utilize MPLS
to forward the traffic over the oversubscribed link or one of the flyways. In [213], a spectrum efficient and failure
tolerant design for wireless data centers with 60 GHz transceivers was examined for data mining applications. A
spherical rack architecture based on bimodal degree distribution for the servers’ connections was proposed to
reduce the hop count and hence reduce the transmission time compared to traditional wireless data center with
cylindrical racks and same degree Cayley graph connections. Challenges related to interference, path loss, and the
optimization of hub servers’ selection were addressed to improve the data transmission rate. Moreover, the
efficiencies of executing MapReduce in failure prone environments (due to software and hardware failures) were
simulated.
Several big data frameworks that tailor their computations to the data center topology or utilize their
properties were proposed as in [214]-[216]. Camdoop in [214] is a MapReduce-like system that run in CamCube
and exploits its topology by aggregating the intermediate data along the path to reduce workers. A window-based
flow control protocol and independent disjoint spanning trees with the same root per reduce worker were used to
provide load balancing. CamCube achieved improvements over switch centric, Hadoop and Dryad, and over
Camdoop with TCP and Camdoop without aggregation. In [215], network-awareness and utilization of existing
or attached networking hardware were proposed to improve the performance of query applications. In-network
processing in ToR switches with attached Network-as-a-Service (NaaS) boxes was examined to partially reduce
the data and hence reduce bandwidth usage and increase the queries throughput. For API transparency, a shim
layer is added to perform software-based custom routing for the traffic through the NaaS boxes. A RAM-based
key-value store in BCube [542]; RAMCube was proposed in [216] to address false failure detection in large data
centers caused by network congestion, entire rack blockage due to ToR switch failure, and the traffic congestion
during recovery. A symmetric multi-ring arrangement that restricts failure detection and recovery to one hop in
BCube is proposed to provide fast fault recovery. Experimental evaluation for the throughput under single switch
failure with 1 GbE NIC cards in the servers indicated that a maximum of 8.3 seconds is needed to fully transmit
data from a recovery to a backup server.
The topologies of data centers were also considered in optimizing VM assignments for various applications
as in [217]-[220]. A Traffic-aware VM Placement Problem (TVMPP) to improve the scalability of data centers
was proposed in [217]. TVMPP follows two-tier approximating algorithm that leverages knowledge of traffic
demands and the data center topology to co-allocate VMs with heavy traffic in nearby hosts. First, the hosts and
the VMs are clustered separately and a 1-to-1 mapping that minimizes the aggregated traffic cost is performed.
Then, each VM within each cluster is assigned to a single host. The gain as a result of TVMPP compared to
random placements for different traffic patterns was examined and the results indicated that multi-level
architectures such as BCube benefit more than tree-based architectures and that heterogeneous traffic leads to
more benefits. To tackle intra data center network performance variability in multi-tenant data centers with
network-unaware VM-based assignments, the work in [218] proposed Oktopus as an online network abstraction
and virtualization framework to offer minimum bandwidth guarantees. Oktopus formulates virtual or
oversubscribed virtual clusters to suit different types of cloud applications in terms of bandwidth requirements
between their VMs. These allocations are based on greedy heuristics that are exposed to the data center topology,
residual bandwidths in links, and current VMs allocation. The results showed that allocating VMs while
accounting for the oversubscription ratio improved the completion time and reduced tenant costs by up to 75%
while maintaining the revenue. In [219], a communication cost minimization-based heuristic: Traffic Amount
First (TAF) was proposed for VMs to PMs assignments under architectural and resources constraints and was
examined in three data centers topologies. Inter VM traffic was reduced by placing VMs with higher inter traffic
in the same PM as much as possible. A Topology-independent resources allocation algorithm namely; NetDEO
was designed in [220] based on swarm optimizations to gradually reallocate existing VMs and allocate newly
accepted VMs based on matching resources and availability. NetDEO maintains the performance during network
and servers upgrades and accounts for the topology in the VM placements.
The performance of big data applications in SDN-controlled electronic and hybrid electronic/optical
switching data centers topologies was considered in [221]-[228]. To evaluate the impact of networking
configurations on the performance of big data applications in SDN-controlled data centers with multi-racks before
deployments, a Flow Optimized Route Configuration Engine (FORCE) was proposed in [221]. FORCE emulates
building virtual topologies with OVS over the physical network controlled by SDN to enable optimizing the
network and enhance the applications performance at run-time and improvements by up to 2.5 times were
achieved. To address big data applications and the need for frequent reconfigurations, the work in [222] examined
a ToR-level SDN-based topology modification in a hybrid data center with core MEMS switch and electrical
Ethernet-based switches at run-time. Different topology construction and routing mechanisms that jointly
optimize the performance and network utilization were proposed for single aggregation, shuffling, and partially
overlapped aggregation communication patterns. The authors accounted for reconfiguration delays by starting the
applications early and accounted for the consistency in routing tables updates. The work in [223] experimentally
examined the performance of MapReduce in two hybrid electronic/optical switching data centers namely c-
Through and Helios with SDN control. An “observe-analyze-act” control framework based on OpenFlow was
utilized for the configurations of the OCS and the packet networks. The authors addressed the hardware and
software challenges and emphasized on the need for near real-time analysis of application requirements to
optimally obtain hybrid switch scheduling decisions. The work in [225] addressed the challenges associated with
handling long-lived elephant flows of background applications while running Hadoop jobs in Helios hybrid data
centers with SDN control. Although SDN control for electronic switches can provide alternative routes to reduce
congestion and allow prioritizing packets in the queues, such techniques largely increase the switches CPU and
RAM requirements with elephant flows. Alternatively, [224] proposed detecting elephant flows and redirecting
them via the high bandwidth OCS, to improve the performance of Hadoop. To reduce the latency of multi-hop
connections between servers in 2D Torus data centers, the work in [225] proposed the use of SDN-controlled
MEMS to bypass electronic links and directly connect the servers. Results based on an emulation testbed and all-
to-all traffic pattern indicated that optical bypassing can reduce the end-to-end latency for 11 of the 15 hosts by
11%. To improve the efficiency of multicasting and incasting in workloads such as HDFS read, join, VM
provisioning, and in-cluster software updates, a hybrid architecture based on Optical Space Switches (OSS) was
proposed in [225] to establish point-to-point links on-demand to connect passive splitters and combiners. The
splitters transparently duplicate the data optically at the line rate and the combiners aggregate incast traffic under
orchestration system control with TDM. Compared with small scale electronic non-blocking switches, similar
performance was obtained indicating potential gains with the optical accelerators at larger scale, where non-
blocking performance is not attained by electronic switches. Unlike the above work which fully offloads multicast
traffic to the optical layer, HERO in [227] was proposed to integrate optical passive splitters and FSO modules
with electronic switching to handle multicast traffic. During the optical switches configuration, HERO multicasts
through the electronic switches, then migrates to optical multicasting. HERO exhibited linear increase in the
completion time with the increase in the flow sizes and significantly outperformed Binomial, and ring Message
Passing Interface (MPI) broadcasting algorithms with electronic switching only for messages less than or equal
to, and greater than 12 kBytes in size, respectively.
A software-defined and controlled hybrid OPS and OCS data center is proposed and examined in [228] for
multi-tenants dynamic Virtual Data Center (VDC) provisioning. A topology manager was utlilized to build the
VDCs with different provisioning granularities (i.e. macro and micro) in an Architecture-on-Demand (AoD) node
with OPS, and OCS modules in addition to ToRs and servers. For VMs placement, a network-aware heuristic that
jointly considers computing and networking resources and takes the wavelengths continuity, optical devices
heterogeneity, and the VM dynamics into account was considered. Improvements by 30.1% and 14.6% were
achieved by the hybrid topology with 8, and 18 wavelengths per ToR switch, respectively, compared to using
OCS only. The work in [229] proposed a 3D MEMS crossbar to connect server blades for scalable stream
processing systems. Software-based control, routing, and scheduling mechanisms were utilized to adapt the
topology to graph computational needs while accounting for MEMS reconfiguration and protocol stack delays.
To overcome the high blocking ratio and scalability limitations of single hop MEMS-based OCS, the authors in
[230] proposed a distributed multi-hop OCS that utilizes WDM and SDM technologies integrated with multi-
rooted tree based data centers. A multi wavelengths optical switch based on Microring Resonators (MR) is
designed to ensure fast switching. A modification to Fat-tree by replacing half of the electronic core, aggregation,
and access switches with the OCS was proposed and distributed control was utilized at each optical switch with
low bandwidth copper links to interconnect and combine control with EPS. Compared with Hedera and DARD,
much faster optical path setup (i.e. 126.144 µs) was achieved with much lower control messaging overhead.
B. Data Center Routing:
In [231], a “reduce tasks” placement problem was analyzed in multi-rack environments to decrease cross-
rack traffic based on two greedy approaches. Compared to original Hadoop with random reduce task placements,
up to 32% speedup in completion time among different workloads was achieved. A scalable DCN-aware load
balancing technique for key distribution and routing in the shuffling phase of MapReduce was proposed in [232]
while considering DCN bandwidth constraints and addressing data skewness. A centralized Heuristic with two
subproblems; network flow and load balancing, was examined and compared to three state-of-the-art techniques,
load balancing-based; LPT, fairness-based; LEEN, and the default routing algorithm with hash-based key
distribution in MapReduce. For synthetic and realistic traffic, the network-aware load balancing algorithm
outperformed the others by 40% in terms of completion time and achieved maximum load per reduce comparable
to that of LPT. To improve shuffling and reduce its routing costs under varying data sizes and data reduction
ratios, a joint intermediate data partitioning and aggregation scheme was proposed in [233]. A decomposition-
based distributed online algorithm was proposed to dynamically adjust data partitioning by assigning keys with
larger data sizes to reduce tasks closer to map tasks while optimizing the placement and migration of aggregators
that merge the same key traffic from multiple map tasks before sending them to remote reduce tasks. For large
scale computations (i.e. 100 keys), and compared to random hash-based partitioning with no aggregation and with
random 6 aggregators placement, the scheme resulted in 50%, and 26% reduction in the completion time,
respectively. DCP was proposed in [234] as an efficient and distributed cache sharing protocol to reduce the intra
data center traffic in Fat-tree data centers by eliminating the need to retransmit redundant packets. It utilized a
packets cache in each switch for the eliminations and a Bloom filter header field to store and share cached packets
information among switches. Simulation results for 8-ary fat-tree data center showed that DCP eliminated the
retransmission by 40-60%. To effectively use the bandwidth in BCube data centers, the work in [235] proposed
and optimized two schemes for in-network aggregation at the servers and switches. The first for incast traffic was
modeled as minimal incast aggregation tree problem, and the second for shuffling traffic was modeled as minimal
shuffle aggregation subgraph problem. 2-approximation efficient algorithms named IRS-based, and SRS-based
were suggested for incast and shuffling traffic, respectively. Moreover, an effective forwarding scheme based on
in-switch and in-packet Bloom filters was utilized at a cost of 10 Bytes per packet to ease related flows
identification. The results for SRS-based revealed traffic reduction by 32.87% for a small BCube, and 53.33 %
for a large-scale BCube with 262,144 servers.
The optimization of data transfers throughput in scientific applications was addressed in [236] while
considering the impact of combining different levels of TCP flows pipelining, parallelism, and concurrency and
the heterogeneity in the files sizes. Recommendations were presented such as the use of pipelining only for file
sizes less than a threshold related to the bandwidth latency product and with different levels related to file size
ranges. To improve the throughput, routing scalability, and upgrade flexibility in electronic switching data centers
with random topologies, Space Shuffle (S2) was proposed in [237] as a greedy key-based multi-path routing
mechanism that operates in multiple ring spaces. Results based on fine-grained packet-level simulations that
considered the finite sizes of shared buffers and forwarding tables and the acknowledgment (ACK) packets
indicated improved scalability compared to Jellyfish, and higher throughput and shorter paths than SWDC and
Fat-tree. However, the overheads associated with packets reordering were not considered. An oblivious distributed
adaptive routing scheme in Clos-based data centers was proposed in [238] and was proven to converge to non-
blocking assignments with minimal out-of-the-order packet delivery via approximate Markov models and
simulations when the flows are sent at half the available bandwidth. While transmitting at full bandwidth with
strictly and rearrangeable non-blocking routing resulted in exponential convergence time, the proposed approach
converged in less than 80 µs in a 1152 nodes cluster and showed weak dependency on the network size at the cost
of minimal delay due to retransmitting first packets of redirected flows. The work in [239] utilized stochastic
permutations to generate bursty traffic flows and statistically evaluated the expected throughput of several layer
2 single path and multi-path routing protocols in oversubscribed spine-leaf data centers. Those included
deterministic single path selections based on hashing source or destination, worst-case optimal single path, ECMP-
like flow-level multipathing, and a stateful Round Robin-based packet-level multipathing (packets spraying).
Simulation results indicated that the throughput of the ECMP-like multipath routing is less than the deterministic
single path due to flow collisions as 40% of the flows experienced 2 fold slowdown and the packet spraying
outperformed all examined protocols. The authors in [240] proposed DCMPTCP to improve MPTCP through
three mechanisms; Fallback for rACk-local traffic (FACT) to reduce unnecessary sub-flows creation for rack-
local traffic, ADAptive packet scheduler (ADA) to estimate flow lengths and enhance their division, and Signal
sHARE control (SHARE) to enhance the congestion control for the short flows with many-to-one patterns.
Compared with two congestion control mechanisms typically used with MPTCP; LIA and XMP, 40% reduction
in inter rack flows transmission time was achieved.
The energy efficiency of routing big data applications traffic in data centers was considered in [241]-[243].
In [241], preemptive flow scheduling and energy efficient routing were combined to improve the utilization in
Fat-tree data centers by maximizing switches sharing while exclusively assigning each flow to needed links during
its schedule. Compared to links bandwidth sharing with ECMP, and flow preemption with ECMP, additional
energy savings by 40% and 30% were achieved, respectively at the cost of increased average completion time. To
improve the energy efficiency of MapReduce-like systems, the work in [242] examined combining VM
assignments with TE. A heuristic; GEERA was designed to first cluster the VMs via minimum k-cut, and then
assign them via local search, while accounting for the traffic patterns. Compared with other load-balancing
techniques (i.e. Randomized Shortest Path (RSP), and integral optimal and fractional solutions), an average
additional 20% energy saving was achieved, and total average savings by 60% in Fat-tree, and 30% in BCube
data centers were achieved. A GreenDCN framework was proposed in [243] to green switch-centric data center
networks (Fat-tree as focus) by optimizing VM placements and TE while considering the network features such
as the regularity and role of switches, and the applications traffic patterns. The energy-efficient VM assignment
algorithm (OptEEA), transforms VMs into super VMs with heavy traffic, assigns jobs to different pods through
k-means clustering, and finally assigns super VMs to racks through minimum k-cut. Then, the energy-aware
routing algorithm (EER) utilizes the first fit decreasing algorithm and MPTCP to balance the traffic across a
minimized number of switches. Compared with greedy VM assignments and shortest path routing, GreenDCN
achieved 50% reduction in the energy consumption.
The optimization of routing traffic between VMs and in multi tenant data center was considered in [244]-
[246]. VirtualKnotter in [244] utilized a two-step heuristic to optimize VM placement in virtualized data centers
while accounting for the congestion due to core switch over-subscription and unbalanced workload placements.
While other TE schemes operate for fixed source and destination pairs, VirtualKnotter considered reallocating
them through optimizing VM migration to further reduce the congestion. Compared to a baseline clustering-based
VM placement, a reduction by 53% in congestion was achieved for production data center traffic. To allow
effective multiplexing of applications with different routing requirements, the authors in [245] proposed an online
Topology Switching (TS) abstraction that defines a different logical typology and routing mechanism per
application according to its goals (e.g. bisection bandwidth, isolation, and resilience). Results based on simulations
for an 8-ary Fat-tree indicated that the tasks achieved their goals with TS while a unified routing via ECMP with
shortest path and isolation based on VLAN failed to guarantee the different goals. The work in [246] addressed
the fairness of bandwidth allocation and link sharing between multiple applications in private cloud data centers.
A distributed algorithm based on dual-based decomposition was utilized to assign link bandwidths for flows based
on maximizing the social welfare across the applications while maintaining performance-centric fairness with
controlled relaxation. The authors assumed that the bottlenecks are in the access link of the PM and considered
workloads where half of the tasks communicate with the other half and no data skew. To evaluate the algorithm,
two different scenarios for applications allocation and communication requirements were used and the results
indicated the flexibility and the fast convergence of the proposed algorithm.
SDN-based solutions to optimize the routing of big data applications traffic in various data center topologies
were discussed in [247]-[257]. MILPFlow was developed in [247] as a routing optimization tool set that utilizes
MILP modeling based on an SDN topology description, that define the characteristics of the data center, to deploy
data path rules in OpenFlow-based controllers. To improve the routing of shuffling traffic in Fat-tree data centers,
an application-aware SDN routing scheme was proposed in [248]. The scheme included a Fat-tree manager,
MapReduce manager, links load monitor, and a routing component. The Fat-tree manager maintains information
about the properties of different connections to prioritize the assignment of flows with less paths flexibility.
Emulation results indicated a reduction in the shuffling time by 20% and 10% compared to Round Robin-based
ECMP under no skew, and with skew, respectively. Compared to Spanning Tree and Floodlight forwarding
module (shortest path-based routing), a reduction around 60% was achieved. To enhance the shuffling between
map and reduce VMs under background traffic in OpenFlow-controlled clusters, the work in [249] suggested
dynamic flows assignment to queues with different rates in OVS and LINC software switches. Results showed
that prioritizing Hadoop traffic and providing more bandwidth to straggler reduce tasks can reduce the completion
time by 42% compared to solely using a 50 Mbps queue. In [250], a dynamic algorithm for workload balancing
between different racks in a Hadoop cluster was proposed. The algorithm estimates the processing capabilities of
each rack and accordingly modifies the allocation of unfinished tasks to racks with least completion time and
higher computing capacity. The proposed algorithm was found to decrease the completion time of the slowest
jobs by 50%. A network Overlay Framework (NoF) was proposed in [251] to gurantee the bandwidth requirements
of Hadoop traffic at run time. NoF achieves this by defining networks topologies, setting communication paths,
and prioritizing traffic. A fully virtualized spine-leaf cluster was utilized to examine the impact on job execution
time when redirecting Hadoop flows through the overlay network controlled by NoF and a reduction in the
completion time by 18-66% was achieved. An Application-Aware Network (ANN) platform with SDN-based
adaptive TE was examined in [252] to control the traffic in Hadoop clusters at fine-grain level to achieve better
performance and resources allocation. An Address Resolution Protocol (ARP) resolver algorithm for flooding
avoidance was proposed instead of STP to update the routing tables. Compared with MapReduce in
oversubscribed links, SDN-based TE resulted in completion time improvement between 16-337× for different
workloads.
To dynamically reduce the volume of shuffling traffic and green data exchange, the work in [253] utilized
spate coding-based middleboxes under SDN control in 2-tier data centers. The scheme uses a sampler to obtain
side information (i.e. traffic from map to reduce tasks residing in the same node) in addition to the coder/decoder
at the middleboxes. Then, OpenFlow is used to multicast the coded packets according to an optimized grouping
of multicast trees. Compared to native Vanilla Hadoop, reduction in the exchanged volume by 43%, and 59%
were achieved when allocating the middlebox in the aggregation and a ToR switch, respectively. Compared to
Camdoop-like in-network aggregation, the reduction percentages were 13%, and 38%. As an improvement of
MPTCP, the authors in [254] proposed a responsive centralized scheme that adds subflows dynamically and
selects best route according to current traffic conditions in SDN-controlled Fat-tree data centers. A controller that
employs Hedera's demand estimation to perform subflow route calculation and path selection, and a monitor per
server to adjust the number of subflows dynamically were utilized. Compared with ECMP and Hedera under
shuffling with background traffic, the responsive scheme achieved 12% improvement in completion time while
utilizing a lower number of subflows. To overcome the scalability issues of centralized OpenFlow-based
controllers, the work in [255] proposed a distributed algorithm at each switch; LocalFlow that exploits the
knowledge of its active flows and defines the forwarding rules at flowlets, individual flows and sub-flows
resolutions. LocalFlow targets data centers with symmetric topologies and can tolerate asymmetries caused by
links and node failures. It allows but reduces flow spatial splitting, while aggregating flows per destination, only
if the load imbalance exceeded a slack threshold. To avoid the pitfalls of packets reordering, the duplicate
acknowledgment (dup-ACK) duration at end-hosts is slightly increased. LocalFlow improved the throughput by
up to 171%, 19%, and 23% compared to ECMP, MPTCP, and Hedera, respectively. XPath was proposed in [256]
to allow different applications to explicitly route their flows without the overheads of establishing paths and
dynamically adding them to routing tables of commodity switches. A 2-step compression algorithm was utilized
to enable pre-installing very huge number of desired paths into commodity switches IP tables in large-scale data
centers with limited table sizes. First, to reduce the number of unique IDs for paths, non-conflicting paths such as
converging and disjoint ones are clustered into path sets and each set is assigned a single ID. Second, the sets are
mapped to IP addresses based on Longest Prefix Matching (LPM) to reduce the number of IP tables entries. Then,
a logically centralized SDN-based controller can be utilized to dynamically update the IDs of the paths instead of
the routing tables, and to handle failures. For MapReduce shuffling traffic, XPath enabled choosing non-
conflicting parallel paths and achieved 3× completion time reduction compared to ECMP. The recent work in
[257] discussed the need for applying Machine Learning (ML) techniques to bridge the gap between large-scale
DCN topological information and various networking applications such as traffic prediction, monitoring, routing,
and workloads placement. A methodology named Topology2Vec was proposed to generate topology
representation results in the form of low dimensional vectors from the actual topologies while accounting for the
dynamics of links and nodes availability due to failures or congestion. To demonstrate the usability of
Topology2Vec, the problem of placing SDN controllers to reduce end-to-end delays was addressed. A summary
of the data center focused optimization studies is given in Table V.

TABLE V SUMMARY OF DATA CENTER -FOCUSED OPTIMIZATION STUDIES - I


Ref Objective Application/ Tools Benchmarks/workloads Experimental Setup/Simulation environment
platform
[204]* Hadoop performance model Hadoop Modelling for MapReduce program to 160 servers in 5 racks (4 CPU cores, 8GB RAM, 2 250GB disk),
for multi-rack clusters 0.21.0 processing time read, encrypt, then sort Gigabit Ethernet Cisco Catalyst 3750G and 2960-S, Bonnie++,
Netperf
[205]* Simulation approach to evaluate Hadoop 1.x ns-2 network simulator, TeraSort, MRPerf-based simulations for (single rack, double rack, tree-based
data center topology effects on Hadoop DiskSim, C++, Tcl, Search, Index and DCell) topologies with 2, 4, 8, 16 nodes each with (2 Xeon Quad
Python 2.5 GHz core, 4 750GB SATA), 1 Gbps Ethernet
[206]* Evaluating job finish time for MapReduce Hadoop 1.x CloudSimExMapReduce Scientific Simulations for hierachical, fat-tree, CamCube, DCell, and
workloads in different data centers topologies simulator, CloudDynTop workloads hypothetically optimal topology for MapReduce, 6-9 servers, 1 Gbps
and 10 Gbps links
[207]* Optimizing shuffling operations in electronic, Google’s MILP Sort Spine-leaf, Fat-tree, BCube, DCell, c-Through,
hybrid, and optical data center topologies MapReduce Helios, Proteus topologies with 16 nodes, 10 Gbps links
[208]* Evaluation of data centers Network - Regular and energy- One-to-one and all-to-all Simulations for Fat-tree, Vl2, FBFLY, BCube, DCell, FiConn with
Power Effectiveness (NPE) aware TCP traffic flows similar network diameter, ∼8000 or ∼100 servers, and a mix of 1, 10
Flyways: wireless or wired additional links routing algorithms GbE links
on-demand between congested ToR switches
[209]* Data path performance evaluation - Network simulation Realistic workload with a OMNET++-based Simulations for spine-leaf data center (100 servers
in spine-leaf architectures cradle mix of in 4 racks) with oversubscription ratios (1:1, 2.5:1, 5:1) and 10, 40,
small and large jobs, and 100 Gigabit Ethernet
bursty traffic

[210]* FFTree: Adjustment to pods in Fat-tree Google’s Flexibility in pods design WordCount Ns3-based simulations for 16 Data Nodes in Linux containers
for higher bandwidth and flexibility MapReduce defined by edge offset, connected by TapBridge NetDevice
BulkSendApplication
[211]*, Flyways: wireless or wired additional links MapReduce- Central controller, MPLS Production MapReduce Simulation for 1500 servers in 75 racks with 20 server per rack)
[212]* on-demand between congested ToR switches like data label workloads and additional Flyways with (0.1, 0.6, 1, and 2) Gbps capacity per
mining switching, optimization Flyway
problem

[213]* Failure-tolerant and spectrum efficient - Bimodal degree 125 Mbytes input data, Simulations for a spherical rack with 200 servers 10 of them are hub
wireless data center for big data applications distribution, space and 20 Mbytes intermediate servers,
time division multiple data
access scheme
[214]* Camdoop: MapReduce-like system Hadoop key-based routing, in- WordCount, Sort 27 servers CamCube (Quad Core 2.27 GHz CPU, 12GB RAM, 32GB
in CamCube data centers 0.20.2, network prcessing, SSD), 1 Gbps Quadport NIC, 2 1 Gbps dual NIC, packet level
Dryad, spanning tree simulator for 512 server CamCube
Camdoop
[215]* In-network distributed query processing Apache Solr NaaS box attached to Queries from Solr cluster (1 master, 6 workers), 1 Gbps Ethernet for servers, 10
system to increase throughput ToR for in-network 75 clients Gbps for NaaS box
processing
[216]* RAMCube : BCube-oriented design Key-value RPC through Ethernet, Key-value workloads BCube(4,1) with 16 servers (2.27GHz Quad core CPU, 16GB RAM,
for resilient key-value store store one hop with set, get, delete 7200 RPM 1TB disk) and 8 8-port DLink GbE, ServerSwitch 1 Gbps
allocation to recovery operations NIC in each server
server
[217]* Traffic-Aware VM Placement Problem - Two-tier approximate VM-to-VM global and Tree, VL2, Fat-tree, and BCube data centers with
(TVMPP) in data centers algorithm partitioned 1024 VMs
and traffic, production
traces
[218]* Oktopus: Intra data center network - Greedy allocation Symmetric and 25 nodes with 5 ToRs and core switch (2.27GHz CPU, 4GB RAM, 1
virtualization for predictable performance algorithms, rate-limiting asymmetric VM-to- VM Gbps), Simulations for multi-tenant data center with 16000 servers in
at end hosts static and dynamic traffic 4 racks and 4 VMs per server
[219]* VM to PM mapping based on architectural - Traffic Amount First Uniformly random VM- Simulations for Fat-tree, VL2, and Tree-based data centers in large-
and resources constraints (TAF) heuristic to-VM traffic scale (60 VMs) and small-scale (10 VMs) settings
[220]* NetDEO: VM placement in data centers Multi-tier Meta-heuristic based on Synthesized traces Simulations for data center networks with heterogeneous servers
data centers and efficient system upgrade web Simulated annealing and different topologies (non-homogeneous Tree, FatTree, BCube)
applications
[221]* Emulator for SDN-based data centers Hadoop 1.x OVS Simulated Hadoop shuffle Testbed with 1 primary server (2.4 GHz, 4GB RAM), 12 client
with reconfigurable topologies traffic generator workstations (dual-core CPU, 2GB RAM), 2 Pica8 Pronto 3290 SDN-
enabled Gigabit Ethernet switches)
[222]* SDN-approach in Application-Aware Hadoop 2.x Ryu, OVS, lightweight HiBench benchmark (sort, 1 master and 8 slaves in GENI testbed (2.67GHz CPU, 8GB RAM)
Networks (AAN) to improve Hadoop REST-API, ARP resolver word count, scan, join, 100 Mbps per link.
PageRank)
[223]* Application-aware run-time SDN-based - 2D Torus topology - Hybrid data center with OpenFlow-enabled ToR electrical switches
control for data centers with Hadoop configuration algorithms connected to an electrical core switch and MEMS-based core optical
switch
[224]* Experimental evaluation for MapReduce Hadoop 1.x Topology manager, TrintonSort (900 GB) and 24 servers in 4 racks, 5 Gbps packet link and 10 Gbps optical link
in c-Through and Helios OpenFlow, circuit switch TCP long-lived traffic Monaco switches, Glimmerglass MEMS
manager
[225]* Measuring latency in Torus-based hybrid - Network Orchestrator Real-time network 2D Torus network testbed constructed with 4 MEMS sliced from a
optical/electrical switching data centers Module (NOM) traffic via Iperf 96×96 MEMS- based CrossFiber LiteSwitcM and two Pica8
OpenFlow switches to emulate 16 NIC
[226]* Acceleration of incast and multicast traffic - Floodlight, Redis pub/sub Two sets of 50 multicast 8 nodes hybrid cluster testbed with optical splitters and
with on-demand passive optical modules messages, jobs (500MB-5GB) with combiners connected with controlled Optical Space Switch (OSS)
integer program to 4 and 7 groups
configure OSS
[227]* HERO: Hybrid accelerated delivery MPICH for SDN controller, greedy 50 multicast flows with Small free-space optics mutlicast testbed with passive optical 1:9
for multicast traffic Message optical links assignment uniform flows splitters for throughput and delay evaluation, flow-level simulations
Passing algorithm (100MB-1GB) and for 100 racks in spine and leaf hybrid topology
Interface groups (10-100)
(MPI)
[228]* Multi-tenant virtual optical data center Virtual Data OpenDayLight, network- 2 VDC, randomly FPGA optoelectronics 12×10GE ToRs, SOA-based OPS switch,
with network-aware VM placement Center aware VM placement generated 500 Polatis 192×192 MEMS- based switch as optical back-plane,
(VDC) based on variance VDCs with Poisson simulations for 20/40 servers per rack, 8 racks
composition requests arrival

[229]* Reconfigurable 3D MEMS based data center System S Scheduling Optimizer for Multiple jobs for 3 IBM Bladecenter-H chassis, 4 HS21 blade servers per chassis,
for optimized stream computing systems Distributed Applications streaming applications Calient 320×320-port 3D MEMS
(SODA)
[230]* Hybrid switching architecture for cloud - Multi wavelength optical 3 synthetic traffic patterns OPNET simulations for the proposed hybrid OCS and EPS switching
applications in Fat-tree data centers switch, distributed with mice and elephant in Fat-tree with 128 servers to evaluate the average end-to-end delay
control for scheduling flows and the network throughput.
[231]° Reduce tasks placements to Hadoop 1.x Linear-time greedy WordCount, Grep, Cluster with 4 racks; A, B , C, and D with 7, 5, 4, and 4 servers (Intel
reduce cross-rack traffic algorithm, binary search PageRank, k-means, Xeon E5504, E5520, E5620, and E5620 CPUs, 8GB RAM), 1 GbE
Frequency Pattern Match links
[232]° Data center network-aware load balancing to - Greedy Synthetic and Wikipedia Simulations for 12-ary Fat-tree with 1 Gbps links
optimize shuffling in MapReduce with skewed algorithm page-to-page link datasets
data
[233]° Joint intermediate data partitioning and - Profiling, decomposition- Dump files in Wikimedia, Simulations for three-tier data center with 20 nodes
aggregation to improve the shuffling phase of based distributed WordCount, random
MapReduce algorithm shuffling
[234]° DCP: Distributed cache protocol to reduce - In-memory store, Randomly generated Simulations for Fat-tree with k=16 (1024 servers, 64 core switches,
redundant packets transmission in Fat-tree bloom filter header packets with Zipf 28 aggregation, 128 edge), simulations
distribution
[235]° In-network aggregation in BCube for Hadoop 1.x IRS-based incast, SRS- WordCount with 61 VMs in 6 nodes emulated BCube (2 8-cores CPU, 24GB RAM,
scalable and efficient data shuffling based shuffle, Bloom combiner 1TB disk), large-scale simulations for BCube(8,5) with
filter
[236]° Application-level TCP tuning for data transfers - Recursive Chunk Bulk data transfers high-speed networking testbeds and cloud networks
through pipelining, parallelism, and concurrency Division, Parallelism- (512KB-2GB) per file Emulab-based emulations, AWS EC2 instances
Concurrency Pipelining
[237]° Space Shuffle (S2): greedy routing on multiple - Greediest routing, Random permutation Fine-grained packet-level event-based simulations for
rings to improve throughput, scalability, and MILP traffic the proposed data center, Jellyfish, SWDC, and Fat-tree
flexibility
[238]° Non-blocking distributed oblivious adaptive - Approx. Markov chain Random permutation OMNet++-based simulations for InfiniBand network (three-level Clos
routing in Clos-based data centers for big data models to predict traffic of 245KB flows DCN with 24 input, 24 output, and 48 intermediate switches and 1152
applications convergence time nodes), 40 Gbps links
[239]° Evaluation for different routing protocols on - Stochastic permutations Bursty traffic, delay- Flow-level simulations for spine and leaf data center
the performance of spine and leaf data centers for traffic generation sensitive workloads with 8 spine switches, 16 leaf switches, and 128 end nodes
[240]° DCMPTCP: improved MPTCP for load - Fallback for rACk-local Many-to-one traffic, data Ns3-based simulations for Spine and leaf data center with 8 spine, 8
balancing in data centers with rack local and Traffic, ADAptive packet mining and web search leaf switches, and 64 nodes per leaf switch, 10 and 40 Gbps links
inter racks traffic scheduler traffic
[241]° Greening data centers by Flow Preemption - Algorithm for the 10k flows wit exponential Simulations for 24-ary Fat-tree
(FP) and Energy-Aware Routing (EAR) FP and EAR scheme distribution with mean of
64MB
[242]° Improving the energy efficiency of routing in - Approximate-algorithm Uniform random traffic Fat-tree (4-ary and 8-ary), BCube(2,2) and BCube(8,2)
data centers by joint VM assignments and TE (GEERA) and number of VMs

[243]° GreenDCN: scalable framework to green data - Time-slotted algorithms; Synthetic jobs with Simulations for 8-ary and 12-ary Fat-tree data centers, 2 VMs per
center network through optimized VM optEEA, EER normal distribution for server, identical switches with 300W max power and max processing
placement and TE no. of servers speed of 1 Tbps
[244]° VirtualKnotter: efficient online VM placement - Multiway θ-Kernighan- Synthetic and realistic Dynamic simulations
and migration to reduce congestion in data Lin and simulated traffic
centers annealing
[245]° Topology switching to allow multiple - Allocator, centralized Randomly generated Simulations for 16-ary Fat-tree with 1 Gbps links
applications to use different routing schemes topology server routing tasks
[246]° Performance-centric fairness in links bandwidth - Gradient projection- Two scenarios for Simulations for a private data center with homogeneous nodes
allocation in private clouds data centers based distributed applications traffic
algorithm
[247]° MILPFlow: a Tool set for modeling and data - MILP, Video streaming, Mininet-based emulation for 4-ary Fat-tree in VirtualBox-based
paths deployment in OpenFlow-based data OVS Iperf traffic VMs, 1 GbE NIC
centers
[248]° Application-aware SDN routing Hadoop 1.x Floodlight controller, WordCount EstiNet-based emulation for 20 OpenFlow switches in 4-ary Fat-tree
in data centers managers, links monitor, with 16 nodes
routing component
[249]° Hadoop acceleration in an Cloudera Floodlight, OVS and Sort (0.4 MB - 4GB), 10 VMs in 3 nodes in Cloudera connected by a physical switch
OpenFlow-based cluster distribution LINC switches Iperf for background
of Hadoop traffic
[250]° SDN-controlled dynamic workload balancing - Balancing algorithm WordCount Mumak-based simulations for a data center with three racks
to improve completion time of Hadoop jobs based on estimation and
prediction
[251]° SDN-based Network Overlay Framework (NoF) Hadoop Configuration engine, TeraGen, TeraSort, Virtualized testbed with spine-leaf virtual switches, VirtualBox for
to define networks based on applications 2.3.0 OVS, POX controller iperf for background VMs
requirements traffic
[252]° SDN-approach in Application-Aware Hadoop 1.x Ryu, OVS, lightweight HiBench benchmark (sort, 1 master and 8 slaves in GENI testbed (2.67GHz CPU, 8GB RAM),
Networks (AAN) to improve Hadoop REST-API, ARP resolver word count, scan, join, 100 Mbps per link
PageRank)
[253]° Dynamic control for data volumes through SDN Vanilla Sampler, spate coding- TeraSort, Grep Prototype: 2-tier data center with 8 nodes (12 cores CPU, 128GB
control and spade coding in cloud data centers Hadoop based middleboxes for RAM, 1TB disk), Testbed: 8 VMs controlled by Citrix XenServer and
coding/decoding connected with OVS
[254]° Responsive multipath TCP for optimized - Demand estimation and Random, permutation, NS3-based simulations for 8-ary Fat-tree with 1 Gbps links and
shuffling in SDN-based data centers route calculation and shuffling traffic customized SDN controller
algorithms
[255]° LocalFlow: local link balancing for scalable - Switch-local pcap packet traces, Packet-level network simulations (stand-alone and htsim-based) for
and optimal flow routing in data centers algorithm MapReduce-style flows 8-ary and 16-ary Fat-tree, VL2, oversubscribed variations
[256]° XPath: explicit flow-level path control in - 2 step compression Random TCP Testbed: 6-ary Fat-tree testbed with Pronto Broadcom 48-port
commodity switches in data centers algorithm connections, sequential Ethernet switches, Algorithm evaluation for BCube, Fat-tree, HyperX,
read/write, shuffling and VL2 with different scales
[257]° Topology2Vec: DCN representation learning - Biased random walk Real-world Internet Simulations
for networking applications sampling ML-based topologies from the
network partitioning Topology Zoo
* Data center topology, ° Data center routing.

C. Scheduling of Flows, Coflows, and Jobs in Data Centers:


Scheduling big data traffic at the flow level was addressed in [258]-[261]. Orchestra [258], a task-aware
centralized cluster manager, aimed to reduce the average completion time of various data transfer patterns for
batch, iterative, and interactive workloads. Within each transfer, a Transfer Controller (TC) was used for
monitoring and updating sources associated with destination sets. For broadcast, a BitTorrent-like protocol,
Cornet, was utilized and for shuffle, an algorithm named Weighted Shuffle Scheduling (WSS) was used. Orchestra
allows scheduling at the transfer level between applications stages where concurrent transfers belonging to the
same job can be optimized through an Inter-Transfer Controller (ITC) that can utilize FIFO, fair, and priority-
based scheduling. Broadcast was improved by 4.5 times compared to native Hadoop with status quo and high
priority transfers were improved by 1.7 times. Seawall in [259] is an edge scheduler that used a hypervisor-based
mechanism to ensure fair sharing of DCN links between tenants. FlowComb was proposed in [260] as a centralized
and transparent network management framework to improve the utilization of clusters and reduce the processing
time of big data applications. FlowComb utilized software agents installed in nodes to detect transfer requests and
report to the centralized engine and OpenFlow to enforce forwarding rules and paths. An improvement of 35% in
completion time of sort workloads was achieved with 60% path enforcement. In [261], distributed flow scheduling
was proposed to achieve adaptive routing and load balancing through DiFS. For several traffic patterns, DiFS
achieved better aggregate throughput compared to ECMP, and comparable or higher throughput compared to
Hedera and DARD.
Optimizing coflows scheduling, which is more applications-aware than flows scheduling, to minimize the
completion time of workloads was the focus in [262]-[265]. Varys was proposed in [262] as a coordinated inter-
coflow scheduler in data centers targeting predictable performance and reduced completion time for big data
applications. A greedy co-flow scheduler and a per-flow rate allocator were utilized to identify the slowest flow
in a co-flow and adjust the rate of accompanying flows to the slowest one so networking resources can be used in
other co-flows. Trace-driven simulations indicated that Varys achieved 3.66×, 5.53×, and 5.65× improvements
compared to fair sharing, per-flow scheduling, and FIFO, respectively. Baraat in [263] suggested decentralized
task-aware scheduling for co-flows to reduce their tail completion times. FIFO with limited multiplexing (FIFO-
LM) is used to schedule the tasks while avoiding head-of-line blocking for small flows by changing the
multiplexing level when heavy tasks arrive. Compared to pFabric [622] and Orchestra, the completion time of
95% of MapReduce workloads was reduced by 43% and 93%, respectively. A decentralized coflow-aware
scheduling system (D-CAS) that dynamically and explicitly set the priorities of flows according to the maximum
load in the senders was proposed in [264] to minimize coflows completion time. D-CAS achieved a performance
close to Varys with up to 15% difference and outperformed Baraat by 1.4 and 4 times for homogeneous and
heterogeneous workloads, respectively. Rapier in [265] integrated routing and scheduling at the coflow level to
optimize the performance of big data applications in DCNs with commodity switches. Compared to Varys with
ECMP and optimized routing only, about 80% and 60% improvement in coflow completion time was achieved.
Scheduling traffic in DCNs with SDN environments was addressed in [266]-[269]. Online scheduling of
multicast flows in Fat-tree DCNs was considered in [266] to ensure bounds on congestion and improve the
throughput and load balancing. A centralized Bounded Congestion Multicast Scheduling (BCMS) algorithm was
developed for use with OpenFlow and improvements compared to VLB and random scheduling were reported.
Pythia in [267] focused on improving the network performance under skewed workloads. A run-time intermediate
data size prediction and centralized OpenFlow-based control were utilized and up to 46% reduction in completion
time was obtained compared to ECMP. SDN Openflow network with Combined Input and Crosspoint Queued
(CICQ) switches was proposed to dynamically schedule packets of different big data applications in [268]. An
application-aware computing and networking resource allocation scheme based on neural network predictions
was developed to allocate adequate resources so that the SLA is minimally violated. For five different DCN
applications, the scheme achieved less than 4% SLA violation rate at the cost of 9.21% increase in the energy
consumption. The authors in [269], proposed and experimentally demonstrated a heuristic for Bandwidth-Aware
Scheduling with SDN (BASS) to reduce the minimum job completion time in Hadoop clusters. The heuristic
prioritizes scheduling the tasks locally but considers remote assignment while accounting for links assignment for
data transfers if the total completion time is reduced. BASS was found to reduce the completion time by 10%.
The work reported in [270]-[273] focused on scheduling traffic in optical and hybrid DCNs. In [270], the
gaps between OCS interconnects performance and latency-sensitive applications requirements were addressed.
Two scheduling algorithms: centralized Static Circuit Flexible Topology (SCFT) and distributed Flexible Circuit,
Flexible Topology (FCFT) were proposed and up to 2.44× improvement over Mordia [283] was achieved.
Resource allocation in NEPHELE was addressed in [271] while accounting for the SDN controller delay and
random allocation of iterative MapReduce tasks. Compared to Mordia, NEPHELE uses multiple WDM rings
instead of one to enable efficient use of resources, and introduces an application-aware and a feedback-based
synchronous slotted scheduling algorithms. In [272], different end-to-end continuous-time scheduling algorithms
were designed for a proposed hybrid packet optical DCNs by the same authors, and were compared using random
traffic. Effective traffic scheduling for a proposed Packet-Switched Optical Network (PSON) DCN with space
switches and layers of AWGRs was examined in [273]. To treat traffic flows differently and optimally based on
their type, three machine-learning flow detection algorithms were used and were compared in terms of accuracy
and classification speed. Scheduling algorithms that consider priority of flows, and occupancy of buffers were
proposed and improvements in terms of packet loss ratio and average delay compared to Round Robin were
reported.
The authors in [274] maximized the throughput of data retrieval for single and multiple applications in tree-
based DCNs while accounting for links bandwidths and allocation of data replicas. The proposed approximation
algorithm achieved near optimal performance and improved retrieval time compared to random scheduling. A
topology-aware heuristic for data placement that minimizes remote data access costs in geo-distributed
MapReduce clusters was proposed in [275] based on optimized replica balanced distribution tree. CoMan was
proposed in [276] to improve bandwidth utilization and reduce completion time of several big data frameworks
in Multiplexed Datacenters. Virtual Link Group (VLG) abstraction was utilized to define shared bandwidth
resources pool and an approximation algorithm was developed. Compared to ECMP with ElasticSwitch [133],
the bandwidth utilization and completion time were improved by 2.83× and 6.68×, respectively. The authors in
[177] proposed probabilistic map and reduce tasks scheduling algorithms that consider the data center topology
and the links statuses in computing the cost and latency of data transmission. Based on the calculations, the
algorithm incorporates randomness in the assignments where the tasks with the least transmission time get
scheduled on available slots with higher probability. Thus, the locality and completion time are balanced without
enforcement. Compared to a coupling scheduling method and to fair scheduling, reductions by 17%, and 46%
were achieved. A Network-Aware Scheduler (NAS) was examined in [278] for improving the shuffling phase in
clusters with multi racks. NAS utilizes three algorithms: Map Task Scheduling (MTS) to balance cross node traffic
caused by skewness, Congestion-Avoidance Reduce Task Scheduling (CA-RTS) to reduce cross-rack traffic, and
Congestion-Reduction RTS (CR-RTS) to prioritize light-shuffle traffic jobs. Compared to state-of-the-art
schedulers like fair and delay, improvements in the throughput by up to 62%, and reductions in the average
completion time by up to 44% and cross rack traffic by up to 40% were reported. A fine-grained framework;
Time-Interleaved Virtual Cluster (TIVC) that accounts for the dynamics of big data applications bandwidth
requirements was proposed and tested in a system named PROTEUS in [279]. Based on observing, via profiling
four workloads, that the traffic peaked only during 30-60% of the execution time, TIVC targeted reducing the
over-utilization of bandwidth and allowing overlapped scheduling of jobs. First, the user's application was profiled
and a cost model based on bandwidth cap and performance trade-offs was generated, then a dynamic
programming-based spatio-temporal jobs allocation algorithm was used to maximize the utilization and revenue.
Compared with Oktopus in [218] for mixed batch jobs, a reduction by 34.5% in the completion time was achieved.
For dynamically arriving jobs at 80% load, PROTEUS reduced rejection ratio to 3.4% from 9.5% reported for
Oktopus.
The authors in [280] explored the benefits of ahead-planning with MapReduce and DAG-based production
workloads and developed Corral that jointly optimizes data placement and tasks scheduling at the rack-level.
Corral modified HDFS and AM to enable rack selection with data and tasks placement and hence, increase the
locality and reduce the overall cross-rack traffic. Compared to native YARN, with capacity scheduler, the
makespan was reduced by up to 33% while reducing cross-rack traffic by up to 90%. Simulation results for
integrating Corral with Varys [262] showed clear improvement over using capacity scheduler with Varys and over
using Corral with TCP indicating the importance of integrating network and data/tasks scheduling solutions. The
authors in [281] proposed Mercury as an extension of YARN that enables hybrid resource management in large
clusters by allowing AM of applications to request guaranteed containers via a centralized scheduling or queueable
containers via distributed scheduling according to their needs. Compared to default YARN, an average
improvement by 35% in throughput was obtained. The work in [282], optimized the allocation of container
resources in terms of reduced congestion and improved data locality by considering the networking resources and
the data locations. Compared to default placement in YARN, 67% reduction in completion time for network-
extensive workloads was obtained.
The energy efficiency of data centers through workloads and traffic scheduling was considered in [283]-286].
The work in [283] examined the performance-energy tradeoffs when using the Low Power Idle (LPI) link sleep
mode of the EEE standard [639] with MapReduce workloads. The timing parameters of entering and leaving the
LPI mode in 10GbE links were optimized while utilizing packet coalescing (i.e delaying outgoing packets during
the quite mode and aggregating them for transmission in the following active mode). Depending on the
superscription ratio and workloads, EEE achieved power saving between 5 and 8 times compared to legacy
Ethernet. A detailed study of the optimum packet coalescing setting under different TCP settings and with shallow
and deep buffers (i.e. 128kB and 10MB per port) was provided and an additional improvement by a factor of two
was achieved. Applications scheduling in fewer machines for energy saving in networking devices of large scale
data centers was considered in [284] through a Hierarchical Scheduling Algorithm (HSA). The workloads were
allocated at the nodes according to their subsets (i.e. connectivity with ToR switches), then according to the levels
of higher layer switches and associate data transmission costs. HSA was tested with a Dynamic Max Node Sorting
Method and two classical bin packing problem solutions and the proposed method resulted in improved
performance and stability. Willow in [285] aimed to reduce the energy consumption of switches in Fat-tree through
SDN-based dynamic scheduling while accounting for the performance of network-limited elastic flows. The
number of activated switches as well as usage duration was considered in computing the routing path per flow.
Compared to ECMP and classical heuristics (i.e. simulated annealing and particle swarm optimization), up to 60%
savings were achieved. The authors in [286] proposed JouleMR as a green-aware and cost-effective tasks and jobs
scheduling framework for MapReduce workloads that maximized renewable energy usage while accounting for
brown energy dynamic pricing and renewable energy storage. To obtain a plan for the jobs/tasks scheduling,
performance-energy consumption models were utilized to guide a two-phase heuristic. The first phase optimizes
the start time of tasks/jobs to reduce the brown energy usage and the second phase optimizes the assigned
resources per task/job to further reduce it while satisfying soft deadlines. Compared to Hadoop with YARN, a
reduction by 35% in brown energy usage and by 21% in overall energy consumption was obtained.
D. Performance Improvements based on Advances in Computing, Storage, and Networking Technologies:
Comparisons of scale-up and scale-out systems for Hadoop were discussed in [287]-[290]. In [287], the
authors showed that running Hadoop workloads with sub Tera-scale on a single scaled-up server can be more
efficient than scale-out clusters. For the scale-up system, several optimizations were suggested to optimize
storage, number of concurrent tasks, heartbeats, JVMs heap size, and the shuffling operation. A hybrid Hadoop
cluster with scale-out and scale-up servers was proposed in [288]. Based on profiling the performance of
workloads on scale-out and scale-up clusters, a scheduler that assigns each job to the best choice was designed. A
study to measure the impact on HDFS of improved networking such as InfiniBand and 10Gigabit Ethernet,
selected protocol, and SSD storage systems was presented in [289]. With HDD storage, enhanced networking
achieved up to 100% performance improvement, while with SSD, the improvement was by up to 219%. The
authors in [290] discussed the need to optimize the tasks scheduling and data placement for data-centric workloads
in compute-centric (i.e. HPC) clusters. A comprehensive evaluation was conducted to study the impact of storing
intermediate RDDs on RAM, local SSD, and Lustre clusters, which requires write locks, on Spark workloads with
different compute and I/O requirements. An Enhanced Load Balancer scheduler was suggested and a Congestion-
Aware Task Dispatching to SSD mechanism was also proposed. Enhancements by 25%, and 41.2% were
achieved. It was also found that increasing the chunk size to 128 MB from 32 MB reduced the job execution time
by 15.9% as scheduling overheads were reduced.
The performance of big data applications under SSD storage was addressed in [291]-[294]. The work in [291]
examined the performance improvements and economics (i.e. cost-per-performance) of partial and full use of
SSDs in MapReduce clusters. A comparison between SATA Hard Disk Drives (HDDs) and Peripheral Component
Interconnect express (PCIe) SSDs with the same aggregate I/O bandwidth showed that SSDs achieved up to 70%
better performance at 2.4× the cost of HDDs. When upgrading HDDs systems with SSDs, proper configurations
with multiple HDFSs should be considered. The authors in [292], discussed the need to modify Hadoop
frameworks when using SSDs and proposed a direct map output collection, pre-read model for HDFS, a reduce
scheduler to address skewness, and a modified data blocks placement policy to extend the lifetime of SSDs. The
first two methods achieved improvements by 30%, and 18%, while the scheduler produced improvements by 12%
and the overall improvement was by 21%. To accelerate I/O intensive and memory-intensive MapReduce jobs,
mpCache, that dynamically caches input data from HDD into SATA SSDs, was proposed and examined in [293]
and average speedups by 2× were gained. The aggregate performance of Non-Volatile Memory Express (NVMe)
SSDs with high number of Docker containers for database applications was evaluated in [294]. Under optimized
selection of instances count, up to 6× improved write throughput can be obtained compared to one instance. With
multiple applications, the performance can degrade by 50%.
Optimizing data transfers from networked storage systems were addressed in [295]-[297]. Google traces were
used in [295] to examine the performance of Hadoop with Network Attached Storage (NAS). MRPerf [111] was
utilized to evaluate the slow down factor of using a NAS, that requires rack-remote transfers, instead of
conventional Direct Attached Storage (DAS) for different number of racks and different workloads. The results
showed that CPU-intensive workloads had the least slow down factor and that TeraSort workloads can benefit the
most from an assumed enhanced NAS. FlexDAS in [296] was proposed as a switch based on SAS expanders to
form a Disk Area Network (DAN) to provide flexibility in connecting and disconnecting computing nodes to
HDDs while maintaining the I/O performance of DAS. Results based on a prototype showed that FlexDAS can
detach and attach a HDD in 1.16 seconds and for three I/O intensive applications, it achieved the same I/O
performance of DAS. The work in [297] addressed optimizing and balancing the I/O operation required for
Terabits data transfers from centralized Parallel Storage Systems (PFS) over WAN for competing scientific
applications. An end-to-end layout-aware optimization based on a scheduling algorithm for sources (i.e. disks)
and for sinks (i.e. requesting servers) was utilized to coordinate the use of network and storage and to reduce I/O
imbalances caused by congestion in some disks.
Remote Direct Memory Access (RDMA) that enables zero-copy (i.e. accessing the RAM without the
intervention of the operating system), was utilized in [298], [299] to improve the performance of big data
applications. Fast Remote Memory (FaRM) was proposed in [298] as a system for key-value and graph stores that
regards RAMs of the machines in a cluster as a shared address space. FaRM utilizes one-sided RDMA lock-free
serializable reads, RDMA writes for fast message passing, and single machine transactions with optimized
locality. A kernel driver, PhyCo, was developed to enable addressing 2GB memory regions to overcome the issue
of limited entries in NIC page tables. Compared to TCP/IP, FaRM achieved improved rate by 11×, and 9×, for
request sizes of 16 and 512 bytes, respectively. Hadoop-A in [299] reduced the overheads of materializing
intermediate data in MapReduce by proposing a virtual shuffling mechanism that utilizes RDMA in InfiniBand
and optimizes the use of memory buffers. Improvements by up to 27% compared to default shuffling and 12%
power consumption reduction were achieved.
The studies in [300]-[309] focused on utilizing improved CPUs, GPUs, and Field Programmable Gate Arrays
(FPGAs) to accelerate big data applications. An investigation of the inefficiencies of scale-out clusters for cloud
applications based on micro-architectural characterization of their workloads was conducted in [300]. It was found
that scale-out workloads exhibit high instruction-cache miss rate, low instruction and memory level parallelism,
and have low CPU core to core bandwidth requirements, and recommendations for specialized servers were
provided. The author in [301] proposed a system for exascale computation that includes low-power processors,
byte-addressable NAND flash as main memory, and Dynamic RAM (DRAM) as cache for the flash. A total of
2550 nodes in 51 boards were designed to have their main memories connected as Hoffman-Singleton graph,
which provides 2.6 PB capacity, 9.1 TB/s bisection bandwidth and reduced latency to at most four hops. Mars in
[302] was proposed as a MapReduce runtime system that utilized NVIDIA GPUs, AMD GPUs, and multi-core
CPUs to accelerate the computations. To ease the programming of MapReduce on GPUs, MarsCUDA
programming framework was introduced to automate the task partitioning, data allocation, and threads allocation
to key-value pairs. The results showed that Mars outperformed a CPU-only based MapReduce system for multi-
core CPUs, Phoenix, by 22 times. The authors in [303] proposed coupling the use of CPUs and GPUs for
accelerating map tasks with MPTCP to reduce shuffling time for improving the performance and reliability of
Hadoop.
FPMR was proposed in [304] as a framework to help developers boost MapReduce for machine learning and
data mining in FPGAs while masking the details of task scheduling and data management and communications
between FPGAs and CPU. A processor scheduler was implemented in FPGA to control the assignments of map
and reduce tasks, and a Common Data Path (CDP) was built to perform data transfers without redundancy to
several tasks. As an example, compute-intensive rank algorithm; RankBoost was mapped into FPMR. Compared
to CPU-only implementation, speedups by 16.74×, and 31.8× were achieved without and with CDP, respectively.
SODA in [305] introduced a framework for software defined accelerators based on FPGA. To target
heterogeneous architectures, SODA utilizes a component based programming model for the accelerators, and
Dataflow execution phases to parallelize the tasks with an out-of-order scheduling scheme. In a Constrained
Shortest Path Finding (CSPF) algorithm for SDN applications in 128 nodes, acceleration by 43.75× was achieved.
To accelerate MapReduce with high energy efficiency in hosting data centers, the authors in [306] extended the
current High Level Synthesis (HLS) toolflow to include the MapReduce framework, which allows customizing
the accelerations and scaling to data center level. To allow easy transition from C/C++ to Register Transfer Level
(RTL) design, the tasks and data paths in MapReduce were decoupled, the accessible memory for each task was
isolated, and a dataflow computational approach was forced. The throughput was improved by up to 4.3× while
having a two order of magnitude improvement in the energy efficiency compared to a multi-core processor. A
Multilevel NoSQL cache implemented in FPGA-based in-NIC and software-based in-kernel caches was proposed
in [307] to accelerate NoSQL. To efficiently process entries with different sizes, multiple heterogeneous
Processing Elements (PEs) (i.e. string PE, hash PE, list PE, and set PE) were utilized instead of pipelining. FPGA-
based NICs were also utilized in [308] to perform one-at-a-time processing for data streaming instead of micro-
batch processing of Apache Spark Streaming and suggested filtering the data in the NIC to reduce processed data.
Experimental results demonstrated that WordCount throughput was improved by 22× and change point detection
latency was reduced by 94.12%. The authors in [309] examined the challenges associated with partitioning large
graphs for the Graphlet counting algorithm from bioinformatics and proposed a framework for reconfigurable
acceleration in FPGAs that achieved an order of magnitude improvement compared to processing in quad-core
CPUs and better scalability.
The challenges for big data applications in disaggregated data centers were discussed in [310], [311]. In [310],
the network requirements for supporting big data applications in disaggregated data centers were discussed. The
key findings were that existing networking hardware that provides between 40 Gbps and 100 Gbps data rates is
sufficient for rack-scale and even data center-scale disaggregation with some workloads, if the network software
is improved and the protocols used were deadline-aware. In [311], the authors demonstrated the feasibility of
composable rack-scale disaggregation for heterogeneous big data workloads and negligible latency impact was
recorded for some of the workloads such as Memchached. Table VI provides the second part of the summary of
data center focused optimization studies reported in this section.

TABLE VI SUMMARY OF DATA CENTER -FOCUSED OPTIMIZATION STUDIES - II


Ref Objective Application/ Tools Benchmarks/workloads Experimental Setup/Simulation environment
platform
[258]* Orchestra: Task-aware cluster management Spark BitTorrent-like protocol, Facebook traces EC2 instances, DETERlab cluster
to reduce the completion time of data transfers Weighted Shuffle
Scheduling
[259]* Seawall: Centralized network bandwidth - Shim layer, Strawman Web workloads Cosmos; 60 nodes in 3 racks (Xeon 2.27 GHz CPU,
allocation scheme in multitenant environments bandwidth allocator 4GB RAM), 1 Gbps NIC ports
[260]* FlowComb: Transparent network-management Hadoop 1.x Software agents, Sort 10GB 14 nodes
framework for high utilization and fast OpenFlow
processing
[261]* Distributed Flow Scheduling (DiFS) for - Imbalance Detection and, Deterministic, random, Packet-level stand-alone simulator for Fat-tree
adaptive routing in hirarchical data centers Explicit Adaption and shuffling flows (120 with 16 hosts and 1024 hosts
algorithm GB)
[262]* Varys: Centralized co-flow scheduling to reduce - Smallest-Effective- Facebook traces Extra large high-memory instances in a 100-machine EC2 cluster,
completion time of data-intensive applications Bottleneck First (SEBF) task-level trace-driven simulations
heuristic
[263]* Baraat: Decentralized task-aware scheduler for Memchached FIFO limited Bing, Cosmos, 20 nodes in 5 racks testbed (Quad core, 4GB RAM), 1 GbE
co-flows in DCNs to reduce tail completion time multiplexing, Smart and Facebook traces flow-based simulations, ns-2 large-scale simulations
Priority Class (SPC)
[264]* D-CAS: A decentralized coflow-aware - Online 2-approximation Facebook traces Python-based trace-driven simulations
scheduling in large-scale DCNs scheduling algorithm
[265]* Rapier: coflow-aware joint routing - Online heuristic, Many-to-many Spine-leaf DCN with 9 nodes (4 CPU cores,8GB RAM, 500GB disk)
and scheduling in DCNs OpenFlow communication patterns large-scale event-based flow-level simulations for 512 nodes Fat-tree
and VL2
[266]* Low-complexity online multicast - BCMS algorithm Unicast and mixed Event-driven flow simulations for 32-ary Fat-tree
flows scheduling in Fat-tree DCNs synthetic traffic
[267]* Pythia: real-time prediction of data Hadoop 1.1.2 Application-specific HiBench benckmarks: 10 nodes (12 CPU cores, 128 GB RAM, 1 SATA disk) in
communication volume to accelerate instrumentation, Sort and Nutch index two racks, OpenFlow-enabled ToRs
MapReduce OpenFlow
[268]* Application-aware Resources Allocation - Neural networks Different applications Cloudsim-based simulations for 20 servers with 80 VMs
scheme for SDN-based data centers traffic
[269]* BASS: Bandwidth-Aware Scheduling Hadoop 1.2.1 OVS, time-slotted WordCount, Sort 5-nodes cluster
with SDN in Hadoop clusters bandwidth
allocation algorithm
[270]* Scheduling in dynamic OCS-switching data - SCFT and FCFT NAS Parallel Benchmark OCSEMU emulator with 20 nodes (E5520 CPU,
centers for delay-sensitive applications scheduling algorithms (NPB) suite 24 GB RAM, 500 GB disk)
[271]* Slotted resource allocation in an - Online and incremental Iterative MapReduce OMNET++ 4.3.1-based packet-level simulations
SDN control-based hybrid DCN scheduling heuristics workloads

[272]* Scheduling algorithms in hybrid - MILP, heuristics 10k-20k random 1024 nodes in two-tier network
packet optical data centers requests (4 core MEMS switches, and 64 ToRs)
[273]* Scheduling in packet-switched optical DCNs - Mahout, C4.5, and Naïve Random traffic Simulations for 80 ToRs connected by two 40×40 AWGRs
Bayes Discretization and two 1 × 2 space switches
(NBD)
[274]* Max-throughput data transfer scheduling - MILP, heuristic 1-1, many-1 Bulk Simulations for three-tier and Fat-tree DCNs with 128 nodes and
to minimize data retrieval time transfers VL2 with 160 nodes.
[275]* Topology-aware data placement Hadoop 2.2.0 Heuristic WordCount, TeraSort, 18 nodes, TopoSim MapReduce simulator
in geo-distributed data centers scheduler k-means
[276]* CoMan: Global in-network management in - 3/2 approximation 55 flows from Emulation for a Fat-tree DCN with 10 switches and 8 servers
multiplexed data centres algorithm 10 applications (using Pica8 3297), trace-driven simulations
[277]* Network-aware MapReduce tasks placement Hadoop 1.2.1 Probabilistic tasks Wordcount, Grep from Palmetto HPC platform with 1978 slave nodes (16 cores CPU, 16GB
to reduce transmission costs in DCNs scheduling algorithm, BigDataBench, Terasort RAM, 3GB disk), 10 GbE
[278]* Network-aware Scheduler (NAS) Hadoop 2.x MTS, CA-RTS, and CR- Facebook traces 40 nodes in 8 racks cluster with 1 GbE for cross-rack links
for high performance shuffling RTS scheduling Trace-driven event-based simulations for 600 nodes in 30 racks with
algorithms 200 users

[279]* PROTEUS: system for Temporally Interleaved Hive, Profiling, Sort, WordCount, Join, Profiling: 33 nodes (4 cores 2.4GHz CPU, 4GB RAM), Gigabit
Virtual Cluster (TIVC) abstraction for multi- Hadoop dynamic programming aggregation switch
tenant data centers 0.20.0 prototype: 18-nodes in three-tier data center with NetFPGA switches,
simulations
[280]* Corral: offline scheduling framework to reduce Hadoop 2.4 Offline planner, SWIM and Microsoft 210 nodes (32 cores) in 7 racks, 10GbE
cross-rack traffic for production workloads cluster scheduler traces, TPC-H large-scale simulations for 2000 nodes in 50 racks
[281]* Mercury: Fine-grained hybrid resources Hadoop 2.4.1 Daemon in each node, Gridmix and Microsoft’s 256 nodes cluster (2 8-core CPU, 128GB RAM, 10 3TB disks),
management and scheduling extensions to YARN production traces 10 Gbps in rack, 6 Gbps between racks
[282]* Optimizing containers allocation to reduce Hadoop 2.7, Simulated annealing, k-means, 8 nodes (8-core CPU, 32GB RAM, SATA disk) in a Fat-tree topology
congestion and improve data locality Apache Flink modification to AM connected components 1 GbE, OpenFlow 1.1
0.9
[283]* Examining the impact of the Energy Hadoop 1.x EEE in switches, TeraSort, Search, MRperf-based packet-level simulations
Efficient Ethernet on MapReduce workloads packets coalescing Index for two racks cluster with up to 80 nodes
[284]* Energy-aware hierarchical applications - k th max node sorting, Uniform / normal random C++ based simulations for up to 4096 nodes
scheduling in large DCNs dynamic max node applications demands with 32, 64, 128, and 256 ports switches
sorting algorithms
[285]* Willow: Energy-efficient SDN-based flows Hadoop 1.x Online greedy Locally generated 16 nodes (AMD CPU, 2GB RAM), 1 GbE, Simulations for Fat-tree
scheduling in data centers with network-limited approximation algorithm MapReduce traces and Fat-tree with disabled core switches, 1 GbE
workloads
[286]* JouleMR: Cost-effective and energy-aware Hadoop 2.6.0 Two-phase heuristic TeraSort, GridMix 10 nodes cluster (2 6-cores CPUs, 24GB RAM, 500GB disk), 10 GbE
MapReduce jobs and tasks scheduling Facebook traces simulations for 600 nodes in tree-structured DCN, with1GbE links
[287]* Evaluation of scale-up vs scale-out Vanilla Parameters optimization Log processing, select, 16 nodes (Quad-core CPU, 12GB RAM, 160GB HDD, 32GB SSD), 1
systems for Hadoop workloads Hadoop to Hadoop in scale-up aggregate, join, TeraSort, GbE, Dell PowerEdge910 (4 8-core CPU, 512GB RAM, 2 600GB
servers k-means, indexing HDD, 8 240GB SSD)
[288]° Hybrid scale-out and scale-up Hadoop 1.2.1 Automatic Hadoop WordCount, Grep, 12 nodes (2 4-core CPU, 16GB RAM, 193GB HDD), 10 Gbps
Hadoop cluster parameters optimization, Terasort, TestDFSIO, Mytinet, 2 nodes (4 6-core CPU, 505GB RAM, 91GB HDD), 10 Gbps
scheduler to select cluster Facebook traces Mytinet, OrangeFS
[289]° Measuring the impact of InfiniBand and Hadoop Testing different network Sort, random write, 10 nodes cluster (Quad-core CPU, 6GB RAM, 250GB HDD or (up to
10Gigabit on HDFS 0.20.0 interfaces and sequential write 4), 64GB SSD), InfiniBand DDR 16 Gbps links and 10GbE
technologies
[290]° Optimizing HPC clusters for dual compute- Spark 0.7.0 Enhanced Load Balancer, GroupBy, Grep, Logistic Hyperion cluster with 101 nodes ( 2 8-core CPUs, 64GB RAM,
centric and data-centric computations Congestion-Aware Task Regression (LR) 128GB SSD) in 2 racks, InfiniBand QDR 32 Gbps links
Dispatching
[291]° Cost-per-performance evaluation of MapReduce Hadoop 2.x Standalone evaluation Teragen, Terasort, (8 core CPU, 48GB RAM ) 10 GbE, single rack, SSDs 1.3TB, HHDs
clusters with SSDs and HDDs with different storage and Teravalidate, WordCount, 2TB setups 6 HDDs, 11 HDDs, 1 SSD, (6 HHDs +1 SSD)
application shuffle, HDFS
configurations
[292]° Improving Hadoop framework for Hadoop Modified map data Terasort, DFSIO 16 nodes (8 core CPU, 256GB RAM, 600GB HDD, 512GB SATA
better utilization of SSDs 2.6.0, handling, pre-read in SSD, 10GbE) 8 nodes (6 core CPU, 256GB RAM, 250GB HDD,
Spark 1.3 HDFS, reduce scheduler, 800GB NVMe SSD, 10GbE) Ceres-II 39 nodes (6 core CPU, 64GB
placement policy RAM, 960GB NVMe SSD, 40GbE)
[293]° mpCache: SSD-based acceleration for Hadoop 2.2.0 Modified HDFS, Grep, WC - Wikipedia 7 nodes (2 8-core CPUs, 20MB cache, 32GB RAM, 2TB SATA
MapReduce workloads in commodity clusters admission control classification - Netflix, HDD, 2 160GB SATA SSD)
policy, main cache PUMA
replacement scheme,
[294]° Evaluation of IO-intensive applications Cassandra Docker’s data volume Cassandra-stress, (32 core hyper-threaded CPU, 20480KB cache, 128GB RAM,
with Docker containers 3.0, TPC-C benchmark and 3 960GB NVMe SSD)
MySQL 5.7,
FIO 2.2
[295]° Examining the performance of different Hadoop - Modification to MRPerf Synthesized Google MRperf-based simulations for 2 and 4 racks clusters
schedulers and impact of NAS on Hadoop to implement Fair share traces for 9174 jobs
scheduler and Quincy
[296]° Improving the flexibility of direct attached Hadoop FlexDAS switch, SAS TeraSort for 1TB, YCSB 12 nodes (4 core CPU, 48GB RAM), 10GbE and 48 external HDDs
storage via a Disk Area Network (DAN) 2.0.5, expanders, Host Bus COSBench 0.3.0.0
Cassandra Adapters (HBAs)
Swift 1.10.0
[297]° Optimizing bulk data transfers from parallel Scientific Layout-aware source Offline Storage Tables Spider storage system at the Oak Ridge Leadership computing
file systems via layout-aware I/O operations applications algorithm, layout-aware (OSTs) with 20MB and Facility with 96 DataDirect Network (DDN) S2A9900 RAID
sink algorithm 256MB files controllers for 13400 1TB SATA HDDs
[298]° FaRM: Fast Remote Memory system using Key-value Circular buffer for YCSB 20 nodes (2 8-core CPUs, 128GB RAM, 240GB SSD), 40Gbps
RDMA to improve in-memory key-value store and graph RDMA messaging, RDMA over Converged Ethernet (RoCE)
stores PhyCo kernel driver,
ZooKeeper
[299]° HadoopA: Virtual shuffling for efficient Hadoop virtual shuffling through TeraSort, Grep, 21-nodes cluster (dual socket quad-core 2.13 GHz Intel Xeon, 8 GB
data movement in MapReduce 0.20.0 three-level segment WordCount, and RAM, 500 GB disk, 8 PCI-Express 2.0 bus), InfiniBand QDR switch,
near-demand merging, Hive workloads 48-port 10GigE
and merging sub-trees

[300]° Micro-architectural characterization of scale- - Intel VTune for Caching, NoSQL, PowerEdge M1000e (2 Intel 6-core CPUs, 32KB
out workloads characterization MapReduce, media L1 cache, 256KB L2 cache, 12MB L3 cache, 24GB RAM)
PARSEC, SPEC2006,
TPC-E
[301]° A 2 PetaFLOP, 3 Petabyte, 9 TB/s - Hoffman-Singleton - 2,550 nodes (64-bit Boston chip, 64GB DDR3 SDRAM, 1TB NAND
and 90 kW cabinet for exascale computations topology flash)

[302]° Mars: Accelerating MapReduce Hadoop 1.x CUDA, Hadoop String match, matrix 240-core NVIDIA GTX280+ 4-core CPU), (128-core NVIDIA
with GPUs streaming, GPU Prefix multiplication, MC, 8800GTX + 4-core CPU), (320 core ATI Radeon HD 3870 + 2-core
Sum routine, GPU Black-scholes, similarity CPU)
Bitonic Sort routine score, PCA
[303]° Using GPUs and MPTCP to Hadoop 1.x CUDA, Hadoop pipes, Terasort, WordCount and Node1 (Intel i7 920, 24GB RAM, 4 1TB HDD), node2 (Intel Quad
improve Hadoop performance MPTCP PiEstimate DataGen Q9400, NVIDIA GT530 GPU, 4GB RAM, 500GB HDD),
heterogeneous 5 nodes
[304]° FPMR: a MapReduce framework - On-chip processor RankBoost 1 node (Intel Pentium 4, 4GB RAM), Altera Stratix II EP2S180F1508
on FPGA to accelerate RankBoost scheduler, Common Data FPGA, Quartus II 8.1 and ModelSim 6.1-based simulations
Path (CDP)

[305]° SODA: software defined acceleration for big - Vivado high level Constrained Shortest Path Xilinx Zynq FPGA, ARM Cortex processor
data with heterogeneous reconfigurable synthesis tools, out-of- Finding (CSPF) for SDN
multicore FPGA resources order scheduling
algorithm
[306]° HLSMapReduceFlow: Synthesizable - High-Level Synthesis- WordCount, histogram, Virtex-7 FPGA
MapReduce Accelerator for FPGA-coupled Data based MapReduce Matrix multiplication,
Centers dataflow linear regression, PCA, k-
means
[307]° FPGA-based in-NIC and software-based NoSQL DRAM and multi USR, SYS, APP, NetFPGA-10G ( Xilinx Virtex-5 XC5VTX240T)
of NetFPGA, in-Kernel caches for NoSQL processing elements in ETC, VAR
NetFPGA, Netfilter
framework for kernel
cache
[308]° FPGA-based processing in NIC Spark 1.6.0, one-at-a-time WordCount, change NetFPGA-10G (Xilinx Virtex-5 XC5VTX240TFFG1759-2) as NIC in
for Spark streaming Scala 2.10.5 methodology operations point detection server node (Intel core i5 CPU, 8GB DRAM) and a client node
[309]° FPGA-acceleration for large - Graph Processing Graphlet counting Convey HC-1 server with Virtex-5 LX330s FPGA
graph processing Elements (GPEs), algorithm
memory interconnect
network, run-time
management unit
[310]° Network requirements for big data Hadoop, Page-level memory WordCount, sort, EC2 instances (m3.2xlarge, c3.4xlarge), with virtual private network
application in disaggregated data centers Spark, access, block-level collaborative (VPC), simulations and emulations
GraphLab, distributed data filtering, YCSB
Timely placement RDMA and
dataflow, integrated NICs
Memchached,
HERD,
SparkSQL
[311]° A composable architecture for rack-scale Memchached, Empirical approaches for 100k Memcached H3 RAID array with 12 SATA drives and single PCIe Gen3 × 4 port,
disaggregation for big data computing Giraph, resources provisioning, operations, PCIe switch with 16 Gen2 × 4 port, host bus adapter connected to
Cassandra PCIe switches TopKPagerank, 10k IBM 3650M4
Cassandra operations
* Scheduling in data centers, °Performance improvements based on advances in technologies.
VIII. SUMMARY OF OPTIMIZATION STUDIES AND FUTURE RESEARCH DIRECTIONS:
Tremendous efforts were devoted in the last decade to address various challenges in optimizing the
deployment of big data applications and their infrastructures. The ever increasing data volumes and the
heterogeneous requirements of available and proposed frameworks in dedicated and multi-tenant clusters are still
calling for further improvements of different aspects of these infrastructures. We categorized the optimization
studies of big data applications into three broad categories which are at the applications-level, cloud networking-
level, and at the data center-level. Application-level studies focused on the efforts to improve the configuration or
structure of the frameworks. These studies were further categorized into optimizing jobs and data placements,
jobs scheduling, and completion time, in addition to benchmarking, production traces, modeling, profiling and
simulators for big data applications. Studies at the cloud networking level addressed the optimization beyond
single cluster deployments, for example for transporting big data and/or for in-network processing of big data.
These studies were further categorized into cloud resource management, virtual assignments and container
assignments, bulk transfers and inter data center networking, and SDN, EON, and NFV optimization studies.
Finally, the data center-level studies focused on optimizing the design of the clusters to improve the performance
of the applications. These studies were further categorized into topology, routing, scheduling of flows, coflows,
and jobs, and advances in computing, storage, and networking technologies studies. In what follows, we highlight
key challenges for big data application deployments and provide some insights into research gaps and future
directions.
Big Data Volumes: There is a huge gap in most of the studies between the actual volumes of big data and
the tested traffic or workloads volumes. The main reason is the relatively high costs of experimenting in large
clusters or renting IaaS. This calls for improving existing simulators or performance models to enable accurate
testing of systems at large scale. Data volumes are continuing to grow beyond the capabilities of existing systems
due to many bottlenecks in computing and networking. This will require continuing the investment in scale-up
systems to incorporate different technologies such as SDN and advanced optical networks for intra and inter data
center networking. Another key challenge with big data is related to the veracity and value of big data which calls
for cleansing techniques prior to processing to eliminate unnecessary computations.
Workload characteristics and their modeling: Big data workload characteristics and available frameworks will
keep changing and evolving. Most of the studies of big data address only the MapReduce framework while few
considered other variations like key-value store, streaming and graph processing applications or a mixture of
applications. Also, several studies have utilized scaled and outdated production traces where only high level
statistics are available or a subset of workloads in micro benchmarks is available for the evaluations which might
not be very representative. Thus, there is a need for more comprehensive, enhanced, and updated benchmarking
suites and production traces to reflect more recent frameworks and workload characteristics.
Resources allocation: Different workloads can have uncorrelated CPU, I/O, memory, and networking resources
requirements and placing those together can improve the overall infrastructure utilization. However, isolating
resources such as cache, network, and I/O via recent management platforms at large scale is still challenging.
Improving some of the resources can change the optimum configurations for applications. For example, improving
the networking can reduce the need to maintain data locality, and hence, the stress on tasks scheduling is reduced.
Also, improving the CPU can change CPU-bound workloads into I/O or memory-bound workloads. Another
challenge is that there is still a gap between the users’ knowledge about their requirements and the resources they
lease which leads to non-optimal configurations and hence, waste of resources, delayed response, and higher
energy consumption. More tools to aid users in understanding their requirements or their workloads are required
for better resource utilization.
Performance in proposed clusters: Most of the research that enhances big data frameworks, reported in in
Section III, was carried in scale-out clusters while overlooking the physical layer characteristics and focusing on
the framework details. Alternatively, most of the studies in Sections V, and VII have considered custom-build
simulators, SDN switches-based emulations, or small scale-up/out clusters while focusing on the hardware impact,
considering only a subset of configurations, and oversimplifying the frameworks characteristics. Although it
might sound more economical to scale out infrastructures as the computational requirements increase, this might
not be sufficient for applications with strict QoS requirements as scaling-out depends on high level of parallelism
which is constrained by the network between the nodes. Hence, improving the networking and scaling up the data
centers are required and are gaining research and industrial considerations. With improving DCNs, many tradeoffs
should be considered including the scalability, agility, end-to-end delays, in addition to the complexity of the
routing and scheduling mechanisms required to fully exploit the improved bandwidth links. This will mostly be
satisfied by application-centric DCN architectures that utilize SDN to dynamically vary the topologies at run-time
to match the requirements of deployed applications.
Clusters heterogeneity: Potential existence of hardware with different specifications for example due to
replacements in very large clusters can lead to completion time imbalance among tasks. This requires more
attention in big data studies and accurate profiling of the performance in all nodes to improve task scheduling to
reduce the imbalance.
Multi-tenancy environments: Multi-tenancy is a key requirement enabled by virtualization of cloud data centers
where workloads of different users are multiplexed in the same infrastructure to improve the utilization. In current
cloud services, static cluster configurations and manual adjustments are still followed but are not optimal. There
is still lack of dynamic job allocation and scheduling for multi-users. In such environments, the performance
isolation between users due to sharing resources such as network and I/O is not yet widely addressed. Also,
fairness between users and optimal pricing while maintaining acceptable QoS for all users is still a challenging
research topic requiring more comprehensive multi-objective studies.
Geo-distributed frameworks: The challenges faced in geo-distributed frameworks were summarized in
Subsection IV-C. Those include the need for modifying the frameworks which were originally designed for single
cluster deployments. There is a need for new frameworks for offers and pricing, optimal resources allocation in
heterogeneous clusters, QoS guarantees, resilience, and energy consumption minimization. A key challenges with
workloads in geo-distributed frameworks is that not all workloads can be divided as the whole data set may need
to reside in a single cluster, also transporting some data sets from remote locations can have high data-access
latency. Extended SDN control between transport networks and within data centers is a promising research area
to jointly optimize path computations, provision distributed resources, and reduce jobs completion time. It is also
a promising research area to improve big data applications and frameworks in geo-distributed environments.
Power consumption: Current infrastructures have a trade-off between energy efficiency and the applications
performance. Most cloud providers still favor over-provisioning to meet SLA over reducing the power
consumption. Reducing the power consumption while maintaining the performance is an area that should be
explored further in designing future systems. For example, it is attractive to consider more energy-efficient
components, more efficient geo-distributed frameworks to reduce the need for transporting big data. It is also
attractive to perform progressive computations in the network as the data transitions, while considering agile
technologies such SDN, VNE, and NFV, which can improve the energy efficiency of big data applications,
however, the impact of those strategies on the applications performance should be comprehensively addressed.

IX. CONCLUSIONS
Improving the performance and efficiency of big data frameworks and applications is an ongoing critical
research area as they are becoming the mainstream for implementing various services including data analytics and
machine learning at large scale. These are services that continue to grow in importance. Support should also be
developed for other services deployed fully or partially in cloud and fog computing environments with ever
increasing volumes of generated data. Big data interacts with systems at different levels starting from the
acquisition of data through wireless and wired access networks from users and IoTs, to transmission through
WAN networks, into different types of data centers for storage and processing via different frameworks. This
survey paper surveyed big data applications and the technologies and network infrastructure needed to implement
them. It identified a number of key challenges and research gaps in optimizing big data applications and
infrastructures. It has also comprehensively summarized early and recent efforts towards improving the
performance and/or energy efficiency of such big data applications at different layers. For the convenience of
readers with different backgrounds, brief descriptions of big data applications and frameworks, cloud computing
and related emerging technologies, and data centers are provided in Sections II, IV, and VI, respectively. The
optimization studies, that appear in Section III focus on the frameworks, those that appear in Section V focus on
cloud networking, with Section VII focusing on data centers, and finally comprehensive summaries are given in
Tables I-VI. The survey paid attention to a range of existing and proposed technologies and focused on different
frameworks and applications including MapReduce, data streaming and graph processing. The survey considered
different optimization metrics (e.g. completion time, fairness, cost, profit, and energy consumption), reported
studies that considered different representative workloads, optimization tools and mathematical techniques, and
considered simulation-based and experimental evaluations in clouds and prototypes. We provided some future
research directions in Section VIII to aid researchers in identifying the limitations of current solutions and hence
determine the area where future technologies should be developed in order to improve big data applications, their
infrastructures and performance.

ACKNOWLEDGEMENTS
Sanaa Hamid Mohamed would like to acknowledge Doctoral Training Award (DTA) funding from the UK
Engineering and physical Sciences Research Council (EPSRC). This work was supported by the Engineering and
Physical Sciences Research Council, INTERNET (EP/H040536/1), STAR (EP/K016873/1) and TOWS
(EP/S016570/1) projects. All data are provided in full in the results section of this paper.

REFERENCES
[1] H. Hu, Y. Wen, T.-S. Chua, and X. Li, “Toward Scalable Systems for Big Data Analytics: A Technology
Tutorial,” Access, IEEE, vol. 2, pp. 652–687, 2014.
[2] Y. Demchenko, C. de Laat, and P. Membrey, “Defining architecture components of the Big Data Ecosystem,”
in Collaboration Technologies and Systems (CTS), 2014 International Conference on, May 2014, pp. 104–112.
[3] X. Yi, F. Liu, J. Liu, and H. Jin, “Building a network highway for big data: architecture and challenges,”
Network, IEEE, vol. 28, no. 4, pp. 5–13, July 2014.
[4] H. Fang, Z. Zhang, C. J. Wang, M. Daneshmand, C. Wang, and H. Wang, “A survey of big data research,”
Network, IEEE, vol. 29, no. 5, pp. 6–9, September 2015.
[5] W. Tan, M. Blake, I. Saleh, and S. Dustdar, “Social-Network-Sourced Big Data Analytics,” Internet
Computing, IEEE, vol. 17, no. 5, pp. 62–69, Sept 2013.
[6] C. Fang, J. Liu, and Z. Lei, “Parallelized user clicks recognition from massive HTTP data based on dependency
graph model,” Communications, China, vol. 11, no. 12, pp. 13–25, Dec 2014.
[7] H. Hu, Y. Wen, Y. Gao, T.-S. Chua, and X. Li, “Toward an SDN-enabled big data platform for social TV
analytics,” Network, IEEE, vol. 29, no. 5, pp. 43–49, September 2015.
[8] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Commun. ACM,
vol. 51, no. 1, pp. 107–113, Jan. 2008.
[9] C. P. Chen and C.-Y. Zhang, “Data-intensive applications, challenges, techniques and technologies: A survey
on big data,” Information Sciences, vol. 275, pp. 314 – 347, 2014.
[10] J. Hack and M. Papka, “Big Data: Next-Generation Machines for Big Science,” Computing in Science
Engineering, vol. 17, no. 4, pp. 63–65, July 2015.
[11] Y. Xu and S. Mao, “A survey of mobile cloud computing for rich media applications,” Wireless
Communications, IEEE, vol. 20, no. 3, pp. 46–53, June 2013.
[12] “Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016-2021,” White Paper,
Cisco, March 2017.
[13] X. He, K. Wang, H. Huang, and B. Liu, “QoE-Driven Big Data Architecture for Smart City,” IEEE
Communications Magazine, vol. 56, no. 2, pp. 88–93, Feb 2018.
[14] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash, “Internet of Things: A Survey on
Enabling Technologies, Protocols, and Applications,” Communications Surveys Tutorials, IEEE, vol. 17, no. 4,
pp. 2347–2376, Fourthquarter 2015.
[15] E. Marín-Tordera, X. Masip-Bruin, J. G. Almiñana, A. Jukan, G. Ren, J. Zhu, and J. Farre, “What is a Fog
Node A Tutorial on Current Concepts towards a Common Definition,” CoRR, vol. abs/1611.09193, 2016.
[16] P. Mach and Z. Becvar, “Mobile Edge Computing: A Survey on Architecture and Computation Offloading,”
IEEE Communications Surveys Tutorials, vol. 19, no. 3, pp. 1628–1656, thirdquarter 2017.
[17] I. Stojmenovic, “Fog computing: A cloud to the ground support for smart things and machine-to-machine
networks,” in 2014 Australasian Telecommunication Networks and Applications Conference (ATNAC), Nov
2014, pp. 117–122.
[18] S. Wang, X. Wang, J. Huang, R. Bie, and X. Cheng, “Analyzing the potential of mobile opportunistic
networks for big data applications,” IEEE Network, vol. 29, no. 5, pp. 57–63, September 2015.
[19] Z. T. Al-Azez, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy efficient IoT
virtualization framework with passive optical access networks,” in 2016 18th International Conference on
Transparent Optical Networks (ICTON), July 2016, pp. 1–4.
[20] Z. T. Al-Azez and A. Q. Lawey and T. E. H. El-Gorashi and J. M. H. Elmirghani, “Virtualization framework
for energy efficient IoT networks,” in 2015 IEEE 4th International Conference on Cloud Networking (CloudNet),
Oct 2015, pp. 74–77.
[21] A. A. Alahmadi, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Distributed processing in
vehicular cloud networks,” in 2017 8th International Conference on the Network of the Future (NOF), Nov 2017,
pp. 22–26.
[22] H. Q. Al-Shammari, A. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “Energy efficient service embedding
in IoT networks,” in 2018 27th Wireless and Optical Communication Conference (WOCC), April 2018, pp. 1–5.
[23] B. Yosuf, M. Musa, T. Elgorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Energy Efficient Service
Distribution in Internet of Things,” in 2018 20th International Conference on Transparent Optical Networks
(ICTON), July 2018, pp. 1–4.
[24] I. S. M. Isa, M. O. I. Musa, T. E. H. El-Gorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Energy Efficiency
of Fog Computing Health Monitoring Applications,” in 2018 20th International Conference on Transparent
Optical Networks (ICTON), July 2018, pp. 1–5.
[25] M. B. A. Halim, S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Fog-assisted caching
employing solar renewable energy for delivering video on demand service,” CoRR, vol. abs/1903.10250, 2019.
[26] F. S. Behbehani, M. O. I. Musa, T. Elgorashi, and J. M. H. Elmirghani, “Energy-efficient distributed
processing in vehicular cloud architecture,” CoRR, vol. abs/1903.12451, 2019.
[27] B. Yosuf, M. N. Musa, T. Elgorashi, and J. M. H. Elmirghani, “Impact of distributed processing on power
consumption for iot based surveillance applications,” 2019.
[28] I. S. M. Isa, M. O. I. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient and Resilient
Infrastructure for Fog Computing Health Monitoring Applications,” arXiv e-prints, p. arXiv:1904.01732, Apr
2019.
[29] R. Ma, A. A. Alahmadi, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient Software Matching
in Vehicular Fog,” arXiv e-prints, p. arXiv:1904.02592, Apr 2019.
[30] Z. T. Al-Azez, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient IoT
Virtualization Framework with Peer to Peer Networking and Processing,” IEEE Access, pp. 1–1, 2019.
[31] M. S. H. Graduate, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Patient-Centric Cellular
Networks Optimization using Big Data Analytics,” IEEE Access, pp. 1–1, 2019.
[32] K. Dolui and S. K. Datta, “Comparison of edge computing implementations: Fog computing, cloudlet and
mobile edge computing,” in 2017 Global Internet of Things Summit (GIoTS), June 2017, pp. 1–6.
[33] J. Andreu-Perez, C. Poon, R. Merrifield, S.Wong, and G.-Z. Yang, “Big Data for Health,” Biomedical and
Health Informatics, IEEE Journal of, vol. 19, no. 4, pp. 1193–1208, July 2015.
[34] X. Xu, Q. Sheng, L.-J. Zhang, Y. Fan, and S. Dustdar, “From Big Data to Big Service,” Computer, vol. 48,
no. 7, pp. 80–83, July 2015.
[35] S. Mazumdar and S. Dhar, “Hadoop as Big Data Operating System– The Emerging Approach for Managing
Challenges of Enterprise Big Data Platform,” in Big Data Computing Service and Applications (BigDataService),
2015 IEEE First International Conference on, March 2015, pp. 499–505.
[36] J. Xie, S. Yin, X. Ruan, Z. Ding, Y. Tian, J. Majors, A. Manzanares, and X. Qin, “Improving MapReduce
performance through data placement in heterogeneous Hadoop clusters,” in Parallel Distributed Processing,
Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, April 2010, pp. 1–9.
[37] J. Leverich and C. Kozyrakis, “On the Energy (in)Efficiency of Hadoop Clusters,” SIGOPS Oper. Syst. Rev.,
vol. 44, no. 1, pp. 61–65, Mar. 2010.
[38] Y. Ying, R. Birke, C. Wang, L. Chen, and N. Gautam, “Optimizing Energy, Locality and Priority in a
MapReduce Cluster,” in Autonomic Computing (ICAC), 2015 IEEE International Conference on, July 2015, pp.
21–30.
[39] X. Ma, X. Fan, J. Liu, and D. Li, “Dependency-aware Data Locality for MapReduce,” Cloud Computing,
IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[40] J. Tan, S. Meng, X. Meng, and L. Zhang, “Improving ReduceTask data locality for sequential MapReduce
jobs,” in INFOCOM, 2013 Proceedings IEEE, April 2013, pp. 1627–1635.
[41] B. Arres, N. Kabachi, O. Boussaid, and F. Bentayeb, “Intentional Data Placement Optimization for
Distributed Data Warehouses,” in Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference
on, Oct 2015, pp. 80–86.
[42] X. Bao, L. Liu, N. Xiao, F. Liu, Q. Zhang, and T. Zhu, “HConfig: Resource adaptive fast bulk loading in
HBase,” in Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), 2014
International Conference on, Oct 2014, pp. 215–224.
[43] Y. Elshater, P. Martin, D. Rope, M. McRoberts, and C. Statchuk, “A Study of Data Locality in YARN,” in
2015 IEEE International Congress on Big Data, June 2015, pp. 174–181.
[44] Z. Liu, Q. Zhang, R. Ahmed, R. Boutaba, Y. Liu, and Z. Gong, “Dynamic Resource Allocation for
MapReduce with Partitioning Skew,” IEEE Transactions on Computers, vol. PP, no. 99, pp. 1–1, 2016.
[45] X. Ding, Y. Liu, and D. Qian, “JellyFish: Online Performance Tuning with Adaptive Configuration and
Elastic Container in Hadoop Yarn,” in Parallel and Distributed Systems (ICPADS), 2015 IEEE 21st International
Conference on, Dec 2015, pp. 831–836.
[46] M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica, “Delay Scheduling: A
Simple Technique for Achieving Locality and Fairness in Cluster Scheduling,” in Proceedings of the 5th European
Conference on Computer Systems, ser. EuroSys ’10. New York, NY, USA: ACM, 2010, pp. 265–278.
[47] M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg, “Quincy: Fair Scheduling for
Distributed Computing Clusters,” in Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems
Principles, ser. SOSP ’09. New York, NY, USA: ACM, 2009, pp. 261–276.
[48] S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu, and S. Wu, “Maestro: Replica-Aware Map Scheduling for
MapReduce,” in Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium
on, May 2012, pp. 435–442.
[49] W. Wang, K. Zhu, L. Ying, J. Tan, and L. Zhang, “MapTask Scheduling in MapReduce With Data Locality:
Throughput and Heavy-Traffic Optimality,” Networking, IEEE/ACM Transactions on, vol. PP, no. 99,
pp. 1–1, 2014.
[50] J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres, and E. Ayguadé, “Resource-
aware Adaptive Scheduling for Mapreduce Clusters,” in Proceedings of the 12th ACM/IFIP/USENIX
International Conference on Middleware, ser. Middleware’11. Berlin, Heidelberg: Springer-Verlag, 2011, pp.
187–207.
[51] J. Wolf, D. Rajan, K. Hildrum, R. Khandekar, V. Kumar, S. Parekh, K.- L. Wu, and A. balmin, “FLEX: A
Slot Allocation Scheduling Optimizer for MapReduce Workloads,” in Proceedings of the ACM/IFIP/USENIX
11th International Conference on Middleware, ser. Middleware ’10. Berlin, Heidelberg: Springer-Verlag, 2010,
pp. 1–20.
[52] M. Pastorelli, D. Carra, M. Dell’Amico, and P. Michiardi, “HFSP: Bringing Size-Based Scheduling To
Hadoop,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[53] Y. Yao, J. Tai, B. Sheng, and N. Mi, “LsPS: A Job Size-Based Scheduler for Efficient Task Assignments in
Hadoop,” Cloud Computing, IEEE Transactions on, vol. 3, no. 4, pp. 411–424, Oct 2015.
[54] Y. Yuan, D. Wang, and J. Liu, “Joint scheduling of MapReduce jobs with servers: Performance bounds and
experiments,” in IEEE INFOCOM 2014 - IEEE Conference on Computer Communications, April 2014, pp. 2175–
2183.
[55] F. Chen, M. Kodialam, and T. V. Lakshman, “Joint scheduling of processing and Shuffle phases in
MapReduce systems,” in INFOCOM, 2012 Proceedings IEEE, March 2012, pp. 1143–1151.
[56] S. Kurazumi, T. Tsumura, S. Saito, and H. Matsuo, “Dynamic Processing Slots Scheduling for I/O Intensive
Jobs of Hadoop MapReduce,” in Proceedings of the 2012 Third International Conference on Networking and
Computing, ser. ICNC ’12. Washington, DC, USA: IEEE Computer Society, 2012, pp. 288–292.
[57] E. Bampis, V. Chau, D. Letsios, G. Lucarelli, I. Milis, and G. Zois, “Energy Efficient Scheduling of
MapReduce Jobs,” in Euro-Par 2014 Parallel Processing, ser. Lecture Notes in Computer Science, F. Silva, I.
Dutra, and V. Santos Costa, Eds. Springer International Publishing, 2014, vol. 8632, pp. 198–209.
[58] T. Wirtz and R. Ge, “Improving MapReduce Energy Efficiency for Computation Intensive Workloads,” in
Proceedings of the 2011 International Green Computing Conference and Workshops, ser. IGCC ’11. Washington,
DC, USA: IEEE Computer Society, 2011, pp. 1–8.
[59] L. Mashayekhy, M. Nejad, D. Grosu, Q. Zhang, and W. Shi, “Energy- Aware Scheduling of MapReduce
Jobs for Big Data Applications,” Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 10, pp.
2720–2733, Oct 2015.
[60] S. Tang, B.-S. Lee, and B. He, “DynamicMR: A Dynamic Slot Allocation Optimization Framework for
MapReduce Clusters,” Cloud Computing, IEEE Transactions on, vol. 2, no. 3, pp. 333–347, July 2014.
[61] Q. Zhang, M. Zhani, Y. Yang, R. Boutaba, and B. Wong, “PRISM: Fine-Grained Resource-Aware
Scheduling for MapReduce,” Cloud Computing, IEEE Transactions on, vol. 3, no. 2, pp. 182–194, April 2015.
[62] Y. Yao, J. Wang, B. Sheng, J. Lin, and N. Mi, “HaSTE: Hadoop YARN Scheduling Based on Task-
Dependency and Resource-Demand,” in 2014 IEEE 7th International Conference on Cloud Computing, June
2014, pp. 184–191.
[63] P. Li, L. Ju, Z. Jia, and Z. Sun, “SLA-Aware Energy-Efficient Scheduling Scheme for Hadoop YARN,” in
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on
Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and
Systems (ICESS), 2015 IEEE 17th International Conference on, Aug 2015, pp. 623–628.
[64] P. Bellavista, A. Corradi, A. Reale, and N. Ticca, “Priority-Based Resource Scheduling in Distributed Stream
Processing Systems for Big Data Applications,” in Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th
International Conference on, Dec 2014, pp. 363–370.
[65] K. Xiong and Y. He, “Power-effiicent resource allocation in MapReduce clusters,” in Integrated Network
Management (IM 2013), 2013 IFIP/IEEE International Symposium on, May 2013, pp. 603–608.
[66] D. Cheng, J. Rao, C. Jiang, and X. Zhou, “Resource and Deadline- Aware Job Scheduling in Dynamic
Hadoop Clusters,” in Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International,
May 2015, pp. 956–965.
[67] Z. Ren, J. Wan, W. Shi, X. Xu, and M. Zhou, “Workload Analysis, Implications, and Optimization on a
Production Hadoop Cluster: A Case Study on Taobao,” IEEE Transactions on Services Computing,
vol. 7, no. 2, pp. 307–321, April 2014.
[68] C. Chen, J. Lin, and S. Kuo, “MapReduce Scheduling for Deadline- Constrained Jobs in Heterogeneous
Cloud Computing Systems,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[69] A. Verma, L. Cherkasova, and R. H. Campbell, “ARIA: Automatic Resource Inference and Allocation for
Mapreduce Environments,” in Proceedings of the 8th ACM International Conference on Autonomic Computing,
ser. ICAC ’11. New York, NY, USA: ACM, 2011, pp. 235–244.
[70] X. Dai and B. Bensaou, “Scheduling for response time in Hadoop MapReduce,” in 2016 IEEE International
Conference on Communications (ICC), May 2016, pp. 1–6.
[71] H. Chang, M. Kodialam, R. R. Kompella, T. V. Lakshman, M. Lee, and S. Mukherjee, “Scheduling in
mapreduce-like systems for fast completion time,” in INFOCOM, 2011 Proceedings IEEE, April 2011,
pp. 3074–3082.
[72] S. Tang, B. Lee, and B. He, “Dynamic Job Ordering and Slot Configurations for MapReduce Workloads,”
Services Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[73] T. Li, G. Yu, X. Liu, and J. Song, “Analyzing the Waiting Energy Consumption of NoSQL Databases,” in
Dependable, Autonomic and Secure Computing (DASC), 2014 IEEE 12th International Conference on, Aug 2014,
pp. 277–282.
[74] S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou, “Re-optimizing Data-parallel
Computing,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation,
ser. NSDI’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 21–21.
[75] H. Wang, H. Chen, Z. Du, and F. Hu, “BeTL: MapReduce Checkpoint Tactics Beneath the Task Level,”
Services Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[76] B. Ghit and D. Epema, “Reducing Job Slowdown Variability for Data-Intensive Workloads,” in Modeling,
Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), 2015 IEEE 23rd
International Symposium on, Oct 2015, pp. 61–70.
[77] S. M. Nabavinejad, M. Goudarzi, and S. Mozaffari, “The Memory Challenge in Reduce Phase of MapReduce
Applications,” IEEE Transactions on Big Data, vol. PP, no. 99, pp. 1–1, 2016.
[78] X. Shi, M. Chen, L. He, X. Xie, L. Lu, H. Jin, Y. Chen, and S. Wu, “Mammoth: Gearing Hadoop Towards
Memory-Intensive MapReduce Applications,” Parallel and Distributed Systems, IEEE Transactions on,
vol. 26, no. 8, pp. 2300–2315, Aug 2015.
[79] Y. Kwon, M. Balazinska, B. Howe, and J. Rolia, “SkewTune: Mitigating Skew in Mapreduce Applications,”
in Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’12.
New York, NY, USA: ACM, 2012, pp. 25–36.
[80] Apache Hadoop: Gridmix. (Cited on 2018, Jan). [Online]. Available:
https://hadoop.apache.org/docs/current/hadoop-gridmix/GridMix.html
[81] S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang, “The HiBench benchmark suite: Characterization of the
MapReduce-based data analysis,” in Data Engineering Workshops (ICDEW), 2010 IEEE 26th International
Conference on, March 2010, pp. 41–51.
[82] V. A. Saletore, K. Krishnan, V. Viswanathan, and M. E. Tolentino, “HcBench: Methodology, development,
and characterization of a customer usage representative big data/Hadoop benchmark,” in 2013 IEEE International
Symposium on Workload Characterization (IISWC), Sept 2013, pp. 77–86.
[83] F. Ahmad, S. Lee, M. Thottethodi, and T. Vijaykumar, “Puma: Purdue mapreduce benchmarks suite,” 2012.
[84] Hive performance benchmarks. (Cited on 2018, Jan). [Online]. Available:
https://issues.apache.org/jira/browse/HIVE-396
[85] Apache Hadoop: Pigmix. (Cited on 2018, Jan). [Online]. Available:
https://cwiki.apache.org/confluence/display/PIG/PigMix
[86] A. Ghazal, T. Rabl, M. Hu, F. Raab, M. Poess, A. Crolotte, and H.- A. Jacobsen, “BigBench: Towards an
Industry Standard Benchmark for Big Data Analytics,” in Proceedings of the 2013 ACM SIGMOD International
Conference on Management of Data, ser. SIGMOD ’13. New York, NY, USA: ACM, 2013, pp. 1197–1208.
[87] Transaction Processing Performance Council (TPC-H). (Cited on 2018, Jan). [Online]. Available:
http://www.tpc.org/tpch/
[88] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking Cloud Serving Systems
with YCSB,” in Proceedings of the 1st ACM Symposium on Cloud Computing, ser. SoCC ’10. New York, NY,
USA: ACM, 2010, pp. 143–154.
[89] S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, “An Analysis of Traces from a Production MapReduce
Cluster,” in Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid
Computing, ser. CCGRID ’10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 94–103.
[90] Y. Chen, A. Ganapathi, R. Griffith, and R. Katz, “The Case for Evaluating MapReduce Performance Using
Workload Suites,” in Modeling, Analysis Simulation of Computer and Telecommunication Systems
(MASCOTS), 2011 IEEE 19th International Symposium on, July 2011, pp. 390–399.
[91] Y. Chen, S. Alspaugh, D. Borthakur, and R. Katz, “Energy Efficiency for Large-scale MapReduce Workloads
with Significant Interactive Analysis,” in Proceedings of the 7th ACM European Conference on
Computer Systems, ser. EuroSys ’12. New York, NY, USA: ACM, 2012, pp. 43–56.
[92] Y. Chen, S. Alspaugh, and R. H. Katz, “Design Insights for MapReduce from Diverse Production
Workloads,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2012-17, Jan 2012.
[93] A. K. Mishra, J. L. Hellerstein, W. Cirne, and C. R. Das, “Towards characterizing cloud backend workloads:
Insights from google compute clusters,” SIGMETRICS Perform. Eval. Rev., vol. 37, no. 4, pp. 34–41, Mar. 2010.
[94] C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch, “Heterogeneity and Dynamicity of
Clouds at Scale: Google Trace Analysis,” in Proceedings of the Third ACM Symposium on Cloud Computing,
ser. SoCC ’12. New York, NY, USA: ACM, 2012, pp. 7:1–7:13.
[95] R. Birke, L. Y. Chen, and E. Smirni, “Multi-resource characterization and their (in)dependencies in
production datacenters,” in 2014 IEEE Network Operations and Management Symposium (NOMS), May 2014,
pp. 1–6.
[96] K. Ren, Y. Kwon, M. Balazinska, and B. Howe, “Hadoop’s Adolescence: An Analysis of Hadoop Usage in
Scientific Workloads,” Proc. VLDB Endow., vol. 6, no. 10, pp. 853–864, Aug. 2013.
[97] S. Shen, V. v. Beek, and A. Iosup, “Statistical Characterization of Business-Critical Workloads Hosted in
Cloud Datacenters,” in Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International
Symposium on, May 2015, pp. 465–474.
[98] Y. Chen, A. S. Ganapathi, A. Fox, R. H. Katz, and D. A. Patterson, “Statistical Workloads for Energy
Efficient MapReduce,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS- 2010-6,
Jan 2010.
[99] H. Yang, Z. Luan, W. Li, and D. Qian, “Mapreduce workload modelling with statistical approach,” Journal
of Grid Computing, vol. 10, no. 2, pp. 279–310, Jun 2012.
[100] Z. Jia, J. Zhan, L. Wang, C. Luo, W. Gao, Y. Jin, R. Han, and L. Zhang, “Understanding Big Data Analytics
Workloads on Modern Processors,” IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99, pp.
1–1, 2016.
[101] J. Deng, G. Tyson, F. Cuadrado, and S. Uhlig, “Keddah: Capturing Hadoop Network Behaviour,” in 2017
IEEE 37th International Conference on Distributed Computing Systems (ICDCS), June 2017, pp. 2143–2150.
[102] H. Herodotou and S. Babu, “Profiling, What-if Analysis, and Costbased Optimization of MapReduce
Programs.” PVLDB, vol. 4, no. 11, pp. 1111–1122, 2011.
[103] H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F. B. Cetin, and S. Babu, “Starfish: A Self-tuning
System for Big Data Analytics,” in In CIDR, 2011, pp. 261–272.
[104] M. Khan, Y. Jin, M. Li, Y. Xiang, and C. Jiang, “Hadoop Performance Modeling for Job Estimation and
Resource Provisioning,” Parallel and Distributed Systems, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[105] N. B. Rizvandi, J. Taheri, R. Moraveji, and A. Y. Zomaya, “On Modelling and Prediction of Total CPU
Usage for Applications in Mapreduce Environments,” in Proceedings of the 12th International Conference on
Algorithms and Architectures for Parallel Processing - Volume Part I, ser. ICA3PP’12. Berlin, Heidelberg:
Springer-Verlag, 2012, pp. 414–427.
[106] Apache Hadoop MapReduce Mumak: Map-Reduce simulator. (Cited on 2016, Dec). [Online]. Available:
https://issues.apache.org/jira/browse/MAPREDUCE-728l
[107] A. Verma, L. Cherkasova, and R. H. Campbell, “Play It Again, SimMR!” in 2011 IEEE International
Conference on Cluster Computing, Sept 2011, pp. 253–261.
[108] S. Hammoud, M. Li, Y. Liu, N. K. Alham, and Z. Liu, “MRSim: A discrete event based MapReduce
simulator,” in 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery, vol. 6, Aug
2010, pp. 2993–2997.
[109] J. Jung and H. Kim, “MR-CloudSim: Designing and implementing MapReduce computing model on
CloudSim,” in 2012 International Conference on ICT Convergence (ICTC), Oct 2012, pp. 504–509.
[110] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, and R. Buyya, “CloudSim: A Toolkit for
Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning
Algorithms,” Softw. Pract. Exper., vol. 41, no. 1, pp. 23–50, Jan. 2011.
[111] G. Wang, A. R. Butt, P. Pandey, and K. Gupta, “Using Realistic Simulation for Performance Analysis of
Mapreduce Setups,” in Proceedings of the 1st ACM Workshop on Large-Scale System and Application
Performance, ser. LSAP ’09. New York, NY, USA: ACM, 2009, pp. 19–26.
[112] M. V. Neves, C. A. F. D. Rose, and K. Katrinis, “MRemu: An Emulation-Based Framework for Datacenter
Network Experimentation Using Realistic MapReduce Traffic,” in Modeling, Analysis and Simulation of
Computer and Telecommunication Systems (MASCOTS), 2015 IEEE 23rd International Symposium on, Oct
2015, pp. 174–177.
[113] Z. Bian, K. Wang, Z. Wang, G. Munce, I. Cremer, W. Zhou, Q. Chen, and G. Xu, “Simulating Big Data
Clusters for System Planning, Evaluation, and Optimization,” in 2014 43rd International Conference on Parallel
Processing, Sept 2014, pp. 391–400.
[114] J. Liu, B. Bian, and S. S. Sury, “Planning Your SQL-on-Hadoop Deployment Using a Low-Cost Simulation-
Based Approach,” in 2016 28th International Symposium on Computer Architecture and High Performance
Computing (SBAC-PAD), Oct 2016, pp. 182–189.
[115] K. Wang, Z. Bian, Q. Chen, R. Wang, and G. Xu, “Simulating Hive Cluster for Deployment Planning,
Evaluation and Optimization,” in 2014 IEEE 6th International Conference on Cloud Computing Technology and
Science, Dec 2014, pp. 475–482.
[116] Yarn Scheduler Load Simulator (SLS). (Cited on 2018, Jan). [Online]. Available:
https://hadoop.apache.org/docs/r2.4.1/hadoop-sls/SchedulerLoadSimulator.html
[117] P. Wette, A. Schwabe, M. Splietker, and H. Karl, “Extending Hadoop’s Yarn Scheduler Load Simulator
with a highly realistic network amp; traffic model,” in Network Softwarization (NetSoft), 2015 1st IEEE
Conference on, April 2015, pp. 1–2.
[118] J.-C. Lin, I. C. Yu, E. B. Johnsen, and M.-C. Lee, “ABS-YARN: A Formal Framework for Modeling
Hadoop YARN Clusters,” in Proceedings of the 19th International Conference on Fundamental Approaches to
Software Engineering - Volume 9633. New York, NY, USA: Springer-Verlag New York, Inc., 2016, pp. 49–65.
[119] N. Liu, X. Yang, X. H. Sun, J. Jenkins, and R. Ross, “YARNsim: Simulating Hadoop YARN,” in Cluster,
Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on, May 2015, pp. 637–
646.
[120] X. Xu, M. Tang, and Y. C. Tian, “Theoretical Results of QoSGuaranteed Resource Scaling for Cloud-based
MapReduce,” IEEE Transactions on Cloud Computing, vol. PP, no. 99, pp. 1–1, 2016.
[121] M. Mattess, R. N. Calheiros, and R. Buyya, “Scaling MapReduce Applications Across Hybrid Clouds to
Meet Soft Deadlines,” in 2013 IEEE 27th International Conference on Advanced Information Networking and
Applications (AINA), March 2013, pp. 629–636.
[122] C.-W. Tsai, W.-C. Huang, M.-H. Chiang, M.-C. Chiang, and C.-S. Yang, “A Hyper-Heuristic Scheduling
Algorithm for Cloud,” IEEE Transactions on Cloud Computing, vol. 2, no. 2, pp. 236–250, April 2014.
[123] N. Lim, S. Majumdar, and P. Ashwood-Smith, “MRCP-RM: a Technique for Resource Allocation and
Scheduling of MapReduce Jobs with Deadlines,” IEEE Transactions on Parallel and Distributed Systems, vol. PP,
no. 99, pp. 1–1, 2016.
[124] K. Chen, J. Powers, S. Guo, and F. Tian, “CRESP: Towards Optimal Resource Provisioning for MapReduce
Computing in Public Clouds,” Parallel and Distributed Systems, IEEE Transactions on, vol. 25, no. 6, pp. 1403–
1412, June 2014.
[125] L. Sharifi, L. Cerdà-Alabern, F. Freitag, and L. Veiga, “Energy efficient cloud service provisioning:
Keeping data center granularity in perspective,” Journal of Grid Computing, vol. 14, no. 2, pp. 299–325, 2016.
[126] M. Zhang, R. Ranjan, M. Menzel, S. Nepal, P. Strazdins, W. Jie, and L. Wang, “An Infrastructure Service
Recommendation System for Cloud Applications with Real-time QoS Requirement Constraints,” IEEE Systems
Journal, vol. PP, no. 99, pp. 1–11, 2015.
[127] V. Jalaparti, H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, “Bridging the Tenant-provider Gap in
Cloud Services,” in Proceedings of the Third ACM Symposium on Cloud Computing, ser. SoCC ’12. New York,
NY, USA: ACM, 2012, pp. 10:1–10:14.
[128] D. Xie, Y. C. Hu, and R. R. Kompella, “On the performance projectability of MapReduce,” in Cloud
Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on, Dec 2012, pp.
301–308.
[129] C. Delimitrou and C. Kozyrakis, “The Netflix Challenge: Datacenter Edition,” IEEE Computer Architecture
Letters, vol. 12, no. 1, pp. 29– 32, January 2013.
[130] M. Jammal, A. Kanso, and A. Shami, “High availability-aware optimization digest for applications
deployment in cloud,” in Communications (ICC), 2015 IEEE International Conference on, June 2015, pp. 6822–
6828.
[131] J. Lee, Y. Turner, M. Lee, L. Popa, S. Banerjee, J.-M. Kang, and P. Sharma, “Application-driven Bandwidth
Guarantees in Datacenters,” SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 467–478, Aug. 2014.
[132] J. Guo, F. Liu, J. C. S. Lui, and H. Jin, “Fair Network Bandwidth Allocation in IaaS Datacenters via a
Cooperative Game Approach,” IEEE/ACM Transactions on Networking, vol. 24, no. 2, pp. 873–886, April 2016.
[133] L. Popa, P. Yalagandula, S. Banerjee, J. C. Mogul, Y. Turner, and J. R. Santos, “Elasticswitch: Practical
work-conserving bandwidth guarantees for cloud computing,” SIGCOMM Comput. Commun. Rev., vol. 43, no.
4, pp. 351–362, Aug. 2013.
[134] V. Jeyakumar, M. Alizadeh, D. Mazières, B. Prabhakar, C. Kim, and A. Greenberg, “EyeQ: Practical
Network Performance Isolation at the Edge,” in Proceedings of the 10th USENIX Conference on Networked
Systems Design and Implementation, ser. nsdi’13. Berkeley, CA, USA: USENIX Association, 2013, pp. 297–
312.
[135] A. Antoniadis, Y. Gerbessiotis, M. Roussopoulos, and A. Delis, “Tossing NoSQL-Databases Out to Public
Clouds,” in Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on, Dec 2014,
pp. 223–232.
[136] T. Z. J. Fu, J. Ding, R. T. B. Ma, M. Winslett, Y. Yang, and Z. Zhang, “DRS: Dynamic Resource Scheduling
for Real-Time Analytics over Fast Streams,” in Distributed Computing Systems (ICDCS), 2015 IEEE 35th
International Conference on, June 2015, pp. 411–420.
[137] L. Chen, S. Liu, B. Li, and B. Li, “Scheduling jobs across geodistributed datacenters with max-min fairness,”
in IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, May 2017, pp. 1–9.
[138] S. Tang, B. S. Lee, and B. He, “Fair Resource Allocation for Data- Intensive Computing in the Cloud,”
IEEE Transactions on Services Computing, vol. PP, no. 99, pp. 1–1, 2016.
[139] H. Won, M. C. Nguyen, M. S. Gil, and Y. S. Moon, “Advanced resource management with access control
for multitenant Hadoop,” Journal of Communications and Networks, vol. 17, no. 6, pp. 592–601, Dec 2015.
[140] H. Herodotou, F. Dong, and S. Babu, “No One (Cluster) Size Fits All: Automatic Cluster Sizing for Data-
intensive Analytics,” in Proceedings of the 2Nd ACM Symposium on Cloud Computing, ser. SOCC ’11. New
York, NY, USA: ACM, 2011, pp. 18:1–18:14.
[141] B. Palanisamy, A. Singh, and L. Liu, “Cost-Effective Resource Provisioning for MapReduce in a Cloud,”
Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 5, pp. 1265–1279, May 2015.
[142] B. Sharma, T. Wood, and C. R. Das, “HybridMR: A Hierarchical MapReduce Scheduler for Hybrid Data
Centers,” in Proceedings of the 2013 IEEE 33rd International Conference on Distributed Computing Systems, ser.
ICDCS ’13. Washington, DC, USA: IEEE Computer Society, 2013, pp. 102–111.
[143] Y. Zhang, X. Fu, and K. K. Ramakrishnan, “Fine-grained multiresource scheduling in cloud datacenters,”
in Local Metropolitan Area Networks (LANMAN), 2014 IEEE 20th International Workshop on, May 2014, pp.
1–6.
[144] X. Zhu, L. Yang, H. Chen, J. Wang, S. Yin, and X. Liu, “Real- Time Tasks Oriented Energy-Aware
Scheduling in Virtualized Clouds,” Cloud Computing, IEEE Transactions on, vol. 2, no. 2, pp. 168–180, April
2014.
[145] Z. Li, J. Ge, H. Hu, W. Song, H. Hu, and B. Luo, “Cost and Energy Aware Scheduling Algorithm for
Scientific Workflows with Deadline Constraint in Clouds,” Services Computing, IEEE Transactions on, vol. PP,
no. 99, pp. 1–1, 2015.
[146] M. Cardosa, A. Singh, H. Pucha, and A. Chandra, “Exploiting Spatio- Temporal Tradeoffs for Energy-
Aware MapReduce in the Cloud,” Computers, IEEE Transactions on, vol. 61, no. 12, pp. 1737–1751, Dec 2012.
[147] F. Teng, D. Deng, L. Yu, and F. Magoulès, “An Energy-Efficient VM Placement in Cloud Datacenter,” in
High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and
Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014 IEEE Intl Conf
on, Aug 2014, pp. 173–180.
[148] B. Palanisamy, A. Singh, L. Liu, and B. Jain, “Purlieus: Localityaware Resource Allocation for MapReduce
in a Cloud,” in Proceedings of 2011 International Conference for High Performance Computing, Networking,
Storage and Analysis, ser. SC ’11. New York, NY, USA: ACM, 2011, pp. 58:1–58:11.
[149] J. Park, D. Lee, B. Kim, J. Huh, and S. Maeng, “Locality-aware Dynamic VM Reconfiguration on
MapReduce Clouds,” in Proceedings of the 21st International Symposium on High-Performance Parallel and
Distributed Computing, ser. HPDC ’12. New York, NY, USA: ACM, 2012, pp. 27–36.
[150] X. Bu, J. Rao, and C.-z. Xu, “Interference and Locality-aware Task Scheduling for MapReduce Applications
in Virtual Clusters,” in Proceedings of the 22Nd International Symposium on High-performance Parallel and
Distributed Computing, ser. HPDC ’13. New York, NY, USA: ACM, 2013, pp. 227–238.
[151] M. Li, D. Subhraveti, A. R. Butt, A. Khasymski, and P. Sarkar, “CAM: A Topology Aware Minimum Cost
Flow Based Resource Manager for MapReduce Applications in the Cloud,” in Proceedings of the 21st
International Symposium on High-Performance Parallel and Distributed Computing, ser. HPDC ’12. New York,
NY, USA: ACM, 2012, pp. 211–222.
[152] V. van Beek, J. Donkervliet, T. Hegeman, S. Hugtenburg, and A. Iosup, “Self-Expressive Management of
Business-Critical Workloads in Virtualized Datacenters,” Computer, vol. 48, no. 7, pp. 46–54, July 2015.
[153] D. Tsoumakos, I. Konstantinou, C. Boumpouka, S. Sioutas, and N. Koziris, “Automated, Elastic Resource
Provisioning for NoSQL Clusters Using TIRAMOLA,” in Cluster, Cloud and Grid Computing (CCGrid), 2013
13th IEEE/ACM International Symposium on, May 2013, pp. 34–41.
[154] H. Kang, Y. Chen, J. L. Wong, R. Sion, and J. Wu, “Enhancement of Xen’s Scheduler for MapReduce
Workloads,” in Proceedings of the 20th International Symposium on High Performance Distributed Computing,
ser. HPDC ’11. New York, NY, USA: ACM, 2011, pp. 251–262.
[155] B. M. Ko, J. Lee, and H. Jo, “Toward Enhancing Block I/O Performance for Virtualized Hadoop Cluster,”
in Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on, Dec 2014, pp. 481–
482.
[156] Y. Yu, H. Zou, W. Tang, L. Liu, and F. Teng, “Flex Tuner: A Flexible Container-Based Tuning System for
Cloud Applications,” in Cloud Engineering (IC2E), 2015 IEEE International Conference on, March 2015, pp.
145–154.
[157] R. Zhang, M. Li, and D. Hildebrand, “Finding the Big Data Sweet Spot: Towards Automatically
Recommending Configurations for Hadoop Clusters on Docker Containers,” in Cloud Engineering (IC2E), 2015
IEEE International Conference on, March 2015, pp. 365–368.
[158] Y. Kang and R. Y. C. Kim, “Twister Platform for MapReduce Applications on a Docker Container,” in
2016 International Conference on Platform Technology and Service (PlatCon), Feb 2016, pp. 1–3.
[159] C. Rista, D. Griebler, C. A. F. Maron, and L. G. Fernandes, “Improving the Network Performance of a
Container-Based Cloud Environment for Hadoop Systems,” in 2017 International Conference on High
Performance Computing Simulation (HPCS), July 2017, pp. 619–626.
[160] L. Yazdanov, M. Gorbunov, and C. Fetzer, “EHadoop: Network I/O Aware Scheduler for Elastic
MapReduce Cluster,” in 2015 IEEE 8th International Conference on Cloud Computing, June 2015, pp. 821– 828.
[161] N. Laoutaris, M. Sirivianos, X. Yang, and P. Rodriguez, “Interdatacenter Bulk Transfers with Netstitcher,”
SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 74–85, Aug. 2011.
[162] Y. Feng, B. Li, and B. Li, “Postcard: Minimizing Costs on Inter- Datacenter Traffic with Store-and-
Forward,” in Distributed Computing Systems Workshops (ICDCSW), 2012 32nd International Conference on,
June 2012, pp. 43–50.
[163] P. Lu, K. Wu, Q. Sun, and Z. Zhu, “Toward online profit-driven scheduling of inter-DC data-transfers for
cloud applications,” in Communications (ICC), 2015 IEEE International Conference on, June 2015, pp. 5583–
5588.
[164] J. Garcia-Dorado and S. Rao, “Cost-aware Multi Data-Center Bulk Transfers in the Cloud from a Customer-
Side Perspective,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[165] C. Wu, C. Ku, J. Ho, and M. Chen, “A Novel Pipeline Approach for Efficient Big Data Broadcasting,”
Knowledge and Data Engineering, IEEE Transactions on, vol. 28, no. 1, pp. 17–28, Jan 2016.
[166] J. Yao, P. Lu, L. Gong, and Z. Zhu, “On Fast and Coordinated Data Backup in Geo-Distributed Optical
Inter-Datacenter Networks,” Journal of Lightwave Technology, vol. 33, no. 14, pp. 3005–3015, July 2015.
[167] P. Lu, L. Zhang, X. Liu, J. Yao, and Z. Zhu, “Highly efficient data migration and backup for big data
applications in elastic optical interdata-center networks,” Network, IEEE, vol. 29, no. 5, pp. 36–42,
September 2015.
[168] I. Alan, E. Arslan, and T. Kosar, “Energy-Aware Data Transfer Tuning,” in Cluster, Cloud and Grid
Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, May 2014, pp. 626–634.
[169] Y. Koshiba, W. Chen, Y. Yamada, T. Tanaka, and I. Paik, “Investigation of network traffic in geo-
distributed data centers,” in Awareness Science and Technology (iCAST), 2015 IEEE 7th International
Conference on, Sept 2015, pp. 174–179.
[170] L. Zhang, C. Wu, Z. Li, C. Guo, M. Chen, and F. Lau, “Moving Big Data to The Cloud: An Online Cost-
Minimizing Approach,” Selected Areas in Communications, IEEE Journal on, vol. 31, no. 12, pp. 2710– 2721,
December 2013.
[171] P. Li, S. Guo, S. Yu, and W. Zhuang, “Cross-cloud MapReduce for Big Data,” Cloud Computing, IEEE
Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[172] P. Li, S. Guo, T. Miyazaki, X. Liao, H. Jin, A. Y. Zomaya, and K.Wang, “Traffic-Aware Geo-Distributed
Big Data Analytics with Predictable Job Completion Time,” IEEE Transactions on Parallel and Distributed
Systems, vol. 28, no. 6, pp. 1785–1796, June 2017.
[173] A. M. Al-Salim, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient Big Data
Networks: Impact of Volume and Variety,” IEEE Transactions on Network and Service Management, vol. PP,
no. 99, pp. 1–1, 2017.
[174] A. M. Al-Salim, T. E. El-Gorashi, A. Q. Lawey, and J. M. Elmirghani, “Greening big data networks: velocity
impact,” IET Optoelectronics, November 2017.
[175] C. Joe-Wong, I. Kamitsos, and S. Ha, “Interdatacenter Job Routing and Scheduling With Variable Costs
and Deadlines,” IEEE Transactions on Smart Grid, vol. 6, no. 6, pp. 2669–2680, Nov 2015.
[176] Y. Yao, L. Huang, A. Sharma, L. Golubchik, and M. Neely, “Power Cost Reduction in Distributed Data
Centers: A Two-Time-Scale Approach for Delay Tolerant Workloads,” Parallel and Distributed Systems, IEEE
Transactions on, vol. 25, no. 1, pp. 200–211, Jan 2014.
[177] C. Jayalath, J. Stephen, and P. Eugster, “From the Cloud to the Atmosphere: Running MapReduce across
Data Centers,” Computers, IEEE Transactions on, vol. 63, no. 1, pp. 74–87, Jan 2014.
[178] Q. Zhang, L. Liu, K. Lee, Y. Zhou, A. Singh, N. Mandagere, S. Gopisetty, and G. Alatorre, “Improving
Hadoop Service Provisioning in a Geographically Distributed Cloud,” in Cloud Computing (CLOUD), 2014 IEEE
7th International Conference on, June 2014, pp. 432–439.
[179] Y. Li, L. Zhao, C. Cui, and C. Yu, “Fast Big Data Analysis in Geo- Distributed Cloud,” in 2016 IEEE
International Conference on Cluster Computing (CLUSTER), Sept 2016, pp. 388–391.
[180] F. J. Clemente-Castelló, B. Nicolae, R. Mayo, and J. C. Fernández, “Performance Model of MapReduce
Iterative Applications for Hybrid Cloud Bursting,” IEEE Transactions on Parallel and Distributed Systems, vol.
29, no. 8, pp. 1794–1807, Aug 2018.
[181] S. Kailasam, P. Dhawalia, S. J. Balaji, G. Iyer, and J. Dharanipragada, “Extending MapReduce across
Clouds with BStream,” IEEE Transactions on Cloud Computing, vol. 2, no. 3, pp. 362–376, July 2014.
[182] R. Tudoran, G. Antoniu, and L. Bougé, “SAGE: Geo-Distributed Streaming Data Analysis in Clouds,” in
2013 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum, May 2013,
pp. 2278–2281.
[183] A. Rabkin, M. Arye, S. Sen, V. S. Pai, and M. J. Freedman, “Aggregation and Degradation in JetStream:
Streaming Analytics in the Wide Area,” in Proceedings of the 11th USENIX Conference on Networked Systems
Design and Implementation, ser. NSDI’14. Berkeley, CA, USA: USENIX Association, 2014, pp. 275–288.
[184] L. Gu, D. Zeng, S. Guo, Y. Xiang, and J. Hu, “A General Communication Cost Optimization Framework
for Big Data Stream Processing in Geo-Distributed Data Centers,” Computers, IEEE Transactions on, vol. 65, no.
1, pp. 19–29, Jan 2016.
[185] W. Chen, I. Paik, and Z. Li, “Cost-Aware Streaming Workflow Allocation on Geo-Distributed Data
Centers,” IEEE Transactions on Computers, vol. 66, no. 2, pp. 256–271, Feb 2017.
[186] Q. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, and I. Stoica, “Low Latency Geo-
distributed Data Analytics,” SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 421–434, Aug. 2015.
[187] A. C. Zhou, S. Ibrahim, and B. He, “On Achieving Efficient Data Transfer for Graph Processing in Geo-
Distributed Datacenters,” in 2017 IEEE 37th International Conference on Distributed Computing Systems
(ICDCS), June 2017, pp. 1397–1407.
[188] S. Das, Y. Yiakoumis, G. Parulkar, N. McKeown, P. Singh, D. Getachew, and P. D. Desai, “Application-
aware aggregation and traffic engineering in a converged packet-circuit network,” in Optical Fiber
Communication Conference and Exposition (OFC/NFOEC), 2011 and the National Fiber Optic Engineers
Conference, March 2011, pp. 1–3.
[189] V. Lopez, J. M. Gran, J. P. Fernandez-Palacios, D. Siracusa, F. Pederzolli, O. Gerstel, Y. Shikhmanter, J.
Mårtensson, P. Sköldström, T. Szyrkowiec, M. Chamania, A. Autenrieth, I. Tomkos, and D. Klonidis, “The role
of SDN in application centric IP and optical networks,” in 2016 European Conference on Networks and
Communications (EuCNC), June 2016, pp. 138–142.
[190] Y. Demchenko, P. Grosso, C. de Laat, S. Filiposka, and M. de Vos, “Zerotouch provisioning (ZTP) model
and infrastructure components for multi-provider cloud services provisioning,” CoRR, vol. abs/1611.02758, 2016.
[191] S. Wang, X. Zhang, W. Hou, X. Yang, and L. Guo, “SDNyquist platform for big data transmission,” in
2016 15th International Conference on Optical Communications and Networks (ICOCN), Sept 2016, pp. 1– 3.
[192] S. Narayan, S. Bailey, A. Daga, M. Greenway, R. Grossman, A. Heath, and R. Powell, “OpenFlow Enabled
Hadoop over Local and Wide Area Clusters,” in 2012 SC Companion: High Performance Computing, Networking
Storage and Analysis, Nov 2012, pp. 1625–1628.
[193] Z. Yu, M. Li, X. Yang, and X. Li, “Palantir: Reseizing Network Proximity in Large-Scale Distributed
Computing Frameworks Using SDN,” in Cloud Computing (CLOUD), 2014 IEEE 7th International Conference
on, June 2014, pp. 440–447.
[194] X. Yang and T. Lehman, “Model Driven Advanced Hybrid Cloud Services for Big Data: Paradigm and
Practice,” in 2016 Seventh International Workshop on Data-Intensive Computing in the Clouds (DataCloud), Nov
2016, pp. 32–36.
[195] A. Sadasivarao, S. Syed, P. Pan, C. Liou, I. Monga, C. Guok, and A. Lake, “Bursting Data between Data
Centers: Case for Transport SDN,” in High-Performance Interconnects (HOTI), 2013 IEEE 21st Annual
Symposium on, Aug 2013, pp. 87–90.
[196] W. Lu and Z. Zhu, “Malleable Reservation Based Bulk-Data Transfer to Recycle Spectrum Fragments in
Elastic Optical Networks,” Journal of Lightwave Technology, vol. 33, no. 10, pp. 2078–2086, May 2015.
[197] Y. Wu, Z. Zhang, C.Wu, C. Guo, Z. Li, and F. Lau, “Orchestrating Bulk Data Transfers across Geo-
Distributed Datacenters,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[198] X. Jin, Y. Li, D. Wei, S. Li, J. Gao, L. Xu, G. Li, W. Xu, and J. Rexford, “Optimizing bulk transfers with
software-defined optical wan,” in Proceedings of the 2016 ACM SIGCOMM Conference, ser. SIGCOMM ’16.
New York, NY, USA: ACM, 2016, pp. 87–100.
[199] A. Asensio and L. Velasco, “Managing transfer-based datacenter connections,” IEEE/OSA Journal of
Optical Communications and Networking, vol. 6, no. 7, pp. 660–669, July 2014.
[200] M. Femminella, G. Reali, and D. Valocchi, “Genome centric networking: A network function virtualization
solution for genomic applications,” in 2017 IEEE Conference on Network Softwarization (NetSoft), July 2017,
pp. 1–9.
[201] L. Gu, S. Tao, D. Zeng, and H. Jin, “Communication cost efficient virtualized network function placement
for big data processing,” in 2016 IEEE Conference on Computer Communications Workshops (INFOCOM
WKSHPS), April 2016, pp. 604–609.
[202] L. Gifre, M. Ruiz, and L. Velasco, “Experimental assessment of Big Data-backed video distribution in the
telecom cloud,” in 2017 19th International Conference on Transparent Optical Networks (ICTON), July 2017, pp.
1–4.
[203] B. García, M. Gallego, L. López, G. A. Carella, and A. Cheambe, “NUBOMEDIA: An Elastic PaaS
Enabling the Convergence of Real-Time and Big Data Multimedia,” in 2016 IEEE International Conference on
Smart Cloud (SmartCloud), Nov 2016, pp. 45–56.
[204] J. Han, M. Ishii, and H. Makino, “A Hadoop performance model for multi-rack clusters,” in Computer
Science and Information Technology (CSIT), 2013 5th International Conference on, March 2013, pp. 265–274.
[205] G. Wang, A. R. Butt, P. Pandey, and K. Gupta, “A simulation approach to evaluating design decisions in
MapReduce setups,” in Modeling, Analysis Simulation of Computer and Telecommunication Systems, 2009.
MASCOTS ’09. IEEE International Symposium on, Sept 2009, pp. 1–11.
[206] Z. Kouba, O. Tomanek, and L. Kencl, “Evaluation of Datacenter Network Topology Influence on Hadoop
MapReduce Performance,” in 2016 5th IEEE International Conference on Cloud Networking (Cloudnet), Oct
2016, pp. 95–100.
[207] S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “On the energy efficiency of MapReduce
shuffling operations in data centers,” in 2017 19th International Conference on Transparent Optical Networks
(ICTON), July 2017, pp. 1–5.
[208] Y. Shang, D. Li, J. Zhu, and M. Xu, “On the Network Power Effectiveness of Data Center Architectures,”
Computers, IEEE Transactions on, vol. 64, no. 11, pp. 3237–3248, Nov 2015.
[209] M. Alizadeh and T. Edsall, “On the Data Path Performance of Leaf-Spine Datacenter Fabrics,” in High-
Performance Interconnects (HOTI), 2013 IEEE 21st Annual Symposium on, Aug 2013, pp. 71–74.
[210] J. Duan and Y. Yang, “FFTree: A flexible architecture for data center networks towards configurability and
cost efficiency,” in 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS), June 2017,
pp. 1–10.
[211] S. Kandula, J. Padhye, and V. Bahl, “Flyways To De-Congest Data Center Networks,” Tech. Rep., August
2009.
[212] D. Halperin, S. Kandula, J. Padhye, P. Bahl, and D. Wetherall, “Augmenting data center networks with
multi-gigabit wireless links,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 38–49, Aug. 2011.
[213] K. Suto, H. Nishiyama, N. Kato, T. Nakachi, T. Sakano, and A. Takahara, “A Failure-Tolerant and
Spectrum-Efficient Wireless Data Center Network Design for Improving Performance of Big Data Mining,” in
Vehicular Technology Conference (VTC Spring), 2015 IEEE 81st, May 2015, pp. 1–5.
[214] P. Costa, A. Donnelly, A. Rowstron, and G. O’Shea, “Camdoop: Exploiting In-network Aggregation for
Big Data Applications,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and
Implementation, ser. NSDI’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 3–3.
[215] L. Rupprecht, “Exploiting In-network Processing for Big Data Management,” in Proceedings of the 2013
SIGMOD/PODS Ph.D. Symposium, ser. SIGMOD’13 PhD Symposium. New York, NY, USA: ACM, 2013, pp.
1–6.
[216] Y. Zhang, C. Guo, R. Chu, G. Lu, Y. Xiong, and H. Wu, “RAMCube: Exploiting Network Proximity for
RAM-Based Key-Value Store,” in 4th USENIX Workshop on Hot Topics in Cloud Computing, Hot- Cloud’12,
Boston, MA, USA, June 12-13, 2012, 2012.
[217] X. Meng, V. Pappas, and L. Zhang, “Improving the Scalability of Data Center Networks with Traffic-aware
Virtual Machine Placement,” in Proceedings of the 29th Conference on Information Communications, ser.
INFOCOM’10. Piscataway, NJ, USA: IEEE Press, 2010, pp. 1154–1162.
[218] H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, “Towards predictable datacenter networks,”
SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 242–253, Aug. 2011.
[219] D. Zeng, S. Guo, H. Huang, S. Yu, and V. C. M. Leung, “Optimal VM Placement in Data Centres with
Architectural and Resource Constraints,” Int. J. Auton. Adapt. Commun. Syst., vol. 8, no. 4, pp. 392–406, Nov.
2015.
[220] Z. Wu, Y. Zhang, V. Singh, G. Jiang, and H. Wang, “Automating Cloud Network Optimization and
Evolution,” Selected Areas in Communications, IEEE Journal on, vol. 31, no. 12, pp. 2620–2631, December 2013.
[221] W. C. Moody, J. Anderson, K.-C. Wange, and A. Apon, “Reconfigurable Network Testbed for Evaluation
of Datacenter Topologies,” in Proceedings of the Sixth International Workshop on Data Intensive Distributed
Computing, ser. DIDC ’14. New York, NY, USA: ACM, 2014, pp. 11–20.
[222] G. Wang, T. E. Ng, and A. Shaikh, “Programming Your Network at Run-time for Big Data Applications,”
in Proceedings of the First Workshop on Hot Topics in Software Defined Networks, ser. HotSDN ’12. New York,
NY, USA: ACM, 2012, pp. 103–108.
[223] H. H. Bazzaz, M. Tewari, G. Wang, G. Porter, T. S. E. Ng, D. G. Andersen, M. Kaminsky, M. A. Kozuch,
and A. Vahdat, “Switching the Optical Divide: Fundamental Challenges for Hybrid Electrical/Optical Datacenter
Networks,” in Proceedings of the 2Nd ACM Symposium on Cloud Computing, ser. SOCC ’11. New York, NY,
USA: ACM, 2011, pp. 30:1–30:8.
[224] M. Channegowda, T. Vlachogiannis, R. Nejabati, and D. Simeonidou, “Optical flyways for handling
elephant flows to improve big data performance in SDN enabled Datacenters,” in 2016 Optical Fiber
Communications Conference and Exhibition (OFC), March 2016, pp. 1–3.
[225] Y. Yin, K. Kanonakis, and P. N. Ji, “Hybrid optical/electrical switching in directly connected datacenter
networks,” in Communications in China (ICCC), 2014 IEEE/CIC International Conference on, Oct 2014, pp. 102–
106.
[226] P. Samadi, V. Gupta, B. Birand, H. Wang, G. Zussman, and K. Bergman, “Accelerating Incast and Multicast
Traffic Delivery for Data-intensive Applications Using Physical Layer Optics,” SIGCOMM Comput. Commun.
Rev., vol. 44, no. 4, pp. 373–374, Aug. 2014.
[227] J. Bao, B. Zhao, D. Dong, and Z. Gong, “HERO: A Hybrid Electrical and Optical Multicast for Accelerating
High-Performance Data Center Applications,” in Proceedings of the SIGCOMM Posters and Demos, ser.
SIGCOMM Posters and Demos ’17. New York, NY, USA: ACM, 2017, pp. 17–18.
[228] S. Peng, B. Guo, C. Jackson, R. Nejabati, F. Agraz, S. Spadaro, G. Bernini, N. Ciulli, and D. Simeonidou,
“Multi-tenant softwaredefined hybrid optical switched data centre,” Lightwave Technology, Journal of, vol. 33,
no. 15, pp. 3224–3233, Aug 2015.
[229] L. Schares, X. J. Zhang, R. Wagle, D. Rajan, P. Selo, S. P. Chang, J. Giles, K. Hildrum, D. Kuchta, J. Wolf,
and E. Schenfeld, “A reconfigurable interconnect fabric with optical circuit switch and software optimizer for
stream computing systems,” in 2009 Conference on Optical Fiber Communication - incudes post deadline papers,
March 2009, pp. 1–3.
[230] X. Yu, H. Gu, K. Wang, and G. Wu, “Enhancing Performance of Cloud Computing Data Center Networks
by Hybrid Switching Architecture,” Lightwave Technology, Journal of, vol. 32, no. 10, pp. 1991–1998, May
2014.
[231] L. Y. Ho, J. J. Wu, and P. Liu, “Optimal Algorithms for Cross-Rack Communication Optimization in
MapReduce Framework,” in 2011 IEEE 4th International Conference on Cloud Computing, July 2011, pp. 420–
427.
[232] Y. Le, F. Wang, J. Liu, and F. Ergün, “On Datacenter-Network-Aware Load Balancing in MapReduce,” in
Cloud Computing (CLOUD), 2015 IEEE 8th International Conference on, June 2015, pp. 485–492.
[233] H. Ke, P. Li, S. Guo, and M. Guo, “On Traffic-Aware Partition and Aggregation in MapReduce for Big
Data Applications,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 3, pp. 818–828, March
2016.
[234] Z. Jiang, Z. Ding, X. Gao, and G. Chen, “DCP: An efficient and distributed data center cache protocol with
Fat-Tree topology,” in Network Operations and Management Symposium (APNOMS), 2014 16th Asia-Pacific,
Sept 2014, pp. 1–4.
[235] D. Guo, J. Xie, X. Zhou, X. Zhu, W. Wei, and X. Luo, “Exploiting Efficient and Scalable Shuffle Transfers
in Future Data Center Networks,” Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 4, pp. 997–
1009, April 2015.
[236] E. Yildirim, E. Arslan, J. Kim, and T. Kosar, “Application-Level Optimization of Big Data Transfers
through Pipelining, Parallelism and Concurrency,” IEEE Transactions on Cloud Computing, vol. 4, no. 1,
pp. 63–75, Jan 2016.
[237] Y. Yu and C. Qian, “Space Shuffle: A Scalable, Flexible, and High- Performance Data Center Network,”
IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99, pp. 1–1, 2016.
[238] E. Zahavi, I. Keslassy, and A. Kolodny, “Distributed Adaptive Routing Convergence to Non-Blocking DCN
Routing Assignments,” Selected Areas in Communications, IEEE Journal on, vol. 32, no. 1, pp. 88–101,
January 2014.
[239] N. Chrysos, M. Gusat, F. Neeser, C. Minkenberg, W. Denzel, and C. Basso, “High performance multipath
routing for datacenters,” in High Performance Switching and Routing (HPSR), 2014 IEEE 15th International
Conference on, July 2014, pp. 70–75.
[240] E. Dong, X. Fu, M. Xu, and Y. Yang, “DCMPTCP: Host-Based Load Balancing for Datacenters,” in 2018
IEEE 38th International Conference on Distributed Computing Systems (ICDCS), July 2018, pp. 622–633.
[241] Y. Shang, D. Li, and M. Xu, “Greening data center networks with flow preemption and energy-aware
routing,” in Local Metropolitan Area Networks (LANMAN), 2013 19th IEEE Workshop on, April 2013, pp. 1–
6.
[242] L. Wang, F. Zhang, and Z. Liu, “Improving the Network Energy Efficiency in MapReduce Systems,” in
Computer Communications and Networks (ICCCN), 2013 22nd International Conference on, July 2013, pp. 1–7.
[243] L. Wang, F. Zhang, J. Arjona Aroca, A. Vasilakos, K. Zheng, C. Hou, D. Li, and Z. Liu, “GreenDCN: A
General Framework for Achieving Energy Efficiency in Data Center Networks,” Selected Areas in
Communications, IEEE Journal on, vol. 32, no. 1, pp. 4–15, January 2014.
[244] X. Wen, K. Chen, Y. Chen, Y. Liu, Y. Xia, and C. Hu, “VirtualKnotter: Online Virtual Machine Shuffling
for Congestion Resolving in Virtualized Datacenter,” in Distributed Computing Systems (ICDCS), 2012 IEEE
32nd International Conference on, June 2012, pp. 12–21.
[245] K. C. Webb, A. C. Snoeren, and K. Yocum, “Topology Switching for Data Center Networks,” in
Proceedings of the 11th USENIX Conference on Hot Topics in Management of Internet, Cloud, and Enterprise
Networks and Services, ser. Hot-ICE’11. Berkeley, CA, USA: USENIX Association, 2011, pp. 14–14.
[246] L. Chen, Y. Feng, B. Li, and B. Li, “Towards performance-centric fairness in datacenter networks,” in
INFOCOM, 2014 Proceedings IEEE, April 2014, pp. 1599–1607.
[247] L. A. Rocha and F. L. Verdi, “MILPFlow: A toolset for integration of computational modelling and
deployment of data paths for SDN,” in Integrated Network Management (IM), 2015 IFIP/IEEE International
Symposium on, May 2015, pp. 750–753.
[248] L.W. Cheng and S. Y.Wang, “Application-Aware SDN Routing for Big Data Networking,” in 2015 IEEE
Global Communications Conference (GLOBECOM), Dec 2014, pp. 1–6.
[249] S. Narayan, S. Bailey, and A. Daga, “Hadoop Acceleration in an OpenFlow-Based Cluster,” in High
Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:, Nov 2012, pp.
535–538.
[250] X. Hou, A. K. T. K, J. P. Thomas, and V. Varadharajan, “Dynamic Workload Balancing for Hadoop
MapReduce,” in Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on, Dec
2014, pp. 56–62.
[251] C. Trois, M. Martinello, L. C. E. de Bona, and M. D. Del Fabro, “From Software Defined Network to
Network Defined for Software,” in Proceedings of the 30th Annual ACM Symposium on Applied Computing,
ser. SAC ’15. New York, NY, USA: ACM, 2015, pp. 665–668.
[252] S. Zhao and D. Medhi, “Application-Aware Network Design for Hadoop MapReduce Optimization Using
Software-Defined Networking,” IEEE Transactions on Network and Service Management, vol. 14, no. 4, pp. 804–
816, Dec 2017.
[253] Z. Asad, M. Chaudhry, and D. Malone, “Greener Data Exchange in the Cloud: A Coding Based
Optimization for Big Data Processing,” Selected Areas in Communications, IEEE Journal on, vol. PP, no. 99,
pp. 1–1, 2016.
[254] J. Duan, Z. Wang, and C. Wu, “Responsive multipath TCP in SDN based datacenters,” in Communications
(ICC), 2015 IEEE International Conference on, June 2015, pp. 5296–5301.
[255] S. Sen, D. Shue, S. Ihm, and M. J. Freedman, “Scalable, Optimal Flow Routing in Datacenters via Local
Link Balancing,” in Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and
Technologies, ser. CoNEXT ’13. New York, NY, USA: ACM, 2013, pp. 151–162.
[256] S. Hu, K. Chen, H. Wu, W. Bai, C. Lan, H. Wang, H. Zhao, and C. Guo, “Explicit path control in commodity
data centers: Design and applications,” Networking, IEEE/ACM Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[257] Z. Xie, L. Hu, K. Zhao, F. Wang, and J. Pang, “Topology2Vec: Topology Representation Learning For Data
Center Networking,” IEEE Access, vol. 6, pp. 33 840–33 848, 2018.
[258] M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, “Managing Data Transfers in Computer
Clusters with Orchestra,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 98–109, Aug. 2011.
[259] A. Shieh, S. Kandula, A. Greenberg, C. Kim, and B. Saha, “Sharing the Data Center Network,” in
Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’11.
Berkeley, CA, USA: USENIX Association, 2011, pp. 309–322.
[260] A. Das, C. Lumezanu, Y. Zhang, V. Singh, G. Jiang, and C. Yu, “Transparent and flexible network
management for big data processing in the cloud,” in Presented as part of the 5th USENIX Workshop on Hot
Topics in Cloud Computing. Berkeley, CA: USENIX, 2013.
[261] W. Cui and C. Qian, “DiFS: Distributed flow scheduling for adaptive routing in hierarchical data center
networks,” in 2014 ACM/IEEE Symposium on Architectures for Networking and Communications Systems
(ANCS), Oct 2014, pp. 53–64.
[262] M. Chowdhury, Y. Zhong, and I. Stoica, “Efficient Coflow Scheduling with Varys,” SIGCOMM Comput.
Commun. Rev., vol. 44, no. 4, pp. 443–454, Aug. 2014.
[263] F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron, “Decentralized Task-aware Scheduling for Data
Center Networks,” SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 431–442, Aug. 2014.
[264] S. Luo, H. Yu, Y. Zhao, S. Wang, S. Yu, and L. Li, “Towards Practical and Near-optimal Coflow Scheduling
for Data Center Networks,” IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99, pp. 1–1,
2016.
[265] Y. Zhao, K. Chen, W. Bai, M. Yu, C. Tian, Y. Geng, Y. Zhang, D. Li, and S. Wang, “Rapier: Integrating
routing and scheduling for coflow-aware data center networks,” in 2015 IEEE Conference on Computer
Communications (INFOCOM), April 2015, pp. 424–432.
[266] Z. Guo, J. Duan, and Y. Yang, “On-Line Multicast Scheduling with Bounded Congestion in Fat-Tree Data
Center Networks,” Selected Areas in Communications, IEEE Journal on, vol. 32, no. 1, pp. 102– 115, January
2014.
[267] M. V. Neves, C. A. F. D. Rose, K. Katrinis, and H. Franke, “Pythia: Faster Big Data in Motion through
Predictive Software-Defined Network Optimization at Runtime,” in 2014 IEEE 28th International Parallel and
Distributed Processing Symposium, May 2014, pp. 82–90.
[268] W. Hong, K. Wang, and Y.-H. Hsu, “Application-Aware Resource Allocation for SDN-based Cloud
Datacenters,” in Cloud Computing and Big Data (CloudCom-Asia), 2013 International Conference on, Dec 2013,
pp. 106–110.
[269] P. Qin, B. Dai, B. Huang, and G. Xu, “Bandwidth-Aware Scheduling With SDN in Hadoop: A New Trend
for Big Data,” Systems Journal, IEEE, vol. PP, no. 99, pp. 1–8, 2015.
[270] H. Rodrigues, R. Strong, A. Akyurek, and T. Rosing, “Dynamic optical switching for latency sensitive
applications,” in Architectures for Networking and Communications Systems (ANCS), 2015 ACM/IEEE
Symposium on, May 2015, pp. 75–86.
[271] K. Kontodimas, K. Christodoulopoulos, E. Zahavi, and E. Varvarigos, “Resource allocation in slotted
optical data center networks,” in 2018 International Conference on Optical Network Design and Modeling
(ONDM), May 2018, pp. 248–253.
[272] G. C. Sankaran and K. M. Sivalingam, “Design and Analysis of Scheduling Algorithms for Optically
Groomed Data Center Networks,” IEEE/ACM Transactions on Networking, vol. 25, no. 6, pp. 3282–3293,
Dec 2017.
[273] L. Wang, X. Wang, M. Tornatore, K. J. Kim, S. M. Kim, D. Kim, K. Han, and B. Mukherjee, “Scheduling
with machine-learning-based flow detection for packet-switched optical data center networks,” IEEE/OSA
Journal of Optical Communications and Networking, vol. 10, no. 4, pp. 365–375, April 2018.
[274] R. Xie and X. Jia, “Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud
Systems,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[275] I. Paik, W. Chen, and Z. Li, “Topology-Aware Optimal Data Placement Algorithm for Network Traffic
Optimization,” Computers, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[276] W. Li, D. Guo, A. X. Liu, K. Li, H. Qi, S. Guo, A. Munir, and X. Tao, “CoMan: Managing Bandwidth
Across Computing Frameworks in Multiplexed Datacenters,” IEEE Transactions on Parallel and Distributed
Systems, vol. 29, no. 5, pp. 1013–1029, May 2018.
[277] H. Shen, A. Sarker, L. Yu, and F. Deng, “Probabilistic Network-Aware Task Placement for MapReduce
Scheduling,” in 2016 IEEE International Conference on Cluster Computing (CLUSTER), Sept 2016, pp. 241–
250.
[278] Z. Li, H. Shen, and A. Sarker, “A Network-Aware Scheduler in Data-Parallel Clusters for High
Performance,” in 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
(CCGRID), May 2018, pp. 1–10.
[279] D. Xie, N. Ding, Y. C. Hu, and R. Kompella, “The Only Constant is Change: Incorporating Time-varying
Network Reservations in Data Centers,” in Proceedings of the ACM SIGCOMM 2012 Conference on
Applications, Technologies, Architectures, and Protocols for Computer Communication, ser. SIGCOMM ’12.
New York, NY, USA: ACM, 2012, pp. 199–210.
[280] V. Jalaparti, P. Bodik, I. Menache, S. Rao, K. Makarychev, and M. Caesar, “Network-Aware Scheduling
for Data-Parallel Jobs: Plan When You Can,” SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 407–420,
Aug. 2015.
[281] K. Karanasos, S. Rao, C. Curino, C. Douglas, K. Chaliparambil, G. M. Fumarola, S. Heddaya, R.
Ramakrishnan, and S. Sakalanaga, “Mercury: Hybrid Centralized and Distributed Scheduling in Large
Shared Clusters,” in Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference, ser.
USENIX ATC ’15. Berkeley, CA, USA: USENIX Association, 2015, pp. 485–497.
[282] T. Renner, L. Thamsen, and O. Kao, “Network-aware resource management for scalable data analytics
frameworks,” in Big Data (Big Data), 2015 IEEE International Conference on, Oct 2015, pp. 2793–2800.
[283] R. F. e Silva and P. M. Carpenter, “Energy Efficient Ethernet on MapReduce Clusters: Packet Coalescing
To Improve 10GbE Links,” IEEE/ACM Transactions on Networking, vol. 25, no. 5, pp. 2731–2742, Oct 2017.
[284] G. Wen, J. Hong, C. Xu, P. Balaji, S. Feng, and P. Jiang, “Energy-aware hierarchical scheduling of
applications in large scale data centers,” in Cloud and Service Computing (CSC), 2011 International Conference
on, Dec 2011, pp. 158–165.
[285] D. Li, Y. Yu, W. He, K. Zheng, and B. He, “Willow: Saving Data Center Network Energy for Network-
Limited Flows,” Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 9, pp. 2610–2620, Sept
2015.
[286] Z. Niu, B. He, and F. Liu, “JouleMR: Towards Cost-Effective and Green-Aware Data Processing
Frameworks,” IEEE Transactions on Big Data, vol. 4, no. 2, pp. 258–272, June 2018.
[287] R. Appuswam, C. Gkantsidis, D. Narayanan, O. Hodson, and A. Rowstron, “Scale-up vs Scale-out for
Hadoop: Time to rethink?” ACM Symposium on Cloud Computing, October 2013.
[288] Z. Li, H. Shen, W. Ligon, and J. Denton, “An Exploration of Designing a Hybrid Scale-Up/Out Hadoop
Architecture Based on Performance Measurements,” IEEE Transactions on Parallel and Distributed Systems,
vol. 28, no. 2, pp. 386–400, Feb 2017.
[289] S. Sur, H. Wang, J. Huang, X. Ouyang, and D. Panda, “Can High- Performance Interconnects Benefit
Hadoop Distributed File System?” 2010.
[290] Y. Wang, R. Goldstone, W. Yu, and T. Wang, “Characterization and Optimization of Memory-Resident
MapReduce on HPC Systems,” in Parallel and Distributed Processing Symposium, 2014 IEEE 28th International,
May 2014, pp. 799–808.
[291] K. Kambatla and Y. Chen, “The Truth About MapReduce Performance on SSDs,” in Proceedings of the
28th USENIX Conference on Large Installation System Administration, ser. LISA’14. Berkeley, CA, USA:
USENIX Association, 2014, pp. 109–117.
[292] J. Hong, L. Li, C. Han, B. Jin, Q. Yang, and Z. Yang, “Optimizing Hadoop Framework for Solid State
Drives,” in 2016 IEEE International Congress on Big Data (BigData Congress), June 2016, pp. 9–17.
[293] B. Wang, J. Jiang, Y. Wu, G. Yang, and K. Li, “Accelerating MapReduce on Commodity Clusters: An SSD-
Empowered Approach,” IEEE Transactions on Big Data, vol. PP, no. 99, pp. 1–1, 2016.
[294] J. Bhimani, J. Yang, Z. Yang, N. Mi, Q. Xu, M. Awasthi, R. Pandurangan, and V. Balakrishnan,
“Understanding performance of I/O intensive containerized applications for NVMe SSDs,” in 2016 IEEE 35th
International Performance Computing and Communications Conference (IPCCC), Dec 2016, pp. 1–8.
[295] G. Wang, A. R. Butt, H. Monti, and K. Gupta, “Towards Synthesizing Realistic Workload Traces for
Studying the Hadoop Ecosystem,” in Proceedings of the 2011 IEEE 19th Annual International Symposium on
Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, ser. MASCOTS ’11.
Washington, DC, USA: IEEE Computer Society, 2011, pp. 400–408.
[296] T. Ono, Y. Konishi, T. Tanimoto, N. Iwamatsu, T. Miyoshi, and J. Tanaka, “FlexDAS: A flexible direct
attached storage for I/O intensive applications,” in 2014 IEEE International Conference on Big Data (Big Data),
Oct 2014, pp. 147–152.
[297] Y. Kim, S. Atchley, G. R. Vallee, and G. M. Shipman, “Layout-aware I/O Scheduling for terabits data
movement,” in 2013 IEEE International Conference on Big Data, Oct 2013, pp. 44–51.
[298] A. Dragojevi´c, D. Narayanan, M. Castro, and O. Hodson, “FaRM: Fast Remote Memory,” in 11th USENIX
Symposium on Networked Systems Design and Implementation (NSDI 14). Seattle, WA: USENIX
Association, 2014, pp. 401–414.
[299] W. Yu, Y. Wang, X. Que, and C. Xu, “Virtual Shuffling for Efficient Data Movement in MapReduce,”
Computers, IEEE Transactions on, vol. 64, no. 2, pp. 556–568, Feb 2015.
[300] M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A.
Ailamaki, and B. Falsafi, “A Case for Specialized Processors for Scale-Out Workloads,” IEEE Micro, vol. 34, no.
3, pp. 31–42, May 2014.
[301] B. Jacob, “The 2 PetaFLOP, 3 Petabyte, 9 TB/s, 90 kW Cabinet: A System Architecture for Exascale and
Big Data,” IEEE Computer Architecture Letters, vol. PP, no. 99, pp. 1–1, 2015.
[302] W. Fang, B. He, Q. Luo, and N. K. Govindaraju, “Mars: Accelerating MapReduce with Graphics
Processors,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 4, pp. 608–620, April 2011.
[303] C. Wang, C. Yang, W. Liao, R. Chang, and T. Wei, “Coupling GPU and MPTCP to improve
Hadoop/MapReduce performance,” in 2016 2nd International Conference on Intelligent Green Building and
Smart Grid (IGBSG), June 2016, pp. 1–6.
[304] Y. Shan, B. Wang, N. X. Jing Yan, Y. Wang, and H. Yang, “FPMR: MapReduce framework on FPGA,” in
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays. New
York, NY, USA: ACM, January 2010, p. 93–102.
[305] C. Wang, X. Li, and X. Zhou, “SODA: Software defined FPGA based accelerators for big data,” in 2015
Design, Automation Test in Europe Conference Exhibition (DATE), March 2015, pp. 884–887.
[306] D. Diamantopoulos and C. Kachris, “High-level synthesizable dataflow MapReduce accelerator for FPGA-
coupled data centers,” in Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS),
2015 International Conference on, July 2015, pp. 26–33.
[307] Y. Tokusashi and H. Matsutani, “Multilevel NoSQL Cache Combining In-NIC and In-Kernel Approaches,”
IEEE Micro, vol. 37, no. 5, pp. 44–51, September 2017.
[308] K. Nakamura, A. Hayashi, and H. Matsutani, “An FPGA-based low-latency network processing for spark
streaming,” in 2016 IEEE International Conference on Big Data (Big Data), Dec 2016, pp. 2410–2415.
[309] B. Betkaoui, D. B. Thomas, W. Luk, and N. Przulj, “A framework for FPGA acceleration of large graph
problems: Graphlet counting case study,” in 2011 International Conference on Field-Programmable Technology,
Dec 2011, pp. 1–8.
[310] P. X. Gao, A. Narayan, S. Karandikar, J. Carreira, S. Han, R. Agarwal, S. Ratnasamy, and S. Shenker,
“Network Requirements for Resource Disaggregation,” in 12th USENIX Symposium on Operating Systems
Design and Implementation (OSDI 16). Savannah, GA: USENIX Association, 2016, pp. 249–264.
[311] C.-S. Li, H. Franke, C. Parris, B. Abali, M. Kesavan, and V. Chang, “Composable architecture for rack scale
big data computing,” Future Generation Computer Systems, vol. 67, pp. 180 – 193, 2017.
[312] M. Chen, S. Mao, and Y. Liu, “Big Data: A Survey,” Mob. Netw. Appl., vol. 19, no. 2, pp. 171–209, Apr.
2014.
[313] S. Sakr, A. Liu, D. Batista, and M. Alomari, “A Survey of Large Scale Data Management Approaches in
Cloud Environments,” Communications Surveys Tutorials, IEEE, vol. 13, no. 3, pp. 311–336, Third 2011.
[314] L. Jiamin and F. Jun, “A Survey of MapReduce Based Parallel Processing Technologies,” Communications,
China, vol. 11, no. 14, pp. 146–155, Supplement 2014.
[315] Y. Zhang, T. Cao, S. Li, X. Tian, L. Yuan, H. Jia, and A. V. Vasilakos, “Parallel Processing Systems for
Big Data: A Survey,” Proceedings of the IEEE, vol. 104, no. 11, pp. 2114–2136, Nov 2016.
[316] H. Zhang, G. Chen, B. C. Ooi, K. L. Tan, and M. Zhang, “In-Memory Big Data Management and Processing:
A Survey,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 7, pp. 1920–1948, July 2015.
[317] R. Han, L. K. John, and J. Zhan, “Benchmarking Big Data Systems: A Review,” IEEE Transactions on
Services Computing, vol. PP, no. 99, pp. 1–1, 2017.
[318] G. Rumi, C. Colella, and D. Ardagna, “Optimization Techniques within the Hadoop Eco-system: A Survey,”
in Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2014 16th International Symposium
on, Sept 2014, pp. 437–444.
[319] B. T. Rao and L. S. S. Reddy, “Survey on Improved Scheduling in Hadoop MapReduce in Cloud
Environments,” CoRR, vol. abs/1207.0780, 2012.
[320] R. Li, H. Hu, H. Li, Y. Wu, and J. Yang, “Mapreduce parallel programming model: A state-of-the-art
survey,” International Journal of Parallel Programming, vol. 44, no. 4, pp. 832–866, Aug 2016.
[321] J. Wu, S. Guo, J. Li, and D. Zeng, “Big Data Meet Green Challenges: Big Data Toward Green Applications,”
IEEE Systems Journal, vol. 10, no. 3, pp. 888–900, Sept 2016.
[322] P. Derbeko, S. Dolev, E. Gudes, and S. Sharma, “Security and privacy aspects in mapreduce on clouds: A
survey,” Computer Science Review, vol. 20, pp. 1 – 28, 2016.
[323] S. Dolev, P. Florissi, E. Gudes, S. Sharma, and I. Singer, “A Survey on Geographically Distributed Big-
Data Processing using MapReduce,” IEEE Transactions on Big Data, vol. PP, no. 99, pp. 1–1, 2017.
[324] M. Hadi, A. Lawey, T. El-Gorashi, and J. Elmirghani, “Big Data Analytics for Wireless and Wired Network
Design: A Survey,” Computer Networks, vol. 132, pp. 180–199, February 2018.
[325] J. Wang, Y. Wu, N. Yen, S. Guo, and Z. Cheng, “Big Data Analytics for Emergency Communication
Networks: A Survey,” IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp. 1758–1778, thirdquarter 2016.
[326] X. Cao, L. Liu, Y. Cheng, and X. Shen, “Towards Energy-Efficient Wireless Networking in the Big Data
Era: A Survey,” IEEE Communications Surveys Tutorials, vol. PP, no. 99, pp. 1–1, 2017.
[327] S. Yu, M. Liu, W. Dou, X. Liu, and S. Zhou, “Networking for Big Data: A Survey,” IEEE Communications
Surveys Tutorials, vol. 19, no. 1, pp. 531–549, Firstquarter 2017.
[328] S. Wang, J. Zhang, T. Huang, J. Liu, T. Pan, and Y. Liu, “A Survey of Coflow Scheduling Schemes for
Data Center Networks,” IEEE Communications Magazine, vol. 56, no. 6, pp. 179–185, June 2018.
[329] K. Wang, Q. Zhou, S. Guo, and J. Luo, “Cluster Frameworks for Efficient Scheduling and Resource
Allocation in Data Center Networks: A Survey,” IEEE Communications Surveys Tutorials, pp. 1–1, 2018.
[330] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: Distributed Data-parallel Programs from
Sequential Building Blocks,” SIGOPS Oper. Syst. Rev., vol. 41, no. 3, pp. 59–72, Mar. 2007.
[331] T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernández- Moctezuma, R. Lax, S. McVeety, D.
Mills, F. Perry, E. Schmidt, and S. Whittle, “The Dataflow Model: A Practical Approach to Balancing Correctness,
Latency, and Cost in Massive-Scale, Unbounded, Out-of- Order Data Processing,” Proceedings of the VLDB
Endowment, vol. 8, pp. 1792–1803, 2015.
[332] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google File System,” SIGOPS Oper. Syst. Rev., vol. 37,
no. 5, pp. 29–43, Oct. 2003.
[333] V. Kalavri and V. Vlassov, “MapReduce: Limitations, Optimizations and Open Issues,” in 2013 12th IEEE
International Conference on Trust, Security and Privacy in Computing and Communications, July 2013, pp. 1031–
1038.
[334] S. Babu, “Towards Automatic Optimization of MapReduce Programs,” in Proceedings of the 1st ACM
Symposium on Cloud Computing, ser. SoCC ’10. New York, NY, USA: ACM, 2010, pp. 137–142.
[335] P. Lama and X. Zhou, “AROMA: Automated Resource Allocation and Configuration of Mapreduce
Environment in the Cloud,” in Proceedings of the 9th International Conference on Autonomic Computing, ser.
ICAC ’12. New York, NY, USA: ACM, 2012, pp. 63–72.
[336] A. Rabkin and R. Katz, “How Hadoop Clusters Break,” Software, IEEE, vol. 30, no. 4, pp. 88–94, July
2013.
[337] D. Cheng, J. Rao, Y. Guo, C. Jiang, and X. Zhou, “Improving Performance of Heterogeneous MapReduce
Clusters with Adaptive Task Tuning,” IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 3, pp.
774–786, March 2017.
[338] T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears, “MapReduce Online,” in
Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’10.
Berkeley, CA, USA: USENIX Association, 2010, pp. 21–21.
[339] T. White, Hadoop: The Definitive Guide, 1st ed. O’Reilly Media, Inc., 2009.
[340] C. Ji, Y. Li, W. Qiu, U. Awada, and K. Li, “Big Data Processing in Cloud Computing Environments,” in
Pervasive Systems, Algorithms and Networks (ISPAN), 2012 12th International Symposium on, Dec 2012, pp.
17–23.
[341] V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah,
S. Seth, B. Saha, C. Curino, O. O’Malley, S. Radia, B. Reed, and E. Baldeschwieler, “Apache Hadoop YARN:
Yet Another Resource Negotiator,” in Proceedings of the 4th Annual Symposium on Cloud Computing, ser.
SOCC ’13. New York, NY, USA: ACM, 2013, pp. 5:1–5:16.
[342] I. Polato, D. Barbosa, A. Hindle, and F. Kon, “Hadoop branching: Architectural impacts on energy and
performance,” in Green Computing Conference and Sustainable Computing Conference (IGSC), 2015 Sixth
International, Dec 2015, pp. 1–4.
[343] C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, “Pig Latin: A Not-so-foreign Language for
Data Processing,” in Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data,
ser. SIGMOD ’08. New York, NY, USA: ACM, 2008, pp. 1099–1110.
[344] B. Saha, H. Shah, S. Seth, G. Vijayaraghavan, A. Murthy, and C. Curino, “Apache Tez: A Unifying
Framework for Modeling and Building Data Processing Applications,” in Proceedings of the 2015 ACM
SIGMOD International Conference on Management of Data, ser. SIGMOD ’15. New York, NY, USA: ACM,
2015, pp. 1357–1369.
[345] A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy, “Hive:
A Warehousing Solution over a Map-reduce Framework,” Proc. VLDB Endow., vol. 2, no. 2, pp. 1626–1629,
Aug. 2009.
[346] A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J.
Donham, N. Bhagat, S. Mittal, and D. Ryaboy, “Storm@twitter,” in Proceedings of the 2014 ACM SIGMOD
International Conference on Management of Data, ser. SIGMOD ’14. New York, NY, USA: ACM, 2014, pp.
147–156.
[347] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R.
E. Gruber, “Bigtable: A Distributed Storage System for Structured Data,” ACM Trans. Comput. Syst., vol. 26,
no. 2, pp. 4:1–4:26, Jun. 2008.
[348] B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D.
Weaver, and R. Yerneni, “PNUTS: Yahoo!’s Hosted Data Serving Platform,” Proc. VLDB Endow., vol. 1, no. 2,
pp. 1277–1288, Aug. 2008.
[349] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P.
Vosshall, and W. Vogels, “Dynamo: Amazon’s Highly Available Key-value Store,” SIGOPS Oper. Syst. Rev.,
vol. 41, no. 6, pp. 205–220, Oct. 2007.
[350] A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin, “HadoopDB: An
Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads,” Proc. VLDB Endow.,
vol. 2, no. 1, pp. 922–933, Aug. 2009.
[351] A. Lakshman and P. Malik, “Cassandra: A Decentralized Structured Storage System,” SIGOPS Oper. Syst.
Rev., vol. 44, no. 2, pp. 35–40, Apr. 2010.
[352] J. Dittrich, J.-A. Quiané-Ruiz, A. Jindal, Y. Kargin, V. Setty, and J. Schad, “Hadoop++: Making a Yellow
Elephant Run Like a Cheetah (Without It Even Noticing),” Proc. VLDB Endow., vol. 3, no. 1-2, pp. 515–529,
Sep. 2010.
[353] J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G.
Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman, “The Case for RAMClouds: Scalable
High-performance Storage Entirely in DRAM,” SIGOPS Oper. Syst. Rev., vol. 43, no. 4, pp. 92–105, Jan. 2010.
[354] F. Färber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner, “SAP HANA Database: Data
Management for Modern Business Applications,” SIGMOD Rec., vol. 40, no. 4, pp. 45–51, Jan. 2012.
[355] M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster Computing with
Working Sets,” in Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, ser.
HotCloud’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 10–10.
[356] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica,
“Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing,” in Proceedings
of the 9th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’12. Berkeley, CA,
USA: USENIX Association, 2012, pp. 2–2.
[357] A. Ching, S. Edunov, M. Kabiljo, D. Logothetis, and S. Muthukrishnan, “One Trillion Edges: Graph
Processing at Facebook-scale,” Proc. VLDB Endow., vol. 8, no. 12, pp. 1804–1815, Aug. 2015.
[358] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: A
System for Large-scale Graph Processing,” in Proceedings of the 2010 ACM SIGMOD International Conference
on Management of Data, ser. SIGMOD ’10. New York, NY, USA: ACM, 2010, pp. 135–146.
[359] B. Shao, H. Wang, and Y. Li, “Trinity: A Distributed Graph Engine on a Memory Cloud,” in Proceedings
of SIGMOD 2013. ACM SIGMOD, June 2013.
[360] Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, “Distributed GraphLab: A
Framework for Machine Learning and Data Mining in the Cloud,” Proc. VLDB Endow., vol. 5, no. 8, pp. 716–
727, Apr. 2012.
[361] J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, “PowerGraph: Distributed Graph-Parallel
Computation on Natural Graphs,” in Presented as part of the 10th USENIX Symposium on Operating Systems
Design and Implementation (OSDI 12). Hollywood, CA: USENIX, 2012, pp. 17–30.
[362] D. Bernstein, “The Emerging Hadoop, Analytics, Stream Stack for Big Data,” Cloud Computing, IEEE,
vol. 1, no. 4, pp. 84–86, Nov 2014.
[363] A. M. Aly, A. Sallam, B. M. Gnanasekaran, L. V. Nguyen-Dinh, W. G. Aref, M. Ouzzani, and A. Ghafoor,
“M3: Stream Processing on Main-Memory MapReduce,” in 2012 IEEE 28th International Conference on Data
Engineering, April 2012, pp. 1253–1256.
[364] T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P.
Nordstrom, and S. Whittle, “Mill-Wheel: Fault-Tolerant Stream Processing at Internet Scale,” in Very Large Data
Bases, 2013, pp. 734–746.
[365] L. Neumeyer, B. Robbins, A. Nair, and A. Kesari, “S4: Distributed Stream Computing Platform,” in
Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, ser. ICDMW ’10.
Washington, DC, USA: IEEE Computer Society, 2010, pp. 170–177.
[366] M. Zaharia, T. Das, H. Li, S. Shenker, and I. Stoica, “Discretized Streams: An Efficient and Fault-tolerant
Model for Stream Processing on Large Clusters,” in Proceedings of the 4th USENIX Conference on Hot Topics
in Cloud Computing, ser. HotCloud’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 10–10.
[367] N. Marz and J. Warren, Big Data: Principles and Best Practices of Scalable Realtime Data Systems, 1st ed.
Greenwich, CT, USA: Manning Publications Co., 2015.
[368] J. Kreps, N. Narkhede, and J. Rao, “Kafka: A distributed messaging system for log processing,” in
Proceedings of 6th International Workshop on Networking Meets Databases (NetDB), Athens, Greece, 2011.
[369] Apache Hadoop Rumen. (Cited on 2016, Dec). [Online]. Available:
https://hadoop.apache.org/docs/stable/hadoop-rumen/Rumen.html
[370] M. Sadiku, S. Musa, and O. Momoh, “Cloud Computing: Opportunities and Challenges,” Potentials, IEEE,
vol. 33, no. 1, pp. 34–36, Jan 2014.
[371] B. Biocic, D. Tomic, and D. Ogrizovic, “Economics of the cloud computing,” in MIPRO, 2011 Proceedings
of the 34th International Convention, May 2011, pp. 1438–1442.
[372] N. da Fonseca and R. Boutaba, Cloud Architectures, Networks, Services, and Management. Wiley-IEEE
Press, 2015, p. 432.
[373] J. E. Smith and R. Nair, “The architecture of virtual machines,” Computer, vol. 38, no. 5, pp. 32–38, May
2005.
[374] B. Sotomayor, R. S. Montero, I. M. Llorente, and I. Foster, “Virtual Infrastructure Management in Private
and Hybrid Clouds,” IEEE Internet Computing, vol. 13, no. 5, pp. 14–22, Sept 2009.
[375] Amazon EC2. (Cited on 2017, Dec). [Online]. Available: https: //aws.amazon.com/ec2
[376] Google Compute Engine Pricing. (Cited on 2017, Dec). [Online]. Available:
https://cloud.google.com/compute/pricing
[377] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield,
“Xen and the Art of Virtualization,” SIGOPS Oper. Syst. Rev., vol. 37, no. 5, pp. 164–177, Oct. 2003.
[378] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori, “kvm: the Linux Virtual Machine Monitor,” in
Proceedings of the Linux Symposium, vol. 1, Ottawa, Ontario, Canada, Jun. 2007, pp. 225–230.
[379] F. Guthrie, S. Lowe, and K. Coleman, VMware vSphere Design, 2nd ed. Alameda, CA, USA: SYBEX Inc.,
2013.
[380] T. Kooburat and M. Swift, “The Best of Both Worlds with On-demand Virtualization,” in Proceedings of
the 13th USENIX Conference on Hot Topics in Operating Systems, ser. HotOS’13. Berkeley, CA, USA: USENIX
Association, 2011, pp. 4–4.
[381] D. Venzano and P. Michiardi, “A Measurement Study of Data-Intensive Network Traffic Patterns in a
Private Cloud,” in 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing, Dec 2013, pp.
476–481.
[382] F. Xu, F. Liu, H. Jin, and A. Vasilakos, “Managing Performance Overhead of Virtual Machines in Cloud
Computing: A Survey, State of the Art, and Future Directions,” Proceedings of the IEEE, vol. 102, no. 1, pp. 11–
31, Jan 2014.
[383] G. Wang and T. S. E. Ng, “The Impact of Virtualization on Network Performance of Amazon EC2 Data
Center,” in Proceedings of the 29th Conference on Information Communications, ser. INFOCOM’10. Piscataway,
NJ, USA: IEEE Press, 2010, pp. 1163–1171.
[384] Q. Duan, Y. Yan, and A. V. Vasilakos, “A Survey on Service-Oriented Network Virtualization Toward
Convergence of Networking and Cloud Computing,” IEEE Transactions on Network and Service Management,
vol. 9, no. 4, pp. 373–392, December 2012.
[385] R. Jain and S. Paul, “Network virtualization and software defined networking for cloud computing: a
survey,” Communications Magazine, IEEE, vol. 51, no. 11, pp. 24–31, November 2013.
[386] A. Fischer, J. F. Botero, M. T. Beck, H. de Meer, and X. Hesselbach, “Virtual Network Embedding: A
Survey,” IEEE Communications Surveys Tutorials, vol. 15, no. 4, pp. 1888–1906, Fourth 2013.
[387] L. Nonde, T. El-Gorashi, and J. Elmirghani, “Energy Efficient Virtual Network Embedding for Cloud
Networks,” Lightwave Technology, Journal of, vol. 33, no. 9, pp. 1828–1849, May 2015.
[388] L. Nonde, T. E. H. Elgorashi, and J. M. H. Elmirghani, “Cloud Virtual Network Embedding: Profit, Power
and Acceptance,” in 2015 IEEE Global Communications Conference (GLOBECOM), Dec 2015, pp. 1– 6.
[389] R. Mijumbi, J. Serrat, J. Gorricho, N. Bouten, F. De Turck, and R. Boutaba, “Network Function
Virtualization: State-of-the-Art and Research Challenges,” Communications Surveys Tutorials, IEEE, vol. 18, no.
1, pp. 236–262, Firstquarter 2016.
[390] H. Hawilo, A. Shami, M. Mirahmadi, and R. Asal, “NFV: state of the art, challenges, and implementation
in next generation mobile networks (vEPC),” IEEE Network, vol. 28, no. 6, pp. 18–26, Nov 2014.
[391] V. Nguyen, A. Brunstrom, K. Grinnemo, and J. Taheri, “SDN/NFVBased Mobile Packet Core Network
Architectures: A Survey,” IEEE Communications Surveys Tutorials, vol. 19, no. 3, pp. 1567–1602, thirdquarter
2017.
[392] D. A. Temesgene, J. Núñez-Martínez, and P. Dini, “Softwarization and Optimization for Sustainable Future
Mobile Networks: A Survey,” IEEE Access, vol. 5, pp. 25 421–25 436, 2017.
[393] I. Afolabi, T. Taleb, K. Samdanis, A. Ksentini, and H. Flinck, “Network Slicing & Softwarization: A Survey
on Principles, Enabling Technologies & Solutions,” IEEE Communications Surveys Tutorials, pp. 1–1, 2018.
[394] J. G. Herrera and J. F. Botero, “Resource Allocation in NFV: A Comprehensive Survey,” IEEE Transactions
on Network and Service Management, vol. 13, no. 3, pp. 518–532, Sept 2016.
[395] L. Peterson, A. Al-Shabibi, T. Anshutz, S. Baker, A. Bavier, S. Das, J. Hart, G. Palukar, and W. Snow,
“Central office re-architected as a data center,” IEEE Communications Magazine, vol. 54, no. 10, pp. 96–101,
October 2016.
[396] A. N. Al-Quzweeni, A. Q. Lawey, T. E. H. Elgorashi, and J. M. H. Elmirghani, “Optimized Energy Aware
5G Network Function Virtualization,” IEEE Access, pp. 1–1, 2019.
[397] A. Al-Quzweeni, A. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “A framework for energy efficient
NFV in 5G networks,” in 2016 18th International Conference on Transparent Optical Networks (ICTON), July
2016, pp. 1–4.
[398] A. Al-Quzweeni, T. E. H. El-Gorashi, L. Nonde, and J. M. H. Elmirghani, “Energy efficient network
function virtualization in 5G networks,” in 2015 17th International Conference on Transparent Optical Networks
(ICTON), July 2015, pp. 1–4.
[399] J. Zhang, Y. Ji, X. Xu, H. Li, Y. Zhao, and J. Zhang, “Energy efficient baseband unit aggregation in cloud
radio and optical access networks,” IEEE/OSA Journal of Optical Communications and Networking, vol. 8, no.
11, pp. 893–901, Nov 2016.
[400] M. Peng, Y. Li, Z. Zhao, and C. Wang, “System architecture and key technologies for 5G heterogeneous
cloud radio access networks,” IEEE Network, vol. 29, no. 2, pp. 6–14, March 2015.
[401] M. Peng, Y. Sun, X. Li, Z. Mao, and C. Wang, “Recent Advances in Cloud Radio Access Networks: System
Architectures, Key Techniques, and Open Issues,” IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp.
2282–2308, thirdquarter 2016.
[402] A. Checko, H. L. Christiansen, Y. Yan, L. Scolari, G. Kardaras, M. S. Berger, and L. Dittmann, “Cloud
RAN for Mobile Networks; A Technology Overview,” IEEE Communications Surveys Tutorials, vol. 17, no. 1,
pp. 405–426, Firstquarter 2015.
[403] L. Velasco, L. M. Contreras, G. Ferraris, A. Stavdas, F. Cugini, M. Wiegand, and J. P. Fernandez-Palacios,
“A service-oriented hybrid access network and clouds architecture,” IEEE Communications Magazine, vol. 53,
no. 4, pp. 159–165, April 2015.
[404] M. Kalil, A. Al-Dweik, M. F. A. Sharkh, A. Shami, and A. Refaey, “A Framework for Joint Wireless
Network Virtualization and Cloud Radio Access Networks for Next Generation Wireless Networks,” IEEE
Access, vol. 5, pp. 20 814–20 827, 2017.
[405] M. Richart, J. Baliosian, J. Serrat, and J. L. Gorricho, “Resource Slicing in Virtual Wireless Networks: A
Survey,” IEEE Transactions on Network and Service Management, vol. 13, no. 3, pp. 462–476, Sept 2016.
[406] R. Bolla, R. Bruschi, F. Davoli, C. Lombardo, J. F. Pajo, and O. R. Sanchez, “The dark side of network
functions virtualization: A perspective on the technological sustainability,” in 2017 IEEE International
Conference on Communications (ICC), May 2017, pp. 1–7.
[407] D. Bernstein, “Containers and Cloud: From LXC to Docker to Kubernetes,” Cloud Computing, IEEE, vol.
1, no. 3, pp. 81–84, Sept 2014.
[408] C. Pahl and B. Lee, “Containers and Clusters for Edge Cloud Architectures – A Technology Review,” in
2015 3rd International Conference on Future Internet of Things and Cloud, Aug 2015, pp. 379–386.
[409] I. Mavridis and H. Karatza, “Performance and Overhead Study of Containers Running on Top of Virtual
Machines,” in 2017 IEEE 19th Conference on Business Informatics (CBI), vol. 02, July 2017, pp. 32– 38.
[410] M. G. Xavier, M. V. Neves, and C. A. F. D. Rose, “A Performance Comparison of Container-Based
Virtualization Systems for MapReduce Clusters,” in 2014 22nd Euromicro International Conference on Parallel,
Distributed, and Network-Based Processing, Feb 2014, pp. 299– 306.
[411] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of virtual
machines and Linux containers,” in Performance Analysis of Systems and Software (ISPASS), 2015 IEEE
International Symposium on, March 2015, pp. 171–172.
[412] Docker Container Executor. (Cited on 2017, Mar). [Online]. Available:
https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html
[413] S. Radhakrishnan, B. J. Muscedere, and K. Daudjee, “V-Hadoop: Virtualized Hadoop using containers,” in
2016 IEEE 15th International Symposium on Network Computing and Applications (NCA), vol. 00, Oct. 2016,
pp. 237–241.
[414] D. Kreutz, F. M. V. Ramos, P. E. Veríssimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, “Software-
Defined Networking: A Comprehensive Survey,” Proceedings of the IEEE, vol. 103, no. 1, pp. 14–76, Jan 2015.
[415] F. Bannour, S. Souihi, and A. Mellouk, “Distributed SDN Control: Survey, Taxonomy, and Challenges,”
IEEE Communications Surveys Tutorials, vol. 20, no. 1, pp. 333–354, Firstquarter 2018.
[416] B. Nunes, M. Mendonca, X.-N. Nguyen, K. Obraczka, and T. Turletti, “A Survey of Software-Defined
Networking: Past, Present, and Future of Programmable Networks,” Communications Surveys Tutorials, IEEE,
vol. 16, no. 3, pp. 1617–1634, Third 2014.
[417] W. Xia, Y. Wen, C. H. Foh, D. Niyato, and H. Xie, “A Survey on Software-Defined Networking,” IEEE
Communications Surveys Tutorials, vol. 17, no. 1, pp. 27–51, Firstquarter 2015.
[418] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner,
“Openflow: Enabling innovation in campus networks,” SIGCOMM Comput. Commun. Rev., vol. 38, no. 2, pp.
69–74, Mar. 2008.
[419] F. Hu, Q. Hao, and K. Bao, “A Survey on Software-Defined Network and OpenFlow: From Concept to
Implementation,” IEEE Communications Surveys Tutorials, vol. 16, no. 4, pp. 2181–2206, Fourthquarter 2014.
[420] A. Lara, A. Kolasani, and B. Ramamurthy, “Network Innovation using OpenFlow: A Survey,” IEEE
Communications Surveys Tutorials, vol. 16, no. 1, pp. 493–512, First 2014.
[421] A. Mendiola, J. Astorga, E. Jacob, and M. Higuero, “A Survey on the Contributions of Software-Defined
Networking to Traffic Engineering,” IEEE Communications Surveys Tutorials, vol. 19, no. 2, pp. 918–953,
Secondquarter 2017.
[422] O. Michel and E. Keller, “SDN in wide-area networks: A survey,” in 2017 Fourth International Conference
on Software Defined Systems (SDS), May 2017, pp. 37–42.
[423] B. Pfaff, J. Pettit, T. Koponen, E. J. Jackson, A. Zhou, J. Rajahalme, J. Gross, A. Wang, J. Stringer, P.
Shelar, K. Amidon, and M. Casado, “The Design and Implementation of Open vSwitch,” in Proceedings of the
12th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’15. Berkeley, CA,
USA: USENIX Association, 2015, pp. 117–130.
[424] P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat,
G. Varghese, and D. Walker, “P4: Programming Protocol-independent Packet Processors,” SIGCOMM Comput.
Commun. Rev., vol. 44, no. 3, pp. 87–95, Jul. 2014.
[425] T. Huang, F. R. Yu, C. Zhang, J. Liu, J. Zhang, and Y. Liu, “A Survey on Large-Scale Software Defined
Networking (SDN) Testbeds: Approaches and Challenges,” IEEE Communications Surveys Tutorials, vol. 19,
no. 2, pp. 891–917, Secondquarter 2017.
[426] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu,
J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat, “B4: Experience with a Globally-deployed Software Defined Wan,”
SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 3–14, Aug. 2013.
[427] C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and R. Wattenhofer, “Achieving High
Utilization with Software-driven WAN,” SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 15–26, Aug.
2013.
[428] J. Wang, Y. Yan, and L. Dittmann, “Design of energy efficient optical networks with software enabled
integrated control plane,” Networks, IET, vol. 4, no. 1, pp. 30–36, 2015.
[429] L. Cui, F. R. Yu, and Q. Yan, “When big data meets software-defined networking: SDN for big data and
big data for SDN,” IEEE Network, vol. 30, no. 1, pp. 58–65, January 2016.
[430] H. Huang, H. Yin, G. Min, H. Jiang, J. Zhang, and Y. Wu, “Data- Driven Information Plane in Software-
Defined Networking,” IEEE Communications Magazine, vol. 55, no. 6, pp. 218–224, 2017.
[431] T. Hafeez, N. Ahmed, B. Ahmed, and A. W. Malik, “Detection and Mitigation of Congestion in SDN
Enabled Data Center Networks: A Survey,” IEEE Access, vol. 6, pp. 1730–1740, 2018.
[432] Y. Zhang, P. Chowdhury, M. Tornatore, and B. Mukherjee, “Energy Efficiency in Telecom Optical
Networks,” Communications Surveys Tutorials, IEEE, vol. 12, no. 4, pp. 441–458, Fourth 2010.
[433] R. Ramaswami, K. N. Sivarajan, and G. H. Sasaki, Optical Networks, 3rd ed. Morgan Kaufmann, 2010.
[434] H. Yin, Y. Jiang, C. Lin, Y. Luo, and Y. Liu, “Big data: transforming the design philosophy of future
internet,” Network, IEEE, vol. 28, no. 4, pp. 14–19, July 2014.
[435] K.-I. Kitayama, A. Hiramatsu, M. Fukui, T. Tsuritani, N. Yamanaka, S. Okamoto, M. Jinno, and M. Koga,
“Photonic Network Vision 2020 - Toward Smart Photonic Cloud,” Lightwave Technology, Journal of, vol. 32,
no. 16, pp. 2760–2770, Aug 2014.
[436] A. S. Thyagaturu, A. Mercian, M. P. McGarry, M. Reisslein, and W. Kellerer, “Software Defined Optical
Networks (SDONs): A Comprehensive Survey,” IEEE Communications Surveys Tutorials, vol. 18, no. 4, pp.
2738–2786, Fourthquarter 2016.
[437] Y. Yin, L. Liu, R. Proietti, and S. J. B. Yoo, “Software Defined Elastic Optical Networks for Cloud
Computing,” IEEE Network, vol. 31, no. 1, pp. 4–10, January 2017.
[438] A. Nag, M. Tornatore, and B. Mukherjee, “Optical Network Design With Mixed Line Rates and Multiple
Modulation Formats,” Journal of Lightwave Technology, vol. 28, no. 4, pp. 466–475, Feb 2010.
[439] Y. Ji, J. Zhang, Y. Zhao, H. Li, Q. Yang, C. Ge, Q. Xiong, D. Xue, J. Yu, and S. Qiu, “All Optical Switching
Networks With Energy- Efficient Technologies From Components Level to Network Level,” IEEE Journal on
Selected Areas in Communications, vol. 32, no. 8, pp. 1600–1614, Aug 2014.
[440] X. Zhao, V. Vusirikala, B. Koley, V. Kamalov, and T. Hofmeister, “The prospect of inter-data-center optical
networks,” IEEE Communications Magazine, vol. 51, no. 9, pp. 32–38, September 2013.
[441] G. Tzimpragos, C. Kachris, I. B. Djordjevic, M. Cvijetic, D. Soudris, and I. Tomkos, “A Survey on FEC
Codes for 100 G and Beyond Optical Networks,” IEEE Communications Surveys Tutorials, vol. 18, no. 1, pp.
209–221, Firstquarter 2016.
[442] D. M. Marom, P. D. Colbourne, A. D’errico, N. K. Fontaine, Y. Ikuma, R. Proietti, L. Zong, J. M. Rivas-
Moscoso, and I. Tomkos, “Survey of photonic switching architectures and technologies in support of spatially and
spectrally flexible optical networking [invited],” IEEE/OSA Journal of Optical Communications and Networking,
vol. 9, no. 1, pp. 1–26, Jan 2017.
[443] X. Yu, M. Tornatore, M. Xia, J. Wang, J. Zhang, Y. Zhao, J. Zhang, and B. Mukherjee, “Migration from
fixed grid to flexible grid in optical networks,” IEEE Communications Magazine, vol. 53, no. 2, pp. 34–43, Feb
2015.
[444] M. Jinno, H. Takara, B. Kozicki, Y. Tsukishima, Y. Sone, and S. Matsuoka, “Spectrum-efficient and
scalable elastic optical path network: architecture, benefits, and enabling technologies,” IEEE Communications
Magazine, vol. 47, no. 11, pp. 66–73, November 2009.
[445] O. Gerstel, M. Jinno, A. Lord, and S. J. B. Yoo, “Elastic optical networking: a new dawn for the optical
layer?” IEEE Communications Magazine, vol. 50, no. 2, pp. s12–s20, February 2012.
[446] B. C. Chatterjee, N. Sarma, and E. Oki, “Routing and Spectrum Allocation in Elastic Optical Networks: A
Tutorial,” IEEE Communications Surveys Tutorials, vol. 17, no. 3, pp. 1776–1800, thirdquarter 2015.
[447] G. Zhang, M. D. Leenheer, A. Morea, and B. Mukherjee, “A Survey on OFDM-Based Elastic Core Optical
Networking,” IEEE Communications Surveys Tutorials, vol. 15, no. 1, pp. 65–87, First 2013.
[448] A. Klekamp, U. Gebhard, and F. Ilchmann, “Energy and Cost Efficiency of Adaptive and Mixed-Line-Rate
IP Over DWDM Networks,” Journal of Lightwave Technology, vol. 30, no. 2, pp. 215–221, Jan 2012.
[449] T. E. El-Gorashi, X. Dong, and J. M. Elmirghani, “Green optical orthogonal frequency-division
multiplexing networks,” IET Optoelectronics, vol. 8, pp. 137–148(11), June 2014.
[450] H. Harai, H. Furukawa, K. Fujikawa, T. Miyazawa, and N. Wada, “Optical Packet and Circuit Integrated
Networks and Software Defined Networking Extension,” Journal of Lightwave Technology, vol. 32, no. 16, pp.
2751–2759, Aug 2014.
[451] IT center, Intel, “Big Data in the Cloud: Converging Technologies,” Solution Brief, 2015.
[452] Project Serengeti: There’s a Virtual Elephant in my Datacenter. (Cited on 2018, May). [Online]. Available:
https://octo.vmware.com/project-serengeti-theres-a-virtual-elephant-in-my-datacenter/
[453] A. Iordache, C. Morin, N. Parlavantzas, E. Feller, and P. Riteau, “Resilin: Elastic MapReduce over Multiple
Clouds,” in 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, May 2013,
pp. 261–268.
[454] D. Wang and J. Liu, “Optimizing big data processing performance in the public cloud: opportunities and
approaches,” Network, IEEE, vol. 29, no. 5, pp. 31–35, September 2015.
[455] D. Agrawal, S. Das, and A. El Abbadi, “Big Data and Cloud Computing: Current State and Future
Opportunities,” in Proceedings of the 14th International Conference on Extending Database Technology, ser.
EDBT/ICDT ’11. New York, NY, USA: ACM, 2011, pp. 530–533.
[456] N. C. Luong, P. Wang, D. Niyato, Y. Wen, and Z. Han, “Resource Management in Cloud Networking Using
Economic Analysis and Pricing Models: A Survey,” IEEE Communications Surveys Tutorials, vol. 19, no. 2, pp.
954–1001, Secondquarter 2017.
[457] Y. Zhao, X. Fei, I. Raicu, and S. Lu, “Opportunities and Challenges in Running Scientific Workflows on
the Cloud,” in 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge
Discovery, Oct 2011, pp. 455–462.
[458] E.-S. Jung and R. Kettimuthu, “Challenges and Opportunities for Data- Intensive Computing in the Cloud,”
Computer, vol. 47, no. 12, pp. 82–85, 2014.
[459] M. H. Ghahramani, M. Zhou, and C. T. Hon, “Toward cloud computing QoS architecture: analysis of cloud
systems and cloud services,” IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 1, pp. 6–18, Jan 2017.
[460] Latency is Everywhere and it Costs you Sales - How to Crush it. (Cited on 2017, Dec). [Online]. Available:
http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it
[461] R. Kohavi, R. M. Henne, and D. Sommerfield, “Practical Guide to Controlled Experiments on the Web:
Listen to Your Customers Not to the Hippo,” in Proceedings of the 13th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, ser. KDD ’07. New York, NY, USA: ACM, 2007, pp. 959–967.
[462] S. S. Krishnan and R. K. Sitaraman, “Video Stream Quality Impacts Viewer Behavior: Inferring Causality
Using Quasi-experimental Designs,” in Proceedings of the 2012 Internet Measurement Conference, ser. IMC ’12.
New York, NY, USA: ACM, 2012, pp. 211–224.
[463] C. Colman-Meixner, C. Develder, M. Tornatore, and B. Mukherjee, “A Survey on Resiliency Techniques
in Cloud Computing Infrastructures and Applications,” IEEE Communications Surveys Tutorials, vol. 18, no. 3,
pp. 2244–2281, thirdquarter 2016.
[464] A. Vishwanath, F. Jalali, K. Hinton, T. Alpcan, R. W. A. Ayre, and R. S. Tucker, “Energy Consumption
Comparison of Interactive Cloud-Based and Local Applications,” IEEE Journal on Selected Areas in
Communications, vol. 33, no. 4, pp. 616–626, April 2015.
[465] J. Baliga, R. W. A. Ayre, K. Hinton, and R. S. Tucker, “Green Cloud Computing: Balancing Energy in
Processing, Storage, and Transport,” Proceedings of the IEEE, vol. 99, no. 1, pp. 149–167, Jan 2011.
[466] H. Zhang, Q. Zhang, Z. Zhou, X. Du, W. Yu, and M. Guizani, “Processing geo-dispersed big data in an
advanced mapreduce framework,” Network, IEEE, vol. 29, no. 5, pp. 24–30, September 2015.
[467] A. Vulimiri, C. Curino, P. B. Godfrey, T. Jungblut, J. Padhye, and G. Varghese, “Global Analytics in the
Face of Bandwidth and Regulatory Constraints,” in Proceedings of the 12th USENIX Conference on Networked
Systems Design and Implementation, ser. NSDI’15. Berkeley, CA, USA: USENIX Association, 2015, pp. 323–
336.
[468] F. Idzikowski, L. Chiaraviglio, A. Cianfrani, J. L. Vizcaíno, M. Polverini, and Y. Ye, “A Survey on Energy-
Aware Design and Operation of Core Networks,” IEEE Communications Surveys Tutorials, vol. 18, no. 2, pp.
1453–1499, Secondquarter 2016.
[469] W. V. Heddeghem, B. Lannoo, D. Colle, M. Pickavet, and P. Demeester, “A Quantitative Survey of the
Power Saving Potential in IP-Over-WDM Backbone Networks,” IEEE Communications Surveys Tutorials, vol.
18, no. 1, pp. 706–731, Firstquarter 2016.
[470] M. N. Dharmaweera, R. Parthiban, and Y. A. ¸Sekercio˘glu, “Toward a Power-Efficient Backbone Network:
The State of Research,” IEEE Communications Surveys Tutorials, vol. 17, no. 1, pp. 198–227, Firstquarter 2015.
[471] R. S. Tucker, “Green Optical Communications—Part I: Energy Limitations in Transport,” IEEE Journal of
Selected Topics in Quantum Electronics, vol. 17, no. 2, pp. 245–260, March 2011.
[472] R. S. Tucker, “Green Optical Communications—Part II: Energy Limitations in Networks,” IEEE Journal of
Selected Topics in Quantum Electronics, vol. 17, no. 2, pp. 261–274, March 2011.
[473] W. V. Heddeghem, M. D. Groote, W. Vereecken, D. Colle, M. Pickavet, and P. Demeester, “Energy-
efficiency in telecommunications networks: Link-by-link versus end-to-end grooming,” in 2010 14th Conference
on Optical Network Design and Modeling (ONDM), Feb 2010, pp. 1–6.
[474] A. Fehske, G. Fettweis, J. Malmodin, and G. Biczok, “The global footprint of mobile communications: The
ecological and economic perspective,” IEEE Communications Magazine, vol. 49, no. 8, pp. 55–62, August 2011.
[475] H. A. Alharbi, M. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Real-Time Emissions of Telecom
Core Networks,” in 2018 20th International Conference on Transparent Optical Networks (ICTON), July 2018,
pp. 1–4.
[476] D. Feng, C. Jiang, G. Lim, L. J. Cimini, G. Feng, and G. Y. Li, “A survey of energy-efficient wireless
communications,” IEEE Communications Surveys Tutorials, vol. 15, no. 1, pp. 167–178, First 2013.
[477] L. Budzisz, F. Ganji, G. Rizzo, M. A. Marsan, M. Meo, Y. Zhang, G. Koutitas, L. Tassiulas, S. Lambert, B.
Lannoo, M. Pickavet, A. Conte, I. Haratcherev, and A. Wolisz, “Dynamic Resource Provisioning for Energy
Efficiency in Wireless Access Networks: A Survey and an Outlook,” IEEE Communications Surveys Tutorials,
vol. 16, no. 4, pp. 2259–2285, Fourthquarter 2014.
[478] M. Ismail,W. Zhuang, E. Serpedin, and K. Qaraqe, “A Survey on Green Mobile Networking: From The
Perspectives of Network Operators and Mobile Users,” IEEE Communications Surveys Tutorials, vol. 17, no. 3,
pp. 1535–1556, thirdquarter 2015.
[479] K. Gomez, R. Riggio, T. Rasheed, and F. Granelli, “Analysing the energy consumption behaviour of WiFi
networks,” in 2011 IEEE Online Conference on Green Communications, Sept 2011, pp. 98–104.
[480] S. Xiao, X. Zhou, D. Feng, Y. Yuan-Wu, G. Y. Li, and W. Guo, “Energy-Efficient Mobile Association in
Heterogeneous Networks With Device-to-Device Communications,” IEEE Transactions on Wireless
Communications, vol. 15, no. 8, pp. 5260–5271, Aug 2016.
[481] A. Abrol and R. K. Jha, “Power Optimization in 5G Networks: A Step Towards GrEEn Communication,”
IEEE Access, vol. 4, pp. 1355–1374, 2016.
[482] L. Valcarenghi, D. P. Van, P. G. Raponi, P. Castoldi, D. R. Campelo, S. Wong, S. Yen, L. G. Kazovsky,
and S. Yamashita, “Energy efficiency in passive optical networks: where, when, and how?” IEEE Network, vol.
26, no. 6, pp. 61–68, November 2012.
[483] J. Kani, S. Shimazu, N. Yoshimoto, and H. Hadama, “Energy-efficient optical access networks: issues and
technologies,” IEEE Communications Magazine, vol. 51, no. 2, pp. S22–S26, February 2013.
[484] P. Vetter, D. Suvakovic, H. Chow, P. Anthapadmanabhan, K. Kanonakis, K. Lee, F. Saliou, X. Yin, and B.
Lannoo, “Energy-efficiency improvements for optical access,” IEEE Communications Magazine, vol. 52, no. 4,
pp. 136–144, April 2014.
[485] B. Skubic, E. I. de Betou, T. Ayhan, and S. Dahlfort, “Energy-efficient next-generation optical access
networks,” IEEE Communications Magazine, vol. 50, no. 1, pp. 122–127, January 2012.
[486] E. Goma, M. Canini, A. Lopez, N. Laoutaris, D. Kostic, P. Rodriguez, R. Stanojevic, and P. Yague,
“Insomnia in the Access (or How to Curb Access Network Related Energy Consumption),” Proceedings of the
ACM SIGCOMM 2011 Conference on Applications, Technologies, Architectures, and Protocols for Computer
Communications, 2011.
[487] A. S. Gowda, A. R. Dhaini, L. G. Kazovsky, H. Yang, S. T. Abraha, and A. Ng’oma, “Towards Green
Optical/Wireless In-Building Networks: Radio-Over-Fiber,” Journal of Lightwave Technology, vol. 32, no. 20,
pp. 3545–3556, Oct 2014.
[488] B. Kantarci and H. T. Mouftah, “Energy efficiency in the extendedreach fiber-wireless access networks,”
IEEE Network, vol. 26, no. 2, pp. 28–35, March 2012.
[489] M. Gupta and S. Singh, “Greening of the Internet,” in Proceedings of the 2003 Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communications, ser. SIGCOMM ’03. New York, NY,
USA: ACM, 2003, pp. 19–26.
[490] J. C. C. Restrepo, C. G. Gruber, and C. M. Machuca, “Energy Profile Aware Routing,” in 2009 IEEE
International Conference on Communications Workshops, June 2009, pp. 1–5.
[491] S. Nedevschi, L. Popa, G. Iannaccone, S. Ratnasamy, and D. Wetherall, “Reducing Network Energy
Consumption via Sleeping and Rateadaptation,” in Proceedings of the 5th USENIX Symposium on Networked
Systems Design and Implementation, ser. NSDI’08. Berkeley, CA, USA: USENIX Association, 2008, pp. 323–
336.
[492] G. Shen and R. Tucker, “Energy-Minimized Design for IP Over WDM Networks,” Optical Communications
and Networking, IEEE/OSA Journal of, vol. 1, no. 1, pp. 176–186, June 2009.
[493] X. Dong, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “On the Energy Efficiency of Physical Topology
Design for IP Over WDM Networks,” Journal of Lightwave Technology, vol. 30, no. 12, pp. 1931–1942, June
2012.
[494] S. Zhang, D. Shen, and C. K. Chan, “Energy-Efficient Traffic Grooming in WDM Networks With Scheduled
Time Traffic,” Journal of Lightwave Technology, vol. 29, no. 17, pp. 2577–2584, Sept 2011.
[495] Z. H. Nasralla, T. E. H. El-Gorashi, M. O. I. Musa, and J. M. H. Elmirghani, “Energy-Efficient Traffic
Scheduling in IP over WDM Networks,” in 2015 9th International Conference on Next Generation Mobile
Applications, Services and Technologies, Sept 2015, pp. 161–164.
[496] Z. H. Nasralla and T. E. H. El-Gorashi and M. O. I. Musa and J. M. H. Elmirghani, “Routing post-disaster
traffic floods in optical core networks,” in 2016 International Conference on Optical Network Design and
Modeling (ONDM), May 2016, pp. 1–5.
[497] Z. H. Nasralla, M. O. I. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Routing post-disaster traffic
floods heuristics,” in 2016 18th International Conference on Transparent Optical Networks (ICTON), July 2016,
pp. 1–4.
[498] M. O. I. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Network coding for energy efficiency in
bypass IP/WDM networks,” in 2016 18th International Conference on Transparent Optical Networks (ICTON),
July 2016, pp. 1–3.
[499] M. O. I. Musa and T. E. H. El-Gorashi and J. M. H. Elmirghani, “Energy efficient core networks using
network coding,” in 2015 17th International Conference on Transparent Optical Networks (ICTON), July 2015,
pp. 1–4.
[500] T. E. H. El-Gorashi, X. Dong, A. Lawey, and J. M. H. Elmirghani, “Core network physical topology design
for energy efficiency and resilience,” in 2013 15th International Conference on Transparent Optical Networks
(ICTON), June 2013, pp. 1–7.
[501] M. Musa, T. Elgorashi, and J. Elmirghani, “Energy efficient survivable IP-over-WDM networks with
network coding,” IEEE/OSA Journal of Optical Communications and Networking, vol. 9, no. 3, pp. 207–217,
March 2017.
[502] M. Musa and T. Elgorashi and J. Elmirghani, “Bounds for energy efficient survivable IP over WDM
networks with network coding,” IEEE/OSA Journal of Optical Communications and Networking, vol. 10, no. 5,
pp. 471–481, May 2018.
[503] Y. Li, L. Zhu, S. K. Bose, and G. Shen, “Energy-Saving in IP Over WDM Networks by Putting Protection
Router Cards to Sleep,” Journal of Lightwave Technology, vol. 36, no. 14, pp. 3003–3017, July 2018.
[504] J. M. H. Elmirghani, L. Nonde, A. Q. Lawey, T. E. H. El-Gorashi, M. O. I. Musa, X. Dong, K. Hinton, and
T. Klein, “Energy efficiency measures for future core networks,” in 2017 Optical Fiber Communications
Conference and Exhibition (OFC), March 2017, pp. 1–3.
[505] J. M. H. Elmirghani, T. Klein, K. Hinton, L. Nonde, A. Q. Lawey, T. E. H. El-Gorashi, M. O. I. Musa, and
X. Dong, “GreenTouch GreenMeter core network energy-efficiency improvement measures and optimization,”
IEEE/OSA Journal of Optical Communications and Networking, vol. 10, no. 2, pp. A250–A269, Feb 2018.
[506] M. O. I. Musa, T. El-Gorashi, and J. M. H. Elmirghani, “Bounds on GreenTouch GreenMeter Network
Energy Efficiency,” Journal of Lightwave Technology, pp. 1–1, 2018.
[507] X. Dong, T. El-Gorashi, and J. Elmirghani, “IP Over WDM Networks Employing Renewable Energy
Sources,” Lightwave Technology, Journal of, vol. 29, no. 1, pp. 3–14, Jan 2011.
[508] X. Dong, T. El-Gorashi, and J. M. H. Elmirghani, “Green IP over WDM Networks: Solar and Wind
Renewable Sources and Data Centres,” in 2011 IEEE Global Telecommunications Conference – GLOBECOM
2011, Dec 2011, pp. 1–6.
[509] G. Shen, Y. Lui, and S. K. Bose, ““Follow the Sun, Follow the Wind” Lightpath Virtual Topology
Reconfiguration in IP Over WDM Network,” Journal of Lightwave Technology, vol. 32, no. 11, pp. 2094– 2105,
June 2014.
[510] M. Gattulli, M. Tornatore, R. Fiandra, and A. Pattavina, “Low- Emissions Routing for Cloud Computing in
IP-over-WDM Networks with Data Centers,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 1,
pp. 28–38, January 2014.
[511] A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Renewable energy in distributed energy
efficient content delivery clouds,” in 2015 IEEE International Conference on Communications (ICC), June 2015,
pp. 128–134.
[512] S. K. Dey and A. Adhya, “Delay-aware green service migration schemes for data center traffic,” IEEE/OSA
Journal of Optical Communications and Networking, vol. 8, no. 12, pp. 962–975, December 2016.
[513] L. Nonde, T. E. H. Elgorashi, and J. M. H. Elmirgahni, “Virtual Network Embedding Employing Renewable
Energy Sources,” in 2016 IEEE Global Communications Conference (GLOBECOM), Dec 2016, pp. 1–6.
[514] C. Ge, Z. Sun, and N.Wang, “A Survey of Power-Saving Techniques on Data Centers and Content Delivery
Networks,” IEEE Communications Surveys Tutorials, vol. 15, no. 3, pp. 1334–1354, Third 2013.
[515] C. Fang, F. R. Yu, T. Huang, J. Liu, and Y. Liu, “A Survey of Green Information-Centric Networking:
Research Issues and Challenges,” IEEE Communications Surveys Tutorials, vol. 17, no. 3, pp. 1455–1472,
thirdquarter 2015.
[516] X. Dong, T. El-Gorashi, and J. Elmirghani, “Green IP Over WDM Networks With Data Centers,” Lightwave
Technology, Journal of, vol. 29, no. 12, pp. 1861–1880, June 2011.
[517] V. Valancius, N. Laoutaris, L. Massoulié, C. Diot, and P. Rodriguez, “Greening the Internet with Nano Data
Centers,” in Proceedings of the 5th International Conference on Emerging Networking Experiments and
Technologies, ser. CoNEXT ’09. New York, NY, USA: ACM, 2009, pp. 37–48.
[518] C. Jayasundara, A. Nirmalathas, E. Wong, and C. Chan, “Improving Energy Efficiency of Video on Demand
Services,” IEEE/OSA Journal of Optical Communications and Networking, vol. 3, no. 11, pp. 870–880,
November 2011.
[519] N. I. Osman, T. El-Gorashi, and J. M. H. Elmirghani, “Reduction of energy consumption of Video-on-
Demand services using cache size optimization,” in 2011 Eighth International Conference on Wireless and Optical
Communications Networks, May 2011, pp. 1–5.
[520] N. I. Osman and T. El-Gorashi and J. M. H. Elmirghani, “The impact of content popularity distribution on
energy efficient caching,” in 2013 15th International Conference on Transparent Optical Networks (ICTON), June
2013, pp. 1–6.
[521] N. I. Osman, T. El-Gorashi, L. Krug, and J. M. H. Elmirghani, “Energy-Efficient Future High-Definition
TV,” Journal of Lightwave Technology, vol. 32, no. 13, pp. 2364–2381, July 2014.
[522] A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “BitTorrent Content Distribution in Optical
Networks,” Journal of Lightwave Technology, vol. 32, no. 21, pp. 4209–4225, Nov 2014.
[523] A. Lawey, T. El-Gorashi, and J. Elmirghani, “Distributed Energy Efficient Clouds Over Core Networks,”
Lightwave Technology, Journal of, vol. 32, no. 7, pp. 1261–1281, April 2014.
[524] H. A. Alharbi, T. E. H. El-Gorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Energy efficient virtual
machines placement in IP over WDM networks,” in 2017 19th International Conference on Transparent Optical
Networks (ICTON), July 2017, pp. 1–4.
[525] U. Wajid, c. cappiello, P. Plebani, B. Pernici, N. Mehandjiev, M. Vitali, M. Gienger, K. Kavoussanakis, D.
Margery, D. Perez, and P. Sampaio, “On Achieving Energy Efficiency and Reducing CO2 Footprint in Cloud
Computing,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[526] A. Al-Salim, A. Lawey, T. El-Gorashi, and J. Elmirghani, “Energy Efficient Tapered Data Networks for
Big Data processing in IP/WDM networks,” in Transparent Optical Networks (ICTON), 2015 17th International
Conference on, July 2015, pp. 1–5.
[527] A. M. Al-Salim, H. M. M. Ali, A. Q. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “Greening big data
networks: Volume impact,” in 2016 18th International Conference on Transparent Optical Networks (ICTON),
July 2016, pp. 1–6.
[528] L. A. Barroso and U. Hoelzle, The Datacenter As a Computer: An Introduction to the Design of Warehouse-
Scale Machines, 1st ed. Morgan and Claypool Publishers, 2009.
[529] T. Wang, Z. Su, Y. Xia, and M. Hamdi, “Rethinking the Data Center Networking: Architecture, Network
Protocols, and Resource Sharing,” Access, IEEE, vol. 2, pp. 1481–1496, 2014.
[530] A. Hammadi and L. Mhamdi, “A survey on architectures and energy efficiency in data center networks,”
Computer Communications, vol. 40, pp. 1 – 21, 2014.
[531] W. Xia, P. Zhao, Y. Wen, and H. Xie, “A Survey on Data Center Networking (DCN): Infrastructure and
Operations,” IEEE Communications Surveys Tutorials, vol. PP, no. 99, pp. 1–1, 2016.
[532] T. Chen, X. Gao, and G. Chen, “The features, hardware, and architectures of data center networks: A
survey,” Journal of Parallel and Distributed Computing, vol. 96, pp. 45 – 74, 2016.
[533] Y. Liu, J. K. Muppala, M. Veeraraghavan, D. Lin, and M. Hamdi, “Data Center Networks,” in
SpringerBriefs in Computer Science, 2013.
[534] K. Bilal, S. U. R. Malik, O. Khalid, A. Hameed, E. Alvarez, V. Wijaysekara, R. Irfan, S. Shrestha, D.
Dwivedy, M. Ali, U. S. Khan, A. Abbas, N. Jalil, and S. U. Khan, “A taxonomy and survey on green data center
networks,” Future Generation Computer Systems, vol. 36, pp. 189 – 208, 2014.
[535] B. Wang, Z. Qi, R. Ma, H. Guan, and A. V. Vasilakos, “A Survey on Data Center Networking for Cloud
Computing,” Comput. Netw., vol. 91, no. C, pp. 528–547, Nov. 2015.
[536] M. Al-Fares, A. Loukissas, and A. Vahdat, “A Scalable, Commodity Data Center Network Architecture,”
SIGCOMM Comput. Commun. Rev., vol. 38, no. 4, pp. 63–74, Aug. 2008.
[537] A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta,
“VL2: A Scalable and Flexible Data Center Network,” SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp.
51–62, Aug. 2009.
[538] J. Kim, W. J. Dally, and D. Abts, “Flattened Butterfly: A Cost-efficient Topology for High-radix Networks,”
SIGARCH Comput. Archit. News, vol. 35, no. 2, pp. 126–137, Jun. 2007.
[539] J. H. Ahn, N. Binkert, A. Davis, M. McLaren, and R. S. Schreiber, “HyperX: topology, routing, and
packaging of efficient large-scale networks,” in Proceedings of the Conference on High Performance Computing
Networking, Storage and Analysis, Nov 2009, pp. 1–11.
[540] Pall Beck, Peter Clemens, Santiago Freitas, Jeff Gatz, Michele Girola, Jason Gmitter, Holger Mueller, Ray
O’Hanlon, Veerendra Para, Joe Robinson, Andy Sholomon, Jason Walker, and Jon Tate, IBM and Cisco: Together
for a World Class Data Center. IBM Redbooks, 2013.
[541] A. Singh, J. Ong, A. Agarwal, G. Anderson, A. Armistead, R. Bannon, S. Boving, G. Desai, B. Felderman,
P. Germano, A. Kanagala, J. Provost, J. Simmons, E. Tanda, J. Wanderer, U. Hölzle, S. Stuart, and A. Vahdat,
“Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network,” in
Sigcomm ’15, 2015.
[542] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, “BCube: A High Performance,
Server-centric Network Architecture for Modular Data Centers,” SIGCOMM Comput. Commun.
Rev., vol. 39, no. 4, pp. 63–74, Aug. 2009.
[543] H. Wu, G. Lu, D. Li, C. Guo, and Y. Zhang, “MDCube: A High Performance Network Structure for Modular
Data Center Interconnection,” in Proceedings of the 5th International Conference on Emerging Networking
Experiments and Technologies, ser. CoNEXT ’09. New York, NY, USA: ACM, 2009, pp. 25–36.
[544] H. Abu-Libdeh, P. Costa, A. Rowstron, G. O’Shea, and A. Donnelly, “Symbiotic Routing in Future Data
Centers,” SIGCOMM Comput. Commun. Rev., vol. 40, no. 4, pp. 51–62, Aug. 2010.
[545] C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “DCell: A Scalable and Fault-Tolerant Network
Structure for Data Centers,” in SIGCOMM08. Association for Computing Machinery, Inc., August 2008.
[546] D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: Using Backup Port for Server Interconnection
in Data Centers,” in IEEE INFOCOM 2009, April 2009, pp. 2276–2285.
[547] A. Singla, C. Hong, L. Popa, and P. B. Godfrey, “Jellyfish: Networking Data Centers Randomly,” CoRR,
vol. abs/1110.1687, 2011.
[548] L. Gyarmati and T. A. Trinh, “Scafida: A Scale-free Network Inspired Data Center Architecture,”
SIGCOMM Comput. Commun. Rev., vol. 40, no. 5, pp. 4–12, Oct. 2010.
[549] J.-Y. Shin, B. Wong, and E. G. Sirer, “Small-world Datacenters,” in Proceedings of the 2Nd ACM
Symposium on Cloud Computing, ser. SOCC ’11. New York, NY, USA: ACM, 2011, pp. 2:1–2:13.
[550] E. Baccour, S. Foufou, R. Hamila, and M. Hamdi, “A survey of wireless data center networks,” in 2015
49th Annual Conference on Information Sciences and Systems (CISS), March 2015, pp. 1–6.
[551] A. S. Hamza, J. S. Deogun, and D. R. Alexander, “Wireless Communication in Data Centers: A Survey,”
IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp. 1572–1595, thirdquarter 2016.
[552] C. Kachris and I. Tomkos, “A survey on optical interconnects for data centers,” Communications Surveys
Tutorials, IEEE, vol. 14, no. 4, pp. 1021–1036, Fourth 2012.
[553] M. Chen, H. Jin, Y. Wen, and V. Leung, “Enabling technologies for future data center networking: a
primer,” Network, IEEE, vol. 27, no. 4, pp. 8–15, July 2013.
[554] L. Schares, D. M. Kuchta, and A. F. Benner, “Optics in Future Data Center Networks,” in 2010 18th IEEE
Symposium on High Performance Interconnects, Aug 2010, pp. 104–108.
[555] H. Ballani, P. Costa, I. Haller, K. Jozwik, K. Shi, B. Thomsen, and H. Williams, “Bridging the Last Mile
for Optical Switching in Data Centers,” in 2018 Optical Fiber Communications Conference and Exposition (OFC),
March 2018, pp. 1–3.
[556] G. Papen, “Optical components for datacenters,” in 2017 Optical Fiber Communications Conference and
Exhibition (OFC), March 2017, pp. 1–53.
[557] L. Chen, E. Hall, L. Theogarajan, and J. Bowers, “Photonic Switching for Data Center Applications,” IEEE
Photonics Journal, vol. 3, no. 5, pp. 834–844, Oct 2011.
[558] P. N. Ji, D. Qian, K. Kanonakis, C. Kachris, and I. Tomkos, “Design and Evaluation of a Flexible-Bandwidth
OFDM-Based Intra-Data Center Interconnect,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 19,
no. 2, pp. 3 700 310–3 700 310, March 2013.
[559] C. Kachris and I. Tomkos, “Power consumption evaluation of hybrid WDM PON networks for data centers,”
in 2011 16th European Conference on Networks and Optical Communications, July 2011, pp. 118–121.
[560] Y. Cheng, M. Fiorani, R. Lin, L. Wosinska, and J. Chen, “POTORI: a passive optical top-of-rack
interconnect architecture for data centers,” IEEE/OSA Journal of Optical Communications and Networking, vol.
9, no. 5, pp. 401–411, May 2017.
[561] H. Liu, F. Lu, A. Forencich, R. Kapoor, M. Tewari, G. M. Voelker, G. Papen, A. C. Snoeren, and G. Porter,
“Circuit Switching Under the Radar with REACToR,” in 11th USENIX Symposium on Networked
Systems Design and Implementation (NSDI 14). Seattle, WA: USENIX Association, 2014, pp. 1–15.
[562] J. Elmirghani, T. EL-GORASHI, and A. HAMMADI, “Passive optical-based data center networks,” 2016,
wO Patent App. PCT/GB2015/053,604. [Online]. Available:
http://google.com/patents/WO2016083812A1?cl=und
[563] A. Hammadi, T. El-Gorashi, and J. Elmirghani, “High performance AWGR PONs in data centre networks,”
in Transparent Optical Networks (ICTON), 2015 17th International Conference on, July 2015,
pp. 1–5.
[564] R. Alani, A. Hammadi, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “PON data centre design with AWGR
and server based routing,” in 2017 19th International Conference on Transparent Optical Networks
(ICTON), July 2017, pp. 1–4.
[565] A. Hammadi, T. E. H. El-Gorashi, M. O. I. Musa, and J. M. H. Elmirghani, “Server-centric PON data center
architecture,” in 2016 18th International Conference on Transparent Optical Networks (ICTON), July 2016, pp.
1–4.
[566] A. Hammadi, M. Musa, T. E. H. El-Gorashi, and J. H. Elmirghani, “Resource provisioning for cloud PON
AWGR-based data center architecture,” in 2016 21st European Conference on Networks and Optical
Communications (NOC), June 2016, pp. 178–182.
[567] A. Hammadi, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energyefficient software-defined AWGR-
based PON data center network,” in 2016 18th International Conference on Transparent Optical Networks
(ICTON), July 2016, pp. 1–5.
[568] A. E. A. Eltraify, M. O. I. Musa, A. Al-Quzweeni, and J. M. H. Elmirghani, “Experimental Evaluation of
Passive Optical Network Based Data Centre Architecture,” in 2018 20th International Conference on Transparent
Optical Networks (ICTON), July 2018, pp. 1–4.
[569] S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficiency of Server-Centric PON
Data Center Architecture for Fog Computing,” in 2018 20th International Conference on Transparent Optical
Networks (ICTON), July 2018, pp. 1–4.
[570] S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Impact of Link Failures on the Performance
of MapReduce in Data Center Networks,” in 2018 20th International Conference on Transparent
Optical Networks (ICTON), July 2018, pp. 1–4.
[571] R. A. T. Alani, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Virtual Machines Embedding for Cloud PON
AWGR and Server Based Data Centres,” arXiv e-prints, p. arXiv:1904.03298, Apr 2019.
[572] A. E. A. Eltraify, M. O. I. Musa, A. Al-Quzweeni, and J. M. H. Elmirghani, “Experimental Evaluation of
Server Centric Passive Optical Network Based Data Centre Architecture,” arXiv e-prints, p. arXiv:1904.04580,
Apr 2019.
[573] A. E. A. Eltraify, M. O. I. Musa, and J. M. H. Elmirghani, “TDM/WDM over AWGR Based Passive Optical
Network Data Centre Architecture,” arXiv e-prints, p. arXiv:1904.04581, Apr 2019.
[574] H. Yang, J. Zhang, Y. Zhao, J. Han, Y. Lin, and Y. Lee, “SUDOI: software defined networking for
ubiquitous data center optical interconnection,” IEEE Communications Magazine, vol. 54, no. 2, pp. 86–95,
February 2016.
[575] M. Fiorani, S. Aleksic, M. Casoni, L. Wosinska, and J. Chen, “Energy- Efficient Elastic Optical Interconnect
Architecture for Data Centers,” IEEE Communications Letters, vol. 18, no. 9, pp. 1531–1534, Sep.
2014.
[576] Z. Cao, R. Proietti, M. Clements, and S. J. B. Yoo, “Experimental Demonstration of Flexible Bandwidth
Optical Data Center Core Network With All-to-All Interconnectivity,” Journal of Lightwave Technology, vol. 33,
no. 8, pp. 1578–1585, April 2015.
[577] S. J. B. Yoo, Y. Yin, and K. Wen, “Intra and inter datacenter networking: The role of optical packet
switching and flexible bandwidth optical networking,” in 2012 16th International Conference on Optical Network
Design and Modelling (ONDM), April 2012, pp. 1–6.
[578] H. J. S. Dorren, S. Di Lucente, J. Luo, O. Raz, and N. Calabretta, “Scaling photonic packet switches to a
large number of ports [invited],” IEEE/OSA Journal of Optical Communications and Networking, vol. 4, no. 9,
pp. A82–A89, Sep. 2012.
[579] F. Yan, W. Miao, O. Raz, and N. Calabretta, “Opsquare: A flat DCN architecture based on flow-controlled
optical packet switches,” IEEE/OSA Journal of Optical Communications and Networking, vol. 9, no. 4, pp. 291–
303, April 2017.
[580] X. Yu, H. Gu, K. Wang, and S. Ma, “Petascale: A Scalable Buffer-Less All-Optical Network for Cloud
Computing Data Center,” IEEE Access, vol. 7, pp. 42 596–42 608, 2019.
[581] G. Wang, D. G. Andersen, M. Kaminsky, K. Papagiannaki, T. E. Ng, M. Kozuch, and M. Ryan, “c Through:
part-time optics in data centers,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, Aug. 2010.
[582] N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A.
Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” SIGCOMM Comput.
Commun. Rev., vol. 41, no. 4, Aug. 2010.
[583] G. Porter, R. Strong, N. Farrington, A. Forencich, P. Chen-Sun, T. Rosing, Y. Fainman, G. Papen, and A.
Vahdat, “Integrating Microsecond Circuit Switching into the Data Center,” SIGCOMM Comput. Commun.
Rev., vol. 43, no. 4, pp. 447–458, Aug. 2013.
[584] A. Singla, A. Singh, and Y. Chen, “OSA: An Optical Switching Architecture for Data Center Networks
with Unprecedented Flexibility,” in Presented as part of the 9th USENIX Symposium on Networked Systems
Design and Implementation (NSDI 12). San Jose, CA: USENIX, 2012, pp. 239–252.
[585] A. Singla, A. Singh, K. Ramachandran, L. Xu, and Y. Zhang, “Proteus: A Topology Malleable Data Center
Network,” in Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks, ser. Hotnets-IX.
New York, NY, USA: ACM, 2010, pp. 8:1–8:6.
[586] X. Ye, Y. Yin, S. Yoo, P. Mejia, R. Proietti, and V. Akella, “DOS – A scalable optical switch for
datacenters,” in Architectures for Networking and Communications Systems (ANCS), 2010 ACM/IEEE
Symposium on, Oct 2010, pp. 1–12.
[587] K. Xia, Y. H. Kaob, M. Yangb, and H. J. Chao, “Petabit optical switch for data center networks,”
Polytechnic Institute of NYU, Tech. Rep., 2010.
[588] N. Hamedazimi, Z. Qazi, H. Gupta, V. Sekar, S. R. Das, J. P. Longtin, H. Shah, and A. Tanwer, “FireFly:
A Reconfigurable Wireless Data Center Fabric Using Free-space Optics,” SIGCOMM Comput. Commun.
Rev., vol. 44, no. 4, pp. 319–330, Aug. 2014.
[589] N. Hamedazimi, H. Gupta, V. Sekar, and S. R. Das, “Patch Panels in the Sky: A Case for Free-space Optics
in Data Centers,” in Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks, ser. HotNets-XII.
New York, NY, USA: ACM, 2013, pp. 23:1–23:7.
[590] A. Roozbeh, J. Soares, G. Q. Maguire, F. Wuhib, C. Padala, M. Mahloo, D. Turull, V. Yadhav, and D.
Kosti´c, “Software-Defined “Hardware” Infrastructures: A Survey on Enabling Technologies and Open Research
Directions,” IEEE Communications Surveys Tutorials, vol. 20, no. 3, pp. 2454–2485, thirdquarter 2018.
[591] S. Rumley, D. Nikolova, R. Hendry, Q. Li, D. Calhoun, and K. Bergman, “Silicon Photonics for Exascale
Systems,” Journal of Lightwave Technology, vol. 33, no. 3, pp. 547–562, Feb 2015.
[592] M. A. Taubenblatt, “Optical Interconnects for High-Performance Computing,” Journal of Lightwave
Technology, vol. 30, no. 4, pp. 448–457, Feb 2012.
[593] G. Zervas, H. Yuan, A. Saljoghei, Q. Chen, and V. Mishra, “Optically disaggregated data centers with
minimal remote memory latency: Technologies, architectures, and resource allocation [Invited],” IEEE/OSA
Journal of Optical Communications and Networking, vol. 10, no. 2, pp. A270–A285, Feb 2018.
[594] G. M. Saridis, Y. Yan, Y. Shu, S. Yan, M. Arslan, T. Bradley, N. V. Wheeler, N. H. L. Wong, F. Poletti,
M. N. Petrovich, D. J. Richardson, S. Poole, G. Zervas, and D. Simeonidou, “EVROS: Alloptical programmable
disaggregated data centre interconnect utilizing hollow-core bandgap fibre,” in 2015 European Conference on
Optical Communication (ECOC), Sep. 2015, pp. 1–3.
[595] O. O. Ajibola, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Disaggregation for Improved Efficiency in
Fog Computing Era,” arXiv e-prints, p. arXiv:1904.01311, Apr 2019.
[596] O. O. Ajibola, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “On Energy Efficiency of Networks for
Composable Datacentre Infrastructures,” in 2018 20th International Conference on Transparent Optical Networks
(ICTON), July 2018, pp. 1–5.
[597] H. M. M. Ali, T. E. H. El-Gorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Future Energy Efficient Data
Centers with Disaggregated Servers,” Journal of Lightwave Technology, vol. PP, no. 99, pp. 1–1, 2017.
[598] H. Mohammad Ali, A. Lawey, T. El-Gorashi, and J. Elmirghani, “Energy efficient disaggregated servers
for future data centers,” in Networks and Optical Communications - (NOC), 2015 20th European Conference on,
June 2015, pp. 1–6.
[599] H. M. M. Ali, A. M. Al-Salim, A. Q. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “Energy efficient
resource provisioning with VM migration heuristic for Disaggregated Server design,” in 2016 18th International
Conference on Transparent Optical Networks (ICTON), July 2016, pp. 1–5.
[600] S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken, “The Nature of Data Center Traffic:
Measurements & Analysis,” in Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement
Conference, ser. IMC ’09. New York, NY, USA: ACM, 2009, pp. 202–208.
[601] T. Benson, A. Anand, A. Akella, and M. Zhang, “Understanding Data Center Traffic Characteristics,”
SIGCOMM Comput. Commun. Rev., vol. 40, no. 1, pp. 92–99, Jan. 2010.
[602] T. Benson, A. Akella, and D. A. Maltz, “Network Traffic Characteristics of Data Centers in the Wild,” in
Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, ser. IMC ’10. New York, NY,
USA: ACM, 2010, pp. 267–280.
[603] A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren, “Inside the Social Network’s (Datacenter)
Network,” SIGCOMM Comput. Commun. Rev., vol. 45, no. 5, pp. 123–137, Aug. 2015.
[604] Q. Zhang, V. Liu, H. Zeng, and A. Krishnamurthy, “High-resolution Measurement of Data Center
Microbursts,” in Proceedings of the 2017 Internet Measurement Conference, ser. IMC ’17. New York, NY, USA:
ACM, 2017, pp. 78–85.
[605] D. A. Popescu and A. W. Moore, “A First Look at Data Center Network Condition Through The Eyes of
PTPmesh,” in 2018 Network Traffic Measurement and Analysis Conference (TMA), June 2018, pp. 1–8.
[606] Y. Peng, K. Chen, G. Wang, W. Bai, Y. Zhao, H. Wang, Y. Geng, Z. Ma, and L. Gu, “Towards
Comprehensive Traffic Forecasting in Cloud Computing: Design and Application,” Networking, IEEE/ACM
Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[607] C. H. Liu, A. Kind, and A. V. Vasilakos, “Sketching the data center network traffic,” IEEE Network, vol.
27, no. 4, pp. 33–39, July 2013.
[608] Z. Hu, Y. Qiao, J. Luo, P. Sun, and Y. Wen, “CREATE: Correlation enhanced traffic matrix estimation in
Data Center Networks,” in 2014 IFIP Networking Conference, June 2014, pp. 1–9.
[609] Y. Han, J. Yoo, and J. W. Hong, “Poisson shot-noise process based flow-level traffic matrix generation for
data center networks,” in 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM),
May 2015, pp. 450–457.
[610] C. Delimitrou, S. Sankar, A. Kansal, and C. Kozyrakis, “ECHO: Recreating network traffic maps for
datacenters with tens of thousands of servers,” in 2012 IEEE International Symposium on Workload
Characterization (IISWC), Nov 2012, pp. 14–24.
[611] M. Noormohammadpour and C. S. Raghavendra, “Datacenter Traffic Control: Understanding Techniques
and Tradeoffs,” IEEE Communications Surveys Tutorials, vol. 20, no. 2, pp. 1492–1525, Secondquarter 2018.
[612] K. Chen, C. Hu, X. Zhang, K. Zheng, Y. Chen, and A. V. Vasilakos, “Survey on routing in data centers:
insights and future directions,” IEEE Network, vol. 25, no. 4, pp. 6–10, July 2011.
[613] R. Rojas-Cessa, Y. Kaymak, and Z. Dong, “Schemes for Fast Transmission of Flows in Data Center
Networks,” IEEE Communications Surveys Tutorials, vol. 17, no. 3, pp. 1391–1422, thirdquarter 2015.
[614] J. Qadir, A. Ali, K. A. Yau, A. Sathiaseelan, and J. Crowcroft, “Exploiting the Power of Multiplicity: A
Holistic Survey of Network- Layer Multipath,” IEEE Communications Surveys Tutorials, vol. 17, no. 4, pp. 2176–
2213, Fourthquarter 2015.
[615] J. Zhang, F. R. Yu, S. Wang, T. Huang, Z. Liu, and Y. Liu, “Load Balancing in Data Center Networks: A
Survey,” IEEE Communications Surveys Tutorials, pp. 1–1, 2018.
[616] Y. Zhang and N. Ansari, “On Architecture Design, Congestion Notification, TCP Incast and Power
Consumption in Data Centers,” IEEE Communications Surveys Tutorials, vol. 15, no. 1, pp. 39–64, First 2013.
[617] M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus,
R. Pan, N. Yadav, and G. Varghese, “CONGA: Distributed Congestion-aware Load Balancing for Datacenters,”
SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 503–514, Aug. 2014.
[618] X. Wu and X. Yang, “DARD: Distributed Adaptive Routing for Datacenter Networks,” in Distributed
Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on, June 2012, pp. 32–41.
[619] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan,
“Data center TCP (DCTCP),” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. –, Aug. 2010.
[620] C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Handley, “Improving Datacenter
Performance and Robustness with Multipath TCP,” in Proceedings of the ACM SIGCOMM 2011 Conference,
ser. SIGCOMM ’11. New York, NY, USA: ACM, 2011, pp. 266–277.
[621] B. Vamanan, J. Hasan, and T. Vijaykumar, “Deadline-aware Datacenter TCP (D2TCP),” in Proceedings of
the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for
Computer Communication, ser. SIGCOMM ’12. New York, NY, USA: ACM, 2012, pp. 115–126.
[622] C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron, “Better never than late: Meeting deadlines in
datacenter networks,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 50–61, Aug. 2011.
[623] M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker, “pFabric: Minimal
Near-optimal Datacenter Transport,” SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 435–446, Aug.
2013.
[624] C.-Y. Hong, M. Caesar, and P. B. Godfrey, “Finishing Flows Quickly with Preemptive Scheduling,” in
Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and
Protocols for Computer Communication, ser. SIGCOMM ’12. New York, NY, USA: ACM, 2012, pp. 127–138.
[625] D. Zats, T. Das, P. Mohan, D. Borthakur, and R. Katz, “DeTail: Reducing the Flow Completion Time Tail
in Datacenter Networks,” in Proceedings of the ACM SIGCOMM 2012 Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communication, ser. SIGCOMM ’12. New York, NY,
USA: ACM, 2012, pp. 139–150.
[626] M. Bari, R. Boutaba, R. Esteves, L. Granville, M. Podlesny, M. Rabbani, Q. Zhang, and M. Zhani, “Data
Center Network Virtualization: A Survey,” Communications Surveys Tutorials, IEEE, vol. 15, no. 2, pp. 909–
928, Second 2013.
[627] V. D. Piccolo, A. Amamou, K. Haddadou, and G. Pujolle, “A Survey of Network Isolation Solutions for
Multi-Tenant Data Centers,” IEEE Communications Surveys Tutorials, vol. 18, no. 4, pp. 2787–2821,
Fourthquarter 2016.
[628] S. Raghul, T. Subashri, and K. R. Vimal, “Literature survey on traffic-based server load balancing using
SDN and open flow,” in 2017 Fourth International Conference on Signal Processing, Communication and
Networking (ICSCN), March 2017, pp. 1–6.
[629] C. Guo, G. Lu, H. J. Wang, S. Yang, C. Kong, P. Sun, W. Wu, and Y. Zhang, “SecondNet: A Data Center
Network Virtualization Architecture with Bandwidth Guarantees,” in Proceedings of the 6th International
COnference, ser. Co-NEXT ’10. New York, NY, USA: ACM, 2010, pp. 15:1–15:12.
[630] M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, “Hedera: Dynamic Flow
Scheduling for Data Center Networks,” in Proceedings of the 7th USENIX Conference on Networked Systems
Design and Implementation, ser. NSDI’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 19–19.
[631] B. Heller, S. Seetharaman, P. Mahadevan, Y. Yiakoumis, P. Sharma, S. Banerjee, and N. McKeown,
“ElasticTree: Saving Energy in Data Center Networks,” in Proceedings of the 7th USENIX Conference on
Networked Systems Design and Implementation, ser. NSDI’10. Berkeley, CA, USA: USENIX Association, 2010,
pp. 17–17.
[632] M. Dayarathna, Y. Wen, and R. Fan, “Data Center Energy Consumption Modeling: A Survey,” IEEE
Communications Surveys Tutorials, vol. 18, no. 1, pp. 732–794, Firstquarter 2016.
[633] D. Çavdar and F. Alagoz, “A survey of research on greening data centers,” in 2012 IEEE Global
Communications Conference (GLOBECOM), Dec 2012, pp. 3237–3242.
[634] A. C. Riekstin, B. B. Rodrigues, K. K. Nguyen, T. C. M. de Brito Carvalho, C. Meirosu, B. Stiller, and M.
Cheriet, “A Survey on Metrics and Measurement Tools for Sustainable Distributed Cloud Networks,”IEEE
Communications Surveys Tutorials, vol. 20, no. 2, pp. 1244–1270, Secondquarter 2018.
[635] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The Cost of a Cloud: Research Problems in Data
Center Networks,” SIGCOMM Comput. Commun. Rev., vol. 39, no. 1, pp. 68–73, Dec. 2008.
[636] W. Zhang, Y. Wen, Y. W. Wong, K. C. Toh, and C. H. Chen, “Towards Joint Optimization Over ICT and
Cooling Systems in Data Centre: A Survey,” IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp. 1596–
1616, thirdquarter 2016.
[637] K. Bilal, S. U. R. Malik, S. U. Khan, and A. Y. Zomaya, “Trends and challenges in cloud datacenters,”
IEEE Cloud Computing, vol. 1, no. 1, pp. 10–20, May 2014.
[638] C. Kachris and I. Tomkos, “Power consumption evaluation of all-optical data center networks,” Cluster
Computing, vol. 16, no. 3, pp. 611–623, 2013.
[639] K. Christensen, P. Reviriego, B. Nordman, M. Bennett, M. Mostowfi, and J. Maestro, “IEEE 802.3az: the
road to energy efficient ethernet,” Communications Magazine, IEEE, vol. 48, no. 11, pp. 50–56, November 2010.

Sanaa Hamid Mohamed


received the B.Sc. degree (honors) in electrical and electronic engineering from the University of Khartoum,
Sudan, in 2009 and the M.Sc. degree in electrical engineering from the American University of Sharjah (AUS),
United Arab Emirates, in 2013. She received a full-time graduate teaching assistantship, MSELE program, AUS
in 2011. She received a Doctoral Training Award (DTA) from the UK Engineering and Physical Sciences
Research Council (EPSRC) to fund her PhD studies at the University of Leeds in 2015. Currently, is a PhD
student at the Institute of Communication and Power Networks, University of Leeds, UK. She has held research
and teaching positions at the University of Khartoum, AUS, Sudan University of Science and Technology, and
Khalifa University between 2009 and 2013. She has been an IEEE member since 2008 and a member of the IEEE
communication, photonics, computer, cloud computing, software defined networks, and sustainable ICT societies.
Her research interests include wireless communications, optical communications, optical networking, software-
defined networking, data center networking, big data analytics, and cloud and fog computing.

Taisir E. H. El-Gorashi
received the B.S. degree (first-class Hons.) in electrical and electronic engineering from the University of
Khartoum, Khartoum, Sudan, in 2004, the M.Sc. degree (with distinction) in photonic and communication systems
from the University of Wales, Swansea, UK, in 2005, and the PhD degree in optical networking from the
University of Leeds, Leeds, UK, in 2010. She is currently a Lecturer in optical networks in the School of Electrical
and Electronic Engineering, University of Leeds. Previously, she held a Postdoctoral Research post at the
University of Leeds (2010– 2014), where she focused on the energy efficiency of optical networks investigating
the use of renewable energy in core networks, green IP over WDM networks with datacenters, energy efficient
physical topology design, energy efficiency of content distribution networks, distributed cloud computing,
network virtualization and Big Data. In 2012, she was a BT Research Fellow, where she developed energy
efficient hybrid wireless-optical broadband access networks and explored the dynamics of TV viewing behavior
and program popularity. The energy efficiency techniques developed during her postdoctoral research contributed
3 out of the 8 carefully chosen core network energy efficiency improvement measures recommended by the
GreenTouch consortium for every operator network worldwide. Her work led to several invited talks at
GreenTouch, Bell Labs, Optical Network Design and Modelling conference, Optical Fiber Communications
conference, International Conference on Computer Communications, EU Future Internet Assembly, IEEE
Sustainable ICT Summit and IEEE 5G World Forum and collaboration with Nokia and Huawei.

Jaafar M. H. Elmirghani (M’92–SM’99)


is the Director of the Institute of Communication and Power Networks within the School of Electronic and
Electrical Engineering, University of Leeds, UK. He joined Leeds in 2007 and prior to that (2000–2007) as chair
in optical communications at the University of Wales Swansea he founded, developed and directed the Institute
of Advanced Telecommunications and the Technium Digital (TD), a technology incubator/spin-off hub. He has
provided outstanding leadership in a number of large research projects at the IAT and TD. He received the Ph.D.
in the synchronization of optical systems and optical receiver design from the University of Huddersfield UK in
1994 and the DSc in Communication Systems and Networks from University of Leeds, UK, in 2014. He has co-
authored Photonic Switching Technology: Systems and Networks, (Wiley) and has published over 500 papers.
He has research interests in optical systems and networks. Prof. Elmirghani is Fellow of the IET, Fellow of the
Institute of Physics and Senior Member of IEEE. He was Chairman of IEEE Comsoc Transmission Access and
Optical Systems technical committee and was Chairman of IEEE Comsoc Signal Processing and Communications
Electronics technical committee, and an editor of IEEE Communications Magazine. He was founding Chair of
the Advanced Signal Processing for Communication Symposium which started at IEEE GLOBECOM’99 and has
continued since at every ICC and GLOBECOM. Prof. Elmirghani was also founding Chair of the first IEEE
ICC/GLOBECOM optical symposium at GLOBECOM’00, the Future Photonic Network Technologies,
Architectures and Protocols Symposium. He chaired this Symposium, which continues to date under different
names. He was the founding chair of the first Green Track at ICC/GLOBECOM at GLOBECOM 2011, and is
Chair of the IEEE Sustainable ICT Initiative within the IEEE Technical Activities Board (TAB) Future Directions
Committee (FDC) and within the IEEE Communications Society, a pan IEEE Societies Initiative responsible for
Green and Sustainable ICT activities across IEEE, 2012-present. He is and has been on the technical program
committee of 38 IEEE ICC/GLOBECOM conferences between 1995 and 2019 including 18 times as Symposium
Chair. He received the IEEE Communications Society Hal Sobol award, the IEEE Comsoc Chapter Achievement
award for excellence in chapter activities (both in 2005), the University of Wales Swansea Outstanding Research
Achievement Award, 2006, the IEEE Communications Society Signal Processing and Communication Electronics
outstanding service award, 2009, a best paper award at IEEE ICC’2013, the IEEE Comsoc Transmission Access
and Optical Systems outstanding Service award 2015 in recognition of “Leadership and Contributions to the Area
of Green Communications”, received the GreenTouch 1000x award in 2015 for “pioneering research contributions
to the field of energy efficiency in telecommunications", the 2016 IET Optoelectronics Premium Award and
shared with 6 GreenTouch innovators the 2016 Edison Award in the “Collective Disruption” Category for their
work on the GreenMeter, an international competition, clear evidence of his seminal contributions to Green
Communications which have a lasting impact on the environment (green) and society. He is currently an editor
of: IET Optoelectronics, Journal of Optical Communications, IEEE Communications Surveys and Tutorials and
IEEE Journal on Selected Areas in Communications series on Green Communications and Networking. He was
Co-Chair of the GreenTouch Wired, Core and Access Networks Working Group, an adviser to the Commonwealth
Scholarship Commission, member of the Royal Society International Joint Projects Panel and member of the
Engineering and Physical Sciences Research Council (EPSRC) College. He was Principal Investigator (PI) of the
£6m EPSRC INTelligent Energy awaRe NETworks (INTERNET) Programme Grant, 2010-2016 and is currently
PI of the £6.6m EPSRC Terabit Bidirectional Multi-user Optical Wireless System (TOWS) for 6G LiFi
Programme Grant, 2019-2024. He has been awarded in excess of £30 million in grants to date from EPSRC, the
EU and industry and has held prestigious fellowships funded by the Royal Society and by BT. He was an IEEE
Comsoc Distinguished Lecturer 2013-2016.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy