A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks
A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks
Abstract— This survey article reviews the challenges associated with deploying and optimizing big data applications
and machine learning algorithms in cloud data centers and networks. The MapReduce programming model and its
widely-used open-source platform; Hadoop, are enabling the development of a large number of cloud-based services
and big data applications. MapReduce and Hadoop thus introduce innovative, efficient, and accelerated intensive
computations and analytics. These services usually utilize commodity clusters within geographically-distributed data
centers and provide cost-effective and elastic solutions. However, the increasing traffic between and within the data
centers that migrate, store, and process big data, is becoming a bottleneck that calls for enhanced infrastructures
capable of reducing the congestion and power consumption. Moreover, enterprises with multiple tenants requesting
various big data services are challenged by the need to optimize leasing their resources at reduced running costs and
power consumption while avoiding under or over utilization. In this survey, we present a summary of the characteristics
of various big data programming models and applications and provide a review of cloud computing infrastructures,
and related technologies such as virtualization, and software-defined networking that increasingly support big data
systems. Moreover, we provide a brief review of data centers topologies, routing protocols, and traffic characteristics,
and emphasize the implications of big data on such cloud data centers and their supporting networks. Wide ranging
efforts were devoted to optimize systems that handle big data in terms of various applications performance metrics
and/or infrastructure energy efficiency. This survey aims to summarize some of these studies which are classified
according to their focus into applications-level, networking-level, or data centers-level optimizations. Finally, some
insights and future research directions are provided.
Index Terms— Big Data, MapReduce, Machine Learning, Data Streaming, Cloud Computing, Cloud Networking,
Software-Defined Networking (SDN), Virtual Machines (VM), Network Function Virtualization (NFV), Containers,
Data Centers Networking (DCN), Energy Efficiency, Completion Time, Scheduling, Routing.
I INTRODUCTION
THE evolving paradigm of big data is essential for critical advancements in data processing models and the
underlying acquisition, transmission, and storage infrastructures [1]. Big data differs from traditional data in being
potentially unstructured, rapidly generated, continuously changing, and massively produced by a large number of
distributed users or devices. Typically, big data workloads are transferred into powerful data centers containing
sufficient storage and processing units for real-time or batch computations and analysis. A widely used
characterization for big data is the "5V" notion which describes big data through its unique attributes of Volume,
Velocity, Variety, Veracity, and Value [2]. In this notation, the volume refers to the vast amount of data produced
which is usually measured in Exabytes (i.e. 260 or 1018 bytes) or Zettabytes (i.e. 270 or 1021 bytes), while the
velocity reflects the high speed or rate of data generation and hence potentially the short lived useful lifetime of
data. Variety indicates that big data can be composed of different types of data which can be categorized into
structured and unstructured. An example of structured data is bank transactions which can fit into relational
database systems, and an example of the unstructured data is social media content that could be a mix of text,
photos, animated Graphics Interchange Format (GIF), audio files, and videos contained in the same element (e.g.
a tweet, or a post). The veracity measures the trustworthiness of the data as some generated portions could be
erroneous or inaccurate, while the value measures the ability of the user or owner of the data to extract useful
information from the data.
In 2020, the global data volume is predicted to be around 40,000 Exabytes which represents a 300 times
growth factor compared to the global data volume in 2005 [3]. An estimate of the global data volume in 2010 is
about 640 Exabytes [4], and in 2015 is about 2,700 Exabytes [5]. This huge growth in data volumes is the result
of continuous developments in various applications that generate massive and rich content related to a wide range
of human activities. For example, online business transactions are expected to have a rate of 450 Billion
transactions per day by 2020 [4]. Social media such as Facebook, LinkedIn, and Twitter, which have between 300
Million and 2 Billion subscribers who access these social media platforms through web browsers in personal
computers (PCs), or through applications installed in tablets and smart phones are enriching the content of the
Internet with content in the range of several Terabytes (240 bytes) per day [5]. Analyzing the thematic connections
between the subscribers, for example by grouping people with similar interests, is opening remarkable
opportunities for targeted marketing and e-commerce. Moreover, the subscriber's behaviours and preferences
tracked by their activities, clickstreams, requests, and collected web log files can be analyzed with big data mining
tools for profound psychological, economical, business-oriented, and product improvement studies [6], [7]. To
accelerate the delay-sensitive operations of web searching and indexing , distributed programming models for big
data such as MapReduce were developed [8]. MapReduce is a powerful, reliable, and cost-effective programming
model that performs parallel processing for large distributed datasets. These features have enabled the
development of different distributed programming big data solutions and cloud computing applications.
Fig. 1. Big data communication, networking, and processing infrastructure, and examples of big data
applications.
A wide range of applications are considered big data applications from data-intensive scientific applications
that require extensive computations to massive datasets that require manipulation such as in earth sciences,
astronomy, nanotechnology, genomics, and bioinformatics [9]. Typically, the computations, simulations, and
modelling in such applications are carried out in High Performance Computing (HPC) clusters with the aid of
distributed and grid computing. However, the datasets growth beyond these systems capacities in addition to the
desire to share datasets for scientific research collaborations in some disciplines are encouraging the utilization of
big data applications in cloud computing infrastructures with commodity devices for scientific computations
despite the resultant performance and cost tradeoffs [10].
With the prevalence of mobile applications and services that have extensive computational and storage
demands exceeding the capabilities of the current smart phones, emerging technologies such as Mobile Cloud
Computing (MCC) were developed [11]. In MCC, the computational and storage demands of applications are
outsourced to remote (or close as in mobile edge computing (MEC)) powerful servers over the Internet. As a
result, on-demand rich services such as video streaming, interactive video, and online gaming can be effectively
delivered to the capacity and battery limited devices. Video content accounted for 51% of the total mobile data
traffic in 2012 [11], and is predicted to account for 78% of an expected total volume of 49 Exabytes by 2021 [12].
Due to these huge demands, in addition to the large sizes of video files, big video data platforms are fronting
several challenges related to video streaming, storage, and replication management, while needing to meet strict
quality-of-experience (QoE) requirements [13].
In addition to mobile devices, the wide range of everyday physical objects that are increasingly interconnected
for automated operations has formed what is known as the Internet-of-Things (IoT). In IoT systems, the underlying
communication and networking infrastructure are typically integrated with big data computing systems for data
collection, analysis, and decision-making. Several technologies such as RFID, low power communication
technologies, Machine-to-Machine (M2M) communications, and wireless sensor networking (WSN) have been
suggested for improved IoT communications and networking infrastructure [14]. To process the big data generated
by IoT devices, different solutions such as cloud and fog computing were proposed [15]-[31]. Existing cloud
computing infrastructures could be utilized by aggregating and processing big data in powerful central data
centers. Alternatively, data could be processed at the edge where fog computing units, typically with limited
processing capacities compared to cloud, are utilized [32]. Edge computing reduces both; the traffic in core
networks, and the latency by being closer to end devices. The connected devices could be sensors gathering
different real-time measurements, or actuators performing automated control operations in industrial, agricultural,
or smart building applications. IoT can support vehicle communication to realize smart transportation systems.
IoT can also support medical applications such as wearables and telecare applications for remote treatment,
diagnosis and monitoring [33]. With this variety in IoT devices, the number of Internet-connected things is
expected to exceed 50 Billion by 2020, and the services provided by IoT are expected to add $15 Trillion to the
global Gross Domestic Product (GDP) in the next 20 years [14]. Figure 1 provides generalized big data
communication, networking, and processing infrastructure and examples of applications that can utilize it.
Achieving the full potential of big data requires a multidisciplinary collaboration between computer scientists,
engineers, data scientists, as well as statisticians and other stakeholders [4]. It also calls for huge investments and
developments by enterprises and other organizations to improve big data processing, management, and analytics
infrastructures to enhance decision making and services offerings. Moreover, there are urgent needs for integrating
new big data applications with existing Application Program Interfaces (API) such as Structured Query Language
(SQL), and R language for statistical computing. More than $15 billion have already been invested in big data
systems by several leading Information Technology (IT) companies such as IBM, Oracle, Microsoft, SAP, and
HP [34]. One of the challenges of big data in enterprise and cloud infrastructures is the existence of various
workloads and tenants with different Service Level Agreement (SLA) requirements that need to be hosted on the
same set of clusters. An early solution to this challenge at the application level is to utilize a distributed file system
to control the access and sharing of data within the clusters [35]. On the infrastructure level, solutions such as
Virtual Machines (VMs) or Linux containers dedicated to each application or tenant were utilized to support the
isolation between their assigned resources [1], [34]. Big data systems are also challenged by security, privacy, and
governance related concerns. Furthermore, as the increasing computational demands of the increasing data
volumes are exceeding the capabilities of existing commodity infrastructures, future enhanced and energy efficient
processing and networking infrastructure for big data have to be investigated and optimized.
This survey paper aims to summarize a wide range of studies that use different state-of-art and emerging
networking and computing technologies to optimize and enhance big data applications and systems in terms of
various performance metrics such as completion time, data locality, load balancing, fairness, reliability, and
resources utilization, and/or their energy efficiency. Due to the popularity and wide applicability of the
MapReduce programming model, its related optimization studies will be the main focus in this survey. Moreover,
optimization studies for big data management and streaming, in addition to generic cloud computing applications
and bulk data transfer are also considered. The optimization studies in this survey are classified according to their
focus into applications-level, cloud networking-level, and data center-level studies as summarized in Figure 2.
The first category at the application level targets the studies that extend or optimize existing framework parameters
and mechanisms such as optimizing jobs and data placements, and scheduling, in addition to developing
benchmarks, traces, and simulators [36]-[119]. As the use of big data applications is evolving from clusters with
controlled environments into cloud environments with geo-distributed data centers, several additional challenges
are encountered. The second category at the networking level focuses on optimizing cloud networking
infrastructures for big data applications such as inter-data centers networking, virtual machine assignments, and
bulk data transfer optimization studies [120]-[203]. The increasing data volumes and processing demands are also
challenging the data centers that store and process big data. The third category at the data center level targets
optimizing the topologies, routing, and scheduling in data centers for big data applications, in addition to the
studies that utilize, demonstrate, and suggest scaled-up computing and networking infrastructures to replace
commodity hardware in the future [204]-[311]. For the performance evaluations in the aforementioned studies,
either realistic traces, or deployments in experimental testbed clusters are utilized.
Although several big data related surveys and tutorials are available, to the best of our knowledge, none has
extensively addressed optimizing big data applications while considering the technological aspects of their hosting
cloud data centers and networking infrastructures. The tutorial in [1] and the survey in [312] considered mapping
the role of cloud computing, IoT, data centers, and applications to the acquisition, storage and processing of big
data. The authors in [313]-[316] extensively surveyed the advances in big data processing frameworks and
compared their components, usage, and performance. A review of benchmarking big data systems is provided in
[317]. The surveys in [318], [319] focused on optimizing jobs scheduling at the application level, while the survey
in [320] additionally tackled extensions, tuning, hardware acceleration, security, and energy efficiency for
MapReduce. The environmental impacts of big data and its usage to green applications and systems were
discussed in [321]. Security and privacy concerns of MapReduce in cloud environments were discussed in [322],
while the challenges and requirements of geo-distributed batch and streaming big data frameworks were outlined
in [323]. The surveys in [324]-[326] addressed the use of big data analytics to optimize wired and wireless
networks, while the survey in [327] overviewed big data mathematical representations for networking
optimizations. The scheduling of flows in data centers for big data is addressed in [328] and the impact of data
centers frameworks on the scheduling and resource allocation is surveyed in [329] for three big data applications.
This survey paper is structured as follows: For the convenience of the reader, brief overviews for the state-of-
art and advances in big data programming models and frameworks, cloud computing and its related technologies,
and cloud data centers are provided before the corresponding optimization studies. Section II reviews the
characteristics of big data programming models and existing batch, streaming processing and storage management
applications while Section III summarizes the applications-focused optimization studies. Section IV discusses the
prominence of cloud computing for big data applications, and the implications of big data applications on cloud
networks. It also reviews some related technologies that support big data and cloud computing systems such as
machine and network virtualization, and Software-Defined Networking (SDN). Section V summarizes the cloud
networking-focused optimization studies. Section VI briefly reviews data center topologies, traffic characteristics
and routing protocols, while Section VII summarizes the data center-focused optimization studies. Finally, Section
VIII provides future research directions, and Section IX concludes the survey. Key Acronyms are provided below.
5) A reduce task is composed of shuffle, sort, and reduce phases. The shuffle phase can start when 5% of
the map results are generated, however the last reduction phase cannot start unless all map tasks are
completed. For shuffling, each reduce worker obtains the locations of the intermediate pairs with the
keys assigned to it and fetches the corresponding results from the map workers’ local disks typically via
the HyperText Transfer Protocol (HTTP).
6) Each reduce worker then sorts its intermediate results by the keys. The sorting is performed in the
Random Access Memory (RAM) if the intermediate results can fit, otherwise, external sort-merge
algorithms are used. The sorting groups all the occurrences of the same key and forms the shuffled
intermediate results (SIR).
7) Each reduce worker applies the assigned user-defined reduce function on the shuffled data to generate
the final key-value pairs output (O), the final output files are then saved in the distributed file system.
In MapReduce, fault-tolerance is achieved by re-executing the failed tasks. Failures can occur due to hardware
causes such as disk failures, out of disk, out of memory, and socket time out. Each map or reduce task can be in
one of three statuses which are idle, in-progress, and completed. If an in-progress map task fails, the master
changes its status to idle to allow it to be re-scheduled on an available map node containing a replica of the data.
If a map worker fails while having some completed map tasks, all contained map tasks must be re-scheduled as
the intermediate results, which are only saved on local disks, are no longer accessible. In this case, all reduce
workers must be re-scheduled to obtain the correct and complete set of intermediate results. If a reduce worker
fails, only in-progress tasks are re-scheduled as the results of completed reduce jobs are saved in the distributed
file system. To improve MapReduce performance, speculative execution can be activated, where backup tasks are
created to speed up the lacking in-progress tasks known as stragglers [8].
The MapReduce programming model is capable of solving several common programming problems such as
words count and sort in addition to implementing complex graph processing, data mining, and machine learning
applications. However, the speed requirements for some computations might not be satisfied by MapReduce due
to several limitations [333]. Moreover, developing efficient MapReduce applications requires advanced
programming skills to fit the computations into the map and reduce pipeline, and deep knowledge of underlying
infrastructures to properly configure and optimize a wide range of parameters [334]-[337]. One of MapReduce
limitations is that transferring non-local input data to map workers and shuffling intermediate results to reduce
workers typically require intensive networking bandwidth and disk I/O operations. Early efforts to minimize the
effects of these bottlenecks included maximizing data locality, where the computations are carried closer to data
[8]. Another limitation is due to the fault-tolerance mechanism that requires materializing the entire output of
MapReduce jobs in the disks managed by the DFS before being accessible for further computations. Hence,
MapReduce is generally less suitable for interactive and iterative computations that require repetitive access to
results. An implementation variation known as MapReduce Online [338], supports shuffling by utilizing RAM
resources to pipeline intermediate results between map and reduce stages before the materialization.
Several other programming models were developed as variants to MapReduce [314]. One of these variants is
Dryad which is a high-performance general-purpose distributed programming model for parallel applications with
coarse-grain data [330]. In Dryad, a Directed Acyclic Graph (DAG) is used to describe the jobs by representing
the computations as the graph vertices and the data communication patterns as the graph edges. The Job Manager
within Dryad, schedules the vertices, which contain sequential programs, to run concurrently in a set of machines
that are available at the run time. These machines can be different cores within the same multi-core PC or can be
thousands of machines within a large data center. Dryad provides fault-tolerance and efficient resource utilization
by allowing graph modification during the computations. Unlike MapReduce, which restricts the programmer to
provide a single input file and produces a single output file, Dryad allows for an arbitrary number of input and
output files.
As a successor to MapReduce, Google has recently introduced “Cloud Dataflow” which is a unified
programming model with enhanced processing capabilities for bounded and unbounded data. It provides a balance
between correctness, latency, and cost when processing massive, unbounded, and out-of-order data [331]. In
Cloud Dataflow, the data is represented as tuples containing the key, value, event-time, and the required time
window for the processing. This supports sophisticated user requirements such as event-time ordering of results
by utilizing data windowing that divide the data streams into finite chunks to be processed in groups. Cloud Data
flow utilizes the features of both FlumeJava which is a batch engine and MillWheel which is a streaming engine
[331]. The core primitives of Cloud Dataflow, are ParDo, which is an element-wise generic parallel processing
function, and GroupByKey and GroupByKeyandWindow which aggregate data with the same key according to the
user requests.
B. Apache Hadoop Architecture and Related Software:
Hadoop, which is currently under the auspices of the Apache Software Foundation, is an open source software
framework written in Java for reliable and scalable distributed computing [339]. This framework was initiated by
Doug Cutting who utilized the MapReduce programming model for indexing web crawls and was open-sourced
by Yahoo in 2005. Beside Apache, several organizations have developed and customized Hadoop distributions
tailored for their infrastructures such as HortonWorks, Cloudera, Amazon Web Services (AWS), Pivotal, and
MAPR technologies. The Hadoop ecosystem allows other programs to run on the same infrastructure with
MapReduce which made it a natural choice for enterprise big data platforms [35].
The basic components of the first versions of Hadoop; Hadoop 1.x are depicted in Figure 4. These versions
contain a layer for the Hadoop Distributed File System (HDFS), a layer for the MapReduce 1.0 engine which
resembles Google's MapReduce, and can have other applications on the top layer. The MapReduce 1.0 layer
follows the master-slave architecture. The master is a single node containing a Job Tracker (JT), while each slave
node contains a Task Tracker (TT). The JT handles jobs assignment and scheduling and maintains the data and
metadata of jobs, in addition to resources information. It also monitors the liveness of TTs and the availability of
their resources by sending periodic heartbeat messages typically each 3 seconds. Each TT contains a predefined
set of slots. Once it accepts a map or a reduce task, it launches a Java Virtual Machine (JVM) in one of its slots
to perform the task, and periodically updates the JT with the task status [339].
The HDFS layer consists of a name node in the master and several data nodes in each slave node. The name
node stores the details of the data nodes and the addresses of the data blocks and their replicas. It also checks the
data nodes via heartbeat messages and manages load balancing. For reliability, a secondary name node is typically
assigned to save snapshots of the primary name node. As in GFS, the default file size in HDFS is 64 MB and three
replicas are maintained for each file for fault-tolerance, performance improvements, and load balancing. Beside
GFS and HDFS, several distributed files systems were developed such as Amazon's simple Storage Service (S3),
Moose File System (MFS), Kosmos distributed file system (KFS), and Colossus [314], [340].
Default tasks scheduling mechanisms in Hadoop are First-In First-Out (FIFO), capacity scheduler, and Hadoop
Fair Scheduler (HFS). FIFO schedules the jobs according to their arrival time which leads to undesirable delays
in environments with a mix of long batch jobs and small interactive jobs [319]. The Capacity scheduler developed
at Yahoo reserves a pool containing minimum resources guarantees for each user, and hence suits systems with
multiple users [319]. FIFO scheduling is then used for the jobs of the same user. The Fair scheduler developed at
Facebook dynamically allocates the resources equally between jobs. It thus improves the response time of small
jobs [46].
(a) (b)
Fig. 4. Framework components in Hadoop (a) Hadoop 1.x, and (b) Hadoop 2.x.
Hadoop 2.x, which is also depicted in Figure 4, introduced a resource management platform named YARN;
Yet Another Resource Negotiator [341]. YARN decouples the resource management infrastructure from the
processing components and enables the coexistence of different processing frameworks beside MapReduce which
increases the flexibility in big data clusters. In YARN, the JT and TT are replaced with three components which
are the Resource Manager (RM), the Node Manager (NM), and the Application Master (AM). The RM is a per-
cluster global resources manager which runs as daemon on a dedicated node. It contains a scheduler that
dynamically leases the available cluster resources in the form of containers (further explained in Subsection IV-
B4), which are considered as logical bundles (e.g. 2 GB RAM, 1 Central Processing Unit (CPU) core), among
competing MapReduce jobs and other applications according to their demands and scheduling priorities. A NM
is a per-server daemon that is responsible for monitoring the health of its physical node, tracking its containers
assignments, and managing the containers lifecycle (i.e. starting and killing). An AM is a per-application container
that manages the resources consumption, the jobs execution flow, and also handles the fault-tolerance tasks. The
AM, which typically needs to harness resources from several nodes to finish its job, issues a resource request to
the RM indicating the required number of containers, the required resources per container, and the locality
preferences.
Figure 5 illustrates the differences between Hadoop 1.x, and Hadoop 2.x with YARN. A detailed study of
various releases of Hadoop is presented in [342]. The authors also provided a brief comparison of these releases
in terms of their energy efficiency and performance.
YARN increases the resource allocation flexibility in MapReduce 2 as it utilizes flexible containers for
resources allocations and hence eliminates the use of fixed resources slots as in MapReduce 1. However, this
advantage comes at the expenses of added system complexity and a slight increase in the power consumption
when compared to MapReduce 1 [342]. Another difference is that the intermediate results shuffling operations
in MapReduce 2 are performed via auxiliary services that preserve the output of a container’s results before killing
it. The communications between the AMs, NMs, and the RM are heartbeat-based. If a node fails, the NM sends
an indicating heartbeat to the RM which in turn informs the affected AMs. If an in-progress job fails, the NM
marks it as idle and re-executes it. If an AM fails, the RM restarts it and synchronizes its tasks. If an RM fails, an
old checkpoint is used and a secondary RM is activated [341].
A wide range of applications and programming frameworks can run natively in Hadoop as depicted in Figure
4. The difference in their implementation between the Hadoop versions is that in the 1.x versions, they are forced
to follow the MapReduce framework while in the 2.x versions they are no longer restricted to it. Examples of
these applications and frameworks are Pig [343], Tez [344], Hive [345], HBase [314], Storm [346], Giraph for
graph processing, and Mahout for machine learning [315], [340]. Due to the lack of built-in declarative languages
in Hadoop, Pig, Tez, and Hive were introduced to support querying and to replace ad-hoc users-written programs
which were hard to maintain and reuse. Pig is composed of an execution engine and a declarative scripting
language named Pig Latin that compiles SQL-like queries to an equivalent set of sequenced MapReduce jobs. Pig
hides the complexity of MapReduce and directly provides advanced operations such as filtering, joining, and
ordering [343]. Tez is a flexible input-processor output-runtime model that transforms the queries into abstracted
DAG where the vertices represent the parallel tasks, and the edges represent the data movement between different
map and reduce stages [344]. It supports in-memory operations which makes it more suitable for interactive
processing than MapReduce. Hive [345] is a data warehouse software developed at Facebook. It contains a
declarative language; HiveQL that automatically generates MapReduce jobs from SQL-like user queries.
C. Completion Time:
The reduction of jobs overall completion time was considered in several studies with the aim of improving the
SLA or reducing the power consumption in underlying clusters. Numerical evaluations were utilized in [65] to
test two power efficient resource allocation approaches in a pool of MapReduce clusters. The algorithms
developed aimed to reduce the end-to-end delay or the energy consumption while considering the availability as
an SLA metric. In an effort to provide predictable services to deadline-constrained jobs, the work in [66] suggested
a Resource and Deadline-aware Hadoop scheduler (RDS). RDS is based on online optimization and a self-learning
completion time estimator for future tasks which makes it suitable for dynamic Hadoop clusters with mixture of
energy sources and workloads. The work in [67] focused on reducing the completion time of small jobs that
account for the majority of the jobs in production Hadoop clusters. The proposed scheduler; Fair4S achieved an
improvement by a factor of 7 compared to fair scheduler where 80% of small jobs waited less than 4 seconds to
be served. In [68], a Bipartite Graph MapReduce scheduler (BGMRS) was proposed for deadline-constrained
MapReduce jobs in clusters with heterogeneous nodes and dynamic job execution time. By optimizing scheduling
and resources allocations, BGMRS reduced the deadline miss ratio by 79% and the completion time by 36%
compared to fair scheduler.
The authors in [69] developed an Automatic Resource Inference and Allocation (ARIA) framework based on
jobs profiling to reduce the completion time of MapReduce jobs in shared clusters. A Service Level Objective
(SLO) scheduler was developed to utilize the predicted completion times and determine the schedule of resources
allocation for tasks to meet soft deadlines. The study in [70] proposed a Dynamic Priority Multi-Queue Scheduler
(DPMQS) to reduce the completion time of map tasks in heterogeneous environments. DPMQS increased both
the data locality and the priority of map jobs that are near to completion. Optimizing the scheduling of mixed
MapReduce-like workloads was considered in [71] through offline and online algorithms that determine the order
of tasks that minimize the weighted sum of the completion time. The authors in [72] also emphasized the role of
optimizing jobs ordering and slots configurations in reducing the total completion time for offline jobs and
proposed algorithms that can improve non-optimized Hadoop by up to 80%. The work in [73] considered
optimizing four NoSQL databases (i.e. HBase, Cassandra, and Hive) by reducing the Waiting Energy
Consumption (WEC) caused by idle nodes waiting for job assignments, I/O operations, or results from other
nodes. RoPE was proposed in [73] to reduce the response time of relational queries performed as cascaded
MapReduce jobs in SCOPE which is a parallel processing engine used by Microsoft. A profiler for code and data
properties was used to improve future invocations of the same queries. RoPE achieved 2× improvements in the
response time for 95% of Bing's production jobs while using 1.5× less resources.
Jobs failures lead to huge increase in their completion time. To improve the performance of Hadoop under
failures, a modified MapReduce work flow with fine-grained fault-tolerance mechanism called BEneath the Task
Level (BeTL) was proposed in [75]. BeTL allows generating more files during the shuffling to create more
checkpoints. It improved the performance of Hadoop under no failures by 6.6% and under failures by up to 51%.
The work in [76] proposed four multi-queue size-based scheduling policies to reduce jobs slowdown variability
which is defined as the idle time to wait for resources or I/O operations. Several factors such as parameters
sensitivity, load unbalance, heavy-traffic, and fairness were considered. The work in [77] optimized the number
of reduce tasks, their configurations, and memory allocations based on profiling the intermediate results size. The
results indicated a complete disregard for job failures due to insufficient memory and a reduction in the completion
time by up to 88.79% compared to legacy memory allocation approaches. To improve the performance of
MapReduce in memory-constrained systems, Mammoth was proposed in [78] to provide global memory
management. In Mammoth, related map and reduce tasks were launched in a single Java Virtual Machine (JVM)
as threads that share the memory at run time. Mammoth actively pushes intermediate results to reducers unlike
Hadoop that passively pulls from disks. A rule-based heuristic was used to prioritize memory allocations and
revocations among map, shuffle, and reduce operations. Mammoth was found to be 5.19 times faster than Hadoop
1 and to outperform Spark for interactive and iterative jobs when the memory is insufficient [78]. An automatic
skew mitigation approach; SkewTune was proposed and optimized in [79]. SkewTune detects different types of
skew, and effectively re-partitions the unprocessed data of the stragglers to process them in idle nodes. The results
indicated a reduction by a factor of 4 in the completion time for workloads with skew and minimal overhead for
workloads without skew.
TABLE I
SUMMARY OF APPLICATIONS-FOCUSED OPTIMIZATIONS STUDIES
Ref Objective Application Tools Benchmarks/workloads Experimental Setup/Simulation environment
[36]* Optimize data placement Hadoop 1.x Dynamic Grep, WordCount 5 heterogeneous nodes with Intel's
in heterogeneous clusters algorithm (Core 2 Duo, Celeron, Pentium 3)
[37]* Reduce energy Hadoop 0.20.0 Modification to Webdata_sort, data_scan 36 nodes (8 CPU cores, 32GB RAM, Gigabit
consumption by Hadoop from Gridmix benchmark NIC, two disks), 48-port HP ProCurve 2810-
scaling down unutilized and defining (16-128 GB) 48G switch
clusters covering subset
[38]* Balance energy - locality-aware Geometric distribution Event-driven simulations
consumption scheduler for tasks arrival rate
and performance via Markov
Decision Process
[39]* Dependency-Aware Hadoop 1.2.1, Modification to 3.5 GB Graph from 4 nodes (Intel i7 3.4 GHz Quad, 16GB DDR3
Locality for MapReduce Giraph 1.0.0 HDFS replication Wikimedia database, RAM, 1 TB disk, 1 Gbps NIC, NETGEAR 8-
(DALM) scheme and 2.1 GB of public social port switch
scheduling networks data
algorithm
[40]* Optimizing reduce task Hadoop 1.x Classical stochastic Grep (10 GB), 7 slave nodes (4 cores 2.933 GHz CPU,
locality for sequential sequential Sort (15 GB, 4.3 GB) 32KB cache, 6GB RAM, 72GB disk)
MapReduce jobs assignment
[41]* Optimizing data Hadoop 1.0.3, k-means clustering 920 GB business ad-hoc 10 nodes; 1 NameNode (6 2.6 GHz CPU cores,
placements for query with Hive 0.10.0 for data queries from TPC-H 16GB RAM, 1TB SATA), 9 DataNodes (Intel
operations placement, HDFS i5, 4GB RAM, 300GB disk), 1 Gbps Ethernet
extension to
support customized
data placement
[42]* HConfig: Configuring HBase 0.96.2, Algorithms to semi- YCSB Benchmark 13 nodes (1 manager, 3 coordinators, 9
data loading in HBase Hadoop 2.2.0 automate data workers), and 40 nodes (1 manager, 3
clusters loading and coordinators, 36 workers) with (AMD
resource allocation Opeteron CPU, 8GB RAM, 2 SATA 1TB
disks), 1 Gigabit Ethernet
[43]* Impact of data locality Hadoop 2.3.0, Modify Rumen and 8GB of synthetic text for One node (4 CPU cores, 16GB RAM), 16
in YARN on completion Sqoop, Pig, YARN Grep, WordCount, 10GB nodes (2 CPU cores, 8GB RAM), 1 Gigabit
time and I/O operations Hive Scheduler Load TPC-H queries Ethernet, YARN Location Simulator
Simulator (SLS) (YLocSim)
to report data
locality
[44]* DREAMS: dynamic YARN 2.4.0 Additional feature WC, Inverted Index, k- 21 Xen-based virtual machines (4 2GHz
reduce tasks resources in YARN means, classification, CPU cores, 8GB RAM, 80GB disk)
allocation DataJoin, Sort, Histo-
movies (5GB, 27GB)
[45]* JellyFish: Self tuning Hadoop 2.x Parameters tuning, PUMA benchmark 4 nodes (dual Intel 6 cores Xeon
system based on YARN resources (TeraSort, WordCount, CPU, 15MB L3 cache, 7GB RAM,
rescheduling Grep, Inverted Index) 320GB disk) Gigabit Ethernet
via elastic
containers
[46]° Delay scheduling: Hadoop 0.20 Algorithms Facebook traces (text 100 nodes in Amazon EC2 (4 2 GHz core, 4
balance fairness and data implemented search, simple filtering disks, 15GB RAM, 1 Gbps links), 100 nodes
locality in HDFS selection, aggregation, (8 CPU cores, 4 disks), 1 Gbps Ethernet
join)
[47]° Quincy: Fair Scheduling Dryad Graph-based Sort(40,160,320) GB, 243 nodes in 8 racks (16GB RAM,
with locality and fairness algorithms Join (11.8, 41.8) GB, 2 2.6 GHz dual core AMD) 48-port
PageRank(240) GB, Gigabit Ethernet per rack
WordCount (0.1-5) GB
[48]° Maestro: Replica- Hadoop 0.19.0, Heuristic GridMix (Sort, 20 virtual nodes in local cluster, 100 Grid5000
aware Scheduling Hadoop 0.21.0 WordCount) nodes (2 GHz dual core AMD, 2GB RAM,
(2.5,1,12.5,200) GB 80GB disk)
[49]° Map task scheduling to - Join Shortest Parameters from search Simulations for a 400 machines cluster
balance data locality and Queue, Max Weight jobs in databases
load balancing Scheduler in heavy
traffic queues
[50]° Resource-aware Hadoop 0.23 Scheduler and Gridmix Benchmark 22 nodes (64-bit 2.8 GHz Intel Xeon,
Adaptive Scheduling job profiler (Sort,Combine,Select) 2GB RAM), Gigabit Ethernet
(RAS)
[51]° FLEX: flexible allocation Hadoop 0.20.0 Standalone plug-in 1.8 TB synthesized 26 nodes (3 GHz Intel Xeon, 13 4-core blades
scheduling scheme or data, and GridMix2 in one rack and 13 8-core blades in second
add-on module in rack
FAIR
[52]° HFSP: size- Pig, Job size estimator PigMix (1,10,100 20 workers with TaskTracker
based scheduling Hadoop 1.x and algorithm GB and 1TB) (4 CPU cores, 8GB RAM)
[53]° LsPS: self-tuning Hadoop 1.x Plug-in WordCount, Grep, EC2 m1.large (7.5GB RAM, 850GB disk), 11
two-tiers scheduler scheduler PiEstimator, Sort nodes (1 master, 10 slaves), trace-driven
simulations
[54]° Joint scheduling of Hadoop 1.2.0 3-approximation WordCount (43.7 GB 16 nodes EC2 VM (1 GHz CPU,
MapReduce in servers algorithm, heuristic Wikipedia document 1.7GB memory, 160GB disk)
package)
[55]° Joint Scheduling of - Linear Synthesized workloads Event-based simulations
processing and shuffle programming,
heuristics
[56]° Dynamic slot scheduling Hadoop 1.0.3 Dynamic algorithm Sort (3,6,9,12,15) GB 2 masters (12 CPU cores 1.9 GHz AMD, 32GB
for I/O intensive jobs based RAM), 4,8,16 slaves (Intel Core i5 1.9 GHz,
on I/O and CPU 8GB RAM)
statistics
[57]° Energy-efficient - Polynomial time Synthesized workloads MATLAB simulations
Scheduling constant-factor
approximation
algorithm
[58]° Energy efficiency for Hadoop 0.20.0 DVFS control in Sort, CloudBurst, 8 nodes (AMD Opteron quad-core 2380 with
comp- source code Matrix Multiplication DVFS support, 2 64 kB L1 caches) Gigabit
utation intensive and via external Ethernet
workloads scheduler
[59]° Energy-aware Hadoop 0.19.1 Scheduling TeraSort, Page Rank, 2 nodes (24GB RAM,16 2.4 GHz CPU cores,
Scheduling algorithms k-means 1TB Disk), 2 nodes (16GB RAM,16 2.4 GHz
EMRSA-I, CPU cores, 1TB Disk)
EMRSA-II
[60]° DynamicMR: slot Hadoop 1.2.1 Algorithms: PI- PUMA benchmark 10 nodes (Intel's X5675, 3.07
allocation in shared DHSA, PD-DHSA, GHz, 24GB RAM, 56GB disk)
clusters SEPB, slot pre-
scheduling
[61]° PRISM: Fine-grained Hadoop 0.20.2 Phase-level Gridmix 2, 10 nodes; 1 master, 15 slaves (Quad-core Xeon
resource aware scheduling PUMA benchmark E5606, 8GB RAM, 100GB disk), Gigabit
scheduling algorithm Ethernet
[62]° HaSTE: scheduling in Hadoop YARN dynamic WordCount (14,10.5) 8 nodes (8 CPU cores, 8GB RAM)
YARN to reduce 2.2.0 programming GB, Terasort 15GB,
makespan algorithm WordMean 10.5GB,
PiEstimate
[63]° SLA-aware energy - DVFS in the per- Sort, Matrix CloudSim-based simulations for 30 servers
efficient scheduling in applications master Multiplications with network speeds between 100-200 Mbps
Hadoop YARN
[64]° Priority-based resource Quasit Meta-scheduler 40 million tuples of 5 nodes; 1 master, 4 workers (AMD
scheduling for realistic 3 hours Athlon64 3800+, 2GB RAM)
Distributed Stream vehicular traffic traces
Processing Systems
[65]† Power-efficient resources - Algorithms - Arena simulator
allocation and mean end-
to-end delay
minimization
[66]† Resource and Deadline- Hadoop 1.2.1 Receding horizon PUMA benchmark 21 Virtual Machines, 1 master and
aware scheduling in control algorithm, (WordCount, TeraSort, 20 slaves (1 CPU core, 2GB RAM)
dynamic Hadoop clusters self-learning Grep)
completion time
estimator
[67]† Fair4S: Modified Hadoop 0.19 Scheduler and Synthesized workloads Discrete event-based
Fair Scheduler modi- generated by Ankus MapReduce simulator
fication to
JobTracker
[68]† Deadline-Constrained Hadoop 1.2.1 Bipartite Graph WordCount, 20 virtual machines in 4 physical machines
MapReduce Scheduling Modelling Sort, Grep (Quad core 3.3 GHz, 32 GB RAM, 1TB disk),
to perform BGMRS Gigabit Ethernet. MATLAB simulations for
scheduler 3500 nodes
[69]† ARIA (Automatic Hadoop 0.20.2 Service level WordCount, Sort, 66 nodes (4 AMD CPU cores, 8GB RAM, 2
Resource Inference and Objective Bayesian classification, 160GB disks) in two racks, Gigabit Ethernet
Allocation) scheduler to meet TF-IDF from Mahout,
soft deadlines WikiTrends, twitter
workload
[70]† Scheduling for Hadoop 1.2.4 Dynamic Priority Text search, Sort, Cluster-1: 6 nodes (dual core CPU 3.2 GHz,
improved response time multi-queue WordCount, Page Rank 2GB RAM, 250GB disk), Cluster-2: 10 nodes
scheduler as a plug- (dual core CPU 3.2 GHz, 2GB RAM, 250GB
in disk), Gigabit Ethernet
[71]† Scheduling for fast - 3-approximation Synthesized workload Simulations
completion in algorithms;
MapReduce-like systems OFFA, Online
heuristic ONA
[72]† Dynamic Job Ordering Hadoop 1.x Greedy algorithms PUMA Benchmark 20 nodes EC2 Extra Large instances (4
and slot configuration based on (WordCount, Sort, CPU cores, 15GB RAM, 4 420GB disks)
Johnson’s rule for Grep), synthesized
2-stage flow shop Facebook traces
[73]† Reduction of idle HDFS 0.20.2, Energy Loading, Grep, Selection, 12 nodes ( Intel i5-2300, 8GB RAM, 1TB disk)
energy consumption HBase 0.90.3, consumption Aggregation, join
due to waiting in NoSQL Hive 0.71, model
Cassandra 1.0.3,
HadoopDB 0.1.1
[74]† RoPE: reduce response SCOPE Query optimizer 80 jobs from major Bing’s Production cluster (tens of thousands
time of relational that business groups of 64 bit, multi-core, commodity servers)
querying piggyback job
execution
[75]† BEneath the Task Level Hadoop 2.2.0 Algorithms to Hibench Benchmark 16 nodes in Windows Azure (1 CPU core,
(BeTL) modify (WordCount, Hive 1.75GB RAM, 60GB disk), Gigabit Ethernet
fine-grained checkpoint MapReduce queries)
strategy workflow
[76]† Job Slowdown Hadoop 1.0.0 Four algorithms; SWIM traces, 6 nodes (24 dual-quad core, 24GB RAM,
Variability reduction FBQ, Grep, sort, WordCount 50TB storage), 1 Gigabit Ethernet and 20
TAGS, SITA, Gbit/s InfiBand, Mumak simulations
COMP
[77]† Memory-aware reduce Hadoop 1.2.1 Mnemonic PUMA benchmarks 1 node (4 CPU cores, 7GB RAM)
tasks configurations mechanism (InvertedIndex,
of Reduce tasks to determine Ranked InvertedIndex,
number Self Join,
Sequence Count,
WordCount)
[78]† SkewTune: mitigating Hadoop 0.21.1 Modification to job Inverted Index, 20 nodes cluster (2 GHz quad-core CPU,
skew in MapReduce tracker and task PageRank, 16GB RAM, 2 750GB disks)
applications tracker CloudBurst
[79] † Mammoth: global Hadoop 1.0.1 Memory WordCount, Sort, 17 nodes (2 8 CPU cores 2.6 GHz Intel Xeon
memory management management WordCount with E5-2670, 32GB memory, 300GB SAS disk)
via the public pool Combiner
*Jobs/data placement, °Jobs Scheduling, †Completion time.
D. Benchmarking Suites, Production Traces, Modelling, Profiling Techniques, and Simulators for Big Data
Applications:
1) Benchmarking Suites:
Understanding the complex characteristics of big data workloads is an essential step toward optimizing the
configurations for the frameworks parameters used and identifying the sources of bottlenecks in the underlying
clusters. As with many legacy applications, MapReduce and other big data frameworks are supported by several
standard benchmarking suites such as [80]-[88]. These benchmarks have been widely utilized to evaluate the
performance of big data applications in different infrastructures either experimentally, analytically or via
simulations as summarized for the optimization studies in Tables I, III, V, and VI. Moreover, they can be used in
production environments for initial tuning and debugging purposes, in addition to stress-testing and bottlenecks
analysis before the actual run of intended commercial services.
The workloads contained in these benchmarks are typically described by a semantic that run on previously
collected or randomly generated datasets. Examples are text retrieval-based (e.g. Word-Count (WC), Word-Count
with Combiner (WCC), and Sort), and web search-based (e.g. Grep, Inverted Index, and Page Rank) workloads.
WC, and WCC calculate the occurrences of each word in large distributed documents. WC and WCC differ in
that the reduction is performed totally at the reduce stage in WC, and is done partially at the map stage with the
aid of a combiner at the map stage in WCC. Sort generates alphabetically sorted output from input documents.
Grep finds the match of regular expressions (regex) in input files, while Inverted Index generates a word-to-
document indexing for a list of documents. PageRank is a link analysis algorithm that measures the popularity of
web pages based on their referral by other websites. In addition to the above examples, computations related to
graphs and to machine learning such as k-means clustering are also considered in benchmarking big data
applications. These workloads vary in being I/O, memory, or CPU intensive. For example, PageRank, Grep, and
sort are I/O intensive, while WC, Page rank, and k-means are CPU intensive [8]. This should be considered in
optimization studies to correctly address bottlenecks and targeted resources. For a detailed review of
benchmarking different specialized big data systems, the reader is referred to the survey in [317], where workloads
generation techniques, input data generation, and assessment metrics are extensively reviewed. Here, we briefly
describe some examples of the widely used big data benchmarks and their workloads as summarized in Table II
and listed below:
1) GridMix [80]: GridMix is the standard benchmark included within the Hadoop distribution. GridMix,
has three versions and provides a mix of workloads synthesized from traces generated by Rumen [369]
from Hadoop clusters.
2) HiBench1 [81]: HiBench is a comprehensive benchmark suite provided by Intel for big data. HiBench
provides a wide range of workloads to evaluate computations speed, systems throughput, and resources
utilization.
3) HcBench [82]: HcBench is a Hadoop benchmark provided by Intel that includes workloads with a mix
of CPU, storage, and network intensive jobs with Gamma inter-job arrival times for realistic clusters
evaluations.
4) PUMA [83]: PUMA is a MapReduce benchmark developed at Purdue University that contains workloads
with different computational and networking demands.
5) Hive Benchmark [84]: The Hive benchmark contains 4 queries and targets comparing Hadoop with Pig.
6) PigMix [85]: PigMix is a benchmark that contains 12 queries types to test the latency and scalability
performance of Apache Pig and MapReduce.
7) BigBench [86]: BigBench is an industrial-based benchmark that provides quires with structures, semi-
structured and unstructured data.
8) TPC-H [87]: The TPC-H benchmark, provided by the Transaction Processing Performance Council
(TPC), allows generating realistic datasets and performing several business oriented ad-hoc queries.
Thus, it can be used to evaluate NoSQL and RDBMS scalability, processing power, and throughput.
9) YCSB2 [88]: The Yahoo Cloud Serving Benchmark (YCSB) targets testing the inserting, reading,
updating and scanning operations in database-like systems. YCSB contains 20 records data sets and
provides a tool to create the workloads.
10) TABLE II
11) EXAMPLES OF BENCHMARKING SUITES FOR BIG DATA APPLICATIONS AND THEIR WORKLOADS.
Benchmark Application Workloads
GridMix [80] Hadoop Synthetic loadjob, Synthetic sleepjob
HiBench [81] Hadoop, SQL, Micro Benchmarks (Sort, WordCount, TeraSort, Sleep, enhanced DFSIO to test HDFS throughput), Machine
Kafka, Spark Learning (Bayesian Classification, k-means, Logistic Regression, Alternating Least Squares, Gradient Boosting
Streaming Trees, Linear Regression, Latent Dirichlet Allocation, Principal Components Analysis, Random Forest, Support
Vector Machine, Singular Value Decomposition), SQL(Scan, Join, Aggregate), Websearch Benchmarks
(PageRank,
Nutch indexing), Graph Benchmark (NWeight), Streaming Benchmarks (Identity, Repartition, Stateful WC,
Fixwindow)
HcBench [82] Hadoop, Hive, Telco-CDR (Call Data Records) interactive queries, Hive workloads (PageRank URLs, aggregates by
Mahout source, average of PageRank), k-means Clustering iterative jobs in machine learning, and Terasort
PUMA [83] Hadoop Micro Benchmarks (Grep, WordCount, TeraSort), term vector, inverted index, self-join, adjacency-
list, k-means, classification, histogram, histogram ratings, sequence count, ranked inverted index
Hive [84] Hive Grep selection, Ranking selection, user visits aggregation, user visits join
PigMix [85] Pig Explode, fr join, join, distinct agg, anti-join, large group by key, nested split,
group all, order by 1 field, order by multiple fields, distinct + union, multi-store
BigBench [86] RDBMS, Business queries (cross-selling, customer micro-segmentation, sentiment analysis, enhancing
NoSQL multi-channel customer experience, assortment and pricing optimization, performance
transparency, return analysis, inventory management, price comparison)
TPC-H [87] RDBMS, Ad-hoc queries for New Customer Web Service, Change Payment Method, Create Order, Shipping, Stock
NoSQL Management Process, Order Status, New Products Web Service Interaction, Product Detail, Change Item
YCSB [88] Cassandra,HBase, Update heavy, Read heavy, Read only, Read latest, Short ranges
Yahoo’s PNUTS
2) Production Traces:
1
Available at: https://github.com/intel-hadoop/hibench
2
Available at: https://github.com/brianfrankcooper/YCSB
As the above-mentioned benchmarks are defined by their semantics where the codes functionalities are known
and the jobs are submitted by a single user to run on deterministic datasets, they might be incapable of fully
representing production environments workloads where realistic mixtures of workloads with different data sizes
and inter-arrival times for multi users coexist. Information about such realistic workloads can be provided in the
form of traces collected from previously submitted jobs in production clusters. Although sharing such information
for production environments is hindered by confidentiality, legal and business restrictions, several companies
have provided archived jobs traces while normalizing the resources usage. Examples of realistic evaluations based
on publicly available production traces were conducted by Yahoo [89], Facebook [90], [91], Cloudera and
Facebook [92], Google [93], [94], IBM [95], and in clusters with scientific [96], or business-critical workloads
[97]. These traces describe various job features such as number of tasks, data characteristics (i.e. input, shuffle
and output data sizes and their ratios), completion time, in addition to jobs inter arrival times and resources usage
without revealing information about the semantics or users sensitive data. Then, synthesized workloads can be
generated by running dummy codes to artificially generate shuffle and outputs data sizes that match the trace on
randomly generated data. To accurately capture the characteristics in the trace while running in testing clusters
with a scale smaller than production clusters, effective sampling should be performed.
Traces based on ten month log files for data intensive MapReduce jobs running in the M45 supercomputing
production Hadoop cluster for Yahoo were characterized in [89] according to utilization, job patterns, and sources
of failures. Facebook traces collected from a 600 nodes for 6 months and Yahoo traces were utilized in [90] to
provide comparisons and insights for MapReduce jobs in production clusters. Both traces were classified by k-
means clustering according to the number of jobs, input, shuffle, and output data sizes, map and reduce tasks, and
jobs durations into 10 and 8 bins for Facebook and yahoo traces, respectively. Both traces indicated input data
sizes ranging between kBytes and TBytes. The workloads are labeled as small jobs which constitute most of the
jobs, load jobs with only map tasks, in addition to expand, aggregate, and transformation jobs based on input,
shuffle, and output data sizes. A framework that properly sample the traces to synthesize representative and a
scaled down workloads for use in smaller clusters is proposed and sleep requests in Hadoop to emulate the inter-
arrivals of jobs were utilized. Facebook traces from 3000 machines with total data size of 12TBytes for
MapReduce Workloads with Significant Interactive Analysis were utilized in [91] to evaluate the energy
efficiency of allocating more jobs to idle nodes in interactive job clusters. Six Cloudera traces from e-commerce,
telecommunications, media, and retail users’ workloads and Facebook traces collected over a year for 2 Million
jobs were analyzed in [92]. Several insights for production jobs such as the weekly time series, and the burstiness
of submissions were provided. These traces are available in a public repository which also contains a synthesizing
tool; SWIM3 which is integrated with Hadoop. Although previously-mentioned traces contain rich information
about various jobs characteristics, the lack of per job resources utilization information makes them partially
representative for estimating workloads resources demands.
Google workloads traces were characterized in [93] and [94] using k-means based on their duration and CPU
and memory resources usage per task to aid in capacity planning, forecasting demands growth, and for improving
tasks scheduling. Insights such as “most tasks are short”, “duration of long and short tasks follows a Bimodal
distribution”, and that “most of the resources are taken by few extensive jobs” were provided. The traces4 were
collected from 12k machines cluster in 2011 and include information about scheduling requests, taken actions,
tasks submission times and normalized resources usage, and machines availability [94]. However, disk and
networking resources were not covered. The traces collected from IBM-based private clouds for banking,
communication, e-business, production, and telecommunication industries in [95] further considered disk and file
system usage in addition to CPU and memory. The inter-dependencies between the resources were measured and
disk and memory resources were found to be negatively correlated indicating the potential benefit of co-locating
memory intensive and disk intensive tasks.
A user-centric study was conducted in [96] based on traces5 collected from three research clusters;
OPENCLOUD, M45, and WEB MINING. Workloads, configurations, in addition to resources usage and sharing
information were used to address the gaps between data scientists needs and systems design. Evaluations for
preferred applications and the penalties of using default parameters were provided. Two large-scale and long term
3
Available at: https://github.com/SWIMprojectucb/swim/wiki
4
Available at: https://github.com/google/cluster-data
5
Available at: http://www.pdl.cmu.edu/HLA/
traces6 collected from distributed data centers for business-critical workloads such as financial simulators were
utilized in [97] to provide basic statistics, correlations, and time-pattern analysis. Full characteristics for the
provisioned and actual usage of CPU, memory, disk I/O, and network I/O throughput resources were presented.
However, no information about the inter arrival times were provided. Ankus MapReduce workload synthesizer
was developed as part of the study in [67] based on e-commerce traces collected from Taobao which is a 2k nodes
production Hadoop cluster. The inter arrival times in the traces were found to be Poisson.
3) Modelling and Profiling Techniques:
Statistical-based characterization and modelling were considered for big data workloads based on production
clusters traces as in [98] or benchmarks as in [99], and [101]. Also, different profiling and workloads modelling
studies such as in [102]-[105] were conducted with the aim of automating clusters configurations or estimating
different performance metrics such as the completion time based on the selected configurations and resources
availability. A statistical-driven workloads generator was developed in [98] to evaluate the energy efficiency of
MapReduce clusters under different scales, configurations, and scheduling policies. A framework to ease
publishing production traces anonymously based on inter-arrival times, jobs mixes, resources usage, idle time,
and data sizes was also proposed. The framework provide non-parametric statistics such as the averages and
standard deviations and 5 numbers percentile summaries (1st, 25th , 50 th, 75 th, and 99 th) for inter-arrival times and
data sizes. Statistical modelling for GridMix, Hive, and HiBench workloads based on principal component
analysis for 45 metrics and regression models was performed in [99] to provide performance estimations for
Hadoop clusters under different workloads and configurations. BigDataBench7 was utilized in [100] to examine
the performance of 11 representative big data workloads in modern superscale out-of-order processors.
Correlation analysis was performed to identify the key factors that affect the Cycles Per Instruction (CPI) count
for each workload. Keddah8 in [101] was proposed as a toolchain to profile based on end host or switches traffic
measurements, empirically characterize, and reproduce Hadoop traffic. Flow-level traffic models can be derived
by Keddah for use with network simulators with varied settings that affect networking requirements such as
replication factor, cluster size, split size, and number of reducers. Results based on TeraSort, PageRank and k-
means workloads in dedicated and cloud-based clusters indicated high correlation with Keddah-based simulation
results.
To automate Hadoop clusters configurations, the authors in [102], [103] developed a cost-based optimization
that utilize an online profiler and a What-if Engine. The profiler uses the BTrace Java-based dynamic
instrumentation tool to collect job profiles at run time for the data flows (i.e. input data and shuffle volumes) and
to calculate the cost based on the program, input data, resources, and configuration parameters at tasks granularity.
The What-if Engine contains a cost-based optimizer that utilizes a task scheduler simulator and model-based
optimization to estimate the costs if different combination of cost variables are used. This suits just-in-time
configurations for the computations that run periodically in production clusters, where the profiler can trial the
execution on a small set of the cluster to obtain the cost function, then the What-if Engine obtains the best
configurations to be used for the rest of the workload. In [104], a Hadoop performance model is introduced. It
estimates completion time and resources usage based on Locally Weighted Linear Regression and Langrage
Multiplier, respectively. The non-overlapped and over-lapped phases of shuffling and the number of reduce waves
were carefully considered. Polynomial regression is used in [105] to estimate the CPU usage in clocks per cycle
for MapReduce jobs based on the number of map and reduce tasks.
4) Simulators:
Tuning big data applications in clusters requires time consuming and error-prone evaluations for a wide range of
configurations and parameters. Moreover, the availability of a dedicated cluster or an experimental setup is not
always guaranteed due to their high deployment costs. To ease tuning the parameters and to study the behaviour
of big data applications in different environments, several simulation tools were proposed [106]-[119], and [148].
These simulators differ in their engines, scalability, flexibility with parameters, level of details, and the support
for additional features such as multi disks and data skew. Mumak [106] is an Apache Hadoop simulator included
in its distribution to simulate the behaviour of large Hadoop clusters by replaying previously generated traces. A
built-in tool; Rumen [369] is included in Hadoop to generate these traces by extracting previous jobs information
6
Available at: http://gwa.ewi.tudelft.nl/datasets/Bitbrains
7
Available at: http://prof.ict.ac.cn/BigDataBench
8
Available at: https://github.com/deng113jie/keddah
from their log files. Rumen collects more than 40 properties of the tasks. In addition, it provides the topology
information to Mumak. However, Mumak simplifies the simulations by assuming that the reduce phase starts only
after the map phase finishes, thus it does not provide accurate modelling for shuffling and provides rough
completion time estimation. SimMR is proposed in [107] as a MapReduce simulator that focuses on modelling
different resource allocation and scheduling approaches. SimMR is capable of replaying real workloads traces as
well as synthetic traces based on the statistical properties of the workloads. It relies on a discrete event-based
simulator engine that accurately emulates Job Tracker decisions in Hadoop for map/reduce slot allocation, and a
pluggable scheduling policy engine that simulates decisions based on the available resources. SimMR was tested
on a 66-node cluster and was found to be more accurate and two orders of magnitude faster than Mumak. It
simplifies the node modelling by assuming several cores but only one disk. MRSim9 is a discrete event-based
MapReduce simulator that relies on SimJava, and GridSim to test workloads behaviour in terms of completion
time and utilization [108]. The user provides cluster topology information and jobs specifications such as the
number of map and reduce tasks, the data layout (i.e locations and replication factor), and algorithms description
(i.e. number of CPU instructions per record and average record size) for the simulations. MRSim focuses on
modelling multi-cores, single disk, network traffic, in addition to memory, buffers, merge, parallel copy, and sort
parameters. The results of MRSim were validated on a single rack cluster with four nodes.
The authors in [109] proposed MR-cloudsim as a simulation tool for MapReduce in cloud computing data
centers. MR-cloudsim is based on the widely used open-source cloud systems event-driven simulator; CloudSim
[110]. In MR-cloudsim, several simplifications such as assigning one reduce per map, and allowing the reduce
phase to start only after the map phase finishes are assumed. To assist with MapReduce clusters design and testing,
MRPerf10 is proposed in [111] to provide fine-grained simulations while focusing on modelling the activities
inside the nodes, the disk and data parameters, and the inter and intra rack networks configurations. MRPerf is
based on Network Simulator-2 (ns-2)11 which is a packet-level simulator, and DiskSim12 which is an advanced
disk simulator. A MapReduce heuristic is used to model Hadoop behaviour and perform scheduling. In MRPerf,
the user provides the topology information, the data layout, and the job specifications, and obtains a detailed
phase-level trace that contains information about the completion time and the volume of transferred data.
However, MRPerf needs several minutes per evaluation and has the limitation of modelling only one replica and
a single disk per node. Also it does not enable speculative execution modeling and simplifies I/O and computations
processes by not overlapping them. MRemu13 in [112] is an emulation-based framework based on Miminet 2.0
for MapReduce performance evaluations in terms of completion time in different data center networks. The user
can emulate arbitrary network topologies and assign bandwidth, packet loss ratio, and latency parameters.
Moreover, network control based on SDN can be emulated. A layered simulation architecture; CSMethod is
proposed in [113] to map the interactions between different software and hardware entities at the cluster level
with big data applications. Detailed models for JVMs, the name nodes, data nodes, JT, TT, and scheduling were
developed. Furthermore, the diverse hardware choices such components, specifications, and topologies were
considered. Similar approaches were also used in [114], and [115] to simulate NoSQL and Hive applications,
respectively. As part of their study, the authors in [136] developed a network flow level discrete-event simulator
named PurSim to aid with simulating MapReduce executions in up to 200 virtual machines.
Few recent articles considered the modelling and simulations of YARN environments [116]-[119]. Yarn
Scheduler Load Simulator (SLS) is included in the Hadoop distribution to be used with Rumen to evaluate the
performance of YARN different scheduling algorithms with different workloads [116]. SLS utilizes a single JVM
to exercise a real RM with thread-based simulators for NM and AM. However, SLS ignores simulating the
network effects as the NM and AM simulators interact with the RM only via heartbeat events. The authors in
[117] suggested an extension for SLS with an SDN-based network emulator; MaxiNet and a data center traffic
generator DCT2 to add realistic modelling for network and traffic in YARN environments and to emulate the
interactions between jobs scheduling and flow scheduling. Real-Time ABS language is utilized in [118] to develop
ABS-YARN simulator that focuses on prototyping YARN and modelling job executions. YARNsim in [119] is a
9
Available at: http://code.google.com/p/mrsim
10
Available at: https://github.com/guanying/mrperf
11
Available at: http://www.isi.edu/nsnam/ns
12
Available at: http://www.pdl.cmu.edu/DiskSim/
13
Available at: https://github.com/mvneves/mremu
parallel discrete-event simulator in YARN that provides comprehensive protocol-level accuracy simulations for
task executions and data flow. A detailed modeling for networking, HDFS, data skew, and I/O read and write
latencies is provided. However, YARNsim simplifies scheduling policies and fault tolerance. A comprehensive
set of Hadoop benchmarks in addition to bioinformatics clustering applications were utilized for experimental
validation and an average error of 10% was achieved.
(a) (b)
Fig. 7. Physical machines with (a) VMs and (b) containers.
Virtualization increases the utilization and energy efficiency of cloud infrastructures as with proper
management, VMs can be efficiently assigned and seamlessly migrated while running and can thus be
consolidated in fewer physical machines. Then, more machines can be switched to low-power or sleep modes.
Managing and monitoring VMs in cloud data centers have been realized through several management platforms
(i.e. hypervisors) such as Xen [377], KVM [378], and VMWare [379]. Also, virtual infrastructures management
tools such as Open Nebula, and VMWare VSphere, were introduced to support the management of virtualized
resources in heterogeneous environments such as hybrid cloud [374]. However, and compared to ``bare-metal''
implementations (i.e. native use of physical machines), virtualization can add performance overheads related to
the additional software layer, memory management requirements, and the virtualization of I/O and networking
interfaces [380], [381]. Moreover additional networking overheads are encountered when migrating VMs between
several clouds for load balancing and power saving purposes through commercial or dedicated inter data center
networks [382]. These migrations, if done concurrently by different service providers, can lead to increased
network congestion, exceeding delay limits, and hence services performance fluctuation, and violations to SLAs
[383]. In the context of VMs, several studies considered optimizing their placements and resource allocation to
improve big data applications performance. Also, overcoming the overheads of using VMs with big data
applications is considered as detailed in Subsection V-B.
2) Network Virtualization:
To complement the benefits of the mature VMs technologies used in cloud computing systems, virtualizing
networking resources have also been considered [384], [385]. Network virtualization enables efficient use of cloud
networking resources by allowing multiple heterogeneous Virtual Networks (VNs), also known as slices,
composed of virtual links and nodes (i.e. switches and routers) to coexist on the same physical network (substrate
network). As with VMs, VNs can be created, updated, migrated, and deleted upon need and hence, allow
customizable and scalable on-demand allocations. The assignments and mapping of the VNs on the physical
network resources can be performed through Virtual Network Embedding (VNE) offline or online algorithms
[386]. VNE can target different goals such as maximizing revenue and resilience through effective links
remapping and virtual node migrations, in addition to energy efficiency by consolidating over-provisioned
resources [387], [388]. Figure 8 illustrates the concept of VNs and VNE where three different VNs share the
physical network. Several challenges can be encountered with network virtualization due to the large scale and
the heterogeneous and autonomous nature of cloud networking infrastructures [384]. Also, different infrastructure
providers (InP) can have conflicting goals and un-unified QoS measures for their network services.
4) Container-based Virtualization:
Container-based virtualization is a recently introduced technology for cloud computing infrastructures proposed
to reduce the overheads of VMs as containers can help provide up to five times better performance [407]. While
hypervisors perform the isolation at the hardware level and require an independent guest OS for each VM,
containers perform the isolation at the OS level and thus, can be regarded as a lightweight alternative to VMs
[408]. In each physical machine, the containers share the OS kernel and provide isolation through having different
user-spaces by using Linux Kernel containment (LKC) user-space interfaces for example. Figure 7 compares the
components of physical machines when hosting VMs 7(a) and containers 7(b). In addition to sharing the OS,
containers can also share the binary and library files of the applications running on them. With these features,
containers can be deployed, migrated, restarted, and terminated faster and can be deployed in larger numbers in a
single machine compared to VMs. However, and due to sharing the OS, containers can be less secure than VMs.
To improve their security, containers can be deployed inside VMs and share the Guest OS [409] at the cost of
reduced performance.
Some examples of Linux-based container engines are Linux-Vserver, OpenVZ and Linux containers (LXC).
The performance isolation, ease of resources management, and overheads of these systems for MapReduce
clusters have been addressed in [410] and near bare-metal performance was reported. Docker [407] is an open-
source and widely-used containers manager that extends LKC with the kernel and application APIs within the
containers [411]. Docker containers have been used to launch the containers within YARN (e.g. in Hadoop 2.7.2)
to provide better software environment assembly and consistency and elasticity in assigning the resources to
different components within YARN (e.g. map or reduce containers) [412]. Besides YARN, other cloud resources
management platforms have successfully adopted containers for resources management such as Mesos and Quasar
[160]. Similar to YARN, V-Hadoop was proposed in [413] to enable the use of Linux containers for Hadoop. V-
Hadoop scales the number of containers used according to the resource usage and availability in cloud
environments.
5) Software-defined Networking (SDN):
Software-defined networking (SDN) is an evolving networking paradigm that separates the control plane, which
generates the rules for where and how data packets are forwarded, from the data plane, which handles the packets
received at the device according to agreed rules, in networking infrastructures [414]. Legacy networks as depicted
in Figure 10(a), contain networking devices with vendor-specific integrated data and control planes that are hard
to update and scale, while software-defined networks introduce a broader level of programmability and flexibility
in networking operations while providing a unified view for the entire network. SDN architectures can have
centralized or semi-centralized controllers distributed across the network [415] to monitor and operate networking
devices such as switches and routers while considering them as simple forwarding elements [416] as illustrated
in Figure 10(b).
Software defined networks have three main layers namely; infrastructure, control, and application. The
infrastructure layer which is composed of SDN-enabled devices interacts with the control layer through a
Southbound Interface (SBI), while the application layer connects to the control layer through a Northbound
Interface (NBI) [415], [417]. Several protocols for SBI such as OpenFlow were developed as an effort to
standardize and realize SDN architectures [418] - [420]. OpenFlow performs the switching at the flow granularity
(i.e. group of sequenced packets with common set of header fields) where each forwarding element contains a
flow table that receives rules updates dynamically from the SDN controllers. Examples of commercially available
OpenFlow centralized controllers are NOX, POX, Trema, Ryu, FloodLight, Beacon, and Maestro, and of
distributed controllers are Onix, ONOS, and HyperFlow [420] - [422]. Software-based OpenFlow switches such
as Open vSwitch (OVS) are also introduced to enable SDN in virtualized environments and ease the interactions
between hypervisors, container engines, and various application elements and the software-based SDN controller
while connecting several physical machines [423]. An additional tool that aids the control in SDN at a finer
granularity is the Programmable Protocol-independent Packet Processor (P4) high-level language that enables
arbitrary modifications to packets in forwarding elements at the line rate [424].
The flexibility and agility benefits that result from adopting SDN in large-scale networks have been
experimentally validated in several testbeds [425] and in commercial Wide Area Networks (WANs) that
interconnect geo-distributed cloud data centers such as Google B4 WAN [426], and Microsoft Software-driven
WAN (SWAN) [427]. SDN in WAN supports various Traffic Engineering (TE) operations and improves their
congestion management, load balancing, and fault tolerance. Also, SDN relaxes the over-provisioning
requirements of traditional WANs which are typically 30-60-% utilized to increase the resilience and performance
[421], [422].
The programmability of SDN enables fine-grained integration of adaptive routing and flow aggregation
protocols with additional resource allocation and energy-aware algorithms across different network layers [428].
This allows dynamic configuration and ease of implementation for networking applications with scalability
requirements and dynamic traffic. The concepts have found wide spread use in cloud computing services and big
data applications [429], [430], NFVs [391]-[393] and wireless networks [405]. This is in addition to intra data
center networking [421], [431] as will be discussed in Subsection VI-E.
(a) (b)
Fig. 10. Architectures of (a) Legacy, and (b) SDN-enabled networks.
C. Challenges and requirements for Big Data Applications deployments in Cloud Environments:
Although Hadoop and other big data frameworks were originally designed and provisioned to run in dedicated
clusters under controlled environments, several cloud-based services, enabled by leasing resources from public,
private or hybrid clouds, are offering big data computations to public users aiming to increase the profit and
utilization of cloud infrastructures [451]. Examples of such services are Amazon Elastic MapReduce (EMR)14,
Microsoft's Azure HDInsight15, VMWare Serengeti's project [452], Cloud MapReduce [120], and Resilin [453].
Cloud-based implementations are increasingly considered as cost-effective, powerful, and scalable delivery
models for big data analytics and can be preferred over cluster-based deployments especially for interactive
workloads [454]. However, to suit cloud computing environments, conventional big data applications are required
to adopt several changes in their frameworks [124], [148]. Also, to suit big data applications and to meet the SLAs
and QoS metrics for the heterogeneous demands of multi-users, cloud computing infrastructures are required to
provide on-demand elastic and resilient services with adequate processing, storage, memory, and networking
resources [126]. Here we identify and discuss some key challenges and requirements for big data applications in
cloud environments:
Framework modifications: Single site cluster implementations typically couple data and compute nodes (e.g.
name nodes and JVMs coexist in same machines) to realize data locality. In virtualized cloud environments and
because elasticity is favored for computing, while resilience is favored for storage, the two entities are typically
decoupled or splitted [142]. For example, VMWare follow this split Hadoop architecture, and Amazon EMR
utilizes S3 for storage and EC2 instances for computing. This requires an additional data loading step into
computing VMs before running jobs [148]. Such transfers are typically free of charge within the same
geographical zone to promote the use of both services, but are charged if carried between different zones. For
cloud-based RDBMs and as discussed in Section II-C, the ACID requirements are replaced by the BASE
requirements, where either the availability or the consistency are relaxed to guarantee partition-tolerance which
is a critical requirement in cloud environments [455]. Another challenge associated with the scale and
heterogeneity in cloud-implementations is that applications debugging cannot be performed in smaller
configurations as in single-site environments and requires tracing in the actual scale [313].
Offer selection and pricing: The availability of large number of competing Cloud Service Providers (CSP)
each offering VMs with different characteristics can be confusing to users whom should define their policies, and
requirements. To ease optimizing Hadoop-based applications for un-experienced users, some platforms such as
14
Available at: https://aws.amazon.com/emr/
15
Available at: https://azure.microsoft.com/en-gb/services/hdinsight/
AmazonEC216 provide a guiding script to aid in estimate computing requirements and tuning of configurations.
However, unified methods for estimation and comparison are not available. CSP are also challenged with
resources allocation while meeting revenue goals as surveyed in [456] at different networking layers where
different economic and pricing models are considered.
Resources allocation: Cloud-based deployments for big data applications require dynamic resource allocation
at run-time as newly generated data arrivals and volumes are unprecedented and production workloads might
change periodically, or irregularly. Fixed resource allocations or pre-provisioning can lead to either under-
provisioning and hence QoS violations or over-provisioning which lead to increased costs [120]. Moreover, cloud-
based computations might experience performance variations due to interference caused by the share of resources
usage. Such variations require careful resources allocations especially for scientific workflow, which also require
elasticity in resources assignments as the requirements of different stages vary [457]. Data intensive applications
and scientific workflows in clouds have data management challenges (i.e. transfers between storage and compute
VMs) and data transfer bottlenecks over WANs [458].
QoS and SLA guarantees: In cloud environments, guaranteeing and maintaining QoS metrics such as response
time, processing time, trust, security, failure-rate, and maintenance time to meet SLAs, which are the agreements
between CSPs and users about services requirements and violation penalties, is a challenging task due to several
factors as discussed in the survey in [459]. With multiple tenants sharing the infrastructure, QoS metrics for each
user should be predictable and independent of the co-existence with other users which requires efficient resource
allocation and performance monitoring. A QoS metric that directly impacts the revenue is the response time of
interactive services. It was reported in [460], and [461] that a latency of 100 ms in search results caused 1% loss
in the sales of Amazon, while a latency of 500 ms caused 20% sales drop, and a speed up of 5 seconds resulted
10% sales increase in Google. QoE in video services is also impacted by latency. It was measured in [462] that
with more than 2 seconds delay in content delivery, 60% of the users abandon the service. Different interactive
services depending on big data analytics are expected to have similar impacts on revenue and customer behaviors.
Resilience: Increasing the reliability, availability, data confidentiality, and ensuring the continuity of services
provided by cloud infrastructures and applications against cyber-attacks or systems failures are critical
requirements especially for sensitive services such as banking and healthcare applications. Resiliency in cloud
infrastructures is approached at different layers including the computing and networking hardware layer, the
middleware, and virtualization layer, and at the applications layer by efficient replication, checkpointing, and
clouds collaboration as extensively surveyed in [463].
Energy consumption: A drawback of cloud-based applications is that the energy consumption at both network
and end-user devices can exceed the energy consumption of local deployments [464], and [465].
D. Options for Big Data Applications deployment in Geo-distributed Cloud Environments:
CSPs place their workloads and content in geo-distributed data centers to improve the quality of their services,
and rely on multiple Internet Service Providers (ISPs), or dedicated WANs [426], [427] to connect these data
centers. Advantages such as load balancing, increasing the capacity, availability and resilience against catastrophic
failures, and reducing the latency by being close to the users are attained [440]. Such environments attract
commercial and non-commercial big data analytics as they match the geo-distributed nature of data generation
and provide scales beyond single-cluster implementations. Thus, there is a recent interest in utilizing geo-
distributed data centers for big data applications despite the challenging requirements as surveyed in [323], and
suggested in [466]. Figure 11 illustrates different scenarios for deploying big data applications in geo-distributed
data centers and compares single-site clusters (case A) with various geo-distributed infrastructures with big data
including bulk data transfers (case B), cloud bursting (case C), and different implementations for geo-distributed
big data frameworks (case D) as outlined in [177]. Case B refers to legacy and big data bulk data transfers between
geo-distributed data centers including data backups, content replications, bulk data transfers, and VMs migrations
which can reach between 330TB and 3.3PB a month [164] requiring cost and performance optimized scheduling
and routing. Case C is for hybrid frameworks, where a private cloud bursts some of its workloads to public clouds
for various goals such as reducing costs, or accelerating completion time. Such scenarios require interfacing with
16
Available at: https://wiki.apache.org/hadoop/AmazonEC2
public cloud frameworks in addition to profiling for performance estimations to ensure the gain from bursting
compared to localized computations.
Fig. 11. Different scenarios for deploying big data applications in geo-distributed cloud environments.
Geo-distributed big data frameworks in public cloud have three main deployment modes [177]. Case D-1
copies all input data into a single data center prior to processing. This case is cost-effective and less complex in
terms of framework management and computations but encounters delays due to WAN bottlenecks. Case D-2
distributes the map-like tasks of a single job in geo-distributed locations to locally process input data. Then,
intermediate results are shuffled geo-distributively to a single location for the final reduce-like tasks for final
output computations. This arrangement suits workloads with intermediate data sizes much less than the input data.
Although networking overheads can be reduced compared to case D-1, this case requires complex geo-aware
control for the distributed framework components and experiences task performance fluctuations under
heterogeneous clouds computing capabilities. The last case, D-3, creates a separate job in each location and
transmits the individual output results to a single location to launch a final job for results aggregation. This
realization relaxes the fine-grained control requirements of the distributed framework in D-2, but is considered
costly due to the large number of jobs. Also, it only suits associative workloads such as ‘averages’ or ‘counting’
where the computations can be performed progressively in stages.
Although such options provide flexibility in implementing geo-distributed big data applications, some
unavoidable obstacles can be encountered. Due to privacy and governance regulation reasons, copying data
beyond their regions might be prohibited [467]. Also, due to HDFS federation (i.e. data nodes cannot register in
other locations governed by a different organization), and the possibility of using different DFSs, additional codes
to unify data retrieval are required.
E. Big Data Implications on the Energy Consumption of Cloud Networking Infrastructures:
Driven by the economic, environmental, and social impacts of the increased CAPEX, OPEX, Global
Greenhouse Gas (GHG) emission, and carbon footprints as a result of the expanding demands for Internet-based
services, tremendous efforts have been devoted by industry and academia to reduce the power consumption and
increase the energy efficiency of transport networks [468]-[472]. These services empowered by fog, edge, and
cloud computing, and various big data frameworks, incur huge traffic loads on networking infrastructures and
computational loads on hosting data centers which in turn, increase the power consumption and carbon footprints
of these infrastructures [1], [473]-[475]. The energy efficiency of utilizing different technologies for wireless
access networks has been addressed in [476]-[481], while for wired PONs and hybrid access networks in [482]-
[488]. Core networks, that interconnect cloud data centers with metro and access networks containing IoT and
users devices, transport huge amounts of aggregated traffic. Therefore, optimizing core networks plays an
important role in improving the energy efficiency of the cloud networking infrastructures challenged by big data.
The reduction of energy consumption and carbon footprint in core networks, mainly IP over WDM networks,
have been widely considered in the literature by optimizing the design of their systems, devices, and/or routing
protocols [432], [489]-[506], utilizing renewable energy sources [507]-[513], and by optimizing the resources
assignment and contents placement in different Internet-based applications [387], [514]-[527].
The early positioning study in [489] to green the Internet addressed the impact of coordinated and
uncoordinated sleeps (for line cards, crossbars, and main processors within switches) on the switching protocols
such as Open Shortest Path First (OSPF) and Internal Broader Gateway Protocol (IBGP). Factors such as how,
when, and where to cause devices to sleep, and the overheads of redirecting the traffic and awakening the devices
were addressed. The study pointed out that energy savings are feasible but are challenging due to the modification
required in devices and protocols. In [490], several energy minimization approaches were proposed such as
Dynamic Voltage Scaling (DVF) and Dynamic Frequency Scaling (DFS) at the circuit level, and efficient routing
based on equipment with efficient energy profiles at the network level. The consideration of switching off idle
nodes and rate adaptation have also been reported in [491]. The energy efficiency of the bypass and non-bypass
virtual topologies and traffic grooming schemes in IP over WDM have been assessed in [492] through Mixed
Integer Linear Programming (MILP) and heuristic methods. The non-bypass approach requires O/E/O
conversation to lightpaths (i.e. traffic carried optically in fiber links and optical devices) in all intermediate nodes,
to be processed electronically in the IP layer and routed to following lightpaths. On the other hand, the bypass
approach omits the need for O/E/O conversation in intermediate nodes, and hence reduces the number of IP router
ports needed, and achieves power consumption savings between 25% and 45% compared to the non-bypass
approach. In [493], a joint optimization for the physical topology of core IP over WDM networks, the energy
consumption and average propagation delay is considered under bypass or non-bypass virtual topologies for
symmetric and asymmetric traffic profiles. Additional 10% saving was achieved compared to the work in [492].
Traffic-focused optimizations for IP over WDM networks were also considered for example in [494]-[499].
Optimizing static and dynamic traffic scheduling and grooming were considered in [494]-[497] in normal and
post-disaster situations to reduce the energy consumption and demands blocking ratio. Techniques such as
utilizing excess capacity, traffic filtering, protection path utilization, and services differentiation were examined.
To achieve lossless reduction for transmitted traffic, the use of Network Coding (NC) in non-bypass IP over WDM
was proposed in [498], [499]. Network-coded ports encode bidirectional traffic flows via XOR operations, and
hence reduce the number of router ports required compared to un-coded ports. The energy efficiency and resilience
trade-offs in IP over WDM networks were also addressed as in [500]-[503]. The impact on the energy
consumption due to restoration after link cuts and core node failures was addressed in [500]. An energy efficient
NC-based 1+1 protection scheme was proposed in [501], and [502] where the encoding of multiple flows sharing
protection paths in non-bypass IP over WDM networks was optimized. MILP, heuristics, in addition to closed
form expressions for the networking power consumption as a function of the hop count, network size, and demands
indicated power saving by up to 37% compared to conventional 1+1 protection. The authors in [503] optimized
the traffic grooming and the assignment of router ports to protection or working links under different protection
schemes while considering the sleep mode for protection ports and cards. Up to 40% saving in the power
consumption was achieved.
Utilizing renewable energy resources such as solar and wind to reduce non-renewable energy usage in IP
over WDM networks with data centers was proposed in [507], and [508]. Factors such as renewable energy
average availability and their transmission losses, regular and inter data center traffic, and the network topology
were considered to optimize the locations of the data centers and an average reduction by 73% in non-renewable
energy usage was achieved. The work in [509] considered periodical reconfiguration to virtual topologies in IP
over WDM networks based on a “follow the sun, follow the wind” operational strategy. Renewable energy was
also considered in IP over WDM networks for cloud computing to green their traffic routing [510], content
distribution [511], services migration [512], and for VNE assignments [513].
The energy efficiency of Information-Centric Networks (ICNs) and Content Distribution Networks (CDNs)
were extensively surveyed in [512] and [514] respectively. CDNs, are scalable services that cache popular contents
throughout ISP infrastructures, while ICNs support name-based routing to ease the access to contents. The
placement of data centers and their content in IP over WDM core nodes were addressed in [516] while considering
the energy consumption, propagation delay, and users upload and download traffic. An early effort to green the
Internet [517] suggested distributing Nano Data Centres (NaDa) next to home gateways to provide various caching
services. In [518], the energy efficiency of Video-on-Demand (VoD) services was examined by evaluating five
strategic caching locations in core, metro, and access networks. The work in [519]-[521] addressed the energy
efficiency of Internet Protocol Television (IPTV) services by optimizing video content caching in IP over WDM
networks while considering the size and power consumption of the caches and the popularity of the contents. To
maximize caches hit rate, the dynamics of TV viewing behaviors throughout the day were explored. Several
optimized content replacement strategies were proposed and up to 89% power consumption reduction was
achieved compared to networks with no caching. The energy efficiency of Peer-to-Peer (P2P) protocol-based
CDNs in IP over WDM networks was examined in [522] while considering the network topology, content
replications, and the behaviors of users. In [523], the energy efficiency and performance of various cloud
computing services over non-bypass IP over WDM networks under centralized and distributed computing modes
were considered. Energy-aware MILP models were developed to optimize the number, location, capacity and
contents of the clouds for three cloud services namely; content delivery, Storage-as-a-Service (SaaS), and virtual
machines (VM)-based applications. An energy efficient cloud content delivery heuristic (DEER-CD) and a real-
time VM placement heuristic (DEERVM) were developed to minimize the power consumption of these services.
The results showed that replicating popular contents and services in several clouds yielded 43% power saving
compared to centralized placements. The placement of VMs in IP over WDM networks for cloud computing was
optimized in [524] while considering their workloads, intra-VM traffic, number of users, and replicas distribution
and an energy saving of 23% was achieved compared to one location placements. The computing and networking
energy efficiency of cloud services realized with VMs and VNs in scenarios using a server, a data center, or
multiple geo-distributed data centers were considered in [387]. A Real-time heuristics for Energy Optimized
Virtual Network Embedding (REOViNE) that considered the delay, clients locations, load distribution, and
efficient energy profiles for data centers was proposed and up to 60% power savings were achieved compared to
bandwidth cost optimized VNE. Moreover, the spectral and energy efficiencies of O-OFDM with adaptive
modulation formats and ALR power profile were examined.
To bridge the gap between traffic growth and networking energy efficiency in wired access, mobile, and
core networks, GreenTouch17, which is a leading Information and Communication Technology (ICT) research
consortium composed of fifty industrial and academic contributors, was formed in 2010 to provide architectures
and specifications targeting energy efficiency improvements by a factor of 1000 in 2020 compared to 2010. As
part of the GreenTouch recommendations, and to provide a road map to ISP operators for energy efficient design
for cloud networks, the work in [504], [506] proposed a combined consideration for IP over WDM design
approaches, and the cloud networking-focused approaches in [387], and [523]. The design approaches jointly
consider optical bypass, sleep modes for components, efficient protection, MLR, optimized topology and routing,
in addition to improvements in hardware where two scenarios; Business-As-Usual (BAU), and BAU with
GreenTouch improvements are examined. Evaluations on AT&T core network with realistic 7 data centers
locations and 2020 projected traffic, based on Cisco Visual Network Index (VNI) forecast and a population-based
gravity model, indicated energy efficiency improvements of 315x compared to 2010 core networks. Focusing on
big data and its applications, the work in [526], [527] addressed improving the energy efficiency of transport
networks while considering different “5V” characteristics of big data and suggested progressive processing in
intermediate nodes as the data traverse from source to central data centers. A tapered network that utilizes limited
processing capabilities in core nodes in addition to 2 optimally selected cloud data centers is proposed in [527]
and energy consumption reduction by 76% is achieved compared to centralized processing. In [527], the effects
of data volumes on the energy consumption is examined. The work in the above mentioned two papers is extended
in [173], [174] and is further explained is Section V with other energy efficiency and/or performance-related
studies for big data applications in cloud networking environments.
17
Available at: www.greentouch.org
of infrastructures and the uncertainty of resources availability. Thus, a wide range of optimizations objectives,
and utilizations for cloud technologies and infrastructures are proposed. This Section is organized as the follows:
Subsection V-A focuses on cloud resources management and optimization for applications, while Subsection V-
B addresses VMs and containers placement and resources allocation optimizations studies. Subsection V-C
discusses optimizations for big data bulk transfer between geo-distributed data centers and for inter data center
networking with big data traffic. Finally, Subsection V-D summarizes studies that optimize big data applications
while utilizing SDN and NFV. The studies presented in this Section are summarized in Tables III, and IV for
generic cloud-based, and specific big data frameworks, respectively.
TABLE IV
SUMMARY OF CLOUD NETWORKING-FOCUSED OPTIMIZATION STUDIES FOR GENERIC CLOUD APPLICATIONS
Ref Objective Tools Benchmarks/workloads Experimental Setup/Simulation environment
[125]* Energy efficiency in cloud data energy consumption MapReduce synthesized Simulations
centers with different granularities evaluation workloads, video streaming
[126]* Cloud recommendation system based Analytic Hierarchy Process Information about cloud Single machine as master, server from NeCTAR
on multicriteria QoS optimization (AHP) decision making providers (e.g. prices, locations) cloud, small EC2 instance, andC3.8xlarge EC2 instance
[129]* QoS and heterogeneity-aware Application Lightweight controller Single and multi-threaded std Homogeneous 40 nodes clusters with 10 configurations,
to Datacenter Server Mapping (ASDM) in cluster schedulers benchmarks, Microsoft workloads heterogeneous 40 nodes cluster with 10 machine types
[130]* Highly available cloud MILP model for availability- Randomly generated 50 nodes cluster with total
applications and services aware VM placements MTTF and MTTR (32 CPU cores, 30GB RAM)
[131]* CloudMirror: Applications-based network TAG modeling Empirical and synthesized Simulations for tree-based cluster with 2048 servers
abstraction and workloads placements with placement algorithm workloads
high availability
[133]* ElasticSwitch: Work-conserving minimum Guarantee Partitioning (GP), Shuffling traffic 100 nodes testbed (4 3GHz CPU, 8GB RAM)
bandwidth guarantees for cloud computing Rate Allocation (RA), OVS
[134]* EyeQ: Network performance Sender and receiver Shuffling traffic, 16 nodes cluster (Quad core CPU), 10 Gbps NIC
Isolation at the servers EyeQ modules Memcached traffic packet-level simulations
[143]° Multi-resource scheduling in Constrained programming, Google cluster Trace-driven simulations for
cloud data centers with VMs first-fit, best-fit heuristics Traces (1024 nodes, 3-tier tree topology)
[144]° Real-time scheduling for Rolling-horizon Google cloud Simulations via
tasks in virtualized clouds optimization Tracelogs CloudSim toolkit
[145]° Cost and energy aware scheduling VM selection/reuse, tasks Montage, LIGO, Simulations via
For deadline constraint tasks in clouds with merging/slaking heuristics SIPHT, CyberShake CloudSim toolkit
[156]° FlexTuner: Flexible container- Modified Mininet, MPICH2 version 1.3, Simulation for one VM
based tuning system iperf tools NAS Parallel Benchmarks (2GB RAM, 1 CPU core)
[161]† NetStitcher: system for inter Multipath and multi-hop, Video and content Equinix topology, 49 CDN (Quad Xeon
data center bulk transfers store-and-forward algorithms sharing traces CPU, 4GB RAM, 3TB disk), 1 Gbps links
[162]† Costs reduction for Convex optimizations, Uniform distribution Simulations for inter
inter-data cener traffic time-slotted model for file sizes DCN with 20 data centers
[163]† Profit-driven traffic scheduling for inter Lyapunov Uniform distribution Simulations for 7 nodes in EC2
DCN with multi cloud applications optimizations for arrival rate with 20 different cloud applications
[164]† CloudMPcast: Bulk transfers in multi ILP and Heuristic based Data backup, Trace-driven simulations for 14
data centers with CSP pricing models on Steiner Tree Problem video distribution data centers from EC2 and Azure
[165]† Reduce completion time of big data LockStep Broadcast - Numerical evaluations
broadcasting in heterogeneous clouds Tree (LSBT) algorithm
[166]† Optimized regular data backup in ILP and Transfer of Simulations for US backbone
Geographically-distributed data centers heuristics 1.35 PBytes network topology with 6 data centers
[167]† Optimize migration and backups in Greedy anycast Poisson arrival demands, Simulations for
EON-based geo-distributed DCNs algorithms -ve exponential service time NSFNET network
[168]† Energy saving in end- Power modelling by 20 and 100GB data sets Servers (Intel Xeon, 6GB RAM, 500 GB disk), (AMD FX,
to-end data transfers linear regression with different file sizes 10GB RAM, 2TB disk), Yokogawa WT210 power meter
[170]† Reducing costs of migrating geo- An offline and 2 Meteorological 22 nodes to emulate 8 data centers, 8 gateways, and
dispersed big data to the cloud online algorithms data traces 6 user-side gateways, additional node for routing control
[175]† Reduction of electricity and bandwidth Distributed 22k jobs from 4 data centers with varying electricity
costs in inter DCNs with large jobs algorithms Google cluster traces and bandwidth costs throughout the day
[176]† Power cost reduction in distributed data Two time scales Facebook Simulations for 7 geo-distributed data centers
centers for delay-tolerant workloads scheduling algorithm traces
[179]† Tasks scheduling and WAN bandwidth Community Detection 2000 file with Simulations in China-VO
allocation for big data analytics based Scheduling (CDS) different sizes network with 5 data centers
[182]† SAGE: service-oriented architecture Azure Queues 5 streaming services Azure public cloud
for data steaming in public cloud with synthetic benchmark (North and West US and North and West Europe)
[183]† JetStream: execution engine for OLAP cubes, 140 Million HTTP requests (51GB) VICCI testbed to emulate Coral CDN
data streaming in WAN adaptive data filtering
[184]† Networking cost reduction for geo- MILP model, Multiple Streams with four Simulations for
distributed big data stream processing VM Placement algorithm different semantics NSFNET network
[185]† Reduction of streaming workflows MILP and 500 streaming workflows Simulations for
costs in geo-distributed data centers 2 heuristics each with 100 tasks 20 data centers
[186]† Iridium: low-latency geo- Linear program, Bing, Conviva, Facebook, TPC- EC2 instances in 8 worldwide regions,
distributed data analytics online heuristic DS queries, AMPLab Big-Data trace-driven simulations
[188]‡ Application-aware Aggregation and TE OpenFlow, Voice, video, 7 GE Quanta packet switched, 3 Ciena CoreDirector
in converged packet-circuit networks POX controller and web traffic hybrid switch, 6 PCs with random traffic generators
[189]‡ Application-Centric IP/optical SDN Bulk transfers, dynamic Software for controller and use
Network Orchestration (ACINO) orchestrator 5G services, security, CDN cases for ACINO infrastructure
[190]‡ ZeroTouch Provisioning (ZTP) Network automation Bioenformatics, GEANT network and use
for managing multiple clouds with SDN and NFV UHD video editing cases for ZTP/ZTPOM
[191]‡ Routing, Modulation level and Spec- MILP, Bulk data Emulation of NSFNET
trum Allocation for bulk transfers NOX controller transfers network with OF-enabled WSSs
[194]‡ VersaStack: full-stack model-driven Resources modelling and Reading/writing from/to AWS and Google VMs, OpenStack-based VMs
orchestrator for VMs in hybrid clouds orchestration workflow Ceph parallel storage with SR-IOV interfaces, 100G SDN network
[195]‡ Open Transport Switch OTS prototype Bulk data ESnet Long Island Metropolitan
(OTS) for Cloud bursting based on OpenFlow transfers Area Network (LIMAN) testbed
[196]‡ Malleable Reservation for efficient MILP, dynamic Bulk data Simulations on NSFNET
bulk data transfer in EONs programming transfers
[197]‡ SDN-based bulk data Dynamic algorithm, Bulk data 10 data centers (IBM BladeCenter HS23 cluster), 10 OF-
transfers orchestration OpenFlow, Beacon transfers enabled HP3500 switches, 10 server-based gateways
[198]‡ Owan: SDN-based traffic management Simulated Bulk data Prototype, testbed, simulations
system for bulk transfers over WAN annealing transfers for ISP and inter data center WAN
[199]‡ SDN-managed Bulk transfers ABNO, ILP Bulk data OMNeT++ simulations for Telefonica (TEL), British
in Flexgrid inter DCNs heuristics transfers Telecom (BT), and Deutsche Telekom (DT) networks
[200]‡ Genome-Centric cloud-based NetServ, algorithms Genome 3-server clusters with total (160 CPU cores, 498GB
networking and processing off-path signaling data RAM, and 37TB storage), emulating GEANT Network
[201]‡ VNF placement for MILP, Random DAGs 10 nodes network with random
big data processing heuristics for chained NFs communication cost values
[202]‡ NFV-based CDN network for OpenStack-based HD and full Leaf cache on SYNERGY testbed (1 DC with 3 servers
big data video distribution CDN manager HD videos each with Intel i7 CPU, 16GB RAM, 1TB HDD)
[203]‡ NUBOMEDIA: real-time multimedia PaaS-based WebRTC 3 media server on
communication and processing APIs data KVM-based instances
*Cloud resources management, °VM and containers assignments, †Bulk transfers and inter DCN, ‡SDN and NFV-based.
VI. DATA CENTERS TOPOLOGIES, ROUTING PROTOCOLS, TRAFFIC CHARACTERISTICS, AND ENERGY
EFFICIENCY:
Data centers can be classically defined as large warehouses that host thousands of servers, switches, and
storage devices to provide various data processing and retrieval services [528]. Intra Data Center Networking
(DCN), defined by the topology (i.e. the connections between the servers and switches), links capacity, and the
switching technologies utilized and routing protocols, is an important design aspect that impacts the performance,
power consumption, scalability, resilience, and cost. Data centers have been successfully hosting legacy web
applications but are challenged by the need to host an increasing number of big data and cloud-based applications
with elastic requirements, multi-tenant, and heterogeneous workloads. Such requirements are contentiously
challenging data center architectures to improve their scalability, agility, and energy efficiency while providing
high performance and low latency. The rest of this Section is organized as follows: Subsection VI-A reviews
electronic switching-based data centers, while Subsection VI-B reviews proposed and demonstrated hybrid
electronic/optical and optical switching-based data centers. Subsection VI-C briefly describe HPC clusters, and
disaggregated data centers. Subsection VI-D presents traffic characteristics in cloud data centers, while Subsection
VI-E reviews intra DCN routing protocol and scheduling mechanisms. Finally, Subsection VI-F addresses the
energy efficiency in data centers.
A. Electronic Switching Data Centers:
Extensive surveys on the categorization and characterization of different data center topologies and
infrastructures are presented in [529]-[535]. In what follows, we briefly review some of state-of-the-art electronic
switching DCN topologies while emphasizing their suitability for big data and cloud applications. Servers in data
centers are typically organized in “racks” where each rack typically accommodates between 16 and 32 servers. A
Top-of-Rack (ToR) switch (also known as access or edge switch) is used to provide direct connections between
the rack's servers and indirect connections with other racks via higher layer/layers switches according to the DCN
topology. Most of legacy DCNs have a multi-rooted tree structure where the ToR layer is connected either to an
upper core layer (two-tiers) or upper aggregation and core layers (three-tiers) [528]. For various improvement
purposes, alternative designs based on Clos networks, flattened connections with high-radix switches,
unstructured connections, and wireless transceivers were also considered. These architectures can be classified as
switch-centric as the servers are only connected to ToR switches and the routing functionalities are exclusive to
the switches. Another class of DCNs, known as server-centric, utilizes the servers/set of servers with multiport
NIC and software-based routing to aid the process of traffic forwarding. A brief description of some electronic
switching DCNs is provided below and some examples of small topologies are illustrated in Figure 12 showing
the architecture in each case:
• Three-tier data centers [528]: Three-tier designs have access, aggregation, and core layers (Figure 12(a)).
Different subsets of ToR/access switches are connected to aggregation switches which connect to core
switches with higher capacity to ensure all-to-all racks connectivity. This increases the over-subscription
ratio as the bisection bandwidth between different layers varies due to link sharing. Supported by firewall,
load balancing, and security features in their expensive switches, three-tier data centers were well-suited for
legacy Internet-based services with dominant south-north traffic and were widely adopted in production data
centers.
• k-ary Fat-tree [536]: Fat-tree was proposed to provide 1:1 oversubscription and multiple equal-cost paths
between servers in a cost-effective manner by utilizing commodity switches with the same number of ports
(k) at all layers. Fat-tree organizes sets of equal edge and aggregation switches in pods and connects each
pod as a complete bipartite graph. Each edge switch is connected to a fixed number of servers and each pod
is connected to all core switches forming a folded-Clos network (Figure 12(b)). The Fat-tree architecture is
widely considered in industry and research [529] indicating its efficiency with various workloads. However,
its wiring complexities increase massively with scaling.
• VL2 [537]: Is a three-tier Clos-based topology with 1:1 oversubscription proposed to provide performance
isolation, load balancing, resilience, and agility in workload placements by using a flat layer-2 addressing
scheme with address resolution. VL2 suits virtualization and multi-tenants, however, its wiring complexities
are high.
• Flattened ButterFLY (FBFLY) [538]: FBFLY is a cost-efficient topology that flattens k-ary n-fly butterfly
networks into k-ary n-flat networks by merging the n switches in each row into a single high-radix switch.
FBFLYs improve the path diversity of butterfly networks, and achieve folded-Clos network performance
under load-balanced traffic with half the costs. However, with random adversarial traffic patterns, both load
balancing and routing become challenging, thus, FBFLY is not widely considered for big data applications.
• HyperX [539]: HyperX is a direct network of switches proposed as an extension to hypercube and flattened
butterfly networks. Further design flexibility is provided as several regular or general configurations are
possible. For load-balanced traffic, HyperX achieved the performance of folded Clos with fewer switches.
Like FBFLY, HyperX did not explicitly target improving big data applications.
• Spine-leaf ([e.g. [540], [541]): Spine-leaf DCNs are folded Clos-based architectures that gained widespread
adoption by industry as they utilize commercially-available high-capacity and high-radix switches. Spine-
leaf allows flexibility in the number of spine, leaf, and servers per leaf and links capacities at all layers (e.g.
in Figure 12(c)). Hence, controllable oversubscription according to cost-performance trade-offs can be
attained. Their commercial usage indicates acceptable performance with big data and cloud applications.
However, wiring complexities are still high.
• BCube [542] and MDCube [543]: BCube is a generalized hyper cube-based architecture that targets
modular data centers with scales that fit in shipping containers. The scaling in BCube is recursive where the
first building block “BCube0” is composed of n servers and an n-port commodity switch and the kth level
(i.e. BCubek) is composed of n BCubek-1 and nk n-port switches. Figure 12(d) shows a BCube1 with n=4. For
its multipath routing and to provide low latency and high bisection bandwidth and fault-tolerance, BCube
utilizes switches and servers equipped with multiple ports to connect with switches at different levels. BCube
is hence, suitable for several traffic patterns such as 1-1, 1-many, one-all, and all-all which arise in big data
workloads. However, with large scales, lower level to higher level bottlenecks increase and address space
have to be overwritten. For larger scales, MDCube in [543] was proposed to interconnect BCube containers
by high speed links in 1D or 2D connections.
• CamCube [544]: CamCube is a server-centric architecture that directly connects servers in a 3D torus
topology where each server is connected to other neighbouring 6 servers. In CamCube, switches are not
needed for intra routing which reduces costs and energy consumption. A Greedy key-based routing
algorithm, Symbiotic routing, is utilized at the servers which enables applications-specific routing and
arbitrary in-network functions such as caching and aggregation at each hop which can improve big data
analytics [214]. However, CamCube might not suit delay sensitive workloads with high number of servers
due to routing complexities, longer paths, and high store-and-forward delay.
• DCell [545]: DCellk is a recursively-scaled data center that utilizes a commodity switch per DCell0 pod to
connect its servers, and the remaining of the (k+1) ports in the servers for direct connections with servers in
other pods of same level and in higher levels pods. Figure 12(e) shows a DCell1 with 4 servers per pod.
DCell provides high bandwidth, scalability, and fault-tolerance at low costs. In addition, under all-all, many-
1, and 1-many traffic patterns, DCell achieves balanced routing, which ensures high performance for big
data applications. However, as it scales, longer paths between servers in different levels are required.
• FiConn [546]: FiConn is a server-centric DCN that utilizes switches and dual port servers to recursively
scale while maintaining low diameter and high bandwidth at reduced cost and wiring complexity compared
to BCube and DCell. In FiConn0, a port in each server is connected to the switch and in each level, half of
the remaining ports in the pods are reserved for the connections with servers in the next level. For example,
Figure 12(f) shows a FiConn2 with 4 servers per FiConn0. Real-time requirements are supported by
employing a small diameter and hop-by-hop traffic-aware routing according to the network condition. This
also improves the handling of bursty traffic of big data applications.
• Unstructured data centers with random connections: With the aim of reducing the average path lengths
and easing incremental expansions, unstructured DCNs based on random graphs such as Jellyfish [547],
Scafida [548], and Small-World Data Center (SWDC) [549] were proposed. Jellyfish [547] creates random
connections between homogeneous or heterogeneous ToR switches and connect hosts to the remaining ports
to support incremental scaling while achieving higher throughput due to low average path lengths. Scafida
[548], is an asymmetric scale-free data center that incrementally scale under limits on the longest path length.
In Scafida, two disjoint paths are assigned per switch pairs to ensure high resilience. SWDC [549] includes
its servers in the routing and connects them in small-world-inspired distribution connections. Unstructured
DCNs, however, have routing and wiring complexities and their performance with big data workloads is not
widely addressed.
• Data centers with wireless 60 GHz radio transceivers: To improve the performance of tree-based DCNs
without additional wiring complexity, the use of wireless transceivers at servers or ToR switches was also
proposed [550], [551].
Fig. 12. Examples of electronic switching DCNs (a) Three-tier, (b) Fat-tree, (c) Spine-leaf, (d) BCube, (e)
DCell, and (f) FiConn.
B. Hybrid Electronic/Optical and All Optical Switching Data Centers:
Optical switching technologies have been proposed for full or partial use in DCNs as solutions to overcome
the bandwidth limitations of electronic switching, reduce costs, and to improve the performance and energy
efficiency [552]-[557]. Such technologies eliminate the need for O/E/O conversion at intermediate hops and make
the interconnections data-rate agnostic. Hybrid architectures add Optical Circuit Switching (OCS), typically
realized with Micro-Electro-Mechanical System Switches (MEMSs) or free-space links, to enhance the capacity
of an existing Electronic Packet Switching (EPS) network. To benefit from both technologies, bursty traffic (i.e.
for mice flows) is offloaded to EPS while bulky traffic (i.e. for elephant flows) is offloaded to the OCS. MEMS-
based OCS requires reconfiguration time in the scale of ms or µs before setting paths between pairs of ToR
switches, and because packet headers are not processed, external control is needed for the reconfigurations.
Another shortcoming of MEMS is their limited port count. WDM technology can increase the capacity of ports
without huge increase in the power consumption [270], resolve wavelength contention, and reduce wiring
complexities at the cost of additional devices for multiplexing, de-multiplexing, and fast tuning lasers and tuneable
transceivers at ToRs or servers. In addition, most of the advances in optical networking discussed in Subsection
IV-B6 have also been considered for DCNs such as OFDM [558], PONs technologies [559]-[573], and EONs
[574]-[576].
In hybrid and all optical DCNs, both active and passive components were considered. The passive
components including fibers, waveguides, splitters, couplers, Arrayed Waveguide Gratings (AWGs), and Arrayed
Waveguide Grating Routers (AWGRs), do not consume power but have insertion loss, crosstalk, and attenuation
losses. Active components include Wavelength Selective Switches (WSSs), that can be configured to route
different sets of wavelengths out of a total of M wavelengths in an input port to N different output ports (i.e. 1×N
switch), MEMSs, Semiconductor Optical Amplifiers (SOAs) that can provide switching time in the range of ns,
Tuneable Wavelength Converters (TWCs), and Mach-Zehnder Interferometer (MZI) which are external
modulators based on controllable phase shifts in split optical signals. In addition to OCS, Optical Packet Switching
(OPS) [577]-580] was also considered with or without intermediate electronic buffering. Examples of hybrid
electrical/optical and all optical switching DCNs are summarized below, and some are illustrated in Figure 13:
• c-Through [581]: In c-Through, electronic ToR switches are connected to a two-tier EPSs network and a
MEMS-based OCS as depicted in Figure 13(a). The EPS maintains persistent but low bandwidth connections
between all ToRs and handles mice flows, while the OCS must be configured to provide high bandwidth
links between pairs of ToRs at a time to handle elephant flows. As the MEMS used have ms switching time,
c-Through was only proven to improve the performance of workloads with slowly varying traffic.
• Helios [582]: In Helios, electronic ToR switches are connected to a single tier containing arbitrary number
of EPSs and MEMS-based OCSs as in Figure 13(b). Helios performs WDM multiplexing in the OCS links
and hence requires WDM transceivers in the ToRs. Due to its complex control, Helios was demonstrated to
improve the performance of applications with second-scale traffic stability.
• Mordia [583]: Mordia is a 24-port OCS prototype based on ring connection between ToRs each with 2D
MEMS-based WSS that provides 11.5 µs reconfiguration time at 65% of electronic switching efficiency.
Mordia can support unicast, multicast, and broadcast circuits, and enables both long and short flows
offloading which makes it suitable for big data workloads. However, it has limited scalability as each source-
destination needs a dedicated wavelength.
• Optical Switching Architecture (OSA) [584] / Proteus [585]: OSA and Proteus utilize a single MEMS-
based optical switching matrix to dynamically change the physical topology of electronic ToRs connections.
Each ToR is connected to the MEMS via an optical module that contains multiplexers/demultiplexers for
WDM, a WSS, circulators, and couplers as depicted in Figure 13(c). This flexible design allows multiple
connections per ToR to handle elephant flows and eliminates blocking for mice flows by enabling multi-hop
connections via relaying ToRs. OSA was examined with bulk transfers and mice flows and minimal
overheads were reported while achieving 60%-100% non-blocking bisection bandwidth.
• Data center Optical Switch (DOS) [586]: DOS utilizes an N+1-ports AWGR to connect N ToR electronic
switches through OPS with the aid of optical label extractors as shown in Figure 13(d). Each ToR is
connected via a TWC to the AWGR to enable it to connect to one other ToR at a time. At the same time,
each ToR can receive from multiple ToR simultaneously. The last ports on the AWGR are connected to an
electronic buffer to resolve contention for transmitting ToRs. DOS suits applications with bursty traffic
patterns, however, its disadvantages include the limited scalability of AWGRs and the power hungry
buffering.
• Petabit [587]: Petabit is a bufferless and high-radix OPS architecture that utilizes three stages of AWGRs
(Input, central, and output) in addition to TWCs as depicted in Figure 13(e). At each stage the wavelength
can be tuned to a different one according to the contention at the next stage. However, electronic buffering
at the ToRs and effective scheduling are required to achieve high throughput. Petabit can scale without
impacting latency and thus can guarantee high performance for applications even at large scales.
• Free-Space Optics (FSO)-based data centers: Using FSO-based interconnection to link ToRs using
mirrors in roofs in DCNs was proposed by several studies such as FireFly [588], and Patch Panels [589].
Fig. 13. Examples of hybrid/all optical switching DCNs (a) c-Through, (b) Helios, (c) OSA/Proteus, (d)
DOS, and (e) Petabit.
This Section summarizes a number of big data applications optimization studies that consider the
characteristics of their hosting data centers including details such as improving their design and protocols or
analyzing the impact of their computing and networking parameters on the applications’ performance. Subsection
VII-A addresses the performance, scalability, flexibility, and energy consumption improvements and tradeoffs for
big data applications under various data centers topologies and design considerations. Subsection VII-B focuses
on the studies that improve intra data centers routing protocols to enhance the performance of big data applications
while improving the load balancing and utilization of the data centers. Subsection VII-C discusses flows, coflows,
and jobs scheduling optimization studies to achieve different applications and data centers performance goals.
Finally, Subsection VII-D addresses the studies that utilize advanced technologies to scale up big data
infrastructures and improve their performance. The studies presented in this Section are summarized in Tables V,
and VI.
A. Data Center Topology:
Evaluating the performance and energy efficiency of big data applications in different data centers topologies
was considered in [204]-[209]. The authors in [204] modeled Hadoop clusters with up to 4 ToR switches and a
core switch to measure the influence of the network on the performance. Several simplifications such as
homogeneous servers and uniform data distribution were applied and model-based and experimental evaluations
indicated that Hadoop scaled well enough under 9 different clusters configurations. The MRPerf simulator was
utilized in [205] to study the effect of the data center topology on the performance of Hadoop while considering
several parameters related to clusters (e.g. CPU, RAM, and disk resources), configurations (e.g. chunk size,
number of map and reduce slots), and framework (e.g. data placement and task scheduling). DCell was compared
to star and double-rack clusters with 72 nodes under the assumptions of 1 replica and no speculative execution
and was found to improve sorting by 99% compared to double-rack clusters. The authors in [206] extended
CloudSim simulator [110] as CloudSimExMapReduce to estimate the completion time of jobs in different data
center topologies with different workload distributions. Compared to hypothetically optimal topology for
MapReduce with a dedicated link for each intermediate data shuffling flow, CamCube provided the best
performance. Different levels of intermediate data skew were also examined and worse performance was reported
for all the topologies. In [207], we examined the effects of the data center network topology on the performance
and energy efficiency of shuffling operations in MapReduce with sort workloads in different data centers with
electronic, hybrid and all-optical switching technologies and different rate/server values. The results indicated that
optical switching technologies achieved an average power consumption reduction by 54% compared to electronic
switching data centers with comparable performance. In [208], the Network Power Effectiveness (NPE) defined
as the ratio between the aggregate throughput and the power consumption was evaluated for six electronic
switching data center topologies under regular and energy-aware routing. The power consumption of the switches,
the server's NIC ports and CPU cores used to process and forward packets in server centric topologies were
considered. The results indicated that FBFLY achieved the highest NPE followed by the server-centric data
centers, and that NPE is slightly impacted by the topology size as the number of switches scales almost linearly
with the data center size for the topologies examined. Design choices such as link speeds, oversubscription ratio,
and buffer sizes in spine and leaf architectures with realistic web search queries with Poisson arrivals and heavy-
tail size distribution were examined by simulations in [209]. It was found that ECMP is efficient only at link
capacities higher than 10 Gbps, where those resulted in 40% degradation in the performance compared to ideal
non-blocking switch. Higher oversubscription ratios degraded the performance only at 60% and higher loads.
Examining spine and leaf switches queue sizes revealed that it is better to maintain consistency and that additional
capacities are beneficial at leaf switches.
Flexible Fat-tree is proposed in [210] as an improvement and generalization of the Fat-tree topology in [536]
to achieve higher aggregate bandwidth and richer paths by allowing uneven number of aggregation and access
switches in the pods. With more aggregation switches, shuffling results indicated about 50% improvement in the
completion time. As a cost-effective solution to improve oversubscribed production data centers, the concept of
flyways, where additional on-demand wireless or wired links between congested ToRs, was introduced in [211]
and further examined in [212]. Under sparse ToR-to-ToR bandwidth requirements, the results indicated that few
flyways allocated in the right ToR switches improved the performance by up to 50% bringing it closer to 1:1 DCNs
performance. The flyways can be 802.11g, 802.11n, or 60 GHz wireless links, or random wired connections for
subset of the ToR switches via commodity switches. The wired connections, however, cannot help if the
congestion is between unconnected ToRs. A central controller is proposed to gather the demands and utilize MPLS
to forward the traffic over the oversubscribed link or one of the flyways. In [213], a spectrum efficient and failure
tolerant design for wireless data centers with 60 GHz transceivers was examined for data mining applications. A
spherical rack architecture based on bimodal degree distribution for the servers’ connections was proposed to
reduce the hop count and hence reduce the transmission time compared to traditional wireless data center with
cylindrical racks and same degree Cayley graph connections. Challenges related to interference, path loss, and the
optimization of hub servers’ selection were addressed to improve the data transmission rate. Moreover, the
efficiencies of executing MapReduce in failure prone environments (due to software and hardware failures) were
simulated.
Several big data frameworks that tailor their computations to the data center topology or utilize their
properties were proposed as in [214]-[216]. Camdoop in [214] is a MapReduce-like system that run in CamCube
and exploits its topology by aggregating the intermediate data along the path to reduce workers. A window-based
flow control protocol and independent disjoint spanning trees with the same root per reduce worker were used to
provide load balancing. CamCube achieved improvements over switch centric, Hadoop and Dryad, and over
Camdoop with TCP and Camdoop without aggregation. In [215], network-awareness and utilization of existing
or attached networking hardware were proposed to improve the performance of query applications. In-network
processing in ToR switches with attached Network-as-a-Service (NaaS) boxes was examined to partially reduce
the data and hence reduce bandwidth usage and increase the queries throughput. For API transparency, a shim
layer is added to perform software-based custom routing for the traffic through the NaaS boxes. A RAM-based
key-value store in BCube [542]; RAMCube was proposed in [216] to address false failure detection in large data
centers caused by network congestion, entire rack blockage due to ToR switch failure, and the traffic congestion
during recovery. A symmetric multi-ring arrangement that restricts failure detection and recovery to one hop in
BCube is proposed to provide fast fault recovery. Experimental evaluation for the throughput under single switch
failure with 1 GbE NIC cards in the servers indicated that a maximum of 8.3 seconds is needed to fully transmit
data from a recovery to a backup server.
The topologies of data centers were also considered in optimizing VM assignments for various applications
as in [217]-[220]. A Traffic-aware VM Placement Problem (TVMPP) to improve the scalability of data centers
was proposed in [217]. TVMPP follows two-tier approximating algorithm that leverages knowledge of traffic
demands and the data center topology to co-allocate VMs with heavy traffic in nearby hosts. First, the hosts and
the VMs are clustered separately and a 1-to-1 mapping that minimizes the aggregated traffic cost is performed.
Then, each VM within each cluster is assigned to a single host. The gain as a result of TVMPP compared to
random placements for different traffic patterns was examined and the results indicated that multi-level
architectures such as BCube benefit more than tree-based architectures and that heterogeneous traffic leads to
more benefits. To tackle intra data center network performance variability in multi-tenant data centers with
network-unaware VM-based assignments, the work in [218] proposed Oktopus as an online network abstraction
and virtualization framework to offer minimum bandwidth guarantees. Oktopus formulates virtual or
oversubscribed virtual clusters to suit different types of cloud applications in terms of bandwidth requirements
between their VMs. These allocations are based on greedy heuristics that are exposed to the data center topology,
residual bandwidths in links, and current VMs allocation. The results showed that allocating VMs while
accounting for the oversubscription ratio improved the completion time and reduced tenant costs by up to 75%
while maintaining the revenue. In [219], a communication cost minimization-based heuristic: Traffic Amount
First (TAF) was proposed for VMs to PMs assignments under architectural and resources constraints and was
examined in three data centers topologies. Inter VM traffic was reduced by placing VMs with higher inter traffic
in the same PM as much as possible. A Topology-independent resources allocation algorithm namely; NetDEO
was designed in [220] based on swarm optimizations to gradually reallocate existing VMs and allocate newly
accepted VMs based on matching resources and availability. NetDEO maintains the performance during network
and servers upgrades and accounts for the topology in the VM placements.
The performance of big data applications in SDN-controlled electronic and hybrid electronic/optical
switching data centers topologies was considered in [221]-[228]. To evaluate the impact of networking
configurations on the performance of big data applications in SDN-controlled data centers with multi-racks before
deployments, a Flow Optimized Route Configuration Engine (FORCE) was proposed in [221]. FORCE emulates
building virtual topologies with OVS over the physical network controlled by SDN to enable optimizing the
network and enhance the applications performance at run-time and improvements by up to 2.5 times were
achieved. To address big data applications and the need for frequent reconfigurations, the work in [222] examined
a ToR-level SDN-based topology modification in a hybrid data center with core MEMS switch and electrical
Ethernet-based switches at run-time. Different topology construction and routing mechanisms that jointly
optimize the performance and network utilization were proposed for single aggregation, shuffling, and partially
overlapped aggregation communication patterns. The authors accounted for reconfiguration delays by starting the
applications early and accounted for the consistency in routing tables updates. The work in [223] experimentally
examined the performance of MapReduce in two hybrid electronic/optical switching data centers namely c-
Through and Helios with SDN control. An “observe-analyze-act” control framework based on OpenFlow was
utilized for the configurations of the OCS and the packet networks. The authors addressed the hardware and
software challenges and emphasized on the need for near real-time analysis of application requirements to
optimally obtain hybrid switch scheduling decisions. The work in [225] addressed the challenges associated with
handling long-lived elephant flows of background applications while running Hadoop jobs in Helios hybrid data
centers with SDN control. Although SDN control for electronic switches can provide alternative routes to reduce
congestion and allow prioritizing packets in the queues, such techniques largely increase the switches CPU and
RAM requirements with elephant flows. Alternatively, [224] proposed detecting elephant flows and redirecting
them via the high bandwidth OCS, to improve the performance of Hadoop. To reduce the latency of multi-hop
connections between servers in 2D Torus data centers, the work in [225] proposed the use of SDN-controlled
MEMS to bypass electronic links and directly connect the servers. Results based on an emulation testbed and all-
to-all traffic pattern indicated that optical bypassing can reduce the end-to-end latency for 11 of the 15 hosts by
11%. To improve the efficiency of multicasting and incasting in workloads such as HDFS read, join, VM
provisioning, and in-cluster software updates, a hybrid architecture based on Optical Space Switches (OSS) was
proposed in [225] to establish point-to-point links on-demand to connect passive splitters and combiners. The
splitters transparently duplicate the data optically at the line rate and the combiners aggregate incast traffic under
orchestration system control with TDM. Compared with small scale electronic non-blocking switches, similar
performance was obtained indicating potential gains with the optical accelerators at larger scale, where non-
blocking performance is not attained by electronic switches. Unlike the above work which fully offloads multicast
traffic to the optical layer, HERO in [227] was proposed to integrate optical passive splitters and FSO modules
with electronic switching to handle multicast traffic. During the optical switches configuration, HERO multicasts
through the electronic switches, then migrates to optical multicasting. HERO exhibited linear increase in the
completion time with the increase in the flow sizes and significantly outperformed Binomial, and ring Message
Passing Interface (MPI) broadcasting algorithms with electronic switching only for messages less than or equal
to, and greater than 12 kBytes in size, respectively.
A software-defined and controlled hybrid OPS and OCS data center is proposed and examined in [228] for
multi-tenants dynamic Virtual Data Center (VDC) provisioning. A topology manager was utlilized to build the
VDCs with different provisioning granularities (i.e. macro and micro) in an Architecture-on-Demand (AoD) node
with OPS, and OCS modules in addition to ToRs and servers. For VMs placement, a network-aware heuristic that
jointly considers computing and networking resources and takes the wavelengths continuity, optical devices
heterogeneity, and the VM dynamics into account was considered. Improvements by 30.1% and 14.6% were
achieved by the hybrid topology with 8, and 18 wavelengths per ToR switch, respectively, compared to using
OCS only. The work in [229] proposed a 3D MEMS crossbar to connect server blades for scalable stream
processing systems. Software-based control, routing, and scheduling mechanisms were utilized to adapt the
topology to graph computational needs while accounting for MEMS reconfiguration and protocol stack delays.
To overcome the high blocking ratio and scalability limitations of single hop MEMS-based OCS, the authors in
[230] proposed a distributed multi-hop OCS that utilizes WDM and SDM technologies integrated with multi-
rooted tree based data centers. A multi wavelengths optical switch based on Microring Resonators (MR) is
designed to ensure fast switching. A modification to Fat-tree by replacing half of the electronic core, aggregation,
and access switches with the OCS was proposed and distributed control was utilized at each optical switch with
low bandwidth copper links to interconnect and combine control with EPS. Compared with Hedera and DARD,
much faster optical path setup (i.e. 126.144 µs) was achieved with much lower control messaging overhead.
B. Data Center Routing:
In [231], a “reduce tasks” placement problem was analyzed in multi-rack environments to decrease cross-
rack traffic based on two greedy approaches. Compared to original Hadoop with random reduce task placements,
up to 32% speedup in completion time among different workloads was achieved. A scalable DCN-aware load
balancing technique for key distribution and routing in the shuffling phase of MapReduce was proposed in [232]
while considering DCN bandwidth constraints and addressing data skewness. A centralized Heuristic with two
subproblems; network flow and load balancing, was examined and compared to three state-of-the-art techniques,
load balancing-based; LPT, fairness-based; LEEN, and the default routing algorithm with hash-based key
distribution in MapReduce. For synthetic and realistic traffic, the network-aware load balancing algorithm
outperformed the others by 40% in terms of completion time and achieved maximum load per reduce comparable
to that of LPT. To improve shuffling and reduce its routing costs under varying data sizes and data reduction
ratios, a joint intermediate data partitioning and aggregation scheme was proposed in [233]. A decomposition-
based distributed online algorithm was proposed to dynamically adjust data partitioning by assigning keys with
larger data sizes to reduce tasks closer to map tasks while optimizing the placement and migration of aggregators
that merge the same key traffic from multiple map tasks before sending them to remote reduce tasks. For large
scale computations (i.e. 100 keys), and compared to random hash-based partitioning with no aggregation and with
random 6 aggregators placement, the scheme resulted in 50%, and 26% reduction in the completion time,
respectively. DCP was proposed in [234] as an efficient and distributed cache sharing protocol to reduce the intra
data center traffic in Fat-tree data centers by eliminating the need to retransmit redundant packets. It utilized a
packets cache in each switch for the eliminations and a Bloom filter header field to store and share cached packets
information among switches. Simulation results for 8-ary fat-tree data center showed that DCP eliminated the
retransmission by 40-60%. To effectively use the bandwidth in BCube data centers, the work in [235] proposed
and optimized two schemes for in-network aggregation at the servers and switches. The first for incast traffic was
modeled as minimal incast aggregation tree problem, and the second for shuffling traffic was modeled as minimal
shuffle aggregation subgraph problem. 2-approximation efficient algorithms named IRS-based, and SRS-based
were suggested for incast and shuffling traffic, respectively. Moreover, an effective forwarding scheme based on
in-switch and in-packet Bloom filters was utilized at a cost of 10 Bytes per packet to ease related flows
identification. The results for SRS-based revealed traffic reduction by 32.87% for a small BCube, and 53.33 %
for a large-scale BCube with 262,144 servers.
The optimization of data transfers throughput in scientific applications was addressed in [236] while
considering the impact of combining different levels of TCP flows pipelining, parallelism, and concurrency and
the heterogeneity in the files sizes. Recommendations were presented such as the use of pipelining only for file
sizes less than a threshold related to the bandwidth latency product and with different levels related to file size
ranges. To improve the throughput, routing scalability, and upgrade flexibility in electronic switching data centers
with random topologies, Space Shuffle (S2) was proposed in [237] as a greedy key-based multi-path routing
mechanism that operates in multiple ring spaces. Results based on fine-grained packet-level simulations that
considered the finite sizes of shared buffers and forwarding tables and the acknowledgment (ACK) packets
indicated improved scalability compared to Jellyfish, and higher throughput and shorter paths than SWDC and
Fat-tree. However, the overheads associated with packets reordering were not considered. An oblivious distributed
adaptive routing scheme in Clos-based data centers was proposed in [238] and was proven to converge to non-
blocking assignments with minimal out-of-the-order packet delivery via approximate Markov models and
simulations when the flows are sent at half the available bandwidth. While transmitting at full bandwidth with
strictly and rearrangeable non-blocking routing resulted in exponential convergence time, the proposed approach
converged in less than 80 µs in a 1152 nodes cluster and showed weak dependency on the network size at the cost
of minimal delay due to retransmitting first packets of redirected flows. The work in [239] utilized stochastic
permutations to generate bursty traffic flows and statistically evaluated the expected throughput of several layer
2 single path and multi-path routing protocols in oversubscribed spine-leaf data centers. Those included
deterministic single path selections based on hashing source or destination, worst-case optimal single path, ECMP-
like flow-level multipathing, and a stateful Round Robin-based packet-level multipathing (packets spraying).
Simulation results indicated that the throughput of the ECMP-like multipath routing is less than the deterministic
single path due to flow collisions as 40% of the flows experienced 2 fold slowdown and the packet spraying
outperformed all examined protocols. The authors in [240] proposed DCMPTCP to improve MPTCP through
three mechanisms; Fallback for rACk-local traffic (FACT) to reduce unnecessary sub-flows creation for rack-
local traffic, ADAptive packet scheduler (ADA) to estimate flow lengths and enhance their division, and Signal
sHARE control (SHARE) to enhance the congestion control for the short flows with many-to-one patterns.
Compared with two congestion control mechanisms typically used with MPTCP; LIA and XMP, 40% reduction
in inter rack flows transmission time was achieved.
The energy efficiency of routing big data applications traffic in data centers was considered in [241]-[243].
In [241], preemptive flow scheduling and energy efficient routing were combined to improve the utilization in
Fat-tree data centers by maximizing switches sharing while exclusively assigning each flow to needed links during
its schedule. Compared to links bandwidth sharing with ECMP, and flow preemption with ECMP, additional
energy savings by 40% and 30% were achieved, respectively at the cost of increased average completion time. To
improve the energy efficiency of MapReduce-like systems, the work in [242] examined combining VM
assignments with TE. A heuristic; GEERA was designed to first cluster the VMs via minimum k-cut, and then
assign them via local search, while accounting for the traffic patterns. Compared with other load-balancing
techniques (i.e. Randomized Shortest Path (RSP), and integral optimal and fractional solutions), an average
additional 20% energy saving was achieved, and total average savings by 60% in Fat-tree, and 30% in BCube
data centers were achieved. A GreenDCN framework was proposed in [243] to green switch-centric data center
networks (Fat-tree as focus) by optimizing VM placements and TE while considering the network features such
as the regularity and role of switches, and the applications traffic patterns. The energy-efficient VM assignment
algorithm (OptEEA), transforms VMs into super VMs with heavy traffic, assigns jobs to different pods through
k-means clustering, and finally assigns super VMs to racks through minimum k-cut. Then, the energy-aware
routing algorithm (EER) utilizes the first fit decreasing algorithm and MPTCP to balance the traffic across a
minimized number of switches. Compared with greedy VM assignments and shortest path routing, GreenDCN
achieved 50% reduction in the energy consumption.
The optimization of routing traffic between VMs and in multi tenant data center was considered in [244]-
[246]. VirtualKnotter in [244] utilized a two-step heuristic to optimize VM placement in virtualized data centers
while accounting for the congestion due to core switch over-subscription and unbalanced workload placements.
While other TE schemes operate for fixed source and destination pairs, VirtualKnotter considered reallocating
them through optimizing VM migration to further reduce the congestion. Compared to a baseline clustering-based
VM placement, a reduction by 53% in congestion was achieved for production data center traffic. To allow
effective multiplexing of applications with different routing requirements, the authors in [245] proposed an online
Topology Switching (TS) abstraction that defines a different logical typology and routing mechanism per
application according to its goals (e.g. bisection bandwidth, isolation, and resilience). Results based on simulations
for an 8-ary Fat-tree indicated that the tasks achieved their goals with TS while a unified routing via ECMP with
shortest path and isolation based on VLAN failed to guarantee the different goals. The work in [246] addressed
the fairness of bandwidth allocation and link sharing between multiple applications in private cloud data centers.
A distributed algorithm based on dual-based decomposition was utilized to assign link bandwidths for flows based
on maximizing the social welfare across the applications while maintaining performance-centric fairness with
controlled relaxation. The authors assumed that the bottlenecks are in the access link of the PM and considered
workloads where half of the tasks communicate with the other half and no data skew. To evaluate the algorithm,
two different scenarios for applications allocation and communication requirements were used and the results
indicated the flexibility and the fast convergence of the proposed algorithm.
SDN-based solutions to optimize the routing of big data applications traffic in various data center topologies
were discussed in [247]-[257]. MILPFlow was developed in [247] as a routing optimization tool set that utilizes
MILP modeling based on an SDN topology description, that define the characteristics of the data center, to deploy
data path rules in OpenFlow-based controllers. To improve the routing of shuffling traffic in Fat-tree data centers,
an application-aware SDN routing scheme was proposed in [248]. The scheme included a Fat-tree manager,
MapReduce manager, links load monitor, and a routing component. The Fat-tree manager maintains information
about the properties of different connections to prioritize the assignment of flows with less paths flexibility.
Emulation results indicated a reduction in the shuffling time by 20% and 10% compared to Round Robin-based
ECMP under no skew, and with skew, respectively. Compared to Spanning Tree and Floodlight forwarding
module (shortest path-based routing), a reduction around 60% was achieved. To enhance the shuffling between
map and reduce VMs under background traffic in OpenFlow-controlled clusters, the work in [249] suggested
dynamic flows assignment to queues with different rates in OVS and LINC software switches. Results showed
that prioritizing Hadoop traffic and providing more bandwidth to straggler reduce tasks can reduce the completion
time by 42% compared to solely using a 50 Mbps queue. In [250], a dynamic algorithm for workload balancing
between different racks in a Hadoop cluster was proposed. The algorithm estimates the processing capabilities of
each rack and accordingly modifies the allocation of unfinished tasks to racks with least completion time and
higher computing capacity. The proposed algorithm was found to decrease the completion time of the slowest
jobs by 50%. A network Overlay Framework (NoF) was proposed in [251] to gurantee the bandwidth requirements
of Hadoop traffic at run time. NoF achieves this by defining networks topologies, setting communication paths,
and prioritizing traffic. A fully virtualized spine-leaf cluster was utilized to examine the impact on job execution
time when redirecting Hadoop flows through the overlay network controlled by NoF and a reduction in the
completion time by 18-66% was achieved. An Application-Aware Network (ANN) platform with SDN-based
adaptive TE was examined in [252] to control the traffic in Hadoop clusters at fine-grain level to achieve better
performance and resources allocation. An Address Resolution Protocol (ARP) resolver algorithm for flooding
avoidance was proposed instead of STP to update the routing tables. Compared with MapReduce in
oversubscribed links, SDN-based TE resulted in completion time improvement between 16-337× for different
workloads.
To dynamically reduce the volume of shuffling traffic and green data exchange, the work in [253] utilized
spate coding-based middleboxes under SDN control in 2-tier data centers. The scheme uses a sampler to obtain
side information (i.e. traffic from map to reduce tasks residing in the same node) in addition to the coder/decoder
at the middleboxes. Then, OpenFlow is used to multicast the coded packets according to an optimized grouping
of multicast trees. Compared to native Vanilla Hadoop, reduction in the exchanged volume by 43%, and 59%
were achieved when allocating the middlebox in the aggregation and a ToR switch, respectively. Compared to
Camdoop-like in-network aggregation, the reduction percentages were 13%, and 38%. As an improvement of
MPTCP, the authors in [254] proposed a responsive centralized scheme that adds subflows dynamically and
selects best route according to current traffic conditions in SDN-controlled Fat-tree data centers. A controller that
employs Hedera's demand estimation to perform subflow route calculation and path selection, and a monitor per
server to adjust the number of subflows dynamically were utilized. Compared with ECMP and Hedera under
shuffling with background traffic, the responsive scheme achieved 12% improvement in completion time while
utilizing a lower number of subflows. To overcome the scalability issues of centralized OpenFlow-based
controllers, the work in [255] proposed a distributed algorithm at each switch; LocalFlow that exploits the
knowledge of its active flows and defines the forwarding rules at flowlets, individual flows and sub-flows
resolutions. LocalFlow targets data centers with symmetric topologies and can tolerate asymmetries caused by
links and node failures. It allows but reduces flow spatial splitting, while aggregating flows per destination, only
if the load imbalance exceeded a slack threshold. To avoid the pitfalls of packets reordering, the duplicate
acknowledgment (dup-ACK) duration at end-hosts is slightly increased. LocalFlow improved the throughput by
up to 171%, 19%, and 23% compared to ECMP, MPTCP, and Hedera, respectively. XPath was proposed in [256]
to allow different applications to explicitly route their flows without the overheads of establishing paths and
dynamically adding them to routing tables of commodity switches. A 2-step compression algorithm was utilized
to enable pre-installing very huge number of desired paths into commodity switches IP tables in large-scale data
centers with limited table sizes. First, to reduce the number of unique IDs for paths, non-conflicting paths such as
converging and disjoint ones are clustered into path sets and each set is assigned a single ID. Second, the sets are
mapped to IP addresses based on Longest Prefix Matching (LPM) to reduce the number of IP tables entries. Then,
a logically centralized SDN-based controller can be utilized to dynamically update the IDs of the paths instead of
the routing tables, and to handle failures. For MapReduce shuffling traffic, XPath enabled choosing non-
conflicting parallel paths and achieved 3× completion time reduction compared to ECMP. The recent work in
[257] discussed the need for applying Machine Learning (ML) techniques to bridge the gap between large-scale
DCN topological information and various networking applications such as traffic prediction, monitoring, routing,
and workloads placement. A methodology named Topology2Vec was proposed to generate topology
representation results in the form of low dimensional vectors from the actual topologies while accounting for the
dynamics of links and nodes availability due to failures or congestion. To demonstrate the usability of
Topology2Vec, the problem of placing SDN controllers to reduce end-to-end delays was addressed. A summary
of the data center focused optimization studies is given in Table V.
[210]* FFTree: Adjustment to pods in Fat-tree Google’s Flexibility in pods design WordCount Ns3-based simulations for 16 Data Nodes in Linux containers
for higher bandwidth and flexibility MapReduce defined by edge offset, connected by TapBridge NetDevice
BulkSendApplication
[211]*, Flyways: wireless or wired additional links MapReduce- Central controller, MPLS Production MapReduce Simulation for 1500 servers in 75 racks with 20 server per rack)
[212]* on-demand between congested ToR switches like data label workloads and additional Flyways with (0.1, 0.6, 1, and 2) Gbps capacity per
mining switching, optimization Flyway
problem
[213]* Failure-tolerant and spectrum efficient - Bimodal degree 125 Mbytes input data, Simulations for a spherical rack with 200 servers 10 of them are hub
wireless data center for big data applications distribution, space and 20 Mbytes intermediate servers,
time division multiple data
access scheme
[214]* Camdoop: MapReduce-like system Hadoop key-based routing, in- WordCount, Sort 27 servers CamCube (Quad Core 2.27 GHz CPU, 12GB RAM, 32GB
in CamCube data centers 0.20.2, network prcessing, SSD), 1 Gbps Quadport NIC, 2 1 Gbps dual NIC, packet level
Dryad, spanning tree simulator for 512 server CamCube
Camdoop
[215]* In-network distributed query processing Apache Solr NaaS box attached to Queries from Solr cluster (1 master, 6 workers), 1 Gbps Ethernet for servers, 10
system to increase throughput ToR for in-network 75 clients Gbps for NaaS box
processing
[216]* RAMCube : BCube-oriented design Key-value RPC through Ethernet, Key-value workloads BCube(4,1) with 16 servers (2.27GHz Quad core CPU, 16GB RAM,
for resilient key-value store store one hop with set, get, delete 7200 RPM 1TB disk) and 8 8-port DLink GbE, ServerSwitch 1 Gbps
allocation to recovery operations NIC in each server
server
[217]* Traffic-Aware VM Placement Problem - Two-tier approximate VM-to-VM global and Tree, VL2, Fat-tree, and BCube data centers with
(TVMPP) in data centers algorithm partitioned 1024 VMs
and traffic, production
traces
[218]* Oktopus: Intra data center network - Greedy allocation Symmetric and 25 nodes with 5 ToRs and core switch (2.27GHz CPU, 4GB RAM, 1
virtualization for predictable performance algorithms, rate-limiting asymmetric VM-to- VM Gbps), Simulations for multi-tenant data center with 16000 servers in
at end hosts static and dynamic traffic 4 racks and 4 VMs per server
[219]* VM to PM mapping based on architectural - Traffic Amount First Uniformly random VM- Simulations for Fat-tree, VL2, and Tree-based data centers in large-
and resources constraints (TAF) heuristic to-VM traffic scale (60 VMs) and small-scale (10 VMs) settings
[220]* NetDEO: VM placement in data centers Multi-tier Meta-heuristic based on Synthesized traces Simulations for data center networks with heterogeneous servers
data centers and efficient system upgrade web Simulated annealing and different topologies (non-homogeneous Tree, FatTree, BCube)
applications
[221]* Emulator for SDN-based data centers Hadoop 1.x OVS Simulated Hadoop shuffle Testbed with 1 primary server (2.4 GHz, 4GB RAM), 12 client
with reconfigurable topologies traffic generator workstations (dual-core CPU, 2GB RAM), 2 Pica8 Pronto 3290 SDN-
enabled Gigabit Ethernet switches)
[222]* SDN-approach in Application-Aware Hadoop 2.x Ryu, OVS, lightweight HiBench benchmark (sort, 1 master and 8 slaves in GENI testbed (2.67GHz CPU, 8GB RAM)
Networks (AAN) to improve Hadoop REST-API, ARP resolver word count, scan, join, 100 Mbps per link.
PageRank)
[223]* Application-aware run-time SDN-based - 2D Torus topology - Hybrid data center with OpenFlow-enabled ToR electrical switches
control for data centers with Hadoop configuration algorithms connected to an electrical core switch and MEMS-based core optical
switch
[224]* Experimental evaluation for MapReduce Hadoop 1.x Topology manager, TrintonSort (900 GB) and 24 servers in 4 racks, 5 Gbps packet link and 10 Gbps optical link
in c-Through and Helios OpenFlow, circuit switch TCP long-lived traffic Monaco switches, Glimmerglass MEMS
manager
[225]* Measuring latency in Torus-based hybrid - Network Orchestrator Real-time network 2D Torus network testbed constructed with 4 MEMS sliced from a
optical/electrical switching data centers Module (NOM) traffic via Iperf 96×96 MEMS- based CrossFiber LiteSwitcM and two Pica8
OpenFlow switches to emulate 16 NIC
[226]* Acceleration of incast and multicast traffic - Floodlight, Redis pub/sub Two sets of 50 multicast 8 nodes hybrid cluster testbed with optical splitters and
with on-demand passive optical modules messages, jobs (500MB-5GB) with combiners connected with controlled Optical Space Switch (OSS)
integer program to 4 and 7 groups
configure OSS
[227]* HERO: Hybrid accelerated delivery MPICH for SDN controller, greedy 50 multicast flows with Small free-space optics mutlicast testbed with passive optical 1:9
for multicast traffic Message optical links assignment uniform flows splitters for throughput and delay evaluation, flow-level simulations
Passing algorithm (100MB-1GB) and for 100 racks in spine and leaf hybrid topology
Interface groups (10-100)
(MPI)
[228]* Multi-tenant virtual optical data center Virtual Data OpenDayLight, network- 2 VDC, randomly FPGA optoelectronics 12×10GE ToRs, SOA-based OPS switch,
with network-aware VM placement Center aware VM placement generated 500 Polatis 192×192 MEMS- based switch as optical back-plane,
(VDC) based on variance VDCs with Poisson simulations for 20/40 servers per rack, 8 racks
composition requests arrival
[229]* Reconfigurable 3D MEMS based data center System S Scheduling Optimizer for Multiple jobs for 3 IBM Bladecenter-H chassis, 4 HS21 blade servers per chassis,
for optimized stream computing systems Distributed Applications streaming applications Calient 320×320-port 3D MEMS
(SODA)
[230]* Hybrid switching architecture for cloud - Multi wavelength optical 3 synthetic traffic patterns OPNET simulations for the proposed hybrid OCS and EPS switching
applications in Fat-tree data centers switch, distributed with mice and elephant in Fat-tree with 128 servers to evaluate the average end-to-end delay
control for scheduling flows and the network throughput.
[231]° Reduce tasks placements to Hadoop 1.x Linear-time greedy WordCount, Grep, Cluster with 4 racks; A, B , C, and D with 7, 5, 4, and 4 servers (Intel
reduce cross-rack traffic algorithm, binary search PageRank, k-means, Xeon E5504, E5520, E5620, and E5620 CPUs, 8GB RAM), 1 GbE
Frequency Pattern Match links
[232]° Data center network-aware load balancing to - Greedy Synthetic and Wikipedia Simulations for 12-ary Fat-tree with 1 Gbps links
optimize shuffling in MapReduce with skewed algorithm page-to-page link datasets
data
[233]° Joint intermediate data partitioning and - Profiling, decomposition- Dump files in Wikimedia, Simulations for three-tier data center with 20 nodes
aggregation to improve the shuffling phase of based distributed WordCount, random
MapReduce algorithm shuffling
[234]° DCP: Distributed cache protocol to reduce - In-memory store, Randomly generated Simulations for Fat-tree with k=16 (1024 servers, 64 core switches,
redundant packets transmission in Fat-tree bloom filter header packets with Zipf 28 aggregation, 128 edge), simulations
distribution
[235]° In-network aggregation in BCube for Hadoop 1.x IRS-based incast, SRS- WordCount with 61 VMs in 6 nodes emulated BCube (2 8-cores CPU, 24GB RAM,
scalable and efficient data shuffling based shuffle, Bloom combiner 1TB disk), large-scale simulations for BCube(8,5) with
filter
[236]° Application-level TCP tuning for data transfers - Recursive Chunk Bulk data transfers high-speed networking testbeds and cloud networks
through pipelining, parallelism, and concurrency Division, Parallelism- (512KB-2GB) per file Emulab-based emulations, AWS EC2 instances
Concurrency Pipelining
[237]° Space Shuffle (S2): greedy routing on multiple - Greediest routing, Random permutation Fine-grained packet-level event-based simulations for
rings to improve throughput, scalability, and MILP traffic the proposed data center, Jellyfish, SWDC, and Fat-tree
flexibility
[238]° Non-blocking distributed oblivious adaptive - Approx. Markov chain Random permutation OMNet++-based simulations for InfiniBand network (three-level Clos
routing in Clos-based data centers for big data models to predict traffic of 245KB flows DCN with 24 input, 24 output, and 48 intermediate switches and 1152
applications convergence time nodes), 40 Gbps links
[239]° Evaluation for different routing protocols on - Stochastic permutations Bursty traffic, delay- Flow-level simulations for spine and leaf data center
the performance of spine and leaf data centers for traffic generation sensitive workloads with 8 spine switches, 16 leaf switches, and 128 end nodes
[240]° DCMPTCP: improved MPTCP for load - Fallback for rACk-local Many-to-one traffic, data Ns3-based simulations for Spine and leaf data center with 8 spine, 8
balancing in data centers with rack local and Traffic, ADAptive packet mining and web search leaf switches, and 64 nodes per leaf switch, 10 and 40 Gbps links
inter racks traffic scheduler traffic
[241]° Greening data centers by Flow Preemption - Algorithm for the 10k flows wit exponential Simulations for 24-ary Fat-tree
(FP) and Energy-Aware Routing (EAR) FP and EAR scheme distribution with mean of
64MB
[242]° Improving the energy efficiency of routing in - Approximate-algorithm Uniform random traffic Fat-tree (4-ary and 8-ary), BCube(2,2) and BCube(8,2)
data centers by joint VM assignments and TE (GEERA) and number of VMs
[243]° GreenDCN: scalable framework to green data - Time-slotted algorithms; Synthetic jobs with Simulations for 8-ary and 12-ary Fat-tree data centers, 2 VMs per
center network through optimized VM optEEA, EER normal distribution for server, identical switches with 300W max power and max processing
placement and TE no. of servers speed of 1 Tbps
[244]° VirtualKnotter: efficient online VM placement - Multiway θ-Kernighan- Synthetic and realistic Dynamic simulations
and migration to reduce congestion in data Lin and simulated traffic
centers annealing
[245]° Topology switching to allow multiple - Allocator, centralized Randomly generated Simulations for 16-ary Fat-tree with 1 Gbps links
applications to use different routing schemes topology server routing tasks
[246]° Performance-centric fairness in links bandwidth - Gradient projection- Two scenarios for Simulations for a private data center with homogeneous nodes
allocation in private clouds data centers based distributed applications traffic
algorithm
[247]° MILPFlow: a Tool set for modeling and data - MILP, Video streaming, Mininet-based emulation for 4-ary Fat-tree in VirtualBox-based
paths deployment in OpenFlow-based data OVS Iperf traffic VMs, 1 GbE NIC
centers
[248]° Application-aware SDN routing Hadoop 1.x Floodlight controller, WordCount EstiNet-based emulation for 20 OpenFlow switches in 4-ary Fat-tree
in data centers managers, links monitor, with 16 nodes
routing component
[249]° Hadoop acceleration in an Cloudera Floodlight, OVS and Sort (0.4 MB - 4GB), 10 VMs in 3 nodes in Cloudera connected by a physical switch
OpenFlow-based cluster distribution LINC switches Iperf for background
of Hadoop traffic
[250]° SDN-controlled dynamic workload balancing - Balancing algorithm WordCount Mumak-based simulations for a data center with three racks
to improve completion time of Hadoop jobs based on estimation and
prediction
[251]° SDN-based Network Overlay Framework (NoF) Hadoop Configuration engine, TeraGen, TeraSort, Virtualized testbed with spine-leaf virtual switches, VirtualBox for
to define networks based on applications 2.3.0 OVS, POX controller iperf for background VMs
requirements traffic
[252]° SDN-approach in Application-Aware Hadoop 1.x Ryu, OVS, lightweight HiBench benchmark (sort, 1 master and 8 slaves in GENI testbed (2.67GHz CPU, 8GB RAM),
Networks (AAN) to improve Hadoop REST-API, ARP resolver word count, scan, join, 100 Mbps per link
PageRank)
[253]° Dynamic control for data volumes through SDN Vanilla Sampler, spate coding- TeraSort, Grep Prototype: 2-tier data center with 8 nodes (12 cores CPU, 128GB
control and spade coding in cloud data centers Hadoop based middleboxes for RAM, 1TB disk), Testbed: 8 VMs controlled by Citrix XenServer and
coding/decoding connected with OVS
[254]° Responsive multipath TCP for optimized - Demand estimation and Random, permutation, NS3-based simulations for 8-ary Fat-tree with 1 Gbps links and
shuffling in SDN-based data centers route calculation and shuffling traffic customized SDN controller
algorithms
[255]° LocalFlow: local link balancing for scalable - Switch-local pcap packet traces, Packet-level network simulations (stand-alone and htsim-based) for
and optimal flow routing in data centers algorithm MapReduce-style flows 8-ary and 16-ary Fat-tree, VL2, oversubscribed variations
[256]° XPath: explicit flow-level path control in - 2 step compression Random TCP Testbed: 6-ary Fat-tree testbed with Pronto Broadcom 48-port
commodity switches in data centers algorithm connections, sequential Ethernet switches, Algorithm evaluation for BCube, Fat-tree, HyperX,
read/write, shuffling and VL2 with different scales
[257]° Topology2Vec: DCN representation learning - Biased random walk Real-world Internet Simulations
for networking applications sampling ML-based topologies from the
network partitioning Topology Zoo
* Data center topology, ° Data center routing.
[272]* Scheduling algorithms in hybrid - MILP, heuristics 10k-20k random 1024 nodes in two-tier network
packet optical data centers requests (4 core MEMS switches, and 64 ToRs)
[273]* Scheduling in packet-switched optical DCNs - Mahout, C4.5, and Naïve Random traffic Simulations for 80 ToRs connected by two 40×40 AWGRs
Bayes Discretization and two 1 × 2 space switches
(NBD)
[274]* Max-throughput data transfer scheduling - MILP, heuristic 1-1, many-1 Bulk Simulations for three-tier and Fat-tree DCNs with 128 nodes and
to minimize data retrieval time transfers VL2 with 160 nodes.
[275]* Topology-aware data placement Hadoop 2.2.0 Heuristic WordCount, TeraSort, 18 nodes, TopoSim MapReduce simulator
in geo-distributed data centers scheduler k-means
[276]* CoMan: Global in-network management in - 3/2 approximation 55 flows from Emulation for a Fat-tree DCN with 10 switches and 8 servers
multiplexed data centres algorithm 10 applications (using Pica8 3297), trace-driven simulations
[277]* Network-aware MapReduce tasks placement Hadoop 1.2.1 Probabilistic tasks Wordcount, Grep from Palmetto HPC platform with 1978 slave nodes (16 cores CPU, 16GB
to reduce transmission costs in DCNs scheduling algorithm, BigDataBench, Terasort RAM, 3GB disk), 10 GbE
[278]* Network-aware Scheduler (NAS) Hadoop 2.x MTS, CA-RTS, and CR- Facebook traces 40 nodes in 8 racks cluster with 1 GbE for cross-rack links
for high performance shuffling RTS scheduling Trace-driven event-based simulations for 600 nodes in 30 racks with
algorithms 200 users
[279]* PROTEUS: system for Temporally Interleaved Hive, Profiling, Sort, WordCount, Join, Profiling: 33 nodes (4 cores 2.4GHz CPU, 4GB RAM), Gigabit
Virtual Cluster (TIVC) abstraction for multi- Hadoop dynamic programming aggregation switch
tenant data centers 0.20.0 prototype: 18-nodes in three-tier data center with NetFPGA switches,
simulations
[280]* Corral: offline scheduling framework to reduce Hadoop 2.4 Offline planner, SWIM and Microsoft 210 nodes (32 cores) in 7 racks, 10GbE
cross-rack traffic for production workloads cluster scheduler traces, TPC-H large-scale simulations for 2000 nodes in 50 racks
[281]* Mercury: Fine-grained hybrid resources Hadoop 2.4.1 Daemon in each node, Gridmix and Microsoft’s 256 nodes cluster (2 8-core CPU, 128GB RAM, 10 3TB disks),
management and scheduling extensions to YARN production traces 10 Gbps in rack, 6 Gbps between racks
[282]* Optimizing containers allocation to reduce Hadoop 2.7, Simulated annealing, k-means, 8 nodes (8-core CPU, 32GB RAM, SATA disk) in a Fat-tree topology
congestion and improve data locality Apache Flink modification to AM connected components 1 GbE, OpenFlow 1.1
0.9
[283]* Examining the impact of the Energy Hadoop 1.x EEE in switches, TeraSort, Search, MRperf-based packet-level simulations
Efficient Ethernet on MapReduce workloads packets coalescing Index for two racks cluster with up to 80 nodes
[284]* Energy-aware hierarchical applications - k th max node sorting, Uniform / normal random C++ based simulations for up to 4096 nodes
scheduling in large DCNs dynamic max node applications demands with 32, 64, 128, and 256 ports switches
sorting algorithms
[285]* Willow: Energy-efficient SDN-based flows Hadoop 1.x Online greedy Locally generated 16 nodes (AMD CPU, 2GB RAM), 1 GbE, Simulations for Fat-tree
scheduling in data centers with network-limited approximation algorithm MapReduce traces and Fat-tree with disabled core switches, 1 GbE
workloads
[286]* JouleMR: Cost-effective and energy-aware Hadoop 2.6.0 Two-phase heuristic TeraSort, GridMix 10 nodes cluster (2 6-cores CPUs, 24GB RAM, 500GB disk), 10 GbE
MapReduce jobs and tasks scheduling Facebook traces simulations for 600 nodes in tree-structured DCN, with1GbE links
[287]* Evaluation of scale-up vs scale-out Vanilla Parameters optimization Log processing, select, 16 nodes (Quad-core CPU, 12GB RAM, 160GB HDD, 32GB SSD), 1
systems for Hadoop workloads Hadoop to Hadoop in scale-up aggregate, join, TeraSort, GbE, Dell PowerEdge910 (4 8-core CPU, 512GB RAM, 2 600GB
servers k-means, indexing HDD, 8 240GB SSD)
[288]° Hybrid scale-out and scale-up Hadoop 1.2.1 Automatic Hadoop WordCount, Grep, 12 nodes (2 4-core CPU, 16GB RAM, 193GB HDD), 10 Gbps
Hadoop cluster parameters optimization, Terasort, TestDFSIO, Mytinet, 2 nodes (4 6-core CPU, 505GB RAM, 91GB HDD), 10 Gbps
scheduler to select cluster Facebook traces Mytinet, OrangeFS
[289]° Measuring the impact of InfiniBand and Hadoop Testing different network Sort, random write, 10 nodes cluster (Quad-core CPU, 6GB RAM, 250GB HDD or (up to
10Gigabit on HDFS 0.20.0 interfaces and sequential write 4), 64GB SSD), InfiniBand DDR 16 Gbps links and 10GbE
technologies
[290]° Optimizing HPC clusters for dual compute- Spark 0.7.0 Enhanced Load Balancer, GroupBy, Grep, Logistic Hyperion cluster with 101 nodes ( 2 8-core CPUs, 64GB RAM,
centric and data-centric computations Congestion-Aware Task Regression (LR) 128GB SSD) in 2 racks, InfiniBand QDR 32 Gbps links
Dispatching
[291]° Cost-per-performance evaluation of MapReduce Hadoop 2.x Standalone evaluation Teragen, Terasort, (8 core CPU, 48GB RAM ) 10 GbE, single rack, SSDs 1.3TB, HHDs
clusters with SSDs and HDDs with different storage and Teravalidate, WordCount, 2TB setups 6 HDDs, 11 HDDs, 1 SSD, (6 HHDs +1 SSD)
application shuffle, HDFS
configurations
[292]° Improving Hadoop framework for Hadoop Modified map data Terasort, DFSIO 16 nodes (8 core CPU, 256GB RAM, 600GB HDD, 512GB SATA
better utilization of SSDs 2.6.0, handling, pre-read in SSD, 10GbE) 8 nodes (6 core CPU, 256GB RAM, 250GB HDD,
Spark 1.3 HDFS, reduce scheduler, 800GB NVMe SSD, 10GbE) Ceres-II 39 nodes (6 core CPU, 64GB
placement policy RAM, 960GB NVMe SSD, 40GbE)
[293]° mpCache: SSD-based acceleration for Hadoop 2.2.0 Modified HDFS, Grep, WC - Wikipedia 7 nodes (2 8-core CPUs, 20MB cache, 32GB RAM, 2TB SATA
MapReduce workloads in commodity clusters admission control classification - Netflix, HDD, 2 160GB SATA SSD)
policy, main cache PUMA
replacement scheme,
[294]° Evaluation of IO-intensive applications Cassandra Docker’s data volume Cassandra-stress, (32 core hyper-threaded CPU, 20480KB cache, 128GB RAM,
with Docker containers 3.0, TPC-C benchmark and 3 960GB NVMe SSD)
MySQL 5.7,
FIO 2.2
[295]° Examining the performance of different Hadoop - Modification to MRPerf Synthesized Google MRperf-based simulations for 2 and 4 racks clusters
schedulers and impact of NAS on Hadoop to implement Fair share traces for 9174 jobs
scheduler and Quincy
[296]° Improving the flexibility of direct attached Hadoop FlexDAS switch, SAS TeraSort for 1TB, YCSB 12 nodes (4 core CPU, 48GB RAM), 10GbE and 48 external HDDs
storage via a Disk Area Network (DAN) 2.0.5, expanders, Host Bus COSBench 0.3.0.0
Cassandra Adapters (HBAs)
Swift 1.10.0
[297]° Optimizing bulk data transfers from parallel Scientific Layout-aware source Offline Storage Tables Spider storage system at the Oak Ridge Leadership computing
file systems via layout-aware I/O operations applications algorithm, layout-aware (OSTs) with 20MB and Facility with 96 DataDirect Network (DDN) S2A9900 RAID
sink algorithm 256MB files controllers for 13400 1TB SATA HDDs
[298]° FaRM: Fast Remote Memory system using Key-value Circular buffer for YCSB 20 nodes (2 8-core CPUs, 128GB RAM, 240GB SSD), 40Gbps
RDMA to improve in-memory key-value store and graph RDMA messaging, RDMA over Converged Ethernet (RoCE)
stores PhyCo kernel driver,
ZooKeeper
[299]° HadoopA: Virtual shuffling for efficient Hadoop virtual shuffling through TeraSort, Grep, 21-nodes cluster (dual socket quad-core 2.13 GHz Intel Xeon, 8 GB
data movement in MapReduce 0.20.0 three-level segment WordCount, and RAM, 500 GB disk, 8 PCI-Express 2.0 bus), InfiniBand QDR switch,
near-demand merging, Hive workloads 48-port 10GigE
and merging sub-trees
[300]° Micro-architectural characterization of scale- - Intel VTune for Caching, NoSQL, PowerEdge M1000e (2 Intel 6-core CPUs, 32KB
out workloads characterization MapReduce, media L1 cache, 256KB L2 cache, 12MB L3 cache, 24GB RAM)
PARSEC, SPEC2006,
TPC-E
[301]° A 2 PetaFLOP, 3 Petabyte, 9 TB/s - Hoffman-Singleton - 2,550 nodes (64-bit Boston chip, 64GB DDR3 SDRAM, 1TB NAND
and 90 kW cabinet for exascale computations topology flash)
[302]° Mars: Accelerating MapReduce Hadoop 1.x CUDA, Hadoop String match, matrix 240-core NVIDIA GTX280+ 4-core CPU), (128-core NVIDIA
with GPUs streaming, GPU Prefix multiplication, MC, 8800GTX + 4-core CPU), (320 core ATI Radeon HD 3870 + 2-core
Sum routine, GPU Black-scholes, similarity CPU)
Bitonic Sort routine score, PCA
[303]° Using GPUs and MPTCP to Hadoop 1.x CUDA, Hadoop pipes, Terasort, WordCount and Node1 (Intel i7 920, 24GB RAM, 4 1TB HDD), node2 (Intel Quad
improve Hadoop performance MPTCP PiEstimate DataGen Q9400, NVIDIA GT530 GPU, 4GB RAM, 500GB HDD),
heterogeneous 5 nodes
[304]° FPMR: a MapReduce framework - On-chip processor RankBoost 1 node (Intel Pentium 4, 4GB RAM), Altera Stratix II EP2S180F1508
on FPGA to accelerate RankBoost scheduler, Common Data FPGA, Quartus II 8.1 and ModelSim 6.1-based simulations
Path (CDP)
[305]° SODA: software defined acceleration for big - Vivado high level Constrained Shortest Path Xilinx Zynq FPGA, ARM Cortex processor
data with heterogeneous reconfigurable synthesis tools, out-of- Finding (CSPF) for SDN
multicore FPGA resources order scheduling
algorithm
[306]° HLSMapReduceFlow: Synthesizable - High-Level Synthesis- WordCount, histogram, Virtex-7 FPGA
MapReduce Accelerator for FPGA-coupled Data based MapReduce Matrix multiplication,
Centers dataflow linear regression, PCA, k-
means
[307]° FPGA-based in-NIC and software-based NoSQL DRAM and multi USR, SYS, APP, NetFPGA-10G ( Xilinx Virtex-5 XC5VTX240T)
of NetFPGA, in-Kernel caches for NoSQL processing elements in ETC, VAR
NetFPGA, Netfilter
framework for kernel
cache
[308]° FPGA-based processing in NIC Spark 1.6.0, one-at-a-time WordCount, change NetFPGA-10G (Xilinx Virtex-5 XC5VTX240TFFG1759-2) as NIC in
for Spark streaming Scala 2.10.5 methodology operations point detection server node (Intel core i5 CPU, 8GB DRAM) and a client node
[309]° FPGA-acceleration for large - Graph Processing Graphlet counting Convey HC-1 server with Virtex-5 LX330s FPGA
graph processing Elements (GPEs), algorithm
memory interconnect
network, run-time
management unit
[310]° Network requirements for big data Hadoop, Page-level memory WordCount, sort, EC2 instances (m3.2xlarge, c3.4xlarge), with virtual private network
application in disaggregated data centers Spark, access, block-level collaborative (VPC), simulations and emulations
GraphLab, distributed data filtering, YCSB
Timely placement RDMA and
dataflow, integrated NICs
Memchached,
HERD,
SparkSQL
[311]° A composable architecture for rack-scale Memchached, Empirical approaches for 100k Memcached H3 RAID array with 12 SATA drives and single PCIe Gen3 × 4 port,
disaggregation for big data computing Giraph, resources provisioning, operations, PCIe switch with 16 Gen2 × 4 port, host bus adapter connected to
Cassandra PCIe switches TopKPagerank, 10k IBM 3650M4
Cassandra operations
* Scheduling in data centers, °Performance improvements based on advances in technologies.
VIII. SUMMARY OF OPTIMIZATION STUDIES AND FUTURE RESEARCH DIRECTIONS:
Tremendous efforts were devoted in the last decade to address various challenges in optimizing the
deployment of big data applications and their infrastructures. The ever increasing data volumes and the
heterogeneous requirements of available and proposed frameworks in dedicated and multi-tenant clusters are still
calling for further improvements of different aspects of these infrastructures. We categorized the optimization
studies of big data applications into three broad categories which are at the applications-level, cloud networking-
level, and at the data center-level. Application-level studies focused on the efforts to improve the configuration or
structure of the frameworks. These studies were further categorized into optimizing jobs and data placements,
jobs scheduling, and completion time, in addition to benchmarking, production traces, modeling, profiling and
simulators for big data applications. Studies at the cloud networking level addressed the optimization beyond
single cluster deployments, for example for transporting big data and/or for in-network processing of big data.
These studies were further categorized into cloud resource management, virtual assignments and container
assignments, bulk transfers and inter data center networking, and SDN, EON, and NFV optimization studies.
Finally, the data center-level studies focused on optimizing the design of the clusters to improve the performance
of the applications. These studies were further categorized into topology, routing, scheduling of flows, coflows,
and jobs, and advances in computing, storage, and networking technologies studies. In what follows, we highlight
key challenges for big data application deployments and provide some insights into research gaps and future
directions.
Big Data Volumes: There is a huge gap in most of the studies between the actual volumes of big data and
the tested traffic or workloads volumes. The main reason is the relatively high costs of experimenting in large
clusters or renting IaaS. This calls for improving existing simulators or performance models to enable accurate
testing of systems at large scale. Data volumes are continuing to grow beyond the capabilities of existing systems
due to many bottlenecks in computing and networking. This will require continuing the investment in scale-up
systems to incorporate different technologies such as SDN and advanced optical networks for intra and inter data
center networking. Another key challenge with big data is related to the veracity and value of big data which calls
for cleansing techniques prior to processing to eliminate unnecessary computations.
Workload characteristics and their modeling: Big data workload characteristics and available frameworks will
keep changing and evolving. Most of the studies of big data address only the MapReduce framework while few
considered other variations like key-value store, streaming and graph processing applications or a mixture of
applications. Also, several studies have utilized scaled and outdated production traces where only high level
statistics are available or a subset of workloads in micro benchmarks is available for the evaluations which might
not be very representative. Thus, there is a need for more comprehensive, enhanced, and updated benchmarking
suites and production traces to reflect more recent frameworks and workload characteristics.
Resources allocation: Different workloads can have uncorrelated CPU, I/O, memory, and networking resources
requirements and placing those together can improve the overall infrastructure utilization. However, isolating
resources such as cache, network, and I/O via recent management platforms at large scale is still challenging.
Improving some of the resources can change the optimum configurations for applications. For example, improving
the networking can reduce the need to maintain data locality, and hence, the stress on tasks scheduling is reduced.
Also, improving the CPU can change CPU-bound workloads into I/O or memory-bound workloads. Another
challenge is that there is still a gap between the users’ knowledge about their requirements and the resources they
lease which leads to non-optimal configurations and hence, waste of resources, delayed response, and higher
energy consumption. More tools to aid users in understanding their requirements or their workloads are required
for better resource utilization.
Performance in proposed clusters: Most of the research that enhances big data frameworks, reported in in
Section III, was carried in scale-out clusters while overlooking the physical layer characteristics and focusing on
the framework details. Alternatively, most of the studies in Sections V, and VII have considered custom-build
simulators, SDN switches-based emulations, or small scale-up/out clusters while focusing on the hardware impact,
considering only a subset of configurations, and oversimplifying the frameworks characteristics. Although it
might sound more economical to scale out infrastructures as the computational requirements increase, this might
not be sufficient for applications with strict QoS requirements as scaling-out depends on high level of parallelism
which is constrained by the network between the nodes. Hence, improving the networking and scaling up the data
centers are required and are gaining research and industrial considerations. With improving DCNs, many tradeoffs
should be considered including the scalability, agility, end-to-end delays, in addition to the complexity of the
routing and scheduling mechanisms required to fully exploit the improved bandwidth links. This will mostly be
satisfied by application-centric DCN architectures that utilize SDN to dynamically vary the topologies at run-time
to match the requirements of deployed applications.
Clusters heterogeneity: Potential existence of hardware with different specifications for example due to
replacements in very large clusters can lead to completion time imbalance among tasks. This requires more
attention in big data studies and accurate profiling of the performance in all nodes to improve task scheduling to
reduce the imbalance.
Multi-tenancy environments: Multi-tenancy is a key requirement enabled by virtualization of cloud data centers
where workloads of different users are multiplexed in the same infrastructure to improve the utilization. In current
cloud services, static cluster configurations and manual adjustments are still followed but are not optimal. There
is still lack of dynamic job allocation and scheduling for multi-users. In such environments, the performance
isolation between users due to sharing resources such as network and I/O is not yet widely addressed. Also,
fairness between users and optimal pricing while maintaining acceptable QoS for all users is still a challenging
research topic requiring more comprehensive multi-objective studies.
Geo-distributed frameworks: The challenges faced in geo-distributed frameworks were summarized in
Subsection IV-C. Those include the need for modifying the frameworks which were originally designed for single
cluster deployments. There is a need for new frameworks for offers and pricing, optimal resources allocation in
heterogeneous clusters, QoS guarantees, resilience, and energy consumption minimization. A key challenges with
workloads in geo-distributed frameworks is that not all workloads can be divided as the whole data set may need
to reside in a single cluster, also transporting some data sets from remote locations can have high data-access
latency. Extended SDN control between transport networks and within data centers is a promising research area
to jointly optimize path computations, provision distributed resources, and reduce jobs completion time. It is also
a promising research area to improve big data applications and frameworks in geo-distributed environments.
Power consumption: Current infrastructures have a trade-off between energy efficiency and the applications
performance. Most cloud providers still favor over-provisioning to meet SLA over reducing the power
consumption. Reducing the power consumption while maintaining the performance is an area that should be
explored further in designing future systems. For example, it is attractive to consider more energy-efficient
components, more efficient geo-distributed frameworks to reduce the need for transporting big data. It is also
attractive to perform progressive computations in the network as the data transitions, while considering agile
technologies such SDN, VNE, and NFV, which can improve the energy efficiency of big data applications,
however, the impact of those strategies on the applications performance should be comprehensively addressed.
IX. CONCLUSIONS
Improving the performance and efficiency of big data frameworks and applications is an ongoing critical
research area as they are becoming the mainstream for implementing various services including data analytics and
machine learning at large scale. These are services that continue to grow in importance. Support should also be
developed for other services deployed fully or partially in cloud and fog computing environments with ever
increasing volumes of generated data. Big data interacts with systems at different levels starting from the
acquisition of data through wireless and wired access networks from users and IoTs, to transmission through
WAN networks, into different types of data centers for storage and processing via different frameworks. This
survey paper surveyed big data applications and the technologies and network infrastructure needed to implement
them. It identified a number of key challenges and research gaps in optimizing big data applications and
infrastructures. It has also comprehensively summarized early and recent efforts towards improving the
performance and/or energy efficiency of such big data applications at different layers. For the convenience of
readers with different backgrounds, brief descriptions of big data applications and frameworks, cloud computing
and related emerging technologies, and data centers are provided in Sections II, IV, and VI, respectively. The
optimization studies, that appear in Section III focus on the frameworks, those that appear in Section V focus on
cloud networking, with Section VII focusing on data centers, and finally comprehensive summaries are given in
Tables I-VI. The survey paid attention to a range of existing and proposed technologies and focused on different
frameworks and applications including MapReduce, data streaming and graph processing. The survey considered
different optimization metrics (e.g. completion time, fairness, cost, profit, and energy consumption), reported
studies that considered different representative workloads, optimization tools and mathematical techniques, and
considered simulation-based and experimental evaluations in clouds and prototypes. We provided some future
research directions in Section VIII to aid researchers in identifying the limitations of current solutions and hence
determine the area where future technologies should be developed in order to improve big data applications, their
infrastructures and performance.
ACKNOWLEDGEMENTS
Sanaa Hamid Mohamed would like to acknowledge Doctoral Training Award (DTA) funding from the UK
Engineering and physical Sciences Research Council (EPSRC). This work was supported by the Engineering and
Physical Sciences Research Council, INTERNET (EP/H040536/1), STAR (EP/K016873/1) and TOWS
(EP/S016570/1) projects. All data are provided in full in the results section of this paper.
REFERENCES
[1] H. Hu, Y. Wen, T.-S. Chua, and X. Li, “Toward Scalable Systems for Big Data Analytics: A Technology
Tutorial,” Access, IEEE, vol. 2, pp. 652–687, 2014.
[2] Y. Demchenko, C. de Laat, and P. Membrey, “Defining architecture components of the Big Data Ecosystem,”
in Collaboration Technologies and Systems (CTS), 2014 International Conference on, May 2014, pp. 104–112.
[3] X. Yi, F. Liu, J. Liu, and H. Jin, “Building a network highway for big data: architecture and challenges,”
Network, IEEE, vol. 28, no. 4, pp. 5–13, July 2014.
[4] H. Fang, Z. Zhang, C. J. Wang, M. Daneshmand, C. Wang, and H. Wang, “A survey of big data research,”
Network, IEEE, vol. 29, no. 5, pp. 6–9, September 2015.
[5] W. Tan, M. Blake, I. Saleh, and S. Dustdar, “Social-Network-Sourced Big Data Analytics,” Internet
Computing, IEEE, vol. 17, no. 5, pp. 62–69, Sept 2013.
[6] C. Fang, J. Liu, and Z. Lei, “Parallelized user clicks recognition from massive HTTP data based on dependency
graph model,” Communications, China, vol. 11, no. 12, pp. 13–25, Dec 2014.
[7] H. Hu, Y. Wen, Y. Gao, T.-S. Chua, and X. Li, “Toward an SDN-enabled big data platform for social TV
analytics,” Network, IEEE, vol. 29, no. 5, pp. 43–49, September 2015.
[8] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Commun. ACM,
vol. 51, no. 1, pp. 107–113, Jan. 2008.
[9] C. P. Chen and C.-Y. Zhang, “Data-intensive applications, challenges, techniques and technologies: A survey
on big data,” Information Sciences, vol. 275, pp. 314 – 347, 2014.
[10] J. Hack and M. Papka, “Big Data: Next-Generation Machines for Big Science,” Computing in Science
Engineering, vol. 17, no. 4, pp. 63–65, July 2015.
[11] Y. Xu and S. Mao, “A survey of mobile cloud computing for rich media applications,” Wireless
Communications, IEEE, vol. 20, no. 3, pp. 46–53, June 2013.
[12] “Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016-2021,” White Paper,
Cisco, March 2017.
[13] X. He, K. Wang, H. Huang, and B. Liu, “QoE-Driven Big Data Architecture for Smart City,” IEEE
Communications Magazine, vol. 56, no. 2, pp. 88–93, Feb 2018.
[14] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash, “Internet of Things: A Survey on
Enabling Technologies, Protocols, and Applications,” Communications Surveys Tutorials, IEEE, vol. 17, no. 4,
pp. 2347–2376, Fourthquarter 2015.
[15] E. Marín-Tordera, X. Masip-Bruin, J. G. Almiñana, A. Jukan, G. Ren, J. Zhu, and J. Farre, “What is a Fog
Node A Tutorial on Current Concepts towards a Common Definition,” CoRR, vol. abs/1611.09193, 2016.
[16] P. Mach and Z. Becvar, “Mobile Edge Computing: A Survey on Architecture and Computation Offloading,”
IEEE Communications Surveys Tutorials, vol. 19, no. 3, pp. 1628–1656, thirdquarter 2017.
[17] I. Stojmenovic, “Fog computing: A cloud to the ground support for smart things and machine-to-machine
networks,” in 2014 Australasian Telecommunication Networks and Applications Conference (ATNAC), Nov
2014, pp. 117–122.
[18] S. Wang, X. Wang, J. Huang, R. Bie, and X. Cheng, “Analyzing the potential of mobile opportunistic
networks for big data applications,” IEEE Network, vol. 29, no. 5, pp. 57–63, September 2015.
[19] Z. T. Al-Azez, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy efficient IoT
virtualization framework with passive optical access networks,” in 2016 18th International Conference on
Transparent Optical Networks (ICTON), July 2016, pp. 1–4.
[20] Z. T. Al-Azez and A. Q. Lawey and T. E. H. El-Gorashi and J. M. H. Elmirghani, “Virtualization framework
for energy efficient IoT networks,” in 2015 IEEE 4th International Conference on Cloud Networking (CloudNet),
Oct 2015, pp. 74–77.
[21] A. A. Alahmadi, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Distributed processing in
vehicular cloud networks,” in 2017 8th International Conference on the Network of the Future (NOF), Nov 2017,
pp. 22–26.
[22] H. Q. Al-Shammari, A. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “Energy efficient service embedding
in IoT networks,” in 2018 27th Wireless and Optical Communication Conference (WOCC), April 2018, pp. 1–5.
[23] B. Yosuf, M. Musa, T. Elgorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Energy Efficient Service
Distribution in Internet of Things,” in 2018 20th International Conference on Transparent Optical Networks
(ICTON), July 2018, pp. 1–4.
[24] I. S. M. Isa, M. O. I. Musa, T. E. H. El-Gorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Energy Efficiency
of Fog Computing Health Monitoring Applications,” in 2018 20th International Conference on Transparent
Optical Networks (ICTON), July 2018, pp. 1–5.
[25] M. B. A. Halim, S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Fog-assisted caching
employing solar renewable energy for delivering video on demand service,” CoRR, vol. abs/1903.10250, 2019.
[26] F. S. Behbehani, M. O. I. Musa, T. Elgorashi, and J. M. H. Elmirghani, “Energy-efficient distributed
processing in vehicular cloud architecture,” CoRR, vol. abs/1903.12451, 2019.
[27] B. Yosuf, M. N. Musa, T. Elgorashi, and J. M. H. Elmirghani, “Impact of distributed processing on power
consumption for iot based surveillance applications,” 2019.
[28] I. S. M. Isa, M. O. I. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient and Resilient
Infrastructure for Fog Computing Health Monitoring Applications,” arXiv e-prints, p. arXiv:1904.01732, Apr
2019.
[29] R. Ma, A. A. Alahmadi, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient Software Matching
in Vehicular Fog,” arXiv e-prints, p. arXiv:1904.02592, Apr 2019.
[30] Z. T. Al-Azez, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient IoT
Virtualization Framework with Peer to Peer Networking and Processing,” IEEE Access, pp. 1–1, 2019.
[31] M. S. H. Graduate, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Patient-Centric Cellular
Networks Optimization using Big Data Analytics,” IEEE Access, pp. 1–1, 2019.
[32] K. Dolui and S. K. Datta, “Comparison of edge computing implementations: Fog computing, cloudlet and
mobile edge computing,” in 2017 Global Internet of Things Summit (GIoTS), June 2017, pp. 1–6.
[33] J. Andreu-Perez, C. Poon, R. Merrifield, S.Wong, and G.-Z. Yang, “Big Data for Health,” Biomedical and
Health Informatics, IEEE Journal of, vol. 19, no. 4, pp. 1193–1208, July 2015.
[34] X. Xu, Q. Sheng, L.-J. Zhang, Y. Fan, and S. Dustdar, “From Big Data to Big Service,” Computer, vol. 48,
no. 7, pp. 80–83, July 2015.
[35] S. Mazumdar and S. Dhar, “Hadoop as Big Data Operating System– The Emerging Approach for Managing
Challenges of Enterprise Big Data Platform,” in Big Data Computing Service and Applications (BigDataService),
2015 IEEE First International Conference on, March 2015, pp. 499–505.
[36] J. Xie, S. Yin, X. Ruan, Z. Ding, Y. Tian, J. Majors, A. Manzanares, and X. Qin, “Improving MapReduce
performance through data placement in heterogeneous Hadoop clusters,” in Parallel Distributed Processing,
Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, April 2010, pp. 1–9.
[37] J. Leverich and C. Kozyrakis, “On the Energy (in)Efficiency of Hadoop Clusters,” SIGOPS Oper. Syst. Rev.,
vol. 44, no. 1, pp. 61–65, Mar. 2010.
[38] Y. Ying, R. Birke, C. Wang, L. Chen, and N. Gautam, “Optimizing Energy, Locality and Priority in a
MapReduce Cluster,” in Autonomic Computing (ICAC), 2015 IEEE International Conference on, July 2015, pp.
21–30.
[39] X. Ma, X. Fan, J. Liu, and D. Li, “Dependency-aware Data Locality for MapReduce,” Cloud Computing,
IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[40] J. Tan, S. Meng, X. Meng, and L. Zhang, “Improving ReduceTask data locality for sequential MapReduce
jobs,” in INFOCOM, 2013 Proceedings IEEE, April 2013, pp. 1627–1635.
[41] B. Arres, N. Kabachi, O. Boussaid, and F. Bentayeb, “Intentional Data Placement Optimization for
Distributed Data Warehouses,” in Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference
on, Oct 2015, pp. 80–86.
[42] X. Bao, L. Liu, N. Xiao, F. Liu, Q. Zhang, and T. Zhu, “HConfig: Resource adaptive fast bulk loading in
HBase,” in Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), 2014
International Conference on, Oct 2014, pp. 215–224.
[43] Y. Elshater, P. Martin, D. Rope, M. McRoberts, and C. Statchuk, “A Study of Data Locality in YARN,” in
2015 IEEE International Congress on Big Data, June 2015, pp. 174–181.
[44] Z. Liu, Q. Zhang, R. Ahmed, R. Boutaba, Y. Liu, and Z. Gong, “Dynamic Resource Allocation for
MapReduce with Partitioning Skew,” IEEE Transactions on Computers, vol. PP, no. 99, pp. 1–1, 2016.
[45] X. Ding, Y. Liu, and D. Qian, “JellyFish: Online Performance Tuning with Adaptive Configuration and
Elastic Container in Hadoop Yarn,” in Parallel and Distributed Systems (ICPADS), 2015 IEEE 21st International
Conference on, Dec 2015, pp. 831–836.
[46] M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica, “Delay Scheduling: A
Simple Technique for Achieving Locality and Fairness in Cluster Scheduling,” in Proceedings of the 5th European
Conference on Computer Systems, ser. EuroSys ’10. New York, NY, USA: ACM, 2010, pp. 265–278.
[47] M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg, “Quincy: Fair Scheduling for
Distributed Computing Clusters,” in Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems
Principles, ser. SOSP ’09. New York, NY, USA: ACM, 2009, pp. 261–276.
[48] S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu, and S. Wu, “Maestro: Replica-Aware Map Scheduling for
MapReduce,” in Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium
on, May 2012, pp. 435–442.
[49] W. Wang, K. Zhu, L. Ying, J. Tan, and L. Zhang, “MapTask Scheduling in MapReduce With Data Locality:
Throughput and Heavy-Traffic Optimality,” Networking, IEEE/ACM Transactions on, vol. PP, no. 99,
pp. 1–1, 2014.
[50] J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres, and E. Ayguadé, “Resource-
aware Adaptive Scheduling for Mapreduce Clusters,” in Proceedings of the 12th ACM/IFIP/USENIX
International Conference on Middleware, ser. Middleware’11. Berlin, Heidelberg: Springer-Verlag, 2011, pp.
187–207.
[51] J. Wolf, D. Rajan, K. Hildrum, R. Khandekar, V. Kumar, S. Parekh, K.- L. Wu, and A. balmin, “FLEX: A
Slot Allocation Scheduling Optimizer for MapReduce Workloads,” in Proceedings of the ACM/IFIP/USENIX
11th International Conference on Middleware, ser. Middleware ’10. Berlin, Heidelberg: Springer-Verlag, 2010,
pp. 1–20.
[52] M. Pastorelli, D. Carra, M. Dell’Amico, and P. Michiardi, “HFSP: Bringing Size-Based Scheduling To
Hadoop,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[53] Y. Yao, J. Tai, B. Sheng, and N. Mi, “LsPS: A Job Size-Based Scheduler for Efficient Task Assignments in
Hadoop,” Cloud Computing, IEEE Transactions on, vol. 3, no. 4, pp. 411–424, Oct 2015.
[54] Y. Yuan, D. Wang, and J. Liu, “Joint scheduling of MapReduce jobs with servers: Performance bounds and
experiments,” in IEEE INFOCOM 2014 - IEEE Conference on Computer Communications, April 2014, pp. 2175–
2183.
[55] F. Chen, M. Kodialam, and T. V. Lakshman, “Joint scheduling of processing and Shuffle phases in
MapReduce systems,” in INFOCOM, 2012 Proceedings IEEE, March 2012, pp. 1143–1151.
[56] S. Kurazumi, T. Tsumura, S. Saito, and H. Matsuo, “Dynamic Processing Slots Scheduling for I/O Intensive
Jobs of Hadoop MapReduce,” in Proceedings of the 2012 Third International Conference on Networking and
Computing, ser. ICNC ’12. Washington, DC, USA: IEEE Computer Society, 2012, pp. 288–292.
[57] E. Bampis, V. Chau, D. Letsios, G. Lucarelli, I. Milis, and G. Zois, “Energy Efficient Scheduling of
MapReduce Jobs,” in Euro-Par 2014 Parallel Processing, ser. Lecture Notes in Computer Science, F. Silva, I.
Dutra, and V. Santos Costa, Eds. Springer International Publishing, 2014, vol. 8632, pp. 198–209.
[58] T. Wirtz and R. Ge, “Improving MapReduce Energy Efficiency for Computation Intensive Workloads,” in
Proceedings of the 2011 International Green Computing Conference and Workshops, ser. IGCC ’11. Washington,
DC, USA: IEEE Computer Society, 2011, pp. 1–8.
[59] L. Mashayekhy, M. Nejad, D. Grosu, Q. Zhang, and W. Shi, “Energy- Aware Scheduling of MapReduce
Jobs for Big Data Applications,” Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 10, pp.
2720–2733, Oct 2015.
[60] S. Tang, B.-S. Lee, and B. He, “DynamicMR: A Dynamic Slot Allocation Optimization Framework for
MapReduce Clusters,” Cloud Computing, IEEE Transactions on, vol. 2, no. 3, pp. 333–347, July 2014.
[61] Q. Zhang, M. Zhani, Y. Yang, R. Boutaba, and B. Wong, “PRISM: Fine-Grained Resource-Aware
Scheduling for MapReduce,” Cloud Computing, IEEE Transactions on, vol. 3, no. 2, pp. 182–194, April 2015.
[62] Y. Yao, J. Wang, B. Sheng, J. Lin, and N. Mi, “HaSTE: Hadoop YARN Scheduling Based on Task-
Dependency and Resource-Demand,” in 2014 IEEE 7th International Conference on Cloud Computing, June
2014, pp. 184–191.
[63] P. Li, L. Ju, Z. Jia, and Z. Sun, “SLA-Aware Energy-Efficient Scheduling Scheme for Hadoop YARN,” in
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on
Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and
Systems (ICESS), 2015 IEEE 17th International Conference on, Aug 2015, pp. 623–628.
[64] P. Bellavista, A. Corradi, A. Reale, and N. Ticca, “Priority-Based Resource Scheduling in Distributed Stream
Processing Systems for Big Data Applications,” in Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th
International Conference on, Dec 2014, pp. 363–370.
[65] K. Xiong and Y. He, “Power-effiicent resource allocation in MapReduce clusters,” in Integrated Network
Management (IM 2013), 2013 IFIP/IEEE International Symposium on, May 2013, pp. 603–608.
[66] D. Cheng, J. Rao, C. Jiang, and X. Zhou, “Resource and Deadline- Aware Job Scheduling in Dynamic
Hadoop Clusters,” in Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International,
May 2015, pp. 956–965.
[67] Z. Ren, J. Wan, W. Shi, X. Xu, and M. Zhou, “Workload Analysis, Implications, and Optimization on a
Production Hadoop Cluster: A Case Study on Taobao,” IEEE Transactions on Services Computing,
vol. 7, no. 2, pp. 307–321, April 2014.
[68] C. Chen, J. Lin, and S. Kuo, “MapReduce Scheduling for Deadline- Constrained Jobs in Heterogeneous
Cloud Computing Systems,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[69] A. Verma, L. Cherkasova, and R. H. Campbell, “ARIA: Automatic Resource Inference and Allocation for
Mapreduce Environments,” in Proceedings of the 8th ACM International Conference on Autonomic Computing,
ser. ICAC ’11. New York, NY, USA: ACM, 2011, pp. 235–244.
[70] X. Dai and B. Bensaou, “Scheduling for response time in Hadoop MapReduce,” in 2016 IEEE International
Conference on Communications (ICC), May 2016, pp. 1–6.
[71] H. Chang, M. Kodialam, R. R. Kompella, T. V. Lakshman, M. Lee, and S. Mukherjee, “Scheduling in
mapreduce-like systems for fast completion time,” in INFOCOM, 2011 Proceedings IEEE, April 2011,
pp. 3074–3082.
[72] S. Tang, B. Lee, and B. He, “Dynamic Job Ordering and Slot Configurations for MapReduce Workloads,”
Services Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[73] T. Li, G. Yu, X. Liu, and J. Song, “Analyzing the Waiting Energy Consumption of NoSQL Databases,” in
Dependable, Autonomic and Secure Computing (DASC), 2014 IEEE 12th International Conference on, Aug 2014,
pp. 277–282.
[74] S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou, “Re-optimizing Data-parallel
Computing,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation,
ser. NSDI’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 21–21.
[75] H. Wang, H. Chen, Z. Du, and F. Hu, “BeTL: MapReduce Checkpoint Tactics Beneath the Task Level,”
Services Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[76] B. Ghit and D. Epema, “Reducing Job Slowdown Variability for Data-Intensive Workloads,” in Modeling,
Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), 2015 IEEE 23rd
International Symposium on, Oct 2015, pp. 61–70.
[77] S. M. Nabavinejad, M. Goudarzi, and S. Mozaffari, “The Memory Challenge in Reduce Phase of MapReduce
Applications,” IEEE Transactions on Big Data, vol. PP, no. 99, pp. 1–1, 2016.
[78] X. Shi, M. Chen, L. He, X. Xie, L. Lu, H. Jin, Y. Chen, and S. Wu, “Mammoth: Gearing Hadoop Towards
Memory-Intensive MapReduce Applications,” Parallel and Distributed Systems, IEEE Transactions on,
vol. 26, no. 8, pp. 2300–2315, Aug 2015.
[79] Y. Kwon, M. Balazinska, B. Howe, and J. Rolia, “SkewTune: Mitigating Skew in Mapreduce Applications,”
in Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’12.
New York, NY, USA: ACM, 2012, pp. 25–36.
[80] Apache Hadoop: Gridmix. (Cited on 2018, Jan). [Online]. Available:
https://hadoop.apache.org/docs/current/hadoop-gridmix/GridMix.html
[81] S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang, “The HiBench benchmark suite: Characterization of the
MapReduce-based data analysis,” in Data Engineering Workshops (ICDEW), 2010 IEEE 26th International
Conference on, March 2010, pp. 41–51.
[82] V. A. Saletore, K. Krishnan, V. Viswanathan, and M. E. Tolentino, “HcBench: Methodology, development,
and characterization of a customer usage representative big data/Hadoop benchmark,” in 2013 IEEE International
Symposium on Workload Characterization (IISWC), Sept 2013, pp. 77–86.
[83] F. Ahmad, S. Lee, M. Thottethodi, and T. Vijaykumar, “Puma: Purdue mapreduce benchmarks suite,” 2012.
[84] Hive performance benchmarks. (Cited on 2018, Jan). [Online]. Available:
https://issues.apache.org/jira/browse/HIVE-396
[85] Apache Hadoop: Pigmix. (Cited on 2018, Jan). [Online]. Available:
https://cwiki.apache.org/confluence/display/PIG/PigMix
[86] A. Ghazal, T. Rabl, M. Hu, F. Raab, M. Poess, A. Crolotte, and H.- A. Jacobsen, “BigBench: Towards an
Industry Standard Benchmark for Big Data Analytics,” in Proceedings of the 2013 ACM SIGMOD International
Conference on Management of Data, ser. SIGMOD ’13. New York, NY, USA: ACM, 2013, pp. 1197–1208.
[87] Transaction Processing Performance Council (TPC-H). (Cited on 2018, Jan). [Online]. Available:
http://www.tpc.org/tpch/
[88] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking Cloud Serving Systems
with YCSB,” in Proceedings of the 1st ACM Symposium on Cloud Computing, ser. SoCC ’10. New York, NY,
USA: ACM, 2010, pp. 143–154.
[89] S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, “An Analysis of Traces from a Production MapReduce
Cluster,” in Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid
Computing, ser. CCGRID ’10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 94–103.
[90] Y. Chen, A. Ganapathi, R. Griffith, and R. Katz, “The Case for Evaluating MapReduce Performance Using
Workload Suites,” in Modeling, Analysis Simulation of Computer and Telecommunication Systems
(MASCOTS), 2011 IEEE 19th International Symposium on, July 2011, pp. 390–399.
[91] Y. Chen, S. Alspaugh, D. Borthakur, and R. Katz, “Energy Efficiency for Large-scale MapReduce Workloads
with Significant Interactive Analysis,” in Proceedings of the 7th ACM European Conference on
Computer Systems, ser. EuroSys ’12. New York, NY, USA: ACM, 2012, pp. 43–56.
[92] Y. Chen, S. Alspaugh, and R. H. Katz, “Design Insights for MapReduce from Diverse Production
Workloads,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2012-17, Jan 2012.
[93] A. K. Mishra, J. L. Hellerstein, W. Cirne, and C. R. Das, “Towards characterizing cloud backend workloads:
Insights from google compute clusters,” SIGMETRICS Perform. Eval. Rev., vol. 37, no. 4, pp. 34–41, Mar. 2010.
[94] C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch, “Heterogeneity and Dynamicity of
Clouds at Scale: Google Trace Analysis,” in Proceedings of the Third ACM Symposium on Cloud Computing,
ser. SoCC ’12. New York, NY, USA: ACM, 2012, pp. 7:1–7:13.
[95] R. Birke, L. Y. Chen, and E. Smirni, “Multi-resource characterization and their (in)dependencies in
production datacenters,” in 2014 IEEE Network Operations and Management Symposium (NOMS), May 2014,
pp. 1–6.
[96] K. Ren, Y. Kwon, M. Balazinska, and B. Howe, “Hadoop’s Adolescence: An Analysis of Hadoop Usage in
Scientific Workloads,” Proc. VLDB Endow., vol. 6, no. 10, pp. 853–864, Aug. 2013.
[97] S. Shen, V. v. Beek, and A. Iosup, “Statistical Characterization of Business-Critical Workloads Hosted in
Cloud Datacenters,” in Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International
Symposium on, May 2015, pp. 465–474.
[98] Y. Chen, A. S. Ganapathi, A. Fox, R. H. Katz, and D. A. Patterson, “Statistical Workloads for Energy
Efficient MapReduce,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS- 2010-6,
Jan 2010.
[99] H. Yang, Z. Luan, W. Li, and D. Qian, “Mapreduce workload modelling with statistical approach,” Journal
of Grid Computing, vol. 10, no. 2, pp. 279–310, Jun 2012.
[100] Z. Jia, J. Zhan, L. Wang, C. Luo, W. Gao, Y. Jin, R. Han, and L. Zhang, “Understanding Big Data Analytics
Workloads on Modern Processors,” IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99, pp.
1–1, 2016.
[101] J. Deng, G. Tyson, F. Cuadrado, and S. Uhlig, “Keddah: Capturing Hadoop Network Behaviour,” in 2017
IEEE 37th International Conference on Distributed Computing Systems (ICDCS), June 2017, pp. 2143–2150.
[102] H. Herodotou and S. Babu, “Profiling, What-if Analysis, and Costbased Optimization of MapReduce
Programs.” PVLDB, vol. 4, no. 11, pp. 1111–1122, 2011.
[103] H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F. B. Cetin, and S. Babu, “Starfish: A Self-tuning
System for Big Data Analytics,” in In CIDR, 2011, pp. 261–272.
[104] M. Khan, Y. Jin, M. Li, Y. Xiang, and C. Jiang, “Hadoop Performance Modeling for Job Estimation and
Resource Provisioning,” Parallel and Distributed Systems, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[105] N. B. Rizvandi, J. Taheri, R. Moraveji, and A. Y. Zomaya, “On Modelling and Prediction of Total CPU
Usage for Applications in Mapreduce Environments,” in Proceedings of the 12th International Conference on
Algorithms and Architectures for Parallel Processing - Volume Part I, ser. ICA3PP’12. Berlin, Heidelberg:
Springer-Verlag, 2012, pp. 414–427.
[106] Apache Hadoop MapReduce Mumak: Map-Reduce simulator. (Cited on 2016, Dec). [Online]. Available:
https://issues.apache.org/jira/browse/MAPREDUCE-728l
[107] A. Verma, L. Cherkasova, and R. H. Campbell, “Play It Again, SimMR!” in 2011 IEEE International
Conference on Cluster Computing, Sept 2011, pp. 253–261.
[108] S. Hammoud, M. Li, Y. Liu, N. K. Alham, and Z. Liu, “MRSim: A discrete event based MapReduce
simulator,” in 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery, vol. 6, Aug
2010, pp. 2993–2997.
[109] J. Jung and H. Kim, “MR-CloudSim: Designing and implementing MapReduce computing model on
CloudSim,” in 2012 International Conference on ICT Convergence (ICTC), Oct 2012, pp. 504–509.
[110] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, and R. Buyya, “CloudSim: A Toolkit for
Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning
Algorithms,” Softw. Pract. Exper., vol. 41, no. 1, pp. 23–50, Jan. 2011.
[111] G. Wang, A. R. Butt, P. Pandey, and K. Gupta, “Using Realistic Simulation for Performance Analysis of
Mapreduce Setups,” in Proceedings of the 1st ACM Workshop on Large-Scale System and Application
Performance, ser. LSAP ’09. New York, NY, USA: ACM, 2009, pp. 19–26.
[112] M. V. Neves, C. A. F. D. Rose, and K. Katrinis, “MRemu: An Emulation-Based Framework for Datacenter
Network Experimentation Using Realistic MapReduce Traffic,” in Modeling, Analysis and Simulation of
Computer and Telecommunication Systems (MASCOTS), 2015 IEEE 23rd International Symposium on, Oct
2015, pp. 174–177.
[113] Z. Bian, K. Wang, Z. Wang, G. Munce, I. Cremer, W. Zhou, Q. Chen, and G. Xu, “Simulating Big Data
Clusters for System Planning, Evaluation, and Optimization,” in 2014 43rd International Conference on Parallel
Processing, Sept 2014, pp. 391–400.
[114] J. Liu, B. Bian, and S. S. Sury, “Planning Your SQL-on-Hadoop Deployment Using a Low-Cost Simulation-
Based Approach,” in 2016 28th International Symposium on Computer Architecture and High Performance
Computing (SBAC-PAD), Oct 2016, pp. 182–189.
[115] K. Wang, Z. Bian, Q. Chen, R. Wang, and G. Xu, “Simulating Hive Cluster for Deployment Planning,
Evaluation and Optimization,” in 2014 IEEE 6th International Conference on Cloud Computing Technology and
Science, Dec 2014, pp. 475–482.
[116] Yarn Scheduler Load Simulator (SLS). (Cited on 2018, Jan). [Online]. Available:
https://hadoop.apache.org/docs/r2.4.1/hadoop-sls/SchedulerLoadSimulator.html
[117] P. Wette, A. Schwabe, M. Splietker, and H. Karl, “Extending Hadoop’s Yarn Scheduler Load Simulator
with a highly realistic network amp; traffic model,” in Network Softwarization (NetSoft), 2015 1st IEEE
Conference on, April 2015, pp. 1–2.
[118] J.-C. Lin, I. C. Yu, E. B. Johnsen, and M.-C. Lee, “ABS-YARN: A Formal Framework for Modeling
Hadoop YARN Clusters,” in Proceedings of the 19th International Conference on Fundamental Approaches to
Software Engineering - Volume 9633. New York, NY, USA: Springer-Verlag New York, Inc., 2016, pp. 49–65.
[119] N. Liu, X. Yang, X. H. Sun, J. Jenkins, and R. Ross, “YARNsim: Simulating Hadoop YARN,” in Cluster,
Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on, May 2015, pp. 637–
646.
[120] X. Xu, M. Tang, and Y. C. Tian, “Theoretical Results of QoSGuaranteed Resource Scaling for Cloud-based
MapReduce,” IEEE Transactions on Cloud Computing, vol. PP, no. 99, pp. 1–1, 2016.
[121] M. Mattess, R. N. Calheiros, and R. Buyya, “Scaling MapReduce Applications Across Hybrid Clouds to
Meet Soft Deadlines,” in 2013 IEEE 27th International Conference on Advanced Information Networking and
Applications (AINA), March 2013, pp. 629–636.
[122] C.-W. Tsai, W.-C. Huang, M.-H. Chiang, M.-C. Chiang, and C.-S. Yang, “A Hyper-Heuristic Scheduling
Algorithm for Cloud,” IEEE Transactions on Cloud Computing, vol. 2, no. 2, pp. 236–250, April 2014.
[123] N. Lim, S. Majumdar, and P. Ashwood-Smith, “MRCP-RM: a Technique for Resource Allocation and
Scheduling of MapReduce Jobs with Deadlines,” IEEE Transactions on Parallel and Distributed Systems, vol. PP,
no. 99, pp. 1–1, 2016.
[124] K. Chen, J. Powers, S. Guo, and F. Tian, “CRESP: Towards Optimal Resource Provisioning for MapReduce
Computing in Public Clouds,” Parallel and Distributed Systems, IEEE Transactions on, vol. 25, no. 6, pp. 1403–
1412, June 2014.
[125] L. Sharifi, L. Cerdà-Alabern, F. Freitag, and L. Veiga, “Energy efficient cloud service provisioning:
Keeping data center granularity in perspective,” Journal of Grid Computing, vol. 14, no. 2, pp. 299–325, 2016.
[126] M. Zhang, R. Ranjan, M. Menzel, S. Nepal, P. Strazdins, W. Jie, and L. Wang, “An Infrastructure Service
Recommendation System for Cloud Applications with Real-time QoS Requirement Constraints,” IEEE Systems
Journal, vol. PP, no. 99, pp. 1–11, 2015.
[127] V. Jalaparti, H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, “Bridging the Tenant-provider Gap in
Cloud Services,” in Proceedings of the Third ACM Symposium on Cloud Computing, ser. SoCC ’12. New York,
NY, USA: ACM, 2012, pp. 10:1–10:14.
[128] D. Xie, Y. C. Hu, and R. R. Kompella, “On the performance projectability of MapReduce,” in Cloud
Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on, Dec 2012, pp.
301–308.
[129] C. Delimitrou and C. Kozyrakis, “The Netflix Challenge: Datacenter Edition,” IEEE Computer Architecture
Letters, vol. 12, no. 1, pp. 29– 32, January 2013.
[130] M. Jammal, A. Kanso, and A. Shami, “High availability-aware optimization digest for applications
deployment in cloud,” in Communications (ICC), 2015 IEEE International Conference on, June 2015, pp. 6822–
6828.
[131] J. Lee, Y. Turner, M. Lee, L. Popa, S. Banerjee, J.-M. Kang, and P. Sharma, “Application-driven Bandwidth
Guarantees in Datacenters,” SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 467–478, Aug. 2014.
[132] J. Guo, F. Liu, J. C. S. Lui, and H. Jin, “Fair Network Bandwidth Allocation in IaaS Datacenters via a
Cooperative Game Approach,” IEEE/ACM Transactions on Networking, vol. 24, no. 2, pp. 873–886, April 2016.
[133] L. Popa, P. Yalagandula, S. Banerjee, J. C. Mogul, Y. Turner, and J. R. Santos, “Elasticswitch: Practical
work-conserving bandwidth guarantees for cloud computing,” SIGCOMM Comput. Commun. Rev., vol. 43, no.
4, pp. 351–362, Aug. 2013.
[134] V. Jeyakumar, M. Alizadeh, D. Mazières, B. Prabhakar, C. Kim, and A. Greenberg, “EyeQ: Practical
Network Performance Isolation at the Edge,” in Proceedings of the 10th USENIX Conference on Networked
Systems Design and Implementation, ser. nsdi’13. Berkeley, CA, USA: USENIX Association, 2013, pp. 297–
312.
[135] A. Antoniadis, Y. Gerbessiotis, M. Roussopoulos, and A. Delis, “Tossing NoSQL-Databases Out to Public
Clouds,” in Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on, Dec 2014,
pp. 223–232.
[136] T. Z. J. Fu, J. Ding, R. T. B. Ma, M. Winslett, Y. Yang, and Z. Zhang, “DRS: Dynamic Resource Scheduling
for Real-Time Analytics over Fast Streams,” in Distributed Computing Systems (ICDCS), 2015 IEEE 35th
International Conference on, June 2015, pp. 411–420.
[137] L. Chen, S. Liu, B. Li, and B. Li, “Scheduling jobs across geodistributed datacenters with max-min fairness,”
in IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, May 2017, pp. 1–9.
[138] S. Tang, B. S. Lee, and B. He, “Fair Resource Allocation for Data- Intensive Computing in the Cloud,”
IEEE Transactions on Services Computing, vol. PP, no. 99, pp. 1–1, 2016.
[139] H. Won, M. C. Nguyen, M. S. Gil, and Y. S. Moon, “Advanced resource management with access control
for multitenant Hadoop,” Journal of Communications and Networks, vol. 17, no. 6, pp. 592–601, Dec 2015.
[140] H. Herodotou, F. Dong, and S. Babu, “No One (Cluster) Size Fits All: Automatic Cluster Sizing for Data-
intensive Analytics,” in Proceedings of the 2Nd ACM Symposium on Cloud Computing, ser. SOCC ’11. New
York, NY, USA: ACM, 2011, pp. 18:1–18:14.
[141] B. Palanisamy, A. Singh, and L. Liu, “Cost-Effective Resource Provisioning for MapReduce in a Cloud,”
Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 5, pp. 1265–1279, May 2015.
[142] B. Sharma, T. Wood, and C. R. Das, “HybridMR: A Hierarchical MapReduce Scheduler for Hybrid Data
Centers,” in Proceedings of the 2013 IEEE 33rd International Conference on Distributed Computing Systems, ser.
ICDCS ’13. Washington, DC, USA: IEEE Computer Society, 2013, pp. 102–111.
[143] Y. Zhang, X. Fu, and K. K. Ramakrishnan, “Fine-grained multiresource scheduling in cloud datacenters,”
in Local Metropolitan Area Networks (LANMAN), 2014 IEEE 20th International Workshop on, May 2014, pp.
1–6.
[144] X. Zhu, L. Yang, H. Chen, J. Wang, S. Yin, and X. Liu, “Real- Time Tasks Oriented Energy-Aware
Scheduling in Virtualized Clouds,” Cloud Computing, IEEE Transactions on, vol. 2, no. 2, pp. 168–180, April
2014.
[145] Z. Li, J. Ge, H. Hu, W. Song, H. Hu, and B. Luo, “Cost and Energy Aware Scheduling Algorithm for
Scientific Workflows with Deadline Constraint in Clouds,” Services Computing, IEEE Transactions on, vol. PP,
no. 99, pp. 1–1, 2015.
[146] M. Cardosa, A. Singh, H. Pucha, and A. Chandra, “Exploiting Spatio- Temporal Tradeoffs for Energy-
Aware MapReduce in the Cloud,” Computers, IEEE Transactions on, vol. 61, no. 12, pp. 1737–1751, Dec 2012.
[147] F. Teng, D. Deng, L. Yu, and F. Magoulès, “An Energy-Efficient VM Placement in Cloud Datacenter,” in
High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and
Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014 IEEE Intl Conf
on, Aug 2014, pp. 173–180.
[148] B. Palanisamy, A. Singh, L. Liu, and B. Jain, “Purlieus: Localityaware Resource Allocation for MapReduce
in a Cloud,” in Proceedings of 2011 International Conference for High Performance Computing, Networking,
Storage and Analysis, ser. SC ’11. New York, NY, USA: ACM, 2011, pp. 58:1–58:11.
[149] J. Park, D. Lee, B. Kim, J. Huh, and S. Maeng, “Locality-aware Dynamic VM Reconfiguration on
MapReduce Clouds,” in Proceedings of the 21st International Symposium on High-Performance Parallel and
Distributed Computing, ser. HPDC ’12. New York, NY, USA: ACM, 2012, pp. 27–36.
[150] X. Bu, J. Rao, and C.-z. Xu, “Interference and Locality-aware Task Scheduling for MapReduce Applications
in Virtual Clusters,” in Proceedings of the 22Nd International Symposium on High-performance Parallel and
Distributed Computing, ser. HPDC ’13. New York, NY, USA: ACM, 2013, pp. 227–238.
[151] M. Li, D. Subhraveti, A. R. Butt, A. Khasymski, and P. Sarkar, “CAM: A Topology Aware Minimum Cost
Flow Based Resource Manager for MapReduce Applications in the Cloud,” in Proceedings of the 21st
International Symposium on High-Performance Parallel and Distributed Computing, ser. HPDC ’12. New York,
NY, USA: ACM, 2012, pp. 211–222.
[152] V. van Beek, J. Donkervliet, T. Hegeman, S. Hugtenburg, and A. Iosup, “Self-Expressive Management of
Business-Critical Workloads in Virtualized Datacenters,” Computer, vol. 48, no. 7, pp. 46–54, July 2015.
[153] D. Tsoumakos, I. Konstantinou, C. Boumpouka, S. Sioutas, and N. Koziris, “Automated, Elastic Resource
Provisioning for NoSQL Clusters Using TIRAMOLA,” in Cluster, Cloud and Grid Computing (CCGrid), 2013
13th IEEE/ACM International Symposium on, May 2013, pp. 34–41.
[154] H. Kang, Y. Chen, J. L. Wong, R. Sion, and J. Wu, “Enhancement of Xen’s Scheduler for MapReduce
Workloads,” in Proceedings of the 20th International Symposium on High Performance Distributed Computing,
ser. HPDC ’11. New York, NY, USA: ACM, 2011, pp. 251–262.
[155] B. M. Ko, J. Lee, and H. Jo, “Toward Enhancing Block I/O Performance for Virtualized Hadoop Cluster,”
in Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on, Dec 2014, pp. 481–
482.
[156] Y. Yu, H. Zou, W. Tang, L. Liu, and F. Teng, “Flex Tuner: A Flexible Container-Based Tuning System for
Cloud Applications,” in Cloud Engineering (IC2E), 2015 IEEE International Conference on, March 2015, pp.
145–154.
[157] R. Zhang, M. Li, and D. Hildebrand, “Finding the Big Data Sweet Spot: Towards Automatically
Recommending Configurations for Hadoop Clusters on Docker Containers,” in Cloud Engineering (IC2E), 2015
IEEE International Conference on, March 2015, pp. 365–368.
[158] Y. Kang and R. Y. C. Kim, “Twister Platform for MapReduce Applications on a Docker Container,” in
2016 International Conference on Platform Technology and Service (PlatCon), Feb 2016, pp. 1–3.
[159] C. Rista, D. Griebler, C. A. F. Maron, and L. G. Fernandes, “Improving the Network Performance of a
Container-Based Cloud Environment for Hadoop Systems,” in 2017 International Conference on High
Performance Computing Simulation (HPCS), July 2017, pp. 619–626.
[160] L. Yazdanov, M. Gorbunov, and C. Fetzer, “EHadoop: Network I/O Aware Scheduler for Elastic
MapReduce Cluster,” in 2015 IEEE 8th International Conference on Cloud Computing, June 2015, pp. 821– 828.
[161] N. Laoutaris, M. Sirivianos, X. Yang, and P. Rodriguez, “Interdatacenter Bulk Transfers with Netstitcher,”
SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 74–85, Aug. 2011.
[162] Y. Feng, B. Li, and B. Li, “Postcard: Minimizing Costs on Inter- Datacenter Traffic with Store-and-
Forward,” in Distributed Computing Systems Workshops (ICDCSW), 2012 32nd International Conference on,
June 2012, pp. 43–50.
[163] P. Lu, K. Wu, Q. Sun, and Z. Zhu, “Toward online profit-driven scheduling of inter-DC data-transfers for
cloud applications,” in Communications (ICC), 2015 IEEE International Conference on, June 2015, pp. 5583–
5588.
[164] J. Garcia-Dorado and S. Rao, “Cost-aware Multi Data-Center Bulk Transfers in the Cloud from a Customer-
Side Perspective,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[165] C. Wu, C. Ku, J. Ho, and M. Chen, “A Novel Pipeline Approach for Efficient Big Data Broadcasting,”
Knowledge and Data Engineering, IEEE Transactions on, vol. 28, no. 1, pp. 17–28, Jan 2016.
[166] J. Yao, P. Lu, L. Gong, and Z. Zhu, “On Fast and Coordinated Data Backup in Geo-Distributed Optical
Inter-Datacenter Networks,” Journal of Lightwave Technology, vol. 33, no. 14, pp. 3005–3015, July 2015.
[167] P. Lu, L. Zhang, X. Liu, J. Yao, and Z. Zhu, “Highly efficient data migration and backup for big data
applications in elastic optical interdata-center networks,” Network, IEEE, vol. 29, no. 5, pp. 36–42,
September 2015.
[168] I. Alan, E. Arslan, and T. Kosar, “Energy-Aware Data Transfer Tuning,” in Cluster, Cloud and Grid
Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, May 2014, pp. 626–634.
[169] Y. Koshiba, W. Chen, Y. Yamada, T. Tanaka, and I. Paik, “Investigation of network traffic in geo-
distributed data centers,” in Awareness Science and Technology (iCAST), 2015 IEEE 7th International
Conference on, Sept 2015, pp. 174–179.
[170] L. Zhang, C. Wu, Z. Li, C. Guo, M. Chen, and F. Lau, “Moving Big Data to The Cloud: An Online Cost-
Minimizing Approach,” Selected Areas in Communications, IEEE Journal on, vol. 31, no. 12, pp. 2710– 2721,
December 2013.
[171] P. Li, S. Guo, S. Yu, and W. Zhuang, “Cross-cloud MapReduce for Big Data,” Cloud Computing, IEEE
Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[172] P. Li, S. Guo, T. Miyazaki, X. Liao, H. Jin, A. Y. Zomaya, and K.Wang, “Traffic-Aware Geo-Distributed
Big Data Analytics with Predictable Job Completion Time,” IEEE Transactions on Parallel and Distributed
Systems, vol. 28, no. 6, pp. 1785–1796, June 2017.
[173] A. M. Al-Salim, A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficient Big Data
Networks: Impact of Volume and Variety,” IEEE Transactions on Network and Service Management, vol. PP,
no. 99, pp. 1–1, 2017.
[174] A. M. Al-Salim, T. E. El-Gorashi, A. Q. Lawey, and J. M. Elmirghani, “Greening big data networks: velocity
impact,” IET Optoelectronics, November 2017.
[175] C. Joe-Wong, I. Kamitsos, and S. Ha, “Interdatacenter Job Routing and Scheduling With Variable Costs
and Deadlines,” IEEE Transactions on Smart Grid, vol. 6, no. 6, pp. 2669–2680, Nov 2015.
[176] Y. Yao, L. Huang, A. Sharma, L. Golubchik, and M. Neely, “Power Cost Reduction in Distributed Data
Centers: A Two-Time-Scale Approach for Delay Tolerant Workloads,” Parallel and Distributed Systems, IEEE
Transactions on, vol. 25, no. 1, pp. 200–211, Jan 2014.
[177] C. Jayalath, J. Stephen, and P. Eugster, “From the Cloud to the Atmosphere: Running MapReduce across
Data Centers,” Computers, IEEE Transactions on, vol. 63, no. 1, pp. 74–87, Jan 2014.
[178] Q. Zhang, L. Liu, K. Lee, Y. Zhou, A. Singh, N. Mandagere, S. Gopisetty, and G. Alatorre, “Improving
Hadoop Service Provisioning in a Geographically Distributed Cloud,” in Cloud Computing (CLOUD), 2014 IEEE
7th International Conference on, June 2014, pp. 432–439.
[179] Y. Li, L. Zhao, C. Cui, and C. Yu, “Fast Big Data Analysis in Geo- Distributed Cloud,” in 2016 IEEE
International Conference on Cluster Computing (CLUSTER), Sept 2016, pp. 388–391.
[180] F. J. Clemente-Castelló, B. Nicolae, R. Mayo, and J. C. Fernández, “Performance Model of MapReduce
Iterative Applications for Hybrid Cloud Bursting,” IEEE Transactions on Parallel and Distributed Systems, vol.
29, no. 8, pp. 1794–1807, Aug 2018.
[181] S. Kailasam, P. Dhawalia, S. J. Balaji, G. Iyer, and J. Dharanipragada, “Extending MapReduce across
Clouds with BStream,” IEEE Transactions on Cloud Computing, vol. 2, no. 3, pp. 362–376, July 2014.
[182] R. Tudoran, G. Antoniu, and L. Bougé, “SAGE: Geo-Distributed Streaming Data Analysis in Clouds,” in
2013 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum, May 2013,
pp. 2278–2281.
[183] A. Rabkin, M. Arye, S. Sen, V. S. Pai, and M. J. Freedman, “Aggregation and Degradation in JetStream:
Streaming Analytics in the Wide Area,” in Proceedings of the 11th USENIX Conference on Networked Systems
Design and Implementation, ser. NSDI’14. Berkeley, CA, USA: USENIX Association, 2014, pp. 275–288.
[184] L. Gu, D. Zeng, S. Guo, Y. Xiang, and J. Hu, “A General Communication Cost Optimization Framework
for Big Data Stream Processing in Geo-Distributed Data Centers,” Computers, IEEE Transactions on, vol. 65, no.
1, pp. 19–29, Jan 2016.
[185] W. Chen, I. Paik, and Z. Li, “Cost-Aware Streaming Workflow Allocation on Geo-Distributed Data
Centers,” IEEE Transactions on Computers, vol. 66, no. 2, pp. 256–271, Feb 2017.
[186] Q. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, and I. Stoica, “Low Latency Geo-
distributed Data Analytics,” SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 421–434, Aug. 2015.
[187] A. C. Zhou, S. Ibrahim, and B. He, “On Achieving Efficient Data Transfer for Graph Processing in Geo-
Distributed Datacenters,” in 2017 IEEE 37th International Conference on Distributed Computing Systems
(ICDCS), June 2017, pp. 1397–1407.
[188] S. Das, Y. Yiakoumis, G. Parulkar, N. McKeown, P. Singh, D. Getachew, and P. D. Desai, “Application-
aware aggregation and traffic engineering in a converged packet-circuit network,” in Optical Fiber
Communication Conference and Exposition (OFC/NFOEC), 2011 and the National Fiber Optic Engineers
Conference, March 2011, pp. 1–3.
[189] V. Lopez, J. M. Gran, J. P. Fernandez-Palacios, D. Siracusa, F. Pederzolli, O. Gerstel, Y. Shikhmanter, J.
Mårtensson, P. Sköldström, T. Szyrkowiec, M. Chamania, A. Autenrieth, I. Tomkos, and D. Klonidis, “The role
of SDN in application centric IP and optical networks,” in 2016 European Conference on Networks and
Communications (EuCNC), June 2016, pp. 138–142.
[190] Y. Demchenko, P. Grosso, C. de Laat, S. Filiposka, and M. de Vos, “Zerotouch provisioning (ZTP) model
and infrastructure components for multi-provider cloud services provisioning,” CoRR, vol. abs/1611.02758, 2016.
[191] S. Wang, X. Zhang, W. Hou, X. Yang, and L. Guo, “SDNyquist platform for big data transmission,” in
2016 15th International Conference on Optical Communications and Networks (ICOCN), Sept 2016, pp. 1– 3.
[192] S. Narayan, S. Bailey, A. Daga, M. Greenway, R. Grossman, A. Heath, and R. Powell, “OpenFlow Enabled
Hadoop over Local and Wide Area Clusters,” in 2012 SC Companion: High Performance Computing, Networking
Storage and Analysis, Nov 2012, pp. 1625–1628.
[193] Z. Yu, M. Li, X. Yang, and X. Li, “Palantir: Reseizing Network Proximity in Large-Scale Distributed
Computing Frameworks Using SDN,” in Cloud Computing (CLOUD), 2014 IEEE 7th International Conference
on, June 2014, pp. 440–447.
[194] X. Yang and T. Lehman, “Model Driven Advanced Hybrid Cloud Services for Big Data: Paradigm and
Practice,” in 2016 Seventh International Workshop on Data-Intensive Computing in the Clouds (DataCloud), Nov
2016, pp. 32–36.
[195] A. Sadasivarao, S. Syed, P. Pan, C. Liou, I. Monga, C. Guok, and A. Lake, “Bursting Data between Data
Centers: Case for Transport SDN,” in High-Performance Interconnects (HOTI), 2013 IEEE 21st Annual
Symposium on, Aug 2013, pp. 87–90.
[196] W. Lu and Z. Zhu, “Malleable Reservation Based Bulk-Data Transfer to Recycle Spectrum Fragments in
Elastic Optical Networks,” Journal of Lightwave Technology, vol. 33, no. 10, pp. 2078–2086, May 2015.
[197] Y. Wu, Z. Zhang, C.Wu, C. Guo, Z. Li, and F. Lau, “Orchestrating Bulk Data Transfers across Geo-
Distributed Datacenters,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[198] X. Jin, Y. Li, D. Wei, S. Li, J. Gao, L. Xu, G. Li, W. Xu, and J. Rexford, “Optimizing bulk transfers with
software-defined optical wan,” in Proceedings of the 2016 ACM SIGCOMM Conference, ser. SIGCOMM ’16.
New York, NY, USA: ACM, 2016, pp. 87–100.
[199] A. Asensio and L. Velasco, “Managing transfer-based datacenter connections,” IEEE/OSA Journal of
Optical Communications and Networking, vol. 6, no. 7, pp. 660–669, July 2014.
[200] M. Femminella, G. Reali, and D. Valocchi, “Genome centric networking: A network function virtualization
solution for genomic applications,” in 2017 IEEE Conference on Network Softwarization (NetSoft), July 2017,
pp. 1–9.
[201] L. Gu, S. Tao, D. Zeng, and H. Jin, “Communication cost efficient virtualized network function placement
for big data processing,” in 2016 IEEE Conference on Computer Communications Workshops (INFOCOM
WKSHPS), April 2016, pp. 604–609.
[202] L. Gifre, M. Ruiz, and L. Velasco, “Experimental assessment of Big Data-backed video distribution in the
telecom cloud,” in 2017 19th International Conference on Transparent Optical Networks (ICTON), July 2017, pp.
1–4.
[203] B. García, M. Gallego, L. López, G. A. Carella, and A. Cheambe, “NUBOMEDIA: An Elastic PaaS
Enabling the Convergence of Real-Time and Big Data Multimedia,” in 2016 IEEE International Conference on
Smart Cloud (SmartCloud), Nov 2016, pp. 45–56.
[204] J. Han, M. Ishii, and H. Makino, “A Hadoop performance model for multi-rack clusters,” in Computer
Science and Information Technology (CSIT), 2013 5th International Conference on, March 2013, pp. 265–274.
[205] G. Wang, A. R. Butt, P. Pandey, and K. Gupta, “A simulation approach to evaluating design decisions in
MapReduce setups,” in Modeling, Analysis Simulation of Computer and Telecommunication Systems, 2009.
MASCOTS ’09. IEEE International Symposium on, Sept 2009, pp. 1–11.
[206] Z. Kouba, O. Tomanek, and L. Kencl, “Evaluation of Datacenter Network Topology Influence on Hadoop
MapReduce Performance,” in 2016 5th IEEE International Conference on Cloud Networking (Cloudnet), Oct
2016, pp. 95–100.
[207] S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “On the energy efficiency of MapReduce
shuffling operations in data centers,” in 2017 19th International Conference on Transparent Optical Networks
(ICTON), July 2017, pp. 1–5.
[208] Y. Shang, D. Li, J. Zhu, and M. Xu, “On the Network Power Effectiveness of Data Center Architectures,”
Computers, IEEE Transactions on, vol. 64, no. 11, pp. 3237–3248, Nov 2015.
[209] M. Alizadeh and T. Edsall, “On the Data Path Performance of Leaf-Spine Datacenter Fabrics,” in High-
Performance Interconnects (HOTI), 2013 IEEE 21st Annual Symposium on, Aug 2013, pp. 71–74.
[210] J. Duan and Y. Yang, “FFTree: A flexible architecture for data center networks towards configurability and
cost efficiency,” in 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS), June 2017,
pp. 1–10.
[211] S. Kandula, J. Padhye, and V. Bahl, “Flyways To De-Congest Data Center Networks,” Tech. Rep., August
2009.
[212] D. Halperin, S. Kandula, J. Padhye, P. Bahl, and D. Wetherall, “Augmenting data center networks with
multi-gigabit wireless links,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 38–49, Aug. 2011.
[213] K. Suto, H. Nishiyama, N. Kato, T. Nakachi, T. Sakano, and A. Takahara, “A Failure-Tolerant and
Spectrum-Efficient Wireless Data Center Network Design for Improving Performance of Big Data Mining,” in
Vehicular Technology Conference (VTC Spring), 2015 IEEE 81st, May 2015, pp. 1–5.
[214] P. Costa, A. Donnelly, A. Rowstron, and G. O’Shea, “Camdoop: Exploiting In-network Aggregation for
Big Data Applications,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and
Implementation, ser. NSDI’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 3–3.
[215] L. Rupprecht, “Exploiting In-network Processing for Big Data Management,” in Proceedings of the 2013
SIGMOD/PODS Ph.D. Symposium, ser. SIGMOD’13 PhD Symposium. New York, NY, USA: ACM, 2013, pp.
1–6.
[216] Y. Zhang, C. Guo, R. Chu, G. Lu, Y. Xiong, and H. Wu, “RAMCube: Exploiting Network Proximity for
RAM-Based Key-Value Store,” in 4th USENIX Workshop on Hot Topics in Cloud Computing, Hot- Cloud’12,
Boston, MA, USA, June 12-13, 2012, 2012.
[217] X. Meng, V. Pappas, and L. Zhang, “Improving the Scalability of Data Center Networks with Traffic-aware
Virtual Machine Placement,” in Proceedings of the 29th Conference on Information Communications, ser.
INFOCOM’10. Piscataway, NJ, USA: IEEE Press, 2010, pp. 1154–1162.
[218] H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, “Towards predictable datacenter networks,”
SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 242–253, Aug. 2011.
[219] D. Zeng, S. Guo, H. Huang, S. Yu, and V. C. M. Leung, “Optimal VM Placement in Data Centres with
Architectural and Resource Constraints,” Int. J. Auton. Adapt. Commun. Syst., vol. 8, no. 4, pp. 392–406, Nov.
2015.
[220] Z. Wu, Y. Zhang, V. Singh, G. Jiang, and H. Wang, “Automating Cloud Network Optimization and
Evolution,” Selected Areas in Communications, IEEE Journal on, vol. 31, no. 12, pp. 2620–2631, December 2013.
[221] W. C. Moody, J. Anderson, K.-C. Wange, and A. Apon, “Reconfigurable Network Testbed for Evaluation
of Datacenter Topologies,” in Proceedings of the Sixth International Workshop on Data Intensive Distributed
Computing, ser. DIDC ’14. New York, NY, USA: ACM, 2014, pp. 11–20.
[222] G. Wang, T. E. Ng, and A. Shaikh, “Programming Your Network at Run-time for Big Data Applications,”
in Proceedings of the First Workshop on Hot Topics in Software Defined Networks, ser. HotSDN ’12. New York,
NY, USA: ACM, 2012, pp. 103–108.
[223] H. H. Bazzaz, M. Tewari, G. Wang, G. Porter, T. S. E. Ng, D. G. Andersen, M. Kaminsky, M. A. Kozuch,
and A. Vahdat, “Switching the Optical Divide: Fundamental Challenges for Hybrid Electrical/Optical Datacenter
Networks,” in Proceedings of the 2Nd ACM Symposium on Cloud Computing, ser. SOCC ’11. New York, NY,
USA: ACM, 2011, pp. 30:1–30:8.
[224] M. Channegowda, T. Vlachogiannis, R. Nejabati, and D. Simeonidou, “Optical flyways for handling
elephant flows to improve big data performance in SDN enabled Datacenters,” in 2016 Optical Fiber
Communications Conference and Exhibition (OFC), March 2016, pp. 1–3.
[225] Y. Yin, K. Kanonakis, and P. N. Ji, “Hybrid optical/electrical switching in directly connected datacenter
networks,” in Communications in China (ICCC), 2014 IEEE/CIC International Conference on, Oct 2014, pp. 102–
106.
[226] P. Samadi, V. Gupta, B. Birand, H. Wang, G. Zussman, and K. Bergman, “Accelerating Incast and Multicast
Traffic Delivery for Data-intensive Applications Using Physical Layer Optics,” SIGCOMM Comput. Commun.
Rev., vol. 44, no. 4, pp. 373–374, Aug. 2014.
[227] J. Bao, B. Zhao, D. Dong, and Z. Gong, “HERO: A Hybrid Electrical and Optical Multicast for Accelerating
High-Performance Data Center Applications,” in Proceedings of the SIGCOMM Posters and Demos, ser.
SIGCOMM Posters and Demos ’17. New York, NY, USA: ACM, 2017, pp. 17–18.
[228] S. Peng, B. Guo, C. Jackson, R. Nejabati, F. Agraz, S. Spadaro, G. Bernini, N. Ciulli, and D. Simeonidou,
“Multi-tenant softwaredefined hybrid optical switched data centre,” Lightwave Technology, Journal of, vol. 33,
no. 15, pp. 3224–3233, Aug 2015.
[229] L. Schares, X. J. Zhang, R. Wagle, D. Rajan, P. Selo, S. P. Chang, J. Giles, K. Hildrum, D. Kuchta, J. Wolf,
and E. Schenfeld, “A reconfigurable interconnect fabric with optical circuit switch and software optimizer for
stream computing systems,” in 2009 Conference on Optical Fiber Communication - incudes post deadline papers,
March 2009, pp. 1–3.
[230] X. Yu, H. Gu, K. Wang, and G. Wu, “Enhancing Performance of Cloud Computing Data Center Networks
by Hybrid Switching Architecture,” Lightwave Technology, Journal of, vol. 32, no. 10, pp. 1991–1998, May
2014.
[231] L. Y. Ho, J. J. Wu, and P. Liu, “Optimal Algorithms for Cross-Rack Communication Optimization in
MapReduce Framework,” in 2011 IEEE 4th International Conference on Cloud Computing, July 2011, pp. 420–
427.
[232] Y. Le, F. Wang, J. Liu, and F. Ergün, “On Datacenter-Network-Aware Load Balancing in MapReduce,” in
Cloud Computing (CLOUD), 2015 IEEE 8th International Conference on, June 2015, pp. 485–492.
[233] H. Ke, P. Li, S. Guo, and M. Guo, “On Traffic-Aware Partition and Aggregation in MapReduce for Big
Data Applications,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 3, pp. 818–828, March
2016.
[234] Z. Jiang, Z. Ding, X. Gao, and G. Chen, “DCP: An efficient and distributed data center cache protocol with
Fat-Tree topology,” in Network Operations and Management Symposium (APNOMS), 2014 16th Asia-Pacific,
Sept 2014, pp. 1–4.
[235] D. Guo, J. Xie, X. Zhou, X. Zhu, W. Wei, and X. Luo, “Exploiting Efficient and Scalable Shuffle Transfers
in Future Data Center Networks,” Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 4, pp. 997–
1009, April 2015.
[236] E. Yildirim, E. Arslan, J. Kim, and T. Kosar, “Application-Level Optimization of Big Data Transfers
through Pipelining, Parallelism and Concurrency,” IEEE Transactions on Cloud Computing, vol. 4, no. 1,
pp. 63–75, Jan 2016.
[237] Y. Yu and C. Qian, “Space Shuffle: A Scalable, Flexible, and High- Performance Data Center Network,”
IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99, pp. 1–1, 2016.
[238] E. Zahavi, I. Keslassy, and A. Kolodny, “Distributed Adaptive Routing Convergence to Non-Blocking DCN
Routing Assignments,” Selected Areas in Communications, IEEE Journal on, vol. 32, no. 1, pp. 88–101,
January 2014.
[239] N. Chrysos, M. Gusat, F. Neeser, C. Minkenberg, W. Denzel, and C. Basso, “High performance multipath
routing for datacenters,” in High Performance Switching and Routing (HPSR), 2014 IEEE 15th International
Conference on, July 2014, pp. 70–75.
[240] E. Dong, X. Fu, M. Xu, and Y. Yang, “DCMPTCP: Host-Based Load Balancing for Datacenters,” in 2018
IEEE 38th International Conference on Distributed Computing Systems (ICDCS), July 2018, pp. 622–633.
[241] Y. Shang, D. Li, and M. Xu, “Greening data center networks with flow preemption and energy-aware
routing,” in Local Metropolitan Area Networks (LANMAN), 2013 19th IEEE Workshop on, April 2013, pp. 1–
6.
[242] L. Wang, F. Zhang, and Z. Liu, “Improving the Network Energy Efficiency in MapReduce Systems,” in
Computer Communications and Networks (ICCCN), 2013 22nd International Conference on, July 2013, pp. 1–7.
[243] L. Wang, F. Zhang, J. Arjona Aroca, A. Vasilakos, K. Zheng, C. Hou, D. Li, and Z. Liu, “GreenDCN: A
General Framework for Achieving Energy Efficiency in Data Center Networks,” Selected Areas in
Communications, IEEE Journal on, vol. 32, no. 1, pp. 4–15, January 2014.
[244] X. Wen, K. Chen, Y. Chen, Y. Liu, Y. Xia, and C. Hu, “VirtualKnotter: Online Virtual Machine Shuffling
for Congestion Resolving in Virtualized Datacenter,” in Distributed Computing Systems (ICDCS), 2012 IEEE
32nd International Conference on, June 2012, pp. 12–21.
[245] K. C. Webb, A. C. Snoeren, and K. Yocum, “Topology Switching for Data Center Networks,” in
Proceedings of the 11th USENIX Conference on Hot Topics in Management of Internet, Cloud, and Enterprise
Networks and Services, ser. Hot-ICE’11. Berkeley, CA, USA: USENIX Association, 2011, pp. 14–14.
[246] L. Chen, Y. Feng, B. Li, and B. Li, “Towards performance-centric fairness in datacenter networks,” in
INFOCOM, 2014 Proceedings IEEE, April 2014, pp. 1599–1607.
[247] L. A. Rocha and F. L. Verdi, “MILPFlow: A toolset for integration of computational modelling and
deployment of data paths for SDN,” in Integrated Network Management (IM), 2015 IFIP/IEEE International
Symposium on, May 2015, pp. 750–753.
[248] L.W. Cheng and S. Y.Wang, “Application-Aware SDN Routing for Big Data Networking,” in 2015 IEEE
Global Communications Conference (GLOBECOM), Dec 2014, pp. 1–6.
[249] S. Narayan, S. Bailey, and A. Daga, “Hadoop Acceleration in an OpenFlow-Based Cluster,” in High
Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:, Nov 2012, pp.
535–538.
[250] X. Hou, A. K. T. K, J. P. Thomas, and V. Varadharajan, “Dynamic Workload Balancing for Hadoop
MapReduce,” in Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on, Dec
2014, pp. 56–62.
[251] C. Trois, M. Martinello, L. C. E. de Bona, and M. D. Del Fabro, “From Software Defined Network to
Network Defined for Software,” in Proceedings of the 30th Annual ACM Symposium on Applied Computing,
ser. SAC ’15. New York, NY, USA: ACM, 2015, pp. 665–668.
[252] S. Zhao and D. Medhi, “Application-Aware Network Design for Hadoop MapReduce Optimization Using
Software-Defined Networking,” IEEE Transactions on Network and Service Management, vol. 14, no. 4, pp. 804–
816, Dec 2017.
[253] Z. Asad, M. Chaudhry, and D. Malone, “Greener Data Exchange in the Cloud: A Coding Based
Optimization for Big Data Processing,” Selected Areas in Communications, IEEE Journal on, vol. PP, no. 99,
pp. 1–1, 2016.
[254] J. Duan, Z. Wang, and C. Wu, “Responsive multipath TCP in SDN based datacenters,” in Communications
(ICC), 2015 IEEE International Conference on, June 2015, pp. 5296–5301.
[255] S. Sen, D. Shue, S. Ihm, and M. J. Freedman, “Scalable, Optimal Flow Routing in Datacenters via Local
Link Balancing,” in Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and
Technologies, ser. CoNEXT ’13. New York, NY, USA: ACM, 2013, pp. 151–162.
[256] S. Hu, K. Chen, H. Wu, W. Bai, C. Lan, H. Wang, H. Zhao, and C. Guo, “Explicit path control in commodity
data centers: Design and applications,” Networking, IEEE/ACM Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[257] Z. Xie, L. Hu, K. Zhao, F. Wang, and J. Pang, “Topology2Vec: Topology Representation Learning For Data
Center Networking,” IEEE Access, vol. 6, pp. 33 840–33 848, 2018.
[258] M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, “Managing Data Transfers in Computer
Clusters with Orchestra,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 98–109, Aug. 2011.
[259] A. Shieh, S. Kandula, A. Greenberg, C. Kim, and B. Saha, “Sharing the Data Center Network,” in
Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’11.
Berkeley, CA, USA: USENIX Association, 2011, pp. 309–322.
[260] A. Das, C. Lumezanu, Y. Zhang, V. Singh, G. Jiang, and C. Yu, “Transparent and flexible network
management for big data processing in the cloud,” in Presented as part of the 5th USENIX Workshop on Hot
Topics in Cloud Computing. Berkeley, CA: USENIX, 2013.
[261] W. Cui and C. Qian, “DiFS: Distributed flow scheduling for adaptive routing in hierarchical data center
networks,” in 2014 ACM/IEEE Symposium on Architectures for Networking and Communications Systems
(ANCS), Oct 2014, pp. 53–64.
[262] M. Chowdhury, Y. Zhong, and I. Stoica, “Efficient Coflow Scheduling with Varys,” SIGCOMM Comput.
Commun. Rev., vol. 44, no. 4, pp. 443–454, Aug. 2014.
[263] F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron, “Decentralized Task-aware Scheduling for Data
Center Networks,” SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 431–442, Aug. 2014.
[264] S. Luo, H. Yu, Y. Zhao, S. Wang, S. Yu, and L. Li, “Towards Practical and Near-optimal Coflow Scheduling
for Data Center Networks,” IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99, pp. 1–1,
2016.
[265] Y. Zhao, K. Chen, W. Bai, M. Yu, C. Tian, Y. Geng, Y. Zhang, D. Li, and S. Wang, “Rapier: Integrating
routing and scheduling for coflow-aware data center networks,” in 2015 IEEE Conference on Computer
Communications (INFOCOM), April 2015, pp. 424–432.
[266] Z. Guo, J. Duan, and Y. Yang, “On-Line Multicast Scheduling with Bounded Congestion in Fat-Tree Data
Center Networks,” Selected Areas in Communications, IEEE Journal on, vol. 32, no. 1, pp. 102– 115, January
2014.
[267] M. V. Neves, C. A. F. D. Rose, K. Katrinis, and H. Franke, “Pythia: Faster Big Data in Motion through
Predictive Software-Defined Network Optimization at Runtime,” in 2014 IEEE 28th International Parallel and
Distributed Processing Symposium, May 2014, pp. 82–90.
[268] W. Hong, K. Wang, and Y.-H. Hsu, “Application-Aware Resource Allocation for SDN-based Cloud
Datacenters,” in Cloud Computing and Big Data (CloudCom-Asia), 2013 International Conference on, Dec 2013,
pp. 106–110.
[269] P. Qin, B. Dai, B. Huang, and G. Xu, “Bandwidth-Aware Scheduling With SDN in Hadoop: A New Trend
for Big Data,” Systems Journal, IEEE, vol. PP, no. 99, pp. 1–8, 2015.
[270] H. Rodrigues, R. Strong, A. Akyurek, and T. Rosing, “Dynamic optical switching for latency sensitive
applications,” in Architectures for Networking and Communications Systems (ANCS), 2015 ACM/IEEE
Symposium on, May 2015, pp. 75–86.
[271] K. Kontodimas, K. Christodoulopoulos, E. Zahavi, and E. Varvarigos, “Resource allocation in slotted
optical data center networks,” in 2018 International Conference on Optical Network Design and Modeling
(ONDM), May 2018, pp. 248–253.
[272] G. C. Sankaran and K. M. Sivalingam, “Design and Analysis of Scheduling Algorithms for Optically
Groomed Data Center Networks,” IEEE/ACM Transactions on Networking, vol. 25, no. 6, pp. 3282–3293,
Dec 2017.
[273] L. Wang, X. Wang, M. Tornatore, K. J. Kim, S. M. Kim, D. Kim, K. Han, and B. Mukherjee, “Scheduling
with machine-learning-based flow detection for packet-switched optical data center networks,” IEEE/OSA
Journal of Optical Communications and Networking, vol. 10, no. 4, pp. 365–375, April 2018.
[274] R. Xie and X. Jia, “Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud
Systems,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[275] I. Paik, W. Chen, and Z. Li, “Topology-Aware Optimal Data Placement Algorithm for Network Traffic
Optimization,” Computers, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[276] W. Li, D. Guo, A. X. Liu, K. Li, H. Qi, S. Guo, A. Munir, and X. Tao, “CoMan: Managing Bandwidth
Across Computing Frameworks in Multiplexed Datacenters,” IEEE Transactions on Parallel and Distributed
Systems, vol. 29, no. 5, pp. 1013–1029, May 2018.
[277] H. Shen, A. Sarker, L. Yu, and F. Deng, “Probabilistic Network-Aware Task Placement for MapReduce
Scheduling,” in 2016 IEEE International Conference on Cluster Computing (CLUSTER), Sept 2016, pp. 241–
250.
[278] Z. Li, H. Shen, and A. Sarker, “A Network-Aware Scheduler in Data-Parallel Clusters for High
Performance,” in 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
(CCGRID), May 2018, pp. 1–10.
[279] D. Xie, N. Ding, Y. C. Hu, and R. Kompella, “The Only Constant is Change: Incorporating Time-varying
Network Reservations in Data Centers,” in Proceedings of the ACM SIGCOMM 2012 Conference on
Applications, Technologies, Architectures, and Protocols for Computer Communication, ser. SIGCOMM ’12.
New York, NY, USA: ACM, 2012, pp. 199–210.
[280] V. Jalaparti, P. Bodik, I. Menache, S. Rao, K. Makarychev, and M. Caesar, “Network-Aware Scheduling
for Data-Parallel Jobs: Plan When You Can,” SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 407–420,
Aug. 2015.
[281] K. Karanasos, S. Rao, C. Curino, C. Douglas, K. Chaliparambil, G. M. Fumarola, S. Heddaya, R.
Ramakrishnan, and S. Sakalanaga, “Mercury: Hybrid Centralized and Distributed Scheduling in Large
Shared Clusters,” in Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference, ser.
USENIX ATC ’15. Berkeley, CA, USA: USENIX Association, 2015, pp. 485–497.
[282] T. Renner, L. Thamsen, and O. Kao, “Network-aware resource management for scalable data analytics
frameworks,” in Big Data (Big Data), 2015 IEEE International Conference on, Oct 2015, pp. 2793–2800.
[283] R. F. e Silva and P. M. Carpenter, “Energy Efficient Ethernet on MapReduce Clusters: Packet Coalescing
To Improve 10GbE Links,” IEEE/ACM Transactions on Networking, vol. 25, no. 5, pp. 2731–2742, Oct 2017.
[284] G. Wen, J. Hong, C. Xu, P. Balaji, S. Feng, and P. Jiang, “Energy-aware hierarchical scheduling of
applications in large scale data centers,” in Cloud and Service Computing (CSC), 2011 International Conference
on, Dec 2011, pp. 158–165.
[285] D. Li, Y. Yu, W. He, K. Zheng, and B. He, “Willow: Saving Data Center Network Energy for Network-
Limited Flows,” Parallel and Distributed Systems, IEEE Transactions on, vol. 26, no. 9, pp. 2610–2620, Sept
2015.
[286] Z. Niu, B. He, and F. Liu, “JouleMR: Towards Cost-Effective and Green-Aware Data Processing
Frameworks,” IEEE Transactions on Big Data, vol. 4, no. 2, pp. 258–272, June 2018.
[287] R. Appuswam, C. Gkantsidis, D. Narayanan, O. Hodson, and A. Rowstron, “Scale-up vs Scale-out for
Hadoop: Time to rethink?” ACM Symposium on Cloud Computing, October 2013.
[288] Z. Li, H. Shen, W. Ligon, and J. Denton, “An Exploration of Designing a Hybrid Scale-Up/Out Hadoop
Architecture Based on Performance Measurements,” IEEE Transactions on Parallel and Distributed Systems,
vol. 28, no. 2, pp. 386–400, Feb 2017.
[289] S. Sur, H. Wang, J. Huang, X. Ouyang, and D. Panda, “Can High- Performance Interconnects Benefit
Hadoop Distributed File System?” 2010.
[290] Y. Wang, R. Goldstone, W. Yu, and T. Wang, “Characterization and Optimization of Memory-Resident
MapReduce on HPC Systems,” in Parallel and Distributed Processing Symposium, 2014 IEEE 28th International,
May 2014, pp. 799–808.
[291] K. Kambatla and Y. Chen, “The Truth About MapReduce Performance on SSDs,” in Proceedings of the
28th USENIX Conference on Large Installation System Administration, ser. LISA’14. Berkeley, CA, USA:
USENIX Association, 2014, pp. 109–117.
[292] J. Hong, L. Li, C. Han, B. Jin, Q. Yang, and Z. Yang, “Optimizing Hadoop Framework for Solid State
Drives,” in 2016 IEEE International Congress on Big Data (BigData Congress), June 2016, pp. 9–17.
[293] B. Wang, J. Jiang, Y. Wu, G. Yang, and K. Li, “Accelerating MapReduce on Commodity Clusters: An SSD-
Empowered Approach,” IEEE Transactions on Big Data, vol. PP, no. 99, pp. 1–1, 2016.
[294] J. Bhimani, J. Yang, Z. Yang, N. Mi, Q. Xu, M. Awasthi, R. Pandurangan, and V. Balakrishnan,
“Understanding performance of I/O intensive containerized applications for NVMe SSDs,” in 2016 IEEE 35th
International Performance Computing and Communications Conference (IPCCC), Dec 2016, pp. 1–8.
[295] G. Wang, A. R. Butt, H. Monti, and K. Gupta, “Towards Synthesizing Realistic Workload Traces for
Studying the Hadoop Ecosystem,” in Proceedings of the 2011 IEEE 19th Annual International Symposium on
Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, ser. MASCOTS ’11.
Washington, DC, USA: IEEE Computer Society, 2011, pp. 400–408.
[296] T. Ono, Y. Konishi, T. Tanimoto, N. Iwamatsu, T. Miyoshi, and J. Tanaka, “FlexDAS: A flexible direct
attached storage for I/O intensive applications,” in 2014 IEEE International Conference on Big Data (Big Data),
Oct 2014, pp. 147–152.
[297] Y. Kim, S. Atchley, G. R. Vallee, and G. M. Shipman, “Layout-aware I/O Scheduling for terabits data
movement,” in 2013 IEEE International Conference on Big Data, Oct 2013, pp. 44–51.
[298] A. Dragojevi´c, D. Narayanan, M. Castro, and O. Hodson, “FaRM: Fast Remote Memory,” in 11th USENIX
Symposium on Networked Systems Design and Implementation (NSDI 14). Seattle, WA: USENIX
Association, 2014, pp. 401–414.
[299] W. Yu, Y. Wang, X. Que, and C. Xu, “Virtual Shuffling for Efficient Data Movement in MapReduce,”
Computers, IEEE Transactions on, vol. 64, no. 2, pp. 556–568, Feb 2015.
[300] M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A.
Ailamaki, and B. Falsafi, “A Case for Specialized Processors for Scale-Out Workloads,” IEEE Micro, vol. 34, no.
3, pp. 31–42, May 2014.
[301] B. Jacob, “The 2 PetaFLOP, 3 Petabyte, 9 TB/s, 90 kW Cabinet: A System Architecture for Exascale and
Big Data,” IEEE Computer Architecture Letters, vol. PP, no. 99, pp. 1–1, 2015.
[302] W. Fang, B. He, Q. Luo, and N. K. Govindaraju, “Mars: Accelerating MapReduce with Graphics
Processors,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 4, pp. 608–620, April 2011.
[303] C. Wang, C. Yang, W. Liao, R. Chang, and T. Wei, “Coupling GPU and MPTCP to improve
Hadoop/MapReduce performance,” in 2016 2nd International Conference on Intelligent Green Building and
Smart Grid (IGBSG), June 2016, pp. 1–6.
[304] Y. Shan, B. Wang, N. X. Jing Yan, Y. Wang, and H. Yang, “FPMR: MapReduce framework on FPGA,” in
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays. New
York, NY, USA: ACM, January 2010, p. 93–102.
[305] C. Wang, X. Li, and X. Zhou, “SODA: Software defined FPGA based accelerators for big data,” in 2015
Design, Automation Test in Europe Conference Exhibition (DATE), March 2015, pp. 884–887.
[306] D. Diamantopoulos and C. Kachris, “High-level synthesizable dataflow MapReduce accelerator for FPGA-
coupled data centers,” in Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS),
2015 International Conference on, July 2015, pp. 26–33.
[307] Y. Tokusashi and H. Matsutani, “Multilevel NoSQL Cache Combining In-NIC and In-Kernel Approaches,”
IEEE Micro, vol. 37, no. 5, pp. 44–51, September 2017.
[308] K. Nakamura, A. Hayashi, and H. Matsutani, “An FPGA-based low-latency network processing for spark
streaming,” in 2016 IEEE International Conference on Big Data (Big Data), Dec 2016, pp. 2410–2415.
[309] B. Betkaoui, D. B. Thomas, W. Luk, and N. Przulj, “A framework for FPGA acceleration of large graph
problems: Graphlet counting case study,” in 2011 International Conference on Field-Programmable Technology,
Dec 2011, pp. 1–8.
[310] P. X. Gao, A. Narayan, S. Karandikar, J. Carreira, S. Han, R. Agarwal, S. Ratnasamy, and S. Shenker,
“Network Requirements for Resource Disaggregation,” in 12th USENIX Symposium on Operating Systems
Design and Implementation (OSDI 16). Savannah, GA: USENIX Association, 2016, pp. 249–264.
[311] C.-S. Li, H. Franke, C. Parris, B. Abali, M. Kesavan, and V. Chang, “Composable architecture for rack scale
big data computing,” Future Generation Computer Systems, vol. 67, pp. 180 – 193, 2017.
[312] M. Chen, S. Mao, and Y. Liu, “Big Data: A Survey,” Mob. Netw. Appl., vol. 19, no. 2, pp. 171–209, Apr.
2014.
[313] S. Sakr, A. Liu, D. Batista, and M. Alomari, “A Survey of Large Scale Data Management Approaches in
Cloud Environments,” Communications Surveys Tutorials, IEEE, vol. 13, no. 3, pp. 311–336, Third 2011.
[314] L. Jiamin and F. Jun, “A Survey of MapReduce Based Parallel Processing Technologies,” Communications,
China, vol. 11, no. 14, pp. 146–155, Supplement 2014.
[315] Y. Zhang, T. Cao, S. Li, X. Tian, L. Yuan, H. Jia, and A. V. Vasilakos, “Parallel Processing Systems for
Big Data: A Survey,” Proceedings of the IEEE, vol. 104, no. 11, pp. 2114–2136, Nov 2016.
[316] H. Zhang, G. Chen, B. C. Ooi, K. L. Tan, and M. Zhang, “In-Memory Big Data Management and Processing:
A Survey,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 7, pp. 1920–1948, July 2015.
[317] R. Han, L. K. John, and J. Zhan, “Benchmarking Big Data Systems: A Review,” IEEE Transactions on
Services Computing, vol. PP, no. 99, pp. 1–1, 2017.
[318] G. Rumi, C. Colella, and D. Ardagna, “Optimization Techniques within the Hadoop Eco-system: A Survey,”
in Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2014 16th International Symposium
on, Sept 2014, pp. 437–444.
[319] B. T. Rao and L. S. S. Reddy, “Survey on Improved Scheduling in Hadoop MapReduce in Cloud
Environments,” CoRR, vol. abs/1207.0780, 2012.
[320] R. Li, H. Hu, H. Li, Y. Wu, and J. Yang, “Mapreduce parallel programming model: A state-of-the-art
survey,” International Journal of Parallel Programming, vol. 44, no. 4, pp. 832–866, Aug 2016.
[321] J. Wu, S. Guo, J. Li, and D. Zeng, “Big Data Meet Green Challenges: Big Data Toward Green Applications,”
IEEE Systems Journal, vol. 10, no. 3, pp. 888–900, Sept 2016.
[322] P. Derbeko, S. Dolev, E. Gudes, and S. Sharma, “Security and privacy aspects in mapreduce on clouds: A
survey,” Computer Science Review, vol. 20, pp. 1 – 28, 2016.
[323] S. Dolev, P. Florissi, E. Gudes, S. Sharma, and I. Singer, “A Survey on Geographically Distributed Big-
Data Processing using MapReduce,” IEEE Transactions on Big Data, vol. PP, no. 99, pp. 1–1, 2017.
[324] M. Hadi, A. Lawey, T. El-Gorashi, and J. Elmirghani, “Big Data Analytics for Wireless and Wired Network
Design: A Survey,” Computer Networks, vol. 132, pp. 180–199, February 2018.
[325] J. Wang, Y. Wu, N. Yen, S. Guo, and Z. Cheng, “Big Data Analytics for Emergency Communication
Networks: A Survey,” IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp. 1758–1778, thirdquarter 2016.
[326] X. Cao, L. Liu, Y. Cheng, and X. Shen, “Towards Energy-Efficient Wireless Networking in the Big Data
Era: A Survey,” IEEE Communications Surveys Tutorials, vol. PP, no. 99, pp. 1–1, 2017.
[327] S. Yu, M. Liu, W. Dou, X. Liu, and S. Zhou, “Networking for Big Data: A Survey,” IEEE Communications
Surveys Tutorials, vol. 19, no. 1, pp. 531–549, Firstquarter 2017.
[328] S. Wang, J. Zhang, T. Huang, J. Liu, T. Pan, and Y. Liu, “A Survey of Coflow Scheduling Schemes for
Data Center Networks,” IEEE Communications Magazine, vol. 56, no. 6, pp. 179–185, June 2018.
[329] K. Wang, Q. Zhou, S. Guo, and J. Luo, “Cluster Frameworks for Efficient Scheduling and Resource
Allocation in Data Center Networks: A Survey,” IEEE Communications Surveys Tutorials, pp. 1–1, 2018.
[330] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: Distributed Data-parallel Programs from
Sequential Building Blocks,” SIGOPS Oper. Syst. Rev., vol. 41, no. 3, pp. 59–72, Mar. 2007.
[331] T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernández- Moctezuma, R. Lax, S. McVeety, D.
Mills, F. Perry, E. Schmidt, and S. Whittle, “The Dataflow Model: A Practical Approach to Balancing Correctness,
Latency, and Cost in Massive-Scale, Unbounded, Out-of- Order Data Processing,” Proceedings of the VLDB
Endowment, vol. 8, pp. 1792–1803, 2015.
[332] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google File System,” SIGOPS Oper. Syst. Rev., vol. 37,
no. 5, pp. 29–43, Oct. 2003.
[333] V. Kalavri and V. Vlassov, “MapReduce: Limitations, Optimizations and Open Issues,” in 2013 12th IEEE
International Conference on Trust, Security and Privacy in Computing and Communications, July 2013, pp. 1031–
1038.
[334] S. Babu, “Towards Automatic Optimization of MapReduce Programs,” in Proceedings of the 1st ACM
Symposium on Cloud Computing, ser. SoCC ’10. New York, NY, USA: ACM, 2010, pp. 137–142.
[335] P. Lama and X. Zhou, “AROMA: Automated Resource Allocation and Configuration of Mapreduce
Environment in the Cloud,” in Proceedings of the 9th International Conference on Autonomic Computing, ser.
ICAC ’12. New York, NY, USA: ACM, 2012, pp. 63–72.
[336] A. Rabkin and R. Katz, “How Hadoop Clusters Break,” Software, IEEE, vol. 30, no. 4, pp. 88–94, July
2013.
[337] D. Cheng, J. Rao, Y. Guo, C. Jiang, and X. Zhou, “Improving Performance of Heterogeneous MapReduce
Clusters with Adaptive Task Tuning,” IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 3, pp.
774–786, March 2017.
[338] T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears, “MapReduce Online,” in
Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’10.
Berkeley, CA, USA: USENIX Association, 2010, pp. 21–21.
[339] T. White, Hadoop: The Definitive Guide, 1st ed. O’Reilly Media, Inc., 2009.
[340] C. Ji, Y. Li, W. Qiu, U. Awada, and K. Li, “Big Data Processing in Cloud Computing Environments,” in
Pervasive Systems, Algorithms and Networks (ISPAN), 2012 12th International Symposium on, Dec 2012, pp.
17–23.
[341] V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah,
S. Seth, B. Saha, C. Curino, O. O’Malley, S. Radia, B. Reed, and E. Baldeschwieler, “Apache Hadoop YARN:
Yet Another Resource Negotiator,” in Proceedings of the 4th Annual Symposium on Cloud Computing, ser.
SOCC ’13. New York, NY, USA: ACM, 2013, pp. 5:1–5:16.
[342] I. Polato, D. Barbosa, A. Hindle, and F. Kon, “Hadoop branching: Architectural impacts on energy and
performance,” in Green Computing Conference and Sustainable Computing Conference (IGSC), 2015 Sixth
International, Dec 2015, pp. 1–4.
[343] C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, “Pig Latin: A Not-so-foreign Language for
Data Processing,” in Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data,
ser. SIGMOD ’08. New York, NY, USA: ACM, 2008, pp. 1099–1110.
[344] B. Saha, H. Shah, S. Seth, G. Vijayaraghavan, A. Murthy, and C. Curino, “Apache Tez: A Unifying
Framework for Modeling and Building Data Processing Applications,” in Proceedings of the 2015 ACM
SIGMOD International Conference on Management of Data, ser. SIGMOD ’15. New York, NY, USA: ACM,
2015, pp. 1357–1369.
[345] A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy, “Hive:
A Warehousing Solution over a Map-reduce Framework,” Proc. VLDB Endow., vol. 2, no. 2, pp. 1626–1629,
Aug. 2009.
[346] A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J.
Donham, N. Bhagat, S. Mittal, and D. Ryaboy, “Storm@twitter,” in Proceedings of the 2014 ACM SIGMOD
International Conference on Management of Data, ser. SIGMOD ’14. New York, NY, USA: ACM, 2014, pp.
147–156.
[347] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R.
E. Gruber, “Bigtable: A Distributed Storage System for Structured Data,” ACM Trans. Comput. Syst., vol. 26,
no. 2, pp. 4:1–4:26, Jun. 2008.
[348] B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D.
Weaver, and R. Yerneni, “PNUTS: Yahoo!’s Hosted Data Serving Platform,” Proc. VLDB Endow., vol. 1, no. 2,
pp. 1277–1288, Aug. 2008.
[349] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P.
Vosshall, and W. Vogels, “Dynamo: Amazon’s Highly Available Key-value Store,” SIGOPS Oper. Syst. Rev.,
vol. 41, no. 6, pp. 205–220, Oct. 2007.
[350] A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin, “HadoopDB: An
Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads,” Proc. VLDB Endow.,
vol. 2, no. 1, pp. 922–933, Aug. 2009.
[351] A. Lakshman and P. Malik, “Cassandra: A Decentralized Structured Storage System,” SIGOPS Oper. Syst.
Rev., vol. 44, no. 2, pp. 35–40, Apr. 2010.
[352] J. Dittrich, J.-A. Quiané-Ruiz, A. Jindal, Y. Kargin, V. Setty, and J. Schad, “Hadoop++: Making a Yellow
Elephant Run Like a Cheetah (Without It Even Noticing),” Proc. VLDB Endow., vol. 3, no. 1-2, pp. 515–529,
Sep. 2010.
[353] J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G.
Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman, “The Case for RAMClouds: Scalable
High-performance Storage Entirely in DRAM,” SIGOPS Oper. Syst. Rev., vol. 43, no. 4, pp. 92–105, Jan. 2010.
[354] F. Färber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner, “SAP HANA Database: Data
Management for Modern Business Applications,” SIGMOD Rec., vol. 40, no. 4, pp. 45–51, Jan. 2012.
[355] M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster Computing with
Working Sets,” in Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, ser.
HotCloud’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 10–10.
[356] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica,
“Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing,” in Proceedings
of the 9th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’12. Berkeley, CA,
USA: USENIX Association, 2012, pp. 2–2.
[357] A. Ching, S. Edunov, M. Kabiljo, D. Logothetis, and S. Muthukrishnan, “One Trillion Edges: Graph
Processing at Facebook-scale,” Proc. VLDB Endow., vol. 8, no. 12, pp. 1804–1815, Aug. 2015.
[358] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: A
System for Large-scale Graph Processing,” in Proceedings of the 2010 ACM SIGMOD International Conference
on Management of Data, ser. SIGMOD ’10. New York, NY, USA: ACM, 2010, pp. 135–146.
[359] B. Shao, H. Wang, and Y. Li, “Trinity: A Distributed Graph Engine on a Memory Cloud,” in Proceedings
of SIGMOD 2013. ACM SIGMOD, June 2013.
[360] Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, “Distributed GraphLab: A
Framework for Machine Learning and Data Mining in the Cloud,” Proc. VLDB Endow., vol. 5, no. 8, pp. 716–
727, Apr. 2012.
[361] J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, “PowerGraph: Distributed Graph-Parallel
Computation on Natural Graphs,” in Presented as part of the 10th USENIX Symposium on Operating Systems
Design and Implementation (OSDI 12). Hollywood, CA: USENIX, 2012, pp. 17–30.
[362] D. Bernstein, “The Emerging Hadoop, Analytics, Stream Stack for Big Data,” Cloud Computing, IEEE,
vol. 1, no. 4, pp. 84–86, Nov 2014.
[363] A. M. Aly, A. Sallam, B. M. Gnanasekaran, L. V. Nguyen-Dinh, W. G. Aref, M. Ouzzani, and A. Ghafoor,
“M3: Stream Processing on Main-Memory MapReduce,” in 2012 IEEE 28th International Conference on Data
Engineering, April 2012, pp. 1253–1256.
[364] T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P.
Nordstrom, and S. Whittle, “Mill-Wheel: Fault-Tolerant Stream Processing at Internet Scale,” in Very Large Data
Bases, 2013, pp. 734–746.
[365] L. Neumeyer, B. Robbins, A. Nair, and A. Kesari, “S4: Distributed Stream Computing Platform,” in
Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, ser. ICDMW ’10.
Washington, DC, USA: IEEE Computer Society, 2010, pp. 170–177.
[366] M. Zaharia, T. Das, H. Li, S. Shenker, and I. Stoica, “Discretized Streams: An Efficient and Fault-tolerant
Model for Stream Processing on Large Clusters,” in Proceedings of the 4th USENIX Conference on Hot Topics
in Cloud Computing, ser. HotCloud’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 10–10.
[367] N. Marz and J. Warren, Big Data: Principles and Best Practices of Scalable Realtime Data Systems, 1st ed.
Greenwich, CT, USA: Manning Publications Co., 2015.
[368] J. Kreps, N. Narkhede, and J. Rao, “Kafka: A distributed messaging system for log processing,” in
Proceedings of 6th International Workshop on Networking Meets Databases (NetDB), Athens, Greece, 2011.
[369] Apache Hadoop Rumen. (Cited on 2016, Dec). [Online]. Available:
https://hadoop.apache.org/docs/stable/hadoop-rumen/Rumen.html
[370] M. Sadiku, S. Musa, and O. Momoh, “Cloud Computing: Opportunities and Challenges,” Potentials, IEEE,
vol. 33, no. 1, pp. 34–36, Jan 2014.
[371] B. Biocic, D. Tomic, and D. Ogrizovic, “Economics of the cloud computing,” in MIPRO, 2011 Proceedings
of the 34th International Convention, May 2011, pp. 1438–1442.
[372] N. da Fonseca and R. Boutaba, Cloud Architectures, Networks, Services, and Management. Wiley-IEEE
Press, 2015, p. 432.
[373] J. E. Smith and R. Nair, “The architecture of virtual machines,” Computer, vol. 38, no. 5, pp. 32–38, May
2005.
[374] B. Sotomayor, R. S. Montero, I. M. Llorente, and I. Foster, “Virtual Infrastructure Management in Private
and Hybrid Clouds,” IEEE Internet Computing, vol. 13, no. 5, pp. 14–22, Sept 2009.
[375] Amazon EC2. (Cited on 2017, Dec). [Online]. Available: https: //aws.amazon.com/ec2
[376] Google Compute Engine Pricing. (Cited on 2017, Dec). [Online]. Available:
https://cloud.google.com/compute/pricing
[377] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield,
“Xen and the Art of Virtualization,” SIGOPS Oper. Syst. Rev., vol. 37, no. 5, pp. 164–177, Oct. 2003.
[378] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori, “kvm: the Linux Virtual Machine Monitor,” in
Proceedings of the Linux Symposium, vol. 1, Ottawa, Ontario, Canada, Jun. 2007, pp. 225–230.
[379] F. Guthrie, S. Lowe, and K. Coleman, VMware vSphere Design, 2nd ed. Alameda, CA, USA: SYBEX Inc.,
2013.
[380] T. Kooburat and M. Swift, “The Best of Both Worlds with On-demand Virtualization,” in Proceedings of
the 13th USENIX Conference on Hot Topics in Operating Systems, ser. HotOS’13. Berkeley, CA, USA: USENIX
Association, 2011, pp. 4–4.
[381] D. Venzano and P. Michiardi, “A Measurement Study of Data-Intensive Network Traffic Patterns in a
Private Cloud,” in 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing, Dec 2013, pp.
476–481.
[382] F. Xu, F. Liu, H. Jin, and A. Vasilakos, “Managing Performance Overhead of Virtual Machines in Cloud
Computing: A Survey, State of the Art, and Future Directions,” Proceedings of the IEEE, vol. 102, no. 1, pp. 11–
31, Jan 2014.
[383] G. Wang and T. S. E. Ng, “The Impact of Virtualization on Network Performance of Amazon EC2 Data
Center,” in Proceedings of the 29th Conference on Information Communications, ser. INFOCOM’10. Piscataway,
NJ, USA: IEEE Press, 2010, pp. 1163–1171.
[384] Q. Duan, Y. Yan, and A. V. Vasilakos, “A Survey on Service-Oriented Network Virtualization Toward
Convergence of Networking and Cloud Computing,” IEEE Transactions on Network and Service Management,
vol. 9, no. 4, pp. 373–392, December 2012.
[385] R. Jain and S. Paul, “Network virtualization and software defined networking for cloud computing: a
survey,” Communications Magazine, IEEE, vol. 51, no. 11, pp. 24–31, November 2013.
[386] A. Fischer, J. F. Botero, M. T. Beck, H. de Meer, and X. Hesselbach, “Virtual Network Embedding: A
Survey,” IEEE Communications Surveys Tutorials, vol. 15, no. 4, pp. 1888–1906, Fourth 2013.
[387] L. Nonde, T. El-Gorashi, and J. Elmirghani, “Energy Efficient Virtual Network Embedding for Cloud
Networks,” Lightwave Technology, Journal of, vol. 33, no. 9, pp. 1828–1849, May 2015.
[388] L. Nonde, T. E. H. Elgorashi, and J. M. H. Elmirghani, “Cloud Virtual Network Embedding: Profit, Power
and Acceptance,” in 2015 IEEE Global Communications Conference (GLOBECOM), Dec 2015, pp. 1– 6.
[389] R. Mijumbi, J. Serrat, J. Gorricho, N. Bouten, F. De Turck, and R. Boutaba, “Network Function
Virtualization: State-of-the-Art and Research Challenges,” Communications Surveys Tutorials, IEEE, vol. 18, no.
1, pp. 236–262, Firstquarter 2016.
[390] H. Hawilo, A. Shami, M. Mirahmadi, and R. Asal, “NFV: state of the art, challenges, and implementation
in next generation mobile networks (vEPC),” IEEE Network, vol. 28, no. 6, pp. 18–26, Nov 2014.
[391] V. Nguyen, A. Brunstrom, K. Grinnemo, and J. Taheri, “SDN/NFVBased Mobile Packet Core Network
Architectures: A Survey,” IEEE Communications Surveys Tutorials, vol. 19, no. 3, pp. 1567–1602, thirdquarter
2017.
[392] D. A. Temesgene, J. Núñez-Martínez, and P. Dini, “Softwarization and Optimization for Sustainable Future
Mobile Networks: A Survey,” IEEE Access, vol. 5, pp. 25 421–25 436, 2017.
[393] I. Afolabi, T. Taleb, K. Samdanis, A. Ksentini, and H. Flinck, “Network Slicing & Softwarization: A Survey
on Principles, Enabling Technologies & Solutions,” IEEE Communications Surveys Tutorials, pp. 1–1, 2018.
[394] J. G. Herrera and J. F. Botero, “Resource Allocation in NFV: A Comprehensive Survey,” IEEE Transactions
on Network and Service Management, vol. 13, no. 3, pp. 518–532, Sept 2016.
[395] L. Peterson, A. Al-Shabibi, T. Anshutz, S. Baker, A. Bavier, S. Das, J. Hart, G. Palukar, and W. Snow,
“Central office re-architected as a data center,” IEEE Communications Magazine, vol. 54, no. 10, pp. 96–101,
October 2016.
[396] A. N. Al-Quzweeni, A. Q. Lawey, T. E. H. Elgorashi, and J. M. H. Elmirghani, “Optimized Energy Aware
5G Network Function Virtualization,” IEEE Access, pp. 1–1, 2019.
[397] A. Al-Quzweeni, A. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “A framework for energy efficient
NFV in 5G networks,” in 2016 18th International Conference on Transparent Optical Networks (ICTON), July
2016, pp. 1–4.
[398] A. Al-Quzweeni, T. E. H. El-Gorashi, L. Nonde, and J. M. H. Elmirghani, “Energy efficient network
function virtualization in 5G networks,” in 2015 17th International Conference on Transparent Optical Networks
(ICTON), July 2015, pp. 1–4.
[399] J. Zhang, Y. Ji, X. Xu, H. Li, Y. Zhao, and J. Zhang, “Energy efficient baseband unit aggregation in cloud
radio and optical access networks,” IEEE/OSA Journal of Optical Communications and Networking, vol. 8, no.
11, pp. 893–901, Nov 2016.
[400] M. Peng, Y. Li, Z. Zhao, and C. Wang, “System architecture and key technologies for 5G heterogeneous
cloud radio access networks,” IEEE Network, vol. 29, no. 2, pp. 6–14, March 2015.
[401] M. Peng, Y. Sun, X. Li, Z. Mao, and C. Wang, “Recent Advances in Cloud Radio Access Networks: System
Architectures, Key Techniques, and Open Issues,” IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp.
2282–2308, thirdquarter 2016.
[402] A. Checko, H. L. Christiansen, Y. Yan, L. Scolari, G. Kardaras, M. S. Berger, and L. Dittmann, “Cloud
RAN for Mobile Networks; A Technology Overview,” IEEE Communications Surveys Tutorials, vol. 17, no. 1,
pp. 405–426, Firstquarter 2015.
[403] L. Velasco, L. M. Contreras, G. Ferraris, A. Stavdas, F. Cugini, M. Wiegand, and J. P. Fernandez-Palacios,
“A service-oriented hybrid access network and clouds architecture,” IEEE Communications Magazine, vol. 53,
no. 4, pp. 159–165, April 2015.
[404] M. Kalil, A. Al-Dweik, M. F. A. Sharkh, A. Shami, and A. Refaey, “A Framework for Joint Wireless
Network Virtualization and Cloud Radio Access Networks for Next Generation Wireless Networks,” IEEE
Access, vol. 5, pp. 20 814–20 827, 2017.
[405] M. Richart, J. Baliosian, J. Serrat, and J. L. Gorricho, “Resource Slicing in Virtual Wireless Networks: A
Survey,” IEEE Transactions on Network and Service Management, vol. 13, no. 3, pp. 462–476, Sept 2016.
[406] R. Bolla, R. Bruschi, F. Davoli, C. Lombardo, J. F. Pajo, and O. R. Sanchez, “The dark side of network
functions virtualization: A perspective on the technological sustainability,” in 2017 IEEE International
Conference on Communications (ICC), May 2017, pp. 1–7.
[407] D. Bernstein, “Containers and Cloud: From LXC to Docker to Kubernetes,” Cloud Computing, IEEE, vol.
1, no. 3, pp. 81–84, Sept 2014.
[408] C. Pahl and B. Lee, “Containers and Clusters for Edge Cloud Architectures – A Technology Review,” in
2015 3rd International Conference on Future Internet of Things and Cloud, Aug 2015, pp. 379–386.
[409] I. Mavridis and H. Karatza, “Performance and Overhead Study of Containers Running on Top of Virtual
Machines,” in 2017 IEEE 19th Conference on Business Informatics (CBI), vol. 02, July 2017, pp. 32– 38.
[410] M. G. Xavier, M. V. Neves, and C. A. F. D. Rose, “A Performance Comparison of Container-Based
Virtualization Systems for MapReduce Clusters,” in 2014 22nd Euromicro International Conference on Parallel,
Distributed, and Network-Based Processing, Feb 2014, pp. 299– 306.
[411] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of virtual
machines and Linux containers,” in Performance Analysis of Systems and Software (ISPASS), 2015 IEEE
International Symposium on, March 2015, pp. 171–172.
[412] Docker Container Executor. (Cited on 2017, Mar). [Online]. Available:
https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html
[413] S. Radhakrishnan, B. J. Muscedere, and K. Daudjee, “V-Hadoop: Virtualized Hadoop using containers,” in
2016 IEEE 15th International Symposium on Network Computing and Applications (NCA), vol. 00, Oct. 2016,
pp. 237–241.
[414] D. Kreutz, F. M. V. Ramos, P. E. Veríssimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, “Software-
Defined Networking: A Comprehensive Survey,” Proceedings of the IEEE, vol. 103, no. 1, pp. 14–76, Jan 2015.
[415] F. Bannour, S. Souihi, and A. Mellouk, “Distributed SDN Control: Survey, Taxonomy, and Challenges,”
IEEE Communications Surveys Tutorials, vol. 20, no. 1, pp. 333–354, Firstquarter 2018.
[416] B. Nunes, M. Mendonca, X.-N. Nguyen, K. Obraczka, and T. Turletti, “A Survey of Software-Defined
Networking: Past, Present, and Future of Programmable Networks,” Communications Surveys Tutorials, IEEE,
vol. 16, no. 3, pp. 1617–1634, Third 2014.
[417] W. Xia, Y. Wen, C. H. Foh, D. Niyato, and H. Xie, “A Survey on Software-Defined Networking,” IEEE
Communications Surveys Tutorials, vol. 17, no. 1, pp. 27–51, Firstquarter 2015.
[418] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner,
“Openflow: Enabling innovation in campus networks,” SIGCOMM Comput. Commun. Rev., vol. 38, no. 2, pp.
69–74, Mar. 2008.
[419] F. Hu, Q. Hao, and K. Bao, “A Survey on Software-Defined Network and OpenFlow: From Concept to
Implementation,” IEEE Communications Surveys Tutorials, vol. 16, no. 4, pp. 2181–2206, Fourthquarter 2014.
[420] A. Lara, A. Kolasani, and B. Ramamurthy, “Network Innovation using OpenFlow: A Survey,” IEEE
Communications Surveys Tutorials, vol. 16, no. 1, pp. 493–512, First 2014.
[421] A. Mendiola, J. Astorga, E. Jacob, and M. Higuero, “A Survey on the Contributions of Software-Defined
Networking to Traffic Engineering,” IEEE Communications Surveys Tutorials, vol. 19, no. 2, pp. 918–953,
Secondquarter 2017.
[422] O. Michel and E. Keller, “SDN in wide-area networks: A survey,” in 2017 Fourth International Conference
on Software Defined Systems (SDS), May 2017, pp. 37–42.
[423] B. Pfaff, J. Pettit, T. Koponen, E. J. Jackson, A. Zhou, J. Rajahalme, J. Gross, A. Wang, J. Stringer, P.
Shelar, K. Amidon, and M. Casado, “The Design and Implementation of Open vSwitch,” in Proceedings of the
12th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI’15. Berkeley, CA,
USA: USENIX Association, 2015, pp. 117–130.
[424] P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat,
G. Varghese, and D. Walker, “P4: Programming Protocol-independent Packet Processors,” SIGCOMM Comput.
Commun. Rev., vol. 44, no. 3, pp. 87–95, Jul. 2014.
[425] T. Huang, F. R. Yu, C. Zhang, J. Liu, J. Zhang, and Y. Liu, “A Survey on Large-Scale Software Defined
Networking (SDN) Testbeds: Approaches and Challenges,” IEEE Communications Surveys Tutorials, vol. 19,
no. 2, pp. 891–917, Secondquarter 2017.
[426] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu,
J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat, “B4: Experience with a Globally-deployed Software Defined Wan,”
SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 3–14, Aug. 2013.
[427] C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and R. Wattenhofer, “Achieving High
Utilization with Software-driven WAN,” SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 15–26, Aug.
2013.
[428] J. Wang, Y. Yan, and L. Dittmann, “Design of energy efficient optical networks with software enabled
integrated control plane,” Networks, IET, vol. 4, no. 1, pp. 30–36, 2015.
[429] L. Cui, F. R. Yu, and Q. Yan, “When big data meets software-defined networking: SDN for big data and
big data for SDN,” IEEE Network, vol. 30, no. 1, pp. 58–65, January 2016.
[430] H. Huang, H. Yin, G. Min, H. Jiang, J. Zhang, and Y. Wu, “Data- Driven Information Plane in Software-
Defined Networking,” IEEE Communications Magazine, vol. 55, no. 6, pp. 218–224, 2017.
[431] T. Hafeez, N. Ahmed, B. Ahmed, and A. W. Malik, “Detection and Mitigation of Congestion in SDN
Enabled Data Center Networks: A Survey,” IEEE Access, vol. 6, pp. 1730–1740, 2018.
[432] Y. Zhang, P. Chowdhury, M. Tornatore, and B. Mukherjee, “Energy Efficiency in Telecom Optical
Networks,” Communications Surveys Tutorials, IEEE, vol. 12, no. 4, pp. 441–458, Fourth 2010.
[433] R. Ramaswami, K. N. Sivarajan, and G. H. Sasaki, Optical Networks, 3rd ed. Morgan Kaufmann, 2010.
[434] H. Yin, Y. Jiang, C. Lin, Y. Luo, and Y. Liu, “Big data: transforming the design philosophy of future
internet,” Network, IEEE, vol. 28, no. 4, pp. 14–19, July 2014.
[435] K.-I. Kitayama, A. Hiramatsu, M. Fukui, T. Tsuritani, N. Yamanaka, S. Okamoto, M. Jinno, and M. Koga,
“Photonic Network Vision 2020 - Toward Smart Photonic Cloud,” Lightwave Technology, Journal of, vol. 32,
no. 16, pp. 2760–2770, Aug 2014.
[436] A. S. Thyagaturu, A. Mercian, M. P. McGarry, M. Reisslein, and W. Kellerer, “Software Defined Optical
Networks (SDONs): A Comprehensive Survey,” IEEE Communications Surveys Tutorials, vol. 18, no. 4, pp.
2738–2786, Fourthquarter 2016.
[437] Y. Yin, L. Liu, R. Proietti, and S. J. B. Yoo, “Software Defined Elastic Optical Networks for Cloud
Computing,” IEEE Network, vol. 31, no. 1, pp. 4–10, January 2017.
[438] A. Nag, M. Tornatore, and B. Mukherjee, “Optical Network Design With Mixed Line Rates and Multiple
Modulation Formats,” Journal of Lightwave Technology, vol. 28, no. 4, pp. 466–475, Feb 2010.
[439] Y. Ji, J. Zhang, Y. Zhao, H. Li, Q. Yang, C. Ge, Q. Xiong, D. Xue, J. Yu, and S. Qiu, “All Optical Switching
Networks With Energy- Efficient Technologies From Components Level to Network Level,” IEEE Journal on
Selected Areas in Communications, vol. 32, no. 8, pp. 1600–1614, Aug 2014.
[440] X. Zhao, V. Vusirikala, B. Koley, V. Kamalov, and T. Hofmeister, “The prospect of inter-data-center optical
networks,” IEEE Communications Magazine, vol. 51, no. 9, pp. 32–38, September 2013.
[441] G. Tzimpragos, C. Kachris, I. B. Djordjevic, M. Cvijetic, D. Soudris, and I. Tomkos, “A Survey on FEC
Codes for 100 G and Beyond Optical Networks,” IEEE Communications Surveys Tutorials, vol. 18, no. 1, pp.
209–221, Firstquarter 2016.
[442] D. M. Marom, P. D. Colbourne, A. D’errico, N. K. Fontaine, Y. Ikuma, R. Proietti, L. Zong, J. M. Rivas-
Moscoso, and I. Tomkos, “Survey of photonic switching architectures and technologies in support of spatially and
spectrally flexible optical networking [invited],” IEEE/OSA Journal of Optical Communications and Networking,
vol. 9, no. 1, pp. 1–26, Jan 2017.
[443] X. Yu, M. Tornatore, M. Xia, J. Wang, J. Zhang, Y. Zhao, J. Zhang, and B. Mukherjee, “Migration from
fixed grid to flexible grid in optical networks,” IEEE Communications Magazine, vol. 53, no. 2, pp. 34–43, Feb
2015.
[444] M. Jinno, H. Takara, B. Kozicki, Y. Tsukishima, Y. Sone, and S. Matsuoka, “Spectrum-efficient and
scalable elastic optical path network: architecture, benefits, and enabling technologies,” IEEE Communications
Magazine, vol. 47, no. 11, pp. 66–73, November 2009.
[445] O. Gerstel, M. Jinno, A. Lord, and S. J. B. Yoo, “Elastic optical networking: a new dawn for the optical
layer?” IEEE Communications Magazine, vol. 50, no. 2, pp. s12–s20, February 2012.
[446] B. C. Chatterjee, N. Sarma, and E. Oki, “Routing and Spectrum Allocation in Elastic Optical Networks: A
Tutorial,” IEEE Communications Surveys Tutorials, vol. 17, no. 3, pp. 1776–1800, thirdquarter 2015.
[447] G. Zhang, M. D. Leenheer, A. Morea, and B. Mukherjee, “A Survey on OFDM-Based Elastic Core Optical
Networking,” IEEE Communications Surveys Tutorials, vol. 15, no. 1, pp. 65–87, First 2013.
[448] A. Klekamp, U. Gebhard, and F. Ilchmann, “Energy and Cost Efficiency of Adaptive and Mixed-Line-Rate
IP Over DWDM Networks,” Journal of Lightwave Technology, vol. 30, no. 2, pp. 215–221, Jan 2012.
[449] T. E. El-Gorashi, X. Dong, and J. M. Elmirghani, “Green optical orthogonal frequency-division
multiplexing networks,” IET Optoelectronics, vol. 8, pp. 137–148(11), June 2014.
[450] H. Harai, H. Furukawa, K. Fujikawa, T. Miyazawa, and N. Wada, “Optical Packet and Circuit Integrated
Networks and Software Defined Networking Extension,” Journal of Lightwave Technology, vol. 32, no. 16, pp.
2751–2759, Aug 2014.
[451] IT center, Intel, “Big Data in the Cloud: Converging Technologies,” Solution Brief, 2015.
[452] Project Serengeti: There’s a Virtual Elephant in my Datacenter. (Cited on 2018, May). [Online]. Available:
https://octo.vmware.com/project-serengeti-theres-a-virtual-elephant-in-my-datacenter/
[453] A. Iordache, C. Morin, N. Parlavantzas, E. Feller, and P. Riteau, “Resilin: Elastic MapReduce over Multiple
Clouds,” in 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, May 2013,
pp. 261–268.
[454] D. Wang and J. Liu, “Optimizing big data processing performance in the public cloud: opportunities and
approaches,” Network, IEEE, vol. 29, no. 5, pp. 31–35, September 2015.
[455] D. Agrawal, S. Das, and A. El Abbadi, “Big Data and Cloud Computing: Current State and Future
Opportunities,” in Proceedings of the 14th International Conference on Extending Database Technology, ser.
EDBT/ICDT ’11. New York, NY, USA: ACM, 2011, pp. 530–533.
[456] N. C. Luong, P. Wang, D. Niyato, Y. Wen, and Z. Han, “Resource Management in Cloud Networking Using
Economic Analysis and Pricing Models: A Survey,” IEEE Communications Surveys Tutorials, vol. 19, no. 2, pp.
954–1001, Secondquarter 2017.
[457] Y. Zhao, X. Fei, I. Raicu, and S. Lu, “Opportunities and Challenges in Running Scientific Workflows on
the Cloud,” in 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge
Discovery, Oct 2011, pp. 455–462.
[458] E.-S. Jung and R. Kettimuthu, “Challenges and Opportunities for Data- Intensive Computing in the Cloud,”
Computer, vol. 47, no. 12, pp. 82–85, 2014.
[459] M. H. Ghahramani, M. Zhou, and C. T. Hon, “Toward cloud computing QoS architecture: analysis of cloud
systems and cloud services,” IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 1, pp. 6–18, Jan 2017.
[460] Latency is Everywhere and it Costs you Sales - How to Crush it. (Cited on 2017, Dec). [Online]. Available:
http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it
[461] R. Kohavi, R. M. Henne, and D. Sommerfield, “Practical Guide to Controlled Experiments on the Web:
Listen to Your Customers Not to the Hippo,” in Proceedings of the 13th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, ser. KDD ’07. New York, NY, USA: ACM, 2007, pp. 959–967.
[462] S. S. Krishnan and R. K. Sitaraman, “Video Stream Quality Impacts Viewer Behavior: Inferring Causality
Using Quasi-experimental Designs,” in Proceedings of the 2012 Internet Measurement Conference, ser. IMC ’12.
New York, NY, USA: ACM, 2012, pp. 211–224.
[463] C. Colman-Meixner, C. Develder, M. Tornatore, and B. Mukherjee, “A Survey on Resiliency Techniques
in Cloud Computing Infrastructures and Applications,” IEEE Communications Surveys Tutorials, vol. 18, no. 3,
pp. 2244–2281, thirdquarter 2016.
[464] A. Vishwanath, F. Jalali, K. Hinton, T. Alpcan, R. W. A. Ayre, and R. S. Tucker, “Energy Consumption
Comparison of Interactive Cloud-Based and Local Applications,” IEEE Journal on Selected Areas in
Communications, vol. 33, no. 4, pp. 616–626, April 2015.
[465] J. Baliga, R. W. A. Ayre, K. Hinton, and R. S. Tucker, “Green Cloud Computing: Balancing Energy in
Processing, Storage, and Transport,” Proceedings of the IEEE, vol. 99, no. 1, pp. 149–167, Jan 2011.
[466] H. Zhang, Q. Zhang, Z. Zhou, X. Du, W. Yu, and M. Guizani, “Processing geo-dispersed big data in an
advanced mapreduce framework,” Network, IEEE, vol. 29, no. 5, pp. 24–30, September 2015.
[467] A. Vulimiri, C. Curino, P. B. Godfrey, T. Jungblut, J. Padhye, and G. Varghese, “Global Analytics in the
Face of Bandwidth and Regulatory Constraints,” in Proceedings of the 12th USENIX Conference on Networked
Systems Design and Implementation, ser. NSDI’15. Berkeley, CA, USA: USENIX Association, 2015, pp. 323–
336.
[468] F. Idzikowski, L. Chiaraviglio, A. Cianfrani, J. L. Vizcaíno, M. Polverini, and Y. Ye, “A Survey on Energy-
Aware Design and Operation of Core Networks,” IEEE Communications Surveys Tutorials, vol. 18, no. 2, pp.
1453–1499, Secondquarter 2016.
[469] W. V. Heddeghem, B. Lannoo, D. Colle, M. Pickavet, and P. Demeester, “A Quantitative Survey of the
Power Saving Potential in IP-Over-WDM Backbone Networks,” IEEE Communications Surveys Tutorials, vol.
18, no. 1, pp. 706–731, Firstquarter 2016.
[470] M. N. Dharmaweera, R. Parthiban, and Y. A. ¸Sekercio˘glu, “Toward a Power-Efficient Backbone Network:
The State of Research,” IEEE Communications Surveys Tutorials, vol. 17, no. 1, pp. 198–227, Firstquarter 2015.
[471] R. S. Tucker, “Green Optical Communications—Part I: Energy Limitations in Transport,” IEEE Journal of
Selected Topics in Quantum Electronics, vol. 17, no. 2, pp. 245–260, March 2011.
[472] R. S. Tucker, “Green Optical Communications—Part II: Energy Limitations in Networks,” IEEE Journal of
Selected Topics in Quantum Electronics, vol. 17, no. 2, pp. 261–274, March 2011.
[473] W. V. Heddeghem, M. D. Groote, W. Vereecken, D. Colle, M. Pickavet, and P. Demeester, “Energy-
efficiency in telecommunications networks: Link-by-link versus end-to-end grooming,” in 2010 14th Conference
on Optical Network Design and Modeling (ONDM), Feb 2010, pp. 1–6.
[474] A. Fehske, G. Fettweis, J. Malmodin, and G. Biczok, “The global footprint of mobile communications: The
ecological and economic perspective,” IEEE Communications Magazine, vol. 49, no. 8, pp. 55–62, August 2011.
[475] H. A. Alharbi, M. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Real-Time Emissions of Telecom
Core Networks,” in 2018 20th International Conference on Transparent Optical Networks (ICTON), July 2018,
pp. 1–4.
[476] D. Feng, C. Jiang, G. Lim, L. J. Cimini, G. Feng, and G. Y. Li, “A survey of energy-efficient wireless
communications,” IEEE Communications Surveys Tutorials, vol. 15, no. 1, pp. 167–178, First 2013.
[477] L. Budzisz, F. Ganji, G. Rizzo, M. A. Marsan, M. Meo, Y. Zhang, G. Koutitas, L. Tassiulas, S. Lambert, B.
Lannoo, M. Pickavet, A. Conte, I. Haratcherev, and A. Wolisz, “Dynamic Resource Provisioning for Energy
Efficiency in Wireless Access Networks: A Survey and an Outlook,” IEEE Communications Surveys Tutorials,
vol. 16, no. 4, pp. 2259–2285, Fourthquarter 2014.
[478] M. Ismail,W. Zhuang, E. Serpedin, and K. Qaraqe, “A Survey on Green Mobile Networking: From The
Perspectives of Network Operators and Mobile Users,” IEEE Communications Surveys Tutorials, vol. 17, no. 3,
pp. 1535–1556, thirdquarter 2015.
[479] K. Gomez, R. Riggio, T. Rasheed, and F. Granelli, “Analysing the energy consumption behaviour of WiFi
networks,” in 2011 IEEE Online Conference on Green Communications, Sept 2011, pp. 98–104.
[480] S. Xiao, X. Zhou, D. Feng, Y. Yuan-Wu, G. Y. Li, and W. Guo, “Energy-Efficient Mobile Association in
Heterogeneous Networks With Device-to-Device Communications,” IEEE Transactions on Wireless
Communications, vol. 15, no. 8, pp. 5260–5271, Aug 2016.
[481] A. Abrol and R. K. Jha, “Power Optimization in 5G Networks: A Step Towards GrEEn Communication,”
IEEE Access, vol. 4, pp. 1355–1374, 2016.
[482] L. Valcarenghi, D. P. Van, P. G. Raponi, P. Castoldi, D. R. Campelo, S. Wong, S. Yen, L. G. Kazovsky,
and S. Yamashita, “Energy efficiency in passive optical networks: where, when, and how?” IEEE Network, vol.
26, no. 6, pp. 61–68, November 2012.
[483] J. Kani, S. Shimazu, N. Yoshimoto, and H. Hadama, “Energy-efficient optical access networks: issues and
technologies,” IEEE Communications Magazine, vol. 51, no. 2, pp. S22–S26, February 2013.
[484] P. Vetter, D. Suvakovic, H. Chow, P. Anthapadmanabhan, K. Kanonakis, K. Lee, F. Saliou, X. Yin, and B.
Lannoo, “Energy-efficiency improvements for optical access,” IEEE Communications Magazine, vol. 52, no. 4,
pp. 136–144, April 2014.
[485] B. Skubic, E. I. de Betou, T. Ayhan, and S. Dahlfort, “Energy-efficient next-generation optical access
networks,” IEEE Communications Magazine, vol. 50, no. 1, pp. 122–127, January 2012.
[486] E. Goma, M. Canini, A. Lopez, N. Laoutaris, D. Kostic, P. Rodriguez, R. Stanojevic, and P. Yague,
“Insomnia in the Access (or How to Curb Access Network Related Energy Consumption),” Proceedings of the
ACM SIGCOMM 2011 Conference on Applications, Technologies, Architectures, and Protocols for Computer
Communications, 2011.
[487] A. S. Gowda, A. R. Dhaini, L. G. Kazovsky, H. Yang, S. T. Abraha, and A. Ng’oma, “Towards Green
Optical/Wireless In-Building Networks: Radio-Over-Fiber,” Journal of Lightwave Technology, vol. 32, no. 20,
pp. 3545–3556, Oct 2014.
[488] B. Kantarci and H. T. Mouftah, “Energy efficiency in the extendedreach fiber-wireless access networks,”
IEEE Network, vol. 26, no. 2, pp. 28–35, March 2012.
[489] M. Gupta and S. Singh, “Greening of the Internet,” in Proceedings of the 2003 Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communications, ser. SIGCOMM ’03. New York, NY,
USA: ACM, 2003, pp. 19–26.
[490] J. C. C. Restrepo, C. G. Gruber, and C. M. Machuca, “Energy Profile Aware Routing,” in 2009 IEEE
International Conference on Communications Workshops, June 2009, pp. 1–5.
[491] S. Nedevschi, L. Popa, G. Iannaccone, S. Ratnasamy, and D. Wetherall, “Reducing Network Energy
Consumption via Sleeping and Rateadaptation,” in Proceedings of the 5th USENIX Symposium on Networked
Systems Design and Implementation, ser. NSDI’08. Berkeley, CA, USA: USENIX Association, 2008, pp. 323–
336.
[492] G. Shen and R. Tucker, “Energy-Minimized Design for IP Over WDM Networks,” Optical Communications
and Networking, IEEE/OSA Journal of, vol. 1, no. 1, pp. 176–186, June 2009.
[493] X. Dong, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “On the Energy Efficiency of Physical Topology
Design for IP Over WDM Networks,” Journal of Lightwave Technology, vol. 30, no. 12, pp. 1931–1942, June
2012.
[494] S. Zhang, D. Shen, and C. K. Chan, “Energy-Efficient Traffic Grooming in WDM Networks With Scheduled
Time Traffic,” Journal of Lightwave Technology, vol. 29, no. 17, pp. 2577–2584, Sept 2011.
[495] Z. H. Nasralla, T. E. H. El-Gorashi, M. O. I. Musa, and J. M. H. Elmirghani, “Energy-Efficient Traffic
Scheduling in IP over WDM Networks,” in 2015 9th International Conference on Next Generation Mobile
Applications, Services and Technologies, Sept 2015, pp. 161–164.
[496] Z. H. Nasralla and T. E. H. El-Gorashi and M. O. I. Musa and J. M. H. Elmirghani, “Routing post-disaster
traffic floods in optical core networks,” in 2016 International Conference on Optical Network Design and
Modeling (ONDM), May 2016, pp. 1–5.
[497] Z. H. Nasralla, M. O. I. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Routing post-disaster traffic
floods heuristics,” in 2016 18th International Conference on Transparent Optical Networks (ICTON), July 2016,
pp. 1–4.
[498] M. O. I. Musa, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Network coding for energy efficiency in
bypass IP/WDM networks,” in 2016 18th International Conference on Transparent Optical Networks (ICTON),
July 2016, pp. 1–3.
[499] M. O. I. Musa and T. E. H. El-Gorashi and J. M. H. Elmirghani, “Energy efficient core networks using
network coding,” in 2015 17th International Conference on Transparent Optical Networks (ICTON), July 2015,
pp. 1–4.
[500] T. E. H. El-Gorashi, X. Dong, A. Lawey, and J. M. H. Elmirghani, “Core network physical topology design
for energy efficiency and resilience,” in 2013 15th International Conference on Transparent Optical Networks
(ICTON), June 2013, pp. 1–7.
[501] M. Musa, T. Elgorashi, and J. Elmirghani, “Energy efficient survivable IP-over-WDM networks with
network coding,” IEEE/OSA Journal of Optical Communications and Networking, vol. 9, no. 3, pp. 207–217,
March 2017.
[502] M. Musa and T. Elgorashi and J. Elmirghani, “Bounds for energy efficient survivable IP over WDM
networks with network coding,” IEEE/OSA Journal of Optical Communications and Networking, vol. 10, no. 5,
pp. 471–481, May 2018.
[503] Y. Li, L. Zhu, S. K. Bose, and G. Shen, “Energy-Saving in IP Over WDM Networks by Putting Protection
Router Cards to Sleep,” Journal of Lightwave Technology, vol. 36, no. 14, pp. 3003–3017, July 2018.
[504] J. M. H. Elmirghani, L. Nonde, A. Q. Lawey, T. E. H. El-Gorashi, M. O. I. Musa, X. Dong, K. Hinton, and
T. Klein, “Energy efficiency measures for future core networks,” in 2017 Optical Fiber Communications
Conference and Exhibition (OFC), March 2017, pp. 1–3.
[505] J. M. H. Elmirghani, T. Klein, K. Hinton, L. Nonde, A. Q. Lawey, T. E. H. El-Gorashi, M. O. I. Musa, and
X. Dong, “GreenTouch GreenMeter core network energy-efficiency improvement measures and optimization,”
IEEE/OSA Journal of Optical Communications and Networking, vol. 10, no. 2, pp. A250–A269, Feb 2018.
[506] M. O. I. Musa, T. El-Gorashi, and J. M. H. Elmirghani, “Bounds on GreenTouch GreenMeter Network
Energy Efficiency,” Journal of Lightwave Technology, pp. 1–1, 2018.
[507] X. Dong, T. El-Gorashi, and J. Elmirghani, “IP Over WDM Networks Employing Renewable Energy
Sources,” Lightwave Technology, Journal of, vol. 29, no. 1, pp. 3–14, Jan 2011.
[508] X. Dong, T. El-Gorashi, and J. M. H. Elmirghani, “Green IP over WDM Networks: Solar and Wind
Renewable Sources and Data Centres,” in 2011 IEEE Global Telecommunications Conference – GLOBECOM
2011, Dec 2011, pp. 1–6.
[509] G. Shen, Y. Lui, and S. K. Bose, ““Follow the Sun, Follow the Wind” Lightpath Virtual Topology
Reconfiguration in IP Over WDM Network,” Journal of Lightwave Technology, vol. 32, no. 11, pp. 2094– 2105,
June 2014.
[510] M. Gattulli, M. Tornatore, R. Fiandra, and A. Pattavina, “Low- Emissions Routing for Cloud Computing in
IP-over-WDM Networks with Data Centers,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 1,
pp. 28–38, January 2014.
[511] A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Renewable energy in distributed energy
efficient content delivery clouds,” in 2015 IEEE International Conference on Communications (ICC), June 2015,
pp. 128–134.
[512] S. K. Dey and A. Adhya, “Delay-aware green service migration schemes for data center traffic,” IEEE/OSA
Journal of Optical Communications and Networking, vol. 8, no. 12, pp. 962–975, December 2016.
[513] L. Nonde, T. E. H. Elgorashi, and J. M. H. Elmirgahni, “Virtual Network Embedding Employing Renewable
Energy Sources,” in 2016 IEEE Global Communications Conference (GLOBECOM), Dec 2016, pp. 1–6.
[514] C. Ge, Z. Sun, and N.Wang, “A Survey of Power-Saving Techniques on Data Centers and Content Delivery
Networks,” IEEE Communications Surveys Tutorials, vol. 15, no. 3, pp. 1334–1354, Third 2013.
[515] C. Fang, F. R. Yu, T. Huang, J. Liu, and Y. Liu, “A Survey of Green Information-Centric Networking:
Research Issues and Challenges,” IEEE Communications Surveys Tutorials, vol. 17, no. 3, pp. 1455–1472,
thirdquarter 2015.
[516] X. Dong, T. El-Gorashi, and J. Elmirghani, “Green IP Over WDM Networks With Data Centers,” Lightwave
Technology, Journal of, vol. 29, no. 12, pp. 1861–1880, June 2011.
[517] V. Valancius, N. Laoutaris, L. Massoulié, C. Diot, and P. Rodriguez, “Greening the Internet with Nano Data
Centers,” in Proceedings of the 5th International Conference on Emerging Networking Experiments and
Technologies, ser. CoNEXT ’09. New York, NY, USA: ACM, 2009, pp. 37–48.
[518] C. Jayasundara, A. Nirmalathas, E. Wong, and C. Chan, “Improving Energy Efficiency of Video on Demand
Services,” IEEE/OSA Journal of Optical Communications and Networking, vol. 3, no. 11, pp. 870–880,
November 2011.
[519] N. I. Osman, T. El-Gorashi, and J. M. H. Elmirghani, “Reduction of energy consumption of Video-on-
Demand services using cache size optimization,” in 2011 Eighth International Conference on Wireless and Optical
Communications Networks, May 2011, pp. 1–5.
[520] N. I. Osman and T. El-Gorashi and J. M. H. Elmirghani, “The impact of content popularity distribution on
energy efficient caching,” in 2013 15th International Conference on Transparent Optical Networks (ICTON), June
2013, pp. 1–6.
[521] N. I. Osman, T. El-Gorashi, L. Krug, and J. M. H. Elmirghani, “Energy-Efficient Future High-Definition
TV,” Journal of Lightwave Technology, vol. 32, no. 13, pp. 2364–2381, July 2014.
[522] A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “BitTorrent Content Distribution in Optical
Networks,” Journal of Lightwave Technology, vol. 32, no. 21, pp. 4209–4225, Nov 2014.
[523] A. Lawey, T. El-Gorashi, and J. Elmirghani, “Distributed Energy Efficient Clouds Over Core Networks,”
Lightwave Technology, Journal of, vol. 32, no. 7, pp. 1261–1281, April 2014.
[524] H. A. Alharbi, T. E. H. El-Gorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Energy efficient virtual
machines placement in IP over WDM networks,” in 2017 19th International Conference on Transparent Optical
Networks (ICTON), July 2017, pp. 1–4.
[525] U. Wajid, c. cappiello, P. Plebani, B. Pernici, N. Mehandjiev, M. Vitali, M. Gienger, K. Kavoussanakis, D.
Margery, D. Perez, and P. Sampaio, “On Achieving Energy Efficiency and Reducing CO2 Footprint in Cloud
Computing,” Cloud Computing, IEEE Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[526] A. Al-Salim, A. Lawey, T. El-Gorashi, and J. Elmirghani, “Energy Efficient Tapered Data Networks for
Big Data processing in IP/WDM networks,” in Transparent Optical Networks (ICTON), 2015 17th International
Conference on, July 2015, pp. 1–5.
[527] A. M. Al-Salim, H. M. M. Ali, A. Q. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “Greening big data
networks: Volume impact,” in 2016 18th International Conference on Transparent Optical Networks (ICTON),
July 2016, pp. 1–6.
[528] L. A. Barroso and U. Hoelzle, The Datacenter As a Computer: An Introduction to the Design of Warehouse-
Scale Machines, 1st ed. Morgan and Claypool Publishers, 2009.
[529] T. Wang, Z. Su, Y. Xia, and M. Hamdi, “Rethinking the Data Center Networking: Architecture, Network
Protocols, and Resource Sharing,” Access, IEEE, vol. 2, pp. 1481–1496, 2014.
[530] A. Hammadi and L. Mhamdi, “A survey on architectures and energy efficiency in data center networks,”
Computer Communications, vol. 40, pp. 1 – 21, 2014.
[531] W. Xia, P. Zhao, Y. Wen, and H. Xie, “A Survey on Data Center Networking (DCN): Infrastructure and
Operations,” IEEE Communications Surveys Tutorials, vol. PP, no. 99, pp. 1–1, 2016.
[532] T. Chen, X. Gao, and G. Chen, “The features, hardware, and architectures of data center networks: A
survey,” Journal of Parallel and Distributed Computing, vol. 96, pp. 45 – 74, 2016.
[533] Y. Liu, J. K. Muppala, M. Veeraraghavan, D. Lin, and M. Hamdi, “Data Center Networks,” in
SpringerBriefs in Computer Science, 2013.
[534] K. Bilal, S. U. R. Malik, O. Khalid, A. Hameed, E. Alvarez, V. Wijaysekara, R. Irfan, S. Shrestha, D.
Dwivedy, M. Ali, U. S. Khan, A. Abbas, N. Jalil, and S. U. Khan, “A taxonomy and survey on green data center
networks,” Future Generation Computer Systems, vol. 36, pp. 189 – 208, 2014.
[535] B. Wang, Z. Qi, R. Ma, H. Guan, and A. V. Vasilakos, “A Survey on Data Center Networking for Cloud
Computing,” Comput. Netw., vol. 91, no. C, pp. 528–547, Nov. 2015.
[536] M. Al-Fares, A. Loukissas, and A. Vahdat, “A Scalable, Commodity Data Center Network Architecture,”
SIGCOMM Comput. Commun. Rev., vol. 38, no. 4, pp. 63–74, Aug. 2008.
[537] A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta,
“VL2: A Scalable and Flexible Data Center Network,” SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp.
51–62, Aug. 2009.
[538] J. Kim, W. J. Dally, and D. Abts, “Flattened Butterfly: A Cost-efficient Topology for High-radix Networks,”
SIGARCH Comput. Archit. News, vol. 35, no. 2, pp. 126–137, Jun. 2007.
[539] J. H. Ahn, N. Binkert, A. Davis, M. McLaren, and R. S. Schreiber, “HyperX: topology, routing, and
packaging of efficient large-scale networks,” in Proceedings of the Conference on High Performance Computing
Networking, Storage and Analysis, Nov 2009, pp. 1–11.
[540] Pall Beck, Peter Clemens, Santiago Freitas, Jeff Gatz, Michele Girola, Jason Gmitter, Holger Mueller, Ray
O’Hanlon, Veerendra Para, Joe Robinson, Andy Sholomon, Jason Walker, and Jon Tate, IBM and Cisco: Together
for a World Class Data Center. IBM Redbooks, 2013.
[541] A. Singh, J. Ong, A. Agarwal, G. Anderson, A. Armistead, R. Bannon, S. Boving, G. Desai, B. Felderman,
P. Germano, A. Kanagala, J. Provost, J. Simmons, E. Tanda, J. Wanderer, U. Hölzle, S. Stuart, and A. Vahdat,
“Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network,” in
Sigcomm ’15, 2015.
[542] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, “BCube: A High Performance,
Server-centric Network Architecture for Modular Data Centers,” SIGCOMM Comput. Commun.
Rev., vol. 39, no. 4, pp. 63–74, Aug. 2009.
[543] H. Wu, G. Lu, D. Li, C. Guo, and Y. Zhang, “MDCube: A High Performance Network Structure for Modular
Data Center Interconnection,” in Proceedings of the 5th International Conference on Emerging Networking
Experiments and Technologies, ser. CoNEXT ’09. New York, NY, USA: ACM, 2009, pp. 25–36.
[544] H. Abu-Libdeh, P. Costa, A. Rowstron, G. O’Shea, and A. Donnelly, “Symbiotic Routing in Future Data
Centers,” SIGCOMM Comput. Commun. Rev., vol. 40, no. 4, pp. 51–62, Aug. 2010.
[545] C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “DCell: A Scalable and Fault-Tolerant Network
Structure for Data Centers,” in SIGCOMM08. Association for Computing Machinery, Inc., August 2008.
[546] D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: Using Backup Port for Server Interconnection
in Data Centers,” in IEEE INFOCOM 2009, April 2009, pp. 2276–2285.
[547] A. Singla, C. Hong, L. Popa, and P. B. Godfrey, “Jellyfish: Networking Data Centers Randomly,” CoRR,
vol. abs/1110.1687, 2011.
[548] L. Gyarmati and T. A. Trinh, “Scafida: A Scale-free Network Inspired Data Center Architecture,”
SIGCOMM Comput. Commun. Rev., vol. 40, no. 5, pp. 4–12, Oct. 2010.
[549] J.-Y. Shin, B. Wong, and E. G. Sirer, “Small-world Datacenters,” in Proceedings of the 2Nd ACM
Symposium on Cloud Computing, ser. SOCC ’11. New York, NY, USA: ACM, 2011, pp. 2:1–2:13.
[550] E. Baccour, S. Foufou, R. Hamila, and M. Hamdi, “A survey of wireless data center networks,” in 2015
49th Annual Conference on Information Sciences and Systems (CISS), March 2015, pp. 1–6.
[551] A. S. Hamza, J. S. Deogun, and D. R. Alexander, “Wireless Communication in Data Centers: A Survey,”
IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp. 1572–1595, thirdquarter 2016.
[552] C. Kachris and I. Tomkos, “A survey on optical interconnects for data centers,” Communications Surveys
Tutorials, IEEE, vol. 14, no. 4, pp. 1021–1036, Fourth 2012.
[553] M. Chen, H. Jin, Y. Wen, and V. Leung, “Enabling technologies for future data center networking: a
primer,” Network, IEEE, vol. 27, no. 4, pp. 8–15, July 2013.
[554] L. Schares, D. M. Kuchta, and A. F. Benner, “Optics in Future Data Center Networks,” in 2010 18th IEEE
Symposium on High Performance Interconnects, Aug 2010, pp. 104–108.
[555] H. Ballani, P. Costa, I. Haller, K. Jozwik, K. Shi, B. Thomsen, and H. Williams, “Bridging the Last Mile
for Optical Switching in Data Centers,” in 2018 Optical Fiber Communications Conference and Exposition (OFC),
March 2018, pp. 1–3.
[556] G. Papen, “Optical components for datacenters,” in 2017 Optical Fiber Communications Conference and
Exhibition (OFC), March 2017, pp. 1–53.
[557] L. Chen, E. Hall, L. Theogarajan, and J. Bowers, “Photonic Switching for Data Center Applications,” IEEE
Photonics Journal, vol. 3, no. 5, pp. 834–844, Oct 2011.
[558] P. N. Ji, D. Qian, K. Kanonakis, C. Kachris, and I. Tomkos, “Design and Evaluation of a Flexible-Bandwidth
OFDM-Based Intra-Data Center Interconnect,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 19,
no. 2, pp. 3 700 310–3 700 310, March 2013.
[559] C. Kachris and I. Tomkos, “Power consumption evaluation of hybrid WDM PON networks for data centers,”
in 2011 16th European Conference on Networks and Optical Communications, July 2011, pp. 118–121.
[560] Y. Cheng, M. Fiorani, R. Lin, L. Wosinska, and J. Chen, “POTORI: a passive optical top-of-rack
interconnect architecture for data centers,” IEEE/OSA Journal of Optical Communications and Networking, vol.
9, no. 5, pp. 401–411, May 2017.
[561] H. Liu, F. Lu, A. Forencich, R. Kapoor, M. Tewari, G. M. Voelker, G. Papen, A. C. Snoeren, and G. Porter,
“Circuit Switching Under the Radar with REACToR,” in 11th USENIX Symposium on Networked
Systems Design and Implementation (NSDI 14). Seattle, WA: USENIX Association, 2014, pp. 1–15.
[562] J. Elmirghani, T. EL-GORASHI, and A. HAMMADI, “Passive optical-based data center networks,” 2016,
wO Patent App. PCT/GB2015/053,604. [Online]. Available:
http://google.com/patents/WO2016083812A1?cl=und
[563] A. Hammadi, T. El-Gorashi, and J. Elmirghani, “High performance AWGR PONs in data centre networks,”
in Transparent Optical Networks (ICTON), 2015 17th International Conference on, July 2015,
pp. 1–5.
[564] R. Alani, A. Hammadi, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “PON data centre design with AWGR
and server based routing,” in 2017 19th International Conference on Transparent Optical Networks
(ICTON), July 2017, pp. 1–4.
[565] A. Hammadi, T. E. H. El-Gorashi, M. O. I. Musa, and J. M. H. Elmirghani, “Server-centric PON data center
architecture,” in 2016 18th International Conference on Transparent Optical Networks (ICTON), July 2016, pp.
1–4.
[566] A. Hammadi, M. Musa, T. E. H. El-Gorashi, and J. H. Elmirghani, “Resource provisioning for cloud PON
AWGR-based data center architecture,” in 2016 21st European Conference on Networks and Optical
Communications (NOC), June 2016, pp. 178–182.
[567] A. Hammadi, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energyefficient software-defined AWGR-
based PON data center network,” in 2016 18th International Conference on Transparent Optical Networks
(ICTON), July 2016, pp. 1–5.
[568] A. E. A. Eltraify, M. O. I. Musa, A. Al-Quzweeni, and J. M. H. Elmirghani, “Experimental Evaluation of
Passive Optical Network Based Data Centre Architecture,” in 2018 20th International Conference on Transparent
Optical Networks (ICTON), July 2018, pp. 1–4.
[569] S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Energy Efficiency of Server-Centric PON
Data Center Architecture for Fog Computing,” in 2018 20th International Conference on Transparent Optical
Networks (ICTON), July 2018, pp. 1–4.
[570] S. H. Mohamed, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Impact of Link Failures on the Performance
of MapReduce in Data Center Networks,” in 2018 20th International Conference on Transparent
Optical Networks (ICTON), July 2018, pp. 1–4.
[571] R. A. T. Alani, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Virtual Machines Embedding for Cloud PON
AWGR and Server Based Data Centres,” arXiv e-prints, p. arXiv:1904.03298, Apr 2019.
[572] A. E. A. Eltraify, M. O. I. Musa, A. Al-Quzweeni, and J. M. H. Elmirghani, “Experimental Evaluation of
Server Centric Passive Optical Network Based Data Centre Architecture,” arXiv e-prints, p. arXiv:1904.04580,
Apr 2019.
[573] A. E. A. Eltraify, M. O. I. Musa, and J. M. H. Elmirghani, “TDM/WDM over AWGR Based Passive Optical
Network Data Centre Architecture,” arXiv e-prints, p. arXiv:1904.04581, Apr 2019.
[574] H. Yang, J. Zhang, Y. Zhao, J. Han, Y. Lin, and Y. Lee, “SUDOI: software defined networking for
ubiquitous data center optical interconnection,” IEEE Communications Magazine, vol. 54, no. 2, pp. 86–95,
February 2016.
[575] M. Fiorani, S. Aleksic, M. Casoni, L. Wosinska, and J. Chen, “Energy- Efficient Elastic Optical Interconnect
Architecture for Data Centers,” IEEE Communications Letters, vol. 18, no. 9, pp. 1531–1534, Sep.
2014.
[576] Z. Cao, R. Proietti, M. Clements, and S. J. B. Yoo, “Experimental Demonstration of Flexible Bandwidth
Optical Data Center Core Network With All-to-All Interconnectivity,” Journal of Lightwave Technology, vol. 33,
no. 8, pp. 1578–1585, April 2015.
[577] S. J. B. Yoo, Y. Yin, and K. Wen, “Intra and inter datacenter networking: The role of optical packet
switching and flexible bandwidth optical networking,” in 2012 16th International Conference on Optical Network
Design and Modelling (ONDM), April 2012, pp. 1–6.
[578] H. J. S. Dorren, S. Di Lucente, J. Luo, O. Raz, and N. Calabretta, “Scaling photonic packet switches to a
large number of ports [invited],” IEEE/OSA Journal of Optical Communications and Networking, vol. 4, no. 9,
pp. A82–A89, Sep. 2012.
[579] F. Yan, W. Miao, O. Raz, and N. Calabretta, “Opsquare: A flat DCN architecture based on flow-controlled
optical packet switches,” IEEE/OSA Journal of Optical Communications and Networking, vol. 9, no. 4, pp. 291–
303, April 2017.
[580] X. Yu, H. Gu, K. Wang, and S. Ma, “Petascale: A Scalable Buffer-Less All-Optical Network for Cloud
Computing Data Center,” IEEE Access, vol. 7, pp. 42 596–42 608, 2019.
[581] G. Wang, D. G. Andersen, M. Kaminsky, K. Papagiannaki, T. E. Ng, M. Kozuch, and M. Ryan, “c Through:
part-time optics in data centers,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, Aug. 2010.
[582] N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A.
Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” SIGCOMM Comput.
Commun. Rev., vol. 41, no. 4, Aug. 2010.
[583] G. Porter, R. Strong, N. Farrington, A. Forencich, P. Chen-Sun, T. Rosing, Y. Fainman, G. Papen, and A.
Vahdat, “Integrating Microsecond Circuit Switching into the Data Center,” SIGCOMM Comput. Commun.
Rev., vol. 43, no. 4, pp. 447–458, Aug. 2013.
[584] A. Singla, A. Singh, and Y. Chen, “OSA: An Optical Switching Architecture for Data Center Networks
with Unprecedented Flexibility,” in Presented as part of the 9th USENIX Symposium on Networked Systems
Design and Implementation (NSDI 12). San Jose, CA: USENIX, 2012, pp. 239–252.
[585] A. Singla, A. Singh, K. Ramachandran, L. Xu, and Y. Zhang, “Proteus: A Topology Malleable Data Center
Network,” in Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks, ser. Hotnets-IX.
New York, NY, USA: ACM, 2010, pp. 8:1–8:6.
[586] X. Ye, Y. Yin, S. Yoo, P. Mejia, R. Proietti, and V. Akella, “DOS – A scalable optical switch for
datacenters,” in Architectures for Networking and Communications Systems (ANCS), 2010 ACM/IEEE
Symposium on, Oct 2010, pp. 1–12.
[587] K. Xia, Y. H. Kaob, M. Yangb, and H. J. Chao, “Petabit optical switch for data center networks,”
Polytechnic Institute of NYU, Tech. Rep., 2010.
[588] N. Hamedazimi, Z. Qazi, H. Gupta, V. Sekar, S. R. Das, J. P. Longtin, H. Shah, and A. Tanwer, “FireFly:
A Reconfigurable Wireless Data Center Fabric Using Free-space Optics,” SIGCOMM Comput. Commun.
Rev., vol. 44, no. 4, pp. 319–330, Aug. 2014.
[589] N. Hamedazimi, H. Gupta, V. Sekar, and S. R. Das, “Patch Panels in the Sky: A Case for Free-space Optics
in Data Centers,” in Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks, ser. HotNets-XII.
New York, NY, USA: ACM, 2013, pp. 23:1–23:7.
[590] A. Roozbeh, J. Soares, G. Q. Maguire, F. Wuhib, C. Padala, M. Mahloo, D. Turull, V. Yadhav, and D.
Kosti´c, “Software-Defined “Hardware” Infrastructures: A Survey on Enabling Technologies and Open Research
Directions,” IEEE Communications Surveys Tutorials, vol. 20, no. 3, pp. 2454–2485, thirdquarter 2018.
[591] S. Rumley, D. Nikolova, R. Hendry, Q. Li, D. Calhoun, and K. Bergman, “Silicon Photonics for Exascale
Systems,” Journal of Lightwave Technology, vol. 33, no. 3, pp. 547–562, Feb 2015.
[592] M. A. Taubenblatt, “Optical Interconnects for High-Performance Computing,” Journal of Lightwave
Technology, vol. 30, no. 4, pp. 448–457, Feb 2012.
[593] G. Zervas, H. Yuan, A. Saljoghei, Q. Chen, and V. Mishra, “Optically disaggregated data centers with
minimal remote memory latency: Technologies, architectures, and resource allocation [Invited],” IEEE/OSA
Journal of Optical Communications and Networking, vol. 10, no. 2, pp. A270–A285, Feb 2018.
[594] G. M. Saridis, Y. Yan, Y. Shu, S. Yan, M. Arslan, T. Bradley, N. V. Wheeler, N. H. L. Wong, F. Poletti,
M. N. Petrovich, D. J. Richardson, S. Poole, G. Zervas, and D. Simeonidou, “EVROS: Alloptical programmable
disaggregated data centre interconnect utilizing hollow-core bandgap fibre,” in 2015 European Conference on
Optical Communication (ECOC), Sep. 2015, pp. 1–3.
[595] O. O. Ajibola, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “Disaggregation for Improved Efficiency in
Fog Computing Era,” arXiv e-prints, p. arXiv:1904.01311, Apr 2019.
[596] O. O. Ajibola, T. E. H. El-Gorashi, and J. M. H. Elmirghani, “On Energy Efficiency of Networks for
Composable Datacentre Infrastructures,” in 2018 20th International Conference on Transparent Optical Networks
(ICTON), July 2018, pp. 1–5.
[597] H. M. M. Ali, T. E. H. El-Gorashi, A. Q. Lawey, and J. M. H. Elmirghani, “Future Energy Efficient Data
Centers with Disaggregated Servers,” Journal of Lightwave Technology, vol. PP, no. 99, pp. 1–1, 2017.
[598] H. Mohammad Ali, A. Lawey, T. El-Gorashi, and J. Elmirghani, “Energy efficient disaggregated servers
for future data centers,” in Networks and Optical Communications - (NOC), 2015 20th European Conference on,
June 2015, pp. 1–6.
[599] H. M. M. Ali, A. M. Al-Salim, A. Q. Lawey, T. El-Gorashi, and J. M. H. Elmirghani, “Energy efficient
resource provisioning with VM migration heuristic for Disaggregated Server design,” in 2016 18th International
Conference on Transparent Optical Networks (ICTON), July 2016, pp. 1–5.
[600] S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken, “The Nature of Data Center Traffic:
Measurements & Analysis,” in Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement
Conference, ser. IMC ’09. New York, NY, USA: ACM, 2009, pp. 202–208.
[601] T. Benson, A. Anand, A. Akella, and M. Zhang, “Understanding Data Center Traffic Characteristics,”
SIGCOMM Comput. Commun. Rev., vol. 40, no. 1, pp. 92–99, Jan. 2010.
[602] T. Benson, A. Akella, and D. A. Maltz, “Network Traffic Characteristics of Data Centers in the Wild,” in
Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, ser. IMC ’10. New York, NY,
USA: ACM, 2010, pp. 267–280.
[603] A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren, “Inside the Social Network’s (Datacenter)
Network,” SIGCOMM Comput. Commun. Rev., vol. 45, no. 5, pp. 123–137, Aug. 2015.
[604] Q. Zhang, V. Liu, H. Zeng, and A. Krishnamurthy, “High-resolution Measurement of Data Center
Microbursts,” in Proceedings of the 2017 Internet Measurement Conference, ser. IMC ’17. New York, NY, USA:
ACM, 2017, pp. 78–85.
[605] D. A. Popescu and A. W. Moore, “A First Look at Data Center Network Condition Through The Eyes of
PTPmesh,” in 2018 Network Traffic Measurement and Analysis Conference (TMA), June 2018, pp. 1–8.
[606] Y. Peng, K. Chen, G. Wang, W. Bai, Y. Zhao, H. Wang, Y. Geng, Z. Ma, and L. Gu, “Towards
Comprehensive Traffic Forecasting in Cloud Computing: Design and Application,” Networking, IEEE/ACM
Transactions on, vol. PP, no. 99, pp. 1–1, 2015.
[607] C. H. Liu, A. Kind, and A. V. Vasilakos, “Sketching the data center network traffic,” IEEE Network, vol.
27, no. 4, pp. 33–39, July 2013.
[608] Z. Hu, Y. Qiao, J. Luo, P. Sun, and Y. Wen, “CREATE: Correlation enhanced traffic matrix estimation in
Data Center Networks,” in 2014 IFIP Networking Conference, June 2014, pp. 1–9.
[609] Y. Han, J. Yoo, and J. W. Hong, “Poisson shot-noise process based flow-level traffic matrix generation for
data center networks,” in 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM),
May 2015, pp. 450–457.
[610] C. Delimitrou, S. Sankar, A. Kansal, and C. Kozyrakis, “ECHO: Recreating network traffic maps for
datacenters with tens of thousands of servers,” in 2012 IEEE International Symposium on Workload
Characterization (IISWC), Nov 2012, pp. 14–24.
[611] M. Noormohammadpour and C. S. Raghavendra, “Datacenter Traffic Control: Understanding Techniques
and Tradeoffs,” IEEE Communications Surveys Tutorials, vol. 20, no. 2, pp. 1492–1525, Secondquarter 2018.
[612] K. Chen, C. Hu, X. Zhang, K. Zheng, Y. Chen, and A. V. Vasilakos, “Survey on routing in data centers:
insights and future directions,” IEEE Network, vol. 25, no. 4, pp. 6–10, July 2011.
[613] R. Rojas-Cessa, Y. Kaymak, and Z. Dong, “Schemes for Fast Transmission of Flows in Data Center
Networks,” IEEE Communications Surveys Tutorials, vol. 17, no. 3, pp. 1391–1422, thirdquarter 2015.
[614] J. Qadir, A. Ali, K. A. Yau, A. Sathiaseelan, and J. Crowcroft, “Exploiting the Power of Multiplicity: A
Holistic Survey of Network- Layer Multipath,” IEEE Communications Surveys Tutorials, vol. 17, no. 4, pp. 2176–
2213, Fourthquarter 2015.
[615] J. Zhang, F. R. Yu, S. Wang, T. Huang, Z. Liu, and Y. Liu, “Load Balancing in Data Center Networks: A
Survey,” IEEE Communications Surveys Tutorials, pp. 1–1, 2018.
[616] Y. Zhang and N. Ansari, “On Architecture Design, Congestion Notification, TCP Incast and Power
Consumption in Data Centers,” IEEE Communications Surveys Tutorials, vol. 15, no. 1, pp. 39–64, First 2013.
[617] M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus,
R. Pan, N. Yadav, and G. Varghese, “CONGA: Distributed Congestion-aware Load Balancing for Datacenters,”
SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 503–514, Aug. 2014.
[618] X. Wu and X. Yang, “DARD: Distributed Adaptive Routing for Datacenter Networks,” in Distributed
Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on, June 2012, pp. 32–41.
[619] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan,
“Data center TCP (DCTCP),” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. –, Aug. 2010.
[620] C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Handley, “Improving Datacenter
Performance and Robustness with Multipath TCP,” in Proceedings of the ACM SIGCOMM 2011 Conference,
ser. SIGCOMM ’11. New York, NY, USA: ACM, 2011, pp. 266–277.
[621] B. Vamanan, J. Hasan, and T. Vijaykumar, “Deadline-aware Datacenter TCP (D2TCP),” in Proceedings of
the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for
Computer Communication, ser. SIGCOMM ’12. New York, NY, USA: ACM, 2012, pp. 115–126.
[622] C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron, “Better never than late: Meeting deadlines in
datacenter networks,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 50–61, Aug. 2011.
[623] M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker, “pFabric: Minimal
Near-optimal Datacenter Transport,” SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 435–446, Aug.
2013.
[624] C.-Y. Hong, M. Caesar, and P. B. Godfrey, “Finishing Flows Quickly with Preemptive Scheduling,” in
Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and
Protocols for Computer Communication, ser. SIGCOMM ’12. New York, NY, USA: ACM, 2012, pp. 127–138.
[625] D. Zats, T. Das, P. Mohan, D. Borthakur, and R. Katz, “DeTail: Reducing the Flow Completion Time Tail
in Datacenter Networks,” in Proceedings of the ACM SIGCOMM 2012 Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communication, ser. SIGCOMM ’12. New York, NY,
USA: ACM, 2012, pp. 139–150.
[626] M. Bari, R. Boutaba, R. Esteves, L. Granville, M. Podlesny, M. Rabbani, Q. Zhang, and M. Zhani, “Data
Center Network Virtualization: A Survey,” Communications Surveys Tutorials, IEEE, vol. 15, no. 2, pp. 909–
928, Second 2013.
[627] V. D. Piccolo, A. Amamou, K. Haddadou, and G. Pujolle, “A Survey of Network Isolation Solutions for
Multi-Tenant Data Centers,” IEEE Communications Surveys Tutorials, vol. 18, no. 4, pp. 2787–2821,
Fourthquarter 2016.
[628] S. Raghul, T. Subashri, and K. R. Vimal, “Literature survey on traffic-based server load balancing using
SDN and open flow,” in 2017 Fourth International Conference on Signal Processing, Communication and
Networking (ICSCN), March 2017, pp. 1–6.
[629] C. Guo, G. Lu, H. J. Wang, S. Yang, C. Kong, P. Sun, W. Wu, and Y. Zhang, “SecondNet: A Data Center
Network Virtualization Architecture with Bandwidth Guarantees,” in Proceedings of the 6th International
COnference, ser. Co-NEXT ’10. New York, NY, USA: ACM, 2010, pp. 15:1–15:12.
[630] M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, “Hedera: Dynamic Flow
Scheduling for Data Center Networks,” in Proceedings of the 7th USENIX Conference on Networked Systems
Design and Implementation, ser. NSDI’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 19–19.
[631] B. Heller, S. Seetharaman, P. Mahadevan, Y. Yiakoumis, P. Sharma, S. Banerjee, and N. McKeown,
“ElasticTree: Saving Energy in Data Center Networks,” in Proceedings of the 7th USENIX Conference on
Networked Systems Design and Implementation, ser. NSDI’10. Berkeley, CA, USA: USENIX Association, 2010,
pp. 17–17.
[632] M. Dayarathna, Y. Wen, and R. Fan, “Data Center Energy Consumption Modeling: A Survey,” IEEE
Communications Surveys Tutorials, vol. 18, no. 1, pp. 732–794, Firstquarter 2016.
[633] D. Çavdar and F. Alagoz, “A survey of research on greening data centers,” in 2012 IEEE Global
Communications Conference (GLOBECOM), Dec 2012, pp. 3237–3242.
[634] A. C. Riekstin, B. B. Rodrigues, K. K. Nguyen, T. C. M. de Brito Carvalho, C. Meirosu, B. Stiller, and M.
Cheriet, “A Survey on Metrics and Measurement Tools for Sustainable Distributed Cloud Networks,”IEEE
Communications Surveys Tutorials, vol. 20, no. 2, pp. 1244–1270, Secondquarter 2018.
[635] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The Cost of a Cloud: Research Problems in Data
Center Networks,” SIGCOMM Comput. Commun. Rev., vol. 39, no. 1, pp. 68–73, Dec. 2008.
[636] W. Zhang, Y. Wen, Y. W. Wong, K. C. Toh, and C. H. Chen, “Towards Joint Optimization Over ICT and
Cooling Systems in Data Centre: A Survey,” IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp. 1596–
1616, thirdquarter 2016.
[637] K. Bilal, S. U. R. Malik, S. U. Khan, and A. Y. Zomaya, “Trends and challenges in cloud datacenters,”
IEEE Cloud Computing, vol. 1, no. 1, pp. 10–20, May 2014.
[638] C. Kachris and I. Tomkos, “Power consumption evaluation of all-optical data center networks,” Cluster
Computing, vol. 16, no. 3, pp. 611–623, 2013.
[639] K. Christensen, P. Reviriego, B. Nordman, M. Bennett, M. Mostowfi, and J. Maestro, “IEEE 802.3az: the
road to energy efficient ethernet,” Communications Magazine, IEEE, vol. 48, no. 11, pp. 50–56, November 2010.
Taisir E. H. El-Gorashi
received the B.S. degree (first-class Hons.) in electrical and electronic engineering from the University of
Khartoum, Khartoum, Sudan, in 2004, the M.Sc. degree (with distinction) in photonic and communication systems
from the University of Wales, Swansea, UK, in 2005, and the PhD degree in optical networking from the
University of Leeds, Leeds, UK, in 2010. She is currently a Lecturer in optical networks in the School of Electrical
and Electronic Engineering, University of Leeds. Previously, she held a Postdoctoral Research post at the
University of Leeds (2010– 2014), where she focused on the energy efficiency of optical networks investigating
the use of renewable energy in core networks, green IP over WDM networks with datacenters, energy efficient
physical topology design, energy efficiency of content distribution networks, distributed cloud computing,
network virtualization and Big Data. In 2012, she was a BT Research Fellow, where she developed energy
efficient hybrid wireless-optical broadband access networks and explored the dynamics of TV viewing behavior
and program popularity. The energy efficiency techniques developed during her postdoctoral research contributed
3 out of the 8 carefully chosen core network energy efficiency improvement measures recommended by the
GreenTouch consortium for every operator network worldwide. Her work led to several invited talks at
GreenTouch, Bell Labs, Optical Network Design and Modelling conference, Optical Fiber Communications
conference, International Conference on Computer Communications, EU Future Internet Assembly, IEEE
Sustainable ICT Summit and IEEE 5G World Forum and collaboration with Nokia and Huawei.