Abstract
A collection of virtual machines (VMs) interconnected with an overlay network with a layer 2 abstraction has proven to be a powerful, unifying abstraction for adaptive distributed and parallel computing on loosely-coupled environments. It is now feasible to allow VMs hosting high performance computing (HPC) applications to seamlessly bridge distributed cloud resources and tightly-coupled supercomputing and cluster resources. However, to achieve the application performance that the tightly-coupled resources are capable of, it is important that the overlay network not introduce significant overhead relative to the native hardware, which is not the case for current user-level tools, including our own existing VNET/U system. In response, we describe the design, implementation, and evaluation of a virtual networking system that has negligible latency and bandwidth overheads in 1–10 Gbps networks. Our system, VNET/P, is directly embedded into our publicly available Palacios virtual machine monitor (VMM). VNET/P achieves native performance on 1 Gbps Ethernet networks and very high performance on 10 Gbps Ethernet networks. The NAS benchmarks generally achieve over 95 % of their native performance on both 1 and 10 Gbps. We have further demonstrated that VNET/P can operate successfully over more specialized tightly-coupled networks, such as Infiniband and Cray Gemini. Our results suggest it is feasible to extend a software-based overlay network designed for computing at wide-area scales into tightly-coupled environments.
Similar content being viewed by others
Notes
This may be expanded in the future. Currently, it has been sized to support the largest possible IPv4 packet size.
References
Abu-Libdeh, H., Costa, P., Rowstron, A., O’Shea, G., Donnelly, A.: Symbiotic routing in future data centers. In: Proceedings of SIGCOMM, August (2010)
AMD Corporation: AMD64 Virtualization Codenamed “Pacific” Technology: Secure Virtual Machine Architecture Reference Manual, May (2005)
Andersen, D., Balakrishnan, H., Kaashoek, F., Morris, R.: Resilient overlay networks. In: Proceedings of SOSP, March (2001)
Bavier, A.C., Feamster, N., Huang, M., Peterson, L.L., Rexford, J.: In vini veritas: realistic and controlled network experimentation. In: Proceedings of SIGCOMM, September (2006)
Cui, Z., Xia, L., Bridges, P., Dinda, P., Lange, J.: Optimizing overlay-based virtual networking through optimistic interrupts and cut-through forwarding. In: Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12—Supercomputing), November (2012)
Dinda, P., Sundararaj, A., Lange, J., Gupta, A., Lin, B.: Methods and Systems for Automatic Inference and Adaptation of Virtualized Computing Environments, March 2012. United States patent number 8,145,760
Evangelinos, C., Hill, C.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere-ocean climate models on Amazon’s EC2. In: Proceedings of Cloud Computing and Its Applications (CCA), October (2008)
Figueiredo, R., Dinda, P.A., Fortes, J.: A case for grid computing on virtual machines. In: Proceedings of the 23rd International Conference on Distributed Computing Systems (ICDCS 2003), May (2003)
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: open MPI
Ganguly, A., Agrawal, A., Boykin, P.O., Figueiredo, R.: IP over P2P: enabling self-configuring virtual IP networks for grid computing. In: Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April (2006)
Gordon, A., Amit, N., Har’El, N., Ben-Yehuda, M., Landau, A., Schuster, A., Tsafrir, D.: ELI:: bare-metal performance for I/O virtualization. In: Proceedings of the 17th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2012), March (2012)
Greenberg, A., Hamilton, J.R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D.A., Patel, P., Sengupta, S.: VL2: a scalable and flexible data center network. In: Proceedings of SIGCOMM, August (2009)
Guo, C., Lu, G., Li, D., Wu, H., Zhang, X., Shi, Y., Tian, C., Zhang, Y., Lu, S.: Bcube: a high performance, server-centric network architecture for modular data centers. In: Proceedings of SIGCOMM, August (2009)
Gupta, A.: Black Box Methods for Inferring Parallel Applications’ Properties in Virtual Environments. PhD thesis, Northwestern University, May 2008. Technical report NWU-EECS-08-04, Department of Electrical Engineering and Computer Science
Gupta, A., Dinda, P.A.: Inferring the topology and traffic load of parallel programs running in a virtual machine environment. In: Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), June (2004)
Gupta, A., Zangrilli, M., Sundararaj, A., Huang, A., Dinda, P., Lowekamp, B.: Free network measurement for virtual machine distributed computing. In: Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2006)
Hua Chu, Y., Rao, S., Sheshan, S., Zhang, H.: Enabling conferencing applications on the Internet using an overlay multicast architecture. In: Proceedings of ACM SIGCOMM (2001)
Huang, W., Liu, J., Abali, B., Panda, D.: A case for high performance computing with virtual machines. In: Proceedings of the 20th ACM International Conference on Supercomputing (ICS), June–July (2006)
Innovative Computing Laboratory: HPC challenge benchmark. http://icl.cs.utk.edu/hpcc/
Intel: Intel cluster toolkit 3.0 for Linux. http://software.intel.com/en-us/articles/intel-mpi-benchmarks/
Jiang, X., Xu, D.: Violin: virtual internetworking on overlay infrastructure. Tech. rep. CSD TR 03-027, Department of Computer Sciences, Purdue University, July (2003)
Joseph, D.A., Kannan, J., Kubota, A., Lakshminarayanan, K., Stoica, I., Wehrle, K.: Ocala: an architecture for supporting legacy applications over overlays. In: Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI), May (2006)
Kallahalla, M., Uysal, M., Swaminathan, R., Lowell, D.E., Wray, M., Christian, T., Edwards, N., Dalton, C.I., Gittler, F.: Softudc: a software-based data center for utility computing. Computer 37(11), 38–46 (2004)
Kashyap, V.: IP over Infiniband (IPoIB) Architecture. IETF Network Working Group Request for Comments. RFC 4392, April (2006). Current expiration: August 2012
Kim, C., Caesar, M., Rexford, J.: Floodless in Seattle: a scalable Ethernet architecture for large enterprises. In: Proceedings of SIGCOMM, August (2008)
Kumar, S., Raj, H., Schwan, K., Ganev, I.: Re-architecting VMMS for multicore systems: the sidecore approach. In: Proceedings of the 2007 Workshop on the Interaction Between Operating Systems and Computer Architecture, June (2007)
Lange, J., Dinda, P.: Transparent network services via a virtual traffic layer for virtual machines. In: Proceedings of the 16th IEEE International Symposium on High Performance Distributed Computing (HPDC), June (2007)
Lange, J., Dinda, P., Hale, K., Xia, L.: An introduction to the Palacios virtual machine monitor—release 1.3. Tech. rep. NWU-EECS-11-10, Department of Electrical Engineering and Computer Science, Northwestern University, October (2011)
Lange, J., Pedretti, K., Dinda, P., Bae, C., Bridges, P., Soltero, P., Merritt, A.: Minimal-overhead virtualization of a large scale supercomputer. In: Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), March (2011)
Lange, J., Pedretti, K., Hudson, T., Dinda, P., Cui, Z., Xia, L., Bridges, P., Gocke, A., Jaconette, S., Levenhagen, M., Brightwell, R.: Palacios and Kitten: new high performance operating systems for scalable virtualized and native supercomputing. In: Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April (2010)
Lange, J., Sundararaj, A., Dinda, P.: Automatic dynamic run-time optical network reservations. In: Proceedings of the 14th International Symposium on High Performance Distributed Computing (HPDC), July (2005)
Lin, B., Dinda, P.: Vsched: mixing batch and interactive virtual machines using periodic real-time scheduling. In: Proceedings of ACM/IEEE SC (Supercomputing), November (2005)
Lin, B., Sundararaj, A., Dinda, P.: Time-sharing parallel applications with performance isolation and control. In: Proceedings of the 4th IEEE International Conference on Autonomic Computing (ICAC), June (2007)
Liu, J., Huang, W., Abali, B., Panda, D.: High performance VMM-Bypass I/O in virtual machines. In: Proceedings of the USENIX Annual Technical Conference, May (2006)
Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M., Wright, C.: VXLAN: A framework for overlaying virtualized layer 2 networks over layer 3 networks. IETF Network Working Group Internet Draft, February (2012). Current expiration: August 2012
Menon, A., Cox, A.L., Zwaenepoel, W.: Optimizing network virtualization in Xen. In: Proceedings of the USENIX Annual Technical Conference (USENIX), May (2006)
Mergen, M.F., Uhlig, V., Krieger, O., Xenidis, J.: Virtualization for high-performance computing. Oper. Syst. Rev. 40(2), 8–11 (2006)
Mysore, R.N., Pamboris, A., Farrington, N., Huang, N., Miri, P., Radhakrishnan, S., Subramanya, V., Vahdat, A.: Portland: a scalable fault-tolerant layer 2 data center network fabric. In: Proceedings of SIGCOMM, August (2009)
Nurmi, D., Wolski, R., Grzegorzyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: The Eucalyptus open-source cloud-computing system. In: Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), May (2009)
Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.: An early performance analysis of cloud computing services for scientific computing. Tech. Rep. PDS2008-006, Delft University of Technology, Parallel and Distributed Systems Report Series, December (2008)
Raj, H., Schwan, K.: High performance and scalable i/o virtualization via self-virtualized devices. In: Proceedings of the 16th IEEE International Symposium on High Performance Distributed Computing (HPDC), July (2007)
Russell, R.: Virtio: towards a de-facto standard for virtual I/O devices. Oper. Syst. Rev. 42(5), 95–103 (2008)
Ruth, P., Jiang, X., Xu, D., Goasguen, S.: Towards virtual distributed environments in a shared infrastructure. Computer 38(5), 63–69 (2005)
Ruth, P., McGachey, P., Jiang, X., Xu D.: Viocluster: virtualization for dynamic computational domains. In: Proceedings of the IEEE International Conference on Cluster Computing (Cluster), September (2005)
Shafer, J., Carr, D., Menon, A., Rixner, S., Cox, A.L., Zwaenepoel, W., Willmann, P.: Concurrent direct network access for virtual machine monitors. In: Proceedings of the 13th International Symposium on High Performance Computer Architecture (HPCA), February (2007)
Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup service for Internet applications. In: Proceedings of ACM SIGCOMM 2001, pp. 149–160 (2001)
Sugerman, J., Venkitachalan, G., Lim, B.-H.: Virtualizing I/O devices on VMware workstation’s hosted virtual machine monitor. In: Proceedings of the USENIX Annual Technical Conference, June (2001)
Sundararaj, A.: Automatic Run-time, and Dynamic Adaptation of Distributed Applications Executing in Virtual Environments. PhD thesis, Northwestern University, December (2006). Technical report NWU-EECS-06-18, Department of Electrical Engineering and Computer Science
Sundararaj, A., Dinda, P.: Towards virtual networks for virtual machine grid computing. In: Proceedings of the 3rd USENIX Virtual Machine Research and Technology Symposium (VM 2004), May (2004). Earlier version available as technical report NWU-CS-03-27, Department of Computer Science, Northwestern University
Sundararaj, A., Gupta, A., Dinda, P.: Increasing application performance in virtual environments through run-time inference and adaptation. In: Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC), July (2005)
Sundararaj, A., Sanghi, M., Lange, J., Dinda, P.: An optimization problem in adaptive virtual enviroments. In: Proceedings of the Seventh Workshop on Mathematical Performance Modeling and Analysis (MAMA), June (2005)
Tsugawa, M.O., Fortes, J.A.B.: A virtual network (vine) architecture for grid computing. In: 20th International Parallel and Distributed Processing Symposium (IPDPS), April (2006)
Uhlig, R., Neiger, G., Rodgers, D., Santoni, A., Martin, F., Anderson, A., Bennettt, S., Kagi, A., Leung, F., Smith, L.: Intel virtualization technology. IEEE Computer, 48–56 (2005)
Van der Wijngaart, R.: NAS parallel benchmarks version 2.4. Tech. rep. NAS-02-007, NASA Advanced Supercomputing (NAS Division), NASA Ames Research Center, October (2002)
Vaughan, C., Rajan, M., Barrett, R., Doerfler, D., Pedretti, K.: Investigating the impact of the Cielo Cray XE6 architecture on scientific application codes. In: Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum (IPDPSW 2011) (2011)
Wolinsky, D., Liu, Y., Juste, P.S., Venkatasubramanian, G., Figueiredo, R.: On the design of scalable, self-configuring virtual networks. In: Proceedings of 21st ACM/IEEE International Conference of High Performance Computing, Networking, Storage, and Analysis (SuperComputing—SC), November (2009)
Xia, L., Cui, Z., Lange, J., Tang, Y., Dinda, P., Bridges, P.: VNET/P: bridging the cloud and high performance computing through fast overlay networking. In: Proceedings of the 21st ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2012), June (2012)
Xia, L., Kumar, S., Yang, X., Gopalakrishnan, P., Liu, Y., Schoenberg, S., Guo, X.: Virtual WiFi: bring virtualization from wired to wireless. In: Proceedings of the 7th International Conference on Virtual Execution Environments (VEE’11) (2011)
Xia, L., Lange, J., Dinda, P., Bae, C.: Investigating virtual passthrough I/O on commodity devices. Oper. Syst. Rev. 43(3), 83–94 (July 2009). Initial version appeared at WIOV 2008
Acknowledgements
We would like to thank Kevin Pedretti and Kyle Hale for their efforts in bringing up Palacios and VNET/P under CNL on the Cray XK6. This project is made possible by support from the United States National Science Foundation (NSF) via grants CNS-0709168 and CNS-0707365, and by the Department of Energy (DOE) via grant DE-SC0005343. Yuan Tang’s visiting scholar position at Northwestern was supported by the China Scholarship Council.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xia, L., Cui, Z., Lange, J. et al. Fast VMM-based overlay networking for bridging the cloud and high performance computing. Cluster Comput 17, 39–59 (2014). https://doi.org/10.1007/s10586-013-0274-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-013-0274-7