Improving High Performance Networking Technologies for Data Center Clusters
Grant, Ryan Eric
MetadataShow full item record
This dissertation demonstrates new methods for increasing the performance and scalability of high performance networking technologies for use in clustered computing systems, concentrating on Ethernet/High-Speed networking convergence. The motivation behind the improvement of high performance networking technologies and their importance to the viability of modern data centers is discussed first. It then introduces the concepts of high performance networking in a commercial data center context as well as high performance computing (HPC) and describes some of the most important challenges facing such networks in the future. It reviews current relevant literature and discusses problems that are not yet solved. Through a study of existing high performance networks, the most promising features for future networks are identified. Sockets Direct Protocol (SDP) is shown to have unexpected performance issues for commercial applications, due to inefficiencies in handling large numbers of simultaneous connections. The first SDP over eXtended Reliable Connections implementation is developed to reduce connection management overhead, demonstrating that performance issues are related to protocol overhead at the SDP level. Datagram offloading for IP over InfiniBand (IPoIB) is found to work well. In the first work of its kind, hybrid high-speed/Ethernet networks are shown to resolve the issues of SDP underperformance and demonstrate the potential for hybrid high-speed networking local area Remote Direct Memory Access (RDMA) technologies and Ethernet wide area networking for data centers. Given the promising results from these studies, a set of solutions to enhance performance at the local and wide area network level for Ethernet is introduced, providing a scalable, connectionless, socket-compatible, fully RDMA-capable networking technology, datagram-iWARP. A novel method of performing RDMA Write operations (called RDMA Write-Record) and RDMA Read over unreliable datagrams over Ethernet is designed, implemented and tested. It shows its applicability in scientific and commercial application spaces and is applicable to other verbs-based networking interfaces such as InfiniBand. The newly proposed RDMA methods, both for send/recv and RDMA Write-Record, are supplemented with interfaces for both socket-based applications and Message Passing Interface (MPI) applications. An MPI implementation is adapted to support datagram-iWARP. Both scalability and performance improvements are demonstrated for HPC and commercial applications.