VSphere, the next generation of VMware Infrastructure 3, will include Cisco Systems Inc.'s Nexus 1000V virtual...
switch with every VMware ESX and ESXi 4.0 server – but you'll need a license key to unlock it. Two experts explain how Cisco's network switch can unlock central performance bottlenecks in a virtual data center.
The virtualization revolution steamrolls ahead, flattening former silos of IT functionality into aggregated resource pools of disk, CPU and RAM. A virtualized data center is far more agile than the physical model. It is more reliable, easier to manage and much less expensive to build and operate. But the revolution has a hitch: When supporting the expanded bandwidth, security and management that consolidated virtual machines (VMs) require, a network can become a bottleneck. The new Cisco Nexus switch family remedies these limitations and facilitates adoption of virtualization as the data center standard.
How virtualization changes data centers
Physical servers are typically so underutilized that most of the time they do nothing but take up rack space and consume electricity. In this physical environment, adding resiliency requires duplicate machines and clustering software, which requires still more rack space, power, network ports, UPS slices, maintenance, periodic upgrades and so on. Shared storage area network (SAN) storage and disaster recovery servers are expensive to implement and manage and, hence, typically relegated to only a small subset of so-called critical servers.
But with virtualization, a hypervisor abstracts an OS from the hardware and the applications from the OS. Virtual machines are load-balanced across a data center, essentially aggregating resources. Downtime from server failure becomes only a memory. Virtualizing data centers also alleviates cost and management burdens in disaster recovery, shared storage and desktop virtualization. Breaking down the former silos of IT functionality and enhancing hardware utilization creates substantial return on investment (ROI) with a short payback period. Still, the high utilization and dense infrastructure create a new set of challenges for network performance, security and management. Forrester Research reports that organizations commonly experience networks as a bottleneck once they have virtualized about 25% of production servers.
Virtualized data center infrastructure challenges
Physical servers are virtualized and consolidated onto VMware ESX host servers at ratios often as high as 50 to 1 on a four-CPU six-core box. Appliances such as backup servers or Web load-balancing appliances, which were traditionally physical, are increasingly deployed as virtual machines on these same hosts.
Virtualization has become commonplace for even high-I/O and mission-critical servers such as SQL Server and Microsoft Exchange. This puts a strain not only on networks but also on storage. Since all virtual machines run on shared storage, it is imperative that a SAN is high-performing.
Originally, Fibre Channel SANs were the go-to choice to ensure adequate virtual data center storage performance. Fibre Channel provides a 4 Gb or 8 Gb throughput and deterministic packet transmission. As such, the packets are guaranteed to get to their ultimate destination. But Fibre Channel requires a separate storage network comprising expensive host bus adapters (HBAs) and switches. It also means that IT staff comfortable only with Ethernet must learn to manage an additional protocol.
Lower-cost and more easily managed iSCSI storage technologies have become an increasingly popular alternative to Fibre Channel SANs. But iSCSI is subject to the speed of the underlying network which is typically 1 Gb and is further constrained by the nondeterministic nature of Ethernet; packets are not guaranteed to reach their destination. Dropped packets can lead to data corruption or storage time-outs. A separate and isolated storage Ethernet fabric adds some protection, but it also adds cost and complexity while still maintaining a measure of vulnerability.
VMware ESX hosts, configured in clusters, enable automatic virtual machine failover should a host server crash. Consequently, all former physical servers now compete for a limited number of ports to access both high-performance clustering and storage along with the Ethernet fabric. This bandwidth requirement further escalates as hundreds or thousands of desktops are virtualized and moved into a data center.
Consolidating servers and storage makes cable management exponentially more difficult as well. For instance, a single four-CPU ESX host might have eight Ethernet connections along with four Fibre Channel connections. Multiply this by 30 servers, and the cabling issues become significant to the point of generating concern about airflow for cooling. The industry has made a rallying cry to use blade servers to eliminate dedicated Ethernet and Fibre Channel cables along with dedicated uplink ports. But virtual interconnects add another layer of virtualization that can degrade performance on both Fibre Channel and Ethernet networks. Blade servers also entail their own challenges such as scalability limiting proprietary chasses, difficulties in troubleshooting and a higher cost model than standalone servers. (For more, see Virtualization and the Cisco Nexus switch combine to kill blade servers.)
VMware's virtual switch
While the VMware virtual switch has achieved much faster inter-virtual machine communication on an ESX host, it limits the enforcement of security best practices which in turn affects the environments that can be virtualized. The VMware virtual switch is rudimentary; it is a basic Layer-2 switch that doesn't route, does not support quality of service (QOS) and provides no visibility to individual virtual machines.
Virtual machines running on an ESX host communicate with one another without going through the physical network –administrators cannot apply traditional security practices, such as access control lists and private VLANs. Administrators also have no way of easily isolating traffic to an individual virtual machine, and even if they do, a VMotion or Distributed Resource Scheduler (DRS) instance leaves administrators in the dark once again. Managing virtual switches is painful because each one must be administered on an individual basis. Since the server administrator is doing the configuration rather than the network team, improper configurations and inconsistent policies often result.
The Nexus family of Cisco switches
A reputed billion dollars went into developing the Cisco Nexus 7000, and more than 1,500 patents were filed. The Nexus family consists of four product lines: the Nexus 1000V (available with the release of ESX 4, which is due out in late spring 2009), the Nexus 2000 fabric extender, the Nexus 5000 switch and the Nexus 7000 switch chassis.
The Nexus family of switches run Cisco NX-OS, which brings the best of both Cisco internetwork operating system (IOS) and SAN-OS under a common code base. The Nexus 1000V runs inside a VMware ESX hypervisor and uses the VMware application programming interface to replace the existing VMware virtual switch. It provides the security and QOS capabilities commonly found in IOS that expand the reach of what can be virtualized. In addition, network policies are created by the network administrator and applied by the server administrator in vCenter Server which allows for consistent policy, even in the event of vMotion or DRS.
Enhancing network bandwidth
The Nexus family was designed for high-speed data center Ethernet (DCE). While 10 Gb Ethernet is already here, the future will clearly feature 40 Gb and even 100 Gb. DCE is a new standard promoted by Cisco enabling a lossless Ethernet. DCE combines the low cost of Ethernet with the deterministic reliability of Fibre Channel. Existing Fibre Channel SANs can connect from the Nexus family of switches and still maintain their defining deterministic qualities with a protocol dubbed Fibre Channel over Ethernet (FCoE), which is available today in the Nexus 5000.
The Nexus 5000 series allows a host to utilize a converged network adapter (CNA) that replaces separate Ethernet and Fibre Channel HBAs in physical ESX hosts, thereby dramatically reducing cabling and network adapter requirements. ISCSI, Network File System (NFS) and Fibre Channel can co-exist as part of a unified network fabric. The barrier to entry of Fibre Channel performance is removed due to the elimination of the need for a separate switching and cabling plant. Nexus enables storage to truly become a function of the network for Fibre Channel, iSCSI and NFS protocols, leaving the decision to the architect as to what will provide best performance for a given environment.
Enhancing network management
The Nexus 1000V resolves the management limitations of the VMware virtual switch by opening a hypervisor to a Cisco-based switch architecture, enabling configuration of up to 64 ESX hosts on a single switch.
The Nexus 1000V looks, feels and acts just like any other Cisco NX-OS switch and gives a network administrator visibility into a VM's virtual network interface cards (NICs). The physical NICs of the ESX host also become managed Ethernet ports on the Nexus 1000V. Each ESX host looks like a line card in a larger switch, and the entire system is managed from the Cisco virtual supervisor module.
The Cisco Nexus 1000V enables assignment of network characteristics to a particular server type through VMware's vCenter Server. In this manner, the correct security policies and QOS characteristics are automatically applied to a virtual machine. Further, a VM maintains its network characteristics as it "VMotions," or live-migrates, across ESX hosts.
Cisco Nexus switches versus Cisco Catalyst switches
Cisco resellers, used to selling in terms of speeds and feeds, commonly promote the Nexus family as if it were the same as the Catalyst family of switches. While the Nexus family has desirable top-of-rack capabilities, this approach may not be compelling enough for CFOs to free budget dollars in today's constrained economic climate. Ultimately, the real power of the Nexus family is its game-changing ability to extend the network from a unified 10 Gb fabric all the way to a hypervisor. This fabric enables the high bandwidth, performance, reliability and management necessary for a successful virtualized data center.
ABOUT THE AUTHORS: Steve Kaplan is the vice president of the Data Center Virtualization Practice at INX and a VMware vExpert. Kaplan can be reached at firstname.lastname@example.org and followed on Twitter at http://www.twitter.com/roidude.
Gary Lamb is the senior director of the Data Center Virtualization Practice at INX and a member of the VMware Partner Technical Advisory Board. Lamb can be reached at email@example.com.