Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Guide That Contains This Content
[+] Expand All
[-] Collapse All


    Virtualization Overview

    In the MetaFabric 1.0 solution, all compute nodes are installed into a virtual environment featuring the VMware ESXi 5.1 operating system. VMware ESXi provides the foundation for building a reliable data center. VMware ESXi 5.1 is the latest hypervisor architecture from VMware. ESXi, vSphere client, and vCenter are components of vSphere. ESXi server is the most important part of vSphere. ESXi is the virtualization server. All the virtual machines or Guest OS are installed on the ESXi server.

    To install, manage, and access those virtual servers which sit above the ESXi server, you will need another part of the vSphere suite called vSphere client or vCenter. The vSphere client allows administrators to connect to ESXi servers and access or manage virtual machines, and is used from the client machine to connect to the ESXi server and perform management tasks.

    The VMware vCenter server is similar to the vSphere client, but it is a server with even more power. The VMware vCenter server is installed on a Windows or Linux server. In this solution, the vCenter server is installed on a Windows 2008 server that is running as a virtual machine (VM). The VMware vCenter server is a centralized management application that lets you manage virtual machines and ESXi hosts centrally. VMware vSphere client is used to access vCenter Server and ultimately manage ESXi servers (Figure 1). VMware vCenter server is compulsory for enterprises to have enterprise features such as vMotion, VMware High Availability, VMware Update Manager, and VMware Distributed Resource Scheduler (DRS). For example, you can easily clone an existing virtual machine by using vCenter server. vCenter is another important part of the vSphere package.

    Figure 1: VMware vSphere Client Manages vCenter Server Which in Turn Manages Virtual Machines in the Data Center

    VMware vSphere Client Manages vCenter
Server Which in Turn Manages Virtual Machines in the Data Center

    In Figure 2, all the compute nodes are part of a data center and the VMware HA Cluster is configured on compute nodes. All compute nodes are running ESXi 5.1 OS, which is a host operating system to all the data center VMs running business-critical applications. With vSphere Client, you can also access ESXi hosts or the vCenter Server. The vSphere Client is used to access the vCenter Server and manage VMware enterprise features.

    A vSphere Distributed Switch (VDS) functions as a single virtual switch across all associated hosts (Figure 2). This enables you to set network configurations that span across all member hosts, allowing virtual machines to maintain a consistent network configuration as they migrate across multiple hosts. Each vSphere Distributed Switch is a network hub that virtual machines can use. A vSphere Distributed Switch can forward traffic internally between virtual machines or link to an external network by connecting to physical Ethernet adapters, also known as uplink adapters. Each vSphere Distributed Switch can also have one or more dvPort groups assigned to it. dvPort groups group multiple ports under a common configuration and provide a stable anchor point for virtual machines connecting to labeled networks. Each dvPort group is identified by a network label, which is unique to the current data center. VLANs enable a single physical LAN segment to be further segmented so that groups of ports are isolated from one another as if they were on physically different segments. The standard is 802.1Q. A VLAN ID, which restricts port group traffic to a logical Ethernet segment within the physical network, is optional.

    Figure 2: VMWare vSphere Distributed Switch Topology

    VMWare vSphere Distributed Switch Topology

    VMware vSphere distributed switches can be divided into two logical areas of operation: the data plane and the management plane. The data plane implements packet switching, filtering, and tagging. The management plane is the control structure used by the operator to configure data plane functionality from the vCenter Server. The VDS eases this management burden by treating the network as an aggregated resource. Individual host-level virtual switches are abstracted into one large VDS spanning multiple hosts at the data center level. In this design, the data plane remains local to each VDS but the management plane is centralized.

    The first step in configuration is to create a vSphere distributed switch on a vCenter Server. After you have created a vSphere distributed switch, you must add hosts, create dvPort groups, and edit vSphere distributed switch properties and policies.

    With the distributed switch feature, VMware vSphere supports provisioning, administering, and monitoring of virtual networking across multiple hosts, including the following functionalities:

    • Central control of the virtual switch port configuration, port group naming, filter settings, and so on.
    • Link Aggregation Control Protocol (LACP) that negotiates and automatically configures link aggregation between vSphere hosts and access layer switches.
    • Network health-check capabilities to verify vSphere with the physical network configuration.

    Additionally, the distributed switch functionality supports (Figure 2):

    • Distributed port — A port on a vSphere distributed switch that connects to a host’s VMkernel or to a virtual machine’s network adapter.
    • Distributed virtual port groups (DVPortgroups) — Port groups that specify port configuration options for each member port. DVportgroups is a set of DV ports. Configuration is inherited from dvSwitch to dvPortgroup.
    • Distributed virtual uplinks (dvUplinks) — dvUplinks provide a level of abstraction for the physical NICs (vmnics) on each host.
    • Private VLANs (PVLANs) — PVLAN support enables broader compatibility with existing networking environments using the technology.

    Figure 3: VMware vSphere Distributed Switch Topology

    VMware vSphere Distributed Switch Topology

    Figure 3 shows an illustration of two compute nodes running ESXi 5.1 OS with multiple VMs deployed on the ESXi hosts. Notice that two physical compute nodes are running VMs in this topology, and the vSphere distributed switch (VDS) is virtually extended across all ESXi hosts managed by the vCenter server. The configuration of VDS is centralized to the vCenter Server.

    A LAG bundle is configured between the access switches and ESXi hosts. As mentioned in the compute node section, an RSNG configuration is required on the QFX3000-M QFabric systems.

    ESXi 5.1 supports LACP protocol for the LAG, which can be enabled by connecting the vCenter Server Web GUI only.

    Note: Link Aggregation Control Protocol (LACP) can only be configured via the vSphere Web Client.

    Configuring LACP

    To enable or disable LACP on an uplink port group:

    Note: All port groups using the Uplink Port Group enabled with LACP must have the load-balancing policy set to IP hash load balancing, network failure detection policy set to link status only, and all uplinks set to active.

    1. Log in to the vCenter Server Web on port 9443.

      Figure 4: Log In to vCenter Server

      Log In to vCenter Server
    2. Select vCenter under the Home radio button from the left tab.

      Figure 5: vCenter Web Client

      vCenter Web Client
    3. Click Networking under vCenter on the left side.

      Figure 6: Click Networking

      Click Networking
    4. Locate an Uplink Port Group in the vSphere Web Client. To locate an uplink port group:
      1. Select a distributed switch and click the Related Objects tab.

        Figure 7: Click Related Objects

        Click Related Objects
      2. Click Uplink Port Groups and select an uplink port group from the list.

        Figure 8: Click Uplink Ports and Select a Port

        Click Uplink Ports and Select a Port
    5. Select the dvSwitch-DVUplinks and click settings from the Actions tab.
    6. Click Edit.
    7. In the LACP section, use the drop-down box to enable or disable LACP.

      Figure 9: Enable LACP Mode

      Enable LACP Mode
    8. When you enable LACP, a Mode drop-down menu appears with these options:
      • Active — The port is in an active negotiating state, in which the port initiates negotiations with remote ports by sending LACP packets.
      • Passive — The port is in a passive negotiating state, in which the port responds to LACP packets it receives but does not initiate LACP negotiation.

      Set this option to passive (disable) or active (enable). The default setting is passive.

      Note: Step 8 is optional.

    9. Click OK.

    Configuring VMware Clusters, High Availability, and Dynamic Resource Scheduler

    VMware clusters enable the management of multiple host systems as a single, logical entity, combining standalone hosts into a single virtual device with pooled resources and higher availability. VMware clusters aggregate the hardware resources of individual ESX Server hosts but manage the resources as if they resided on a single host. Now, when you power on a virtual machine, it can be given resources from anywhere in the cluster, rather than from a specific physical ESXi host.

    VMware high availability (HA) allows virtual machines running on specific hosts to be restarted automatically using other host resources in the cluster in the case of host failure. VMware HA continuously monitors all ESX Server hosts in a cluster and detects failures. The VMware HA agent placed on each host maintains a heartbeat with the other hosts in the cluster. Each server sends heartbeats to the other servers in the cluster at 5-second intervals. If any servers lose heartbeat over three consecutive heartbeat intervals, VMware HA initiates the failover action of restarting all affected virtual machines on other hosts. VMware HA also monitors whether sufficient resources are available in the cluster at all times in order to be able to restart virtual machines on different physical host machines in the event of host failure. Safe restart of virtual machines is made possible by the locking technology in the ESX Server storage stack, which allows multiple ESX Server hosts to have simultaneous access to the same virtual machine files.

    VMware Dynamic Resource Scheduler (DRS) automatically provides initial virtual machine placement and makes automatic resource relocation and optimization decisions as hosts are added or removed from the cluster. DRS also optimizes based on virtual machine load, managing resources in events where the load on individual virtual machines goes up or down. VMware DRS also makes cluster-wide resource pools possible.

    For more information on configuration of VMware HA clusters, see:

    VMware vSphere 5.1 HA Documentation

    The MetaFabric 1.0 solution utilized VMware clusters in both POD1 and POD2. Below are overview screenshots that illustrate the use of clusters in the solution.

    The MetaFabric 1.0 solution test bed contains three clusters: Infra (Figure 10), POD1 (Figure 11), and POD2 (Figure 12). All clusters are configured with HA and DRS.

    Figure 10: Infra Cluster Hosts Detail

    Infra Cluster Hosts Detail

    Figure 11: POD1 Cluster Hosts Detail

    POD1 Cluster Hosts Detail

    Figure 12: POD2 Cluster Hosts Detail

    POD2 Cluster Hosts Detail

    The Infra cluster (Figure 13) is running all VMs required to support the data center infrastructure. The Infra cluster is hosted on two standalone servers (IBM System x3750 M4). The VMs hosted on the Infra cluster are:

    • Windows 2K8 Server with vCenter Server VM
    • Windows 2K8 domain controller VM
    • Windows 2K8 SQL database server VM
    • Junos Space Network Director
    • Remote Secure Access (SA)
    • Firefly Host Management (also referred to as vGW Management)
    • Firefly Host SVM – Hosts (also referred to as vGW SVM – Hosts)
    • Windows 7 VM - For NOC (Jump station)

    Figure 13: INFRA Cluster VMs

    INFRA Cluster VMs

    The POD1 cluster (Figure 14) hosts the VMs that run all enterprise business-critical application in the test bed. POD1 is hosted on one IBM Flex pass-thru chassis and one 40-Gb CNA module chassis. POD1 contains the following applications/VMs:

    • Windows Server 2012 domain controller
    • Exchange Server 2012 CAS
    • Exchange Server 2012 CAS
    • Exchange Server 2012 CAS
    • Exchange Mailbox server
    • Exchange Mailbox server
    • Exchange Mailbox server
    • MediaWiki Server
    • vGW SVM - All compute nodes

    Figure 14: POD1 Cluster

    POD1 Cluster

    The POD2 cluster (Figure 15) hosts the VMs that run all enterprise business-critical applications in the test bed. POD2 has one IBM Flex pass-thru chassis and one 10-Gb CNA module chassis. POD2 contains the following applications/VMs:

    • Windows Server 2012 secondary domain controller
    • SharePoint Server (Web-front end, six total VMs)
    • SharePoint Application Server (two of these)
    • SharePoint Database Server
    • vGW SVM – All compute nodes

    Figure 15: POD2 Cluster

    POD2 Cluster

    Configuring VMware Enhanced vMotion Compatibility

    VMware Enhanced vMotion Compatibility (EVC) configures a cluster and its hosts to maximize vMotion compatibility. Once enabled, EVC will ensure that only hosts that are compatible with those in the cluster can be added to the cluster. This solution uses the Intel Sandy Bridge Generation option for enhanced vMotion compatibility that supports the baseline feature set.

    To configure a vSphere distributed switch on a vCenter server, perform following steps.

    • Add a vSphere distributed switch
    • Add hosts to a vSphere distributed switch
    • Add a distributed port group (dvPG) configuration

    For more details on configuration of VMware EVC, see:

    VMware vSphere 5.1 Documentation - Enable EVC on an Existing Cluster

    In the MetaFabric 1.0 solution, EVC is configured as directed in the link provided. A short overview of the configuration follows.

    Each ESXi host in the POD hosts multiple VMs and is part of a different port group. VMs running on the PODs include Microsoft Exchange, MediaWiki, Microsoft SharePoint, MySQL database, and Firefly Host (VM security). Because traffic is flowing to and from many different VMs, multiple port groups are defined on the distributed switch:

    • Infra = PG-INFRA-101
    • SharePoint = PG-SP-102
    • MediaWiki = PG-WM-103
    • Exchange = PG-XCHG-104
    • MySQL Database for SharePoint = PG-SQL-105
    • vMotion = PG-vMotion-106
    • Fault Tolerance = PG-Fault Tolerance-107
    • Exchange Cluster = PG-Exchange-Cluster-109
    • iSCSI POD1 = PG-STORAGE-108
    • iSCSI POD2 = PG-STORAGE-208
    • Network MGMT = PG-MGMT-800
    • Security (vGW) = PG-Security-801
    • Remote Access = PG-Remote-Access-810

    These port groups are configured as shown in Figure 16. In this scenario, a port group naming convention was used to ease identification and mapping of VM and its function (for example, Exchange, SharePoint) to a VLAN ID. For instance, one VM is connected to PG104 running an Exchange application while another VM is is connected to PG103 running a MediaWiki application on the same ESXi host. Port group naming convention is also used in this scenario to identify the VLAN ID to which the host belongs. For instance, PG-XCHG-104 is using VLAN ID 104 on the network. (The 104 in the name is the same as the host VLAN ID.) The use of different port groups and VLANs enables the use of vMotion, which in turn enables fault tolerance in the data center.

    Figure 16: Port Groups

    Port Groups

    NIC teaming is also deployed in the solution. NIC teaming is a configuration of multiple uplink adapters that connect to a single switch to form a team. A NIC team can either share the load of traffic between physical and virtual networks among some or all of its members, or provide passive failover in the event of a hardware failure or a network outage. All the port groups (PG) except for iSCSI protocol storage groups are configured with a NIC teaming policy for failover and redundancy. All the compute nodes have four active adapters as dvUplink in the NIC teaming policy. This configuration enables load balancing and resiliency. The IBM Pure Flex System with a 10-Gb CNA card has two network adapters on each ESXi host. Consequently, that system has only two dvUplink adapters per ESXi host. Figure 17 is an example of one port group configuration. Other port groups are configured similarly (with the exception being the storage port group).

    Figure 17: Port Group and NIC Teaming Example

    Port Group and NIC Teaming Example

    Figure 18: Configure Teaming and Failover

    Configure Teaming and Failover

    Note: An exception to the use of NIC teaming is an iSCSI port group. The ISCSI protocol doesn’t support multi-channeling or bundling (LAG). When deploying iSCSI, instead of configuring four active dvUplinks, a single dvUplink should be used. In this solution, QFX3000-M QFabric POD1 uses one port group (PG-storage-108) and QFX3000-M QFabric POD2 uses another port group (PG-storage-208). These port groups are connected to the storage array utilizing the iSCSI protocol. Figure 18 shows the iSCSI port group (PG-storage-108). Port group storage 208 is configured in the same way.

    The VMkernel TCP/IP networking stack supports iSCSI, NFS, vMotion, and fault tolerance logging. The VMkernel port enables these services on the ESX server. Virtual machines run their own system TCP/IP stacks and connect to the VMkernel at the Ethernet level through standard and distributed switches. In ESXi, the VMkernel networking interface provides network connectivity for the ESXi host and handles vMotion and IP storage. Moving a virtual machine from one host to another is called migration. VMware vMotion enables the migration of active virtual machines with no down time.

    Management of iSCSI, vMotion, and fault tolerance is enabled by the creation of four virtual kernel adapters. These adapters are bound to their respective distributed port group. For more information on creating and binding virtual kernel adapters to distributed port groups, see:

    Mounting Storage Using the iSCSI Protocol

    To mount the storage using iSCSI protocol, perform following steps:

    • Create a single VMkernel adapter for iSCSI.
    • Change the port group policy for the iSCSI VMkernel adapter.
    • Bind iSCSI adapters with VMkernel adapters.
    • Set up Jumbo frames with iSCSI.
    • Configure dynamic discovery addresses for iSCSI adapters.
    • Re-scan storage on iSCSI adapters.

    Note: The ESXi host must have permission to access the storage array. This is discussed further in the storage section of this guide.

    For information about configuring and mounting of iSCSI storage connection to a vSwitch (either vSwitch or distributed switch), see:

    Configuring and Troubleshooting iSCSI Storage

    Figure 19 shows an example of an ESXi host deployed in POD1. The port gropus PG-Storage-108 and PG-Storage-208 dvPG have been created for POD1 and POD2, respectively. (The example shows PG-Storage-108.) VMkernel is configured to use the subnet for hosts in POD1 and the subnet for hosts in POD2 to bind with the respective storage port group to access the EMC storage.

    • EMC storage iSCSI IP for POD1 = and
    • EMC storage iSCSI IP for POD2 = and

    As mentioned earlier, the iSCSI protocol doesn’t support multichannel (LAG) but can support multipath; you will see only one physical interface bind with the storage port group. To achieve multipath, separate storage port group and network subnet are required to access EMC storage as a backup link.

    Figure 19: POD1 PG-STORAGE-108 Created for iSCSI

    POD1 PG-STORAGE-108 Created for iSCSI

    Configuring Fault Tolerance

    VMware vSphere fault tolerance provides continuous availability for virtual machines by creating and maintaining a secondary VM that is identical to, and continuously available to replace, the primary VM in the event of a failure. The feature is enabled on a per virtual machine basis. This virtual machine resides on a different host in the cluster, and runs in virtual lockstep with the primary virtual machine. When a failure is detected, the second virtual machine takes the place of the first one with the least possible interruption of service. Because the secondary VM is in virtual lockstep with the primary VM, it can take over execution at any point without interruption, thereby providing fault tolerant protection.

    Figure 20: VMware Fault Tolerance

    VMware Fault Tolerance

    The primary and secondary VMs continuously exchange heartbeats. This exchange allows the virtual machine pair to monitor the status of one another to ensure that fault tolerance is continually maintained. A transparent failover occurs if the host running the primary VM fails, in which case the secondary VM is immediately activated to replace the primary VM. A new secondary VM is started and fault tolerance redundancy is reestablished within a few seconds. If the host running the secondary VM fails, it is also immediately replaced. In either case, users experience no interruption in service and no loss of data. VMware vSphere HA must be enabled before you can power on fault tolerant virtual machines or add a host to a cluster that already supports fault tolerant virtual machines. Only virtual machines with a single vCPU are compatible with fault tolerance.

    For configuration instructions for VMware fault tolerance, see:

    Preparing Your Cluster and Hosts for Fault Tolerance

    The MetaFabric 1.0 solution test bed features VMware fault tolerance (Figure 21). This was tested as part of the solution on the port group “PG-Fault tolerance-107”. VMkernel is bound to this port group. Once fault tolerance is enabled on a VM, a secondary VM is automatically created.

    Fault tolerance is also enabled on Windows domain controller VM (running on one compute node in the POD1 cluster).

    Figure 21: VMware Fault Tolerance on POD1

    VMware Fault Tolerance on POD1

    Configuring VMware vMotion

    The VMware VMotion feature, part of VirtualCenter, allows you to migrate running virtual machines from one physical machine to another with no perceivable impact to the end user (Figure 22). You can use VMotion to upgrade and repair servers without any downtime or disruptions and also to optimize resource pools dynamically, resulting in an improvement in the overall efficiency of a data center. To ensure successful migration and subsequent functioning of the virtual machine, you must respect certain compatibility constraints. Complete virtualization of all components of a machine, such as CPU, BIOS, storage disks, networking, and memory, allows the entire state of a virtual machine to be captured by a set of data files. Therefore, moving a virtual machine from one host to another is nothing but data transfer between two hosts.

    Figure 22: VMware vMotion Enables Virtual Machine Mobility

    VMware vMotion Enables Virtual Machine

    VMware vMotion benefits data center administrators in critical situations, such as:

    • Hardware maintenance: VMotion allows you to repair or upgrade the underlying hardware without scheduling any downtime or disrupting business operations.
    • Optimizing hardware resources: VMotion lets you move virtual machines away from failing or underperforming hosts.
      • This can be done automatically in combination with VMware Distributed Resource Scheduler (DRS). VMware DRS continuously monitors utilization across resource pools and allocates resources among virtual machines based on current needs and priorities. When virtual machine resources are constrained, DRS makes additional capacity available by migrating live virtual machines to a less‐utilized host using VMotion.

    The requirements for vMotion include:

    • Datastore compatibility: The source and destination hosts must use shared storage. You can implement this shared storage using a SAN or iSCSI. The shared storage can use VMFS or shared NAS. Disks of all virtual machines using VMFS must be available to both source and target hosts.
    • Network compatibility: VMotion itself requires a Gigabit Ethernet network. Additionally, virtual machines on source and destination hosts must have access to the same subnets, implying that network labels for each virtual Ethernet adapter should match. You should configure these networks on each ESX host.
    • CPU compatibility: The source and destination hosts must have compatible sets of CPUs.

    VMware vMotion is configured on all MetaFabric 1.0 hosts (Figure 23). VMware vMotion is using a separate port group called PG-vMotion-106, and VMkernel is bound to this port group. Network and storage is unique on all hosts, which is a requirement for vMotion. Once vMotion configuration is completed, active VMs will be moved any available host where resources are free. DRS can also kick in the vMotion feature if one of the ESX hosts shows high resource utilization (CPU, memory). You can also manually trigger vMotion if the need arises to move a VM within the data center.

    Figure 23: VMware vMotion Configured in the Test Lab

    VMware vMotion Configured in the Test

    For more information on configuration of VMware vMotion, see:

    Creating a VMkernel port and enabling vMotion on an ESXi/ESX host

    Published: 2015-04-20