Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Deploy DPDK vRouter for Optimal Container Networking

DPDK Overview

Cloud-Native Contrail® Networking supports the Data Plane Development Kit (DPDK). DPDK is an open-source set of libraries and drivers for rapid packet processing. Cloud-Native Contrail Networking accelerates container networking with DPDK vRouter technology. DPDK enables fast packet processing by allowing network interface cards (NICs) to send direct memory access (DMA) packets directly into an application’s address space. This method of packet routing lets the application poll for packets, which prevents the overhead of interrupts from the NIC.

Utilizing DPDK enables the Cloud-Native Contrail vRouter to process more packets per second than it could when running as a kernel module DPDK interface for container service functions. Cloud-Native Contrail Networking leverages the processing power of the DPDK vRouter to power high-demand container service functions.

When you provision a Contrail compute node with DPDK, the corresponding YAML file specifies the:

  • Number of CPU cores to use for forwarding packets.

  • Number of huge pages to allocate for DPDK.

  • UIO driver to use for DPDK.

DPDK vRouter Support for DPDK and Non-DPDK Workloads

When a container or pod needs access to the DPDK vRouter, the following workload types occur:

  1. Non-DPDK workload (pod): This workload includes non-DPDK pod applications that are unaware of the underlying DPDK vRouter. These applications are not designed for DPDK and do not use DPDK capabilities. In Cloud-Native Contrail Networking, this workload type functions normally in a DPDK vRouter-enabled cluster.
  2. Containerized DPDK workload: These workloads are built on the DPDK platform. DPDK interfaces are brought up using vHost protocol, which acts as a datapath for management and control functions. Pods act as the vHost Server, and the underlying DPDK vRouter acts as the vHost Client.
  3. Mix of Non-DPDK and DPDK workloads: The management or control channel on an application in this pod might be non-DPDK (Veth pair), and the datapath might be a DPDK interface.

Non-DPDK Pod Overview

A virtual ethernet (Veth) pair plumbs the networking of a non-DPDK pod. One end of the Veth pair attaches to the pod's namespace. The other end attaches to the kernel of the host machine. The Container Networking Interface (CNI) establishes the Veth pair and allocates IP addresses using IP Address Management (IPAM).

DPDK Pod Overview

A DPDK pod contains a vhost interface and a virtio interface. The pod uses the vhost interface for management purposes and the virtio interface for high-throughput packet processing applications. A DPDK application in the pod uses the vhost protocol to establish communication with the DPDK vRouter in the host. The DPDK application receives an argument to establish a filepath for a UNIX socket. The vRouter uses this socket to establish the control channel, run negotiations, and create vrings over huge pages of shared memory for high-speed datapaths.

Mix of Non-DPDK and DPDK Pod Overview

This pod might contain non-DPDK and DPDK applications. A non-DPDK application uses a non-DPDK interface (Veth pair), and the DPDK application uses the DPDK interfaces (vhost, virtio). These two workloads occur simultaneously.

DPDK vRouter Architecture

The Contrail DPDK vRouter is a container that runs inside the Contrail compute node. The vRouter runs as either a Linux kernel module or a user space DPDK process. The vRouter is responsible for transmitting packets between virtual workloads (tenants, guests) on physical devices. The vRouter also transmits packets between virtual interfaces and physical interfaces.

The Cloud-Native Contrail vRouter supports the following encapsulation protocols:

  • MPLS over UDP (MPLSoUDP)
  • MPLS over GRE (MPLSoGRE)
  • Virtual Extensible LAN (VXLAN)

Compared with the traditional Linux kernel deployment, deploying the vRouter as a user space DPDK process drastically increases the performance and processing speed of the vRouter application. This increase in performance is the result of the following factors:

  • The virtual network functions (VNFs) operating in user space are built for DPDK and designed to take advantage of DPDK’s packet processing power.
  • DPDK's poll mode drivers (PMDs) use the physical interface (NIC) of a VM's host instead of the Linux kernel's interrupt-based drivers. The NIC’s registers operate in user space, which makes them accessible by DPDK’s PMDs.

As a result, the Linux OS does not need to manage the NIC's registers. This means that the DPDK application manages all packet polling, packet processing, and packet forwarding of a NIC. Instead of waiting for an I/O interrupt to occur, a DPDK application constantly polls for packets and processes these packets immediately upon receiving them.

DPDK Interface Support for Containers

The benefits and architecture of DPDK usually optimize VM networking. Cloud-Native Contrail Networking lets your Kubernetes containers take full advantage of these features. In Kubernetes, a containerized DPDK pod typically contains two or more interfaces. The following interfaces form the backbone of a DPDK pod:

  • Vhost user protocol (for management): The vhost user protocol is a backend component that interfaces with the host. In Cloud-Native Contrail Networking, the vhost interface acts as a datapath for management and control functions between the pod and vRouter. This protocol comprises the following two planes:
    • The control plane exchanges information (memory mapping for DMA, capability negotiation for establishing and terminating the data plane) between a pod and vRouter through a Unix socket.
    • The data plane is implemented through direct memory access and transmits data packets between a pod and vRouter.
  • Virtio interface (for high-throughput applications): At a high level, virtio is a virtual device that transmits packets between a pod and vRouter. The virtio interface is a shared memory (shm) solution that lets pods access DPDK libraries and features.

These interfaces enable the DPDK vRouter to transmit packets between pods. The interfaces give pods access to advanced networking features provided by the vRouter (huge pages, lockless ring buffers, poll mode drivers). For more information about these features, visit A journey to the vhost-users realm.

Applications use DPDK to create vhost and virtio interfaces. The application or pod then uses DPDK libraries directly to establish control channels using Unix domain sockets. The interfaces establish datapaths between a pod and vRouter using shared memory vrings.

DPDK vRouter Host Prerequisites

In order to deploy a DPDK vRouter, you must configure the following huge pages and NICs on the host node:

  • Huge pages configuration: Specify the percentage of host memory to be reserved for the DPDK huge pages. The following command line shows huge pages set at 2MB:

    The following example allocates four 1GB huge pages and 1024 2MB huge pages:

    Note:

    We recommend that you allocate 1GB for the huge pages size.

  • Enable (input-output memory management unit (IOMMU): DPDK applications require IOMMU support. Configure IOMMU settings and enable IOMMU from the BIOS. Apply the following flags as boot parameters to enable IOMMU:
  • Ensure that the Kernel driver is loaded onto Port Forward 0 (port 0) of the host's NIC. Ensure that DPDK PMD drivers are loaded onto Port Forward 1 (port 1) of the host's NIC.
    Note:

    In an environment where both DPDK and kernel drivers use different ports of a common NIC, we strongly recommend that you deploy a DPDK node with kernel drivers bound to port 0 on the NIC. We further recommend that you deploy DPDK PMD drivers bound to port 1 of that NIC. Other port assignment configurations might cause performance issues. For more information, see section 24.9.11 of the following DPDK documentation: I40E Poll Mode Driver.

  • PCI driver (vfio-pci, uio_pci_generic): Specify which PCI driver to use based on NIC type.
    Note:

    The vfio-pci is built-in.

      • Manually install the uio_pci_generic module if needed:
      • Verify that the uio_pci_generic module is installed:

Deploy a Kubernetes Cluster with DPDK vRouter in Compute Node

Cloud-Native Contrail Networking utilizes a DPDK deployer to launch a Kubernetes cluster with DPDK compatibility. This deployer performs lifecycle management functions and applies DPDK vRouter prerequisites. A custom resource (CR) for the DPDK vRouter is a subset of the deployer. The CR contains the following:

  • Controllers for deploying Cloud-Native Contrail Networking resources

  • Built-in controller logic for the vRouter

Apply the DPDK deployer YAML file, and deploy the DPDK vRouter CR with agentModeType: dpdk using the following command:

After applying the CR YAML file, the deployer creates a daemonset for the vRouter. This deamonset spins up a pod with a DPDK container.

If you get an error message, ensure that your cluster has the custom resource definition (CRD) for the vRouter using the following command:

The following is an example of the output you receive:

If no CRD is present in the cluster, check the deployer using the following command:

Check the image used by the contrail-k8s-crdloader container. This image should be the latest image the deployer uses. Update the image and ensure that your new pod uses this image.

After you verify that your new pod is running the latest image, use the following command to verify that the CRD for the vRouter is present:

After you verify that the CRD for the vRouter is present, use the following command to apply the vRouter CR:

DPDK vRouter Custom Resource Settings

You can configure the following settings of the vRouter's CR:

  • service_core_mask: Specify a service core mask. The service core mask enables you to dynamically allocate CPU cores for services.

    You can enter the following input formats:

    • Hexadecimal (for example, 0xf)

    • List of CPUs separated by commas (for example, 1,2,4)

    • Range of CPUs separated by a dash (for example, 1-4)

    Note:

    PMDs require the bulk of your available CPU cores for packet processing. As a result, we recommend that you reserve a maximum of 1 to 2 CPU cores for service_core_mask and dpdk_ctrl_thread_mask. These two cores share CPU power.

  • cpu_core_mask: Specify a CPU core mask. DPDK's PMDs use these cores for high-throughput packet-processing applications.

    The following are supported input formats:

    • Hexadecimal (for example, 0xf)

    • List of CPUs separated by commas (for example, 1,2,4)

    • Range of CPUs separated by a dash (for example, 1-4)

  • dpdk_ctrl_thread_mask: Specify a control thread mask. DPDK uses these core threads for internal processing.

    The following are supported input formats:

    • Hexadecimal (for example, 0xf)

    • List of CPUs separated by commas (for example, 1,2,4)

    • Range of CPUs separated by a dash (for example, 1-4)

    Note:

    PMDs require the bulk of your available CPU cores for packet processing. As a result, we recommend that you reserve a maximum of 1 to 2 CPU cores for service_core_mask and dpdk_ctrl_thread_mask. These two cores share CPU power.

  • dpdk_command_additional_args: Specify DPDK vRouter settings that are not default settings. Arguments that you enter here are appended to the DPDK PMD command line.

    The following is an example argument:--yield_option 0

    .