Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Manage Single Cluster CN2

SUMMARY Learn how to perform life cycle management tasks in a single cluster installation.

Overview

The way that you manage a Kubernetes cluster does not change when CN2 is the CNI plug-in. Once CN2 is installed, CN2 components work seamlessly with other Kubernetes components to provide the networking infrastructure.

The Contrail controller is constantly watching and reacting to cluster events as they occur. When you add a new node, the Contrail data plane components are automatically deployed. When you delete a node, the Contrail controller automatically deletes networking resources associated with that node. CN2 works seamlessly with kubectl and other tools such as Prometheus and Grafana.

In addition to standard Kubernetes management tools, you can use tools and procedures that are specific to CN2. This section covers these tools and procedures.

Run Preflight and Postflight Checks in Release 23.1

Use this procedure to run preflight or postflight checks on all cluster nodes.

Preflight checks allow you to verify that your cluster nodes can support CN2. The checks test for resource capacity, kernel compability, network reachability, and other infrastructure requirements. You typically run preflight checks prior to installing CN2, but you can run these checks after installing CN2 as well.

Postflight checks allow you to verify that your CN2 installation is working properly. The checks test for status, pod-to-pod communication, API server reachability, and other basic functions. You run postflight checks after installing CN2.

You must create the ContrailReadiness controller prior to running this procedure. See Install ContrailReadiness Controller in Release 23.1.

  1. Locate the contrail-tools/contrail-readiness directory from the downloaded CN2 Tools package.
  2. If you haven't already done so, ensure you've populated the manifests with your repository login credentials. See Configure Repository Credentials for one way to do this.
  3. To run the preflight checks:
    You typically run preflight checks after you create the cluster but before you install CN2.
    Note:

    In a multi-cluster deployment, run preflight checks from the central cluster only.

  4. To run the postflight checks:
    You run postflight checks after you install CN2.
    Note:

    In a multi-cluster deployment, run postflight checks from the central cluster only.

  5. Read the preflight and postflight check results as applicable.

    Address any errors before proceeding.

    Note:

    The preflight and postflight checks do not automatically rerun after you've fixed any errors. The output will continue to show errors even after you've fixed them.

Back Up the Contrail Etcd Database in Release 23.1

Use this example procedure in release 23.1 to back up the Contrail etcd database.

Note:

The following steps refer to a Contrail controller node. A Contrail controller node is a worker node that is running a Contrail controller.

  1. Install etcdctl on all Contrail controller nodes.
    1. Log in to one of the Contrail controller nodes.
    2. Download etcd. This example downloads to the /tmp directory.
    3. Untar and move the etcd executable to a directory in your path (for example /usr/local/bin).
    4. Check that you've installed etcd.
    5. Repeat on all the Contrail controller nodes.
  2. Get a list of the contrail-etcd pods.
    Take note of the contrail-etcd pod names, the IP addresses, and the nodes they're running on. You'll need this information in the next few steps.
  3. Copy the etcd certificate and key files from the pods to the Contrail controller nodes.
    We run kubectl on the Contrail controller nodes in this step. We assume you've set up kubeconfig on these nodes in its default location (~/.kube/config).
    1. Pick a contrail-etcd pod (for example, contrail-etcd-0) and log in to the Contrail controller node that's hosting that pod.
    2. Copy the certificate and key files from that contrail-etcd pod to the hosting Contrail controller node.
      In this example, we're copying the certificates and key files from the contrail-etcd-0 pod to local files on this node.This copies the certificate and key files from the contrail-etcd-0 pod to ca.crt, tls.crt, and tls.key in the current directory on this control plane node.
    3. Repeat for each contrail-etcd pod.
  4. Back up the etcd database on one of the Contrail controller nodes. You only need to back up the database on one node.
    1. Log back in to one of the Contrail controller nodes.
    2. Back up the etcd database.
      This example saves the database to /tmp/etcdbackup.db on this Contrail controller node.where <etcd-pod-ip> is the IP address of the pod on this node and the <etcd-port> is the port that etcd is listening on (by default, 12379).
  5. Copy the database to a safe location.

Restore the Contrail Etcd Database in Release 23.1

Use this example procedure in release 23.1 to restore the Contrail etcd database from a snapshot on an Amazon EKS cluster.

Note:

The following steps refer to a Contrail controller node. A Contrail controller node is a worker node that is running a Contrail controller.

  1. Copy the snapshot you want to restore to all the Contrail controller nodes.
    The steps below assume you've copied the snapshot to /tmp/etcdbackup.db on all the Contrail controller nodes.
  2. Restore the snapshot.
    1. Log in to one of the Contrail controller nodes. In this example, we're logging in to the Contrail controller node that is hosting contrail-etcd-0.
    2. Restore the etcd database to the contrail-etcd-0 pod on this Contrail controller node.
      This creates a contrail-etcd-0.etcd directory on the node. where --name=contrail-etcd-0 specifies that this command is restoring the database to contrail-etcd-0, --initial-cluster=... lists all the contrail-etcd members in the cluster, and --initial-advertise-peer-urls=... refers to the IP address and port number that the contrail-etcd-0 pod is listening on.
    3. Repeat for the other contrail-etcd pods on their respective Contrail controller nodes, substituting the --name and --initial-advertise-peer-urls values with the respective contrail-etcd pod name and IP address.
  3. Stop the contrail-etcd pods.
    This sets the replicas to 0, which effectively stops the pods.
  4. Replace contrail-etcd data with the data from the snapshot.
    1. SSH into one of the Contrail controller nodes.
    2. Replace the data. Recall that the snapshot is stored in the contrail-etcd-<xxx>.etcd directory.
      where contrail-etcd-xxx is the name of the contrail-etcd pod on the Contrail controller node that you logged in to.
    3. Repeat for the other Contrail controller nodes.
  5. Start the contrail-etcd pods.
    This sets the replicas to 3, which effectively starts the pods.
  6. Restart the contrail-system apiserver and controller.
    Delete all the contrail-k8s-apiserver and contrail-k8s-controller pods.These pods will automatically restart.
  7. Restart the vrouters.
    Delete all the contrail-vrouter-nodes pods.These pods will automatically restart.
  8. Check that all pods are in running state.

Upgrade CN2

Use this procedure to upgrade CN2.

The Contrail controller consists of Deployments and StatefulSets, which are configured for rolling updates. During the upgrade, the pods in each Deployment and StatefulSet are upgraded one at a time. The remaining pods in that Deployment or StatefulSet remain operational. This enables Contrail controller upgrades to be hitless.

The CN2 data plane consists of a DaemonSet with a single vRouter pod. During the upgrade procedure, this single pod is taken down and upgraded. Because of this, CN2 data plane upgrades are not hitless. If desired, migrate traffic off of the node being upgraded prior to performing the upgrade.

You upgrade CN2 software by porting the contents of your existing manifests to the new manifests, and then applying the new manifests. All CN2 manifests must reference the same software version.

Note:

Before you upgrade, check to make sure that each node has at least one allocatable pod available. The upgrade procedure temporarily allocates an additional pod, which means that your node cannot be running at maximum pod capacity when you perform the upgrade. You can check pod capacity on a node by using the kubectl describe node command.

  1. Download the manifests for the new release.
  2. Locate the (old) manifest(s) that you used to create the existing CN2 installation. In this procedure, we assume it's single_cluster_deployer_example.yaml and cert-manager.yaml.
  3. Port over any changes from the old manifest(s) to the new manifest(s).
    The new manifests can contain constructs that are specific to the new release. Identify all changes that you've made to the old manifests and copy them over to the new manifests. This includes repository credentials, network configuration changes, and other customizations.
    Note:

    If you have a large number of nodes, use node selectors to group your upgrades to a more manageable number.

  4. Upgrade CN2.

    The pods in each Deployment and Stateful set will upgrade one at a time. The vRouter DaemonSet will go down and come back up.

  5. Use standard kubectl commands to check on the upgrade.

    Check the status of the nodes.

    Check the status of the pods.

    If some pods remain down, debug the installation as you normally do. Use the kubectl describe command to see why a pod is not coming up. A common error is a network or firewall issue preventing the node from reaching the Juniper Networks repository.

Uninstall CN2 in Release 23.1

Use this procedure to uninstall CN2. You must install the ContrailReadiness controller prior to running this procedure. See Install ContrailReadiness Controller in Release 23.1.

This tool removes the following:

  • contrail namespace and resources that belong to that namespace
  • contrail-system namespace and resources that belong to that namespace
  • contrail-deploy namespace and resources that belong to that namespace
  • default-global-vrouter-config and default-global-system-config
Note:

Since there are interdependencies between CN2 components, don't try to delete CN2 components individually. The provided tool uninstalls CN2 components gracefully and in the proper sequence.

  1. Locate the contrail-tools/contrail-readiness directory from the downloaded CN2 Tools package.
  2. If you haven't already done so, ensure you've populated the manifests with your repository login credentials. See Configure Repository Credentials for one way to do this.
  3. If you've installed Contrail Analytics, uninstall it now. The uninstall script does not uninstall resources in namespaces other than those listed above.
  4. Delete any other resources and namespaces (for example, overlay networks) that you created after you installed CN2.
  5. Uninstall CN2.
  6. Query the uninstall results.
  7. Finally, delete the contrail-readiness namespace.