Understanding the Design of the Midsize Enterprise Campus Solution
The design of the midsize enterprise campus solution is guided by several high-level goals aimed primarily at solving the business problems presented by the proliferation of IP devices and applications in the enterprise. The solution design goals are to:
Support up to 10,000 users and devices
Ensure uninterrupted voice and video sessions by providing sub-second recovery from network failures
Provide secure, flexible access to the network while protecting critical data from unauthorized access
Provide a consistent and high quality of experience for applications, such as voice, video, and mission-critical data applications, through the use of a robust quality-of-service (QoS) feature set and policy
This section describes the overall solution design, the design considerations, and how the components of the design function together to meet the solution goals. It describes:
Basic Functional Design
Figure 1 shows the basic topology used in the midsize enterprise campus solution. This topology was chosen to provide a general and flexible example that can be modified to apply to different Enterprise vertical markets and physical facilities. The physical topology is typically based on several factors including availability of cable plant and layout of the building or campus.
Two physical locations are defined:
Location A—High-density location that serves as the campus network core: the core switches, edge devices, and services are located here.
Location B—Low- to medium-density location that is geographically separate from location A.
The topology follows the hierarchical design commonly used in today’s campus networks in which the campus LAN is divided into layers: access, aggregation, core, and edge. This type of design allows the components in each layer to assume a distinct role in the network. This in turn facilitates optimizing each component for its role and troubleshooting the network because it is easier to isolate problems. It also permits a modular approach, in which operational changes can be constrained to a subset of the network and in which design elements can be replicated for easy scaling as the network grows.
In addition, the solution design takes advantage of Juniper Networks® Virtual Chassis technology and the high port density of the EX9200/EX9250 switches to simplify and flatten the network. The use of Virtual Chassis greatly reduces the number of devices to be managed, and the high port density of the EX9200/EX9250 switches permits the core and aggregation layers to be collapsed in location A. The resulting network design also eliminates the need for the Spanning Tree Protocol (STP).
The access layer provides network connectivity to end-user devices, such as personal computers, VoIP phones, and printers, and connects wireless LAN (WLAN) access points to the network. Access switches typically reside in the wiring closets of each floor in a campus facility.
The access layer should provide end users with a consistent access experience regardless of their location or device. As the first layer of the network to provide user access control, the access layer plays a critical role in protecting the network from malicious attacks.
A well-designed access layer should provide:
High port density and high-bandwidth uplink ports—Access layer devices must provide high port density for client devices and high-bandwidth uplink ports to reduce the client-to-uplink subscription ratio.
Reliable connections with high quality of service—Access layer devices must support high availability through redundant hardware. They should also provide traffic management features. They must be able to classify, mark, and prioritize traffic in support of end-to-end quality of service (QoS). With the increasing use of multicast applications, support of multicast snooping is important to limit multicast packet flooding.
Secure access—Access layer devices must provide access control services, such as 802.1x, and integrate with security infrastructure services. They must support segmentation of traffic through VLANs. In addition, they must provide security from malicious attacks by supporting techniques such as DHCP snooping, dynamic ARP inspection, and IP source guard.
Simplified deployment and management—Because of the large number of devices deployed in the access layer, simplified management of the devices is a must. To simplify the deployment of IP phones, CCTV, and access points and to reduce capital expenditures, Power over Ethernet (PoE) is a must. When deploying PoE and PoE+, pay careful attention the PoE power requirements of the powered devices and the overall PoE power budget of the access switch.
Scalability—The access layer must be able to scale flexibly to reduce capital and operating expenses as users and devices grow.
For the access layer, the solution design uses EX Series Ethernet Switches. EX Series switches have the necessary port density, port types, and features to meet the access layer requirements. The access layer design also takes advantage of Juniper Networks Virtual Chassis technology, which allows interconnected switches to behave, operate, and be managed as a single device with high port density. The use of Virtual Chassis in the access layer:
Simplifies network management by reducing the number of managed devices by a factor of 4 to 10, depending on the models of switch being used
Enables the network to grow port count without increasing operational overhead
Provides control plane redundancy—one switch acts as the primary Routing Engine and another as the backup
Preserves bandwidth—inter-switch traffic is routed over the Virtual Chassis backplane at line rates for all packet sizes
The aggregation layer acts as a multiplexing point between the access layer and the campus network core. The aggregation layer combines a large number of smaller interfaces from the access switches into high-bandwidth trunk ports that can be more easily consumed by the core switch. The aggregation layer also provides Layer 3 routing services to the access layer.
Because all traffic to and from the access layer flows through the aggregation layer, the aggregation layer needs to provide high availability and resiliency, including hardware redundancy and the ability to upgrade the software while the devices are in service. It also must have high throughput, providing wire-rate forwarding and a nonblocking architecture.
Scalability is also a key consideration for the aggregation layer. Scale requirements increase linearly for every port added to the access layer. For example, if an access switch supports 10,000 MAC addresses and an aggregation switch consolidates 100 access switches, the MAC scale requirements for the aggregation switch is 1,000,000 MAC addresses (10,000 x 100).
For location A, the solution design uses EX9200/EX9250 switches to aggregate the traffic from the access switches in location A . These switches have the feature set, high port density, and scalability that enable them to function simultaneously as aggregation switches and core switches. This allows the core and aggregation layers to be collapsed into a single set of devices in location A. Collapsing the core and aggregation layers has these advantages:
Decreased number of devices
Decreased complexity and management overhead
The EX9200/EX9250 switches are in a multichassis link aggregation (MC-LAG) configuration, which provides high availability and a nonblocking architecture for the aggregation layer by eliminating the need for STP. High Availability Design describes MC-LAG in more detail.
For location B, two EX4600 switches serve as aggregation switches. The EX9200/EX9250 switches in location A could aggregate the traffic from location B; however, this option would increase cabling costs. EX4600 switches deliver highly available, simple, and scalable 10 GbE connectivity in a compact and power-efficient platform. The switches are in a Virtual Chassis configuration to provide the required redundancy and nonblocking architecture.
The core layer is at the heart of the campus network—every network element ultimately converges at the core. The core is usually configured as a Layer 3 device that provides high-speed packet switching between multiple sets of aggregation and/or access devices and that connects them to the perimeter or WAN edge network. Essential design features of the core layer include:
High port speed (1 GB, 10 GB, 40 GB) and the expected ability to support higher speeds in the future
High port density and the ability to scale to support future network expansion
High throughput, providing wire-rate forwarding and a nonblocking architecture
Robust Layer 2 and Layer 3 feature set
The EX9200/EX9250 switches used in the solution design support:
Any combination of 1 GB, 10 GB, and 40 GB line cards, with support of expected future 100 GB line cards
Up to 240 10 GB ports at line-rate speeds
MC-LAG for high availability
Redundant power supplies, fan modules, control modules, and switching fabrics
Up to 1,000,000 unicast routes
Up to 256K firewall filters (ACLs)
Up to 1,000,000 MAC addresses
The edge layer is the gateway for remote access to the campus network. It handles all Internet traffic into and out of the campus network. As a result, the edge can be a choke point for Internet traffic, making high availability and redundancy a vital aspect of edge design.
In addition, the edge is the first line of defense against attacks coming from the Internet and must provide robust security.
The midsize enterprise campus solution uses the following devices to fulfill the requirements of the edge network:
The edge firewall provides perimeter security services such as traffic inspection, security policies, NAT, and IPsec. Given the edge firewall’s important role in protecting the campus network from attacks, the campus network must be designed so that all Internet traffic entering and exiting the campus network must pass through the firewall.
The solution design uses two SRX650 Services Gateways that are clustered for node redundancy. An SRX650 Services Gateway supports up to 7.0 Gbps firewall, 1.5 Gbps IPsec VPN, and 900 Mbps IPS, making it suitable for a medium to large campus. The SRX650 gateways are physically connected to the core switch and edge routers, ensuring that edge traffic must pass through them.
The edge router connects the campus network to the Internet service provider. To support any edge interconnect offered, the edge router must support the IPv4, IPv6, ISO, and MPLS protocols. It must also support widely deployed routing protocols in campus networks, such as static routes, OSPF, OSPF-TE, OSPFv3, IS-IS, and BGP.
Because it is directly connected to the Internet, the edge router must provide the following security and tunneling features:
The ability to limit what type of traffic accesses the control plane and enforce packets per second (pps) limitations
The ability to police traffic, penalizing or discarding traffic that exceeds a set bandwidth
Support for granular access control lists that can match on Layer 2 through Layer 4 fields
Support for the following unicast reverse path forwarding modes: loose, strict, and VRF
Support for Secure Shell (SSH)
Support for IPsec, GRE, and IP tunneling
To resolve IP address conflicts and bridge IPv6 islands, the edge router must support a wide variety of NAT techniques such as:
NAT44 (static translation of source IPv4 without port mapping)
NAT 64 (translation of IPv6 addresses to IPv4 addresses and vice versa)
NAPT44 and NAPT66 (static translation of source IPv4 and IPv6 addresses with port mapping)
Twice NAT44 (static translation of both source and destination IPv4 addresses)
NAPT-PT (bind addresses in an IPv6 network with addresses in an IPv4 network and vice versa)
Finally, because the edge router is the ingress and egress point of the campus, edge routers must support account data collection, such as average traffic flows and statistics or the number of bytes or packets received and transmitted per application.
The midsize enterprise campus solution uses two MX240 3D Universal Edge Routers in a redundant MC-LAG configuration to meet the edge routing requirements. Because the MX240 router offers dual Routing Engines and unified in-service software upgrade (ISSU) at a reasonable price point, it is the preferred option over the smaller MX80 router.
High Availability Design
The midsize enterprise campus solution is designed to provide users with uninterrupted network access during hardware or software failures. Voice and video users have particularly demanding requirements—for a user to perceive no loss of service, the network must recover from failures in less than 1 second. In addition, users expect 24x7 access to the network—downtime because of planned maintenance must be minimal. This section discusses how the solution design meets these requirements.
Control Plane Redundancy
The midsize enterprise campus solution is designed so that all devices in the wired network have redundant control planes. The techniques to achieve redundancy vary according to the type of device:
Dual Routing Engines in a single physical device—Each EX9200/EX9250 switch and MX240 router has two Routing Engines. One Routing Engine acts as the primary Routing Engine for the switch, while the other acts as the backup Routing Engine. If the primary Routing Engine fails, the backup Routing Engine takes over.
Virtual Chassis—All access switches and the paired aggregation switches in location B are in Virtual Chassis configurations. In a Virtual Chassis, one member acts as the primary Routing Engine for the Virtual Chassis, while another member acts as the backup Routing Engine. The remaining members take a line card role. If the Virtual Chassis primary member fails, the backup takes over.
Chassis cluster—The SRX650 Services Gateways achieve control plane redundancy by being in a chassis cluster configuration. In a chassis cluster configuration, one of the gateways acts as the primary Routing Engine. If the Routing Engine in the primary gateway fails, the Routing Engine in the standby gateway takes over.
High Availability Software
The following high availability software features are enabled on the switches and routers:
Graceful Routing Engine switchover (GRES)—When GRES is enabled on switches and routers, the backup Routing Engine automatically synchronizes with the primary Routing Engine to preserve kernel state information and forwarding state. This synchronization enables the backup Routing Engine to continue to forward traffic, if the primary Routing Engine fails, without having to relearn routes or port states.
Nonstop active routing (NSR) and nonstop bridging (NSB)—NSR and NSB prevent service interruptions during the brief period when the backup Routing Engine takes over from a failed primary switch or router. Normally, the absence of the primary device would cause routing and switching protocols to begin the process of reconverging network paths to route around what they determine to be a failed device. NSR and NSB prevent such a reconvergence from occurring, thus maintaining service continuity.
Although you can use GRES with graceful protocol restart instead of NSR, this solution uses NSR because it can result in faster convergence after a control plane failure, supports unified in-service software upgrades, and does not rely on helper routers to assist in restoring routing protocol information.
On the SRX Series gateways, which do not support NSR, graceful protocol restart is enabled.
All traffic traveling in and out of the campus flows through the core and edge layers. It is therefore essential that these layers do not have a single point of failure. The solution design uses redundant devices at each of these layers so that, if one device fails, the other device can continue to forward traffic. The following techniques are used to achieve node redundancy at the core and edge:
Multichassis link aggregation (MC-LAG) configuration for edge routers and core switches
Chassis cluster for edge firewalls
Multichassis Link Aggregation Design
MC-LAG is configured on the core switches and edge routers to provide node redundancy at the core level. MC-LAG supports link aggregation groups (LAGs) that are spread across more than one device. Thus, if one of the switches fails, the other switch continues to forward the traffic on its MC-LAG link.
The client device—the device on the other end of the MC-LAG—does not need to be aware of MC-LAG. From its perspective, it is connecting to a single device through a LAG.
To support LAGs across two devices, both devices in an MC-LAG configuration must be able to synchronize their Link Aggregation Control Protocol (LACP) configurations, learned MAC addresses, and Address Resolution Protocol (ARP) entries. MC-LAG uses the following mechanisms to do so:
Inter-chassis Control Protocol (ICCP)—Control plane protocol that synchronizes configurations and operational states between two MC-LAG peers. It uses TCP as a transport protocol and requires Bidirectional Forwarding Detection (BFD) for fast convergence.
Interchassis link (ICL) link—Layer 2 link that is used to replicate forwarding information across peers.
Figure 2 illustrates the MC-LAG configuration that is used by the core switches.
MC-LAG Design Considerations
MC-LAG can be configured in active/standby mode, in which only one device actively forwards traffic, or in active/active mode, in which both devices actively forward traffic. Figure 3 illustrates the difference between active/standby and active/active.
This solution uses active/active as the preferred mode for the following reasons:
Traffic is load-balanced in active/active mode, resulting in link-level efficiency of 100 percent.
Convergence is faster in active/active mode than in active/standby mode. In active/active mode, information is exchanged between devices during operations. After a failure, the operational switch or router does not need to relearn any routes and continues to forward traffic.
It enables you to configure Layer 3 protocols on integrated routing and bridging (IRB) interfaces, providing a hybrid Layer 2 and Layer 3 environment on the core switch.
MC-LAG is used in conjunction with the Virtual Router Redundancy Protocol (VRRP) both on the core switches and on the edge routers. VRRP permits redundant routers to appear as a single virtual router to the other devices. In a VRRP implementation, each VRRP peer shares a common virtual IP address and virtual MAC address in addition to its unique physical IP address and MAC address. Thus, each IRB configured on the core switches must have a virtual IP address.
Typically, VRRP implementations are active/passive implementations, in which only one peer forwards traffic while the other peer is in standby. However, in the Junos® operating system (Junos OS), the VRRP forwarding logic has been modified when both VRRP and active/active MC-LAG are configured. In this case, both VRRP peers forward traffic and load-balance the traffic between them. As shown in Figure 4, data packets received by the backup peer on the MC-LAG member link are forwarded by the backup peer rather than being sent to the primary peer for forwarding.
Firewall Chassis Cluster Design
SRX Series Services Gateways achieve node redundancy through chassis clustering. In the solution design, two SRX650 gateways are clustered to provide stateful failover of processes, services, and traffic flow.
Creating a chassis cluster requires configuring the following interfaces on the SRX650 Services Gateways:
Control link—Link between the cluster nodes that transmits session state, configuration, and aliveness signals.
Fabric link—Link between the cluster nodes that transmits network traffic between the nodes and synchronizes the data plane software’s dynamic runtime state.
Redundant Ethernet interface—Virtual interface that is active on one node at a time and can fail over to the other node. Each redundant Ethernet (reth) interface consists of at least one interface from each cluster node. The redundant Ethernet interface has its own MAC address, which is different from the physical interface MAC addresses of its members. When a redundant Ethernet interface fails over, the connecting devices are updated with the MAC address of the new physical interface in use. Because the redundant Ethernet interface continues to use the same virtual MAC address and IP address, Layer 3 operations continue to work with no need for user intervention.
Figure 5 illustrates the chassis cluster topology.
In this topology, two redundant Ethernet interfaces are configured:
reth0, which connects to the core switches
reth1, which connects to the edge routers
To increase redundancy and bandwidth, the redundant Ethernet interfaces are configured as redundant Ethernet LAGs, with two physical interfaces bundled into each LAG on each cluster node. These physical interfaces permit each cluster node to have a physical connection to each core switch and edge router.
Firewall Chassis Cluster Design Considerations
SRX Series chassis clusters support both active/active and active/backup clustering modes. Because the additional scale provided by active/active mode is not required by this solution, the design uses the simpler and more commonly implemented active/backup mode. In active/backup mode, only the LAG member links on the active cluster node are active and forward data traffic.
The active node uses gratuitous ARP to advertise to the connecting devices that it is the next-hop gateway. If a failover occurs, the backup node uses gratuitous ARP to announce that it is now the next-hop gateway. As a result, for failover to work, the redundant Ethernet interface members and their connecting interfaces on the other devices must belong to the same bridge domain, as shown in Figure 5. These bridge domains result in an OSPF broadcast network.
The core switches and edge routers must be configured with an OSPF priority of 255 and 254 to ensure that they will always be the designated router and backup designated router for their bridge domain.
Switching and Routing Design
In the switching and routing design, the aggregation layer forms the boundary between Layer 2 and Layer 3, as illustrated in Figure 6.
The following summarizes the basic switching and routing design:
The devices in the access layer are configured as Layer 2 switches that forward user traffic on high-speed trunk ports to the aggregation layer.
The switches in the aggregation layer provide the boundary between Layer 2 and Layer 3. They are configured to provide Layer 2 switching on their downstream trunk ports to the access switches and Layer 3 routing on their upstream ports to the core. They act as the default gateways for the access devices.
The devices in the core and edge layers are primarily Layer 3 devices, routing traffic between the aggregation layer devices and between the internal campus network and the external Enterprise WAN and Internet.
This section covers:
Important considerations for the design of the switching network are:
Separation of Layer 2 Traffic
The access layer of the campus network provides network access to a wide variety of devices and users. The traffic generated by these devices and users often has different management or security requirements and thus needs to be separated. For example, voice traffic generated by VoIP phones requires different quality-of-service parameters than data traffic generated by laptops. Or users from the finance department might need to be granted access to a server that no other users can access.
Typically in campus networks, this traffic separation is achieved through the use of virtual LANs (VLANs) in the access and aggregation layers. Each organization will have its own requirements for separating user traffic using VLANs. In testing, this solution deployed a VLAN design that is optimized for management simplicity and that can be easily adapted to other organization environments. In the solution design, user traffic is separated into VLANs based on:
Traffic type—Voice and data traffic are carried on separate VLANs.
Department—Each functional group, or department, has its own VLAN. For example, there are different VLANs for Engineering, Marketing, Sales, Finance, and Executive personnel.
Access method—Wired traffic and wireless traffic are separated into different VLANs.
Wired data traffic is dynamically assigned to a port data VLAN as a result of the user authentication process.
For wired voice traffic, this solution takes advantage of the voice VLAN feature supported on EX Series switches. This feature enables otherwise standard access ports to accept both untagged (data) and tagged (voice) traffic and separate these traffic streams into separate VLANs. This in turn allows a VoIP phone and an end-host machine to share a single port while enabling the application of different quality-of-service parameters to the voice traffic.
In this solution, then, each user access port is associated with two VLANs—a data VLAN, which is dynamically assigned as a result of the authentication process, and a voice VLAN, which is statically configured on the port. A single voice VLAN can be used for all wired voice traffic because voice traffic typically has the same security requirements regardless of user role.
Layer 2 Loop Prevention
In campus architectures, each access switch is typically connected to two aggregation switches for reliability and high availability. The aggregation switches in turn have a Layer 2 connection to each other. This topology can create a Layer 2 loop.
Traditionally, a Spanning Tree Protocol (STP) is used to prevent Layer 2 loops. SPT exchanges information with other switches to prune specific redundant links, creating a loop-free topology with a single active Layer 2 data path between any two switches.
However, STP adds latency to the network. Although more recent versions of STP have reduced convergence after a failure to a few seconds, STP still has not achieved the sub-second convergence that Layer 3 protocols have achieved. Real-time applications, such as voice or video, experience disruptions when STP is used in campus networks. In addition, STP results in inefficient use of network resources because it blocks all but one of the redundant paths.
To prevent the creation of Layer 2 loops, this solution uses MC-LAG in the aggregation layer of location A and Virtual Chassis in the aggregation layer of location B. Each technology creates a single virtual device from one or more physical devices. From the point-of-view of the connecting access switch, the switch has multiple links to a single device through an aggregated Ethernet interface. STP is unnecessary when these technologies are incorporated into the Layer 2 network. This improves network performance by reducing latency and improves network efficiency by enabling all links to forward traffic.
In this solution, Layer 3 routing starts at the aggregation layer. Figure 7 provides more detail on the routing design in the aggregation, core, and edge layers.
Elements of the routing design include:
Integrated Bridging and Routing
To provide Layer 3 routing capabilities for the user VLANs, the core switches in location A and the aggregation switches in location B are configured with integrated routing and bridging (IRB) interfaces on the user VLANs. IRB interfaces are also known as routed VLAN interfaces (RVIs). IRB interfaces:
Function as the gateway router IP addresses for the hosts on the VLAN subnet
Provide Layer 3 interfaces for routing traffic between VLANs
The core and aggregation switches advertise the network prefixes to the edge firewalls to allow the edge firewalls to provide services such a Network Address Translation (NAT) and encryption.
Interior Gateway Protocol
For the interior gateway protocol (IGP), we recommend the use of a link-state protocol such as IS-IS or OSPF rather than a distance vector protocol such as RIP. Although distance vector protocols are generally easier to configure and to maintain than link-state protocols, link-state protocols feature improved scaling and quicker convergence times, features that are critical in larger networks. The solution design uses OSPF because it is the most commonly used IGP in campus networks.
As shown in Figure 7, OSPF routing occurs between two major sections of the campus network: the perimeter and the core. These sections have differing traffic profiles and flows. Traffic traveling to or from the Internet must always pass through the perimeter, while local campus traffic stays entirely within the core.
To limit link-state advertisement (LSA) flooding to within each section, the solution implements two OSPF areas:
Area 0, the core section, contains the core switches and the location B aggregation switch.
Area 1, the perimeter section, contains the edge firewalls and routers.
All IRB and VRRP interfaces are configured as passive OSPF interfaces. This enables them to advertise their addresses into OSPF while preventing end devices from receiving LSAs and creating an adjacency with the core switches.
The edge routers have the responsibility of advertising external reachability to the other OSPF nodes. To provide Internet access to the campus network, the routers in this solution export a dynamic, condition-based, default route to the Internet into OSPF towards the edge firewalls and core switches. The edge routers export this default route only when they receive a route through an external BGP (EBGP) advertisement from the Internet service provider. If an edge router does not receive a route advertisement from its EBGP neighbor, it stops exporting the default route.
The following configuration is implemented on each OSPF node:
Authentication—MD5 encryption is enabled to prevent unauthorized or accidental adjacencies.
Reference bandwidth—OSPF uses a reference bandwidth to calculate the cost of using an interface. This reference bandwidth should be the same on all nodes. We recommend using bandwidth large enough to accommodate expected near-future increases in Ethernet interface speeds. This solution uses a reference bandwidth of 1000 Gbps.
Loop-free alternate (LFA) feature—The LFA feature enables fast OSPF network restoration and convergence after network faults, which minimize disruptions to real-time applications such as VoIP and video. It works by preprogramming the Packet Forwarding Engine with loop-free backup paths for known prefixes.
Network Address Translation
The SRX650 Services Gateways provide Network Address Translation (NAT) services. NAT protects the campus private address space by mapping the private IP addresses to routable, public IP addresses. For more information about the perimeter security features provided by the SRX650 Services Gateways, see Security Design.
Exterior Gateway Protocol
The solution design uses BGP4 as its exterior gateway protocol for Internet connectivity. The MX240 edge routers use external BGP (EBGP) to peer with the ISPs. In addition, the routers are configured to:
Use internal BGP (IBGP) to peer with each other and use a next-hop-self export policy.
Advertise the campus public IP address space to the external peers. To support redundancy, each router uses the same prefix for the campus public IP address to external peers.
Give ISP-1 a higher local preference because it is the preferred exit to the Internet.
Bidirectional Forwarding Detection Protocol
To enable faster detection of link failures than the failure-detection mechanisms of OSPF and BGP deliver, this solution enables the Bidirectional Forwarding Detection (BFD) protocol on all OSPF and BGP links. The BFD protocol is a simple hello mechanism that works at the link level. A pair of routing devices exchanges BFD packets. Hello packets are sent at a specified, regular interval. A neighbor failure is detected when the routing device stops receiving a reply after a specified interval. Because the BFD failure detection timers have shorter time limits than the OSPF and BGP failure detection mechanisms, BFD provides faster detection of link failures.
Multicast Routing and Snooping Design
An increasing number of applications in enterprise networks use multicast forwarding, such as audio/video conferencing, software distribution, stock quotes, distance learning, and so on. For the midsize enterprise campus solution, support for multicast is based on the most common multicast protocols used in enterprise networks for multicast signaling, multicast group management, and Layer 2 multicast snooping.
Multicast Signaling Protocol
Protocol Independent Multicast (PIM) is used for the multicast routing protocol. It is the predominant multicast protocol used on the Internet.
PIM has several modes of operations, the most common of which are:
Dense mode (PIM-DM)—Uses a flood-and-prune mechanism to build a source-based distribution tree. A router receives the multicast traffic on the interface closest to the source and floods the traffic to all other interfaces. Routers with no multicast receivers must prune back unnecessary branches.
Sparse mode (PIM-SM)—Uses reverse path forwarding (RPF) to create a path from a multicast source to the multicast receiver when the receiver issues an explicit join request. A single router called a rendezvous point (RP) is initially selected in each multicast domain to be the connection point between multicast sources and interested receivers. Traffic flows are then rooted at the RP along the rendezvous-point tree. The rendezvous-point tree is later replaced by an optimized shortest-path tree.
Source-specific multicast (PIM-SSM)—Uses a subset of PIM sparse mode and IGMP version 3 (IGMPv3) to enable a receiver to receive multicast traffic directly from the source. PIM-SM builds a shortest-path tree between the receiver and the source without the help of an RP.
Table 1 summarizes the pros and cons of each PIM mode.
Table 1: Comparison of PIM Modes
The solution design uses PIM-SM. By not requiring IGMPv3, PIM-SM enables better interoperability with vendor equipment that does not support IGMPv3 and simplifies configuration.
In implementing PIM-SM, you should choose your RPs carefully to improve the performance and fault-tolerance of the network. Table 2 lists the options for RP selection and compares them.
Table 2: Comparison of RP Selection Options
RP Selection Option
Bootstrap Router (BSR)
Because fault tolerance is a necessity in an enterprise network, we recommend that you use one of the dynamic methods of RP selection. The Bootstrap Router method is an industry standard, preferred by Junos OS. As a result, this solution uses the Bootstrap Router method.
Multicast Group Management Protocol
The Internet Group Management Protocol (IGMP) is used for the multicast group management protocol. IGMP manages multicast receiver groups for IPv4 multicast traffic. IGMP enables a router to detect when a host on a directly attached subnet, typically a LAN, wants to receive traffic from a certain multicast group. Even if more than one host on the LAN wants to receive traffic for that multicast group, the router sends only one copy of each packet for that multicast group out on that interface, because of the inherent broadcast nature of LANs. When IGMP informs the router that there are no interested hosts on the subnet, the packets are withheld and that leaf is pruned from the distribution tree.
There are three versions of IGMP, all of which are supported by Junos OS:
IGMP version 1 (IGMPv1)—In this, the original protocol, all multicast routers send periodic membership queries to an all-host group address. Hosts reply with explicit join messages, but the protocol uses a timeout to determine when hosts leave a group.
IGMP version 2 (IGMPv2)—In IGMPv2, an election process results in one router in a network sending membership queries. Group-specific queries are supported, and hosts can send explicit leave-group messages.
IGMP version 3 (IGMPv3)—In IGMPv3, hosts can specify the source from which they want to receive group multicast content. This means that IGMPv3 can be used with PIM-SSM to create a shortest-path tree between receiver and source.
Table 3 lists the pros and cons of each IGMP version.
Table 3: Comparison of IGMP Versions
Junos OS defaults to IGMPv2. Because this solution does not require PIM-SSM, the solution design uses IGMPv2. IGMPv2 can interoperate with devices running IGMPv1.
By default, a switch floods multicast traffic to all interfaces in a Layer 2 broadcast domain or VLAN. This behavior increases bandwidth consumption. By examining (snooping) IGMP messages between hosts and multicast routers, a switch can learn which hosts are interested in receiving traffic for a multicast group. Based on what it learns, the switch then forwards multicast traffic only to those interfaces in the VLAN that are connected to interested receivers instead of flooding the traffic to all interfaces.
In this solution design, IGMP snooping is enabled on the Layer 2 devices in the access layers and aggregation layers, including the collapsed core/aggregation switches. In the MC-LAG configuration of the core switches, IGMP snooping membership information is automatically synchronized between both of the core switches.
Networks are subject to attacks from various malicious sources. These attacks can be passive, where an intruder intercepts data traveling through the network, or active, where an intruder initiates commands to disrupt the normal operation of the network (for example, denial-of-service attacks or address spoofing). Security for a campus network involves preventing and monitoring unauthorized access, network misuse, unauthorized network modification, or attacks that result in the denial of network services or network accessible resources.
This section discusses the following elements of the security design of the midsize enterprise campus solution:
With the proliferation of user devices on the campus, effective access control should support role-based policy orchestration. Together, access control and policy orchestration must be able to:
Identify the user and the user’s role
Authenticate the user and authorize the user to access resources on the network
Identify the type of device, operating system, and ownership (corporate-owned or employee-owned)
Quarantine a device if necessary
Detect the location of entry point and traffic encryption requirements
For access control, the solution design uses the 802.1X port-based network access control standard in the access layer, which integrates with and supports role-based policy orchestration.
802.1X Network Access Control Protocol
EX Series switches support endpoint access control through the 802.1X port-based network access control standard. When 802.1X authentication is enabled on a port, the switch (known as the authenticator) blocks all traffic to and from the end device (known as a supplicant) until the supplicant’s credentials are presented and matched on an authentication server, typically a RADIUS server. After the supplicant is authenticated, the switch opens the port to the supplicant.
Figure 8 illustrates the authentication process. The supplicant and authenticator communicate with each other by using exchanging Extensible Authentication Protocol (EAP) packets carried by the 802.1X protocol. The authenticator and the RADIUS server communicate by exchanging EAP packets carried by the RADIUS protocol.
The 802.1X protocol supports a number of different versions of the EAP protocol. This solution uses EAP-TTLS. EAP-TTLS is an “outer” protocol—it sets up a secure tunnel in which another authentication protocol, the “inner” protocol, handles the communication between the supplicant and the authentication server. The authentication server must present a valid certificate, which EAP-TTLS uses to form the tunnel. Verifying the identity of the authentication server ensures that a user connects to the intended network, and not to an access point that is pretending to be the network. For the inner protocol, the design uses the Password Authentication Protocol (PAP) or, when Junos Pulse is the supplicant, JUAC (a proprietary Juniper Networks protocol).
This solution uses EAP-TTLS because it:
Provides strong security
Is supported by the Junos Pulse client
Does not require client-side certificates, which simplifies the management of client devices
Is an industry standard
In a bring-your-own-device environment, 802.1X supplicants can be a wide variety of devices running a variety of supplicant software. The campus design must support these native supplicants, while continuing to provide secure connections to the campus network.
The campus design must also support:
Devices that do not have an 802.1X supplicant, such as printers, VoIP phones, and security cameras. For these devices, this solution uses MAC authentication, in which the device is authenticated by its MAC address.
Multiple supplicants on one port. Many organizations connect both a computer and an IP phone to a single port. The solution design supports separate authentication of both devices, using any combination of 802.1X or MAC authentication.
In addition to configuring both MAC authentication and 802.1X authentication on the same port, you can also restrict a port to performing MAC authentication only—for example, if you have a port that connects only to a device without an 802.1X supplicant, such as a video camera.
Authenticators and Policy Enforcement
In this design, the EX switches in the access layer act as 802.1X authenticators. They receive authentication requests from client supplicants, forward the authentication requests to the authentication server, and open or close the ports to traffic depending on the results of the authentication request.
The switches also act as policy enforcement points. Based on the information returned from the authentication server, they dynamically assign user traffic to VLANs and apply firewall filters (ACLs) to the traffic, restricting or allowing access to network resources as required by the user role.
At minimum, this solution design requires a RADIUS server that acts as an authentication server, providing authentication and returning information such as the VLAN and name of the firewall filter associated with the user. In addition, the authentication server might provide other services such as client compliance checking, device profiling, or mobile device management, either directly or indirectly through integration with other servers.
As an example, as part of testing this solution, Juniper Networks used Junos Pulse Access Control Service as the authentication server. This service was integrated with:
An LDAP server to validate user credentials. The same LDAP server was used to validate the credentials of users connecting remotely.
Host-checking software on the Junos Pulse client that acted as an 802.1X supplicant on Windows laptops. The host checker determined the status of the antivirus program on the laptop. If the laptop was not running the correct antivirus program or did not have the latest antivirus definitions, the Access Control Service returned to the switch the remediation VLAN ID and associated firewall filter. The filter restricted the user to accessing only the Access Control Service or a remediation server, from which the user could download and install the antivirus program.
A mobile device management service that acted as an authorization server for mobile devices and pushed profiles to the devices to provision them after the devices were successfully authorized. The purpose of provisioning the devices was to ensure that they would use EAP-TTLS for authentication through the Access Control Service.
Traffic Flows During 802.1X Authentication
Figure 9 illustrates the traffic flow during 802.1X authentication.
As illustrated in Figure 9, the steps involved in granting an 802.1X supplicant access to the wired network are:
- The employee connects the device to the access switch. The switch port blocks all traffic other than 802.1X traffic.
- The switch begins 802.1X communications with the device,
requesting the user credentials, while blocking all traffic other
than 802.1.X traffic.
If the 802.1X supplicant is nonresponsive or not enabled on the end device, the switch puts the port in the guest VLAN and assigns the firewall filter associated with the guest VLAN. The guest firewall filter allows access only to the authentication server and the remediation server. In this solution, the guest VLAN purpose is to quarantine the device and does not provide guest user access to the Internet.
- When the switch receives credentials from the supplicant, it uses the RADIUS protocol to communicate the user credentials to the authentication server.
- The authentication server validates the user credentials
and returns a RADIUS response to the switch containing the VLAN and
firewall filter associated with the user.
If the client device fails host checking—for example, it does not have the correct antivirus program installed—the authentication server returns a RADIUS response containing the remediation VLAN and firewall filter. The remediation filter permits the user access only to the authentication server and remediation server, quarantining the device.
- The switch assigns the VLAN to the port and opens the port for user traffic, applying the firewall filter to the traffic. If the user was successfully validated and the user device passed host checking, the user can now access the Internet or the protected resources permitted by the firewall filter.
Headless devices, such as printers, security cameras, and VoIP phones, usually do not have an 802.1X supplicant. For such devices, you can enable MAC authentication on an 802.1X port, allowing connecting devices to be authenticated by their MAC addresses.
Headless devices are granted access to the network as follows:
- The device is connected to the network.
- The switch blocks any traffic other than 802.1X traffic on the port and waits for a response from an 802.1X supplicant on the device.
- When the switch receives no response after a set timeout period, it sends the device’s MAC address to the authentication server for authentication.
- If the MAC address is registered with the authentication server, the server authenticates the device.
- The authenticator opens the port and allows traffic on it.
It is also possible to configure a port so that only MAC authentication, and not 802.1X authentication, is allowed on the port. In this case, the switch does not wait for a response from an 802.1X supplicant on the connecting device—instead, it sends the MAC address directly to the authentication server for authentication.
Access Port Security
In addition to preventing unauthorized access, security design includes preventing various attacks, such as Layer 2 DoS attacks and address spoofing. DoS attacks can be prevented through ingress firewall filters (ACLs) and rate limiting. For address spoofing, we recommend that you enable the following security measures on access switches:
DHCP snooping—Filters and blocks ingress Dynamic Host Configuration Protocol (DHCP) server messages on untrusted ports; builds and maintains an IP-address/MAC-address binding database (the DHCP snooping database)
Dynamic ARP inspection (DAI)—Prevents Address Resolution Protocol (ARP) spoofing attacks. ARP requests and replies are compared against entries in the DHCP snooping database, and filtering decisions are made based on the results of those comparisons
IP source guard—Mitigates the effects of IP address spoofing attacks. The source IP address in a packet that is sent from an untrusted access interface is validated against the source MAC address in the DHCP snooping database. The packet is allowed for further processing if the source IP address to source MAC address binding is valid; if the binding is not valid, the packet is discarded.
Remote Access Security
Users expect to be able to access the campus network remotely, from anywhere, at any time, using any device. In this solution, an SSL VPN provides remote access security for web-capable devices by intermediating the data that flows between external users and the enterprise’s internal resources. During intermediation, the SSL VPN receives secure requests from the external, authenticated users and then makes requests to the internal resources on behalf of those users. The SSL VPN used in testing was the Junos Pulse Secure Access Service, which was integrated with an LDAP server for authenticating users.
Internet Edge Security
The SRX650 Services Gateways provide perimeter security services, stateful policy enforcement, and Network Address Translation (NAT) for Internet traffic that is entering or exiting the campus network.
Security Zones and Security Policies
The SRX Series Services Gateways are zone-based firewalls, enabling you to group interfaces with similar security requirements into security zones. You can then apply security policies to traffic as it traverses from one zone to another zone.
In the solution design, two security zones are defined:
The trust zone, which contains the interfaces that connect to the core switch
The untrust zone, which contains the interfaces that connect to the edge router
These zones are configured on the redundant Ethernet interfaces as shown in Figure 10.
Table 4 describes the security policies that govern traffic passing between zones.
Table 4: Zone Policies
Employee remote access
Permit all public sources addresses to access the SSL VPN service using HTTP and HTTPS only. All other inbound traffic is denied and logged.
Employee Internet access
Permit all private source addresses within the trust zone to access the Internet with HTTP, HTTPS, DNS, NTP, UDP, and PING. All other outbound traffic is denied and logged.
In some campus environments, additional security policies might be needed. For example, you might require a security policy that allows external access to public domain servers, such as webservers.
Network Address Translation
To protect the internal IP address space, the SRX650 Services Gateways perform Network Address Translation (NAT).
In this solution:
Outbound traffic uses source NAT. Source NAT translates private IP addresses to public IP addresses selected from a configurable pool. For the public IP addresses, we recommend using a source NAT pool instead of an interface IP pool because it provides more scale.
Inbound traffic uses destination NAT. Destination NAT translates the public IP address of the Secure Access Service server to its private IP address.
Quality of Service Design
EX Series switches are designed to provide high quality of service (QoS) to end users and applications in the campus. Many EX Series features contribute to delivering high QoS—features such as high-bandwidth links, reduced latency through Virtual Chassis technology, fast route convergence, and so on. Nevertheless, it is still important to implement specific QoS policies. QoS is the manipulation of aggregates of traffic such that each aggregate is forwarded in a fashion that is consistent with the required behaviors of the application generating that traffic. QoS is mandatory for any campus deployment where there is potential for congestion or contention for resources.
QoS and Service-Level Agreements
QoS policies are typically based on application. Each application has specific service-level agreements (SLAs) that must be considered when determining QoS policies for the campus. Table 5 gives some baseline guidance for SLAs.
Table 5: Baseline SLAs for Campus Networks
Low latency—Less than 150 ms.
Low latency—Less than 150 ms.
Mission-critical data—Should be given higher priority in queuing and policing policies.
Other data—Usually treated as best-effort traffic.
The above-mentioned SLAs are generic in nature and might not completely satisfy the requirements of your applications. Use these SLAs as a starting point for determining the SLAs required for your applications.
Overview of QoS in the Campus LAN
Junos OS provides the class-of-service (CoS) feature to allow you to configure an individual node to handle traffic in a way that is consistent with the end-to-end QoS policy. CoS consists of the following components:
A forwarding class is a means of aggregating traffic that has the same characteristics and that requires the same behavior as it flows through a network node. To share a forwarding class, traffic does not have to belong to the same application—it must merely require the same behavior.
A forwarding class is a label used entirely within a network node. A forwarding class does not explicitly appear outside a node. However, forwarding classes are usually implemented consistently across nodes in a campus network.
The forwarding classes used in a campus network depend, of course, on the applications supported and their SLAs. For this configuration example, five forwarding classes are used:
Network control—For protocol control packets, which generally have a high priority.
Voice—For voice traffic, which requires low loss, low latency, low jitter, assured bandwidth, and end-to-end service.
Video—For video traffic. Video traffic is similar to voice traffic in its SLA requirements, but video traffic is bursty and requires more bandwidth to be allocated per stream.
Mission critical—For data traffic that requires higher QoS than best effort, such as mission-critical applications or transactional applications.
Best effort—All other traffic.
Some campus environments place video and voice into the same forwarding class; however, the bursty nature of video generally requires a different CoS policy than does voice.
Traffic must be classified before it can be assigned to a forwarding class. Junos OS supports three methods of classifying traffic:
Interface-based—Traffic is classified by the interface it arrives on. Although interface-based classification is the simplest method, this configuration example does not use it because it means that all traffic arriving on an interface must require the same behavior.
Behavior Aggregate (BA)—BA classification relies on markings placed in the headers of incoming frames or packets. Ethernet frames and IP packets include a field in their headers that indicates the class of the frame or packet—for example, Ethernet frames use three 802.1p bits while IPv4 packets use the 6-bit DiffServ Code Point (DSCP) field.
This configuration example uses the DiffServe Code Points shown Table 6 to map packets to their forwarding class.
Table 6: DiffServe Code Points Mapped to Forwarding Class
DiffServ Code Point
Multifield—Multifield classification uses ingress firewall filters to classify traffic based on Layer 2, Layer 3, or Layer 4 information. Multifield classifier filters can be applied to Layer 2 or Layer 3 interfaces or to VLANs or to some combination of these. Because the multifield classifier filters are stored in Ternary Content Addressable Memory (TCAM), the same multifield classifier applied to multiple interfaces can consume TCAM memory. You can reduce TCAM consumption by applying the multifield classifier to VLANs instead. This configuration example uses multifield classification on access switches to classify packets on VLANs for client traffic.
Queues and Schedulers
You can configure each port on a switch to use up to 8 or up to 12 egress queues, depending on the switch model. The forwarding class of a packet determines which queue it is sent to for transmission.
Each queue has one or more schedulers associated with it—different schedulers can be applied to different interfaces. Schedulers determine when packets are placed on the interface from the queue in which they are waiting. When you define a scheduler, you can specify scheduling priority, buffer size, queue shaping, transmit rate, and drop profile, as described here:
Scheduling priority—Priority can be either strict-high or shaped-deficit weighted round-robin (SDWRR). With strict-high priority scheduling, packets in higher priority queues are always transmitted before packets in lower priority queues. As long as the higher priority queue has packets waiting, the lower priority queues will not be serviced. Queue priority is determined by queue number—higher numbered queues always have a higher priority than lower numbered queues (for example, queue 7 has a higher priority than queue 6). Strict-high priority is used for queues that process traffic that is sensitive to delays, such as voice traffic.
All other priorities result in the queues being serviced in an SDWRR fashion, with packets being transmitted sequentially, starting with the highest priority queue.
Buffer size—Buffer size refers to the amount of buffer space allocated to a queue. Consider the following when configuring buffer size:
Because strict-high priority queues have a high transmit rate, they require smaller buffers. We recommend reserving a small percentage for strict-high priority queues.
SDWRR queues, in contrast, require larger buffers. The buffer size required can vary based on application load and requirements. A common practice is to match buffer size to transmit rate.
Voice traffic should not be buffered over a long period, because that increases latency and jitter. Instead, packets should be dropped. To achieve this, you can specify that the buffer size is exact, which prevents any excess voice packets from being buffered in the shared buffer.
Queue shaping—Shaping limits the rate at which traffic can be transmitted. Traffic that does not conform to the shaper’s criteria is held in the queue until it does conform. No explicit constraint is placed on more traffic entering the queue, as long as the queue is not full.
Because packets in a strict-high priority queue are always transmitted before packets in a lower priority queue, a strict-high priority queue can potentially consume all the bandwidth and starve lower priority queues. We recommend that you use queue shaping on strict-high priority queues to prevent this situation from occurring.
Transmit rate—Transmit rate specifies the portion of the total interface bandwidth that is allocated to the queue. This rate can be specified as a fixed value, as a percentage of the total bandwidth, or as the rest of the available bandwidth. Transmit rate is not applicable to strict-high priority queues, because these queues are always serviced when there are packets in the queue.
Drop profile—Tail drop profile is a congestion management mechanism that allows a switch to drop arriving packets when queue buffers become full or begin to overflow. EX Series switches support either weighted tail drop (WTD) or weighted random early detection (WRED). If you do not explicitly configure a drop profile, a default tail drop profile is used.
We recommend that you do not use WRED on queues that handle UDP traffic. UDP is often used by applications that are intolerant of loss, latency, and jitter. In addition, because UDP has no built-in mechanism for identifying the loss of a packet and modifying its rate of transmission, the packet is either lost (reducing the perceived QoS) without having significant impact on the throughput, or, worse, the application identifies the loss and demands retransmission of the packet, so the packet is then seen twice, potentially increasing the congestion.
For this solution, the default drop profile is used. Specific drop profiles are not used because each type of traffic within the campus has its own queue and there is no need to differentially drop packets if the queue becomes congested.
Policing, or rate limiting, lets you control the amount of traffic that enters an interface. You can achieve policing by including policers in firewall filter configurations. A firewall filter configured with a policer permits only traffic within a specified set of rate limits, thereby providing protection from denial-of-service (DoS) attacks. Traffic that exceeds the rate limits specified by the policer is either discarded immediately or is marked as lower priority than traffic that is within the rate limits. The lower priority traffic is discarded when there is traffic congestion.
Hard-drop behavior can have a negative impact, particularly on TCP traffic and when the policer is run consistently at its limit. While it is possible to reclassify packets based on a policer, it is important to avoid reordering packets in applications that are sensitive to the order in which packets are received, such as voice, video, and other real-time traffic.
If traffic rate limiting is required in your implementation, policing should be done at the edge to control the load entering the network. This network configuration example does not implement policers.
A rewrite rule sets the appropriate CoS bits in the outgoing packet, thus allowing the next downstream device to classify the packet into the appropriate service group. Rewriting, or remarking, outbound packets is generally done by edge devices. In this network configuration example, rewriting is done by the switches in the access layer.
Deploying CoS in the Campus LAN
CoS components are implemented on a per-hop basis, with each device being separately configured for CoS. The user, on the other hand, evaluates quality of experience based on the end-to-end traffic flow. Even though CoS is implemented on a per-hop basis, you must consider the end-to-end traffic flow when configuring CoS so that the resulting quality of experience is consistent with the desired end-to-end user experience or application behavior. Bear in mind that a single congested hop can destroy the end-to-end experience, and subsequent nodes can do nothing to recover the end-to-end quality of experience for the user.
This network configuration example implements CoS at the access, aggregation, and core layers. Your organization might want to extend the CoS implementation to include the edge firewalls and routers.
In developing your implementation strategy, it is useful to divide your network into trusted and untrusted domains. Trust and untrust are commonly used terms in a security context, but they can also be used in QoS. An edge device (such as an access switch or a router connecting to the Internet) resides between the trusted and untrusted boundary. These are the first and last entry points into and out of the campus network. The CoS markings on packets coming from the untrusted domain might not conform to the campus QoS policy, but once packets enter the campus LAN, network administrators have complete control and can manipulate packets so that they comply with the established QoS strategy.
For this solution, traffic within the campus LAN is trusted, while traffic arriving at the access layer is untrusted.