MPLS Working Group Richard Li Internet-Draft Katherine Zhao Intended status: Informational Robin Li Expires: December 25,2015 Huawei Technologies June 25,2014 MPLS Deployments and Use Cases That Cannot be Solved By Using 20-bit Label draft-lzl-mpls-ucase-n-20bit-label-limitation-00 Abstract The MPLS label format and encoding method are specified in [RFC3032], the label value is represented using the 20-bit space and supports up to 1 million of instances. As exponential Internet growth continues there are specific network deployment scenarios where a clear need to represent more than one million entities is required. This document describes the MPLS deployment scenarios, use cases, and requirements, where the current 20-bit label will no longer be sufficient. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on [Date]. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. Requirement Language . . . . . . . . . . . . . . . . . . . 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2. VPN Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 2.1. L3VPN with VXLAN and NVGRE Use Cases . . . . . . . . . . . 2.1.1 The Argument for >1 Million Identifiers . . . . . . . . 2.2. MPLS Based L2VPN . . . . . . . . . . . . . . . . . . . . . 3. Usecase of MRT MT and FRR . . . . . . . . . . . . . . . . . . 4. NVO3 Use Case . . . . . . . . . . . . . . . . . . . . . . . . 5. General Requirements . . . . . . . . . . . . . . . . . . . . . 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1. Normative References . . . . . . . . . . . . . . . . . . . 8.2. Informative References . . . . . . . . . . . . . . . . . . Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 1. Introduction MPLS label format and encoding method have been specified in [RFC3032] for more than 10 years. The MPLS label value is represented by 20-bit space that can support up to 1 million of instances. This proven technology has been widely deployed and worked fine for years. However in the recent years the widespread adoption of network virtualization as well as SDN technologies are being designed and deployed in data center and cloud networks, which have culminated in a new set of requirements to use a MPLS label to represent more than one million of instances. This document sets out to identify the deployment scenarios and use cases that require more than 2^20 identifiers in the same MPLS label space, and therefore cannot be supported by the existing 20-bit label. 1.1. Terminology The following terms are used in this document: CE - Customer Edge DC - Data Center MPLS - Multi protocol Label Switching NVE - Network Virtualization Edge NVGRE - Network Virtualization using Generic Routing Encapsulation NVO3 - Network Virtualization Over layer 3 PE - Provider Edge VLAN - Virtual Local Area Network VNI - VXLAN Network Identifier (VXLAN) VNID - Virtual Network ID (NVO3) VPN - Virtual private network VRF - Virtual Routing and Forwarding VSID -Virtual Subnet ID (NVGRE) VTEP - VXLAN Tunnel End Point VXLAN - Virtual eXtensible Local Area Network 2. VPN Use Cases VPN technology allows customer to connect geographically diverse sites and data centers across core networks with ensured performance and security. There are many methods designed for VPN connectivity, the following sub-sections will discuss use cases of BGP MPLS/IP based L3 and L2 VPN respectively. 2.1 L3VPN with VXLAN and NVGRE Use Cases In the BGP MPLS based VPN reference model, at each site there are one or more Customer Edge (CE) devices, each of which is attached to one or more Provider Edge (PE) routers via some sort of attachment circuit such as Ethernet/VLAN, etc. When the VPN reference model is extended to connect virtual networks, the CE and PE devices on the data center site can be physically combined into a same device,which performs the PE function with respect to the VPN model and the Network Virtual Entity function of data center. With different network protocol used in the data center such as VXLAN or NVGRE, the encoding method of each protocol is different but both use 24-bit space to represent up to 16 million virtual network instances. Figure 1 shows a sample topology where customer sites are connected to a data center via L3VPN over a MPLS core network. Customer sites are L3 networks; the VXLAN protocol [I-D.mahalingam-dutt-dcops-vxlan] is used in data center. Between the data center and MPLS core is a new device with combined functions of PE and VTEP (VXLAN Tunnel End Point), which is named as PE-VTEP in Figure 1. When a network entity at CE site wants to send packets to a VM in the data center, the PE- VTEP device works like a gateway, the L3 VPN id can be carried over via a MPLS label cross the core network. When packets get into the data center, a VPN id can be mapped into a VNI (virtual network identifier) of VxLAN network, this way customer L3 networks are connected to data center via L3 VPN cross core while the traffic remains private within the VPN. ....................... .................. . . . . CE1-| +-------+ +-------+ VXLAN . CE2-|-----| PE | MPLS |PE-VTEP| Network . CE3-| +-------+ Network +-------+ in . Customer . . . DataCenter . L3 Networks . . . . ....................... .................. +-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | Payload | | Payload | | Payload | +-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | IP | | dest VM IP | | VM IP | +-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | L3VPN(VNI)Label | | VXLAN Header | +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | LSP Label | | UDP/IP | +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ <--------------->|<-------------------->|<----------------> Original Packet Packet format between Packet format in Customer site PE and PE-VTEP in data center L3 network in MPLS core VXLAN network Figure 1. Interconnecting Customer Sites with VXLAN in DC The PE and PE-VTEP devices on the MPLS core perform the following functions: VPN PE functions: It uses BGP to distribute VPN routes; maintains VRFs; uses MPLS to receive and forward packets from and to the MPLS network. VXLAN VTEP functions: It originates and terminates VXLAN tunnels; runs all the necessary protocols to build and tear down the VXLAN tunnels; maintains the VXLAN tunnel forwarding states including the MAC table; L3VPN-VXLAN inter-working functions: It maintains the mapping information between L3VPN label and VXLAN VNI. This mapping information is used to receive packets from the MPLS network and forward them to the VXLAN network, and receive packets from the VXLAN network and forward them to the MPLS network VXLAN uses a 24-bit segment identifier in the form of a VNI (virtual network identifier) in data center. It resolved the issue caused by 4096 limitation of VLAN (IEEE802.Q) that do not provide enough segments for scalable cloud deployment. NVGRE is another use case similar to VXLAN architecturally. NVGRE uses GRE as a method to tunnel L2 packets across MPLS/IP network, and a 24-bit segment identifier in the form of a VSID (virtual subnet ID). NVGRE allows LAN segments to scale to 16 million in each data center; each LAN segment can be extended across MPLS/IP core network. Figure 2 shows a topology similar to Figure 1 except that the VXLAN is replaced by NVGRE within the data center; and PE-VTEP is replaced by PE-NVE that works like a gateway mapping VPN header and NVGRE header accordingly. ....................... .................. . . . . CE1-| +-------+ +-------+ NVGRE . |-----| PE | MPLS |PE-NVE | Network . CE2-| +-------+ Network +-------+ in . . . . DataCenter . Customer . . . . Sites ....................... .................. +-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | Payload | | Payload | | UDP (IP) | +-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | IP | | dest VM IP | |NVGRE Hdr(VSID)| +-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | VPN(VSID)Label| | IP | +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | LSP Label | +-+-+-+-+-+-+-+-+ <--------------->|<------------------------>|<---------------> IP Packet Packet format out of Packet format out in L3 Network PE to MPLS network of PE-VTEP to NVGRE network Figure 2. Interconnecting Customer Sites with NVGRE in DC Based on these scaling considerations outlined previously for VXLAN and NVGRE use cases, the current 20-bit MPLS label space limitation, supporting up to 1 million virtual network segments, will not be sufficient for future DC-based MPLS deployments and use cases. 2.1.1 The Argument for >1 Million Identifiers VMs need to be on the same VN to communicate each other in order to ensure the information privacy and traffic isolation from outside of the VN. In the old time a virtual network may be shared by an enterprise or a large group of the people, the number of VNIs were just in the order of hundreds or thousands while a data center held only several tens of thousands servers, back then the VLAN technology (support 4096 segments) was sufficiently used for many years. However, along with the new technology such as cloud, mobile Internet and Internet of Things became mature and reality, the world is in the midst of a dramatic transformation to Internet enabled devices, the data center scale has gone up explosively. Based on StorageServers.com, Microsoft built its own data center with 500,000 sq ft in 2006 and held about 100,000 servers today. Based on Forbes.com, the world largest data center will reach 6.3million sq ft (completion in 2016, Langfang, China) increasing nearly 12 times (6.3million sq ft/500,000 sq ft). It is safe to assume that the data center can hold 6 times more, e.g. 600,000 servers in 2016. Assume each server holds 16 VMs and uses 2 VNIs in average conservatively, and then the largest data center could need 12 million VNIs (600,000 servers *2 VNIs) after year 2016, many other data center will also need more than 1 million VNIs based on this trend. The current 20-bit label cannot support it therefore we need a solution to meet the demand. Another question is why we need to use one-to-one mapping between a VPN label and virtual network id VNI? It is true that some label sharing method can be used to aggregate multiple VNP IDs into a single MPLS label, and then decode it by PE-VTEP and map to corresponding VNIs in the data center. This kind of methods use one- to-many mapping to save the number of MPLS labels thus no need to expand 20-bits label space. However one-to-many aggregation always involve extra algorithm like hashing/lookups etc, which adds overhead to packet encapsulation and then impacts the traffic performance. Moreover, some deployment uses specific flow identification that does not allow the label sharing/aggregation. To ensure efficient traffic distribution and packet encapsulation in virtual network, there is a clear advantage to use one-to-one mapping for gaining better performance and higher network scalability. 2.2. Use Case of L2 EVPN over MPLS BGP MPLS based EVPN (Ethernet VPN) introduces a new model for delivery of Ethernet services,which extends L2 connectivity for all connected customer sites and data centers. EVPN was driven by marketing requirements such as MAC address scalability; Load balancing and all-active redundancy; optimal forwarding and fast convergence and so on. IETF draft [draft-ietf-l2vpn-evpn-07] specified BGP MPLS based Ethernet VPN protocol. Figure 3 illustrates a sample use case of EVPN deployment. As with L3 VPNs, a L2 EVPN network is comprised of customer edge (CE) devices (host, router, or switch) connected to provider edge (PE) devices. The difference is that EVPN forwarding is based on destination MAC address other than IP address in L3 VPN architecture. ....................... . . CE1-| +-------+ +------+ |-CE3 |-----| PE | | PE |-----|-CE4 CE2-| +-------+ +------+ |-CE5 . MPLS Network . Customer . . Customer Sites ....................... Sites +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+ | Ethernet | | Ethernet | | Ethernet | | Frame | | Frame | | Frame | +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+ | EVPN Label(EVI) | +-+-+-+-+-+-+-+-+-+ | LSP Label | +-+-+-+-+-+-+-+-+-+ <-------------------->|<-------------------->|<-------------------> Original Packet Packet format Packet format in Customer site between PE's in Customer site L2 network in MPLS core L2 Network Figure 3. L2 networks via BGP/MPLS based L2 EVPN EVPN protocol specified ESI to identify an Ethernet segment, there could be one to many VLANs within a segment. ESI is a ten octets integer while VID (VLAN ID) is a 12-bit filed. Therefore EVPN allows very large number of EVPNs been deployed over a service provider's MPLS network, each providing network connectivity to customers while ensuring the traffic sharing remains private within a VPN. When an ingress PE floods multi-destination traffic, an EVI (EVPN identifier, see Figure 3 above) label is mapped to ESI, VID and destination MAC address at the customers L2 network. In case a customer site has more than 1 million connected network devices /VMs, each need an EVPN ID, then the current MPLS 20-bit label cannot support this demand. 3. Use Case of MRT MT and FRR In some deployment of Fast Reroute protection, the global label allocation(GLA) is used to represent a LSP for fast switchover. In the event of a link failure, traffic on an LSP is rerouted to the next-hop using a pre-configured backup LSP identified by a global MPLS label. With GLA, Protected LSP and backup LSP can be setup through same interface, any MPLS frame received will be switched based on its global label regardless its incoming interface. Together with make-before-break method it can achieve the fast traffic recovery. When the same FRR mechanism is applied to MRT MT (Maximally Redundant Trees Multi-topology) scenario, the 20-bit MPLS label space will not be sufficient. For example, assume that there are three topologies configured in a MPLS core network, each topology is colored to yellow (default), red and blue respectively. When enabling the whole network FRR by incremental deployment of LDP MT in the IP network, the MPLS label has to be globally unique in order to achieve fast reroute. Since the number of Internet route is around 500,000 based on some statistics, when MPLS labels are allocated in the yellow (default), blue and red multi-topology respectively and simultaneously, the required labels for allocation will reach 3 times 500,000 at least 1.5million, thus it exceeds the existing 20-bit MPLS label range of 1 million labels. This kind of use cases impose additional requirement to the MPLS big label. 4. NVO3 Use Case NVO3 is an on-going effort to standardize solutions to data center virtualizaiton with the goal of providing viable data encapsulation and protocols across a scaling range of a few thousand VMs to several million VMs running on greater than one hundred thousand physical servers. NVO3 considers approaches to multi-tenancy that reside at the network layer rather than using traditional isolation mechanisms that rely on the underlying layer 2 technology (e.g. VLANs). Based on NVO3 framework and problem statement, NVO3 will deliver 16 million virtual networks in a physical data center similar to VXLAN and NVGRE. As described for L3 and L2 VPN use cases, we will need to solve the problem of associating MPLS labels to NVO3 VNIDs, thus it will be a potential use case for MPLS big label. 5. General Requirements When design MPLS big label applied to the use cases described in this document, following requirements should also be considered. 1. An extension to the MPLS label format of [RFC3032] should be specified for big label 2. Big label label needs 24-bit space with 16 million of labels to support inter-working with VXLAN/NVGRE/NVO3; The same requirement is applicable for L2VPN in case of using Q-in-Q 3. Big label needs 32-bit space with 1 billion of labels in order to support the number of SIDs specified by Segment Routing 4. PE device in MPLS core network must advertise its big label process capability to other devices in the same routing domain 5. Every PE device must enable big label capability in order to distribute traffic using big label in the same routing domain. A mechanism needs to be specified if allowing some explicit nodes not to use big label in this routing domain. 6. PE devices must be capable to support expanded number of virtual interfaces and forwarding entries etc that could reach 16 million or 1 billion in the worst case. 7. Big label framework needs to be backward compatible. The network device supporting big label must too have capability to process regular 20-bit label. 8. The big label should work with the existing MPLS hardware architecture 6. IANA Considerations This document makes no requests for IANA actions. 7. Security Considerations This draft does not add any additional security implications to the BGP/MPLS IP VPNs. All existing authentication and security mechanisms for MPLS still apply. 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2547] Rosen, E. and Y. Rekhter, "BGP/MPLS VPNs", RFC 2547, March 1999. [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in BGP-4", RFC 3107, May 2001. 8.2. Informative References [1] R.Li and M.Li,"Encoding of Big Labels in MPLS Label Stacks", [draft-renwei-mpls-big-label-00] June, 2013 [2] Z.Li and L.Zheng, "Mega Label - Expansion of MPLS Label Range" [draft-li-mpls-mega-label-00] July, 2013 [3] R. Aggarwal,Y. Rekhter and E. Rosen, "MPLS Upstream Label Assignment and Context-Specific Label Space" [RFC5331] August, 2008 [I-D.mahalingam-dutt-dcops-vxlan] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M., and C. Wright, "VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", draft-mahalingam-dutt-dcops-vxlan-03 (work in progress), February 2013. [I-D.sridharan-virtualization-nvgre] Sridharan, M., Greenberg, A., Venkataramaiah, N., Wang, Y., Duda, K., Ganga, I., Lin, G., Pearson, M., Thaler, P., and C. Tumuluri, "NVGRE: Network Virtualization using Generic Routing Encapsulation", draft-sridharan- virtualization-nvgre-02 (work in progress), February 2013. Authors' Addresses Renwei Li Huawei Technologies 2330 Central Expressway Santa Clara, CA 95050 USA Email: renwei.li@huawei.com Katherine Zhao Huawei Technologies 2330 Central Expressway Santa Clara, CA 95050 USA Email: katherine.zhao@huawei.com Zhenbin Li Huawei Technologies Huawei Campus, No.156 Beiqing Rd. Beijing 100095 China Email: lizhenbin@huawei.com