BESS Workgroup
Internet Engineering Task Force (IETF)                   J. Rabadan, Ed.
Internet Draft
Request for Comments: 8584                                         Nokia
Updates: 7432                                            S. Mohanty, Ed.
Intended status:
Category: Standards Track                                     A. Sajassi
ISSN: 2070-1721                                                    Cisco
                                                                J. Drake
                                                                 Juniper
                                                              K. Nagaraj
                                                            S. Sathappan
                                                                   Nokia

Expires: July 28, 2019                                  January 24,
                                                              April 2019

 Framework for EVPN Ethernet VPN Designated Forwarder Election Extensibility
             draft-ietf-bess-evpn-df-election-framework-09

Abstract

   An alternative to the Default default Designated Forwarder (DF) selection
   algorithm in Ethernet VPN (EVPN) networks VPNs (EVPNs) is defined.  The DF is the
   Provider Edge (PE) router responsible for sending broadcast, unknown
   unicast Broadcast, Unknown
   Unicast, and multicast Multicast (BUM) traffic to multi-homed a multihomed Customer Equipment Edge
   (CE) device on a given VLAN on a particular Ethernet Segment (ES) within a VLAN. (ES).
   In addition, the capability ability to influence the DF election result for a
   VLAN based on the state of the associated Attachment Circuit (AC) is
   specified.  This document clarifies the DF Election election Finite State
   Machine in EVPN, therefore EVPN services.  Therefore, it updates the EVPN specification.
   specification (RFC 7432).

Status of this This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents an Internet Standards Track document.

   This document is a product of the Internet Engineering Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum
   (IETF).  It represents the consensus of six months the IETF community.  It has
   received public review and may be updated, replaced, or obsoleted has been approved for publication by other documents at any
   time.  It the
   Internet Engineering Steering Group (IESG).  Further information on
   Internet Standards is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work available in progress."

   The list Section 2 of RFC 7841.

   Information about the current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list status of Internet-Draft Shadow Directories can this document, any errata,
   and how to provide feedback on it may be accessed obtained at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on July 28, 2019.
   https://www.rfc-editor.org/info/rfc8584.

Copyright Notice

   Copyright (c) 2019 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info)
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3 ....................................................3
      1.1. Conventions and Terminology ................................3
      1.2. Default Designated Forwarder (DF) Election in EVPN  . . . .  3
     1.2.
           Services ...................................................5
      1.3. Problem Statement . . . . . . . . . . . . . . . . . . . . .  6
       1.2.1. ..........................................8
           1.3.1. Unfair Load-Balancing Load Balancing and Service Disruption  . . . . .  6
       1.2.2. ........8
           1.3.2. Traffic Black-Holing on Individual AC Failures  . . . .  7
     1.3. .....10
      1.4. The Need for Extending the Default DF Election in
           EVPN  . . 10 Services .............................................12
   2. Conventions and Terminology . . . . . . . . . . . . . . . . . . 11
   3. Designated Forwarder Election Protocol and BGP Extensions . . . 12
     3.1. ......13
      2.1. The DF Election Finite State Machine (FSM)  . . . . . . . . 12
     3.2. ................13
      2.2. The DF Election Extended Community  . . . . . . . . . . . . 15
       3.2.1. ........................16
           2.2.1. Backward Compatibility  . . . . . . . . . . . . . . . . 18
     3.3. Auto-Derivation of ES-Import Route Target . . . . . . . . . 18
   4. .............................19
   3. The Highest Random Weight DF Election Algorithm . . . . . . . . 18
     4.1. ................19
      3.1. HRW and Consistent Hashing  . . . . . . . . . . . . . . . . 19
     4.2. ................................20
      3.2. HRW Algorithm for EVPN DF Election  . . . . . . . . . . . . 19
   5. ........................20
   4. The Attachment Circuit Influenced AC-Influenced DF Election Capability  . . . 21
     5.1. .......................22
      4.1. AC-Influenced DF Election Capability For for
           VLAN-Aware Bundle Services . . . . . . . . . . . . . . . . . . . . . . 23
   6. ................................24
   5. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . . 24
   7. ..............................................25
   6. Security Considerations . . . . . . . . . . . . . . . . . . . . 25
   8. ........................................26
   7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 25
   9. ............................................27
   8. References  . . . . . . . . . . . . . . . . . . . . . . . . . . 26
     9.1. .....................................................28
      8.1. Normative References  . . . . . . . . . . . . . . . . . . . 26
     9.2. ......................................28
      8.2. Informative References  . . . . . . . . . . . . . . . . . . 27
   10. ....................................29
   Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 27
   11. ...................................................30
   Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 28 ......................................................30
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28 ................................................31

1.  Introduction

   The Designated Forwarder (DF) in EVPN networks Ethernet VPNs (EVPNs) is the
   Provider Edge (PE) router responsible for sending broadcast, unknown unicast Broadcast, Unknown
   Unicast, and
   multicast Multicast (BUM) traffic to a multi-homed multihomed Customer Equipment Edge
   (CE)
   device, device on a given VLAN on a particular Ethernet Segment (ES).
   The DF is selected out elected from the set of multihomed PEs attached to a list given
   ES, each of candidate PEs that advertise which advertises an ES route for the same ES as identified by
   its Ethernet Segment Identifier (ESI) to the EVPN network. (ESI).  By default, the EVPN uses a
   DF Election election algorithm referred to as "Service Carving"
   and it "service carving".  The DF
   election algorithm is based on a modulus function (V mod N) that
   takes the number of PEs in the ES (N) and the VLAN value (V) as
   input.  This Default DF
   Election algorithm has some inefficiencies that this document addresses by defining a inefficiencies in the default DF
   election algorithm by defining a new DF Election election algorithm and a capability an
   ability to influence the DF Election election result for a VLAN, depending on
   the state of the associated Attachment Circuit (AC).  In order to
   avoid any ambiguity with the identifier used in the DF Election Algorithm, election
   algorithm, this document uses the term Ethernet Tag "Ethernet Tag" instead of VLAN.
   "VLAN".  This document also creates a registry with IANA, IANA for future
   DF Election Algorithms election algorithms and Capabilities. capabilities (see Section 7).  It also
   presents a formal definition and clarification of the DF Election election
   Finite State Machine (FSM),
   therefore the (FSM).  Therefore, this document updates [RFC7432]
   [RFC7432], and EVPN implementations MUST conform to the
   prescribed FSM.

   The procedures described in this document apply to DF election in all
   EVPN solutions solutions, including those described in [RFC7432] and [RFC8214].
   Apart from the FSM formal description, description of the FSM, this document does not
   intend to update other
   [RFC7432] procedures. It procedures described in [RFC7432]; it only
   aims to improve the behavior of the DF
   Election election on PEs that are
   upgraded to follow the procedures described procedures. in this document.

1.1. Default Designated Forwarder (DF) Election  Conventions and Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in EVPN

   [RFC7432] defines the Designated Forwarder (DF) this document are to be interpreted as the EVPN PE
   responsible for: described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

   o Flooding  AC: Attachment Circuit.  An AC has an Ethernet Tag associated
      with it.

   o  ACS: Attachment Circuit Status.

   o  BUM: Broadcast, Unknown unicast unknown unicast, and Multicast traffic (BUM), on
     a given multicast.

   o  DF: Designated Forwarder.

   o  NDF: Non-Designated Forwarder.

   o  BDF: Backup Designated Forwarder.

   o  Ethernet Tag on a particular A-D per ES route: Refers to Route Type 1 as defined in
      [RFC7432] or to Auto-discovery per Ethernet Segment (ES), route.

   o  Ethernet A-D per EVI route: Refers to the
     CE. This is valid for single-active and all-active Route Type 1 as defined in
      [RFC7432] or to Auto-discovery per EVPN
     multi-homing. Instance route.

   o Sending unicast traffic  ES: Ethernet Segment.

   o  ESI: Ethernet Segment Identifier.

   o  EVI: EVPN Instance.

   o  MAC-VRF: A Virtual Routing and Forwarding table for Media Access
      Control (MAC) addresses on a given Ethernet Tag PE.

   o  BD: Broadcast Domain.  An EVI may be comprised of one BD
      (VLAN-based or VLAN Bundle services) or multiple BDs (VLAN-aware
      Bundle services).

   o  Bridge table: An instantiation of a BD on a particular ES
     to the CE. This is valid for single-active multi-homing.

   Figure 1 illustrates an example that we will use to explain the
   Designated Forwarder function.

                        +---------------+
                        |   IP/MPLS     |
                        |   CORE        |
          +----+ ES1 +----+           +----+
          | CE1|-----|    |           |    |____ES2
          +----+     | PE1|           | PE2|    \
                     |    |           +----+     \+----+
                     +----+             |         | CE2|
                        |             +----+     /+----+
                        |             |    |____/   |
                        |             | PE3|    ES2 /
                        |             +----+       /
                        |               |         /
                        +-------------+----+     /
                                      | PE4|____/ES2
                                      |    |
                                      +----+

               Figure 1 Multi-homing Network of EVPN

   Figure 1 illustrates a case where there are two Ethernet Segments,
   ES1 and ES2. PE1 is attached to CE1 via MAC-VRF.

   o  HRW: Highest Random Weight.

   o  VID: VLAN Identifier.

   o  CE-VID: Customer Edge VLAN Identifier.

   o  Ethernet Segment ES1 whereas
   PE2, PE3 and PE4 are attached Tag: Used to CE2 via ES2 i.e. PE2, PE3 and PE4
   form represent a redundancy group. Since CE2 is multi-homed to different PEs on
   the same Ethernet Segment, it BD that is necessary for PE2, PE3 and PE4 to
   agree configured on a DF to satisfy the above mentioned requirements.

   The effect of forwarding loops in a Layer-2 network is particularly
   severe because of the broadcast nature of Ethernet traffic and given
      ES for the
   lack purpose of a Time-To-Live (TTL). Therefore it is very important DF election.  Note that in
   the case of a multi-homed CE only one any of the PEs following
      may be used to send BUM
   traffic to it.

   One represent a BD: VIDs (including Q-in-Q tags),
      configured IDs, VNIs (Virtual Extensible Local Area Network
      (VXLAN) Network Identifiers), normalized VIDs, I-SIDs (Service
      Instance Identifiers), etc., as long as the representation of the pre-requisites for this support
      BDs is that participating configured consistently across the multihomed PEs
   must agree amongst themselves as to who would act as the Designated
   Forwarder (DF). This needs attached
      to that ES.  The Ethernet Tag value MUST be achieved through a distributed
   algorithm different from zero.

   o  Ethernet Tag ID: Refers to the identifier used in which each participating PE independently and
   unambiguously selects one of the participating PEs EVPN routes
      defined in [RFC7432].  Its value may be the same as the DF, and Ethernet
      Tag value (see the
   result should be consistent and unanimous.

   The default algorithm definition for DF election defined by [RFC7432] at Ethernet Tag) when advertising
      routes for VLAN-aware Bundle services.  Note that in the
   granularity case of (ESI,EVI)
      VLAN-based or VLAN Bundle services, the Ethernet Tag ID is referred to as "service carving". In this
   document, service carving and Default zero.

   o  DF Election algorithm are used
   interchangeably. With service carving, it is possible election procedure: Also called "DF election".  Refers to elect
   multiple DFs per Ethernet Segment (one per EVI) the
      process in order to perform
   load-balancing its entirety, including the discovery of traffic destined to a given Segment. The objective
   is that the load-balancing procedures should carve up PEs in the BD space
   among
      ES, the creation and maintenance of the redundant PE nodes evenly, in such candidate list, and the
      selection of a way that every PE is PE.

   o  DF algorithm: A component of the DF election procedure.  Strictly
      refers to the selection of a PE for a distinct set given <ES, Ethernet Tag>.

   o  RR: Route Reflector.  A network routing component for BGP
      [RFC4456].  It offers an alternative to the logical full-mesh
      requirement of EVIs. the Internal Border Gateway Protocol (IBGP).  The DF Election algorithm as described in [RFC7432] (Section 8.5)
      purpose of the RR is
   based on concentration.  Multiple BGP routers can peer
      with a modulus operation. The PEs to which central point, the ES (for which DF
   election is to be carried out per EVI) is multi-homed form RR -- acting as a route reflector server
      -- rather than peer with every other router in a full mesh.  This
      results in an ordered
   (ordinal) list O(N) peering as opposed to O(N^2).

   o  TTL: Time To Live.

   This document also assumes that the reader is familiar with the
   terminology provided in ascending order of [RFC7432].

1.2.  Default Designated Forwarder (DF) Election in EVPN Services

   [RFC7432] defines the PE IP address values. For
   example, there are N PEs: PE0, PE1,... PEN-1 ranked DF as per increasing
   IP addresses in the ordinal list; then for each VLAN with EVPN PE responsible for:

   o  Flooding BUM traffic on a given Ethernet Tag V, configured on a particular ES to
      the Ethernet Segment ES1, PEx CE.  This is the DF valid for VLAN
   V on ES1 when x equals (V mod N). In the case of VLAN Bundle only the
   lowest VLAN is used. In the case when the planned density is high
   (meaning there are significant number of VLANs Single-Active and the All-Active EVPN
      multihoming.

   o  Sending unicast traffic on a given Ethernet Tags
   are uniformly distributed), Tag on a particular ES
      to the thinking CE.  This is valid for Single-Active multihoming.

   Figure 1 illustrates an example that the DF Election we will
   be spread across the PEs hosting that Ethernet Segment and good load-
   balancing can be achieved.

   However, the described Default DF Election algorithm has some
   undesirable properties and in some cases can be somewhat disruptive
   and unfair. This document describes some of those issues and defines
   a mechanism for dealing with them. These mechanisms do involve
   changes use to explain the Default DF Election algorithm, but they do not require
   any changes to the EVPN Route exchange and have minimal changes in
   the
   function.

                        +---------------+
                        |   IP/MPLS     |
                        |   Core        |
          +----+ ES1 +----+           +----+
          | CE1|-----|    |           |    |____ES2
          +----+     | PE1|           | PE2|    \
                     |    |           +----+     \+----+
                     +----+             |         | CE2|
                        |             +----+     /+----+
                        |             |    |____/   |
                        |             | PE3|    ES2 /
                        |             +----+       /
                        |               |         /
                        +-------------+----+     /
                                      | PE4|____/ES2
                                      |    |
                                      +----+

                        Figure 1: EVPN routes.

   In addition, Multihoming

   Figure 1 illustrates a case where there are two ESes: ES1 and ES2.
   PE1 is attached to CE1 via ES1, whereas PE2, PE3, and PE4 are
   attached to CE2 via ES2, i.e., PE2, PE3, and PE4 form a need redundancy
   group.  Since CE2 is multihomed to extend different PEs on the DF Election procedures so
   that new algorithms same ES, it
   is necessary for PE2, PE3, and capabilities are possible. A single algorithm
   (the Default PE4 to agree on a DF Election algorithm) may not meet to satisfy the requirements
   above-mentioned requirements.

   The effect of forwarding loops in
   all the use-cases.

   Note that while [RFC7432] elects a DF per <ES, EVI>, this document
   elects a DF per <ES, BD>. This means that unlike [RFC7432], where for Layer 2 network is particularly
   severe because of the broadcast nature of Ethernet traffic and the
   lack of a VLAN-Aware Bundle service EVI there TTL.  Therefore, it is very important that, in the case of
   a multihomed CE, only one DF for of the EVI,
   this document specifies that there will PEs be multiple DFs, one used to send BUM traffic
   to it.

   One of the prerequisites for each
   BD configured in this support is that EVI.

1.2. Problem Statement participating PEs
   must agree amongst themselves as to who would act as the DF.  This section describes some potential issues with the Default DF
   Election algorithm.

1.2.1. Unfair Load-Balancing and Service Disruption

   There are three fundamental problems with the current Default DF
   Election algorithm.

   1- First, the algorithm will not perform well when the Ethernet Tag
      follows a non-uniform distribution, for instance when the Ethernet
      Tags are all even or all odd. In such a case let us assume that
      the ES is multi-homed
   needs to two PEs; be achieved through a distributed algorithm in which each
   participating PE independently and unambiguously selects one of the
   participating PEs will be elected as DF for all of the VLANs. This is very sub-optimal. It defeats DF, and the purpose result should be consistent and
   unanimous.

   The default algorithm for DF election defined by [RFC7432] at the
   granularity of service carving (ESI, EVI) is referred to as the DFs are not really evenly
      spread across. "service carving".  In fact, in
   this particular case, one of document, service carving and the PEs
      does not get elected as default DF at all, so election algorithm
   are used interchangeably.  With service carving, it does not participate in
      the DF responsibilities at all. Consider another example where,
      referring is possible to Figure 1, lets assume that PE2, PE3, PE4 are
   elect multiple DFs per ES (one per EVI) in
      ascending order to perform load
   balancing of the IP address; and each VLAN configured on ES2
      is associated with an Ethernet Tag of the form (3x+1), where x is
      an integer. This will result in PE3 always be selected as the DF.

   2- traffic destined to a given ES.  The Ethernet tag that identifies the BD can be as large as 2^24;
      however, it objective is not guaranteed that
   the tenant BD on the ES will
      conform to a uniform distribution. In fact, it is load-balancing procedures should carve up to the
      customer what BDs they will configure on BD space among the ES. Quoting [Knuth],
      "In general, we want to avoid values of M that divide r^k+a or
      r^k-a, where k and
   redundant PE nodes evenly, in such a are small numbers and r way that every PE is the radix of the
      alphabetic character set (usually r=64, 256 or 100), since a
      remainder modulo such DF for
   a value distinct set of M tends to be largely EVIs.

   The DF election algorithm (as described in [RFC7432], Section 8.5) is
   based on a simple
      superposition of key digits. Such considerations suggest that we
      choose M modulus operation.  The PEs to which the ES (for which DF
   election is to be a prime number such that r^k!=a(modulo)M or
      r^k!=?a(modulo)M carried out per EVI) is multihomed form an ordered
   (ordinal) list in ascending order by PE IP address value.  For
   example, there are N PEs: PE0, PE1,... PE(N-1) ranked as per
   increasing IP addresses in the ordinal list; then, for small k & a." each VLAN with
   Ethernet Tag V, configured on ES1, PEx is the DF for VLAN V on ES1
   when x equals (V mod N).  In our case, N the case of a VLAN Bundle, only the
   lowest VLAN is used.  In the case when the planned density is high
   (meaning there are a significant number of PEs in [RFC7432] which corresponds
      to M above. Since N, N-1 or N+1 need not satisfy VLANs and the primality
      properties of Ethernet
   Tags are uniformly distributed), the M above; as per thinking is that the [RFC7432] modulo based DF
      assignment, whenever a PE goes down or a new PE boots up (hosting
      the same Ethernet Segment), the modulo scheme election
   will not necessarily
      map BDs to be spread across the PEs uniformly.

   3- The third problem is one hosting that ES and good load balancing
   can be achieved.

   However, the described default DF election algorithm has some
   undesirable properties and, in some cases, can be somewhat disruptive
   and unfair.  This document describes some of disruption. Consider those issues and defines
   a case when mechanism for dealing with them.  These mechanisms do involve
   changes to the
      same Ethernet Segment is multi-homed default DF election algorithm, but they do not require
   any changes to a set of PEs. When the ES
      is down EVPN route exchange, and changes in one of the PEs, say PE1, or PE1 itself reboots, or the
      BGP process goes down or EVPN
   routes will be minimal.

   In addition, there is a need to extend the connectivity between PE1 DF election procedures so
   that new algorithms and an RR
      goes down, capabilities are possible.  A single
   algorithm (the default DF election algorithm) may not meet the effective number of PEs
   requirements in the system now becomes
      N-1, and DFs are computed for all the VLANs that are configured on use cases.

   Note that Ethernet Segment. In general, if the while [RFC7432] elects a DF per <ES, EVI>, this document
   elects a DF per <ES, BD>.  This means that unlike [RFC7432], where
   for a VLAN v happens
      not to be PE1, but some other PE, say PE2, it VLAN-aware Bundle service EVI there is likely only one DF for the EVI,
   this document specifies that some
      other PE (different from PE1 and PE2) there will become the new DF. be multiple DFs, one for each
   BD configured in that EVI.

1.3.  Problem Statement

   This
      is section describes some potential issues with the default DF
   election algorithm.

1.3.1.  Unfair Load Balancing and Service Disruption

   There are three fundamental problems with the current default DF
   election algorithm.

   1.  The algorithm will not desirable. Similarly perform well when the Ethernet Tag follows
       a new PE hosts non-uniform distribution -- for instance, when the same Ethernet
      Segment,
       Tags are all even or all odd.  In such a case, let us assume that
       the mapping again changes because ES is multihomed to two PEs; one of the modulus
      operation. PEs will be elected
       as the DF for all of the VLANs.  This results is very suboptimal.  It
       defeats the purpose of service carving, as the DFs are not really
       evenly spread across the PEs hosting the ES.  In fact, in needless churn. Again this
       particular case, one of the PEs does not get elected as the DF at
       all, so it does not participate in DF responsibilities at all.
       Consider another example where, referring to Figure 1, say v1, v2 let's
       assume that (1) PE2, PE3, and v3 PE4 are VLANs listed in ascending order
       by IP address and (2) each VLAN configured on ES2 with is associated
       with an Ethernet Tags Tag of value 999, 1000 and 1001 respectively.
      So PE1, PE2 and PE3 are the DFs for v1, v2 and v3 respectively.
      Now when PE3 goes down, PE2 form (3x+1), where x is an integer.
       This will become result in PE3 always being selected as the DF for v1 and PE1 will
      become DF.

   2.  The Ethernet Tag that identifies the DF for v2.

   One point to note BD can be as large as 2^24;
       however, it is not guaranteed that the Default DF election algorithm assumes
   that all tenant BD on the PEs who are multi-homed ES will
       conform to the same Ethernet Segment
   (and interested in the DF Election by exchanging EVPN routes) use an
   Originating Router's IP Address of the same family. This does not
   need a uniform distribution.  In fact, it is up to be the case as
       customer what BDs they will configure on the EVPN address-family can be carried over an
   IPv4 ES.  Quoting
       [Knuth]:

          In general, we want to avoid values of M that divide r^k+a or IPv6 peering,
          r^k-a, where k and a are small numbers and r is the PEs attached to the same ES may use an
   address radix of either family.

   Mathematically,
          the alphabetic character set (usually r=64, 256 or 100), since
          a conventional hash function maps remainder modulo such a key k value of M tends to be largely a number
   i representing one
          simple superposition of m hash buckets through key digits.  Such considerations
          suggest that we choose M to be a function h(k) i.e.
   i=h(k). prime number such that
          r^k!=a(modulo)M or r^k!=?a(modulo)M for small k & a.

       In the EVPN our case, h is simply a modulo-m hash function viz.
   h(v) = v mod N, where N is the number of PEs that are multi-homed (Section 8.5 of [RFC7432]).
       N corresponds to M above.  Since N, N-1, or N+1 need not satisfy
       the Ethernet Segment in discussion. It is well-known that for good
   hash distribution using the modulus operation, primality properties of M, as per the modulus N should
   be modulo-based DF
       assignment [RFC7432], whenever a prime-number not too close to PE goes down or a power of 2 [CLRS2009]. When the
   effective number of PEs changes from N new PE boots
       up (attached to N-1 (or vice versa); all the objects (VLAN V) same ES), the modulo scheme will be remapped except those for which V mod N
   and V mod (N-1) refer not
       necessarily map BDs to PEs uniformly.

   3.  Disruption is another problem.  Consider a case when the same PE in the previous and subsequent
   ordinal rankings respectively. From a forwarding perspective, this ES
       is multihomed to a churn, as it results set of PEs.  When the ES is DOWN in re-programming one of the PE ports as either
   blocking
       PEs, say PE1, or PE1 itself reboots, or non-blocking at the PEs where BGP process goes down
       or the DF state changes.

   This document addresses this problem connectivity between PE1 and furnishes a solution to this
   undesirable behavior.

1.2.2. Traffic Black-Holing on Individual AC Failures

   As discussed in section 2.1 an RR goes down, the Default DF Election algorithm defined
   by [RFC7432] takes into account only two variables
       effective number of PEs in the modulus
   function system now becomes N-1, and DFs
       are computed for a given ES: the existence of all the PE's IP address VLANs that are configured on the
   candidate list and the locally provisioned Ethernet Tags.

   If that ES.
       In general, if the DF for an <ESI, EVI> fails (due a VLAN V happens not to physical link/node
   failures) an ES route withdrawal will make the Non-DF (NDF) PEs re-
   elect the DF for be PE1, but
       some other PE, say PE2, it is likely that <ESI, EVI> some other PE
       (different from PE1 and the service PE2) will be recovered.

   However, become the Default DF election procedure does new DF.  This is not provide
       desirable.  Similarly, when a
   protection against "logical" failures or human errors that may occur
   at service level on new PE hosts the DF, while same ES, the list of active PEs for a given
   ES does not change. These failures may have an impact not only on the
   local PE where the issue happens, but also on the rest of the PEs
       mapping again changes because of the ES. Some examples of such logical failures are listed below:

   a) A given individual Attachment Circuit (AC) defined modulus operation.  This
       results in an ES is
      accidentally shutdown or even not provisioned yet (hence the
      Attachment Circuit Status - ACS - is DOWN), while the ES is
      operationally active (since the ES route is active).

   b) A given MAC-VRF - needless churn.  Again referring to Figure 1, say V1,
       V2, and V3 are VLANs configured on ES2 with a defined ES - is shutdown or not
      provisioned yet, while the ES is operationally active (since the
      ES route is active). In this case, the ACS associated Ethernet
       Tags of all values 999, 1000, and 1001, respectively.  So, PE1, PE2,
       and PE3 are the ACs defined
      in that MAC-VRF is considered to be DOWN.

   Neither (a) nor (b) DFs for V1, V2, and V3, respectively.  Now when
       PE3 goes down, PE2 will trigger become the DF re-election on the remote
   multi-homed PEs for a given ES since V1 and PE1 will become
       the ACS DF for V2.

   One point to note is not taken into
   account in that the default DF election procedures. While algorithm assumes
   that all the PEs who are multihomed to the same ES (and interested in
   the ACS is used as a DF election tie-breaker and trigger in VPLS multi-homing procedures
   [VPLS-MH], there is no procedure defined in by exchanging EVPN routes) use an Originating
   Router's IP address [RFC7432] of the same family.  This does not need
   to trigger be the DF re-election based on case, as the ACS change on EVPN address family can be carried over an
   IPv4 or IPv6 peering, and the DF.

   Figure 2 illustrates PEs attached to the described issue with same ES may use an example.

                               +---+
                               |CE4|
                               +---+
                                 |
                            PE4  |
                           +-----+-----+
           +---------------|  +-----+  |---------------+
           |               |  | BD-1|  |               |
           |               +-----------+               |
           |                                           |
           |
   address of either family.

   Mathematically, a conventional hash function maps a key k to a number
   i representing one of m hash buckets through a function h(k), i.e.,
   i = h(k).  In the EVPN                    |
           |                                           |
           | PE1               PE2                PE3  |
           | (NDF)             (DF)               (NDF)|
       +-----------+       +-----------+       +-----------+
       |  | BD-1|  |       |  | BD-1|  |       |  | BD-1|  |
       |  +-----+  |-------|  +-----+  |-------|  +-----+  |
       +-----------+       +-----------+       +-----------+
              AC1\   ES12   /AC2  AC3\   ES23   /AC4
                  \        /          \        /
                   \      /            \      /
                    +----+              +----+
                    |CE12|              |CE23|
                    +----+              +----+

          Figure 2 Default DF Election and Traffic Black-Holing

   BD-1 is defined in PE1, PE2, PE3 and PE4. CE12 case, h is simply a multi-homed CE
   connected modulo-m hash function
   viz. h(V) = V mod N, where N is the number of PEs that are multihomed
   to ES12 the ES in PE1 and PE2. Similarly CE23 question.  It is multi-homed to
   PE2 and PE3 well known that for good hash
   distribution using ES23. Both, CE12 and CE23, are connected the modulus operation, the modulus N should be a
   prime number not too close to BD-1
   through VLAN-based service interfaces: CE12-VID 1 (VLAN ID 1 on CE12)
   is associated a power of 2 [CLRS2009].  When the
   effective number of PEs changes from N to AC1 N-1 (or vice versa), all
   the objects (VLAN V) will be remapped except those for which V mod N
   and AC2 in BD-1, whereas CE23-VID 1 is
   associated V mod (N-1) refer to AC3 the same PE in the previous and AC4 subsequent
   ordinal rankings, respectively.  From a forwarding perspective, this
   is a churn, as it results in BD-1. Assume that, although not
   represented, there are other ACs defined on these ES mapped to
   different BDs.

   After executing reprogramming the PE ports as either
   blocking or non-blocking at the PEs where the [RFC7432] Default DF election algorithm, PE2
   turns out state changes.

   This document addresses this problem and furnishes a solution to be this
   undesirable behavior.

1.3.2.  Traffic Black-Holing on Individual AC Failures

   The default DF election algorithm defined by [RFC7432] takes into
   account only two variables in the DF modulus function for ES12 and ES23 a given ES:
   the existence of the PE's IP address in BD-1. The following
   issues may arise:

   a) the candidate list and the
   locally provisioned Ethernet Tags.

   If AC2 is accidentally shutdown or even not configured, CE12
      traffic will be impacted. In case of all-active multi-homing, the
      BUM traffic DF for an <ESI, EVI> fails (due to CE12 physical link/node
   failures), an ES route withdrawal will be "black-holed", whereas make the NDF PEs re-elect the
   DF for single-
      active multi-homing, all that <ESI, EVI> and the traffic to/from CE12 service will be
      discarded. This is due to recovered.

   However, the fact default DF election procedure does not provide
   protection against "logical" failures or human errors that may occur
   at the service level on the DF, while the list of active PEs for a logical failure in PE2's
      AC2
   given ES does not change.  These failures may have an impact not trigger only
   on the local PE where the issue happens but also on the rest of the
   PEs of the ES.  Some examples of such logical failures are listed
   below:

   (a)  A given individual AC defined in an ES route withdrawn for ES12 (since there
      are still other ACs active on ES12) and therefore PE1 will is accidentally shut down
        or is not re-
      run provisioned yet (hence, the DF election procedures.

   b) If ACS is DOWN), while the Bridge Table for BD-1 ES
        is administratively shutdown operationally active (since the ES route is active).

   (b)  A given MAC-VRF with a defined ES is either shut down or even not configured yet on PE2, CE12 and CE23 will both be impacted:
      BUM traffic to both CEs will be discarded in case
        provisioned yet, while the ES is operationally active (since the
        ES route is active).  In this case, the ACS of all-active
      multi-homing and all traffic will be discarded to/from the CEs ACs
        defined in
      case of single-active multi-homing. This that MAC-VRF is due considered to the fact that
      PE1 and PE3 be DOWN.

   Neither (a) nor (b) will not re-run trigger the DF election procedures and will
      keep assuming PE2 is the DF.

   Quoting [RFC7432], "when an Ethernet Tag is decommissioned re-election on an
   Ethernet Segment, then the PE MUST withdraw the Ethernet A-D per EVI
   route(s) announced for the <ESI, Ethernet Tags> that are impacted by
   the decommissioning", however, while this A-D per EVI route
   withdrawal is used at the remote
   multihomed PEs performing aliasing or backup
   procedures, it is not used to influence the DF election for the
   affected EVIs.

   This document adds an optional modification of the DF Election
   procedure so that a given ES, since the ACS may be is not taken into
   account as a variable in
   the DF election, and therefore EVPN can provide protection against
   logical failures.

1.3. The Need for Extending the Default DF Election in EVPN

   Section 1.2 describes some of the issues that exist in the Default DF
   Election election procedures. In order to address those issues, this document
   introduces a new DF Election framework. This framework allows  While the PEs
   to agree on a common DF election algorithm, as well ACS is used as the
   capabilities to enable during the DF Election procedure. Generally,
   'DF election algorithm' refers to the algorithm by which a number of
   input parameters are used to determine the DF PE, while 'DF
   election
   capability' refers to an additional feature that can be used prior to tiebreaker and trigger in Virtual Private LAN Service (VPLS)
   multihoming procedures [VPLS-MH], there is no procedure defined in
   the invocation of EVPN specification [RFC7432] to trigger the DF election algorithm, such as modifying re-election based
   on the
   inputs (or list ACS change on the DF.

   Figure 2 shows an example of candidate PEs).

   Within this framework, this document defines a new logical AC failure.

                               +---+
                               |CE4|
                               +---+
                                 |
                            PE4  |
                           +-----+-----+
           +---------------|  +-----+  |---------------+
           |               |  | BD-1|  |               |
           |               +-----------+               |
           |                                           |
           |                   EVPN                    |
           |                                           |
           | PE1               PE2                PE3  |
           | (NDF)             (DF)               (NDF)|
       +-----------+       +-----------+       +-----------+
       |  | BD-1|  |       |  | BD-1|  |       |  | BD-1|  |
       |  +-----+  |-------|  +-----+  |-------|  +-----+  |
       +-----------+       +-----------+       +-----------+
              AC1\   ES12   /AC2  AC3\   ES23   /AC4
                  \        /          \        /
                   \      /            \      /
                    +----+              +----+
                    |CE12|              |CE23|
                    +----+              +----+

          Figure 2: Default DF Election
   algorithm and a new capability that can influence the DF Election
   result:

   o The new DF Election algorithm Traffic Black-Holing

   BD-1 is referred to as "Highest Random
     Weight" (HRW). The HRW procedures are described defined in section 4.

   o The new DF Election capability PE1, PE2, PE3, and PE4.  CE12 is referred a multihomed CE
   connected to as "AC-Influenced DF
     Election" (AC-DF). The AC-DF procedures are described ES12 in section 5.

   o HRW PE1 and AC-DF mechanisms PE2.  Similarly, CE23 is multihomed to
   PE2 and PE3 using ES23.  Both CE12 and CE23 are independent of each other. Therefore,
     a PE may support either HRW or AC-DF independently or may support
     both of them together. A PE may also support AC-DF capability along connected to BD-1
   through VLAN-based service interfaces: CE12-VID 1 (VID 1 on CE12) is
   associated with AC1 and AC2 in BD-1, whereas CE23-VID 1 is associated
   with AC3 and AC4 in BD-1.  Assume that, although not represented,
   there are other ACs defined on these ESes mapped to different BDs.

   After executing the Default default DF election algorithm per [RFC7432].

   In addition, this document defines a way as described in
   [RFC7432], PE2 turns out to indicate be the support DF for ES12 and ES23 in BD-1.  The
   following issues may arise:

   (a)  If AC2 is accidentally shut down or is not configured yet, CE12
        traffic will be impacted.  In the case of
   HRW and/or AC-DF along with All-Active
        multihoming, the EVPN ES routes advertised for a given
   ES. Refer BUM traffic to section 3.2 CE12 will be "black-holed",
        whereas for more details.

2. Conventions and Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" Single-Active multihoming, all the traffic to/from
        CE12 will be discarded.  This is because a logical failure in this document
        PE2's AC2 may not trigger an ES route withdrawal for ES12 (since
        there are still other ACs active on ES12); therefore, PE1 will
        not rerun the DF election procedures.

   (b)  If the bridge table for BD-1 is administratively shut down or is
        not configured yet on PE2, CE12 and CE23 will both be impacted:
        BUM traffic to both CEs will be interpreted as described discarded in BCP
   14 [RFC2119] [RFC8174] when, the case of
        All-Active multihoming, and only when, they appear in all
   capitals, as shown here.

   o AC traffic will be discarded
        to/from the CEs in the case of Single-Active multihoming.  This
        is because PE1 and ACS - Attachment Circuit PE3 will not rerun the DF election procedures
        and Attachment Circuit Status. An
     AC has will keep assuming that PE2 is the DF.

   Quoting [RFC7432], "When an Ethernet tag is decommissioned on an
   Ethernet Tag associated to it.

   o BUM - refers to segment, then the PE MUST withdraw the Broadcast, Unknown unicast and Multicast
     traffic.

   o DF, NDF and BDF - Designated Forwarder, Non-Designated Forwarder
     and Backup Designated Forwarder

   o Ethernet A-D per ES route - refers to [RFC7432] route type 1 or
     Auto-Discovery per Ethernet Segment route.

   o EVI
   route(s) announced for the <ESI, Ethernet tags> that are impacted by
   the decommissioning."  However, while this A-D per EVI route - refers to [RFC7432] route type 1
   withdrawal is used at the remote PEs performing aliasing or
     Auto-Discovery per EVPN Instance route.

   o ES and ESI - Ethernet Segment and Ethernet Segment Identifier.

   o EVI - EVPN Instance.

   o MAC-VRF - A Virtual Routing and Forwarding table backup
   procedures, it is not used to influence the DF election for Media Access
     Control (MAC) addresses on a PE.

   o BD - Broadcast Domain. An EVI the
   affected EVIs.

   This document adds an optional modification of the DF election
   procedure so that the ACS may be comprised of one (VLAN-Based
     or VLAN Bundle services) or multiple (VLAN-Aware Bundle services)
     Broadcast Domains.

   o Bridge Table - An instantiation of a broadcast domain on a MAC-VRF.

   o HRW - Highest Random Weight

   o VID and CE-VID - VLAN Identifier and Customer Equipment VLAN
     Identifier.

   o Ethernet Tag - used to represent taken into account as a Broadcast Domain variable in
   the DF election; therefore, EVPN can provide protection against
   logical failures.

1.4.  The Need for Extending the Default DF Election in EVPN Services

   Section 1.3 describes some of the issues that is
     configured on exist in the default DF
   election procedures.  In order to address those issues, this document
   introduces a given ES for the purpose of new DF election. Note that
     any of election framework.  This framework allows the following may be used
   PEs to represent agree on a Broadcast Domain:
     VIDs (including Q-in-Q tags), configured IDs, VNI (VXLAN Network
     Identifiers), normalized VID, I-SIDs (Service Instance
     Identifiers), etc., common DF election algorithm, as long well as the representation of the broadcast
     domains is configured consistently across the multi-homed PEs
     attached
   capabilities to that ES. The Ethernet Tag value MUST be different from
     zero.

   o Ethernet Tag ID - enable during the DF election procedure.  Generally,
   "DF election algorithm" refers to the identifier algorithm by which a number of
   input parameters are used in to determine the EVPN routes
     defined in [RFC7432]. Its value may DF PE, while "DF election
   capability" refers to an additional feature that can be used prior to
   the same invocation of the DF election algorithm, such as modifying the Ethernet Tag
     value (see Ethernet Tag definition) when advertising routes for
     VLAN-aware Bundle services. Note that in case
   inputs (or list of VLAN-based or VLAN
     Bundle services, the Ethernet Tag ID is zero.

   o candidate PEs).

   Within this framework, this document defines a new DF Election Procedure election
   algorithm and a new capability that can influence the DF Algorithm - election
   result:

   o  The Designated Forwarder
     Election Procedure or simply new DF Election, refers election algorithm is referred to the process as "Highest Random
      Weight" (HRW).  The HRW procedures are described in
     its entirety, including the discovery of the PEs Section 3.

   o  The new DF election capability is referred to as "AC-Influenced DF
      election" (AC-DF).  The AC-DF procedures are described in the ES, the
     creation and maintenance of the PE candidate list
      Section 4.

   o  HRW and the selection AC-DF mechanisms are independent of each other.
      Therefore, a PE. The Designated Forwarder Algorithm is just a component PE may support either HRW or AC-DF independently or
      may support both of them together.  A PE may also support the
      AC-DF capability along with the default DF Election Procedure and strictly refers election algorithm per
      [RFC7432].

   In addition, this document defines a way to indicate the selection support of a
     PE
   HRW and/or AC-DF along with the EVPN ES routes advertised for a given <ES,Ethernet Tag>.

   o TTL - Time To Live

   This document also assumes familiarity with the terminology of
   [RFC7432].

3.
   ES.  Refer to Section 2.2 for more details.

2.  Designated Forwarder Election Protocol and BGP Extensions

   This section describes the BGP extensions required to support the new
   DF Election election procedures.  In addition, since the EVPN specification
   [RFC7432] does leave leaves several questions open as to the precise final
   state machine FSM
   behavior of the DF election, section 3.1 describes Section 2.1 precisely describes the
   intended behavior.

3.1.

2.1.  The DF Election Finite State Machine (FSM)

   Per [RFC7432], the FSM described shown in Figure 3 is executed per
   <ESI,VLAN> <ES, VLAN>
   in the case of VLAN-based service or <ESI,[VLANs <ES, [VLANs in VLAN Bundle]> in
   the case of a VLAN Bundle on each participating PE.

   Observe  Note that currently the VLANs are derived from local configuration
   and the FSM does not provide any protection against misconfiguration
   where the same (EVI,ESI) combination has different set of VLANs on
   different participating PEs or one of the PEs elects to consider
   VLANs as VLAN Bundle and another as separate VLANs for election
   purposes (service type mismatch).

   The
   FSM is conceptual and any conceptual.  Any design or implementation MUST comply with a
   behavior that is equivalent to the one behavior outlined in this FSM.

                     VLAN_CHANGE                VLAN_CHANGE
                     RCVD_ES                    RCVD_ES
                     LOST_ES                    LOST_ES
                     +----+
                     +----+                     +-------+
                     |    v    |                     |                    ++----++       v
                     |  +-+----+   ES_UP       |  DF  |       ++-------++
                     +->+ INIT +---------------> WAIT +-------------->+ DF_WAIT |
                        ++-----+               +----+-+               +-------+-+
                         ^                             |
     +-----------+       |                             |DF_TIMER
     | ANY STATE ANY_STATE +-------+         VLAN_CHANGE         |
     +-----------+ ES_DOWN    +-----------------+      |
                              |    RCVD_ES      v      v
                        +-----++
                     +--------++   LOST_ES     ++---+-+
                        |  DF  |     ++------+-+
                     |  DF  |
                        | DONE DF_DONE +<--------------+ CALC DF_CALC +<-+
                        +------+
                     +---------+   CALCULATED  +----+-+  +-------+-+  |
                                                       |    |
                                                       +----+
                                                       VLAN_CHANGE
                                                       RCVD_ES
                                                       LOST_ES

                Figure 3 3: DF Election Finite State Machine

   Observe that each EVI is locally configured on each of the multihomed
   PEs attached to a given ES and that the FSM does not provide any
   protection against inconsistent configuration between these PEs.
   That is, for a given EVI, one or more of the PEs are inadvertently
   configured with a different set of VLANs for a VLAN-aware Bundle
   service or with different VLANs for a VLAN-based service.

   The states and events shown in Figure 3 are defined as follows.

   States:

   1.  INIT: Initial State state.

   2.  DF_WAIT: State in which the participant waits for enough
       information to perform the DF election for the EVI/ESI/VLAN
       combination.

   3.  DF_CALC: State in which the new DF is recomputed.

   4.  DF_DONE: State in which the according corresponding DF for the EVI/ESI/VLAN
       combination has been elected.

   5.  ANY_STATE: Refers to any of the above states.

   Events:

   1.  ES_UP: The ESI ES has been locally configured as 'up'. "UP".

   2.  ES_DOWN: The ESI ES has been locally configured as 'down'. "DOWN".

   3.  VLAN_CHANGE: The VLANs configured in a bundle (that uses the ESI) ES)
       changed.  This event is necessary for VLAN Bundles only.

   4.  DF_TIMER: DF Wait timer [RFC7432] (referred to as "Wait timer" in this
       document) has expired.

   5.  RCVD_ES: A new or changed Ethernet Segment ES route is received in a
       BGP REACH UPDATE. an Update
       message with an MP_REACH_NLRI.  Receiving an unchanged UPDATE Update
       MUST NOT trigger this event.

   6.  LOST_ES: A BGP UNREACH UPDATE An Update message with an MP_UNREACH_NLRI for a
       previously received Ethernet
       Segment ES route has been received.  If an UNREACH such a
       message is seen for a route that has not been advertised
       previously, the event MUST NOT be triggered.

   7.  CALCULATED: DF has been successfully calculated.

   According

   Corresponding actions when transitions are performed or states are
   entered/exited:

   1.   ANY_STATE on ES_DOWN:
        (i) stop Stop the DF wait timer Wait timer.
        (ii) assume Assume an NDF for the local PE.

   2.   INIT on ES_UP: transition Transition to DF_WAIT.

   3.   INIT on VLAN_CHANGE, RCVD_ES RCVD_ES, or LOST_ES: do Do nothing.

   4.   DF_WAIT on entering the state:
        (i) start Start the DF wait Wait timer if not started already or expired expired.
        (ii) assume Assume an NDF for the local PE.

   5.   DF_WAIT on VLAN_CHANGE, RCVD_ES RCVD_ES, or LOST_ES: do Do nothing.

   6.   DF_WAIT on DF_TIMER: transition Transition to DF_CALC.

   7.   DF_CALC on entering or re-entering the state:
        (i) rebuild Rebuild the candidate list, hash perform a hash, and perform election the
        election.
        (ii) Afterwards Afterwards, the FSM generates a CALCULATED event against
        itself.

   8.   DF_CALC on VLAN_CHANGE, RCVD_ES RCVD_ES, or LOST_ES: do Do as prescribed in transition
        Transition 7.

   9.   DF_CALC on CALCULATED: mark Mark the election result for the VLAN or
        bundle, and transition to DF_DONE.

   11.

   10.  DF_DONE on exiting the state: if there is If a new DF election is triggered
        and the current DF is lost, then assume an NDF for the local PE
        for the VLAN or VLAN Bundle.

   12.

   11.  DF_DONE on VLAN_CHANGE, RCVD_ES RCVD_ES, or LOST_ES: transition Transition to
        DF_CALC.

   The above events and transitions are defined for the Default default DF
   Election Algorithm.
   election algorithm.  As described in Section 5, 4, the use of the AC-DF
   capability introduces additional events and transitions.

3.2.

2.2.  The DF Election Extended Community

   For the DF election procedures to be consistent and unanimous, it is
   necessary that all the participating PEs agree on the DF Election election
   algorithm and capabilities to be used.  For instance, it is not
   possible that for some PEs to continue to use the Default default DF Election election
   algorithm and while some PEs use HRW.  For brown-field brownfield deployments and for
   interoperability with legacy PEs, it is important that all PEs need
   to have
   the capability ability to fall back on the Default default DF Election. election.  A PE can
   indicate its willingness to support HRW and/or AC-DF by signaling a
   DF Election Extended Community along with the Ethernet Segment ES route (Type-4). (Route
   Type 4).

   The DF Election Extended Community is a new BGP transitive extended
   community Extended
   Community attribute [RFC4360] that is defined to identify the DF
   election procedure to be used for the Ethernet Segment. ES.  Figure 4 shows the
   encoding of the DF Election Extended Community.

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | Type=0x06 Type = 0x06   | Sub-Type(0x06)| RSV |  DF Alg |    Bitmap     ~
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     ~     Bitmap    |            Reserved                           |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                 Figure 4 4: DF Election Extended Community
   Where:

   o Type is 0x06  Type: 0x06, as registered with IANA (Section 7) for EVPN Extended
      Communities.

   o Sub-Type is 0x06 -  Sub-Type: 0x06.  "DF Election Extended Community" Community", as requested by
     this document to registered
      with IANA.

   o RSV / Reserved -  RSV/Reserved: Reserved bits for DF Alg information that is specific information. to
      DF Alg.

   o  DF Alg (5 bits) - bits): Encodes the DF Election election algorithm values (between
      0 and 31) that the advertising PE desires to use for the ES.  This
      document requests creates an IANA to set up a registry called "DF Alg
     Registry" and solicits Alg" (Section 7),
      which contains the following values:

      -  Type 0: Default DF Election election algorithm, or modulus-based
         algorithm as defined in [RFC7432].

      -  Type 1: HRW algorithm (explained in this document). Algorithm (Section 3).

      -  Types 2-30: Unassigned.

      -  Type 31: Reserved for Experimental Use.

   o  Bitmap (2 octets) - octets): Encodes "capabilities" to use with the DF
     Election
      election algorithm in the field "DF Alg". DF Alg field.  This document requests creates an
      IANA to create a registry (Section 7) for the Bitmap field, with values 0-15, 0-15.
      This registry is called "DF Election Capabilities" and solicits includes
      the following
     values: bit values listed below.

                              1 1 1 1 1 1
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         | |A|                           |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       Figure 5 5: Bitmap field Field in the DF Election Extended Community

      -  Bit 0 (corresponds to Bit 24 of the DF Election Extended
         Community): Unassigned.

      -  Bit 1: AC-DF Capability (AC-Influenced DF Election, explained in this
       document). election; see
         Section 4).  When set to 1, it indicates the desire to use AC-
       Influenced DF Election
         AC-DF with the rest of the PEs in the ES.

      -  Bits 2-15: Unassigned.

   The DF Election Extended Community is used as follows:

   o  A PE SHOULD attach the DF Election Extended Community to any
      advertised ES route route, and the Extended Community MUST be sent if
      the ES is locally configured with a DF election algorithm other
      than the Default Election default DF election algorithm or if a capability is
      required to be used.  In the Extended Community, the PE indicates
      the desired "DF Alg" algorithm and "Bitmap" capabilities to be
      used for the ES.

      -  Only one DF Election Extended Community can be sent along with
         an ES route.  Note that the intent is not for the advertising
         PE to indicate all the supported DF election algorithms and
       capabilities,
         capabilities but to signal the preferred one.

      -  DF Algs Alg values 0 and 1 can be both be used with bit AC-DF Bit 1 (AC-DF) set
         to 0 or 1.

      -  In general, a specific DF Alg SHOULD determine the use of the
         reserved bits in the Extended Community, which may be used in a
         different way for a different DF Alg.  In particular, for DF Algs
         Alg values 0 and 1, the reserved bits are not set by the
         advertising PE and SHOULD be ignored by the receiving PE.

   o  When a PE receives the ES Routes routes from all the other PEs for the ES
      in question, it checks to see if all the advertisements have the
     extended community
      Extended Community with the same DF Alg and Bitmap:

      - In the case that  If they do, this particular PE MUST follow the procedures for
         the advertised DF Alg and capabilities.  For instance, if all
         ES routes for a given ES indicate DF Alg HRW and AC-DF set
         to 1, then the receiving PE and by induction all the other PEs in attached to the ES will proceed to do perform the DF Election
         election as per the HRW
       Algorithm algorithm and following the AC-DF
         procedures.

      - Otherwise  Otherwise, if even a single advertisement for the type-4 route for Route Type 4 is
         received without the locally configured DF Alg and capability,
         the Default default DF Election algorithm (modulus) election algorithm MUST be used as prescribed in
         [RFC7432].  This procedure handles the case where participating
         PEs in the ES disagree about the DF algorithm and capability to apply.
         be applied.

      -  The absence of the DF Election Extended Community or the
         presence of multiple DF Election Extended Communities (in the
         same route) MUST be interpreted by a receiving PE as an
         indication of the
       Default default DF Election election algorithm on the sending PE,
         PE -- that is, DF Alg 0 and no DF Election election capabilities.

   o  When all the PEs in an ES advertise DF Type 31, they will rely on
      the local policy to decide how to proceed with the DF Election. election.

   o  For any new capability defined in the future, the
     applicability/compatibility applicability/
      compatibility of this new capability to to/with the existing DF Algs Alg
      values must be assessed on a case by case case-by-case basis.

   o  Likewise, for any new DF Alg defined in the future, its
      applicability/compatibility to to/with the existing capabilities must
      be assessed on a case by case case-by-case basis.

3.2.1.

2.2.1.  Backward Compatibility

   Implementations that comply with [RFC7432] implementations only (i.e., those
   implementations that predate this specification) will not advertise
   the DF Election Extended Community.  That means that all other
   participating PEs in the ES will not receive DF preferences and will
   revert to the Default default DF Election election algorithm without AC-Influenced DF Election. AC-DF.

   Similarly, a [RFC7432] an implementation receiving that complies with [RFC7432] only and
   that receives a DF Election Extended Community will ignore it and
   will continue to use the
   Default default DF Election election algorithm.

3.3. Auto-Derivation of ES-Import Route Target

   Section 7.6 of [RFC7432] describes how the value of the ES-Import
   Route Target for ESI types 1, 2, and 3 can be auto-derived by using
   the high-order six bytes of the nine byte ESI value. The same auto-
   derivation procedure can be extended to ESI types 0, 4, and 5 as long
   as it is ensured that the auto-derived values for ES-Import RT among
   different ES types don't overlap. As in [RFC7432], the mechanism to
   guarantee that the auto-derived ESI or ES-import RT values for
   different ESIs do not match is out of scope of this document.

4.

3.  The Highest Random Weight DF Election Algorithm

   The procedure discussed in this section is applicable to the DF
   Election
   election in EVPN Services services [RFC7432] and the EVPN Virtual Private Wire
   Services
   Service (VPWS) [RFC8214].

   Highest Random Weight (HRW)

   HRW as defined in [HRW1999] is originally proposed in the context of
   Internet Caching caching and proxy Server server load balancing.  Given an object
   name and a set of servers, HRW maps a request to a server using the
   object-name (object-id) and server-name (server-id) rather than the
   server states.  HRW forms a hash out of the server-id and the
   object-id and forms an ordered list of the servers for the particular
   object-id.  The server for which the hash value is highest, highest serves as
   the primary server responsible for that particular object, and the
   server with the next highest next-highest value in that hash serves as the backup
   server.  HRW always maps a given object name to the same server
   within a given cluster; consequently consequently, it can be used at client sites
   to achieve global consensus on object-server object-to-server mappings.  When that
   server goes down, the backup server becomes the responsible
   designate.

   Choosing an appropriate hash function that is statistically oblivious
   to the key distribution and imparts a good uniform distribution of
   the hash output is an important aspect of the algorithm. Fortunately
   Fortunately, many such hash functions exist.  [HRW1999] provides pseudo-random
   pseudorandom functions based on the Unix utilities rand and srand and
   easily constructed XOR functions that satisfy the desired hashing
   properties.  HRW already finds use in multicast and ECMP
   [RFC2991],[RFC2992].

4.1. [RFC2991]
   [RFC2992].

3.1.  HRW and Consistent Hashing

   HRW is not the only algorithm that addresses the object to server object-to-server
   mapping problem with goals of fair load distribution, redundancy redundancy, and
   fast access.  There is another family of algorithms that also
   addresses this problem; these fall under the umbrella of the
   Consistent Hashing Algorithms [CHASH].  These will not be considered
   here.

4.2.

3.2.  HRW Algorithm for EVPN DF Election

   This section describes the application of HRW to DF election.  Let
   DF(v)
   DF(V) denote the Designated Forwarder DF and BDF(v) BDF(V) denote the Backup
   Designated forwarder BDF for the Ethernet Tag v, where v is the VLAN, V;
   Si is the IP address of PE i, i; Es denotes is the Ethernet Segment Identifier ESI; and weight Weight is a function
   of v, V, Si, and Es.

   Note that while the DF election algorithm provided in [RFC7432] uses
   a PE address and vlan VLAN as inputs, this document uses an Ethernet Tag,
   PE
   address address, and ESI as inputs.  This is because if the same set of
   PEs are
   multi-homed multihomed to the same set of ESes, then the DF election
   algorithm used in [RFC7432] would result in the same PE being elected
   DF for the same set of broadcast domains BDs on each ES, which can ES; this could have adverse
   side-effects
   side effects on both load balancing and redundancy.  Including an ESI
   in the DF election algorithm introduces additional entropy entropy, which
   significantly reduces the probability of the same PE being elected DF
   for the same set of broadcast domains BDs on each ES.  Therefore, when using the HRW Algorithm
   algorithm for EVPN DF Election, election, the ESI value in the Weight function
   below SHOULD be set to that of the corresponding ES.

   In the case of a VLAN Bundle service, v V denotes the lowest VLAN VLAN,
   similar to the 'lowest "lowest VLAN in bundle' bundle" logic of [RFC7432].

   1.  DF(v)  DF(V) = Si| Weight(v, Weight(V, Es, Si) >= Weight(v, Weight(V, Es, Sj), for all j.
       In the case of a tie, choose the PE whose IP address is
       numerically the least.  Note that 0 <= i,j < Number number of PEs in the
       redundancy group.

   2.  BDF(v)  BDF(V) = Sk| Weight(v, Weight(V, Es, Si) >= Weight(v, Weight(V, Es, Sk) Sk), and Weight(v,
       Weight(V, Es, Sk) >= Weight(v, Weight(V, Es, Sj).  In the case of tie a tie,
       choose the PE whose IP address is numerically the least.

   Where:

   DF(v):

   o  DF(V) is defined to be the address Si (index i) for which weight(v,
      Weight(V, Es, Si) is the highest, highest; 0 <= i < N-1

   BDF(v) N-1.

   o  BDF(V) is defined as that PE with address Sk for which the
      computed
   weight Weight is the next highest after the weight Weight of the DF.
      j is the running index from 0 to N-1, i, N-1; i and k are selected values.

   Since the Weight is a pseudo-random pseudorandom function with the domain as the
   three-tuple (v, (V, Es, S), it is an efficient and deterministic
   algorithm that is independent of the Ethernet Tag v V sample space
   distribution.  Choosing a good hash function for the pseudo-random pseudorandom
   function is an important consideration for this algorithm to perform
   better than the Default default algorithm.  As mentioned previously, such
   functions are described in the HRW paper. [HRW1999].  We take as a candidate hash
   function the first one out of the two that are listed as preferred in
   [HRW1999]:

   Wrand(v,

      Wrand(V, Es, Si) = (1103515245((1103515245.Si+12345) XOR
   D(v,Es))+12345)(mod
      D(V, Es))+12345)(mod 2^31)

   Here D(v,Es)

   Here, D(V, Es) is the 31-bit digest (CRC-32 and discarding the MSB
   most significant bit (MSB), as noted in [HRW1999]) of the 14-byte stream, the 14-octet
   stream (the 4-octet Ethernet Tag v (4 bytes) V followed by the Ethernet Segment Identifier (10 bytes). 10-octet ESI).  It
   is mandated that the 14-byte 14-octet stream is be formed by the concatenation
   of the Ethernet tag Tag and the Ethernet Segment identifier ESI in network byte order.  The CRC
   should proceed as if the stream is in network byte order
   (big-endian).  Si is the address of the ith server.  The server's
   IP address length does not matter matter, as only the low-order 31 bits are
   modulo significant.

   A point to note is that the Weight function takes into consideration
   the combination of the Ethernet Tag, Ethernet Segment the ES, and the PE IP- IP address,
   and the actual length of the server IP address (whether IPv4 or IPv6)
   is not really relevant.  The Default default algorithm defined in [RFC7432]
   cannot employ both IPv4 and IPv6 PE addresses, since [RFC7432] does
   not specify how to decide on the ordering (the ordinal list) when
   both IPv4 and IPv6 PEs are present.

   HRW solves the disadvantages pointed out in Section 1.2.1 1.3.1 of this
   document and
   ensures: ensures that:

   o with  With very high probability that probability, the task of DF election for the VLANs
      configured on an ES is more or less equally distributed among the PEs
      PEs, even for in the 2 PE case. case of two PEs (see the first fundamental
      problem listed in Section 1.3.1).

   o  If a PE that is not the DF or the BDF for that VLAN, VLAN goes down or
      its connection to the ES goes down, it does not result in a DF or
      BDF reassignment.  This saves computation, especially in the case
      when the connection flaps.

   o  More importantly importantly, it avoids the needless disruption case of third fundamental problem listed
      in Section
     1.2.1 (3), 1.3.1 (needless disruption) that is inherent in the
      existing Default default DF Election. election.

   o  In addition to the DF, the algorithm also furnishes the BDF, which
      would be the DF if the current DF fails.

5.

4.  The Attachment Circuit Influenced AC-Influenced DF Election Capability

   The procedure discussed in this section is applicable to the DF
   Election
   election in EVPN Services services [RFC7432] and EVPN Virtual Private Wire
   Services VPWS [RFC8214].

   The AC-DF capability is expected to be of general applicability with generally applicable to any
   future DF Algorithm. algorithm.  It modifies the DF Election election procedures by
   removing from consideration any candidate PE in the ES that cannot
   forward traffic on the AC that belongs to the BD.  This section is
   applicable to VLAN-Based VLAN-based and VLAN Bundle service interfaces.
   Section
   5.1 4.1 describes the procedures for VLAN-Aware VLAN-aware Bundle service
   interfaces.

   In particular, when used with the Default default DF Alg, algorithm, the AC-DF
   capability modifies the Step 3 in the DF Election election procedure described in [RFC7432]
   [RFC7432], Section 8.5, as follows:

   3. When the timer expires, each PE builds an ordered "candidate" candidate list
      of the IP addresses of all the PE nodes attached to the Ethernet
      Segment ES
      (including itself), in increasing numeric value.  The candidate
      list is based on the Originator Originating Router's IP addresses of the ES routes,
      routes but excludes any PE from whom no Ethernet A-D per ES route
      has been received, received or from whom the route has been withdrawn.
      Afterwards, the DF Election election algorithm is applied on a per
      <ES, Ethernet Tag>, Tag>; however, the IP address for a PE will not be
      considered to be a candidate for a given <ES, Ethernet Tag> until
      the corresponding Ethernet A-D per EVI route has been received
      from that PE.  In other words, the ACS on the ES for a given PE
      must be UP so that the PE is considered as to be a candidate for a
      given BD.

      If the Default default DF Alg algorithm is used, every PE in the resulting
      candidate list is then given an ordinal indicating its position in
      the ordered list, starting with 0 as the ordinal for the PE with
      the numerically lowest IP address.  The ordinals are used to
      determine which PE node will be the DF for a given Ethernet Tag on
      the
      Ethernet Segment, ES, using the following rule:

      Assuming a redundancy group of N PE nodes, for VLAN-based service,
      the PE with ordinal i is the DF for an <ES, Ethernet Tag V> when
      (V mod N)= N) = i.  In the case of VLAN-(aware) bundle a VLAN (-aware) Bundle service,
      then the numerically lowest VLAN value in that bundle on that ES
      MUST be used in the modulo function as the Ethernet Tag.

      It should be noted that using the "Originating Originating Router's IP
      address" Address
      field [RFC7432] in the Ethernet Segment ES route to get the PE IP address needed
      for the ordered list allows for a CE to be multihomed across
      different ASes Autonomous Systems (ASes) if such a need ever arises.

   The above three paragraphs differ modified Step 3, above, differs from [RFC7432] [RFC7432], Section 8.5,
   Step 3, 3 in two aspects: ways:

   o  Any DF Alg algorithm can be used, and used -- not only the described modulus-based DF
      Alg (referred to as the Default default DF Election, election or DF "DF Alg 0 0" in this
      document).

   o  The candidate list is pruned based upon non-receipt of Ethernet
      A-D routes: a PE's IP address MUST be removed from the ES
      candidate list if its Ethernet A-D per ES route is withdrawn.  A
      PE's IP address MUST NOT be considered as to be a candidate DF for a an
      <ES, Ethernet
     Tag>, Tag> if its Ethernet A-D per EVI route for the
      <ES, Ethernet Tag> is withdrawn.

   The following example illustrates the AC-DF behavior applied to the
   Default
   default DF election algorithm, assuming the network in Figure 2:

   a)

   (a)  When PE1 and PE2 discover ES12, they advertise an ES route for
        ES12 with the associated ES-import extended community ES-Import Extended Community and the DF
        Election Extended Community indicating AC-DF=1; AC-DF = 1; they start a
        DF Wait timer (independently).  Likewise, PE2 and PE3 advertise
        an ES route for ES23 with AC-DF=1 AC-DF = 1 and start a DF Wait timer.

   b) PE1/PE2

   (b)  PE1 and PE2 advertise an Ethernet A-D per ES route for ES12, ES12.
        PE2 and
      PE2/PE3 PE3 advertise an Ethernet A-D per ES route for ES23.

   c)

   (c)  In addition, PE1/PE2/PE3 PE1, PE2, and PE3 advertise an Ethernet A-D per EVI
        route for AC1, AC2, AC3 AC3, and AC4 as soon as the ACs are enabled.
        Note that the AC can be associated to with a single customer VID (e.g. VLAN-
      based
        (e.g., VLAN-based service interfaces) or a bundle of customer
        VIDs (e.g. (e.g., VLAN Bundle service interfaces).

   d)

   (d)  When the timer expires, each PE builds an ordered "candidate" candidate list
        of the IP addresses of all the PE nodes connected attached to the Ethernet
      Segment ES
        (including itself) as explained above in [RFC7432] the modified Step 3. 3 above.
        Any PE from which an Ethernet A-D per ES route has not been
        received is pruned from the list.

   e)

   (e)  When electing the DF for a given BD, a PE will not be considered
        to be a candidate until an Ethernet A-D per EVI route has been
        received from that PE.  In other words, the ACS on the ES for a
        given PE must be UP so that the PE is considered as to be a
        candidate for a given BD.  For example, PE1 will not consider
        PE2 as a candidate for DF election for <ES12,VLAN-1> <ES12, VLAN-1> until an
        Ethernet A-D per EVI route is received from PE2 for <ES12,VLAN-1>.

   f)
        <ES12, VLAN-1>.

   (f)  Once the PEs with ACS = DOWN for a given BD have been removed
        from the candidate list, the DF Election election can be applied for the
        remaining N candidates.

   Note that this procedure only modifies the existing EVPN control
   plane by adding and processing the DF Election Extended Community, Community
   and by pruning the candidate list of PEs that take part in the DF
   election.

   In addition to the events defined in the FSM in Section 3.1, 2.1, the
   following events SHALL modify the candidate PE list and trigger the
   DF re-election in a PE for a given <ES, Ethernet Tag>.  In the FSM of
   shown in Figure 3, the events below MUST trigger a transition from
   DF_DONE to DF_CALC:

   i.

   1.  Local AC going DOWN/UP.

   ii.

   2.  Reception of a new Ethernet A-D per EVI update/withdraw route update/withdrawal
       for the <ES, Ethernet Tag>.

   iii.

   3.  Reception of a new Ethernet A-D per ES update/withdraw route update/withdrawal
       for the ES.

5.1.

4.1.  AC-Influenced DF Election Capability For for VLAN-Aware Bundle
      Services

   The procedure described in section 5 Section 4 works for VLAN-based and VLAN
   Bundle service interfaces since, because, for those service types, a PE
   advertises only one Ethernet A-D per EVI route per <ES,VLAN> <ES, VLAN> or
   <ES,VLAN
   <ES, VLAN Bundle>.  In Section 5, 4, an Ethernet Tag represents a given
   VLAN or VLAN Bundle for the purpose of DF Election. election.  The withdrawal
   of such a route means that the PE cannot forward traffic on that
   particular <ES,VLAN> <ES, VLAN> or <ES,VLAN Bundle>, therefore <ES, VLAN Bundle>; therefore, the PE can be
   removed from consideration for DF. DF election.

   According to [RFC7432], in VLAN-aware Bundle services, the PE
   advertises multiple Ethernet A-D per EVI routes per <ES,VLAN <ES, VLAN Bundle>
   (one route per Ethernet Tag), while the DF Election election is still
   performed per <ES,VLAN <ES, VLAN Bundle>.  The withdrawal of an individual
   route only indicates the unavailability of a specific AC but and not
   necessarily all the ACs in the <ES,VLAN <ES, VLAN Bundle>.

   This document modifies the DF Election election for VLAN-Aware VLAN-aware Bundle services
   in the following way: ways:

   o  After confirming that all the PEs in the ES advertise the AC-DF
      capability, a PE will perform a DF Election election per <ES,VLAN>, <ES, VLAN>, as
      opposed to per <ES,VLAN <ES, VLAN Bundle> as described in [RFC7432].  Now,
      the withdrawal of an Ethernet A-D per EVI route for a VLAN will
      indicate that the advertising PE's ACS is DOWN and the rest of the
      PEs in the ES can remove the PE from consideration for DF election
      in the <ES,VLAN>. <ES, VLAN>.

   o  The PEs will now follow the procedures in section 5. Section 4.

   For example, assuming three Bridge Tables bridge tables in PE1 for the same MAC-VRF
   (each one associated to with a different Ethernet Tag, e.g. e.g., VLAN-1, VLAN-2
   VLAN-2, and VLAN-3), PE1 will advertise three Ethernet A-D per EVI
   routes for ES12.  Each of the three routes will indicate the status
   of each of the three ACs in ES12.  PE1 will be considered as to be a
   valid candidate PE for DF Election election in <ES12,VLAN-1>, <ES12,VLAN-2>, <ES12,VLAN-3> <ES12, VLAN-1>, <ES12, VLAN-2>,
   and <ES12, VLAN-3> as long as its three routes are active.  For
   instance, if PE1 withdraws the Ethernet A-D per EVI routes for <ES12,VLAN-1>,
   <ES12, VLAN-1>, the PEs in ES12 will not consider PE1 as a suitable
   DF candidate for <ES12,VLAN-1>. <ES12, VLAN-1>.  PE1 will still be considered for <ES12,VLAN-2>
   <ES12, VLAN-2> and <ES12,VLAN-3> <ES12, VLAN-3>, since its routes are active.

6.

5.  Solution Benefits

   The solution described in this document provides the following
   benefits:

   a) Extends

   (a)  It extends the DF Election election as defined in [RFC7432] to address
        the unfair load- load balancing and potential black-holing issues of with
        the Default default DF
      Election election algorithm.  The solution is applicable
        to the DF Election election in EVPN Services services [RFC7432] and EVPN Virtual Private Wire Services VPWS
        [RFC8214].

   b)

   (b)  It defines a way to signal the DF Election election algorithm and
        capabilities intended by the advertising PE.  This is done by
        defining the DF Election Extended Community, which allow signaling
      of allows the
        advertising PE to indicate its support for the capabilities supported by
        defined in this document as well as any
      other future subsequently defined DF Election
        election algorithms and or capabilities.

   c) The solution

   (c)  It is backwards compatible with the procedures defined in
        [RFC7432].  If one or more PEs in the ES do not support the new
        procedures, they will all follow the [RFC7432] DF Election.

7. election as defined in
        [RFC7432].

6.  Security Considerations

   This document addresses some identified issues in the DF Election election
   procedures described in [RFC7432] by defining a new DF Election election
   framework.  In general, this framework allows the PEs that are part
   of the same Ethernet Segment ES to exchange additional information and agree on the DF Election Type
   election type and Capabilities capabilities to be used.

   Following

   By following the procedures in this document, the operator will
   minimize
   undesired situations such undesirable situations as unfair load-balancing, load balancing,
   service
   disruption disruption, and traffic black-holing. Since those  Because such
   situations may have
   been could be purposely created by a malicious user with access
   to the configuration of one PE, this document enhances also enhances the
   security of the network.  Note that the network will not benefit of from
   the new procedures if the DF Election Alg election algorithm is not consistently
   configured on all the PEs in the ES (if there is no unanimity among
   all the PEs, the DF Election Alg election algorithm falls back to the Default [RFC7432] default DF Election).
   election as provided in [RFC7432]).  This behavior could be exploited
   by an attacker that manages to modify the configuration of one PE in
   the Ethernet Segment ES so that the DF Election Alg election algorithm and capabilities in all the
   PEs in the Ethernet
   Segment ES fall back to the Default default DF Election. election.  If that is the
   case, the PEs will be exposed to the unfair load-balancing, load balancing, service
   disruption
   disruption, and black-holing that were mentioned earlier.

   In addition, the new framework is extensible and allows for future new
   security enhancements in the future.  Note that such enhancements are
   out of the scope of for this document.  Finally, since this document extends
   the procedures in [RFC7432], the same Security Considerations security considerations as
   those described in [RFC7432] are valid for this document.

8.

7.  IANA Considerations

   IANA is requested to: has:

   o Allocate  Allocated Sub-Type value 0x06 in the "EVPN Extended Community Sub-
     Types"
      Sub-Types" registry defined in [RFC7153] as follows:

     SUB-TYPE VALUE     NAME

      Sub-Type Value    Name                             Reference
      --------------     -------------------------    ------------------------------   -------------
      0x06              DF Election Extended Community   This document

   o  Set up a registry called "DF Alg" for the DF Alg field in the
      Extended Community.  New registrations will be made through the
      "RFC Required" procedure defined in [RFC8126].  Value 31 is for
     Experimental
      experimental use and does not require any other RFC than this
      document.  The following initial values in that registry are
     requested: exist:

      Alg         Name                               Reference
      ----        --------------        -----------------------------      -------------
      0           Default DF Election                This document
      1           HRW algorithm Algorithm                      This document
      2-30        Unassigned
      31          Reserved for Experimental use Use      This document

   o  Set up a registry called "DF Election Capabilities" for the two-
     octet
      2-octet Bitmap field in the Extended Community.  New registrations
      will be made through the "RFC Required" procedure defined in
      [RFC8126].  The following initial value in that registry is
     requested: exists:

      Bit         Name                             Reference
      ----        --------------        ----------------                 -------------
      0           Unassigned
      1           AC-DF capability Capability                 This document
      2-15        Unassigned

9.

8.  References

9.1.

8.1.  Normative References

   [RFC7432]  Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
              Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based
              Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432,
              February 2015, <https://www.rfc-editor.org/info/rfc7432>.

   [RFC8214]  Boutros, S., Sajassi, A., Salam, S., Drake, J., and J.
              Rabadan, "Virtual Private Wire Service Support in Ethernet
              VPN", RFC 8214, DOI 10.17487/RFC8214, August 2017, <https://www.rfc-
   editor.org/info/rfc8214>.
              <https://www.rfc-editor.org/info/rfc8214>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in
              RFC 2119 Key Words", BCP 14, RFC 8174,
              DOI 10.17487/RFC8174, May 2017,
              <https://www.rfc-editor.org/info/rfc8174>.

   [RFC4360]  Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
              Communities Attribute", RFC 4360, DOI 10.17487/RFC4360,
              February 2006, <http://www.rfc-editor.org/info/rfc4360>. <https://www.rfc-editor.org/info/rfc4360>.

   [RFC7153]  Rosen, E. and Y. Rekhter, "IANA Registries for BGP
              Extended Communities", RFC 7153, DOI 10.17487/RFC7153,
              March 2014, <https://www.rfc-editor.org/info/rfc7153>.

   [RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
              Writing an IANA Considerations Section in RFCs", BCP 26,
              RFC 8126, DOI 10.17487/RFC8126, June 2017, <https://www.rfc-
   editor.org/info/rfc8126>.

9.2.
              <https://www.rfc-editor.org/info/rfc8126>.

8.2.  Informative References

   [VPLS-MH]  Kothari, Henderickx et al., B., Kompella, K., Henderickx, W., Balus, F., and
              J. Uttaro, "BGP based Multi-homing in Virtual Private LAN
              Service", draft-ietf-bess-vpls-multihoming-
   02.txt, work in progress, September, 2018. Work in Progress, draft-ietf-bess-vpls-
              multihoming-03, March 2019.

   [CHASH]    Karger, D., Lehman, E., Leighton, T., Panigrahy, R.,
              Levine, M., and D. Lewin, "Consistent Hashing and Random
              Trees: Distributed Caching Protocols for Relieving Hot
              Spots on the World Wide Web", ACM Symposium on Theory of Computing
              Computing, ACM Press Press, New York, DOI 10.1145/258533.258660,
              May 1997.

   [CLRS2009] Cormen, T., Leiserson, C., Rivest, R., and C. Stein,
              "Introduction to Algorithms (3rd ed.)", Edition)", MIT Press and McGraw-Hill
              Press, ISBN 0-262-03384-4., February 0-262-03384-8, 2009.

   [RFC2991]  Thaler, D. and C. Hopps, "Multipath Issues in Unicast and
              Multicast Next-Hop Selection", RFC 2991,
              DOI 10.17487/RFC2991, November 2000, <http://www.rfc-editor.org/info/rfc2991>.
              <https://www.rfc-editor.org/info/rfc2991>.

   [RFC2992]  Hopps, C., "Analysis of an Equal-Cost Multi-Path
              Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000,
   <http://www.rfc-editor.org/info/rfc2992>.
              <https://www.rfc-editor.org/info/rfc2992>.

   [RFC4456]  Bates, T., Chen, E., and R. Chandra, "BGP Route
              Reflection: An Alternative to Full Mesh Internal BGP
              (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006,
              <https://www.rfc-editor.org/info/rfc4456>.

   [HRW1999]  Thaler, D. and C. Ravishankar, "Using Name-Based Mappings
              to Increase Hit Rates", IEEE/ACM Transactions in networking on
              Networking, Volume 6
   Issue 6, No. 1, February 1998, <https://www.microsoft.com/en-us/research/wp-
   content/uploads/2017/02/HRW98.pdf>.
              <https://www.microsoft.com/en-us/research/wp-content/
              uploads/2017/02/HRW98.pdf>.

   [Knuth]    Knuth, D., "The Art of Computer Programming - Programming: Volume 3:
              Sorting and Searching,Vol 3
   Pg. Searching", 2nd Edition, Addison-Wesley,
              Page 516, Addison Wesley

10. 1998.

Acknowledgments

   The authors want to thank Sriram Venkateswaran, Laxmi Padakanti, Ranganathan Boovaraghavan, Tamas Mondal, Sami Boutros, Jakob Heitz,
   Luc Andre Burdet, Anoop Ghanwani, Mrinmoy Ghosh, Jakob Heitz, Leo
   Mermelstein, Mankamana Mishra, Anoop Ghanwani and Tamas Mondal, Laxmi Padakanti, Samir Thoria
   Thoria, and Sriram Venkateswaran for their review and contributions.
   Special thanks to Stephane Litkowski for his thorough review and
   detailed contributions.

11. Contributors

   In addition

   They would also like to thank their working group chairs, Matthew
   Bocci and Stephane Litkowski, and their AD, Martin Vigoureux, for
   their guidance and support.

   Finally, they would like to thank the authors listed on Directorate reviewers and the
   ADs for their thorough reviews and probing questions, the answers to
   which have substantially improved the front page, quality of the document.

Contributors

   The following
   coauthors people have also contributed substantially to this document: document
   and should be considered coauthors:

   Antoni Przygienda
   Juniper Networks, Inc.
   1194 N. Mathilda Drive Ave.
   Sunnyvale, CA  95134
   USA  94089
   United States of America

   Email: prz@juniper.net

   Vinod Prabhu
   Nokia

   Email: vinod.prabhu@nokia.com

   Wim Henderickx
   Nokia

   Email: wim.henderickx@nokia.com

   Wen Lin
   Juniper Networks, Inc.

   Email: wlin@juniper.net
   Patrice Brissette
   Cisco Systems

   Email: pbrisset@cisco.com

   Keyur Patel
   Arrcus, Inc Inc.

   Email: keyur@arrcus.com

   Autumn Liu
   Ciena

   Email: hliu@ciena.com

Authors' Addresses

   Jorge Rabadan (editor)
   Nokia
   777 E. Middlefield Road
   Mountain View, CA  94043 USA
   United States of America

   Email: jorge.rabadan@nokia.com

   Satya Mohanty (editor)
   Cisco Systems, Inc.
   225 West Tasman Drive
   San Jose, CA  95134
   USA
   United States of America

   Email: satyamoh@cisco.com

   Ali Sajassi
   Cisco Systems, Inc.
   225 West Tasman Drive
   San Jose, CA  95134
   USA
   United States of America

   Email: sajassi@cisco.com
   John Drake
   Juniper Networks, Inc.
   1194 N. Mathilda Drive Ave.
   Sunnyvale, CA  95134
   USA  94089
   United States of America

   Email: jdrake@juniper.net

   Kiran Nagaraj
   Nokia
   701 E. Middlefield Road
   Mountain View, CA  94043 USA
   United States of America

   Email: kiran.nagaraj@nokia.com

   Senthil Sathappan
   Nokia
   701 E. Middlefield Road
   Mountain View, CA  94043 USA
   United States of America

   Email: senthil.sathappan@nokia.com