rfc9494.original | rfc9494.txt | |||
---|---|---|---|---|
Internet Engineering Task Force J. Uttaro | Internet Engineering Task Force (IETF) J. Uttaro | |||
Internet-Draft Independent Contributor | Request for Comments: 9494 Independent Contributor | |||
Updates: 6368 (if approved) E. Chen | Updates: 6368 E. Chen | |||
Intended status: Standards Track Palo Alto Networks | Category: Standards Track Palo Alto Networks | |||
Expires: 13 January 2024 B. Decraene | ISSN: 2070-1721 B. Decraene | |||
Orange | Orange | |||
J. G. Scudder | J. Scudder | |||
Juniper Networks | Juniper Networks | |||
12 July 2023 | November 2023 | |||
Support for Long-lived BGP Graceful Restart | Long-Lived Graceful Restart for BGP | |||
draft-ietf-idr-long-lived-gr-06 | ||||
Abstract | Abstract | |||
In this document, we introduce a new BGP capability termed "Long- | This document introduces a BGP capability called the "Long-Lived | |||
lived Graceful Restart Capability" so that stale routes can be | Graceful Restart Capability" (or "LLGR Capability"). The benefit of | |||
retained for a longer time upon session failure than is provided for | this capability is that stale routes can be retained for a longer | |||
by BGP Graceful Restart (RFC 4724). A well-known BGP community | time upon session failure than is provided for by BGP Graceful | |||
"LLGR_STALE" is introduced for marking stale routes retained for a | Restart (as described in RFC 4724). A well-known BGP community | |||
longer time. A second well-known BGP community, "NO_LLGR", is | called "LLGR_STALE" is introduced for marking stale routes retained | |||
introduced to mark routes for which these procedures should not be | for a longer time. A second well-known BGP community called | |||
applied. We also specify that such long-lived stale routes be | "NO_LLGR" is introduced for marking routes for which these procedures | |||
treated as the least-preferred, and their advertisements be limited | should not be applied. We also specify that such long-lived stale | |||
to BGP speakers that have advertised the new capability. Use of this | routes be treated as the least preferred and that their | |||
extension is not advisable in all cases, and we provide guidelines to | advertisements be limited to BGP speakers that have advertised the | |||
help determine if it is. | capability. Use of this extension is not advisable in all cases, and | |||
we provide guidelines to help determine if it is. | ||||
This memo updates RFC 6368 by specifying that the LLGR_STALE | This memo updates RFC 6368 by specifying that the LLGR_STALE | |||
community must be propagated into, or out of, the path attributes | community must be propagated into, or out of, the path attributes | |||
exchanged between PE and CE. | exchanged between the Provider Edge (PE) and Customer Edge (CE) | |||
routers. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | This document is a product of the Internet Engineering Task Force | |||
Task Force (IETF). Note that other groups may also distribute | (IETF). It represents the consensus of the IETF community. It has | |||
working documents as Internet-Drafts. The list of current Internet- | received public review and has been approved for publication by the | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | Information about the current status of this document, any errata, | |||
and may be updated, replaced, or obsoleted by other documents at any | and how to provide feedback on it may be obtained at | |||
time. It is inappropriate to use Internet-Drafts as reference | https://www.rfc-editor.org/info/rfc9494. | |||
material or to cite them other than as "work in progress." | ||||
This Internet-Draft will expire on 13 January 2024. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2023 IETF Trust and the persons identified as the | Copyright (c) 2023 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
in the Revised BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction | |||
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 | 2. Terminology | |||
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2.1. Definitions | |||
3. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 5 | 2.2. Abbreviations | |||
3.1. Long-lived Graceful Restart Capability . . . . . . . . . 5 | 2.3. Requirements Language | |||
3.2. LLGR_STALE Community . . . . . . . . . . . . . . . . . . 7 | 3. Protocol Extensions | |||
3.3. NO_LLGR Community . . . . . . . . . . . . . . . . . . . . 7 | 3.1. Long-Lived Graceful Restart Capability | |||
4. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 7 | 3.2. LLGR_STALE Community | |||
4.1. Use of Graceful Restart Capability . . . . . . . . . . . 8 | 3.3. NO_LLGR Community | |||
4.2. Session Resets . . . . . . . . . . . . . . . . . . . . . 8 | 4. Theory of Operation | |||
4.3. Processing LLGR_STALE Routes . . . . . . . . . . . . . . 10 | 4.1. Use of the Graceful Restart Capability | |||
4.4. Route Selection . . . . . . . . . . . . . . . . . . . . . 11 | 4.2. Session Resets | |||
4.5. Errors . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 4.3. Processing LLGR_STALE Routes | |||
4.6. Optional Partial Deployment Procedure . . . . . . . . . . 11 | 4.4. Route Selection | |||
4.7. Procedures when BGP is the PE-CE Protocol in a VPN . . . 12 | 4.5. Errors | |||
4.7.1. Procedures when EBGP is the PE-CE Protocol in a | 4.6. Optional Partial Deployment Procedure | |||
VPN . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 4.7. Procedures When BGP Is the PE-CE Protocol in a VPN | |||
4.7.2. Procedures when IBGP is the PE-CE Protocol in a | 4.7.1. Procedures When EBGP Is the PE-CE Protocol in a VPN | |||
VPN . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 4.7.2. Procedures When IBGP Is the PE-CE Protocol in a VPN | |||
5. Deployment Considerations . . . . . . . . . . . . . . . . . . 13 | 5. Deployment Considerations | |||
5.1. When BGP is the PE-CE Protocol in a VPN . . . . . . . . . 15 | 5.1. When BGP Is the PE-CE Protocol in a VPN | |||
5.2. Risks of Depreferencing Routes . . . . . . . . . . . . . 15 | 5.2. Risks of Depreferencing Routes | |||
6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 | 6. Security Considerations | |||
7. Examples of Operation . . . . . . . . . . . . . . . . . . . . 18 | 7. Examples of Operation | |||
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 | 8. IANA Considerations | |||
9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 21 | 9. References | |||
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 | 9.1. Normative References | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 | 9.2. Informative References | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 22 | Acknowledgements | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 23 | Contributors | |||
Authors' Addresses | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 | ||||
1. Introduction | 1. Introduction | |||
Historically, routing protocols in general, and BGP in particular, | Routing protocols in general, and BGP in particular, have | |||
have been designed with a focus on correctness, where a key part of | historically been designed with a focus on "correctness", where a key | |||
"correctness" is for each network element's forwarding state to | part of correctness is for each network element's forwarding state to | |||
converge toward the current state of the network as quickly as | converge to the current state of the network as quickly as possible. | |||
possible. For this reason, the protocol was designed to remove state | For this reason, the protocol was designed to remove state advertised | |||
advertised by routers that went down (from a BGP perspective) as | by routers that went down (from a BGP perspective) as quickly as | |||
quickly as possible. Over time, this has been relaxed somewhat, | possible. Over time, this has been relaxed somewhat, notably by BGP | |||
notably by BGP Graceful Restart (GR) [RFC4724]; however, the paradigm | Graceful Restart (GR) [RFC4724]; however, the paradigm has remained | |||
has remained one of attempting to rapidly remove "stale" state from | one of attempting to rapidly remove stale state from the network. | |||
the network. | ||||
Over time, two phenomena have arisen that call into question the | Over time, two phenomena have arisen that call into question the | |||
underlying assumptions of this paradigm. The first is the widespread | underlying assumptions of this paradigm. | |||
adoption of tunneled forwarding infrastructures, for example, MPLS. | ||||
Such infrastructures eliminate the risk of some types of forwarding | 1. The widespread adoption of tunneled forwarding infrastructures | |||
loops that can arise in hop-by-hop forwarding and thus reduce one of | (for example, MPLS). Such infrastructures eliminate the risk of | |||
the motivations for strong consistency between forwarding elements. | some types of forwarding loops that can arise in hop-by-hop | |||
The second is the increasing use of BGP as a transport for data which | forwarding; thus, they reduce one of the motivations for strong | |||
is less closely associated with packet forwarding than was originally | consistency between forwarding elements. | |||
the case. Examples include the use of BGP for autodiscovery (VPLS | ||||
[RFC4761]) and filter programming (FLOWSPEC [RFC8955]). In these | 2. The increasing use of BGP as a transport for data that is less | |||
cases, BGP data takes on a character more akin to configuration than | closely associated with packet forwarding than was originally the | |||
to traditional routing. | case. Examples include the use of BGP for auto-discovery | |||
(Virtual Private LAN Service (VPLS) [RFC4761]) and filter | ||||
programming (Flow Specification (FLOWSPEC) [RFC8955]). In these | ||||
cases, BGP data takes on a character more akin to configuration | ||||
than to conventional routing. | ||||
The observations above motivate a desire to offer network operators | The observations above motivate a desire to offer network operators | |||
the ability to choose to retain BGP data for a longer period than has | the ability to choose to retain BGP data for a longer period than has | |||
hitherto been possible when the BGP control plane fails for some | hitherto been possible when the BGP control plane fails for some | |||
reason. Although the semantics of BGP Graceful Restart [RFC4724] are | reason. Although the semantics of BGP Graceful Restart [RFC4724] are | |||
close to those desired, several gaps exist, most notably in the | close to those desired, several gaps exist, most notably in the | |||
maximum time for which "stale" information can be retained -- | maximum time for which stale information can be retained: Graceful | |||
Graceful Restart imposes a 4095-second upper bound. | Restart imposes a 4095-second upper bound. | |||
In this document, we introduce a new BGP capability termed "Long- | In this document, we introduce a BGP capability called the "Long- | |||
lived Graceful Restart Capability" so that stale information can be | Lived Graceful Restart Capability". The goal of this capability is | |||
retained for a longer time across a session reset. We also introduce | that stale information can be retained for a longer time across a | |||
two new BGP well-known communities, "LLGR_STALE", to mark such | session reset. We also introduce two BGP well-known communities: | |||
information, and "NO_LLGR", to indicate that these procedures should | ||||
not be applied to the marked route. Long-lived stale information is | * LLGR_STALE to mark such information, and | |||
to be treated as least-preferred, and its advertisement limited to | ||||
BGP speakers that support the new capability. Where possible, we | * NO_LLGR to indicate that these procedures should not be applied to | |||
reference the semantics of BGP Graceful Restart [RFC4724] rather than | the marked route. | |||
specifying similar semantics in this document. | ||||
Long-lived stale information is to be treated as least preferred, and | ||||
its advertisement limited to BGP speakers that support the | ||||
capability. Where possible, we reference the semantics of BGP | ||||
Graceful Restart [RFC4724] rather than specifying similar semantics | ||||
in this document. | ||||
The expected deployment model for this extension is that it will only | The expected deployment model for this extension is that it will only | |||
be invoked for certain address families. This is discussed in more | be invoked for certain address families. This is discussed in more | |||
detail in the Deployment Considerations section (Section 5). When | detail in Section 5. The use of this extension may be combined with | |||
used, its use may be combined with that of traditional Graceful | that of conventional Graceful Restart; in such a case, it is invoked | |||
Restart, in which case it is invoked only after the traditional | after the conventional Graceful Restart interval has elapsed. When | |||
Graceful Restart interval has elapsed, or it may be invoked | not combined, LLGR is invoked immediately. Apart from the potential | |||
immediately. Apart from the potential to greatly extend the timer, | to greatly extend the timer, the most obvious difference between LLGR | |||
the most obvious difference between Long-Lived and traditional | and conventional Graceful Restart is that in LLGR, routes are | |||
Graceful Restart is that in the Long-Lived version, routes are | "depreferenced"; that is, they are treated as least preferred. | |||
"depreferenced", that is, treated as least-preferred, whereas in the | Contrarily, in conventional GR, route preference is not affected. | |||
traditional version, route preference is not affected. The design | The design choice to treat long-lived stale routes as least preferred | |||
choice to treat Long-Lived Stale routes as least-preferred was | was informed by the expectation that they might be retained for | |||
informed by the expectation that they might be retained for a | (potentially) an almost unbounded period of time; whereas, in the | |||
(potentially) almost unbounded period of time, whereas in the | conventional Graceful Restart case, stale routes are retained for | |||
traditional Graceful Restart case, stale routes are retained for only | only a brief interval. In the case of Graceful Restart, the trade- | |||
a brief interval. In the Graceful Restart case, the tradeoff between | off between advertising new route status (at the cost of routing | |||
advertising new route status (at the cost of routing churn) and not | churn) and not advertising it (at the cost of suboptimal or incorrect | |||
advertising it (at the cost of suboptimal or incorrect route | route selection) is resolved in favor of not advertising. In the | |||
selection) is resolved in favor of not advertising. In the LLGR | case of LLGR, it is resolved in favor of advertising new state, using | |||
case, it is resolved in favor of advertising new state, and using | ||||
stale information only as a last resort. | stale information only as a last resort. | |||
Section 7 provides some simple examples illustrating the operation of | Section 7 provides some simple examples illustrating the operation of | |||
this extension. | this extension. | |||
1.1. Requirements Language | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | 2.1. Definitions | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
2. Definitions | Depreference: A route is said to be depreferenced if it has its | |||
route selection preference reduced in reaction to some event. | ||||
CE: A Customer Edge router. [RFC4364] | Helper: Sometimes referred to as "helper router". During Graceful | |||
Restart or Long-Lived Graceful Restart, the router that detects a | ||||
session failure and applies the listed procedures. [RFC4724] | ||||
refers to this as the "receiving speaker". | ||||
Depreference, Depreferenced: A route is said to be depreferenced if | Route: In this document, "route" means any information encoded as | |||
it has its route selection preference reduced in reaction to some | BGP Network Layer Reachability Information (NLRI) and a set of path | |||
event. | attributes. As discussed above, the connection between such routes | |||
and the installation of forwarding state may be quite remote. | ||||
EoR: Marker for End-of-RIB, defined in [RFC4724] Section 2. | Further note that, for brevity, in this document when we reference | |||
conventional Graceful Restart, we cite its base specification, | ||||
[RFC4724]. That specification has been updated by [RFC8538]. The | ||||
citation to [RFC4724] is not intended to be limiting. | ||||
GR: Abbreviation for "Graceful Restart" [RFC4724], also sometimes | 2.2. Abbreviations | |||
referred to herein as "conventional Graceful Restart" or | ||||
"conventional GR" to distinguish it from the "Long-lived Graceful | ||||
Restart" defined by this document. | ||||
Helper: Or "helper router". During Graceful Restart or Long-lived | CE: Customer Edge (See [RFC4364] for more information on Customer | |||
Graceful Restart, the router that detects a session failure and | Edge routers.) | |||
applies the listed procedures. [RFC4724] refers to this as the | ||||
"receiving speaker". | ||||
LLGR: Abbreviation for "Long-lived Graceful Restart". | EoR: End-of-RIB (See Section 2 of [RFC4724] for more information on | |||
End-of-RIB markers.) | ||||
LLST: Abbreviation for "Long-lived Stale Time". | GR: Graceful Restart (See [RFC4724] for more information on GR.) | |||
This term is also sometimes referred to herein as "conventional | ||||
Graceful Restart" or "conventional GR" to distinguish it from the | ||||
"Long-Lived Graceful Restart" or "LLGR" defined by this document. | ||||
PE: A Provider Edge router. [RFC4364] | LLGR: Long-Lived Graceful Restart | |||
Route: We use "route" to mean any information encoded as a BGP NLRI | LLST: Long-Lived Stale Time | |||
and set of path attributes. As discussed above, the connection | ||||
between such routes and the installation of forwarding state may be | ||||
quite remote. | ||||
VRF: VPN Routing and Forwarding table. [RFC4364] | PE: Provider Edge (See [RFC4364] for more information on Provider | |||
Edge routers.) | ||||
VRF: VPN Routing and Forwarding (See [RFC4364] for more information | ||||
on VRF tables.) | ||||
2.3. Requirements Language | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in | ||||
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
3. Protocol Extensions | 3. Protocol Extensions | |||
A new BGP capability and two new BGP communities are introduced. | A BGP capability and two BGP communities are introduced in the | |||
subsections that follow. | ||||
3.1. Long-lived Graceful Restart Capability | 3.1. Long-Lived Graceful Restart Capability | |||
The "Long-lived Graceful Restart Capability", or "LLGR Capability" | The "Long-Lived Graceful Restart Capability", or "LLGR Capability", | |||
(value: 71) is a BGP capability [RFC5492] that can be used by a BGP | (value: 71) is a BGP capability [RFC5492] that can be used by a BGP | |||
speaker to indicate its ability to preserve its state according to | speaker to indicate its ability to preserve its state according to | |||
the procedures of this document. This capability MUST be advertised | the procedures of this document. If the LLGR capability is | |||
in conjunction with the Graceful Restart capability [RFC4724], see | advertised, the Graceful Restart capability [RFC4724] MUST also be | |||
the "Use of Graceful Restart Capability" section (Section 4.1). | advertised; see Section 4.1. | |||
The capability value consists of zero or more tuples <AFI, SAFI, | The capability value consists of zero or more tuples <AFI, SAFI, | |||
Flags, Long-lived Stale Time> as follows: | Flags, LLST> as follows: | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Address Family Identifier (16 bits) | | | Address Family Identifier (16 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Subsequent Address Family Identifier (8 bits) | | | Subsequent Address Family Identifier (8 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Flags for Address Family (8 bits) | | | Flags for Address Family (8 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Long-lived Stale Time (24 bits) | | | Long-Lived Stale Time (24 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| ... | | | ... | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Address Family Identifier (16 bits) | | | Address Family Identifier (16 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Subsequent Address Family Identifier (8 bits) | | | Subsequent Address Family Identifier (8 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Flags for Address Family (8 bits) | | | Flags for Address Family (8 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
| Long-lived Stale Time (24 bits) | | | Long-Lived Stale Time (24 bits) | | |||
+--------------------------------------------------+ | +--------------------------------------------------+ | |||
The meaning of the fields are as follows: | The meaning of the fields are as follows: | |||
Address Family Identifier (AFI), Subsequent Address Family | Address Family Identifier (AFI), Subsequent Address Family | |||
Identifier (SAFI): | Identifier (SAFI): | |||
The AFI and SAFI, taken in combination, indicate that the BGP | ||||
speaker has the ability to preserve its forwarding state for the | ||||
address family during a subsequent BGP restart. Routes may be | ||||
either: | ||||
The AFI and SAFI, taken in combination, indicate that the BGP | * explicitly associated with a particular AFI and SAFI if using | |||
speaker has the ability to preserve its forwarding state for | the encoding described in [RFC4760], or | |||
the address family during a subsequent BGP restart. Routes may | ||||
be explicitly associated with a particular AFI and SAFI using | ||||
the encoding of [RFC4760] or implicitly associated with | ||||
<AFI=IPv4, SAFI=Unicast> if using the encoding of [RFC4271]. | ||||
Flags for Address Family: | * implicitly associated with <AFI=IPv4, SAFI=Unicast> if using | |||
the encoding described in [RFC4271]. | ||||
This field contains bit flags relating to routes that were | Flags for Address Family: | |||
advertised with the given AFI and SAFI. | This field contains bit flags relating to routes that were | |||
advertised with the given AFI and SAFI. | ||||
0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | |||
+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
|F| Reserved | | |F| Reserved | | |||
+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
The most significant bit is used to indicate whether the state | The most significant bit is used to indicate whether the state for | |||
for routes that were advertised with the given AFI and SAFI has | routes that were advertised with the given AFI and SAFI has indeed | |||
indeed been preserved during the previous BGP restart. When | been preserved during the previous BGP restart. When set (value | |||
set (value 1), the bit indicates that the state has been | 1), the bit indicates that the state has been preserved. This bit | |||
preserved. This bit is called the "F bit" since it was | is called the "F bit" since it was historically used to indicate | |||
historically used to indicate the preservation of Forwarding | the preservation of forwarding state. Use of the F bit is | |||
State. Use of the F bit is detailed in the Session Resets | detailed in Section 4.2. The remaining bits are reserved and MUST | |||
section (Section 4.2). | be set to zero by the sender and ignored by the receiver. | |||
The remaining bits are reserved and MUST be set to zero by the | ||||
sender and ignored by the receiver. | Long-Lived Stale Time: | |||
Long-lived Stale Time: | This time (in seconds) specifies how long stale information (for | |||
This time (in seconds) specifies how long stale information | this AFI/SAFI) may be retained by the receiver (in addition to the | |||
(for this AFI/SAFI) may be retained by the receiver (in | period specified by the "Restart Time" in the Graceful Restart | |||
addition with the period specified by the "Restart Time" in the | Capability). Because the potential use cases for this extension | |||
Graceful Restart Capability). Because the potential use cases | vary widely, there is no suggested default value for the LLST. | |||
for this extension vary widely, there is no suggested default | ||||
value for the LLST. | ||||
3.2. LLGR_STALE Community | 3.2. LLGR_STALE Community | |||
The well-known BGP community [RFC1997] "LLGR_STALE" (value: | The well-known BGP community LLGR_STALE (value: 0xFFFF0006) can be | |||
0xFFFF0006) can be used to mark stale routes retained for a longer | used to mark stale routes retained for a longer period of time (see | |||
period of time. Such long-lived stale routes are to be handled | [RFC1997] for more information on BGP communities). Such long-lived | |||
according to the procedures specified in the Theory of Operation | stale routes are to be handled according to the procedures specified | |||
section (Section 4). | in Section 4. | |||
An implementation MAY allow users to configure policies that accept, | An implementation MAY allow users to configure policies that accept, | |||
reject, or modify routes based on the presence or absence of this | reject, or modify routes based on the presence or absence of this | |||
community. | community. | |||
3.3. NO_LLGR Community | 3.3. NO_LLGR Community | |||
The well-known BGP community "NO_LLGR" (value: 0xFFFF0007) can be | The well-known BGP community NO_LLGR (value: 0xFFFF0007) can be used | |||
used to mark routes that a BGP speaker does not want to be treated | to mark routes that a BGP speaker does not want to be treated | |||
according to these procedures, as detailed in the Operation section | according to these procedures, as detailed in Section 4. | |||
(Section 4). | ||||
An implementation MAY allow users to configure policies that accept, | An implementation MAY allow users to configure policies that accept, | |||
reject, or modify routes based on the presence or absence of this | reject, or modify routes based on the presence or absence of this | |||
community. | community. | |||
4. Theory of Operation | 4. Theory of Operation | |||
If A BGP speaker is configured to support the procedures of this | If a BGP speaker is configured to support the procedures of this | |||
document, it MUST use BGP Capabilities Advertisement [RFC5492] to | document, it MUST use BGP Capabilities Advertisement [RFC5492] to | |||
advertise the "Long-lived Graceful Restart Capability". The setting | advertise the Long-Lived Graceful Restart Capability. The setting of | |||
of the parameters for an AFI/SAFI depends on the properties of the | the parameters for an AFI/SAFI depends on the properties of the BGP | |||
BGP speaker, network scale, and local configuration. | speaker, network scale, and local configuration. | |||
In the presence of the "Long-lived Graceful Restart Capability", the | In the presence of the Long-Lived Graceful Restart Capability, the | |||
procedures specified in [RFC4724] and [RFC8538] continue to apply | procedures specified in [RFC4724] continue to apply unless explicitly | |||
unless explicitly revised by this document. | revised by this document. | |||
4.1. Use of Graceful Restart Capability | 4.1. Use of the Graceful Restart Capability | |||
The Graceful Restart capability MUST be advertised in conjunction | If the LLGR Capability is advertised, the Graceful Restart capability | |||
with the LLGR capability. If it is not so advertised, the LLGR | MUST also be advertised. If it is not so advertised, the LLGR | |||
capability MUST be disregarded. The purpose for mandating that both | Capability MUST be disregarded. The purpose for mandating this is to | |||
be used in conjunction is to enable the reuse of certain base | enable the reuse of certain base mechanisms that are common to both | |||
mechanisms that are common to both "flavors", notably origination, | "flavors" notably: origination, collection, and processing of EoR as | |||
collection, and processing of EoR, as well as the finite state | well as the finite-state-machine modifications and connection-reset | |||
machine modifications and connection reset logic introduced by GR. | logic introduced by GR. | |||
We observe that if support for conventional Graceful Restart is not | We observe that, if support for conventional Graceful Restart is not | |||
desired for the session, the conventional GR phase can be skipped by | desired for the session, the conventional GR phase can be skipped by | |||
omitting all AFI/SAFI from the GR capability, advertising a Restart | omitting all AFIs/SAFIs from the GR Capability, advertising a Restart | |||
Time of zero, or both. The Session Resets section (Section 4.2) | Time of zero, or both. Section 4.2 discusses the interaction of | |||
discusses the interaction of conventional and long-lived GR. | conventional and LLGR. | |||
4.2. Session Resets | 4.2. Session Resets | |||
BGP Graceful Restart [RFC4724], updated by [RFC8538], defines | BGP Graceful Restart [RFC4724] defines conditions under which a BGP | |||
conditions under which a BGP session can reset and have its | session can reset and have its associated routes retained. If such a | |||
associated routes retained. If such a reset occurs for a session for | reset occurs for a session in which the LLGR Capability has also been | |||
which the LLGR Capability has also been exchanged, the following | exchanged, the following procedures apply: | |||
procedures apply. | ||||
If the Graceful Restart Capability that was received does not list | * If the Graceful Restart Capability that was received does not list | |||
all AFI/SAFI supported by the session, then for those non-listed AFI/ | all AFIs/SAFIs supported by the session, then the GR Restart Time | |||
SAFI the GR "Restart Time" shall be deemed zero. Similarly, if the | shall be deemed zero for those AFIs/SAFIs that are not listed. | |||
received LLGR Capability does not list all AFI/SAFI supported by the | ||||
session, then for those non-listed AFI/SAFI the "Long-lived Stale | ||||
Time" shall be deemed zero. | ||||
The following text in Section 4.2 of the GR specification [RFC4724] | * Similarly, if the received LLGR Capability does not list all AFIs/ | |||
no longer applies: | SAFIs supported by the session, then the Long-Lived Stale Time | |||
shall be deemed zero for those AFIs/SAFIs that are not listed. | ||||
If the session does not get re-established within the "Restart | The following text in Section 4.2 of [RFC4724] no longer applies: | |||
Time" that the peer advertised previously, the Receiving Speaker | ||||
MUST delete all the stale routes from the peer that it is | | If the session does not get re-established within the "Restart | |||
retaining. | | Time" that the peer advertised previously, the Receiving Speaker | |||
| MUST delete all the stale routes from the peer that it is | ||||
| retaining. | ||||
and the following procedures are specified instead: | and the following procedures are specified instead: | |||
After the session goes down, and before the session is re- | After the session goes down, and before the session is re- | |||
established, the stale routes for an AFI/SAFI MUST be retained. The | established, the stale routes for an AFI/SAFI MUST be retained. The | |||
interval for which they are retained is limited by the sum of the | interval for which they are retained is limited by the sum of the | |||
"Restart Time" in the received Graceful Restart Capability and the | Restart Time in the received Graceful Restart Capability and the | |||
"Long-lived Stale Time" in the received Long-lived Graceful Restart | Long-Lived Stale Time in the received Long-Lived Graceful Restart | |||
Capability. The timers received in the Long-lived Graceful Restart | Capability. The timers received in the Long-Lived Graceful Restart | |||
Capability SHOULD be modifiable by local configuration, which may | Capability SHOULD be modifiable by local configuration, which may | |||
impose either an upper or a lower bound, or both, on their respective | impose an upper bound, a lower bound, or both on their respective | |||
values. | values. | |||
If the value of the "Restart Time" or the "Long-lived Stale Time" is | If the value of the Restart Time or the Long-Lived Stale Time is | |||
zero, the duration of the corresponding period would be zero seconds. | zero, the duration of the corresponding period would be zero seconds. | |||
For example, if the "Restart Time" is zero and the "Long-lived Stale | For example, if the Restart Time is zero and the Long-Lived Stale | |||
Time" is nonzero, only the procedures particular to LLGR would apply. | Time is nonzero, only the procedures particular to LLGR would apply. | |||
Conversely, if the "Long-lived Stale Time" is zero and the "Restart | Conversely, if the Long-Lived Stale Time is zero and the Restart Time | |||
Time" is nonzero, only the procedures of GR would apply. If both are | is nonzero, only the procedures of GR would apply. If both are zero, | |||
zero, none of these procedures would apply, only those of the base | none of these procedures would apply, only those of the base BGP | |||
BGP specification (although EoR would still be used as detailed in | specification [RFC4271] (although EoR would still be used as detailed | |||
[RFC4724]). And finally, if both are nonzero, then the procedures | in [RFC4724]). And finally, if both are nonzero, then the procedures | |||
would be applied serially -- first those of GR, then those of LLGR. | would be applied serially: first those of GR and then those of LLGR. | |||
We observe that during the first interval, while the procedures of GR | During the first interval, we observe that, while the procedures of | |||
are in effect, route preference would not be affected. During the | GR are in effect, route preference would not be affected. During the | |||
second interval, while LLGR procedures are in effect, routes would be | second interval, while LLGR procedures are in effect, routes would be | |||
treated as least-preferred as specified elsewhere in this document. | treated as least preferred as specified elsewhere in this document. | |||
Once the "Restart Time" period ends (including the case that the | Once the Restart Time period ends (including the case in which the | |||
"Restart Time" is zero), the LLGR period is said to have begun and | Restart Time is zero), the LLGR period is said to have begun and the | |||
the following procedures MUST be performed: | following procedures MUST be performed: | |||
* For each AFI/SAFI for which it has received a nonzero "Long-lived | * For each AFI/SAFI for which it has received a nonzero Long-Lived | |||
Stale Time", the helper router MUST start a timer for that "Long- | Stale Time, the helper router MUST start a timer for that Long- | |||
lived Stale Time". If the timer for the "Long-lived Stale Time" | Lived Stale Time. If the timer for the Long-Lived Stale Time for | |||
for a given AFI/SAFI expires before the session is re-established, | a given AFI/SAFI expires before the session is re-established, the | |||
the helper MUST delete all stale routes of that AFI/SAFI from the | helper MUST delete all stale routes of that AFI/SAFI from the | |||
neighbor that it is retaining. | neighbor that it is retaining. | |||
* The helper router MUST attach the LLGR_STALE community to the | * The helper router MUST attach the LLGR_STALE community to the | |||
stale routes being retained. Note that this requirement implies | stale routes being retained. Note that this requirement implies | |||
that the routes would need to be readvertised, to disseminate the | that the routes would need to be readvertised in order to | |||
modified community. | disseminate the modified community. | |||
* If any of the routes from the peer have been marked with the | * If any of the routes from the peer have been marked with the | |||
NO_LLGR community, either as sent by the peer, or as the result of | NO_LLGR community, either as sent by the peer or as the result of | |||
a configured policy, they MUST NOT be retained, but MUST be | a configured policy, they MUST NOT be retained and MUST be removed | |||
removed as per the normal operation of [RFC4271]. | as per the normal operation of [RFC4271]. | |||
* The helper router MUST perform the procedures listed under | * The helper router MUST perform the procedures listed in | |||
Section 4.3. | Section 4.3. | |||
Once the session is re-established, the procedures specified in | Once the session is re-established, the procedures specified in | |||
[RFC4724] apply for the stale routes irrespective of whether the | [RFC4724] apply for the stale routes irrespective of whether the | |||
stale routes are retained during the "Restart Time" period or the | stale routes are retained during the Restart Time period or the Long- | |||
"Long-lived Stale Time" period. However, in the case of consecutive | Lived Stale Time period. However, in the case of consecutive | |||
restarts, the previously marked stale routes MUST NOT be deleted | restarts, the previously marked stale routes MUST NOT be deleted | |||
before the timer for the "Long-lived Stale Time" expires. | before the timer for the Long-Lived Stale Time expires. | |||
Similarly to [RFC4724], once the session is re-established, if the F | Similar to [RFC4724], once the LLGR Period begins, the Helper MUST | |||
bit for a specific address family is not set in the newly received | immediately remove all the stale routes from the peer that it is | |||
LLGR Capability, or if a specific address family is not included in | retaining for that address family if any of the following occur: | |||
the newly received LLGR Capability, or if the LLGR and accompanying | ||||
GR Capability are not received in the re-established session at all, | ||||
then the Helper MUST immediately remove all the stale routes from the | ||||
peer that it is retaining for that address family. | ||||
If a "Long-lived Stale Time" timer is running for routes with a given | * the F bit for a specific address family is not set in the newly | |||
received LLGR Capability, or | ||||
* a specific address family is not included in the newly received | ||||
LLGR Capability, or | ||||
* the LLGR and accompanying GR Capability are not received in the | ||||
re-established session at all. | ||||
If a Long-Lived Stale Time timer is running for routes with a given | ||||
AFI/SAFI received from a peer, it MUST NOT be updated (other than by | AFI/SAFI received from a peer, it MUST NOT be updated (other than by | |||
manual operator intervention) until the peer has established and | manual operator intervention) until the peer has established and | |||
synchronized a new session. The session is termed "synchronized" for | synchronized a new session. The session is termed "synchronized" for | |||
a given AFI/SAFI once the EoR for that AFI/SAFI has been received | a given AFI/SAFI once the EoR for that AFI/SAFI has been received | |||
from the peer, or once the Selection_Deferral_Timer discussed in | from the peer or once the Selection_Deferral_Timer discussed in | |||
[RFC4724] expires. | [RFC4724] expires. | |||
The value of a "Long-lived Stale Time" in the capability received | The value of a Long-Lived Stale Time in the capability received from | |||
from a neighbor MAY be reduced by local configuration. | a neighbor MAY be reduced by local configuration. | |||
While the session is down, the expiration of a "Long-lived Stale | While the session is down, the expiration of a Long-Lived Stale Time | |||
Time" timer is treated analogously to the expiration of the "Restart | timer is treated analogously to the expiration of the Restart Time | |||
Time" timer in Graceful Restart, other than applying only to the AFI/ | timer in [RFC4724], other than applying only to the AFI/SAFI it | |||
SAFI it accompanies. However, the timer continues to run once the | accompanies. However, the timer continues to run once the session | |||
session has re-established. The timer is neither stopped nor updated | has re-established. The timer is neither stopped nor updated until | |||
until EoR is received for the relevant AFI/SAFI from the peer. If | the EoR marker is received for the relevant AFI/SAFI from the peer. | |||
the timer expires during synchronization with the peer, any stale | If the timer expires during synchronization with the peer, any stale | |||
routes that the peer has not refreshed, are removed. If the session | routes that the peer has not refreshed are removed. If the session | |||
subsequently resets prior to becoming synchronized, any remaining | subsequently resets prior to becoming synchronized, any remaining | |||
routes (for the AFI/SAFI whose LLST timer expired) MUST be removed | routes (for the AFI/SAFI whose LLST timer expired) MUST be removed | |||
immediately. | immediately. | |||
4.3. Processing LLGR_STALE Routes | 4.3. Processing LLGR_STALE Routes | |||
A BGP speaker that has advertised the "Long-lived Graceful Restart | A BGP speaker that has advertised the Long-Lived Graceful Restart | |||
Capability" to a neighbor MUST perform the following upon receiving a | Capability to a neighbor MUST perform the following upon receiving a | |||
route from that neighbor with the "LLGR_STALE" community, or upon | route from that neighbor with the LLGR_STALE community or upon | |||
attaching the "LLGR_STALE" community itself per Section 4.2: | attaching the LLGR_STALE community itself per Section 4.2: | |||
* Treat the route as the least-preferred in route selection (see | * Treat the route as the least preferred in route selection (see | |||
below). See the Risks of Depreferencing Routes section | below). See Section 5.2 for a discussion of potential risks | |||
(Section 5.2) for a discussion of potential risks inherent in | inherent in doing this. | |||
doing this. | ||||
* The route SHOULD NOT be advertised to any neighbor from which the | * The route SHOULD NOT be advertised to any neighbor from which the | |||
Long-lived Graceful Restart Capability has not been received. The | Long-Lived Graceful Restart Capability has not been received. The | |||
exception is described in the Optional Partial Deployment | exception is described in Section 4.6. Note that this requirement | |||
Procedure section (Section 4.6). Note that this requirement | ||||
implies that such routes should be withdrawn from any such | implies that such routes should be withdrawn from any such | |||
neighbor. | neighbor. | |||
* The "LLGR_STALE" community MUST NOT be removed when the route is | * The LLGR_STALE community MUST NOT be removed when the route is | |||
further advertised. | further advertised. | |||
4.4. Route Selection | 4.4. Route Selection | |||
A "least-preferred" route MUST be treated as less preferred than any | A least preferred route MUST be treated as less preferred than any | |||
other route that is not also least-preferred. When performing route | other route that is not also least preferred. When performing route | |||
selection between two routes both of which are least-preferred, | selection between two routes when both are least preferred, normal | |||
normal tie-breaking applies. Note that this would only be expected | tiebreaking applies. Note that this would only be expected to happen | |||
to happen if the only routes available for selection were least- | if the only routes available for selection were least preferred; in | |||
preferred -- in all other cases, such routes would have been | all other cases, such routes would have been eliminated from | |||
eliminated from consideration. | consideration. | |||
4.5. Errors | 4.5. Errors | |||
If the LLGR capability is received without an accompanying GR | If the LLGR Capability is received without an accompanying GR | |||
capability, the LLGR capability MUST be ignored, that is, the | Capability, the LLGR Capability MUST be ignored, that is, the | |||
implementation MUST behave as though no LLGR capability had been | implementation MUST behave as though no LLGR Capability has been | |||
received. | received. | |||
4.6. Optional Partial Deployment Procedure | 4.6. Optional Partial Deployment Procedure | |||
Ideally, all routers in an Autonomous System would support this | Ideally, all routers in an Autonomous System (AS) would support this | |||
specification before it was enabled. However, to facilitate | specification before it were enabled. However, to facilitate | |||
incremental deployment, stale routes MAY be advertised to neighbors | incremental deployment, stale routes MAY be advertised to neighbors | |||
that have not advertised the Long-lived Graceful Restart Capability | that have not advertised the Long-Lived Graceful Restart Capability | |||
under the following conditions: | under the following conditions: | |||
* The neighbors MUST be internal (IBGP or Confederation) neighbors. | * The neighbors MUST be internal (Internal BGP (IBGP) or | |||
Confederation) neighbors. | ||||
* The NO_EXPORT community [RFC1997] MUST be attached to the stale | * The NO_EXPORT community [RFC1997] MUST be attached to the stale | |||
routes. | routes. | |||
* The stale routes MUST have their LOCAL_PREF set to zero. See the | * The stale routes MUST have their LOCAL_PREF set to zero. See | |||
Risks of Depreferencing Routes section (Section 5.2) for a | Section 5.2 for a discussion of potential risks inherent in doing | |||
discussion of potential risks inherent in doing this. | this. | |||
If this strategy for partial deployment is used, the network operator | If this strategy for partial deployment is used, the network operator | |||
should set LOCAL_PREF to zero for all long-lived stale routes | should set the LOCAL_PREF to zero for all long-lived stale routes | |||
throughout the Autonomous System. This trades off a small reduction | throughout the Autonomous System. This trades off a small reduction | |||
in flexibility (ordering may not be preserved between competing long- | in flexibility (ordering may not be preserved between competing long- | |||
lived stale routes) for consistency between routers that do, and do | lived stale routes) for consistency between routers that do, and do | |||
not, support this specification. Since the consistency of route | not, support this specification. Since the consistency of route | |||
selection can be important for preventing forwarding loops, the | selection can be important for preventing forwarding loops, the | |||
latter consideration dominates. | latter consideration dominates. | |||
4.7. Procedures when BGP is the PE-CE Protocol in a VPN | 4.7. Procedures When BGP Is the PE-CE Protocol in a VPN | |||
4.7.1. Procedures when EBGP is the PE-CE Protocol in a VPN | 4.7.1. Procedures When EBGP Is the PE-CE Protocol in a VPN | |||
In VPN deployments, for example [RFC4364], EBGP is often used as a | In VPN deployments (for example, [RFC4364]), External BGP (EBGP) is | |||
PE-CE protocol. It may be a practical necessity in such deployments | often used as a PE-CE protocol. It may be a practical necessity in | |||
to accommodate interoperation with peer routers that cannot easily be | such deployments to accommodate interoperation with peer routers that | |||
upgraded to support specifications such as this one. This leads to a | cannot easily be upgraded to support specifications such as this one. | |||
problem: in this specification, we take pains to ensure that "stale" | This leads to a problem: the procedures defined elsewhere in this | |||
routing information will not leak beyond the perimeter of routers | document generally prevent LLGR stale routes from being sent across | |||
that support these procedures so that it can be depreferenced as | EBGP sessions that don't support LLGR, but this could prevent the VPN | |||
expected, and we provide a workaround (Section 4.6) for the case | routes from being used for their intended purpose. | |||
where one or more IBGP routers are not upgraded. However, in the VPN | ||||
PE-CE case, the protocol in use is EBGP, and our workaround does not | ||||
work since it relies on the use of LOCAL_PREF, an IBGP-only path | ||||
attribute. | ||||
We observe that the principal motivation for restricting the | We observe that the principal motivation for restricting the | |||
propagation of "stale" routing information is the desire to prevent | propagation of "stale" routing information is the desire to prevent | |||
it from spreading without limit once it exits the "safe" perimeter. | it from spreading without limit once it exits the "safe" perimeter. | |||
We further observe that VPN deployments are typically topologically | We further observe that VPN deployments are typically topologically | |||
constrained, making this concern moot. For this reason, an | constrained, making this concern moot. For this reason, an | |||
implementation MAY advertise stale routes over a PE-CE session, when | implementation MAY advertise stale routes over a PE-CE session, when | |||
explicitly configured to do so. That is, the second rule listed in | explicitly configured to do so. That is, the second rule listed in | |||
Section 4.3 MAY be disregarded in such cases. All other rules | Section 4.3 MAY be disregarded in such cases. All other rules | |||
continue to apply. Finally, if this exception is used, the | continue to apply. Finally, if this exception is used, the | |||
implementation SHOULD by default attach the NO_EXPORT community to | implementation SHOULD, by default, attach the NO_EXPORT community to | |||
the routes in question, as an additional protection against stale | the routes in question, as an additional protection against stale | |||
routes spreading without limit. Attachment of the NO_EXPORT | routes spreading without limit. Attachment of the NO_EXPORT | |||
community MAY be disabled by explicit configuration, to accommodate | community MAY be disabled by explicit configuration in order to | |||
exceptional cases. | accommodate exceptional cases. | |||
See further discussion of using explicitly configured policy to | See further discussion of using an explicitly configured policy to | |||
mitigate this issue in Section 5.1. | mitigate this issue in Section 5.1. | |||
4.7.2. Procedures when IBGP is the PE-CE Protocol in a VPN | 4.7.2. Procedures When IBGP Is the PE-CE Protocol in a VPN | |||
If IBGP is used as the PE-CE protocol, following the procedures of | If IBGP is used as the PE-CE protocol, following the procedures of | |||
[RFC6368], then when a PE router imports a VPN route that contains | [RFC6368], then when a PE router imports a VPN route that contains | |||
the ATTR_SET attribute into a destination VRF and subsequently | the ATTR_SET attribute into a destination VRF and subsequently | |||
advertises that route to a CE router, | advertises that route to a CE router: | |||
* If the CE router does support the procedures of this document (in | * If the CE router supports the procedures of this document (in | |||
other words, if the CE router has advertised the LLGR Capability): | other words, if the CE router has advertised the LLGR Capability): | |||
In addition to including in the advertised route the path | ||||
attributes derived from the ATTR_SET as per [RFC6368], the PE | ||||
router MUST also include the LLGR_STALE community if it is present | ||||
in the path attributes of the imported route, even if it is not | ||||
present in the ATTR_SET attribute. | ||||
* If the CE router does not support the procedures of this document, | In addition to including the path attributes derived from the | |||
then the optional procedures of Section 4.6 MAY be followed, | ATTR_SET attribute in the advertised route as per [RFC6368], | |||
attaching the NO_EXPORT community and setting the value of | the PE router MUST also include the LLGR_STALE community if it | |||
LOCAL_PREF to zero, overriding the value found in the ATTR_SET. | is present in the path attributes of the imported route, even | |||
if it is not present in the ATTR_SET attribute. | ||||
* If the CE router does not support the procedures of this document: | ||||
Then the optional procedures of Section 4.6 MAY be followed, | ||||
attaching the NO_EXPORT community and setting the value of | ||||
LOCAL_PREF to zero, overriding the value found in the ATTR_SET. | ||||
Similarly, when a PE router receives a route from a CE into its VRF | Similarly, when a PE router receives a route from a CE into its VRF | |||
and subsequently exports that route to a VPN address family, | and subsequently exports that route to a VPN address family: | |||
* If the PE router does support the procedures of this document (in | * If the PE router supports the procedures of this document (in | |||
other words, if the PE router has advertised the LLGR Capability): | other words, if the PE router has advertised the LLGR Capability): | |||
In addition to including in the VPN route the ATTR_SET derived | ||||
from the path attributes as per [RFC6368], the PE router MUST also | ||||
include the LLGR_STALE community in the VPN route if it is present | ||||
in the path attributes of the route as received from the CE. | ||||
* If the PE router does not support the procedures of this document, | In addition to including in the VPN route the ATTR_SET derived | |||
there exists no ideal solution. The CE could advertise a route | from the path attributes as per [RFC6368], the PE router MUST | |||
with LLGR_STALE, with the understanding that the LLGR_STALE | also include the LLGR_STALE community in the VPN route if it is | |||
marking will only be honored by the provider network if | present in the path attributes of the route as received from | |||
appropriate policy configuration exists on the PE (see | the CE. | |||
Section 5.1). It is at least guaranteed that LLGR_STALE will be | ||||
propagated when the route is propagated beyond the provider | * If the PE router does not support the procedures of this document: | |||
network. Or, the CE could refrain from advertising the LLGR_STALE | ||||
route to the incapable PE. | There exists no ideal solution. The CE could advertise a route | |||
with LLGR_STALE, with the understanding that the LLGR_STALE | ||||
marking will only be honored by the provider network if | ||||
appropriate policy configuration exists on the PE (see | ||||
Section 5.1). It is at least guaranteed that LLGR_STALE will | ||||
be propagated when the route is propagated beyond the provider | ||||
network, or the CE could refrain from advertising the | ||||
LLGR_STALE route to the incapable PE. | ||||
5. Deployment Considerations | 5. Deployment Considerations | |||
The deployment considerations discussed in [RFC4724] apply to this | The deployment considerations discussed in [RFC4724] apply to this | |||
document. In addition, network operators are cautioned to carefully | document. In addition, network operators are cautioned to carefully | |||
consider the potential disadvantages of deploying these procedures | consider the potential disadvantages of deploying these procedures | |||
for a given AFI/SAFI. Most notably, if used for an AFI/SAFI that | for a given AFI/SAFI. Most notably, if used for an AFI/SAFI that | |||
conveys traditional reachability information, the use of a long-lived | conveys conventional reachability information, the use of a long- | |||
stale route could result in a loss of connectivity for the covered | lived stale route could result in a loss of connectivity for the | |||
prefix. This specification takes pains to mitigate this risk where | covered prefix. This specification takes pains to mitigate this risk | |||
possible, by making such routes least-preferred and by restricting | where possible by making such routes least preferred and by | |||
the scope of such routes to routers that support these procedures | restricting the scope of such routes to routers that support these | |||
(or, optionally, a single Autonomous System, see "Optional Partial | procedures (or, optionally, a single Autonomous System, see | |||
Deployment Procedure" (Section 4.6)). However, according to the | Section 4.6). However, if a stale route is chosen as best for a | |||
normal rules of IP forwarding a stale more-specific route, that has | given prefix, then according to the normal rules of IP forwarding, | |||
no non-stale alternate paths available, will still be used instead of | that route will be used for matching destinations, even if a non- | |||
a non-stale less-specific route. Networks in which the deployment of | stale less specific matching route is also available. Networks in | |||
these procedures would be especially concerning include those which | which the deployment of these procedures would be especially | |||
do not use "tunneled" forwarding (in other words, those using | concerning include those that do not use "tunneled" forwarding (in | |||
traditional hop-by-hop forwarding). | other words, those using conventional hop-by-hop forwarding). | |||
Implementations MUST NOT enable these procedures by default. They | Implementations MUST NOT enable these procedures by default. They | |||
MUST require affirmative configuration per AFI/SAFI in order to | MUST require affirmative configuration per AFI/SAFI in order to | |||
enable them. | enable them. | |||
The procedures of this document do not alter the route resolvability | The procedures of this document do not alter the route resolvability | |||
requirement of Section 9.1.2.1 of [RFC4271]. Because of this, it | requirement of Section 9.1.2.1 of [RFC4271]. Because of this, it | |||
will commonly be the case that "stale" IBGP routes will only continue | will commonly be the case that "stale" IBGP routes will only continue | |||
to be used if the router depicted in the next hop remains resolvable, | to be used if the router depicted in the next hop remains resolvable, | |||
even if its BGP component is down. Details of IGP fault-tolerance | even if its BGP component is down. Details of IGP fault-tolerance | |||
strategies are beyond the scope of this document. In addition to the | strategies are beyond the scope of this document. In addition to the | |||
foregoing, it may be advisable to check the viability of the next hop | foregoing, it may be advisable to check the viability of the next hop | |||
through other means, for example, BFD [RFC5880]. This may be | through other means, for example, Bidirectional Forwarding Detection | |||
especially useful in cases where the next hop is known directly at | (BFD) [RFC5880]. This may be especially useful in cases where the | |||
the network layer, notably EBGP. | next hop is known directly at the network layer, notably EBGP. | |||
As discussed in this document, after a BGP session goes down and | As discussed in this document, after a BGP session goes down and | |||
before the session is re-established, stale routes may be retained | before the session is re-established, stale routes may be retained | |||
for up to two consecutive periods, controlled by the "Restart Time" | for up to two consecutive periods, controlled by the Restart Time and | |||
and the "Long-lived Stale Time", respectively. During the first | the Long-Lived Stale Time, respectively: | |||
period routing churn would be prevented but with potential | ||||
blackholing of traffic. During the second period potential | ||||
blackholing of traffic may be reduced but routing churn would be | ||||
visible throughout the network. The setting of the relevant | ||||
parameters for a particular application should take into account the | ||||
tradeoffs, the network dynamics, and potential failure scenarios. If | ||||
needed, the first period can be bypassed either by local | ||||
configuration or by setting the "Restart Time" in the Graceful | ||||
Restart Capability to zero and/or not listing the AFI/SAFI in that | ||||
Capability. | ||||
The setting of the F bit (and the "Forwarding State" bit of the | * During the first period, routing churn would be prevented, but | |||
accompanying GR capability) depends in part on deployment | with potential persistent packet loss. | |||
* During the second period, potential persistent packet loss may be | ||||
reduced, but routing churn would be visible throughout the | ||||
network. | ||||
The setting of the relevant parameters for a particular application | ||||
should take into account trade-offs, network dynamics, and potential | ||||
failure scenarios. If needed, the first period can be bypassed | ||||
either by local configuration or by setting the Restart Time in the | ||||
Graceful Restart Capability to zero and/or not listing the AFI/SAFI | ||||
in that capability. | ||||
The setting of the F bit (and the Forwarding State bit of the | ||||
accompanying GR Capability) depends, in part, on deployment | ||||
considerations. The F bit can be understood as an indication that | considerations. The F bit can be understood as an indication that | |||
the Helper should flush associated routes (if the bit is left clear). | the Helper should flush associated routes (if the bit is left clear). | |||
As discussed in the Introduction (Section 1), an important use case | As discussed in Section 1, an important use case for LLGR is for | |||
for LLGR is for routes that are more akin to configuration than to | routes that are more akin to configuration than to conventional | |||
traditional routing. For such routes, it may make sense to always | routing. For such routes, it may make sense to always set the F bit, | |||
set the F bit, regardless of other considerations. Likewise, for | regardless of other considerations. Likewise, for control-plane-only | |||
control-plane-only entities such as dedicated route reflectors, that | entities, such as dedicated route reflectors that do not participate | |||
do not participate in the forwarding plane, it makes sense to always | in the forwarding plane, it makes sense to always set the F bit. | |||
set the F bit. Overall, the rule of thumb is that if loss of state | Overall, the rule of thumb is that if loss of state on the restarting | |||
on the restarting router can reasonably be expected to cause a | router can reasonably be expected to cause a forwarding loop or | |||
forwarding loop or black hole, the F bit should be set scrupulously | persistent packet loss, the F bit should be set scrupulously | |||
according to whether state has been retained. Specifics of when the | according to whether state has been retained. Specifics of whether | |||
F bit is, and is not, set are implementation-dependent and may also | or not the F bit is set are implementation dependent and may also be | |||
be controlled by configuration. Also, for every AFI/SAFI represented | controlled by configuration. Also, for every AFI/SAFI represented in | |||
in the LLGR capability that is also represented in the GR capability, | the LLGR Capability that is also represented in the GR Capability, | |||
there will be two corresponding F bits -- the LLGR F bit and the GR F | there will be two corresponding F bits: the LLGR F bit and the GR F | |||
bit. If the LLGR F bit is set, the corresponding GR F bit should | bit. If the LLGR F bit is set, the corresponding GR F bit should | |||
also be set, since to do otherwise would cause the state to be | also be set, since to do otherwise would cause the state to be | |||
cleared on the Receiving Router per the normal rules of GR, violating | cleared on the Receiving Router per the normal rules of GR, violating | |||
the intent of the set LLGR bit. | the intent of the set LLGR bit. | |||
5.1. When BGP is the PE-CE Protocol in a VPN | 5.1. When BGP Is the PE-CE Protocol in a VPN | |||
As discussed in Section 4.7, it may be necessary for a PE to | As discussed in Section 4.7, it may be necessary for a PE to | |||
advertise stale routes to a CE in some VPN deployments, even if the | advertise stale routes to a CE in some VPN deployments, even if the | |||
CE does not support this specification. In that case, the operator | CE does not support this specification. In that case, the operator | |||
configuring their PE to advertise such routes should notify the | configuring their PE to advertise such routes should notify the | |||
operator of the CE receiving the routes, and the CE should be | operator of the CE receiving the routes, and the CE should be | |||
configured to depreference the routes. | configured to depreference the routes. | |||
Similarly, it may be necessary for a CE to advertise stale routes to | Similarly, it may be necessary for a CE to advertise stale routes to | |||
a PE, even if the PE does not support this specification. In that | a PE, even if the PE does not support this specification. In that | |||
skipping to change at page 16, line 15 ¶ | skipping to change at line 724 ¶ | |||
Consistent route selection is a fundamental tenet of IBGP correctness | Consistent route selection is a fundamental tenet of IBGP correctness | |||
and safe operation in hop-by-hop routed networks. When routers | and safe operation in hop-by-hop routed networks. When routers | |||
within an AS apply different criteria in selecting routes, they can | within an AS apply different criteria in selecting routes, they can | |||
arrive at inconsistent route selections. This can lead to the | arrive at inconsistent route selections. This can lead to the | |||
formation of forwarding loops unless some form of tunneled forwarding | formation of forwarding loops unless some form of tunneled forwarding | |||
is used to prevent "core" routers from making a (potentially | is used to prevent "core" routers from making a (potentially | |||
inconsistent) forwarding decision based on the IP header. | inconsistent) forwarding decision based on the IP header. | |||
This specification uses the state of a peering session as an input to | This specification uses the state of a peering session as an input to | |||
the selection criteria, depreferencing routes that are associated | the selection criteria, depreferencing routes that are associated | |||
with a session that has gone down but have not yet aged out. Since | with a session that has gone down but that have not yet aged out. | |||
different routers within an AS might have different notions as to | Since different routers within an AS might have different notions as | |||
whether their respective sessions with a given peer are up or down, | to whether their respective sessions with a given peer are up or | |||
they might apply different selection criteria to routes from that | down, they might apply different selection criteria to routes from | |||
peer. This could result in a forwarding loop forming between such | that peer. This could result in a forwarding loop forming between | |||
routers. | such routers. | |||
For an example of such a forwarding loop, consider the following | For an example of such a forwarding loop, consider the following | |||
simple topology: | simple topology: | |||
A ---- B ---- C ------------------------- D | A ---- B ---- C ------------------------- D | |||
^ ^ | ^ ^ | |||
| | | | | | |||
R1 R2 | R1 R2 | |||
Figure 1 | ||||
In this example, A - D are routers with a full mesh of IBGP sessions | In this example, A - D are routers with a full mesh of IBGP sessions | |||
between them (the sessions are not shown). The short links have unit | between them (the sessions are not shown). The short links have unit | |||
cost, the long link has cost 5. Routers A and D are AS border | cost, the long link has cost 5. Routers A and D are AS border | |||
routers, each advertising some route, R, into the AS -- these are | routers, each advertising some route, R, with the same LOCAL_PREF | |||
denoted R1 and R2 in the diagram. In ordinary operation, it can be | into the AS: denoted R1 and R2 in the diagram. In ordinary | |||
seen that routers B and C will select R1 for forwarding, and will | operation, it can be seen that routers B and C will select R1 for | |||
forward toward A. | forwarding and will forward toward A. | |||
Suppose that the session between A and B goes down for some reason, | Suppose that the session between A and B goes down for some reason, | |||
and stays down long enough for LLGR processing to be invoked on B. | and it stays down long enough for LLGR processing to be invoked on B. | |||
Then on B, route R1 will be depreferenced, leading to the selection | Then, on B, route R1 will be depreferenced, leading to the selection | |||
of R2 by B. However, C will continue to prefer R1. It can be seen | of R2 by B. However, C will continue to prefer R1. In this case, it | |||
that in this case, a forwarding loop for packets destined to R would | can be seen that a forwarding loop for packets destined to R would | |||
form between B and C. (We note that other forwarding loop scenarios | form between B and C. (We note that other forwarding loop scenarios | |||
can be constructed for traditional GR, but are generally considered | can be constructed for conventional GR, but these are generally | |||
less severe since GR can remain in effect for a much more limited | considered less severe since GR can remain in effect for a much more | |||
interval.) | limited interval.) | |||
The potential benefits of this specification can outweigh the risks | The potential benefits of this specification can outweigh the risks | |||
discussed above, as long as care is exercised in deployment. The | discussed above, as long as care is exercised in deployment. The | |||
cardinal rule to be followed is, if a given set of routes are being | cardinal rule to be followed is that if a given set of routes is | |||
used within an AS for hop-by-hop forwarding, it is not recommended to | being used within an AS for hop-by-hop forwarding, enabling LLGR | |||
enable LLGR procedures. If tunneled forwarding (such as MPLS) is | procedures is not recommended. If tunneled forwarding (such as MPLS) | |||
used within the AS, or if routes are being used for purposes other | is used within the AS, or if routes are being used for purposes other | |||
than hop-by-hop forwarding, less caution is needed, though the | than hop-by-hop forwarding, less caution is needed; however, the | |||
operator should still carefully consider the consequences of enabling | operator should still carefully consider the consequences of enabling | |||
LLGR. | LLGR. | |||
6. Security Considerations | 6. Security Considerations | |||
The security implications of the LLGR mechanism defined in this | The security implications of the LLGR mechanism defined in this | |||
document are akin to those incurred by the maintenance of stale | document are akin to those incurred by the maintenance of stale | |||
routing information within a network. However, since the retention | routing information within a network. However, since the retention | |||
time may potentially be much longer, the window during which certain | time may be much longer, the window during which certain attacks are | |||
attacks are feasible may be substantially increased. This is | feasible may substantially increase. This is particularly relevant | |||
particularly relevant when considering the maintenance of routing | when considering the maintenance of routing information that is used | |||
information that is used for service segregation - such as MPLS label | for service segregation, such as MPLS label entries. | |||
entries. | ||||
For MPLS VPN services, the effectiveness of the traffic isolation | For MPLS VPN services, the effectiveness of the traffic isolation | |||
between VPNs relies on the correctness of the MPLS labels between | between VPNs relies on the correctness of the MPLS labels between | |||
ingress and egress PEs. In particular, when an egress PE withdraws a | ingress and egress PEs. In particular, when an egress PE withdraws a | |||
label L1 allocated to a VPN1 route, this label must not be assigned | label L1 allocated to a VPN1 route, this label must not be assigned | |||
to a VPN route of a different VPN until all ingress PEs stop using | to a VPN route of a different VPN until all ingress PEs stop using | |||
the old VPN1 route using L1. | the old VPN1 route using L1. | |||
Such a corner case may happen today if the propagation of VPN routes | Such a corner case may happen today if the propagation of VPN routes | |||
by BGP messages between PEs takes more time than the label re- | by BGP messages between PEs takes more time than the label | |||
allocation delay on a PE. Given that we can generally bound the | reallocation delay on a PE. Given that we can generally bound the | |||
worst-case BGP propagation time to a few minutes (for example 2-5), | worst-case BGP propagation time to a few minutes (for example, 2-5 | |||
the security breach will not occur if PEs are designed to not | minutes), the security breach will not occur if PEs are designed to | |||
reallocate a previously used and withdrawn label before a few | not reallocate a previously used and withdrawn label before a few | |||
minutes. | minutes. | |||
The problem is made worse with BGP GR between PEs as VPN routes can | The problem is made worse with BGP GR between PEs because VPN routes | |||
be stalled for a longer period of time (for example 20 minutes). | can be stalled for a longer period of time (for example, 20 minutes). | |||
This is further aggravated by the BGP LLGR extension proposed in this | This is further aggravated by the LLGR extension specified in this | |||
document as VPN routes can be stalled for a much longer period of | document because VPN routes can be stalled for a much longer period | |||
time (for example 2 hours, 1 day). | of time (for example, 2 hours, 1 day). | |||
In order to exploit the vulnerability described above, there is a | In order to exploit the vulnerability described above, an attacker | |||
requirement to engineer a specific LLGR state between two PE devices, | needs to engineer a specific LLGR state between two PE devices and | |||
whilst engineering label reallocation to occur in a manner that | also cause the label reallocation to occur such that the two | |||
results in the two topologies overlapping. Therefore, to avoid the | topologies overlap. To avoid the potential for a VPN breach, the | |||
potential for a VPN breach, before enabling BGP LLGR for a VPN | operator should ensure that the lower bound for label reuse is | |||
address family, the operator should endeavor to ensure that the lower | greater than the upper bound on the LLST before enabling LLGR for a | |||
bound on when a label might be reused is greater than the upper bound | VPN address family. Section 4.2 discusses the provision of an upper | |||
on LLST. Section 4.2 discusses the provision of an upper bound on | bound on LLST. Details of features for setting a lower bound on | |||
LLST. Details of features for setting a lower bound on label reuse | label reuse time are beyond the scope of this document; however, | |||
time are beyond the scope of this document; however, factors that | factors that might need to be taken into account when setting this | |||
might need to be taken into account when setting this value include: | value include: | |||
* The load of the BGP route churn on a PE (in terms of the number of | * The load of the BGP route churn on a PE (in terms of the number of | |||
VPN labels advertised and the churn rate). | VPN labels advertised and the churn rate). | |||
* The label allocation policy on the PE (possibly depending upon the | * The label allocation policy on the PE, which possibly depends upon | |||
size of the pool of the VPN labels (which can be restricted by | the size of the pool of the VPN labels (which can be restricted by | |||
hardware considerations or other MPLS usages), the label | hardware considerations or other MPLS usages), the label | |||
allocation scheme (for example per route or per VRF/CE), the re- | allocation scheme (for example, per route or per VRF/CE), and the | |||
allocation policy (for example least recently used label). | reallocation policy (for example, least recently used label). | |||
Note that [RFC4781] which defines Graceful Restart Mechanism for BGP | Note that [RFC4781], which defines the Graceful Restart Mechanism for | |||
with MPLS is also applicable to BGP LLGR. | BGP with MPLS, is also applicable to LLGR. | |||
7. Examples of Operation | 7. Examples of Operation | |||
For illustrative purposes, we present a few examples of how this | For illustrative purposes, we present a few examples of how this | |||
specification might be used in practice. These examples are neither | specification might be used in practice. These examples are neither | |||
exhaustive nor normative. | exhaustive nor normative. | |||
Consider the following scenario: A border router, ASBR1, has an IBGP | Consider the following scenario: A border router, ASBR1, has an IBGP | |||
peering with a route reflector, RR1, from which it learns routes. It | peering with a route reflector, RR1, from which it learns routes. It | |||
has an EBGP peering with an external peer, EXT, to which it | has an EBGP peering with an external peer, EXT, to which it | |||
advertises those routes. The external peer has advertised the GR and | advertises those routes. The external peer has advertised the GR and | |||
LLGR Capabilities to ASBR1. ASBR1 is configured to support GR and | LLGR Capabilities to ASBR1. ASBR1 is configured to support GR and | |||
LLGR on its sessions with RR1 and EXT. RR1 advertises a GR Restart | LLGR on its sessions with RR1 and EXT. RR1 advertises a GR Restart | |||
Time of 1 (second) and an LLST of 3600 (seconds): | Time of 1 (second) and an LLST of 3600 (seconds): | |||
+==========+=====================================================+ | +==========+=====================================================+ | |||
| Time | Event | | | Time | Event | | |||
+==========+=====================================================+ | +==========+=====================================================+ | |||
| t | ASBR1's IBGP session with RR fails. ASBR1 retains | | | t | ASBR1's IBGP session with RR fails. ASBR1 retains | | |||
| | RR's routes according to the rules of GR [RFC4724] | | | | RR's routes according to the rules of GR [RFC4724]. | | |||
+----------+-----------------------------------------------------+ | +----------+-----------------------------------------------------+ | |||
| t+1 | GR Restart Time expires. ASBR1 transitions RR's | | | t+1 | GR Restart Time expires. ASBR1 transitions RR's | | |||
| | routes to long-lived stale by attaching the | | | | routes to long-lived stale routes by attaching the | | |||
| | LLGR_STALE community and depreferencing them. | | | | LLGR_STALE community and depreferencing them. | | |||
| | However, since it has no backup routes, it | | | | However, since it has no backup routes, it | | |||
| | continues to make use of them. It re-announces | | | | continues to make use of them. It re-announces | | |||
| | them to EXT with the LLGR_STALE community attached. | | | | them to EXT with the LLGR_STALE community attached. | | |||
+----------+-----------------------------------------------------+ | +----------+-----------------------------------------------------+ | |||
| t+1+3600 | LLST expires. ASBR1 removes RR's stale routes from | | | t+1+3600 | LLST expires. ASBR1 removes RR's stale routes from | | |||
| | its own RIB and sends BGP updates to withdraw them | | | | its own RIB and sends BGP updates to withdraw them | | |||
| | from EXT. | | | | from EXT. | | |||
+----------+-----------------------------------------------------+ | +----------+-----------------------------------------------------+ | |||
Table 1 | Table 1 | |||
Next, imagine the same scenario but suppose RR1 advertised a GR | Next, imagine the same scenario, but suppose RR1 advertised a GR | |||
Restart Time of zero, effectively disabling GR. Equally, ASBR1 could | Restart Time of zero, effectively disabling GR. Equally, ASBR1 could | |||
have used local configuration to override RR1's offered Restart Time, | have used a local configuration to override RR1's offered Restart | |||
setting it to a locally-configured value of zero: | Time, setting it to a locally configured value of zero: | |||
+==========+=======================================================+ | +==========+=======================================================+ | |||
| Time | Event | | | Time | Event | | |||
+==========+=======================================================+ | +==========+=======================================================+ | |||
| t | ASBR1's IBGP session with RR fails. ASBR1 | | | t | ASBR1's IBGP session with RR fails. ASBR1 | | |||
| | transitions RR's routes to long-lived stale by | | | | transitions RR's routes to long-lived stale routes by | | |||
| | attaching the LLGR_STALE community and depreferencing | | | | attaching the LLGR_STALE community and depreferencing | | |||
| | them. However, since it has no backup routes, it | | | | them. However, since it has no backup routes, it | | |||
| | continues to make use of them. It re-announces them | | | | continues to make use of them. It re-announces them | | |||
| | to EXT with the LLGR_STALE community attached. | | | | to EXT with the LLGR_STALE community attached. | | |||
+----------+-------------------------------------------------------+ | +----------+-------------------------------------------------------+ | |||
| t+0+3600 | LLST expires. ASBR1 removes RR's stale routes from | | | t+0+3600 | LLST expires. ASBR1 removes RR's stale routes from | | |||
| | its own RIB and sends BGP updates to withdraw them | | | | its own RIB and sends BGP updates to withdraw them | | |||
| | from EXT. | | | | from EXT. | | |||
+----------+-------------------------------------------------------+ | +----------+-------------------------------------------------------+ | |||
Table 2 | Table 2 | |||
Next, imagine the original scenario, but consider that the ASBR1-RR1 | Next, imagine the original scenario, but consider that the ASBR1-RR1 | |||
session comes back up and becomes synchronized 180 seconds after the | session comes back up and becomes synchronized 180 seconds after the | |||
failure was detected: | failure was detected: | |||
+=========+=====================================================+ | +=========+=====================================================+ | |||
| Time | Event | | | Time | Event | | |||
+=========+=====================================================+ | +=========+=====================================================+ | |||
| t | ASBR1's IBGP session with RR fails. ASBR1 retains | | | t | ASBR1's IBGP session with RR fails. ASBR1 retains | | |||
| | RR's routes according to the rules of GR [RFC4724] | | | | RR's routes according to the rules of GR [RFC4724]. | | |||
+---------+-----------------------------------------------------+ | +---------+-----------------------------------------------------+ | |||
| t+1 | GR Restart Time expires. ASBR1 transitions RR's | | | t+1 | GR Restart Time expires. ASBR1 transitions RR's | | |||
| | routes to long-lived stale by attaching the | | | | routes to long-lived stale routes by attaching the | | |||
| | LLGR_STALE community and depreferencing them. | | | | LLGR_STALE community and depreferencing them. | | |||
| | However, since it has no backup routes, it | | | | However, since it has no backup routes, it | | |||
| | continues to make use of them. It re-announces | | | | continues to make use of them. It re-announces | | |||
| | them to EXT with the LLGR_STALE community attached. | | | | them to EXT with the LLGR_STALE community attached. | | |||
+---------+-----------------------------------------------------+ | +---------+-----------------------------------------------------+ | |||
| t+1+179 | Session is reestablished and resynchronized. ASBR1 | | | t+1+179 | Session is re-established and resynchronized. | | |||
| | removes the LLGR_STALE community from RR1's routes | | | | ASBR1 removes the LLGR_STALE community from RR1's | | |||
| | and re-announces them to EXT with the LLGR_STALE | | | | routes and re-announces them to EXT with the | | |||
| | community removed. | | | | LLGR_STALE community removed. | | |||
+---------+-----------------------------------------------------+ | +---------+-----------------------------------------------------+ | |||
Table 3 | Table 3 | |||
Finally, imagine the original scenario, but consider that EXT has not | Finally, imagine the original scenario, but consider that EXT has not | |||
advertised the LLGR Capability to ASBR1: | advertised the LLGR Capability to ASBR1: | |||
+==========+======================================================+ | +==========+======================================================+ | |||
| Time | Event | | | Time | Event | | |||
+==========+======================================================+ | +==========+======================================================+ | |||
| t | ASBR1's IBGP session with RR fails. ASBR1 retains | | | t | ASBR1's IBGP session with RR fails. ASBR1 retains | | |||
| | RR's routes according to the rules of GR [RFC4724] | | | | RR's routes according to the rules of GR [RFC4724]. | | |||
+----------+------------------------------------------------------+ | +----------+------------------------------------------------------+ | |||
| t+1 | GR Restart Time expires. ASBR1 transitions RR's | | | t+1 | GR Restart Time expires. ASBR1 transitions RR's | | |||
| | routes to long-lived stale by attaching the | | | | routes to long-lived stale routes by attaching the | | |||
| | LLGR_STALE community and depreferencing them. | | | | LLGR_STALE community and depreferencing them. | | |||
| | However, since it has no backup routes, it continues | | | | However, since it has no backup routes, it continues | | |||
| | to make use of them. It withdraws them from EXT. | | | | to make use of them. It withdraws them from EXT. | | |||
+----------+------------------------------------------------------+ | +----------+------------------------------------------------------+ | |||
| t+1+3600 | LLST expires. ASBR1 removes RR's stale routes from | | | t+1+3600 | LLST expires. ASBR1 removes RR's stale routes from | | |||
| | its own RIB. | | | | its own RIB. | | |||
+----------+------------------------------------------------------+ | +----------+------------------------------------------------------+ | |||
Table 4 | Table 4 | |||
8. Acknowledgements | 8. IANA Considerations | |||
We would like to thank Nabil Bitar, Martin Djernaes, Roberto | ||||
Fragassi, Jeffrey Haas, Jakob Heitz, Daniam Henriques, Nicolai | ||||
Leymann, Mike McBride, Paul Mattes, John Medamana, Pranav Mehta, Han | ||||
Nguyen, Saikat Ray, Valery Smyslov, and Bo Wu for their valuable | ||||
input and contributions to the discussion and solution. | ||||
9. Contributors | ||||
Clarence Filsfils | ||||
Cisco Systems | ||||
Brussels 1150 | ||||
Belgium | ||||
Email: cf@cisco.com | ||||
Pradosh Mohapatra | ||||
Sproute Networks | ||||
Email: mpradosh@yahoo.com | ||||
Yakov Rekhter | ||||
Eric Rosen | ||||
Email: erosen52@gmail.com | ||||
Rob Shakir | ||||
Google, Inc. | ||||
1600 Amphitheatre Parkway | ||||
Mountain View, CA 94043 | ||||
United States of America | ||||
Email: robjs@google.com | ||||
Adam Simpson | This document defines a BGP capability called the "Long-Lived | |||
Nokia | Graceful Restart Capability". IANA has assigned a value of 71 from | |||
the "Capability Codes" registry. | ||||
Email: adam.1.simpson@nokia.com | This document introduces two BGP well-known communities: | |||
10. IANA Considerations | * the first called "LLGR_STALE" for marking long-lived stale routes, | |||
and | ||||
This document defines a new BGP capability - Long-lived Graceful | * the second called "NO_LLGR" for marking routes that should not be | |||
Restart Capability. IANA has assigned a Capability Code of 71, from | retained if stale. | |||
the "Capability Codes" registry. | ||||
This document introduces a new BGP well-known community "LLGR_STALE" | IANA has assigned these well-known community values 0xFFFF0006 and | |||
for marking long-lived stale routes, and another well-known community | ||||
"NO_LLGR" to mark routes that should not be retained if stale. IANA | ||||
has assigned these well-known community values 0xFFFF0006 and | ||||
0xFFFF0007, respectively, from the "BGP Well-known Communities" | 0xFFFF0007, respectively, from the "BGP Well-known Communities" | |||
registry. | registry. | |||
For each of these three registrations, IANA is requested to update | IANA has established a registry called the "Long-Lived Graceful | |||
the reference to refer to this document. | Restart Flags for Address Family" registry under the "Border Gateway | |||
Protocol (BGP) Parameters" group. The registration procedures are | ||||
IANA is requested to establish a new registry called "Long-lived | Standards Action (see [RFC8126]). The registry is initially | |||
Graceful Restart Flags for Address Family" under the Border Gateway | populated as follows: | |||
Protocol (BGP) Parameters group. The registration procedures are | ||||
Standards Action. The registry should initially be populated as | ||||
follows: | ||||
+==============+=======================+============+===============+ | +==============+=======================+============+===========+ | |||
| Bit Position | Name | Short Name | Reference | | | Bit Position | Name | Short Name | Reference | | |||
+==============+=======================+============+===============+ | +==============+=======================+============+===========+ | |||
| 0 | Preservation of state | F | This | | | 0 | Preservation of state | F | RFC 9494 | | |||
| | | | document | | +--------------+-----------------------+------------+-----------+ | |||
+--------------+-----------------------+------------+---------------+ | | 1-7 | Unassigned | | | | |||
| 1-7 | Unassigned | | | | +--------------+-----------------------+------------+-----------+ | |||
+--------------+-----------------------+------------+---------------+ | ||||
Table 5 | Table 5 | |||
11. References | 9. References | |||
11.1. Normative References | 9.1. Normative References | |||
[RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities | [RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities | |||
Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, | Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, | |||
<https://www.rfc-editor.org/info/rfc1997>. | <https://www.rfc-editor.org/info/rfc1997>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
skipping to change at page 23, line 34 ¶ | skipping to change at line 1008 ¶ | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[RFC8538] Patel, K., Fernando, R., Scudder, J., and J. Haas, | [RFC8538] Patel, K., Fernando, R., Scudder, J., and J. Haas, | |||
"Notification Message Support for BGP Graceful Restart", | "Notification Message Support for BGP Graceful Restart", | |||
RFC 8538, DOI 10.17487/RFC8538, March 2019, | RFC 8538, DOI 10.17487/RFC8538, March 2019, | |||
<https://www.rfc-editor.org/info/rfc8538>. | <https://www.rfc-editor.org/info/rfc8538>. | |||
11.2. Informative References | 9.2. Informative References | |||
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | |||
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | |||
2006, <https://www.rfc-editor.org/info/rfc4364>. | 2006, <https://www.rfc-editor.org/info/rfc4364>. | |||
[RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private | [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private | |||
LAN Service (VPLS) Using BGP for Auto-Discovery and | LAN Service (VPLS) Using BGP for Auto-Discovery and | |||
Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, | Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, | |||
<https://www.rfc-editor.org/info/rfc4761>. | <https://www.rfc-editor.org/info/rfc4761>. | |||
[RFC4781] Rekhter, Y. and R. Aggarwal, "Graceful Restart Mechanism | [RFC4781] Rekhter, Y. and R. Aggarwal, "Graceful Restart Mechanism | |||
for BGP with MPLS", RFC 4781, DOI 10.17487/RFC4781, | for BGP with MPLS", RFC 4781, DOI 10.17487/RFC4781, | |||
January 2007, <https://www.rfc-editor.org/info/rfc4781>. | January 2007, <https://www.rfc-editor.org/info/rfc4781>. | |||
[RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection | [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection | |||
(BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, | (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, | |||
<https://www.rfc-editor.org/info/rfc5880>. | <https://www.rfc-editor.org/info/rfc5880>. | |||
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for | ||||
Writing an IANA Considerations Section in RFCs", BCP 26, | ||||
RFC 8126, DOI 10.17487/RFC8126, June 2017, | ||||
<https://www.rfc-editor.org/info/rfc8126>. | ||||
[RFC8955] Loibl, C., Hares, S., Raszuk, R., McPherson, D., and M. | [RFC8955] Loibl, C., Hares, S., Raszuk, R., McPherson, D., and M. | |||
Bacher, "Dissemination of Flow Specification Rules", | Bacher, "Dissemination of Flow Specification Rules", | |||
RFC 8955, DOI 10.17487/RFC8955, December 2020, | RFC 8955, DOI 10.17487/RFC8955, December 2020, | |||
<https://www.rfc-editor.org/info/rfc8955>. | <https://www.rfc-editor.org/info/rfc8955>. | |||
Acknowledgements | ||||
We would like to thank Nabil Bitar, Martin Djernaes, Roberto | ||||
Fragassi, Jeffrey Haas, Jakob Heitz, Daniam Henriques, Nicolai | ||||
Leymann, Mike McBride, Paul Mattes, John Medamana, Pranav Mehta, Han | ||||
Nguyen, Saikat Ray, Valery Smyslov, and Bo Wu for their valuable | ||||
input and contributions to the discussion and solution. | ||||
Contributors | ||||
Clarence Filsfils | ||||
Cisco Systems | ||||
1150 Brussels | ||||
Belgium | ||||
Email: cf@cisco.com | ||||
Pradosh Mohapatra | ||||
Sproute Networks | ||||
Email: mpradosh@yahoo.com | ||||
Yakov Rekhter | ||||
Eric Rosen | ||||
Email: erosen52@gmail.com | ||||
Rob Shakir | ||||
Google, Inc. | ||||
1600 Amphitheatre Parkway | ||||
Mountain View, CA 94043 | ||||
United States of America | ||||
Email: robjs@google.com | ||||
Adam Simpson | ||||
Nokia | ||||
Email: adam.1.simpson@nokia.com | ||||
Authors' Addresses | Authors' Addresses | |||
James Uttaro | James Uttaro | |||
Independent Contributor | Independent Contributor | |||
Email: juttaro@ieee.org | Email: juttaro@ieee.org | |||
Enke Chen | Enke Chen | |||
Palo Alto Networks | Palo Alto Networks | |||
Email: enchen@paloaltonetworks.com | Email: enchen@paloaltonetworks.com | |||
End of changes. 140 change blocks. | ||||
563 lines changed or deleted | 591 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |