rfc8405.original | rfc8405.txt | |||
---|---|---|---|---|
Network Working Group B. Decraene | Internet Engineering Task Force (IETF) B. Decraene | |||
Internet-Draft Orange | Request for Comments: 8405 Orange | |||
Intended status: Standards Track S. Litkowski | Category: Standards Track S. Litkowski | |||
Expires: September 20, 2018 Orange Business Service | ISSN: 2070-1721 Orange Business Service | |||
H. Gredler | H. Gredler | |||
RtBrick Inc | RtBrick Inc. | |||
A. Lindem | A. Lindem | |||
Cisco Systems | Cisco Systems | |||
P. Francois | P. Francois | |||
C. Bowers | C. Bowers | |||
Juniper Networks, Inc. | Juniper Networks, Inc. | |||
March 19, 2018 | June 2018 | |||
SPF Back-off Delay algorithm for link state IGPs | Shortest Path First (SPF) Back-Off Delay Algorithm for Link-State IGPs | |||
draft-ietf-rtgwg-backoff-algo-10 | ||||
Abstract | Abstract | |||
This document defines a standard algorithm to temporarily postpone or | This document defines a standard algorithm to temporarily postpone or | |||
'back-off' link-state IGP Shortest Path First (SPF) computations. | "back off" link-state IGP Shortest Path First (SPF) computations. | |||
This reduces the computational load and churn on IGP nodes when | This reduces the computational load and churn on IGP nodes when | |||
multiple temporally close network events trigger multiple SPF | multiple temporally close network events trigger multiple SPF | |||
computations. | computations. | |||
Having one standard algorithm improves interoperability by reducing | Having one standard algorithm improves interoperability by reducing | |||
the probability and/or duration of transient forwarding loops during | the probability and/or duration of transient forwarding loops during | |||
the IGP convergence when the IGP reacts to multiple temporally close | the IGP convergence when the IGP reacts to multiple temporally close | |||
IGP events. | IGP events. | |||
Requirements Language | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in | ||||
[BCP14] [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on September 20, 2018. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc8405. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. High level goals . . . . . . . . . . . . . . . . . . . . . . 3 | 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 | |||
3. Definitions and parameters . . . . . . . . . . . . . . . . . 4 | 2. High-Level Goals . . . . . . . . . . . . . . . . . . . . . . 3 | |||
4. Principles of SPF delay algorithm . . . . . . . . . . . . . . 5 | 3. Definitions and Parameters . . . . . . . . . . . . . . . . . 4 | |||
5. Specification of the SPF delay state machine . . . . . . . . 6 | 4. Principles of the SPF Delay Algorithm . . . . . . . . . . . . 5 | |||
5. Specification of the SPF Delay State Machine . . . . . . . . 6 | ||||
5.1. State Machine . . . . . . . . . . . . . . . . . . . . . . 6 | 5.1. State Machine . . . . . . . . . . . . . . . . . . . . . . 6 | |||
5.2. State . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 5.2. State . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
5.3. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 5.3. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
5.4. FSM Events . . . . . . . . . . . . . . . . . . . . . . . 8 | 5.4. FSM Events . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
6. Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 6. Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
7. Partial Deployment . . . . . . . . . . . . . . . . . . . . . 11 | 7. Partial Deployment . . . . . . . . . . . . . . . . . . . . . 10 | |||
8. Impact on micro-loops . . . . . . . . . . . . . . . . . . . . 11 | 8. Impact on Micro-loops . . . . . . . . . . . . . . . . . . . . 11 | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | |||
10. Security considerations . . . . . . . . . . . . . . . . . . . 12 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 11 | |||
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 11 | |||
12.1. Normative References . . . . . . . . . . . . . . . . . . 12 | 11.2. Informative References . . . . . . . . . . . . . . . . . 11 | |||
12.2. Informative References . . . . . . . . . . . . . . . . . 12 | Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
1. Introduction | 1. Introduction | |||
Link state IGPs, such as IS-IS [ISO10589-Second-Edition], OSPF | Link-state IGPs, such as IS-IS [ISO10589], OSPF [RFC2328], and OSPFv3 | |||
[RFC2328] and OSPFv3 [RFC5340], perform distributed route computation | [RFC5340], perform distributed route computation on all routers in | |||
on all routers in the area/level. In order to have consistent | the area/level. In order to have consistent routing tables across | |||
routing tables across the network, such distributed computation | the network, such distributed computation requires that all routers | |||
requires that all routers have the same version of the network | have the same version of the network topology (Link-State Database | |||
topology (Link State DataBase (LSDB)) and perform their computation | (LSDB)) and perform their computation essentially at the same time. | |||
essentially at the same time. | ||||
In general, when the network is stable, there is a desire to trigger | In general, when the network is stable, there is a desire to trigger | |||
a new Shortest Path First (SPF) computation as soon as a failure is | a new Shortest Path First (SPF) computation as soon as a failure is | |||
detected in order to quickly route around the failure. However, when | detected in order to quickly route around the failure. However, when | |||
the network is experiencing multiple failures over a short period of | the network is experiencing multiple failures over a short period of | |||
time, there is a conflicting desire to limit the frequency of SPF | time, there is a conflicting desire to limit the frequency of SPF | |||
computations, which would allow a reduction in control plane | computations, which would allow a reduction in control plane | |||
resources used by IGPs and all protocols/subsystems reacting on the | resources used by IGPs and all protocols/subsystems reacting on the | |||
attendant route change, such as LDP [RFC5036], RSVP-TE [RFC3209], BGP | attendant route change, such as LDP [RFC5036], RSVP-TE [RFC3209], BGP | |||
[RFC4271], Fast ReRoute computations (e.g., Loop Free Alternates | [RFC4271], Fast Reroute computations (e.g., Loop-Free Alternates | |||
(LFA) [RFC5286]), FIB updates, etc. This also reduces network churn | (LFAs) [RFC5286]), FIB updates, etc. This also reduces network churn | |||
and, in particular, reduces the side effects such as micro-loops | and, in particular, reduces side effects (such as micro-loops | |||
[RFC5715] that ensue during IGP convergence. | [RFC5715]) that ensue during IGP convergence. | |||
To allow for this, IGPs usually implement an SPF Back-off Delay | To allow for this, IGPs usually implement an SPF Back-Off Delay | |||
algorithm that postpones or backs-off the SPF computation. However, | algorithm that postpones or backs off the SPF computation. However, | |||
different implementations have chosen different algorithms. Hence, | different implementations chose different algorithms. Hence, in a | |||
in a multi-vendor network, it's not possible to ensure that all | multi-vendor network, it's not possible to ensure that all routers | |||
routers trigger their SPF computation after the same delay. This | trigger their SPF computation after the same delay. This situation | |||
situation increases the average and maximum differential delay | increases the average and maximum differential delay between routers | |||
between routers completing their SPF computation. It also increases | completing their SPF computation. It also increases the probability | |||
the probability that different routers compute their FIBs based on | that different routers compute their FIBs based on different LSDB | |||
different LSDB versions. Both factors increase the probability and/ | versions. Both factors increase the probability and/or duration of | |||
or duration of micro-loops as discussed in Section 8. | micro-loops as discussed in Section 8. | |||
To allow multi-vendor networks to have all routers delay their SPF | This document specifies a standard algorithm to allow multi-vendor | |||
computations for the same duration, this document specifies a | networks to have all routers delay their SPF computations for the | |||
standard algorithm. | same duration. | |||
2. High level goals | 1.1. Requirements Language | |||
The high level goals of this algorithm are the following: | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in | ||||
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
o Very fast convergence for a single event (e.g., link failure). | 2. High-Level Goals | |||
o Paced fast convergence for multiple temporally close IGP events | The high-level goals of this algorithm are the following: | |||
while IGP stability is considered acceptable. | ||||
o Delayed convergence when IGP stability is problematic. This will | o very fast convergence for a single event (e.g., link failure), | |||
o paced fast convergence for multiple temporally close IGP events | ||||
while IGP stability is considered acceptable, | ||||
o delayed convergence when IGP stability is problematic (this will | ||||
allow the IGP and related processes to conserve resources during | allow the IGP and related processes to conserve resources during | |||
the period of instability. | the period of instability), and | |||
o Always try to avoid different SPF_DELAY Section 3 timer values | o always try to avoid different SPF_DELAY (Section 3) timer values | |||
across different routers in the area/level. This requires | across different routers in the area/level. This requires | |||
specific consideration as different routers may receive IGP | specific consideration as different routers may receive IGP | |||
messages at different interval or even order, due to differences | messages at different intervals, or even orders, due to | |||
both in the distance from the originator of the IGP event and in | differences both in the distance from the originator of the IGP | |||
flooding implementations. | event and in flooding implementations. | |||
3. Definitions and parameters | 3. Definitions and Parameters | |||
IGP events: The reception or origination of an IGP LSDB change | IGP events: The reception or origination of an IGP LSDB change | |||
requiring a new routing table computation. Examples are a topology | requiring a new routing table computation. Some examples are a | |||
change, a prefix change and a metric change on a link or prefix. | topology change, a prefix change, and a metric change on a link or | |||
Note that locally triggering a routing table computation is not | prefix. Note that locally triggering a routing table computation is | |||
considered as an IGP event since other IGP routers are unaware of | not considered an IGP event since other IGP routers are unaware of | |||
this occurrence. | this occurrence. | |||
Routing table computation, in this document, is scoped to the IGP. | Routing table computation, in this document, is scoped to the IGP; | |||
So this is the computation of the IGP RIB, performed by the IGP, | so, this is the computation of the IGP RIB, performed by the IGP, | |||
using the IGP LSDB. No distinction is made between the type of | using the IGP LSDB. No distinction is made between the type of | |||
computation performed. e.g., full SPF, incremental SPF, Partial Route | computation performed, e.g., full SPF, incremental SPF, or Partial | |||
Computation (PRC): the type of computation is a local consideration. | Route Computation (PRC); the type of computation is a local | |||
This document may interchangeably use the terms routing table | consideration. This document may interchangeably use the terms | |||
computation and SPF computation. | "routing table computation" and "SPF computation". | |||
SPF_DELAY: The delay between the first IGP event triggering a new | SPF_DELAY: The delay between the first IGP event triggering a new | |||
routing table computation and the start of that routing table | routing table computation and the start of that routing table | |||
computation. It can take the following values: | computation. It can take the following values: | |||
INITIAL_SPF_DELAY: A very small delay to quickly handle a single | INITIAL_SPF_DELAY: A very small delay to quickly handle a single | |||
isolated link failure, e.g., 0 milliseconds. | isolated link failure, e.g., 0 milliseconds. | |||
SHORT_SPF_DELAY: A small delay to provide fast convergence in the | SHORT_SPF_DELAY: A small delay to provide fast convergence in the | |||
case of a single component failure (node, Shared Risk Link Group | case of a single component failure (such as the node failure or | |||
(SRLG)..) that leads to multiple IGP events, e.g., 50-100 | Shared Risk Link Group (SRLG) failure) that leads to multiple IGP | |||
milliseconds. | events, e.g., 50-100 milliseconds. | |||
LONG_SPF_DELAY: A long delay when the IGP is unstable, e.g., 2 | LONG_SPF_DELAY: A long delay when the IGP is unstable, e.g., 2 | |||
seconds. Note that this allows the IGP network to stabilize. | seconds. Note that this allows the IGP network to stabilize. | |||
TIME_TO_LEARN_INTERVAL: This is the maximum duration typically needed | TIME_TO_LEARN_INTERVAL: This is the maximum duration typically needed | |||
to learn all the IGP events related to a single component failure | to learn all the IGP events related to a single component failure | |||
(e.g., router failure, SRLG failure), e.g., 1 second. It's mostly | (such as router failure or SRLG failure), e.g., 1 second. It's | |||
dependent on failure detection time variation between all routers | mostly dependent on failure detection time variation between all | |||
that are adjacent to the failure. Additionally, it may depend on the | routers that are adjacent to the failure. Additionally, it may | |||
different IGP implementations/parameters across the network, related | depend on the different IGP implementations/parameters across the | |||
to origination and flooding of their link state advertisements. | network, related to origination and flooding of their link-state | |||
advertisements. | ||||
HOLDDOWN_INTERVAL: The time required with no received IGP events | HOLDDOWN_INTERVAL: The time required with no received IGP events | |||
before considering the IGP to be stable again and allowing the | before considering the IGP to be stable again and allowing the | |||
SPF_DELAY to be restored to INITIAL_SPF_DELAY. e.g. a | SPF_DELAY to be restored to INITIAL_SPF_DELAY, e.g., a | |||
HOLDDOWN_INTERVAL of 3 seconds. The HOLDDOWN_INTERVAL MUST be | HOLDDOWN_INTERVAL of 3 seconds. The HOLDDOWN_INTERVAL MUST be | |||
defaulted and configured to be longer than the | defaulted and configured to be longer than the | |||
TIME_TO_LEARN_INTERVAL. | TIME_TO_LEARN_INTERVAL. | |||
4. Principles of SPF delay algorithm | 4. Principles of the SPF Delay Algorithm | |||
For this first IGP event, we assume that there has been a single | For this first IGP event, we assume that there has been a single | |||
simple change in the network which can be taken into account using a | simple change in the network, which can be taken into account using a | |||
single routing computation (e.g., link failure, prefix (metric) | single routing computation (e.g., link failure, prefix (metric) | |||
change) and we optimize for very fast convergence, delaying the | change), and we optimize for very fast convergence, which delays the | |||
routing computation by INITIAL_SPF_DELAY. Under this assumption, | routing computation by INITIAL_SPF_DELAY. Under this assumption, | |||
there is no benefit in delaying the routing computation. In a | there is no benefit in delaying the routing computation. In a | |||
typical network, this is the most common type of IGP event. Hence, | typical network, this is the most common type of IGP event. Hence, | |||
it makes sense to optimize this case. | it makes sense to optimize this case. | |||
If subsequent IGP events are received in a short period of time | If subsequent IGP events are received in a short period of time | |||
(TIME_TO_LEARN_INTERVAL), we then assume that a single component | (TIME_TO_LEARN_INTERVAL), we then assume that a single component | |||
failed, but that this failure requires the knowledge of multiple IGP | failed, but that this failure requires the knowledge of multiple IGP | |||
events in order for IGP routing to converge. Under this assumption, | events in order for IGP routing to converge. Under this assumption, | |||
we want fast convergence since this is a normal network situation. | we want fast convergence since this is a normal network situation. | |||
However, there is a benefit in waiting for all IGP events related to | However, there is a benefit in waiting for all IGP events related to | |||
this single component failure so that the IGP can compute the post- | this single component failure so that the IGP can compute the post- | |||
failure routing table in a single additional route computation. In | failure routing table in a single additional route computation. In | |||
this situation, we delay the routing computation by SHORT_SPF_DELAY. | this situation, we delay the routing computation by SHORT_SPF_DELAY. | |||
If IGP events are still received after TIME_TO_LEARN_INTERVAL from | If IGP events are still received after TIME_TO_LEARN_INTERVAL from | |||
the initial IGP event received in QUIET state Section 5.1, then the | the initial IGP event received in QUIET state (see Section 5.1), then | |||
network is presumably experiencing multiple independent failures. In | the network is presumably experiencing multiple independent failures. | |||
this case, while waiting for network stability, the computations are | In this case, while waiting for network stability, the computations | |||
delayed for a longer time represented by LONG_SPF_DELAY. This SPF | are delayed for a longer time, which is represented by | |||
delay is kept until no IGP events are received for HOLDDOWN_INTERVAL. | LONG_SPF_DELAY. This SPF delay is kept until no IGP events are | |||
received for HOLDDOWN_INTERVAL. | ||||
Note that in order to increase the consistency network wide, the | Note that in order to increase the consistency network wide, the | |||
algorithm uses a delay (TIME_TO_LEARN_INTERVAL) from the initial IGP | algorithm uses a delay (TIME_TO_LEARN_INTERVAL) from the initial IGP | |||
event, rather than the number of SPF computation performed. Indeed, | event rather than the number of SPF computations performed. Indeed, | |||
as all routers may receive the IGP events at different times, we | as all routers may receive the IGP events at different times, we | |||
cannot assume that all routers will perform the same number of SPF | cannot assume that all routers will perform the same number of SPF | |||
computations. For example, assuming that the SPF delay is 50 ms, | computations. For example, assuming that the SPF delay is 50 | |||
router R1 may receive 3 IGP events (E1, E2, E3) in those 50 ms and | milliseconds, router R1 may receive three IGP events (E1, E2, E3) in | |||
hence will perform a single routing computation. While another | those 50 milliseconds and hence will perform a single routing | |||
router R2 may only receive 2 events (E1, E2) in those 50 ms and hence | computation, while another router R2 may only receive two events (E1, | |||
will schedule another routing computation when receiving E3. | E2) in those 50 milliseconds and hence will schedule another routing | |||
computation when receiving E3. | ||||
5. Specification of the SPF delay state machine | 5. Specification of the SPF Delay State Machine | |||
This section specifies the finite state machine (FSM) intended to | This section specifies the Finite State Machine (FSM) intended to | |||
control the timing of the execution of SPF calculations in response | control the timing of the execution of SPF calculations in response | |||
to IGP events. | to IGP events. | |||
5.1. State Machine | 5.1. State Machine | |||
The FSM is initialized to the QUIET state with all three timers | The FSM is initialized to the QUIET state with all three timers | |||
timers (SPF_TIMER, HOLDDOWN_TIMER, LEARN_TIMER) deactivated. | (SPF_TIMER, HOLDDOWN_TIMER, and LEARN_TIMER) deactivated. | |||
The events which may change the FSM states are an IGP event or the | The events that may change the FSM states are an IGP event or the | |||
expiration of one timer (SPF_TIMER, HOLDDOWN_TIMER, LEARN_TIMER). | expiration of one timer (SPF_TIMER, HOLDDOWN_TIMER, or LEARN_TIMER). | |||
The following diagram briefly describes the state transitions. | The following diagram briefly describes the state transitions. | |||
+-------------------+ | +-------------------+ | |||
+---->| |<-------------------+ | +---->| |<-------------------+ | |||
| | QUIET | | | | | QUIET | | | |||
+-----| |<---------+ | | +-----| |<---------+ | | |||
7: +-------------------+ | | | 7: +-------------------+ | | | |||
SPF_TIMER | | | | SPF_TIMER | | | | |||
expiration | | | | expiration | | | | |||
| 1: IGP event | | | | 1: IGP event | | | |||
| | | | | | | | |||
v | | | v | | | |||
+-------------------+ | | | +-------------------+ | | | |||
+---->| | | | | +---->| | | | | |||
| | SHORT_WAIT |----->----+ | | | | SHORT_WAIT |----->----+ | | |||
+-----| | | | +-----| | | | |||
2: +-------------------+ 6: HOLDDOWN_TIMER | | 2: +-------------------+ 6: HOLDDOWN_TIMER | | |||
IGP event | expiration | | IGP event | expiration | | |||
8: SPF_TIMER | | | 8: SPF_TIMER | | | |||
expiration | | | expiration | | | |||
| 3: LEARN_TIMER | | | 3: LEARN_TIMER | | |||
| expiration | | | expiration | | |||
| | | | | | |||
v | | v | | |||
+-------------------+ | | +-------------------+ | | |||
+---->| | | | +---->| | | | |||
| | LONG_WAIT |------------>-------+ | | | LONG_WAIT |------------>-------+ | |||
+-----| | | +-----| | | |||
4: +-------------------+ 5: HOLDDOWN_TIMER | 4: +-------------------+ 5: HOLDDOWN_TIMER | |||
IGP event expiration | IGP event expiration | |||
9: SPF_TIMER expiration | 9: SPF_TIMER expiration | |||
Figure 1: State Machine | Figure 1: State Machine | |||
5.2. State | 5.2. State | |||
The naming and semantics of each state corresponds directly to the | The naming and semantics of each state corresponds directly to the | |||
SPF delay used for IGP events received in that state. Three states | SPF delay used for IGP events received in that state. Three states | |||
are defined: | are defined: | |||
QUIET: This is the initial state, when no IGP events have occurred | QUIET: This is the initial state, when no IGP events have occurred | |||
for at least HOLDDOWN_INTERVAL since the previous routing table | for at least HOLDDOWN_INTERVAL since the previous routing table | |||
computation. The state is meant to handle link failures very | computation. The state is meant to handle link failures very | |||
quickly. | quickly. | |||
SHORT_WAIT: State entered when an IGP event has been received in | SHORT_WAIT: This is the state entered when an IGP event has been | |||
QUIET state. This state is meant to handle single component failure | received in QUIET state. This state is meant to handle single | |||
requiring multiple IGP events (e.g., node, SRLG). | component failure requiring multiple IGP events (e.g., node, SRLG). | |||
LONG_WAIT: State reached after TIME_TO_LEARN_INTERVAL. In other | LONG_WAIT: This is the state reached after TIME_TO_LEARN_INTERVAL. | |||
words, state reached after TIME_TO_LEARN_INTERVAL in state | In other words, this is the state reached after | |||
SHORT_WAIT. This state is meant to handle multiple independent | TIME_TO_LEARN_INTERVAL in state SHORT_WAIT. This state is meant to | |||
component failures during periods of IGP instability. | handle multiple independent component failures during periods of IGP | |||
instability. | ||||
5.3. Timers | 5.3. Timers | |||
SPF_TIMER: The FSM timer that uses the computed SPF delay. Upon | SPF_TIMER: This is the FSM timer that uses the computed SPF delay. | |||
expiration, the Route Table Computation (as defined in Section 3) is | Upon expiration, the routing table computation (as defined in | |||
performed. | Section 3) is performed. | |||
HOLDDOWN_TIMER: The FSM timer that is (re)started whan an IGP event | HOLDDOWN_TIMER: This is the FSM timer that is (re)started when an IGP | |||
is received and set to HOLDDOWN_INTERVAL. Upon expiration, the FSM | event is received and set to HOLDDOWN_INTERVAL. Upon expiration, the | |||
is moved to the QUIET state. | FSM is moved to the QUIET state. | |||
LEARN_TIMER: The FSM timer that is started when an IGP event is | LEARN_TIMER: This is the FSM timer that is started when an IGP event | |||
recevied while the FSM is in the QUIET state. Upon expiration, the | is received while the FSM is in the QUIET state. Upon expiration, | |||
FSM is moved to the LONG_WAIT state. | the FSM is moved to the LONG_WAIT state. | |||
5.4. FSM Events | 5.4. FSM Events | |||
This section describes the events and the actions performed in | This section describes the events and the actions performed in | |||
response. | response. | |||
Transition 1: IGP event, while in QUIET state. | Transition 1: IGP event while in QUIET state | |||
Actions on event 1: | Actions on event 1: | |||
o If SPF_TIMER is not already running, start it with value | o If SPF_TIMER is not already running, start it with value | |||
INITIAL_SPF_DELAY. | INITIAL_SPF_DELAY. | |||
o Start LEARN_TIMER with TIME_TO_LEARN_INTERVAL. | o Start LEARN_TIMER with TIME_TO_LEARN_INTERVAL. | |||
o Start HOLDDOWN_TIMER with HOLDDOWN_INTERVAL. | o Start HOLDDOWN_TIMER with HOLDDOWN_INTERVAL. | |||
o Transition to SHORT_WAIT state. | o Transition to SHORT_WAIT state. | |||
Transition 2: IGP event, while in SHORT_WAIT. | Transition 2: IGP event while in SHORT_WAIT | |||
Actions on event 2: | Actions on event 2: | |||
o Reset HOLDDOWN_TIMER to HOLDDOWN_INTERVAL. | o Reset HOLDDOWN_TIMER to HOLDDOWN_INTERVAL. | |||
o If SPF_TIMER is not already running, start it with value | o If SPF_TIMER is not already running, start it with value | |||
SHORT_SPF_DELAY. | SHORT_SPF_DELAY. | |||
o Remain in current state. | o Remain in current state. | |||
Transition 3: LEARN_TIMER expiration. | Transition 3: LEARN_TIMER expiration | |||
Actions on event 3: | Actions on event 3: | |||
o Transition to LONG_WAIT state. | o Transition to LONG_WAIT state. | |||
Transition 4: IGP event, while in LONG_WAIT. | Transition 4: IGP event while in LONG_WAIT | |||
Actions on event 4: | Actions on event 4: | |||
o Reset HOLDDOWN_TIMER to HOLDDOWN_INTERVAL. | o Reset HOLDDOWN_TIMER to HOLDDOWN_INTERVAL. | |||
o If SPF_TIMER is not already running, start it with value | o If SPF_TIMER is not already running, start it with value | |||
LONG_SPF_DELAY. | LONG_SPF_DELAY. | |||
o Remain in current state. | o Remain in current state. | |||
Transition 5: HOLDDOWN_TIMER expiration, while in LONG_WAIT. | Transition 5: HOLDDOWN_TIMER expiration while in LONG_WAIT | |||
Actions on event 5: | Actions on event 5: | |||
o Transition to QUIET state. | o Transition to QUIET state. | |||
Transition 6: HOLDDOWN_TIMER expiration, while in SHORT_WAIT. | Transition 6: HOLDDOWN_TIMER expiration while in SHORT_WAIT | |||
Actions on event 6: | Actions on event 6: | |||
o Deactivate LEARN_TIMER. | o Deactivate LEARN_TIMER. | |||
o Transition to QUIET state. | o Transition to QUIET state. | |||
Transition 7: SPF_TIMER expiration, while in QUIET. | Transition 7: SPF_TIMER expiration while in QUIET | |||
Actions on event 7: | Actions on event 7: | |||
o Compute SPF. | o Compute SPF. | |||
o Remain in current state. | o Remain in current state. | |||
Transition 8: SPF_TIMER expiration, while in SHORT_WAIT. | Transition 8: SPF_TIMER expiration while in SHORT_WAIT | |||
Actions on event 8: | Actions on event 8: | |||
o Compute SPF. | o Compute SPF. | |||
o Remain in current state. | o Remain in current state. | |||
Transition 9: SPF_TIMER expiration, while in LONG_WAIT. | Transition 9: SPF_TIMER expiration while in LONG_WAIT | |||
Actions on event 9: | Actions on event 9: | |||
o Compute SPF. | o Compute SPF. | |||
o Remain in current state. | o Remain in current state. | |||
6. Parameters | 6. Parameters | |||
All the parameters MUST be configurable at the protocol instance | All the parameters MUST be configurable at the protocol instance | |||
granularity. They MAY be configurable at the area/level granularity. | granularity. They MAY be configurable at the area/level granularity. | |||
All the delays (INITIAL_SPF_DELAY, SHORT_SPF_DELAY, LONG_SPF_DELAY, | All the delays (INITIAL_SPF_DELAY, SHORT_SPF_DELAY, LONG_SPF_DELAY, | |||
TIME_TO_LEARN_INTERVAL, HOLDDOWN_INTERVAL) SHOULD be configurable at | TIME_TO_LEARN_INTERVAL, and HOLDDOWN_INTERVAL) SHOULD be configurable | |||
the millisecond granularity. They MUST be configurable at least at | at the millisecond granularity. They MUST be configurable at least | |||
the tenth of second granularity. The configurable range for all the | at the tenth of a second granularity. The configurable range for all | |||
parameters SHOULD at least be from 0 milliseconds to 60 seconds. The | the parameters SHOULD at least be from 0 milliseconds to 60 seconds. | |||
HOLDDOWN_INTERVAL MUST be defaulted or configured to be longer than | The HOLDDOWN_INTERVAL MUST be defaulted or configured to be longer | |||
the TIME_TO_LEARN_INTERVAL. | than the TIME_TO_LEARN_INTERVAL. | |||
If this SPF backoff algorithm is enabled by default, then in order to | If this SPF Back-Off algorithm is enabled by default, then in order | |||
have consistent SPF delays between implementations with default | to have consistent SPF delays between implementations with default | |||
configuration, the following default values SHOULD be implemented: | configuration, the following default values SHOULD be implemented: | |||
INITIAL_SPF_DELAY 50 ms, SHORT_SPF_DELAY 200ms, LONG_SPF_DELAY: 5 | ||||
000ms, TIME_TO_LEARN_INTERVAL 500ms, HOLDDOWN_INTERVAL 10 000ms. | INITIAL_SPF_DELAY 50 ms | |||
SHORT_SPF_DELAY 200 ms | ||||
LONG_SPF_DELAY 5000 ms | ||||
TIME_TO_LEARN_INTERVAL 500 ms | ||||
HOLDDOWN_INTERVAL 10000 ms | ||||
In order to satisfy the goals stated in Section 2, operators are | In order to satisfy the goals stated in Section 2, operators are | |||
RECOMMENDED to configure delay intervals such that INITIAL_SPF_DELAY | RECOMMENDED to configure delay intervals such that INITIAL_SPF_DELAY | |||
<= SHORT_SPF_DELAY and SHORT_SPF_DELAY <= LONG_SPF_DELAY. | <= SHORT_SPF_DELAY and SHORT_SPF_DELAY <= LONG_SPF_DELAY. | |||
When setting (default) values, one should consider the customers and | When setting (default) values, one should consider the customers and | |||
their application requirements, the computational power of the | their application requirements, the computational power of the | |||
routers, the size of the network, and, in particular, the number of | routers, the size of the network, and, in particular, the number of | |||
IP prefixes advertised in the IGP, the frequency and number of IGP | IP prefixes advertised in the IGP, the frequency and number of IGP | |||
events, the number of protocols reactions/computations triggered by | events, and the number of protocol reactions/computations triggered | |||
IGP SPF computation (e.g., BGP, PCEP, Traffic Engineering CSPF, Fast | by IGP SPF computation (e.g., BGP, Path Computation Element | |||
ReRoute computations). Note that some or all of these factors may | Communication Protocol (PCEP), Traffic Engineering Constrained SPF | |||
change over the life of the network. In case of doubt, it's | (CSPF), and Fast Reroute computations). Note that some or all of | |||
RECOMMENDED that timer intervals should be chosen conservatively | these factors may change over the life of the network. In case of | |||
(i.e., longer timer values). | doubt, it's RECOMMENDED that timer intervals should be chosen | |||
conservatively (i.e., longer timer values). | ||||
For the standard algorithm to be effective in mitigating micro-loops, | For the standard algorithm to be effective in mitigating micro-loops, | |||
it is RECOMMENDED that all routers in the IGP domain, or at least all | it is RECOMMENDED that all routers in the IGP domain, or at least all | |||
the routers in the same area/level, have exactly the same configured | the routers in the same area/level, have exactly the same configured | |||
values. | values. | |||
7. Partial Deployment | 7. Partial Deployment | |||
In general, the SPF Back-off Delay algorithm is only effective in | In general, the SPF Back-Off Delay algorithm is only effective in | |||
mitigating micro-loops if it is deployed, with the same parameters, | mitigating micro-loops if it is deployed with the same parameters on | |||
on all routers in the IGP domain or, at least, all routers in an IGP | all routers in the IGP domain or, at least, all routers in an IGP | |||
area/level. The impact of partial deployment is dependent on the | area/level. The impact of partial deployment is dependent on the | |||
particular event, topology, and the algorithm(s) used on other | particular event, the topology, and the algorithm(s) used on other | |||
routers in the IGP area/level. In cases where the previous SPF Back- | routers in the IGP area/level. In cases where the previous SPF Back- | |||
off Delay algorithm was implemented uniformly, partial deployment | Off Delay algorithm was implemented uniformly, partial deployment | |||
will increase the frequency and duration of micro-loops. Hence, it | will increase the frequency and duration of micro-loops. Hence, it | |||
is RECOMMENDED that all routers in the IGP domain or at least within | is RECOMMENDED that all routers in the IGP domain, or at least within | |||
the same area/level be migrated to the SPF algorithm described herein | the same area/level, be migrated to the SPF algorithm described | |||
at roughly the same time. | herein at roughly the same time. | |||
Note that this is not a new consideration as over times, network | Note that this is not a new consideration; over time, network | |||
operators have changed SPF delay parameters in order to accommodate | operators have changed SPF delay parameters in order to accommodate | |||
new customer requirements for fast convergence, as permitted by new | new customer requirements for fast convergence, as permitted by new | |||
software and hardware. They may also have progressively replaced an | software and hardware. They may also have progressively replaced an | |||
implementation with a given SPF Back-off Delay algorithm by another | implementation with a given SPF Back-Off Delay algorithm by another | |||
implementation with a different one. | implementation with a different one. | |||
8. Impact on micro-loops | 8. Impact on Micro-loops | |||
Micro-loops during IGP convergence are due to a non-synchronized or | Micro-loops during IGP convergence are due to a non-synchronized or | |||
non-ordered update of the forwarding information tables (FIB) | non-ordered update of FIBs [RFC5715] [RFC6976] [SPF-MICRO]. FIBs are | |||
[RFC5715] [RFC6976] [I-D.ietf-rtgwg-spf-uloop-pb-statement]. FIBs | installed after multiple steps, such as flooding of the IGP event | |||
are installed after multiple steps such as flooding of the IGP event | ||||
across the network, SPF wait time, SPF computation, FIB distribution | across the network, SPF wait time, SPF computation, FIB distribution | |||
across line cards, and FIB update. This document only addresses the | across line cards, and FIB update. This document only addresses the | |||
contribution from the SPF wait time. This standardized procedure | contribution from the SPF wait time. This standardized procedure | |||
reduces the probability and/or duration of micro-loops when IGPs | reduces the probability and/or duration of micro-loops when IGPs | |||
experience multiple temporally close events. It does not prevent all | experience multiple temporally close events. It does not prevent all | |||
micro-loops. However, it is beneficial and is less complex and | micro-loops; however, it is beneficial and is less complex and costly | |||
costly to implement when compared to full solutions such as [RFC5715] | to implement when compared to full solutions such as [RFC5715] or | |||
or [RFC6976]. | [RFC6976]. | |||
9. IANA Considerations | 9. IANA Considerations | |||
No IANA actions required. | This document has no IANA actions. | |||
10. Security considerations | 10. Security Considerations | |||
The algorithm presented in this document does not compromise IGP | The algorithm presented in this document does not compromise IGP | |||
security. An attacker having the ability to generate IGP events | security. An attacker having the ability to generate IGP events | |||
would be able to delay the IGP convergence time. The LONG_SPF_DELAY | would be able to delay the IGP convergence time. The LONG_SPF_DELAY | |||
state may help mitigate the effects of Denial-of-Service (DOS) | state may help mitigate the effects of Denial-of-Service (DoS) | |||
attacks generating many IGP events. | attacks generating many IGP events. | |||
11. Acknowledgements | 11. References | |||
We would like to acknowledge Les Ginsberg, Uma Chunduri, Mike Shand | ||||
and Alexander Vainshtein for the discussions and comments related to | ||||
this document. | ||||
12. References | ||||
12.1. Normative References | 11.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
12.2. Informative References | 11.2. Informative References | |||
[I-D.ietf-rtgwg-spf-uloop-pb-statement] | ||||
Litkowski, S., Decraene, B., and M. Horneffer, "Link State | ||||
protocols SPF trigger and delay algorithm impact on IGP | ||||
micro-loops", draft-ietf-rtgwg-spf-uloop-pb-statement-06 | ||||
(work in progress), January 2018. | ||||
[ISO10589-Second-Edition] | [ISO10589] | |||
International Organization for Standardization, | International Organization for Standardization, | |||
"Intermediate system to Intermediate system intra-domain | "Information technology -- Telecommunications and | |||
routeing information exchange protocol for use in | information exchange between systems -- Intermediate | |||
conjunction with the protocol for providing the | System to Intermediate System intra-domain routeing | |||
connectionless-mode Network Service (ISO 8473)", ISO/ | information exchange protocol for use in conjunction with | |||
IEC 10589:2002, Second Edition, Nov 2002. | the protocol for providing the connectionless-mode network | |||
service (ISO 8473)", ISO/IEC 10589:2002, Second Edition, | ||||
November 2002. | ||||
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, | [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, | |||
DOI 10.17487/RFC2328, April 1998, | DOI 10.17487/RFC2328, April 1998, | |||
<https://www.rfc-editor.org/info/rfc2328>. | <https://www.rfc-editor.org/info/rfc2328>. | |||
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., | [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., | |||
and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP | and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP | |||
Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, | Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, | |||
<https://www.rfc-editor.org/info/rfc3209>. | <https://www.rfc-editor.org/info/rfc3209>. | |||
skipping to change at page 13, line 38 | skipping to change at page 13, line 5 | |||
[RFC5715] Shand, M. and S. Bryant, "A Framework for Loop-Free | [RFC5715] Shand, M. and S. Bryant, "A Framework for Loop-Free | |||
Convergence", RFC 5715, DOI 10.17487/RFC5715, January | Convergence", RFC 5715, DOI 10.17487/RFC5715, January | |||
2010, <https://www.rfc-editor.org/info/rfc5715>. | 2010, <https://www.rfc-editor.org/info/rfc5715>. | |||
[RFC6976] Shand, M., Bryant, S., Previdi, S., Filsfils, C., | [RFC6976] Shand, M., Bryant, S., Previdi, S., Filsfils, C., | |||
Francois, P., and O. Bonaventure, "Framework for Loop-Free | Francois, P., and O. Bonaventure, "Framework for Loop-Free | |||
Convergence Using the Ordered Forwarding Information Base | Convergence Using the Ordered Forwarding Information Base | |||
(oFIB) Approach", RFC 6976, DOI 10.17487/RFC6976, July | (oFIB) Approach", RFC 6976, DOI 10.17487/RFC6976, July | |||
2013, <https://www.rfc-editor.org/info/rfc6976>. | 2013, <https://www.rfc-editor.org/info/rfc6976>. | |||
[SPF-MICRO] | ||||
Litkowski, S., Decraene, B., and M. Horneffer, "Link State | ||||
protocols SPF trigger and delay algorithm impact on IGP | ||||
micro-loops", Work in Progress, draft-ietf-rtgwg-spf- | ||||
uloop-pb-statement-07, May 2018. | ||||
Acknowledgements | ||||
We would like to acknowledge Les Ginsberg, Uma Chunduri, Mike Shand, | ||||
and Alexander Vainshtein for the discussions and comments related to | ||||
this document. | ||||
Authors' Addresses | Authors' Addresses | |||
Bruno Decraene | Bruno Decraene | |||
Orange | Orange | |||
Email: bruno.decraene@orange.com | Email: bruno.decraene@orange.com | |||
Stephane Litkowski | Stephane Litkowski | |||
Orange Business Service | Orange Business Service | |||
skipping to change at page 14, line 4 | skipping to change at page 13, line 28 | |||
Bruno Decraene | Bruno Decraene | |||
Orange | Orange | |||
Email: bruno.decraene@orange.com | Email: bruno.decraene@orange.com | |||
Stephane Litkowski | Stephane Litkowski | |||
Orange Business Service | Orange Business Service | |||
Email: stephane.litkowski@orange.com | Email: stephane.litkowski@orange.com | |||
Hannes Gredler | Hannes Gredler | |||
RtBrick Inc | RtBrick Inc. | |||
Email: hannes@rtbrick.com | Email: hannes@rtbrick.com | |||
Acee Lindem | Acee Lindem | |||
Cisco Systems | Cisco Systems | |||
301 Midenhall Way | 301 Midenhall Way | |||
Cary, NC 27513 | Cary, NC 27513 | |||
USA | United States of America | |||
Email: acee@cisco.com | Email: acee@cisco.com | |||
Pierre Francois | Pierre Francois | |||
Email: pfrpfr@gmail.com | Email: pfrpfr@gmail.com | |||
Chris Bowers | Chris Bowers | |||
Juniper Networks, Inc. | Juniper Networks, Inc. | |||
1194 N. Mathilda Ave. | 1194 N. Mathilda Ave. | |||
Sunnyvale, CA 94089 | Sunnyvale, CA 94089 | |||
US | United States of America | |||
Email: cbowers@juniper.net | Email: cbowers@juniper.net | |||
End of changes. 83 change blocks. | ||||
233 lines changed or deleted | 239 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |