PANET Working Group Shankar Raman INTERNET-DRAFT Balaji Venkat Venkataswami Intended Status: Experimental RFC Kamakoti Veezhinathan Gaurav Raina Expires: November 5, 2013 IIT Madras May 4, 2013 TCAM power reduction and optimization in Routers draft-mjsraman-panet-tcam-power-efficiency-02 Abstract This specification entails enabling power and performance management in Routers (multi-chassis and single chassis) with respect to TCAM and SRAM usage on intelligent router line cards that implement Virtual Aggregation of routes. The version 00 of this document had several errata that have been addressed in this version. This scheme relates to unicast alone since multicast entries are programmed only after consultation with the unicast entries in the software plane in the Route processor. Hence this scheme does not apply to multicast. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Shankar Raman et.al Expires November 5, 2013 [Page 1] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Methodology . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Packet flow and its consequences . . . . . . . . . . . . . . 6 2.2 Switching off the TCAM banks which are unused . . . . . . . 7 2.2.1 Algorithm for switching off and switching on TCAM banks . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.2 Prefix entry population latency . . . . . . . . . . . . 10 2.3 Using Aggregate routes . . . . . . . . . . . . . . . . . . . 10 2.4 Route Processor as the DLC/BDLC . . . . . . . . . . . . . . 10 2.5 Advantages of this scheme . . . . . . . . . . . . . . . . . 10 3 Security Considerations . . . . . . . . . . . . . . . . . . . . 11 4 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11 5 References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.1 Normative References . . . . . . . . . . . . . . . . . . . 11 5.2 Informative References . . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 Shankar Raman et.al Expires November 5, 2013 [Page 2] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 1 Introduction This specification / draft entails enabling power and performance management in Routers (multi-chassis and single chassis) with respect to TCAM / SRAM usage on intelligent router line cards that implement Virtual Aggregation. Distributed line cards in a routers have intelligence to forward packet without interference from the central control plane, once their TCAMs and associated SRAMs are populated. Since memory and TCAM are the most power hungry components in these line cards, we should effectively manage the line card power usage by optimally using their SRAM memory and TCAM banks. When a packet enters a distributed line card, the packet switching logic extracts information from the header. It then looks up the entry in the appropriate TCAM (CAM in L2 switches or both in Multi- layer switches), and then passes the packet to the outgoing line card. The packet is then placed onto the outgoing port. The TCAM banks used in the line cards typically carry the entire forwarding table as tabulated by the central control processor card/cards (when in plurality used for redundancy). All line cards do not need the entire forwarding table as each line card may serve a set of sources and destinations. This is true especially in smaller networks but may also apply to routers in the Internet, especially ASBRs and POP border routers facing the customers of the ISP. Switching on the entire set of TCAMs (in the worst case) and downloading the entire forwarding table atop each of the line cards in the chassis leads to a sub-optimal way of switching packets Current status quo on this (apart from VPN route localization) is a proof that as far as Internet destinations go, all line cards on the backbone routers or even within a small campus or a medium sized ISP, carry the entire forwarding table built by the routing table manager on these routers. In order to obviate the necessity of carrying routes which are unused (by this term. we mean those that are unreferenced for making forwarding decisions) and to reduce TCAM space (applicable to switching as well which involves CAMs), we suggest a solution where a couple of Linecards are considered to be Designated and Backup Designated linecards. Designated Linecards are filled with all the entries from the control plane. The rest of the Linecards are provided with entries leftover from aging other entries which were not used over a period of time. An entry may not be available in these Linecards after they have been aged out (because of not being referenced and/or modified), these non- DLC linecards (as they are called), refer to the DLC/BDLC linecards for the first packet of a flow and retrieve the prefixes Shankar Raman et.al Expires November 5, 2013 [Page 3] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 associated with the hit on the DLC/BDLC. They populate these retrieved entries in their respective TCAM banks. Periodically on the non-DLC linecards a Route aggregation similar to the S-VA or the simple Virtual Aggregate with exception entries is generated to further reduce the TCAM occupation. The TCAM banks which are not occupied as a result of this scheme may be switched off. This in turn can switch off their associated SRAM banks as well which contain the rewrite information. The document also opens up the specification to simulations which help strengthen the argument for this scheme. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Methodology Distributed line cards in a routers have intelligence to forward packet without interference from the central control plane, once their TCAMs and associated SRAMs are populated. Since memory and TCAM are the most power hungry components in these line cards, we should effectively manage the line card power usage by optimally using their SRAM memory and TCAM banks. When a packet enters a distributedline card, the packet switching logic extracts information from the header. It then looks up the entry in the appropriate TCAM (CAM in L2 switches or both in Multi- layer switches), and then passes the packet to the outgoing line card. It is then placed onto the outgoing port. The TCAM banks used in the line cards typically carry the entire forwarding table as tabulated by the central control processor card/cards (when in plurality used for redundancy). All line cards do not need the entire forwarding table as each line card may serve a set of sources and destinations. This is true especially in smaller networks but may also apply to routers in the internet, especially ASBRs and POP border routers facing the customers of the ISP. Switching on the entire set of TCAMs (in the worst case) and downloading the entire forwarding table atop each of the line cards in the chassis leads to a sub-optimal way of switching packets. Current status quo on this (apart from VPN route localization) is a proof that as far as internet destinations go, all line cards on the backbone routers or even within a small campus or a medium sized ISP, carry the entire forwarding table built by the routing table manager on these routers. In order to obviate the necessity of Shankar Raman et.al Expires November 5, 2013 [Page 4] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 carrying routes which are unused (by this term. we mean those that are unreferenced for making forwarding decisions) and to reduce TCAM space (applicable to switching as well which involves CAMs), we suggest the following solution: a) At initialization time the entire forwarding table that is constructed in the control processor is downloaded to all line cards. This is the current implementation in most routers today. b) Begin a clock algorithm on these routes using a referenced bit and modified bit that is appended to each TCAM entry in the line card. a. When a TCAM entry is accessed for forwarding decision in the router on a specific line card, the referenced bit associated with that entry is set. b. When a TCAM entry is modified as a result of a routing entry being modified either with respect to its next-hop or any other reason the modified bit is set. c. A timer called the ReferenceTimer is started with a suitable interval. d. When the ReferenceTimer clock ticks down to 0, the TCAM entries in the line card are scanned and those that have the following values are not disturbed. 1. Referenced bit = 1, Modified bit = 1 2. Referenced bit = 0, Modified bit = 1 e. A timer called ModifiedDecayTimer is also started using a suitable time interval. f. The ModifiedDecayTimer is a single global timer which is started whenever a route change that modifies the nexthop to a set of route entries is downloaded to the line card (which is the status quo as of now). Each line card has its own respective ModifiedDecayTimer. g. On expiry of the ModifiedDecayTimer the TCAM entries with modified bit = 1 are set to modified bit = 0. h. Similarly each TCAM entry has a 32 bit counter that counts down to 0, which is in effect a ReferencedDecayTimer. A suitable value may be appended to this counter at the time of initialization and when the TCAM entry is referenced for forwarding. This counter is active only for the active banks of TCAM. A suitable alternative with or without the counter would be an LRU policy that removes and adds entries as appropriate. Shankar Raman et.al Expires November 5, 2013 [Page 5] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 i. When the ReferencedDecayTimer counts down to 0, the referenced bit for that TCAM entry is cleared. j. Continuing from (d) the TCAM entries with Modified bit = 0 and Referenced bit = 0 are removed from the TCAM when the ReferencedTimer expires. Further optimization on the TCAM may be done at regular intervals whenever the ReferencedTimer expires. This would involve re-arranging the TCAM to optimize on storage. The specific method relates to understanding the spread of TCAM entries with different mask lengths. Usually the TCAM entries are arranged in descending order of mask lengths with enough gaps to accommodate for different prefixes in their respective mask lengths. The algorithms in current software for populating the TCAM entries with enough gaps to accommodate for new additions are many. These however are out of scope of this document. Such algorithms allow for a fair amount of spread of the TCAM entries across the TCAM space thus allowing for several banks of TCAMs to be left empty if not used. 2.1 Packet flow and its consequences We will consider the initial condition on how these entries get populated. If a packet enters a line card on one of its ports and the switching logic finds that the TCAM entry is absent or has been cleared from its entry in the TCAM. We use a mechanism by which such a packet is first forwarded to one of the DLC (or designated Line cards which may be elected from the set of line cards in the chassis) where the entire forwarding table is always stored. There may be a Primary DLC and a backup DLC for redundancy purposes. When the packet hits the ingress line card which does not have the required TCAM entry for forwarding, it routes the packet within the chassis to the DLC. The DLC receives the packet or just the packet header and does the following: i) Looks up its full forwarding table, ii) Picks up the result and along with it the TCAM entry or entries which accumulate to all the set of related routes (perhaps all the routes with the same prefix that is rounded off to the nearest major class network boundary) and iii) sends them across to the ingress line card in question. The ingress line card uses the result to forward the packet and populates the TCAM with the group of entries dispatched to it by the DLC or the Backup DLC. One could use per-destination load-balancing within the ingress line card to distribute the lookup in such a way that the load is effectively shared by picking the DLC OR the Backup DLC. Once the set of routes sent back from the DLC is accumulated Shankar Raman et.al Expires November 5, 2013 [Page 6] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 and loaded into the TCAM, it may be periodically scanned and compressed even at a later point in time using the S-VA mechanism as explained in later sections. Now the ingress line card has the required entries. It is thus made possible that the flows that go to the desired destinations from then on would hit the TCAM entries that are populated according to the method discussed above. Thus, long lived TCP / UDP flows would have TCAM entries populated for the duration of their existence in the respective scheme-deployed linecards. Smaller timed flows too would have entries populated but this could be throttled by the timer mechanisms that have been provided. With respect to multicast routes, all multicast routing entries are programmed in the hardware after consultation with their respective unicast companions in the Route Processor. So the lookup for programming the (*,G) and (S,G) entries in the hardware are done in the software plane in the Route Processor. Hence the multicast entries are not affected by this scheme. This specification relates to unicast routing alone and TCAM entries that relate to it. Additionally a set threshold called QueryThreshold is configured on the DLC slots. If this is exceeded in terms of rate of packets coming to DLCs from other linecards in the form of the first packet query for a TCAM entry lookup, a periodic full download of the entire forwarding table is done to the those linecards which deploy this scheme. A further purge is then done using the Timers mentioned above. It is also possible to think of deploying this scheme selectively on a set of line cards. In this method, the rest of the line cards can act as TCAM entry route servers. Such an arrangement can help in load balancing the packets involved in the first query process on to the route server line cards in a sticky fashion for DLC lookup. A suitable hashing algorithm can be used to do this load- balancing. 2.2 Switching off the TCAM banks which are unused The TCAM banks which are not occupied as a result of this scheme may be switched off completely along with their associated SRAM banks as well which contain the rewrite information. The specification opens up the idea to simulations which help strengthen the argument for this scheme. 2.2.1 Algorithm for switching off and switching on TCAM banks Shankar Raman et.al Expires November 5, 2013 [Page 7] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 It is important to note that one cannot switch off all TCAM banks that are empty. If there is a sudden rush / burst of traffic at points in time which may be pretty regular for most deployments if the exact number of used TCAM banks alone are set on there would be a problem with respect to not being able to accommodate all the new flows in the switched on TCAM banks. Thus there would be an overflow and that would mean switching TCAM banks that are switched off to ON when it is too late. The latency of switching ON and off TCAMs will be an issue. If it takes too much time to do it packet loss could occur owing to non-availability of TCAM banks. In order to alleviate the problem specified above the following algorithm among several methods can be used. Shankar Raman et.al Expires November 5, 2013 [Page 8] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 Begin 1 // At Initialization time 2 All TCAM banks are filled with entries in the non-DLC linecards; 2.1 // When Timers start working 3 Timer mechanisms are used to age out entries that are not used; 4 When TCAM banks become empty NOT all of them are switched off; 5 Say X % of the TCAM banks are switched off; 6 X is determined by the number of active TCAM banks used; 7 if (number of TCAM banks used is greater than say 75%) 8 then 9 X = 0; // This would mean switching on all TCAM banks // even those that are empty 10 elseif (number of TCAM banks is lesser than 75 but 11 closer to 50%) 12 X = 25%; // This would mean switching on the rest // of the empty TCAM banks // TCAM banks switched off = literal 25%. Rest of the // TCAM banks are switched ON. 13 elseif (number of TCAM banks is much lesser than 50% 14 but greater than 35%) 15 X = 30%; // TCAM banks switched off = literal 30%. Rest of the // TCAM banks are switched ON. 16 elseif (number of TCAM banks is lesser than 35%) 17 X = 45%; // TCAM banks switched off = literal 45%. Rest of the // TCAM banks are switched ON. 18 endif End The percentages are rounded off to the ceiling of the TCAM banks Shankar Raman et.al Expires November 5, 2013 [Page 9] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 existent on the linecard. Note : TCAM banks that are empty but switched ON are ready to be populated with entries if the bursty traffic trigger the first-packet lookup the DLC/BDLC linecards. This could be seen as a conservative approach in making sure there are no major events owing to overflow. Underflow is always to be considered to be a good thing. This loop is run periodically and more frequently when the first packet re-direction to the DLC/BDLC is happening frequently. Thus there is always a buffer of TCAM banks that are ready to be populated in this conservative approach. Simulation results will be published in the next version of the draft. 2.2.2 Prefix entry population latency When the first-packet is sent to the DLC/BDLC linecards it is possible that there is a latency in populating the resultant prefix + mask entry in the linecard requesting the lookup. It is to be understood that till the lookup entry is populated for the requesting linecard all preceding lookups are forwarded to the DLC/BDLC linecards. 2.3 Using Aggregate routes Also periodically on the non-DLC linecards a Route aggregation process similar to the S-VA or the simple Virtual Aggregate with exception entries is used to further reduce the TCAM occupation. 2.4 Route Processor as the DLC/BDLC It is possible that the linecard which is designated as the DLC/ Backup DLC can be the Route processor itself. In such a case the same Network Processor Unit chipset that is available on the linecard would be available on the Route Processor itself. There could be more than one Route Processor each with the NPU chipset. Thus the DLC / BDLC could be the Route Processor and backup Route Processor. 2.5 Advantages of this scheme The TCAM banks unused as a result of this scheme save power when they are empty since they are switched off. Existing commercial chipsets provide the capability of banks of TCAM to be switched off. Also the upcoming chipsets provide for TCAM banks which are smaller than a set of a few large TCAM banks. Shankar Raman et.al Expires November 5, 2013 [Page 10] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 Power consumption of related SRAM banks that map to these TCAM banks , which store datum connected to route entries represented in the respective TCAM entries are also saved from being refreshed. The advantage is realized if the SRAM banks are also granularized to be in the form of sets of smaller banks than a set of few large banks. A finer granularity of TCAM banks with respect to their size and the number of TCAM entries that occupy these banks can be achieved to derive maximum benefit from this scheme. 3 Security Considerations There are no security considerations with regard to this specification. 4 IANA Considerations No IANA considerations need to be considered as part of this document. 5 References 5.1 Normative References None. 5.2 Informative References None. Authors' Addresses Shankar Raman Department of Computer Science and Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: mjsraman@cse.iitm.ac.in Shankar Raman et.al Expires November 5, 2013 [Page 11] INTERNET DRAFT TCAM power reduction methods in Routers May 4, 2013 Balaji Venkat Venkataswami Department of Electrical Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: balajivenkat299@gmail.com Prof.Kamakoti Veezhinathan Department of Computer Science and Engineering IIT Madras Chennai - 600036 TamilNadu India Email: kama@cse.iitm.ac.in Prof.Gaurav Raina Department of Electrical Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: gaurav@ee.iitm.ac.in Shankar Raman et.al Expires November 5, 2013 [Page 12]