<?xml version='1.0'encoding='utf-8'?>encoding='UTF-8'?> <!DOCTYPE rfc [ <!ENTITY nbsp " "> <!ENTITYreg "®"> <!ENTITYzwsp "​"> <!ENTITY nbhy "‑"> <!ENTITY wj "⁠"> ]><!-- This template is for creating an Internet Draft using xml2rfc, which is available here: http://xml.resource.org. --> <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> <!-- used by XSLT processors --> <!-- For a complete list and description of processing instructions (PIs), please see http://xml.resource.org/authoring/README.html. --> <!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use. (Here they are set differently than their defaults in xml2rfc v1.32) --> <?rfc strict="yes" ?> <!-- give errors regarding ID-nits and DTD validation --> <!-- control the table of contents (ToC) --> <?rfc toc="yes"?> <!-- generate a ToC --> <?rfc tocdepth="4"?> <!-- the number of levels of subsections in ToC. default: 3 --> <!-- control references --> <?rfc symrefs="yes"?> <!-- use symbolic references tags, i.e, [RFC2119] instead of [1] --> <?rfc sortrefs="yes" ?> <!-- sort the reference entries alphabetically --> <!-- control vertical white space (using these PIs as follows is recommended by the RFC Editor) --> <?rfc compact="yes" ?> <!-- do not start each main section on a new page --> <?rfc subcompact="no" ?> <!-- keep one blank line between list items --> <!-- end of list of popular I-D processing instructions --><rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-briscoe-docsis-q-protection-07" number="9957" ipr="trust200902" obsoletes="" updates=""submissionType="IETF"submissionType="independent" xml:lang="en" tocInclude="true" tocDepth="4" symRefs="true" sortRefs="true" version="3"><!-- xml2rfc v2v3 conversion 3.18.2 --> <!-- category values: std, bcp, info, exp, and historic ipr values: trust200902, noModificationTrust200902, noDerivativesTrust200902, or pre5378Trust200902 you can add<front> <!--[rfced] May theattributes updates="NNNN" and obsoletes="NNNN" they will automatically"®" beoutputremoved from the title? We note that previous RFCs with"(if approved)" --> <!-- ***** FRONT MATTER ***** --> <front> <!-- The abbreviated title is usedDOCSIS in thepage header -title do not use this. Also, on this topic, the Chicago Manual of Style says that it isonlynot necessaryif the full title is longer than 39 charactersin "publications that are not advertising or sales materials". Original: The DOCSIS® Queue Protection Algorithm to Preserve Low Latency Suggested: The DOCSIS Queue Protection Algorithm to Preserve Low Latency --> <title abbrev="Queue Protection to Preserve Low Latency">TheDOCSIS®DOCSIS® Queue Protection Algorithm to Preserve Low Latency</title> <seriesInfoname="Internet-Draft" value="draft-briscoe-docsis-q-protection-07"/>name="RFC" value="9957"/> <author fullname="Bob Briscoe" initials="B." role="editor" surname="Briscoe"> <organization>Independent</organization> <address> <postal><street/> <country>UK</country><country>United Kingdom</country> </postal> <email>ietf@bobbriscoe.net</email><uri>http://bobbriscoe.net/</uri><uri>https://bobbriscoe.net/</uri> </address> </author> <author fullname="Greg White" initials="G." surname="White"> <organization>CableLabs</organization> <address> <postal><street/> <country>US</country><country>United States of America</country> </postal> <email>G.White@CableLabs.com</email> </address> </author> <datemonth="" year=""/>month="April" year="2026"/> <area>WIT</area> <workgroup>tsvwg</workgroup> <keyword>Independent Submission Stream</keyword> <keyword>ISE</keyword> <keyword>Latency</keyword> <keyword>Policing</keyword> <abstract> <t>Thisinformational documentInformational RFC explains the specification of the queue protection algorithm used inDOCSISData-Over-Cable Service Interface Specification (DOCSIS) technology since version 3.1. A sharedlow latencylow-latency queue relies on the non-queue-buildingbehaviourbehavior of every traffic flow using it. However, some flows might not take such care, either accidentally or maliciously. If a queue is about to exceed a threshold level of delay, the queue protection algorithm can rapidly detect the flows most likely to be responsible. It can then prevent harm to other traffic in thelow latencylow-latency queue by ejecting selected packets (or all packets) of these flows.TheThis document is designed for fourtypes of audience:audiences: a) congestion control designers who need to understand how to keep on the'good'"good" side of the algorithm; b) implementers of the algorithm who want to understand it in more depth; c) designers of algorithms with similar goals, perhaps for non-DOCSIS scenarios; and d) researchers interested in evaluating the algorithm.</t> </abstract> </front> <middle> <section anchor="qp_intro" numbered="true" toc="default"> <name>Introduction</name> <t>Thisinformational documentInformational RFC explains the specification of the queue protection (QProt) algorithm used in DOCSIS technology since version 3.1 <xref target="DOCSIS" format="default"/>.</t> <t>Although the algorithm is defined inannexAnnex P of <xref target="DOCSIS" format="default"/>, it relies oncross-referencescross references to other parts of the set ofspecs.specifications. This document pulls all the strands together into one self-contained document. The core of the document is a similar pseudocode walk-through to that in the DOCSISspec,specification, but it also includes additionalmaterial: i) amaterial:</t> <ol spacing="normal" type="i"> <li>a briefoverview; ii) aoverview,</li> <li>a definition of how a data sender needs to behave to avoid triggering queueprotection; and iii) aprotection, and</li> <li>a section giving the rationale for the designchoices.</t>choices.</li></ol> <t>Low queuing delay depends on hosts sending their data smoothly, either at a low rate or responding toexplicit congestion notifications (ECNExplicit Congestion Notifications (ECNs) (see <xref target="RFC8311"format="default"/>,format="default"/> and <xref target="RFC9331" format="default"/>).So lowSo, low-latency queuinglatencyis something hosts create themselves, not something the network gives them. This tends to ensure that self-interest alone does not drive flows to mis-mark their packets for thelow latencylow-latency queue. However, traffic from an application that does not behave in a non-queue-building way might erroneously be classified into alow latencylow-latency queue, whether accidentally or maliciously. QProt protects other traffic in thelow latencylow-latency queue from the harm due to excess queuing that would otherwise be caused by such anomalousbehaviour.</t>behavior.</t> <t>In normal scenarios without misclassified traffic, QProt is not expected to intervene at all in the classification or forwarding of packets.</t> <t>An overview of howlow latencylow-latency support has been added to DOCSIS technology is given in <xref target="LLD" format="default"/>. In each direction of a DOCSIS link (upstream and downstream), there are two queues: one forLow LatencyLow-Latency (LL) and one for Classic traffic, in an arrangement similar to the IETF's Coupled DualQAQMActive Queue Management (AQM) <xref target="RFC9332" format="default"/>. The two queues enable a transition from'Classic'"Classic" to'Scalable'"Scalable" congestion control so that low latency can become the norm for any application, including ones seeking both full throughput and low latency, not just low-rate applications that have been more traditionally associated with alow latencylow-latency service. <!--[rfced] We note that the companion document RFC-to-be 9956 (draft-ietf-tsvwg-nab-33) cites more information for Reno and Cubic. Should citations be added here as well for the ease of the reader? Original: The Classic queue is only necessary for traffic such as traditional (Reno/Cubic) TCP that needs about a round trip of buffering to fully utilize the link, and therefore has no incentive to mismark itself as low latency. Perhaps: The Classic queue is only necessary for traffic such as traditional (Reno [RFC5681] / Cubic [RFC9438) TCP that needs about a round trip of buffering to fully utilize the link; therefore, this traffic has no incentive to mismark itself as low latency. --> The Classic queue is only necessary for traffic such as traditional (Reno/Cubic) TCP that needs about a round trip of buffering to fully utilize the link, and therefore has no incentive to mismark itself as low latency. The QProt function is located at the ingress to theLow LatencyLow-Latency queue. Therefore, in theupstreamupstream, QProt is located on the cable modem(CM), and(CM); in thedownstreamdownstream, it is located on thecable CMTS (CMCM TerminationSystem).System (CMTS). If an arriving packet triggers queue protection, the QProt algorithm ejects the packet from theLow LatencyLow-Latency queue and reclassifies it into the Classic queue.</t> <t>If QProt is used in settings other than DOCSIS links, it would be a simple matter to detect queue-building flows by using slightly differentconditions,conditions and/or to trigger a different action as a consequence, as appropriate for the scenario, e.g., dropping instead of reclassifying packets or perhaps accumulating a second per-flow score to decide whether to redirect a whole flow rather than just certain packets. Such work is for future study and out of scope of the present document.</t> <t>The QProt algorithm is based on a rigorous approach to quantifying how much each flow contributes to congestion, which is used in economics to allocate responsibility for the cost of one party'sbehaviourbehavior on others (the economic externality). Another important feature of the approach is that the metric used for the queuing score is based on the same variable that determines the level of ECN signalling seen by the sender (see <xref target="RFC8311"format="default"/>,format="default"/> and <xref target="RFC9331"format="default"/>.format="default"/>). This makes the internal queuing score visible externally as ECN markings. This transparency is necessary to be able to objectively state (in <xreftarget="qp_nec_flow_behaviour"target="qp_nec_flow_behavior" format="default"/>) how a flow can keep on the'good'"good" side of the algorithm.</t> <section numbered="true" toc="default"> <name>Document Roadmap</name> <t>The core of the document is the walk-through of the DOCSIS QProt algorithm's pseudocode in <xref target="qp_walk-through" format="default"/>.</t> <t>Prior to that, <xref target="qp_approach" format="default"/> summarizes the approach used in thealgorithm, thenalgorithm. Then, <xreftarget="qp_nec_flow_behaviour"target="qp_nec_flow_behavior" format="default"/> considers QProt from the perspective of theend-system,end-system by defining thebehaviourbehavior that a flow needs to comply with to avoid the QProt algorithm ejecting its packets from thelow latencylow-latency queue.</t> <t><xref target="qp_rationale" format="default"/> gives deeper insight into the principles and rationale behind the algorithm.ThenThen, <xref target="qp_limitations" format="default"/> explains the limitations of theapproach, followed by theapproach. The usual closingsections.</t>sections follow.</t> </section> <section anchor="l4sds_Terminology" numbered="true" toc="default"> <name>Terminology</name> <t>The normative language for the DOCSIS QProt algorithm is in the DOCSISspecsspecifications <xref target="DOCSIS" format="default"/>, <xref target="DOCSIS-CM-OSS" format="default"/>, and <xref target="DOCSIS-CCAP-OSS"format="default"/>format="default"/>: not in thisinformational guide.Informational RFC. If there is any inconsistency, the DOCSISspecsspecifications take precedence.</t> <t>The following terms and abbreviations are used:</t> <dl newline="false" spacing="normal"> <dt>CM:</dt> <dd>Cable Modem</dd> <dt>CMTS:</dt> <dd>CM Termination System</dd> <dt>Congestion-rate:</dt> <dd>The transmission rate of bits or bytes contained within packets of a flow that have the CE codepoint set in the IP-ECN field <xref target="RFC3168" format="default"/> (including IP headers unless specified otherwise). Congestion-bit-rate and congestion-volume were introduced in <xref target="RFC7713" format="default"/> and <xref target="RFC6789" format="default"/>.</dd> <dt>DOCSIS:</dt><dd>Data Over Cable<dd>Data-Over-Cable System Interface Specification. "DOCSIS" is a registered trademark of Cable Television Laboratories, Inc. ("CableLabs").</dd> <dt>Non-queue-building:</dt> <dd>A flow that tends not to build aqueue</dd>queue.</dd> <dt>Queue-building:</dt> <dd>A flow that builds a queue. If it is classified into theLow LatencyLow-Latency queue, it is therefore a candidate for the queue protection algorithm to detect and sanction.</dd> <dt>ECN:</dt> <dd>Explicit Congestion Notification</dd> <dt>QProt:</dt> <dd>The Queue Protection function</dd> </dl> </section> <section numbered="true" toc="default"> <name>Copyright Material</name> <t>Parts of this document are reproduced from <xref target="DOCSIS" format="default"/> with kind permission of the copyright holder, Cable Television Laboratories, Inc. ("CableLabs").</t> </section> </section> <section anchor="qp_approach" numbered="true" toc="default"> <name>Approach- In Brief</name>(In Brief)</name> <t>The algorithm is divided into mechanism and policy. There is only a tiny amount of policy code, but policy might need to be changed in the future. So, where hardware implementation is being considered, it would be advisable to implement the policy aspects in firmware or software:</t> <ul spacing="normal"> <li> <t>The mechanism aspects identify flows, maintainflow-stateflow-state, and accumulate per-flow queuing scores;</t> </li> <li> <t>The policy aspects can be divided into conditions and actions:</t> <ul spacing="normal"> <li> <t>The conditions are the logic that determines when action should be taken to avert the risk of queuing delay becoming excessive;</t> </li> <li> <t>The actions determine how this risk is averted, e.g., by redirecting packets from a flow into anotherqueue,queue orto reclassifyby reclassifying a whole flow that seems to be misclassified.</t> </li> </ul> </li> </ul> <section anchor="qp_approach_mechanism" numbered="true" toc="default"> <name>Mechanism</name> <t>The algorithm maintains per-flow-state, where'flow'"flow" usually means an end-to-end(layer-4)(Layer 4) 5-tuple. The flow-state consists of a queuing score that decays over time.IndeedIndeed, it is transformed into time units so that it represents the flow-state's own expiry time (explained in <xref target="qp_rationale_normalize" format="default"/>). A higher queuing score pushes out the expiry time further.</t> <t>Non-queue-building flows tend to release their flow-staterapidly ---rapidly: it usually expires reasonably early in the gap between the packets of a normal flow.ThenThen, the memory can be recycled for packets from other flows that arrivein between. Soin-between. So, only queue-building flows hold flow state persistently.</t> <t>The simplicity and effectiveness of the algorithm is due to the definition of the queuing score. The queueing score represents the share of blame for queuing that each flow bears. The scoring algorithm uses the same internal variable, probNative, that the AQM for thelow latencylow-latency queue uses to ECN-markpackets (thepackets. (The other two forms of marking, Classic and coupled, are driven by Classictraffic and thereforetraffic; therefore, they are not relevant to protection of the LL queue). In this way, the queuing score accumulates the size of each arriving packet of aflow,flow but scaled by the value of probNative (in the range 0 to 1) at the instant the packet arrives.SoSo, a flow's score accumulatesfaster, thefaster:</t> <ul> <li>the higher the degree of queuingand theand</li> <li>the faster that the flow's packets arrive when there isqueuing. <xrefqueuing.</li> </ul> <t><xref target="qp_rationale_not_throughput" format="default"/> explains further why this score represents blame for queuing.</t> <t>Thealgorithmalgorithm, as described sofarfar, would accumulate a number that would rise at the so-called congestion-rate of the flow (seeTerminology in<xref target="l4sds_Terminology" format="default"/>),i.e., thei.e., the rate at which the flow is contributing tocongestion,congestion or the rate at which the AQM is forwarding bytes of the flow that areECN marked.ECN-marked. However, rather than growing continually, the queuing score is also reduced (or'aged')"aged") at a constant rate. This is because it is unavoidable for capacity-seeking flows to induce a continuous low level of congestion as they track available capacity. <xref target="qp_rationale_aging" format="default"/> explains why this allowance can be set to the same constant for any scalable flow, whatever its bit rate.</t> <t>For implementation efficiency, the queuing score is transformed into timeunitsunits; this is so that it represents the expiry time of the flow state (as already discussed above).ThenThen, it does not need to be explicitlyaged,aged because the natural passage of time implicitly'ages'"ages" an expiry time. The transformation into time units simply involves dividing the queuing score of each packet by the constant aging rate(explained(this is explained further in <xref target="qp_rationale_normalize" format="default"/>).</t> </section> <section anchor="qp_approach_policy" numbered="true" toc="default"> <name>Policy</name> <section numbered="true" toc="default"> <name>Policy Conditions</name> <t>The algorithm uses the queuing score to determine whether to eject each packet only at the time it first arrives. This limits the policies available. For instance, when queueing delay exceeds a threshold, it is not possible to eject a packet from the flow with the highest queuingscoring,scoring because that would involve searching the queue for such a packet(if indeed(if, indeed, onewaswere still in the queue). Nonetheless, it is still possible to develop a policy that protects the low latency of the queue by making the queuing score threshold stricter the greater the excess of queuing delay relative to the threshold(explained(this is explained in <xref target="qp_rationale_conditions" format="default"/>).</t> </section> <section numbered="true" toc="default"> <name>Policy Action</name><t>In the DOCSIS QProt spec at<t>At the time of writing, the DOCSIS QProt specification states that when the policy conditions aremetmet, the action taken to protect thelow latencylow-latency queue is to reclassify a packet into the Classic queue(justified(this is justified in <xref target="qp_rationale_reclassify" format="default"/>).</t> </section> </section> </section> <sectionanchor="qp_nec_flow_behaviour"anchor="qp_nec_flow_behavior" numbered="true" toc="default"> <name>Necessary FlowBehaviour</name>Behavior</name> <t>The QProt algorithm described here can be used for responsive and/or unresponsive flows.</t> <ul spacing="normal"> <li> <t>It is possible to objectively describe the least responsive way that a flow will need to respond to congestion signals in order to avoid triggering queue protection, no matter the link capacity and no matter how much other traffic there is.</t> </li> <li> <t>It is not possible to describe how fast or smooth an unresponsive flow should be to avoid queueprotection,protection because this depends on how much other traffic there is and the capacity of the link, which an application is unable to know. However, the more smoothly an unresponsive flow paces its packets and the lower its rate relative to typical broadband link capacities, the less likelihood that it will risk causing enough queueing to trigger queue protection.</t> </li> </ul> <t>Responsivelow latencylow-latency flows can usean L4Sa Low Latency, Low Loss, and Scalable throughput (L4S) ECN codepoint <xref target="RFC9331" format="default"/> to get classified into thelow latencylow-latency queue.</t> <t>A sender can arrange for flows that are smooth but do not respond to ECN marking to be classified into thelow latencylow-latency queue by using the Non-Queue-Building (NQB) Diffserv codepoint <xreftarget="I-D.ietf-tsvwg-nqb"target="RFC9956" format="default"/>, which the DOCSISspecsspecifications support, or an operator can use various other local classifiers.</t> <t>As already explained in <xref target="qp_approach_mechanism" format="default"/>, the QProt algorithm is driven from the same variable that drives theECN markingECN-marking probability in thelow latencylow-latency or'LL'"LL" queue (the'Native'"Native" AQM of the LL queue is defined in the Immediate Active Queue Management Annex of <xref target="DOCSIS" format="default"/>). The algorithm that calculates this internal variable is run on the arrival of every packet, whether or not it isECN-capable or not,ECN-capable, so that it can be used by the QProt algorithm. But the variable is only used to ECN-mark packets that are ECN-capable.</t> <t>Not only does this dual use of the variable improve processing efficiency, but it also makes the basis of the QProt algorithm visible and transparent, at least for responsive ECN-capable flows.ThenThen, it is possible to state objectively that a flow can avoid triggering queue protection by keeping the bit rate ofECN markedECN-marked packets (the congestion-rate) below AGING, which is a configured constant of the algorithm (default 2^19 B/s ~= 4 Mb/s). Note that it is in a congestion controller's own interest to keep its average congestion-rate well below this level (e.g., ~1Mb/s),Mb/s) to ensure that it does not trigger queue protection during transient dynamics.</t> <t>If the QProt algorithm is used in other settings, it would still need to be based on the visible level of congestion signalling, in a similar way to the DOCSIS approach. Without transparency of the basis of the algorithm's decisions, end-systems would not be able to avoid triggering queue protection on an objective basis.</t> </section> <section anchor="qp_walk-through" numbered="true" toc="default"> <name>Pseudocode Walk-Through</name><t/><section anchor="qp_header_file" numbered="true" toc="default"> <name>Input Parameters,ConstantsConstants, and Variables</name> <t>The operator input parameters that set the parameters in the first two blocks of pseudocode below are defined for cable modems (CMs) in <xref target="DOCSIS-CM-OSS" format="default"/> and for CMTSs in <xref target="DOCSIS-CCAP-OSS" format="default"/>. Then, further constants are either derived from the input parameters or hard-coded.</t> <t>Defaults and units are shown in square brackets. Defaults (or indeed any aspect of the algorithm) are subject to change, so the latest DOCSISspecsspecifications are the definitive references.AlsoAlso, any operator might set certain parameters to non-default values.</t> <!--[rfced] FYI, "us" has been updated to "µs" in three instances where it follows numerals in comments in the pseudocode. This is in keeping with using µs for microseconds in RFC-to-be 9956. Original: 4000us 1000us 525 us Current: 4000 µs 1000 µs 525 µs --> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ // Input Parameters MAX_RATE; // Configured maximum sustained rate [b/s] QPROTECT_ON; // Queue Protection is enabled [Default: TRUE] CRITICALqL_us; // LL queue threshold delay [us] Default: MAXTH_us CRITICALqLSCORE_us;// The threshold queuing score [Default:4000us]4000 µs] LG_AGING; // The aging rate of the q'ing score [Default: 19] // as log base 2 of the congestion-rate [lg(B/s)] // Input Parameters for the calcProbNative() algorithm: MAXTH_us; // Max LL AQM marking threshold [Default:1000us]1000 µs] LG_RANGE; // Log base 2 of the range of ramp [lg(ns)] // Default: 2^19 = 524288 ns (roughly 525us) ]]></sourcecode>µs)]]></sourcecode> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ // Constants, either derived from input parameters or hard-coded T_RES; // Resolution of t_exp [ns] // Convert units (approx) AGING = pow(2, (LG_AGING-30) ) * T_RES; // lg([B/s]) to [B/T_RES] CRITICALqL = CRITICALqL_us * 1000; // [us] to [ns] CRITICALqLSCORE = CRITICALqLSCORE_us * 1000/T_RES; // [us] to [T_RES] // Threshold for the q'ing score condition CRITICALqLPRODUCT = CRITICALqL * CRITICALqLSCORE; qLSCORE_MAX = 5E9 / T_RES; // Max queuing score = 5 s ATTEMPTS = 2; // Max attempts to pick a bucket (vendor-specific) BI_SIZE = 5; // Bit-width of index number for non-default buckets NBUCKETS = pow(2, BI_SIZE); // No. of non-default buckets MASK = NBUCKETS-1; // convenient constant, with BI_SIZE LSBs set // Queue Protection exit states EXIT_SUCCESS = 0; // Forward the packet EXIT_SANCTION = 1; // Redirect the packet MAX_PROB = 1; // For integer arithmetic, would use a large int // e.g., 2^31, to allow space for overflow MAXTH = MAXTH_us * 1000; // Max marking threshold [ns] MAX_FRAME_SIZE = 2000; // DOCSIS-wide constant [B] // Minimum marking threshold of 2 MTU for slow links [ns] FLOOR = 2 * 8 * MAX_FRAME_SIZE * 10^9 / MAX_RATE; RANGE = (1 << LG_RANGE); // Range of ramp [ns] MINTH = max ( MAXTH - RANGE, FLOOR); MAXTH = MINTH + RANGE; // Max marking threshold[ns] ]]></sourcecode>[ns]]]></sourcecode> <!--[rfced] Please review and rephrase the following sentence with regard to the clause that begins "but in the floating..." as the sentence does not seem to parse as is. Original: The actual DOCSIS QProt algorithm is defined using integer arithmetic, but in the floating point arithmetic used in this document, (0 <= probNative <= 1). Perhaps: The actual DOCSIS QProt algorithm is defined using integer arithmetic, but in the floating-point arithmetic used in this document, the native marking probability is between 0 and 1 (inclusive), i.e., 0 <= probNative <= 1. --> <t>Throughout the pseudocode, most variables are integers. The only exceptions arefloating pointfloating-point variables representing probabilities (MAX_PROB and probNative) and the AGING parameter. The actual DOCSIS QProt algorithm is defined using integer arithmetic, but in thefloating pointfloating-point arithmetic used in this document, (0 <= probNative <= 1). Also, the pseudocode omits overflowcheckingchecking, and it would need to be made robust to non-default input parameters.</t> <t>The resolution for expressing time, T_RES, needs to be chosen to ensure that expiry times for buckets can represent times that are a fraction (e.g., 1/10) of the expected packet interarrival time for the system.</t> <t>The following definitions explain the purpose of important variables and functions.</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ // Public variables: qdelay; // The current queuing delay of the LL queue [ns] probNative; // Native marking probability of LL queue within [0,1] // External variables packet; // The structure holding packet header fields packet.size; // The size of the current packet [B] packet.uflow; // The flow identifier of the current packet // (e.g., 5-tuple or 4-tuple ifIPSec)IPsec) // Irrelevant details of DOCSIS function to return qdelay are removed qdelayL(...) // Returns current delay of thelow latencylow-latency Q[ns] ]]></sourcecode>[ns]]]></sourcecode> <t>Pseudocode for how the algorithm categorizes packets by flow ID to populate the variable packet.uflow is not given in detail here. The application's flow ID is usually defined by a common 5-tuple (or 4-tuple) of:</t> <ul spacing="normal"> <li> <t>source and destination IP addresses of the innermost IP header found;</t> </li> <li> <t>the protocol (IPv4) or next header (IPv6) field in this IP header</t> </li> <li> <t>either of:</t> <ul spacing="normal"> <li> <t>source and destination port numbers, for TCP, UDP, UDP-Lite,SCTP, DCCP,Stream Control Transmission Protocol (SCTP), Datagram Congestion Control Protocol (DCCP), etc.</t> </li> <li> <t>SecurityParametersParameter Index (SPI) forIPSecIPsec Encapsulating Security Payload (ESP) <xref target="RFC4303" format="default"/>.</t> </li> </ul> </li> </ul> <t>The Microflow Classification section of the Queue Protection Annex of the DOCSISspec <xrefspecification <xref target="DOCSIS" format="default"/> defines various strategies to find these headers by skipping extension headers or encapsulations. If they cannot be found, thespecspecification defines various less-specific 3-tuples that would be used. The DOCSISspecspecification should be referred to for all these strategies, which will not be repeated here.</t> <t>The array of bucket structures defined below is used by all the Queue Protection functions:</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ struct bucket { // The leaky bucket structure to hold per-flow state id; // identifier (e.g., 5-tuple) of flow using bucket t_exp; // expiry time in units of T_RES // (t_exp - now) = flow's transformed q'ing score }; struct bucketbuckets[NBUCKETS+1]; ]]></sourcecode>buckets[NBUCKETS+1];]]></sourcecode> </section> <section anchor="qp_data_path" numbered="true" toc="default"> <name>Queue Protection Data Path</name> <t>All the functions of Queue Protection operate on the data path, driven by packet arrivals.</t> <t>The following functions that maintain per-flow queuing scores and manage per-flow state are considered primarily as mechanism:</t> <ulempty="true"spacing="normal"> <li> <t>pick_bucket(uflow_id); // Returns bucket identifier</t> </li> <li> <t>fill_bucket(bucket_id, pkt_size, probNative); // Returns queuing score</t> </li> <li> <t>calcProbNative(qdelay) // Returns ECN-marking probability of the native LL AQM</t> </li> </ul> <t>The following function is primarily concerned with policy:</t> <ulempty="true"spacing="normal"> <li> <t>qprotect(packet, ...); // Returns exit status to either forward or redirect the packet</t> </li> </ul> <t>('...' suppresses distracting detail.)</t> <t>Future modifications to policy aspects are more likely than modifications to mechanisms. Therefore, policy aspects would be less appropriate candidates for any hardware acceleration.</t> <t>The entry point to these functions is qprotect(), which is called from packet classification before each packet is enqueued into the appropriate queue, queue_id, as follows:</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ classifier(packet) { // Determine which queue using ECN,DSCPDSCP, and any local-use fields queue_id = classify(packet); // LQ & CQ are macros for valid queue IDs returned by classify() if (queue_id == LQ) { // if packet classified toLow LatencyLow-Latency Service Flow if (QPROTECT_ON) { if (qprotect(packet, ...) == EXIT_SANCTION) { // redirect packet to Classic Service Flow queue_id = CQ; } } return queue_id;} ]]></sourcecode>}]]></sourcecode> <section anchor="qp_qprotect" numbered="true" toc="default"> <name>The qprotect()function</name>Function</name> <t>On each packet arrival at the LL queue, qprotect() measures the current delay of the LL queue and derives the native LL marking probability from it.ThenThen, it uses pick_bucket to find the bucket already holding the flow'sstate,state or to allocate a new bucket if the flow is new or its state has expired (the most likely case).ThenThen, the queuing score is updated by the fill_bucket() function. That completes the mechanism aspects.</t> <t>The comments against the subsequent policy conditions and actions should be self-explanatory at a superficial level. The deeper rationale for these conditions is given in <xref target="qp_rationale_conditions" format="default"/>.</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ //Per packetPer-packet queue protection qprotect(packet, ...) { bckt_id; // bucket index qLscore; // queuing score of pkt's flow in units of T_RES qdelay = qL.qdelay(...); probNative = calcProbNative(qdelay); bckt_id = pick_bucket(packet.uflow); qLscore = fill_bucket(buckets[bckt_id], packet.size, probNative); // Determine whether to sanction packet if ( ( ( qdelay > CRITICALqL ) // Test if qdelay over threshold... // ...and if flow's q'ing score scaled by qdelay/CRITICALqL // ...exceeds CRITICALqLSCORE && ( qdelay * qLscore > CRITICALqLPRODUCT ) ) // or qLSCORE_MAX reached || ( qLscore >= qLSCORE_MAX ) ) return EXIT_SANCTION; else return EXIT_SUCCESS;} ]]></sourcecode>}]]></sourcecode> </section> <section anchor="qp_pick_bucket" numbered="true" toc="default"> <name>The pick_bucket()function</name>Function</name> <t>The pick_bucket() function is optimized for flow-state that will normally have expired from packet to packet of the same flow. It is just one way of finding the bucket associated with the flow ID of eachpacket -packet: it might be possible to develop more efficient alternatives.</t> <t>The algorithm is arranged so that the bucket holding any live (non-expired) flow-state associated with a packet will always be found before a new bucket is allocated. The constant ATTEMPTS, defined earlier, determines how many hashes are used to find a bucket for eachflow (actually,flow. (Actually, only one hash is generated; then, by default, 5 bits of it at a time are used as the hashvalue, becausevalue because, bydefaultdefault, there are 2^5 = 32 buckets).</t> <t>The algorithm stores the flow's own ID in its flow-state. So, when a packet of a flow arrives, the algorithm tries up to ATTEMPTS times to hash to a bucket, looking for the flow's own ID. If found, it uses that bucket, firstresettingsresetting the expiry time to'now'"now" if it has expired.</t> <t>If it does not find the flow's ID, and the expiry time is still current, the algorithm can tell that another flow is using that bucket, and it continues to look for a bucket for the flow. Even if it finds another flow's bucket where the expiry time has passed, it doesn't immediately use it. It merely remembers it as the potential bucket to use. But first it runs through all the ATTEMPTS hashes to look for a bucket assigned to the flow ID. Then, if a live bucket is not already associated with the packet's flow, the algorithm should have already set aside an existing bucket with a score that has aged out. Given this bucket is no longer necessary to hold state for its previous flow, it can be recycled for use by the present packet's flow.</t> <t>If all else fails, there is one additional bucket (called the dregs) that can be used. If the dregs is still in live use by another flow, subsequent flows that cannot find a bucket of their own all share it, adding their score to the one in the dregs. A flow might get away with using the dregs on its own, but when there are many mis-marked flows, multiple flows are more likely to collide in the dregs, including innocent flows. The choice of number of buckets and number of hash attempts determines how likely it will be that this undesirable scenario will occur.</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ // Pick the bucket associated with flow uflw pick_bucket(uflw) { now; // current time j; // loop counter h32; // holds hash of the packet's flow IDs h; // bucket index being checked hsav; // interim chosen bucket index h32 = hash32(uflw); // 32-bit hash of flow ID hsav = NBUCKETS; // Default bucket now = get_time_now(); // in units of T_RES // The for loop checks ATTEMPTS buckets for ownership byflow-IDflow ID // It also records the 1st bucket, if any, that could be recycled // because it's expired. // Must not recycle a bucket until all ownership checks completed for (j=0; j<ATTEMPTS; j++) { // Use least signif. BI_SIZE bits of hash for each attempt h = h32 & MASK; if (buckets[h].id == uflw) { // Once uflw's bucket found... if (buckets[h].t_exp <= now) // ...if bucket has expired... buckets[h].t_exp = now; // ...reset it return h; // Either way, use it } else if ( (hsav == NBUCKETS) // If not seen expired bucket yet // and this bucket has expired && (buckets[h].t_exp <= now) ) { hsav = h; // set it as the interim bucket } h32 >>= BI_SIZE; // Bit-shift hash for next attempt } // If reached here, no tested bucket was owned by theflow-IDflow ID if (hsav != NBUCKETS) { // If here, found an expired bucket within the above for loop buckets[hsav].t_exp = now; // Reset expired bucket } else { // If here, we're having to use the default bucket (the dregs) if (buckets[hsav].t_exp <= now) { // If dregs has expired... buckets[hsav].t_exp = now; // ...reset it } } buckets[hsav].id = uflw; // In either case, claim for recycling return hsav;} ]]></sourcecode>}]]></sourcecode> </section> <section anchor="qp_fill_bucket" numbered="true" toc="default"> <name>The fill_bucket()function</name>Function</name> <t>The fill_bucket() function both accumulates and ages the queuing score over time, as outlined in <xref target="qp_approach_mechanism" format="default"/>. To make aging the score efficient, the increment of the queuing score is transformed into units of time by dividing byAGING,AGING so that the result represents the new expiry time of the flow.</t> <t>Given that probNative is already used to select which packets to ECN-mark, it might be thought that the queuing score could just be incremented by the full size of each selected packet, instead of incrementing it by the product of every packet's size (pkt_sz) and probNative. However, the unpublished experience of one of the authors with other congestion policers has found that the score then increments far too jumpily, particularly when probNative is low.</t> <t>A deeper explanation of the queuing score is given in <xref target="qp_rationale" format="default"/>.</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ fill_bucket(bckt_id, pkt_sz, probNative) { now; // current time now = get_time_now(); // in units of T_RES // Add packet's queuing score // For integer arithmetic, a bit-shift can replace the division qLscore = min(buckets[bckt_id].t_exp - now + probNative * pkt_sz / AGING, qLSCORE_MAX); buckets[bckt_id].t_exp = now + qLscore; return qLscore;} ]]></sourcecode>}]]></sourcecode> </section> <section anchor="qp_calcProbNative" numbered="true" toc="default"> <name>The calcProbNative()function</name>Function</name> <t>To derive this queuing score, the QProt algorithm uses the linear ramp function calcProbNative() to normalize instantaneous queuing delay of the LL queue into a probability in the range [0,1], which it assigns to probNative.</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ calcProbNative(qdelay){ if ( qdelay >= MAXTH ) { probNative = MAX_PROB; } else if ( qdelay > MINTH ) { probNative = MAX_PROB * (qdelay - MINTH)/RANGE; // In practice, the * and the / would use a bit-shift } else { probNative = 0; } return probNative;} ]]></sourcecode>}]]></sourcecode> </section> </section> </section> <!--[rfced] Section 5 is titled "Rationale". Then there is a difference between the formatting of the title of Section 5.1 (Rationale:) and the other titles. Might we update as follows? Original: 5. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.1. Rationale: Blame for Queuing, not for Rate in Itself . . 18 5.2. Rationale for Constant Aging of the Queuing Score . . . . 20 5.3. Rationale for Transformed Queuing Score . . . . . . . . . 21 5.4. Rationale for Policy Conditions . . . . . . . . . . . . . 22 5.5. Rationale for Reclassification as the Policy Action . . . 25 Perhaps A (all colons): 5. Rationale 5.1. Rationale: Blame for Queuing, Not for Rate in Itself 5.2. Rationale: Constant Aging of the Queuing Score 5.3. Rationale: Transformed Queuing Score 5.4. Rationale: Policy Conditions 5.5. Rationale: Reclassification as the Policy Action Perhaps B (all "for"): 5. Rationale 5.1. Rationale for Blame for Queuing, Not for Rate in Itself 5.2. Rationale for Constant Aging of the Queuing Score 5.3. Rationale for Transformed Queuing Score 5.4. Rationale for Policy Conditions 5.5. Rationale for Reclassification as the Policy Action Perhaps C (just removing as they are all subsections of "Rationale"): 5. Rationale 5.1. Blame for Queuing, Not for Rate in Itself 5.2. Constant Aging of the Queuing Score 5.3. Transformed Queuing Score 5.4. Policy Conditions 5.5. Reclassification as the Policy Action --> <section anchor="qp_rationale" numbered="true" toc="default"> <name>Rationale</name><t/><section anchor="qp_rationale_not_throughput" numbered="true" toc="default"> <name>Rationale: Blame for Queuing,notNot for Rate in Itself</name> <t><xref target="qp_fig_blame_cbr_v_burst" format="default"/> shows the bit rates of two flows as stacked areas. It poses the question of which flow is more to blame for queuingdelay;delay: the unresponsive constant bit rate flow (c) that is consuming about 80% of thecapacity,capacity or the flow sending regular short unresponsive bursts (b)? The smoothness of c seems better for avoiding queuing, but its high rate does not. However, if flow cwaswere not there, or ran slightly more slowly, b would not cause any queuing.</t> <figure anchor="qp_fig_blame_cbr_v_burst"> <name>Which isMoremore toBlameblame forQueuing Delay?</name>queuing delay?</name> <artwork name="" type="" align="left"alt=""><![CDATA[^alt=""><![CDATA[ ^ bit rate (stacked areas) | ,-. ,-. ,-. ,-. ,-. |--|b|----------|b|----------|b|----------|b|----------|b|---Capacity |__|_|__________|_|__________|_|__________|_|__________|_|_____ | | c | | | +---------------------------------------------------------------->time ]]></artwork>time]]></artwork> </figure> <t>To explain queuing scores, in the following it will initially be assumed that the QProt algorithm is accumulating queuingscores,scores but not taking any action as a result.</t> <t>To quantify the responsibility that each flow bears for queuing delay, the QProt algorithm accumulates the product of the rate of each flow and the level of congestion, both measured at the instant each packet arrives. The instantaneous flow rate is represented at each discrete event when a packet arrives by the packet's size, which accumulates faster the more packets arrive within each unit of time. The level of congestion is normalized to a dimensionless number between 0 and 1 (probNative). This fractional congestion level is used in preference to a direct dependence on queuing delay for two reasons:</t> <ul spacing="normal"> <li> <t>to be able to ignore very low levels of queuing that contribute insignificantly to delay</t> </li> <li> <t>to be able to erect a steep barrier against excessive queuing delay</t> </li> </ul> <t>The unit of the resulting queue score is "congested-bytes" per second, which distinguishes it from just bytes per second.</t> <t>Then, during the periods between bursts (b), neither flow accumulates any queuingscore -score: the high rate of c is benign. But, during each burst, if we say the rate of c and b are 80% and 45% of capacity, thus causing 25% overload, they each bear (80/125)% and (45/125)% of the responsibility for the queuing delay (64% and 36%). The algorithm does not explicitly calculate these percentages. They are just the outcome of the number of packets arriving from each flow during the burst.</t> <t>To summarize, the queuing score never sanctions rate solely on its own account. It only sanctions rate inasmuch as it causes queuing.</t> <figure anchor="qp_fig_blame_scenario"> <name>Responsibility for Queuing:More ComplexMore-Complex Scenario</name> <artwork name="" type="" align="left"alt=""><![CDATA[^alt=""><![CDATA[ ^ bit rate (stacked areas) , | ,-. |\ ,- |------Capacity-|b|----------,-.----------|b|----------|b\----- | __|_|_______ |b| /``\| _...-._-': | ,.-- | ,-. __/ \__|_|_ _/ |/ \|/ | |b| ___/ \___/ __ r | |_|/ v \__/ \_______ _/\____/ | _/ \__/ | +---------------------------------------------------------------->time ]]></artwork>time]]></artwork> </figure> <t><xref target="qp_fig_blame_scenario" format="default"/> gives amore complexmore-complex illustration of the way the queuing score assigns responsibility for queuing (limited to the precision that ASCII art can illustrate). The figure shows the bit rates of three flows represented as stacked areas labelled b,vv, and r. The unresponsive bursts (b) are the same as in the previous example, but avariable ratevariable-rate video (v) replaces flow c.It'sIts rate varies as the complexity of the video scene varies.AlsoAlso, on a slower timescale, in response to the level of congestion, the video adapts its quality. However, on a shorttime-scaletimescale it appears to be unresponsive to small amounts of queuing. Also,part-waypartway through, alow latencylow-latency responsive flow (r) joins in, aiming to fill the balance of capacity left by the other two.</t> <t>The combination of the first burst and the low application-limited rate of the video causes neither flow to accumulate queuing score. In contrast, the second burst causes similar excessive overload (125%) to the example in <xref target="qp_fig_blame_cbr_v_burst" format="default"/>. Then, the video happens to reduce its rate (probably due to aless complexless-complex scene) so the third burst causes only a little congestion. Let us assume the resulting queue causes probNative to rise to just 1%, then the queuing score will only accumulate 1% of the size of each packet of flows v and b during this burst.</t> <t>The fourth burst happens to arrive just as the new responsive flow (r) has filled the available capacity, so it leads to very rapid growth of the queue. After a roundtriptrip, the responsive flow rapidly backs off, and the adaptive video also backs off more rapidly than it wouldnormally,normally because of the very high congestion level. The rapid response to congestion of flow r reduces the queuing score that all three flows accumulate, but they each still bear the cost in proportion to the product of the rates at which their packets arrive at the queue and the value of probNative when they do so. Thus, during the fifth burst, they all accumulatelessa lower score than thefourth,fourth because the queuing delay is not as excessive.</t> </section> <section anchor="qp_rationale_aging" numbered="true" toc="default"> <name>Rationale for Constant Aging of the Queuing Score</name> <t>Even well-behaved flows will not always be able to respond fast enough to dynamic events.AlsoAlso, well-behaved flows, e.g.,DCTCPData Center TCP (DCTCP) <xref target="RFC8257" format="default"/>, TCP Prague <xref target="I-D.briscoe-iccrg-prague-congestion-control" format="default"/>,BBRv3Bottleneck Bandwidth and Round-trip propagation time version 3 (BBRv3) <xref target="BBRv3"format="default"/>format="default"/>, or the L4S variant of SCReAM <xref target="SCReAM" format="default"/> for real-time media <xref target="RFC8298" format="default"/>, can maintain a very shallow queue by continual careful probing for more while also continually subtracting a little from their rate (or congestion window) in response to low levels of ECN signalling. Therefore, the QProt algorithm needs to continually offer a degree of forgiveness to age out the queuing score as it accumulates.</t> <t>Scalable congestioncontrollerscontrollers, such as thoseaboveabove, maintain their congestion window in inverse proportion to the congestion level, probNative. That leads to the important propertythatthat, onaverageaverage, a scalable flow holds the product of its congestion window and the congestion level constant, no matter the capacity of the link or how many other flows it competes with. For instance, if the link capacity doubles, a scalable flow induces half the congestion probability.OrOr, if three scalable flows compete for the capacity, each flow will reduce to one third of the capacity they would use on their own and increase the congestion level by 3x. Therefore, in steady state, a scalable flow will induce the same constant amount of "congested-bytes" per round trip, whatever the linkcapacity,capacity and no matter how many flows are sharing the capacity.</t> <t>This suggests that the QProt algorithm will not sanction a well-behaved scalable flow if it ages out the queuing score at a sufficient constant rate. The constant will need to be somewhat above the average of a well-behaved scalable flow to allow for normal dynamics.</t> <t>Relating QProt's aging constant to a scalable flow does not mean that a flow has to behave like a scalableflow. Itflow: it can be lessaggressive,aggressive but notmore.more aggressive. For instance, a longer RTT flow can run at a lower congestion-rate than the aging rate, but it can also increase its aggressiveness to equal the rate of short RTT scalable flows <xref target="ScalingCC" format="default"/>. The constant aging of QProt also means that a long-running unresponsive flow will be prone to trigger QProt if it runs faster than a competing responsive scalable flow would. And, of course, if a flow causes excessive queuing in theshort-term,short term, its queuing score will still rise faster than the constant aging process will decrease it.ThenThen, QProt will still eject the flow's packets before they harm the low latency of the shared queue.</t> </section> <section anchor="qp_rationale_normalize" numbered="true" toc="default"> <name>Rationale for Transformed Queuing Score</name> <t>The QProt algorithm holds a flow's queuing score state in a structure called abucket,"bucket". This is because of its similarity to a classic leaky bucket (except the contents of the bucketdoesdo not represent bytes).</t> <figure anchor="qp_fig_qscore_normalize"> <name>Transformation of Queuing Score</name> <artwork name="" type="" align="left"alt=""><![CDATA[probNativealt=""><![CDATA[ probNative * pkt_sz probNative * pkt_sz / AGING | | | V | | V | | : | ___ | : | |_____| ___ |_____| | | ___ | | |__ __| |__ __| | | V V AGING * DtDt ]]></artwork>Dt]]></artwork> </figure> <t>The accumulation and aging of the queuing score is shown on the left of <xref target="qp_fig_qscore_normalize" format="default"/> in token bucket form. Dt is the difference between the times when the scores of the current and previous packets were processed.</t> <t>A transformed equivalent of this token bucket is shown on the right of <xref target="qp_fig_qscore_normalize" format="default"/>, dividing both the input and output by the constant AGING rate. The result is a bucket-depth that represents time and it drains at the rate that time passes.</t> <t>As a further optimization, the time the bucket was last updated is not stored with the flow-state. Instead, when the bucket isinitializedinitialized, the queuing score is added to the system time'now'"now" and the resulting expiry time is written into the bucket. Subsequently, if the bucket has not expired, the incremental queuing score is added to the time already held in the bucket.ThenThen, the queuing score always represents the expiry time of the flow-state itself. This means that the queuing score does not need to be aged explicitly because it ages itself implicitly.</t> </section> <section anchor="qp_rationale_conditions" numbered="true" toc="default"> <name>Rationale for Policy Conditions</name> <t>Pseudocode for the QProt policy conditions is given in <xref target="qp_header_file" format="default"/> within the second half of the qprotect() function. When each packet arrives, after finding its flow state and updating the queuing score of the packet's flow, the algorithm checks whether the shared queue delay exceeds a constant threshold CRITICALqL (e.g., 2 ms), as repeated below for convenience:</t> <sourcecode name=""type=""type="pseudocode" markers="true"><![CDATA[ if ( ( qdelay > CRITICALqL ) // Test if qdelay over threshold... // ...and if flow's q'ing score scaled by qdelay/CRITICALqL // ...exceeds CRITICALqLSCORE && ( qdelay * qLscore > CRITICALqLPRODUCT ) ) // Recall that CRITICALqLPRODUCT = CRITICALqL *CRITICALqLSCORE ]]></sourcecode>CRITICALqLSCORE]]></sourcecode> <t>If the queue delay threshold is exceeded, the flow's queuing score is temporarily scaled up by the ratio of the current queue delay to the threshold queuing delay, CRITICALqL (the reason for the scaling is given next). If this scaled up score exceeds another constant threshold CRITICALqLSCORE, the packet is ejected. The actual last line of code above multiplies both sides of the second condition by CRITICALqL to avoid a costly division.</t> <t>This approach allows each packet to be assessed once, as it arrives. Once queue delay exceeds the threshold, it has two implications:</t> <ul spacing="normal"> <li> <t>The current packet might beejectedejected, even though there are packets already in the queue from flows with higher queuing scores. However, any flow that continues to contribute to the queue will have to send further packets, giving an opportunity to eject them as well, as they subsequently arrive.</t> </li> <li> <t>The next packets to arrive might not beejected,ejected because they might belong to flows with low queuing scores. In this case, queue delay could continue to rise with no opportunity to eject a packet. This is why the queuing score is scaled up by the current queue delay. Then, the more the queue has grown without ejecting a packet, the more the algorithm'raises"raises thebar'bar" to further packets.</t> </li> </ul> <t>The above approach is preferred over the extra per-packet processing cost of searching the buckets for the flow with the highest queuing score and searching the queue for one of its packets to eject (if one is still in the queue).</t> <t>Notethatthat, bydefaultdefault, CRITICALqL_us is set to the maximum threshold of the ramp markingalgorithm,algorithm MAXTH_us. However, there is some debate as to whether setting it to the minimum threshold instead would improve QProt performance. This would roughly double the ratio of qdelay to CRITICALqL, which is compared against the CRITICALqLSCORE threshold.SoSo, the threshold would have to be roughly doubled accordingly.</t> <t><xref target="qp_fig_policy_conditions" format="default"/> explains this approach graphically. On the horizontalaxisaxis, it shows actual harm, meaning the queuing delay in the shared queue. On the verticalaxisaxis, it shows thebehaviourbehavior record of the flow associated with the currently arriving packet, represented in the algorithm by the flow's queuing score. The shaded region represents the combination of actual harm andbehaviourbehavior record that will lead to the packet being ejected.</t> <figure anchor="qp_fig_policy_conditions"> <name>Graphical Explanation of the Policy Conditions</name> <artwork name="" type="" align="left"alt=""><![CDATA[Behaviouralt=""><![CDATA[ Behavior Record: Queueing Score of Arriving Packet's Flow ^ | + |/ / / / / / / / / / / / / / / / / / / | + N | / / / / / / / / / / / / / / / / / / / | + |/ / / / / / / / / / | + | / / / / E (Eject packet) / / / / / | + |/ / / / / / / / / / | + | / / / / / / / / / / / / / / / / / / / | + |/ / / / / / / / / / / / / / / / / / / | +| / / / / / / / / / / / / / / / / / / / | |+ / / / / / / / / / / / / / / / / / / | N | + / / / / / / / / / / / / / / / / / | (No actual | +/ / / / / / / / / / / / / / / | harm) | + / / / / / / / / / / / / | | P (Pass over) + ,/ / / / / / / / | | ^ + /./ /_/ +--------------+------------------------------------------> CRITICALqL Actual Harm: Shared QueueDelay ]]></artwork>Delay]]></artwork> </figure> <t>The regions labelled'N'"N" represent cases where the first condition is not met--- no actual harm--- queue delay is below the critical threshold, CRITICALqL.</t> <t>The region labelled'E'"E" represents cases where there is actual harm (queue delay exceeds CRITICALqL) and the queuing score associated with the arriving packet is high enough to be able to eject it with certainty.</t> <t>The region labelled'P'"P" represents cases where there is actual harm, but the queuing score of the arriving packet is insufficient to eject it, so it has to bePassedpassed over. This adds to queuing delay, but the alternative would be to sanction an innocent flow. It can be seen that, as actual harm increases, thejudgementjudgment of innocence becomes increasingly stringent; thebehaviourbehavior record of the next packet's flow does not have to be as bad to eject it.</t> <t>Conditioning ejection on actual harm helps prevent VPN packets being ejected unnecessarily. VPNs consisting of multiple flows can tend to accumulate queuing score faster than it is agedout,out because the aging rate is intended for a single flow. However, whether or not some traffic is in a VPN, the queue delay threshold (CRITICALqL) will be no more likely to be exceeded.SoSo, conditioning ejection on actual harm helps reduce the chance that VPN traffic will be ejected by the QProt function.</t> </section> <section anchor="qp_rationale_reclassify" numbered="true" toc="default"> <name>Rationale for Reclassification as the Policy Action</name> <t>When the DOCSIS QProt algorithm deems that it is necessary to eject a packet to protect theLow LatencyLow-Latency queue, it redirects the packet to the Classic queue. In theLow LatencyLow-Latency DOCSIS architecture (as in Coupled DualQ AQMs generally), the Classic queue is expected to frequently have a larger backlog of packets, which is caused by classic congestion controllers interacting with a classic AQM (which has a latency target of10ms)10 ms) as well as other bursty traffic.</t> <t>Therefore, typically, an ejected packet will experience higher queuing delay than it would otherwise, and it could be re-ordered within its flow (assuming QProt does not eject all packets of an anomalous flow). The mild harm caused to the performance of the ejected packet's flow is deliberate. It gives senders a slight incentive to identify their packets correctly.</t> <t>If there were no such harm, there would be nothing to prevent all flows from identifying themselves as suitable for classification into thelow latency queue,low-latency queue and just letting QProt sort the resulting aggregate into queue-building and non-queue-building flows. This might seem like a useful alternative to requiring senders to correctly identify their flows. However, handling of mis-classified flows is not without a cost. The more packets that have to be reclassified, the more often the delay of thelow latencylow-latency queue would exceed the threshold.AlsoAlso, more memory would be required to hold the extra flow state.</t> <t>When a packet is redirected into the Classic queue, an operator might want to alter the identifier(s) that originally caused it to be classified into theLow Latency queue,Low-Latency queue so that the packet will not be classified into anotherlow latencylow-latency queue further downstream. However, redirection of occasional packets can be due to unusually high transient load just at the specific bottleneck, not necessarily at any otherbottleneck,bottleneck and not necessarily due to bad flowbehaviour.behavior. Therefore,Section 5.4.1.2 of<xref target="RFC9331"format="default"/>section="5.4.1.2"/> precludes a network node from altering the end-to-end ECN field to exclude traffic from L4S treatment. Instead a local-use identifier ought to be used (e.g., Diffserv Codepoint or VLANtag),tag) so that each operator can apply its own policy, without prejudging what other operators ought to do.</t> <t>Although not supported in the DOCSISspecs,specifications, QProt could be extended to recognize that large numbers of redirected packets belong to the same flow. This might be detected when the bucket expiry time t_exp exceeds a threshold. Depending on policy and implementation capabilities, QProt could then install a classifier to redirect a whole flow into the Classic queue, with an idle timeout to remove stale classifiers. In these'persistent offender'"persistent offender" cases, QProt might also overwrite each redirected packet's DSCP or clear its ECN field to Not-ECT, in order to protect other potential L4S queues downstream. The DOCSISspecsspecifications do not discuss sanctioning wholeflows, soflows; further discussion is beyond the scope of the present document.</t> </section> </section> <section anchor="qp_limitations" numbered="true" toc="default"> <name>Limitations</name> <t>The QProt algorithm groups packets with commonlayer-4Layer 4 flow identifiers. It then uses this grouping to accumulate queuing scores and to sanction packets.</t> <t>This choice of identifier for grouping is pragmatic with no scientific basis. All the packets of a flow certainly pass between the same two endpoints.ButHowever, some applications might initiate multiple flows between the sameend-points,endpoints, e.g., for media, control, data, etc. Others might use common flow identifiers for all these streams. Also, a user might group multiple application flows within the same encrypted VPN between the samelayer-4Layer 4 tunnelend-points. Andendpoints. And, even if there were a one-to-one mapping between flows and applications, there is no reason to believe that the rate at which congestion can be caused ought to be allocated on aper application flowper-application-flow basis.</t> <t>The use of a queuing score that excludes those aspects of flow rate that do not contribute to queuing (<xref target="qp_rationale_not_throughput" format="default"/>) goes some way to mitigating thislimitation,limitation because the algorithm does not judge responsibility for queuing delay primarily on the combined rate of a set of flows grouped under one flow ID.</t> </section> <section anchor="l4sds_IANA" numbered="true" toc="default"> <name>IANAConsiderations (to be removed by RFC Editor)</name>Considerations</name> <t>Thisspecification containsdocument has no IANAconsiderations.</t>actions.</t> </section> <!-- [rfced] We had the following questions related to the Implementation Status section: a) Should this section be removed per the guidance in RFC 7942 (relevant parts copied below for your convenience)? We recommend that the Implementation Status section should be removed from Internet-Drafts before they are published as RFCs. As a result, we do not envisage changes to this section after approval of the document for publication, while the document sits in the RFC Editor queue, e.g., the RFC errata process does not apply. This process is not mandatory. Authors of Internet-Drafts are encouraged to consider using the process for their documents, and working groups are invited to think about applying the process to all of their protocol specifications. ... This process was initially proposed as an experiment in [RFC6982]. That document is now obsoleted, and the process advanced to Best Current Practice. ... Each Internet-Draft may contain a section entitled "Implementation Status". This section, if it appears, should be located just before the "Security Considerations" section ... ... Since this information is necessarily time dependent, it is inappropriate for inclusion in a published RFC. The authors should include a note to the RFC Editor requesting that the section be removed before publication. ... Authors are requested to add a note to the RFC Editor at the top of this section, advising the Editor to remove the entire section before publication, as well as the reference to RFC 7942. b) If not, should Table 1 have some sort of title? c) FYI - In the meantime, we have updated per your guidance on the document intake form as follows: Old:" and one CMTS implementation by a third manufacturer." Current: " and several CMTS implementations by other manufacturers.” --> <section anchor="qp_impl_status" numbered="true" toc="default"> <name>Implementation Status</name> <table align="center"> <thead> <tr> <th align="left">Implementation name:</th> <th align="left">DOCSIS models for ns-3</th> </tr> </thead> <tbody> <tr> <td align="left">Organization</td> <td align="left">CableLabs</td> </tr> <tr> <td align="left">Web page</td> <td align="left">https://apps.nsnam.org/app/docsis-ns3/</td> </tr> <tr> <td align="left">Description</td> <td align="left">ns-3 simulation models developed and used in support of theLow LatencyLow-Latency DOCSIS development, including models of Dual Queue Coupled AQM, Queue Protection, and the DOCSIS MAC</td> </tr> <tr> <td align="left">Maturity</td> <td align="left">Simulation models that can also be used in emulation mode in a testbed context</td> </tr> <tr> <td align="left">Coverage</td> <td align="left">Complete implementation of Annex P of DOCSIS 3.1</td> </tr> <tr> <td align="left">Version</td> <td align="left">DOCSIS 3.1, version I21; https://www.cablelabs.com/specifications/CM-SP-MULPIv3.1?v=I21</td> </tr> <tr> <td align="left">Licence</td> <td align="left">GPLv2</td> </tr> <tr> <td align="left">Contact</td> <td align="left">via web page</td> </tr> <tr> <td align="left">LastImpl'n update</td>Implementation Update</td> <td align="left">Mar 2022</td> </tr> <tr> <td align="left">Information valid at</td> <td align="left">7 Mar 2022</td> </tr> </tbody> </table> <t>There are also a number of closed source implementations, including2two cable modem implementations written by different chipsetmanufacturers,manufacturers andoneseveral CMTSimplementationimplementations bya third manufacturer.other manufacturers. These, as well as the ns-3 implementation, have passed the full suite of compliance tests developed by CableLabs.</t> </section> <section anchor="l4sds_Security_Considerations" numbered="true" toc="default"> <name>Security Considerations</name> <t>The whole of this document concerns traffic security. It considers the security question of how to identify and eject traffic that does not comply with the non-queue-buildingbehaviourbehavior required to use a sharedlow latencylow-latency queue, whether accidentally or maliciously.</t><t>Section 8.2<t><xref target="RFC9330" section="8.2"/> of the L4S architecture <xref target="RFC9330" format="default"/> introduces the problem of maintaining low latency by either self-restraint orenforcement,enforcement and places DOCSIS queue protection in context within a wider set of approaches to the problem.</t> <section anchor="qp_resource_exhaust" numbered="true" toc="default"> <name>Resource Exhaustion Attacks</name> <t>The algorithm has been designed to fail gracefully in the face of traffic crafted to overrun the resources used for the algorithm's own processing and flow state. This means that non-queue-building flows will always be less likely to be sanctioned than queue-building flows. But an attack could be contrived to deplete resources in such a way that the proportion of innocent (non-queue-building) flows that are incorrectly sanctioned could increase.</t> <t>Incorrect sanctioning is intended not to be catastrophic; it results in more packets from well-behaved flows being redirected into the Classicqueue; thus introducingqueue, which introduces more reordering into innocent flows.</t> <section anchor="qp_flow-state_exhaust" numbered="true" toc="default"> <name>Exhausting Flow-State Storage</name> <t>To exhaust the number of buckets, the most efficient attack is to send enough long-running attack flows to increase the chance that an arriving flow will not find an availablebucket,bucket andthereforewill, therefore, have to share the'dregs'"dregs" bucket. For instance, if ATTEMPTS=2 and NBUCKETS=32, it requires about 94 attack flows, each using different port numbers, to increase the probability to 99% that an arriving flow will have to share the dregs, where it will share a high degree of redirection into the C queue with the remainder of the attack flows.</t> <t>For an attacker to keep buckets busy, it is more efficient to hold onto them by cycling regularly through a set of port numbers (94 in the aboveexample),example) rather than to keep occupying and releasing buckets with single packet flows across a much larger number of ports.</t> <t>During such an attack, the coupled marking probability will have saturated at 100%. So, to hold a bucket, the rate of an attack flow needs to be no less than the AGING rate of eachbucket; 4Mb/sbucket: 4 Mb/s by default. However, for an attack flow to be sure to hold on to its bucket, it would need to send somewhat faster.ThusThus, an attack with 100 flows would need a total force of more than 100 *4Mb/s4 Mb/s =400Mb/s.</t>400 Mb/s.</t> <t>This attack can be mitigated (but not prevented) by increasing the number of buckets. The required attack force scales linearly with the number of buckets, NBUCKETS. So, if NBUCKETS were doubled to 64, twice as many4Mb/s4 Mb/s flows would be needed to maintain the same impact on innocent flows.</t> <t>Probably the most effective mitigation would be to implement redirection of whole-flows once enough of the individual packets of a certain offending flow had been redirected. This would free up the buckets used to maintain the per-packet queuing score of persistent offenders. Whole-flow redirection is outside the scope of the current version of the QProt algorithm specified here, but it is briefly discussed at the end of <xref target="qp_rationale_reclassify" format="default"/>.</t> <t>It might be considered that all the packets of persistently offending flows ought to be discarded rather than redirected. However, this is notrecommended,recommended because attack flows might be able to pervert whole-flow discard, turning it onto at least some innocent flows, thus amplifying an attack that causes reordering into total deletion of some innocent flows.</t> </section> <section anchor="qp_proc_exhaust" numbered="true" toc="default"> <name>Exhausting Processing Resources</name> <t>The processing time needed to apply the QProt algorithm to each LL packet is small and intended not to take all the time available between each of a run of fairly small packets. However, an attack could use minimumsizesized packets launched from multiple input interfaces into a lower capacity output interface. Whether the QProt algorithm is vulnerable to processor exhaustion will depend on the specific implementation.</t> <t>Addition of a capability to redirect persistently offending flows from LL to C would be the most effective way to reduce the per-packet processing cost of the QProtalgorithm,algorithm when under attack. As already mentioned in <xref target="qp_flow-state_exhaust" format="default"/>, this would also be an effective way to mitigate flow-state exhaustion attacks. Further discussion of whole-flow redirection is outside the scope of the presentdocument,document but is briefly discussed at the end of <xref target="qp_rationale_reclassify" format="default"/>.</t> </section> </section> </section> <section numbered="true" toc="default"> <name>Comments Solicited</name> <t>Evaluation and assessment of the algorithm by researchers is solicited. Comments and questions are also encouraged and welcome. They can be addressed to the authors.</t> </section><section numbered="true" toc="default"> <name>Acknowledgements</name> <t>Thanks to Tom Henderson, Magnus Westerlund, David Black, Adrian Farrel and Gorry Fairhurst for their reviews of this document. The design of the QProt algorithm and the settings of the parameters benefited from discussion and critique from the participants of the cable industry working group on Low Latency DOCSIS. CableLabs funded Bob Briscoe's initial work on this document.</t> </section></middle><!-- *****BACK MATTER ***** --><back> <displayreference target="I-D.briscoe-iccrg-prague-congestion-control" to="PRAGUE-CC"/> <references> <name>References</name> <references> <name>Normative References</name><reference anchor="RFC3168" target="https://www.rfc-editor.org/info/rfc3168" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml"> <front> <title>The Addition of Explicit Congestion Notification (ECN) to IP</title> <author fullname="K. Ramakrishnan" initials="K." surname="Ramakrishnan"/> <author fullname="S. Floyd" initials="S." surname="Floyd"/> <author fullname="D. Black" initials="D." surname="Black"/> <date month="September" year="2001"/> <abstract> <t>This memo specifies the incorporation of ECN (Explicit Congestion Notification) to TCP and IP, including ECN's use of two bits in the IP header. [STANDARDS-TRACK]</t> </abstract> </front> <seriesInfo name="RFC" value="3168"/> <seriesInfo name="DOI" value="10.17487/RFC3168"/> </reference> <reference anchor="RFC8311" target="https://www.rfc-editor.org/info/rfc8311" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8311.xml"> <front> <title>Relaxing Restrictions on Explicit Congestion Notification (ECN) Experimentation</title> <author fullname="D. Black" initials="D." surname="Black"/> <date month="January" year="2018"/> <abstract> <t>This memo updates RFC 3168, which specifies Explicit Congestion Notification (ECN) as an alternative to packet drops for indicating network congestion to endpoints. It relaxes restrictions in RFC 3168 that hinder experimentation towards benefits beyond just removal of loss. This memo summarizes the anticipated areas of experimentation and updates RFC 3168 to enable experimentation in these areas. An Experimental RFC in the IETF document stream is required to take advantage of any of these enabling updates. In addition, this memo makes related updates to the ECN specifications for RTP in RFC 6679 and for the Datagram Congestion Control Protocol (DCCP) in RFCs 4341, 4342, and 5622. This memo also records the conclusion of the ECN nonce experiment in RFC 3540 and provides the rationale for reclassification of<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8311.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9331.xml"/> <!-- [I-D.ietf-tsvwg-nqb] -> [RFC9956] companion doc RFC3540 from Experimental to Historic; this reclassification enables new experimental use of the ECT(1) codepoint.</t> </abstract> </front> <seriesInfo name="RFC" value="8311"/> <seriesInfo name="DOI" value="10.17487/RFC8311"/> </reference> <reference anchor="RFC9331" target="https://www.rfc-editor.org/info/rfc9331" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9331.xml"> <front> <title>The Explicit Congestion Notification (ECN) Protocol for Low Latency, Low Loss, and Scalable Throughput (L4S)</title> <author fullname="K. De Schepper" initials="K." surname="De Schepper"/> <author fullname="B. Briscoe" initials="B." role="editor" surname="Briscoe"/> <date month="January" year="2023"/> <abstract> <t>This specification defines the protocol to be used for a new network service called Low Latency, Low Loss, and Scalable throughput (L4S). L4S uses an Explicit Congestion Notification (ECN) scheme at the IP layer that is similar to the original (or 'Classic') ECN approach, except as specified within. L4S uses 'Scalable' congestion control, which induces much more frequent control signals from the network, and it responds to them with much more fine-grained adjustments so that very low (typically sub-millisecond on average) and consistently low queuing delay becomes possible for L4S traffic without compromising link utilization. Thus, even capacity-seeking (TCP-like) traffic can have high bandwidth and very low delay at the same time, even during periods of high traffic load.</t> <t>The L4S identifier defined in this document distinguishes L4S from 'Classic' (e.g., TCP-Reno-friendly) traffic. Then, network bottlenecks can be incrementally modified to distinguish and isolate existing traffic that still follows the Classic behaviour, to prevent it from degrading the low queuing delay and low loss of L4S traffic. This Experimental specification defines the rules that L4S transports and network elements need to follow, with the intention that L4S flows neither harm each other's performance nor that of Classic traffic. It also suggests open questions to be investigated during experimentation. Examples of new Active Queue Management (AQM) marking algorithms and new transports (whether TCP-like or real time) are specified separately.</t> </abstract> </front> <seriesInfo name="RFC" value="9331"/> <seriesInfo name="DOI" value="10.17487/RFC9331"/> </reference>9956 = draft-ietf-tsvwg-nqb-33 --> <referenceanchor="I-D.ietf-tsvwg-nqb" target="https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-21" xml:base="https://bib.ietf.org/public/rfc/bibxml-ids/reference.I-D.ietf-tsvwg-nqb.xml">anchor="RFC9956" target="https://www.rfc-editor.org/info/rfc9956"> <front> <title>A Non-Queue-Building Per-Hop Behavior (NQB PHB) for Differentiated Services</title> <authorfullname="Greg White"initials="G."surname="White">surname="White" fullname="Greg White"> <organization>CableLabs</organization> </author> <authorfullname="Thomas Fossati"initials="T."surname="Fossati">surname="Fossati" fullname="Thomas Fossati"> <organization>Linaro</organization> </author> <authorfullname="Ruediger Geib"initials="R."surname="Geib">surname="Geib" fullname="Ruediger Geib"> <organization>Deutsche Telekom</organization> </author> <dateday="7" month="November" year="2023"/> <abstract> <t>This document specifies propertiesmonth='April' year='2026'/> </front> <seriesInfo name="RFC" value="9956"/> <seriesInfo name="DOI" value="10.17487/RFC9956"/> </reference> <!-- [rfced] Please review the following andcharacteristics of a Non- Queue-Building Per-Hop Behavior (NQB PHB).let us know if any further updates are necessary: TheNQB PHB provides a shallow-buffered, best-effort service as a complement to a Default deep-buffered best-effort serviceoriginal URLs forInternet services. The purpose of this NQB PHB is to provide a separate queue that enables smooth (i.e. non-bursty), low-data-rate, application-limited traffic microflows, which would ordinarily share a queue with bursty[DOCSIS], [DOCSIS-CCAP-OSS], andcapacity-seeking traffic,[DOCSIS-CM-OSS] resolved toavoid the latency, latency variation and loss caused by such traffic. This PHB is implemented without prioritization and can be implemented without rate policing, making it suitablea blank search results page. We found more-direct URLs forenvironments where the use ofthesefeatures is restricted. The NQB PHB has been developed primarily for use by access network segments, where queuing delays and queuing loss caused by Queue-Building protocols are manifested, but its use is not limited to such segments. In particular, applications to cable broadband links, Wi-Fi links, and mobile network radioCableLabs specifications andcore segments are discussed. This document recommends a specific Differentiated Services Code Point (DSCP) to identify Non-Queue- Building microflows. [NOTE (to be removed by RFC-Editor): This documentupdated the referencesan ISE submission draft (I-D.briscoe-docsis-q-protection)accordingly. Note thatis approved for publication as an RFC. This draft should be heldwe also updated the date forpublication until[DOCSIS-CCAP-OSS] from "21 January 2019" to "7 February 2019" to match thequeue protection RFC can be referenced.]</t> </abstract> </front> <seriesInfo name="Internet-Draft" value="draft-ietf-tsvwg-nqb-21"/> </reference>information provided at that URL. --> <reference anchor="DOCSIS"target="https://specification-search.cablelabs.com/CM-SP-MULPIv3.1">target="https://www.cablelabs.com/specifications/CM-SP-MULPIv3.1"> <front> <title>MAC and Upper Layer Protocols Interface (MULPI) Specification, CM-SP-MULPIv3.1</title> <author fullname="" surname=""> <organization>CableLabs</organization> </author> <date day="21" month="January" year="2019"/> </front> <seriesInfo name="Data-Over-Cable Service Interface SpecificationsDOCSIS®DOCSIS(r) 3.1" value="Version I17 or later"/> </reference> <reference anchor="DOCSIS-CM-OSS"target="https://specification-search.cablelabs.com/CM-SP-CM-OSSIv3.1">target="https://www.cablelabs.com/specifications/CM-SP-CCAP-OSSIv3.1"> <front> <title>Cable Modem Operations Support System InterfaceSpec</title>Specification</title> <author fullname="" surname=""> <organization>CableLabs</organization> </author> <date day="21" month="January" year="2019"/> </front> <seriesInfo name="Data-Over-Cable Service Interface SpecificationsDOCSIS®DOCSIS(r) 3.1" value="Version I14 or later"/> </reference> <reference anchor="DOCSIS-CCAP-OSS"target="https://specification-search.cablelabs.com/CM-SP-CM-OSSIv3.1">target="https://www.cablelabs.com/specifications/CM-SP-CM-OSSIv3.1"> <front> <title>CCAP Operations Support System InterfaceSpec</title>Specification</title> <author fullname="" surname=""> <organization>CableLabs</organization> </author> <dateday="21" month="January"day="7" month="February" year="2019"/> </front> <seriesInfo name="Data-Over-Cable Service Interface SpecificationsDOCSIS®DOCSIS(r) 3.1" value="Version I14 or later"/> <format target="https://specification-search.cablelabs.com/CM-SP-CCAP-OSSIv3.1" type="PDF"/> </reference> </references> <references> <name>Informative References</name> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4303.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6789.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7713.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8257.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8298.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9332.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9330.xml"/> <!-- [I-D.briscoe-iccrg-prague-congestion-control] draft-briscoe-iccrg-prague-congestion-control-04 IESG State: Expired as of 1/5/25 --> <xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.briscoe-iccrg-prague-congestion-control.xml"/> <referenceanchor="RFC4303" target="https://www.rfc-editor.org/info/rfc4303" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4303.xml">anchor="LLD" target="https://cablela.bs/low-latency-docsis-technology-overview-february-2019"> <front><title>IP Encapsulating Security Payload (ESP)</title><title>Low Latency DOCSIS: Technology Overview</title> <author fullname="Greg White" initials="G." surname="White"> <organization>CableLabs</organization> </author> <authorfullname="S. Kent" initials="S." surname="Kent"/>fullname="Karthik Sundaresan" initials="K." surname="Sundaresan"> <organization>CableLabs</organization> </author> <author fullname="Bob Briscoe" initials="B." surname="Briscoe"> <organization>CableLabs</organization> </author> <datemonth="December" year="2005"/> <abstract> <t>This document describes an updated version of the Encapsulating Security Payload (ESP) protocol, which is designed to provide a mix of security services in IPv4 and IPv6. ESP is used to provide confidentiality, data origin authentication, connectionless integrity, an anti-replay service (a form of partial sequence integrity), and limited traffic flow confidentiality. This document obsoletes RFC 2406 (November 1998). [STANDARDS-TRACK]</t> </abstract>day="" month="February" year="2019"/> </front><seriesInfo name="RFC" value="4303"/> <seriesInfo name="DOI" value="10.17487/RFC4303"/><refcontent>CableLabs White Paper</refcontent> </reference> <referenceanchor="RFC6789" target="https://www.rfc-editor.org/info/rfc6789" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6789.xml">anchor="ScalingCC" target="https://arxiv.org/abs/1904.07605"> <front><title>Congestion Exposure (ConEx) Concepts and Use Cases</title><title>Resolving Tensions between Congestion Control Scaling Requirements</title> <authorfullname="B.fullname="Bob Briscoe" initials="B."role="editor" surname="Briscoe"/> <author fullname="R. Woundy" initials="R." role="editor" surname="Woundy"/>surname="Briscoe"> <organization>Simula Research Lab</organization> </author> <authorfullname="A. Cooper" initials="A." role="editor" surname="Cooper"/>fullname="Koen De Schepper" initials="K." surname="De Schepper"> <organization>Nokia Bell Labs</organization> </author> <datemonth="December" year="2012"/> <abstract> <t>This document providesmonth="July" year="2017"/> </front> <seriesInfo name="Simula Technical Report" value="TR-CS-2016-001"/> <seriesInfo name="DOI" value="10.48550/arXiv.1904.07605"/> <refcontent>arXiv:1904.07605</refcontent> </reference> <!-- [rfced] FYI: We updated theentry point[BBRv3] and [SCReAM] references tothe set of documentation about the Congestion Exposure (ConEx) protocol. It explains the motivationmatch current style guidance forincluding a ConEx marking at the IP layer:references toexpose information about congestionweb-based public code repositories: https://www.rfc-editor.org/styleguide/part2/#ref_repo --> <reference anchor="BBRv3" target="https://github.com/google/bbr/blob/v3/README.md"> <front> <title>TCP BBR v3 Release</title> <author/> <date day="18" month="March" year="2025"/> </front> <refcontent>commit 90210de</refcontent> </reference> <reference anchor="SCReAM" target="https://github.com/EricssonResearch/scream/blob/master/README.md"> <front> <title>SCReAM</title> <author/> <date day="10" month="November" year="2025"/> </front> <refcontent>commit 0208f59</refcontent> </reference> </references> </references> <section numbered="false" toc="default"> <name>Acknowledgements</name> <t>Thanks tonetwork nodes. Although such information may have a number<contact fullname="Tom Henderson"/>, <contact fullname="Magnus Westerlund"/>, <contact fullname="David Black"/>, <contact fullname="Adrian Farrel"/>, and <contact fullname="Gorry Fairhurst"/> for their reviews ofuses,thisdocument focuses on howdocument. The design of theinformation communicated byQProt algorithm and theConEx marking can serve assettings of thebasis for significantly more efficientparameters benefited from discussion andeffective traffic management than what existscritique from the participants of the cable industry working group on Low-Latency DOCSIS. CableLabs funded <contact fullname="Bob Briscoe"/>'s initial work on this document.</t> </section> <!--[rfced] We had theInternet today. This document is notfollowing questions related to terminology used throughout the document: a) Several sections use "the algorithm" in anInternet Standards Track specification;opening statement while other sections say "The QProt algorithm". Would itis publishedbe easier forinformational purposes.</t> </abstract> </front> <seriesInfo name="RFC" value="6789"/> <seriesInfo name="DOI" value="10.17487/RFC6789"/> </reference> <reference anchor="RFC7713" target="https://www.rfc-editor.org/info/rfc7713" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7713.xml"> <front> <title>Congestion Exposure (ConEx) Concepts, Abstract Mechanism, and Requirements</title> <author fullname="M. Mathis" initials="M." surname="Mathis"/> <author fullname="B. Briscoe" initials="B." surname="Briscoe"/> <date month="December" year="2015"/> <abstract> <t>This document describes an abstract mechanism by which senders inform the network aboutthecongestion recently encountered by packetsreader to call it "The QProt algorithm" in first mentions in a section (and use "the algorithm" thereafter in thesame flow. Today, network elements at any layersection)? Thinking of readers that maysignal congestionnot read the entire RFC, but instead jump to a section from a reference link. b) We have updated to use thereceiver by dropping packets or by Explicit Congestion Notification (ECN) markings,form on the right throughout. Please let us know any objections. IPSec / IPsec (to match RFC 4303) flow-ID / flow ID c) How may we make the following terms consistent throughout? Congestion-rate vs. congestion-rate Coupled DualQ AQM vs. Dual Queue Coupled AQM (companion uses "IETF's Coupled DualQ AQM") Diffserv Codepoint vs. Diffserv codepoint (companion uses Diffserv Code Point and Differentiated Services Code Point) flow state vs. flow-state Native vs. native vs. "Native" per-flow-state vs. per-flow state queue protection vs. Queue Protection --> <!--[rfced] We had thereceiver passes this information backfollowing questions related to abbreviations used throughout thesenderdocument: a) FYI - We have added expansions for abbreviations upon first use per Section 3.6 of RFC 7322 ("RFC Style Guide"). Please review each expansion intransport-layer feedback. The mechanism described here enablesthesenderdocument carefully toalso relay this congestion information back intoensure correctness. b) We see that thenetwork in-band atcompanion document (draft-ietf-tsvwg-nqb-33) uses theIP layer, suchfollowing abbreviations: NQB - Non-Queue-Building QB - Queue-Building We see that this document only uses NQB when mentioning thetotal amount of congestion from all elements on the path is revealed to all IP elements alongDiffserv codepoint. Can NQB be introduced earlier in thepath, where it could, for example,document and be used toprovide inputrefer totraffic management. This mechanism is called Congestion Exposure, or ConEx. The companion document, "Congestion Exposure (ConEx) Concepts and Use Cases" (RFC 6789), providestheentry point to the set of ConEx documentation.</t> </abstract> </front> <seriesInfo name="RFC" value="7713"/> <seriesInfo name="DOI" value="10.17487/RFC7713"/> </reference> <reference anchor="RFC8257" target="https://www.rfc-editor.org/info/rfc8257" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8257.xml"> <front> <title>Data Center TCP (DCTCP): TCP Congestion Control for Data Centers</title> <author fullname="S. Bensley" initials="S." surname="Bensley"/> <author fullname="D. Thaler" initials="D." surname="Thaler"/> <author fullname="P. Balasubramanian" initials="P." surname="Balasubramanian"/> <author fullname="L. Eggert" initials="L." surname="Eggert"/> <author fullname="G. Judd" initials="G." surname="Judd"/> <date month="October" year="2017"/> <abstract> <t>This Informational RFC describes Data Center TCP (DCTCP): a TCP congestion control scheme for data-center traffic. DCTCP extends the Explicit Congestion Notification (ECN) processing to estimate the fraction of bytesgeneral concept? c) We see thatencounter congestion[DOCSIS] uses "Queue Protection" rather thansimply detecting that some congestion has occurred. DCTCP then scales the TCP congestion window based on this estimate. This method achieves high-burst tolerance, low latency, and high throughput with shallow- buffered switches. This memo also discusses deployment issues related to the coexistence of DCTCP and conventional TCP, discusses"queue protection". We see both thelack of a negotiating mechanism between sender and receiver,capped andpresents some possible mitigations. This memo documents DCTCP as currently implemented by several major operating systems. DCTCP, as describedlowercase versions used in thisspecification, is applicabledocument. May we update todeployments in controlled environments like data centers, but it must not be deployed over the public Internet without additional measures.</t> </abstract> </front> <seriesInfo name="RFC" value="8257"/> <seriesInfo name="DOI" value="10.17487/RFC8257"/> </reference> <reference anchor="RFC8298" target="https://www.rfc-editor.org/info/rfc8298" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8298.xml"> <front> <title>Self-Clocked Rate Adaptation for Multimedia</title> <author fullname="I. Johansson" initials="I." surname="Johansson"/> <author fullname="Z. Sarker" initials="Z." surname="Sarker"/> <date month="December" year="2017"/> <abstract> <t>This memo describes a rate adaptation algorithm for conversational media services such as interactive video. The solution conformssimply QProt (after first expansion) when referring to thepacket conservation principle and uses a hybrid loss-and-delay- based congestion control algorithm. The algorithmalgorithm? And/Or are there places where capping or lowercasing this term isevaluated over both simulated Internet bottleneck scenarios as well as in a Long Term Evolution (LTE) system simulator andnecessary? If not, please let us know how we may make this consistent. Further, isshownit QProt algorithm or DOCSIS QProt algorithm? d) FYI - We have updated the expansion of DOCSIS toachieve both low latency and high video throughput in these scenarios.</t> </abstract> </front> <seriesInfo name="RFC" value="8298"/> <seriesInfo name="DOI" value="10.17487/RFC8298"/> </reference> <reference anchor="RFC9332" target="https://www.rfc-editor.org/info/rfc9332" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9332.xml"> <front> <title>Dual-Queue Coupled Active Queue Management (AQM) for Low Latency, Low Loss, and Scalable Throughput (L4S)</title> <author fullname="K. De Schepper" initials="K." surname="De Schepper"/> <author fullname="B. Briscoe" initials="B." role="editor" surname="Briscoe"/> <author fullname="G. White" initials="G." surname="White"/> <date month="January" year="2023"/> <abstract> <t>This specification defines a framework for couplinguse hyphenation (i.e., Data-Over-Cable) to match theActive Queue Management (AQM) algorithmsuse intwo queues intended for flows with different responses to congestion. This provides a way for[DOCSIS] and theInternet to transition fromcompanion document. Please let us know any objections. e) How may we expand thescaling problems of standard TCP-Reno-friendly ('Classic') congestion controlsfollowing abbreviations? CE MAC f) We will update to use thefamily of 'Scalable' congestion controls. These are designed for consistently very low queuing latency, very low congestion loss, and scalingabbreviated forms ofper-flow throughput by using Explicit Congestion Notification (ECN) in a modified way. UntiltheCoupled Dual Queue (DualQ), these Scalable L4S congestion controls could only be deployed where a clean-slate environment could be arranged, such as in private data centres.</t> <t>This specificationfollowing after expansion on firstexplains how a Coupled DualQ works. It then givesuse (per thenormative requirementsguidance at https://www.rfc-editor.org/styleguide/part2/#exp_abbrev): LL CM g) We note thatare necessarythis document uses LL queue as an abbreviation forit to work well. Alllow-latency queue. However, we see RFC 9332 uses "low-latency (L) queue". Please review thisis independent of which two AQMs are used, but pseudocode examples of specific AQMsdiscrepancy and let us know if any further updates aregivennecessary. Further, please note that we have hyphenated low latency when it appears inappendices.</t> </abstract> </front> <seriesInfo name="RFC" value="9332"/> <seriesInfo name="DOI" value="10.17487/RFC9332"/> </reference> <reference anchor="RFC9330" target="https://www.rfc-editor.org/info/rfc9330" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9330.xml"> <front> <title>Low Latency, Low Loss, and Scalable Throughput (L4S) Internet Service: Architecture</title> <author fullname="B. Briscoe" initials="B." role="editor" surname="Briscoe"/> <author fullname="K. De Schepper" initials="K." surname="De Schepper"/> <author fullname="M. Bagnulo" initials="M." surname="Bagnulo"/> <author fullname="G. White" initials="G." surname="White"/> <date month="January" year="2023"/> <abstract> <t>This document describes the L4S architecture, which enables Internet applicationsattributive position toachieve low queuing latency, low congestion loss, and scalable throughput control. L4S is based on the insight that the root cause of queuing delay ismatch its use in RFC 9330-9332. --> <!-- [rfced] Please review thecapacity-seeking congestion controllers"Inclusive Language" portion ofsenders, not inthequeue itself. With the L4S architecture, all Internet applications could (but do not have to) transition away from congestion control algorithms that cause substantial queuing delayonline Style Guide <https://www.rfc-editor.org/styleguide/part2/#inclusive_language> andinstead adopt a new class of congestion controls that can seek capacity with very little queuing. Theselet us know if any changes areaided by a modified formneeded. Updates ofExplicit Congestion Notification (ECN) from the network. Withthisnew architecture, applications can have both low latency and high throughput.</t> <t>The architecture primarily concerns incremental deployment. It defines mechanisms that allow the new class of L4S congestion controls to coexist with 'Classic' congestion controls in a shared network. The aim is for L4S latency and throughput to be usually much better (and rarely worse) whilenature typicallynot impacting Classic performance.</t> </abstract> </front> <seriesInfo name="RFC" value="9330"/> <seriesInfo name="DOI" value="10.17487/RFC9330"/> </reference> <reference anchor="I-D.briscoe-iccrg-prague-congestion-control" target="https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control-03" xml:base="https://bib.ietf.org/public/rfc/bibxml-ids/reference.I-D.briscoe-iccrg-prague-congestion-control.xml"> <front> <title>Prague Congestion Control</title> <author fullname="Koen De Schepper" initials="K." surname="De Schepper"> <organization>Nokia Bell Labs</organization> </author> <author fullname="Olivier Tilmans" initials="O." surname="Tilmans"> <organization>Nokia Bell Labs</organization> </author> <author fullname="Bob Briscoe" initials="B." surname="Briscoe"> <organization>Independent</organization> </author> <author fullname="Vidhi Goel" initials="V." surname="Goel"> <organization>Apple Inc</organization> </author> <date day="14" month="October" year="2023"/> <abstract> <t>This specification defines the Prague congestion control scheme,result in more precise language, which isderived from DCTCP and adaptedhelpful forInternet traffic by implementingreaders. For example, please consider whether thePrague L4S requirements. Over paths with L4S support atfollowing should be updated: native In addition, please consider whether uses of "tradition" should be updated for clarity. While thebottleneck,NIST website <https://web.archive.org/web/20250214092458/https://www.nist.gov/nist-research-library/nist-technical-series-publications-author-instructions#table1> indicates that this term is potentially biased, itadapts the DCTCP mechanisms to achieve consistently low latency and full throughput. Itisdefined independently of any particular transport protocol or operating system, but notes are added that highlight issues specific to certain transports and OSs. Italso ambiguous. "Tradition" ismainly based on experience with the reference Linux implementation of TCP Prague and the Apple implementation over QUIC, buta subjective term, as itincludes experience from other implementations where available. The implementation doesis notsatisfy all the Prague requirements (yet) andtheIETF might decide that certain requirements need to be relaxed as an outcome of the process of trying to satisfy them all. Future plans that have typically only been implemented as proof-of- concept code are outlined in a separate section.</t> </abstract> </front> <seriesInfo name="Internet-Draft" value="draft-briscoe-iccrg-prague-congestion-control-03"/> </reference> <reference anchor="LLD" target="https://cablela.bs/low-latency-docsis-technology-overview-february-2019"> <front> <title>Low Latency DOCSIS: Technology Overview</title> <author fullname="Greg White" initials="G." surname="White"> <organization>CableLabs</organization> </author> <author fullname="Karthik Sundaresan" initials="K." surname="Sundaresan"> <organization>CableLabs</organization> </author> <author fullname="Bob Briscoe" initials="B." surname="Briscoe"> <organization>CableLabs</organization> </author> <date day="" month="February" year="2019"/> </front> <seriesInfo name="CableLabs White Paper" value=""/> </reference> <reference anchor="ScalingCC" target="https://arxiv.org/abs/1904.07605"> <front> <title>Resolving Tensions between Congestion Control Scaling Requirements</title> <author fullname="Bob Briscoe" initials="B." surname="Briscoe"> <organization>Simula Research Lab</organization> </author> <author fullname="Koen De Schepper" initials="K." surname="De Schepper"> <organization>Nokia Bell Labs</organization> </author> <date month="July" year="2017"/> </front> <seriesInfo name="Simula Technical Report" value="TR-CS-2016-001 arXiv:1904.07605"/> </reference> <reference anchor="BBRv3" target="https://github.com/google/bbr/blob/v3/README.md"> <front> <title>TCP BBR v3 Release</title> <author fullname="Neal Cardwell" initials="N" surname="Cardwell"> <organization/> </author> <date/> </front> <seriesInfo name="github repository;" value="Linux congestion control module"/> </reference> <reference anchor="SCReAM" target="https://github.com/EricssonResearch/scream/blob/master/README.md"> <front> <title>SCReAM</title> <author fullname="Ingemar Johansson" initials="I" surname="Johansson"> <organization/> </author> <date/> </front> <seriesInfo name="github repository;" value=""/> <format target="https://github.com/google/bbr/blob/v2alpha/README.md" type="Source code"/> </reference> </references> </references>same for everyone. --> </back> </rfc>