wdiff rfc9957.original.xml rfc9957.xml

<?xml version='1.0' encoding='utf-8'?> encoding='UTF-8'?>

<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY reg     "&#174;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs),
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->

<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-briscoe-docsis-q-protection-07" number="9957" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" submissionType="independent" xml:lang="en" tocInclude="true" tocDepth="4" symRefs="true" sortRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.18.2 -->
  <!-- category values: std, bcp, info, exp, and historic
    ipr values: trust200902, noModificationTrust200902, noDerivativesTrust200902,
       or pre5378Trust200902
    you can add the attributes updates="NNNN" and obsoletes="NNNN"
    they will automatically be output with "(if approved)" -->

  <!-- ***** FRONT MATTER ***** -->

  <front>
    <!-- The abbreviated title is used in the page header - it is only necessary if the
       full title is longer than 39 characters -->

    <title abbrev="Queue Protection to Preserve Low Latency">The DOCSIS&reg; DOCSIS
    Queue Protection Algorithm to Preserve Low Latency</title>
    <seriesInfo name="Internet-Draft" value="draft-briscoe-docsis-q-protection-07"/> name="RFC" value="9957"/>
    <author fullname="Bob Briscoe" initials="B." role="editor" surname="Briscoe">
      <organization>Independent</organization>
      <address>
        <postal>
          <street/>
          <country>UK</country>
          <country>United Kingdom</country>
        </postal>
        <email>ietf@bobbriscoe.net</email>
        <uri>http://bobbriscoe.net/</uri>
        <uri>https://bobbriscoe.net/</uri>
      </address>
    </author>
    <author fullname="Greg White" initials="G." surname="White">
      <organization>CableLabs</organization>
      <address>
        <postal>
          <street/>
          <country>US</country>
          <country>United States of America</country>
        </postal>
        <email>G.White@CableLabs.com</email>
      </address>
    </author>
    <date month="" year=""/> month="April" year="2026"/>
    <area>WIT</area>
    <workgroup>tsvwg</workgroup>

    <keyword>Independent Submission Stream</keyword>
    <keyword>ISE</keyword>
    <keyword>Latency</keyword>
    <keyword>Policing</keyword>

    <abstract>
      <t>This informational document Informational RFC explains the specification of the queue
      protection algorithm used in DOCSIS Data-Over-Cable Service Interface Specification (DOCSIS) technology since introduced in version 3.1. A
      shared low latency low-latency queue relies on the non-queue-building behaviour of
      every traffic flow using it. However, some flows might not take such
      care, either accidentally or maliciously. If a queue is about to exceed
      a threshold level of delay, the queue protection Queue Protection algorithm can rapidly
      detect the flows most likely to be responsible. It can then prevent harm
      to other traffic in the low latency low-latency queue by ejecting selected packets
      (or all packets) of these flows. The This document is designed for four types
      of audience: audiences: a) congestion control designers who need to understand how
	to keep on the 'good' "good" side of the algorithm; b) implementers of the
	algorithm who want to understand it in more depth; c) designers of
	algorithms with similar goals, perhaps for non-DOCSIS scenarios; and
	d) researchers interested in evaluating the algorithm.</t>

    </abstract>
  </front>
  <middle>
    <section anchor="qp_intro" numbered="true" toc="default">
      <name>Introduction</name>

      <t>This informational document Informational RFC explains the specification of the queue
      protection (QProt) algorithm used in DOCSIS technology since introduced in version 3.1
      <xref target="DOCSIS" format="default"/>.</t>
      <t>Although the algorithm is defined in annex Annex P of <xref target="DOCSIS" format="default"/>, it relies on cross-references cross references to other parts of the
      set of specs. specifications. This document pulls all the strands together into one
      self-contained document. The core of the document is a similar
      pseudocode walk-through to that in the DOCSIS spec, specification, but it also includes
      additional material: i) a material:</t>
      <ol spacing="normal" type="i">
	<li>a brief overview; ii) a overview,</li>
	<li>a definition of how a data
	sender needs to behave to avoid triggering queue protection; and iii) a protection, and</li>
	<li>a
      section giving the rationale for the design choices.</t> choices.</li></ol>
      <t>Low queuing delay depends on hosts sending their data smoothly,
      either at a low rate or responding to explicit congestion notifications
      (ECN Explicit Congestion Notification
      (ECN) (see <xref target="RFC8311" format="default"/>, format="default"/> and <xref target="RFC9331" format="default"/>). So low Therefore, low-latency queuing
      latency
      is something hosts create themselves, not something the network
      gives them. This tends to ensure that self-interest alone does not drive
      flows to mis-mark mismark their packets for the low latency low-latency queue. However,
      traffic from an application that does not behave in a non-queue-building
      way might erroneously be classified into a low latency low-latency queue, whether
      accidentally or maliciously. QProt protects other traffic in the low
      latency low-latency queue from the harm due to excess queuing that would otherwise
      be caused by such anomalous behaviour.</t>
      <t>In normal scenarios without misclassified traffic, QProt is not
      expected to intervene at all in the classification or forwarding of
      packets.</t>
      <t>An overview of how low latency low-latency support has been added to DOCSIS
      technology is given in <xref target="LLD" format="default"/>. In each direction of a
      DOCSIS link (upstream and downstream), there are two queues: one for Low
      Latency Low-Latency (LL) and one for Classic traffic, in an arrangement similar to
      the IETF's Dual-Queue Coupled DualQ AQM Active Queue Management (AQM) <xref target="RFC9332" format="default"/>. The two queues
      enable a transition from 'Classic' "Classic" to 'Scalable' "Scalable" congestion control so
      that low latency can become the norm for any application, including ones
      seeking both full throughput and low latency, not just low-rate
      applications that have been more traditionally associated with a low
      latency low-latency service.  The Classic queue is only necessary for traffic such as
      traditional (Reno/Cubic) (Reno <xref target="RFC5681"/> / Cubic <xref target="RFC9438"/>) TCP that needs about a round trip of buffering
      to fully utilize the link, and therefore link; therefore, this traffic has no incentive to mismark
      itself as low latency. The QProt function is located at the ingress to
      the Low Latency Low-Latency queue. Therefore, in the upstream upstream, QProt is located on
      the cable modem (CM), and (CM); in the downstream downstream, it is located on the cable
      CMTS (CM
      CM Termination System). System (CMTS). If an arriving packet triggers queue
      protection, the QProt algorithm ejects the packet from the Low Latency Low-Latency
      queue and reclassifies it into the Classic queue.</t>
      <t>If QProt is used in settings other than DOCSIS links, it would be a
      simple matter to detect queue-building flows by using slightly different
      conditions,
      conditions and/or to trigger a different action as a consequence, as
      appropriate for the scenario, e.g., dropping instead of reclassifying
      packets or perhaps accumulating a second per-flow score to decide
      whether to redirect a whole flow rather than just certain packets. Such
      work is for future study and out of scope of the present document.</t>
      <t>The QProt algorithm is based on a rigorous approach to quantifying how much
      each flow contributes to congestion, which is used in economics to
      allocate responsibility for the cost of one party's behaviour on others
      (the economic externality). Another important feature of the approach is
      that the metric used for the queuing score is based on the same variable
      that determines the level of ECN signalling seen by the sender (see <xref target="RFC8311" format="default"/>, format="default"/> and <xref target="RFC9331" format="default"/>. format="default"/>). This makes the internal
      queuing score visible externally as ECN markings. This transparency is
      necessary to be able to objectively state (in <xref target="qp_nec_flow_behaviour" format="default"/>) how a flow can keep on the 'good' "good" side
      of the algorithm.</t>
      <section numbered="true" toc="default">
        <name>Document Roadmap</name>
        <t>The core of the document is the walk-through of the DOCSIS QProt
        algorithm's pseudocode in <xref target="qp_walk-through" format="default"/>.</t>
        <t>Prior to that, <xref target="qp_approach" format="default"/> summarizes the approach
        used in the algorithm, then algorithm. Then, <xref target="qp_nec_flow_behaviour" format="default"/>
        considers QProt from the perspective of the end-system, end-system by defining
        the behaviour that a flow needs to comply with to avoid the QProt
        algorithm ejecting its packets from the low latency low-latency queue.</t>
        <t><xref target="qp_rationale" format="default"/> gives deeper insight into the
        principles and rationale behind the algorithm. Then Then, <xref target="qp_limitations" format="default"/> explains the limitations of the approach,
        followed by the approach.  The usual closing sections.</t> sections follow.</t>
      </section>
      <section anchor="l4sds_Terminology" numbered="true" toc="default">
        <name>Terminology</name>
        <t>The normative language for the DOCSIS QProt algorithm is in the
        DOCSIS specs specifications <xref target="DOCSIS" format="default"/>, <xref target="DOCSIS-CM-OSS" format="default"/>, and
        <xref target="DOCSIS-CCAP-OSS" format="default"/> format="default"/>: not in this informational guide. Informational RFC. If
        there is any inconsistency, the DOCSIS specs specifications take precedence.</t>
        <t>The following terms and abbreviations are used:</t>
        <dl newline="false" spacing="normal">
          <dt>CM:</dt>
          <dd>Cable Modem</dd>
          <dt>CMTS:</dt>
          <dd>CM Termination System</dd>
          <dt>Congestion-rate:</dt>
          <dd>The transmission rate of counting only bits or
            bytes contained within packets of a flow that have the CE Congestion Experienced (CE)
            codepoint set in the IP-ECN field <xref target="RFC3168" format="default"/>
            (including IP headers unless specified otherwise).
            Congestion-bit-rate and congestion-volume were introduced in <xref target="RFC7713" format="default"/> and <xref target="RFC6789" format="default"/>.</dd>
          <dt>DOCSIS:</dt>
          <dd>Data Over Cable
          <dd>Data-Over-Cable System Interface
            Specification. "DOCSIS" is a registered trademark of Cable
            Television Laboratories, Inc. ("CableLabs").</dd>
          <dt>Non-queue-building:</dt>
          <dd>A
          <dt>QProt:</dt>
          <dd>The DOCSIS Queue Protection function</dd>
          <dt>non-queue-building:</dt>
          <dd><t>A flow that tends not to build a
            queue</dd>
          <dt>Queue-building:</dt> queue.</t>
          <t>Note the use of lowercase "non-queue-building", which describes the behaviour
            of a flow, in contrast to uppercase Non-Queue-Building (NQB),
            which refers to a Diffserv marking <xref target="RFC9956" format="default"/>.
            A flow identifying itself as uppercase NQB may not behave in a
            non-queue-building way, which is what the QProt algorithm is
            intended to detect.</t></dd>
          <dt>queue-building:</dt>
          <dd>A flow that builds a queue. If it is
            classified into the Low Latency Low-Latency queue, it is therefore a candidate
            for the queue protection QProt  algorithm to detect and sanction.</dd>
          <dt>ECN:</dt>
          <dd>Explicit Congestion Notification</dd>
          <dt>QProt:</dt>
          <dd>The Queue Protection function</dd> Notification <xref target="RFC3168" format="default"/></dd>
          <dt>ECN marking or ECN signalling:</dt>
          <dd>Setting the IP-ECN field of an increasing proportion of packets to the
            Congestion Experienced (CE) codepoint <xref target="RFC3168" format="default"/> as queuing worsens.</dd>
        </dl>
      </section>

      <section anchor="source" numbered="true" toc="default">
        <name>Copyright
	  <name>Source Material</name>
	  <t>Parts of this document are reproduced from <xref target="DOCSIS" format="default"/> target="DOCSIS"/> with kind permission of the copyright holder, Cable Television Laboratories, Inc. ("CableLabs").</t>
	</section>
    </section>
    <section anchor="qp_approach" numbered="true" toc="default">
      <name>Approach - In Brief</name> (In Brief)</name>
      <t>The QProt algorithm is divided into mechanism and policy. There is only a
      tiny amount of policy code, but policy might need to be changed in the
      future. So, where hardware implementation is being considered, it would
      be advisable to implement the policy aspects in firmware or
      software:</t>
      <ul spacing="normal">
        <li>
          <t>The mechanism aspects identify flows, maintain flow-state flow state, and
          accumulate per-flow queuing scores;</t>
        </li>
        <li>
          <t>The policy aspects can be divided into conditions and
          actions:</t>
          <ul spacing="normal">
            <li>
              <t>The conditions are the logic that determines when action
              should be taken to avert the risk of queuing delay becoming
              excessive;</t>
            </li>
            <li>
              <t>The actions determine how this risk is averted, e.g., by
              redirecting packets from a flow into another queue, queue or to
              reclassify by
              reclassifying a whole flow that seems to be misclassified.</t>
            </li>
          </ul>
        </li>
      </ul>
      <section anchor="qp_approach_mechanism" numbered="true" toc="default">

        <name>Mechanism</name>
        <t>The algorithm maintains per-flow-state, per-flow state, where 'flow' "flow" usually means
        an end-to-end (layer-4) (Layer 4) 5-tuple. The flow-state flow state consists of a queuing
        score that decays over time. Indeed Indeed, it is transformed into time units
        so that it represents the flow-state's flow state's own expiry time (explained in
        <xref target="qp_rationale_normalize" format="default"/>). A higher queuing score
        pushes out the expiry time further.</t>
        <t>Non-queue-building flows tend to release their flow-state rapidly
        --- flow state rapidly:
        it usually expires reasonably early in the gap between the packets
        of a normal flow. Then Then, the memory can be recycled for packets from
        other flows that arrive in between. So in-between. Thus, only queue-building flows hold
        flow state persistently.</t>
        <t>The simplicity and effectiveness of the algorithm is due to the
        definition of the queuing score. The queueing score represents the
        share of blame for queuing that each flow bears. The scoring algorithm
        uses the same internal variable, probNative, that the AQM for the low
        latency low-latency queue uses to ECN-mark packets (the packets.  (The other two forms of
        marking, Classic and coupled, are driven by Classic traffic and
        therefore traffic;
        therefore, they are not relevant to protection of the LL queue). In this way,
        the queuing score accumulates the size of each arriving packet of a
        flow,
        flow but scaled by the value of probNative (in the range 0 to 1) at
        the instant the packet arrives. So Therefore, a flow's score accumulates faster,
        the faster:</t>
	<ul>
          <li>the higher the degree of queuing and the and</li>
	  <li>the faster that the flow's
          packets arrive when there is queuing. <xref queuing.</li>
	</ul>
	<t><xref target="qp_rationale_not_throughput" format="default"/> explains further why this score
        represents blame for queuing.</t>
        <t>The algorithm algorithm, as described so far far, would accumulate a number that
        would rise at the so-called congestion-rate of the flow (see
        Terminology in
        <xref target="l4sds_Terminology" format="default"/>), i.e.,&nbsp;the i.e., the
        rate at which the flow is contributing to congestion, congestion or the rate at
        which the AQM is forwarding bytes of the flow that are ECN marked. ECN-marked.
        However, rather than growing continually, the queuing score is also
        reduced (or 'aged') "aged") at a constant rate. This is because it is
        unavoidable for capacity-seeking flows to induce a continuous low
        level of congestion as they track available capacity. <xref target="qp_rationale_aging" format="default"/> explains why this allowance can be set
        to the same constant for any scalable flow, whatever its bit rate.</t>
        <t>For implementation efficiency, the queuing score is transformed
        into time units so that it units. It then represents the expiry time of the flow
        state (as already discussed above). Then Then, it does not need to be
        explicitly aged, aged because the natural passage of time implicitly 'ages' "ages"
        an expiry time. The transformation into time units simply involves
        dividing the queuing score of each packet by the constant aging rate
        (explained
        (this is explained further in <xref target="qp_rationale_normalize" format="default"/>).</t>
      </section>
      <section anchor="qp_approach_policy" numbered="true" toc="default">
        <name>Policy</name>
        <section numbered="true" toc="default">
          <name>Policy Conditions</name>
          <t>The algorithm uses the queuing score to determine whether to
          eject each packet only at the time it first arrives. This limits the
          policies available. For instance, when queueing delay exceeds a
          threshold, it is not possible feasible to eject a packet from the flow with
          the highest queuing scoring, scoring because that would involve searching
          the queue for such a packet (if indeed (if, indeed, one was were still in the queue).
          Nonetheless, it is still possible to develop a policy that protects
          the low latency of the queue by making the queuing score threshold
          stricter the greater the excess of queuing delay relative to the
          threshold (explained (this is explained in <xref target="qp_rationale_conditions" format="default"/>).</t>
        </section>
        <section numbered="true" toc="default">
          <name>Policy Action</name>
          <t>In the DOCSIS QProt spec at
          <t>At the time of writing, the DOCSIS QProt specification states that, when the policy
          conditions are met met, the action taken to protect the low latency low-latency queue
          is to reclassify a packet into the Classic queue (justified (this is justified in <xref target="qp_rationale_reclassify" format="default"/>).</t>
        </section>
      </section>
    </section>
    <section anchor="qp_nec_flow_behaviour" numbered="true" toc="default">
      <name>Necessary Flow Behaviour</name>
      <t>The QProt algorithm described here can be used for responsive and/or
      unresponsive flows.</t>
      <ul spacing="normal">
        <li>
          <t>It is possible to objectively describe the least responsive way
          that a flow will need to respond to congestion signals in order to
          avoid triggering queue protection, no matter the link capacity and
          no matter how much other traffic there is.</t>
        </li>
        <li>
          <t>It is not possible to describe how fast or smooth an unresponsive
          flow should be to avoid queue protection, protection because this depends on
          how much other traffic there is and the capacity of the link, which
          an application is unable to know. However, the more smoothly an
          unresponsive flow paces its packets and the lower its rate relative
          to typical broadband link capacities, the less likelihood that it
          will risk causing enough queueing to trigger queue protection.</t>
        </li>
      </ul>
      <t>Responsive low latency low-latency flows can use an L4S a Low Latency, Low Loss, and Scalable throughput (L4S) ECN codepoint <xref target="RFC9331" format="default"/> to get classified into the low latency low-latency queue.</t>
      <t>A sender can arrange for flows that are smooth but do not respond to
      ECN marking to be classified into the low latency low-latency queue by using the
      Non-Queue-Building (NQB) Diffserv codepoint <xref target="I-D.ietf-tsvwg-nqb" target="RFC9956" format="default"/>, which the DOCSIS specs specifications support, or an
      operator can use various other local classifiers.</t>
      <t>As already explained in <xref target="qp_approach_mechanism" format="default"/>, the
      QProt algorithm is driven from the same variable that drives the ECN
      marking ECN-marking probability in the low latency Low-Latency or 'LL' "LL" queue (the 'Native' "Native" AQM
      of the LL queue is defined in the Immediate Active Queue Management
      Annex of <xref target="DOCSIS" format="default"/>). The algorithm that calculates this
      internal variable is run on the arrival of every LL packet, whether or not it is
      ECN-capable or not,
      ECN-capable, so that it can be used by the QProt algorithm. But
      the variable is only used to ECN-mark packets that are ECN-capable.</t>
      <t>Not only does this dual use of the variable improve processing
      efficiency, but it also makes the basis of the QProt algorithm visible
      and transparent, at least for responsive ECN-capable flows. Then Then, it is
      possible to state objectively that a flow can avoid triggering queue
      protection by keeping the bit rate of ECN marked ECN-marked packets (the
      congestion-rate) below AGING, which is a configured constant of the
      algorithm (default 2^19 B/s ~= ≈ 4 Mb/s). Note that it is in a congestion
      controller's own interest to keep its average congestion-rate well below
      this level (e.g., ~1 Mb/s), Mb/s) to ensure that it does not trigger queue
      protection during transient dynamics.</t>
      <t>If the QProt algorithm is used in other settings, it would still need
      to be based on the visible level of congestion signalling, in a similar
      way to the DOCSIS approach. Without transparency of the basis of the
      algorithm's decisions, end-systems would not be able to avoid triggering
      queue protection on an objective basis.</t>
    </section>
    <section anchor="qp_walk-through" numbered="true" toc="default">
      <name>Pseudocode Walk-Through</name>
      <t/>
      <section anchor="qp_header_file" numbered="true" toc="default">
        <name>Input Parameters, Constants Constants, and Variables</name>
        <t>The operator input parameters that set the parameters in the first
        two blocks of pseudocode below are defined for cable modems (CMs) in
        <xref target="DOCSIS-CM-OSS" format="default"/> and for CMTSs in <xref target="DOCSIS-CCAP-OSS" format="default"/>. Then, further constants are either derived
        from the input parameters or hard-coded.</t>
        <t>Defaults and units are shown in square brackets. Defaults (or
        indeed any aspect of the QProt algorithm) are subject to change, so the
        latest DOCSIS specs specifications are the definitive references. Also Also, any operator
        might set certain parameters to non-default values.</t>

        <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
// Input Parameters
MAX_RATE;          // Configured maximum sustained rate [b/s]
QPROTECT_ON;       // Queue Protection is enabled [Default: TRUE]
CRITICALqL_us;     // LL queue threshold delay [us] [µs] Default: MAXTH_us
CRITICALqLSCORE_us;// The threshold queuing score [Default: 4000us] 4000 µs]
LG_AGING;          // The aging rate of the q'ing score [Default: 19]
                   //  as log base 2 of the congestion-rate [lg(B/s)]

// Input Parameters for the calcProbNative() algorithm:
MAXTH_us;          // Max LL AQM marking threshold [Default: 1000us] 1000 µs]
LG_RANGE;          // Log base 2 of the range of ramp [lg(ns)]
                   //  Default: 2^19 = 524288 ns (roughly 525 us)
]]></sourcecode> µs)]]></sourcecode>

        <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
// Constants, either derived from input parameters or hard-coded
T_RES;                                    // Resolution of t_exp [ns]
                                          // Convert units (approx)
AGING = pow(2, (LG_AGING-30) ) * T_RES;   // lg([B/s]) to [B/T_RES]
CRITICALqL = CRITICALqL_us * 1000;        // [us] [µs] to [ns]
CRITICALqLSCORE = CRITICALqLSCORE_us * 1000/T_RES; // [us] [µs] to [T_RES]
// Threshold for the q'ing score condition
CRITICALqLPRODUCT = CRITICALqL * CRITICALqLSCORE;
qLSCORE_MAX = 5E9 / T_RES;           // Max queuing score = 5 s

ATTEMPTS = 2; // Max attempts to pick a bucket (vendor-specific)
BI_SIZE = 5;  // Bit-width of index number for non-default buckets
NBUCKETS = pow(2, BI_SIZE);  // No. of non-default buckets
MASK = NBUCKETS-1;     // convenient constant, with BI_SIZE LSBs set

                       // Queue Protection exit states
EXIT_SUCCESS  = 0;     // Forward the packet
EXIT_SANCTION = 1;     // Redirect the packet

MAX_PROB      = 1; // For integer arithmetic, would use a large int
                   //  e.g., 2^31, to allow space for overflow
MAXTH = MAXTH_us * 1000;   // Max marking threshold [ns]
MAX_FRAME_SIZE = 2000;  // DOCSIS-wide constant [B]
// Minimum marking threshold of 2 MTU for slow links [ns]
FLOOR =  2 * 8 * MAX_FRAME_SIZE * 10^9 / MAX_RATE;
RANGE = (1 << LG_RANGE);      // Range of ramp [ns]
MINTH = max ( MAXTH - RANGE, FLOOR);
MAXTH = MINTH + RANGE;           // Max marking threshold [ns]
]]></sourcecode> [ns]]]></sourcecode>

        <t>Throughout the pseudocode, most variables are integers. The only
        exceptions are floating point floating-point variables representing probabilities
        (MAX_PROB and probNative) and the AGING parameter. The actual DOCSIS
        QProt algorithm is defined using integer arithmetic, but in the
        floating point
        floating-point arithmetic used in this document, (0 0 &lt;= probNative
        &lt;= 1). 1. Also, the pseudocode omits overflow checking and it would
        need to be made robust to non-default input parameters.</t>
        <t>The resolution for expressing time, T_RES, needs to be chosen to
        ensure that expiry times for buckets can represent times that are a
        fraction (e.g., 1/10) of the expected packet interarrival time for the
        system.</t>
        <t>The following definitions explain the purpose of important
        variables and functions.</t>
        <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
// Public variables:
qdelay;        // The current queuing delay of the LL queue [ns]
probNative;    // Native marking probability of LL queue within [0,1]

// External variables
packet;            // The structure holding packet header fields
packet.size;       // The size of the current packet [B]
packet.uflow;      // The flow identifier of the current packet
                   //  (e.g., 5-tuple or 4-tuple if IPSec) IPsec)

// Irrelevant details of DOCSIS function to return qdelay are removed
qdelayL(...)      // Returns current delay of the low latency low-latency Q [ns]
]]></sourcecode> [ns]]]></sourcecode>

        <t>Pseudocode for how the algorithm categorizes packets by flow ID to
        populate the variable packet.uflow is not given in detail here. The
        application's flow ID is usually defined by a common 5-tuple (or
        4-tuple) of:</t>
        <ul spacing="normal">
          <li>
            <t>source and destination IP addresses of the innermost IP header
            found;</t>
          </li>
          <li>
            <t>the protocol (IPv4) or next header (IPv6) field in this IP
            header</t>
          </li>
          <li>
            <t>either of:</t>
            <ul spacing="normal">
              <li>
                <t>source and destination port numbers, for TCP, UDP,
                UDP-Lite, SCTP, DCCP, Stream Control Transmission Protocol (SCTP), Datagram Congestion Control Protocol (DCCP), etc.</t>
              </li>
              <li>
                <t>Security Parameters Parameter Index (SPI) for IPSec IPsec Encapsulating
                Security Payload (ESP) <xref target="RFC4303" format="default"/>.</t>
              </li>
            </ul>
          </li>
        </ul>
        <t>The Microflow Classification section of the Queue Protection
        Annex of the DOCSIS spec&nbsp;<xref specification <xref target="DOCSIS" format="default"/> defines various
        strategies to find these headers by skipping extension headers or
        encapsulations. If they cannot be found, the spec specification defines
        various less-specific 3-tuples that would be used. The DOCSIS
        spec
        specification should be referred to for all these strategies, which will
        not be repeated here.</t>
        <t>The array of bucket structures defined below is used by all the
        Queue Protection
        QProt  functions:</t>
        <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
struct bucket { // The leaky bucket structure to hold per-flow state
   id;          // identifier (e.g., 5-tuple) of flow using bucket
   t_exp;       // expiry time in units of T_RES
                // (t_exp - now) = flow's transformed q'ing score
};
struct bucket buckets[NBUCKETS+1];
]]></sourcecode> buckets[NBUCKETS+1];]]></sourcecode>
      </section>

      <section anchor="qp_data_path" numbered="true" toc="default">
        <name>Queue Protection
        <name>QProt  Data Path</name>
        <t>All the functions of Queue Protection QProt  operate on the data path,
        driven by packet arrivals.</t>
        <t>The following functions that maintain per-flow queuing scores and
        manage per-flow state are considered primarily as mechanism:</t>
        <ul empty="true" spacing="normal">
          <li>
            <t>pick_bucket(uflow_id);
        <sourcecode name="" type="pseudocode" markers="false"><![CDATA[
    pick_bucket(uflow_id);              // Returns bucket identifier</t>
          </li>
          <li>
            <t>fill_bucket(bucket_id, identifier
    fill_bucket(bucket_id, pkt_size, probNative); // /* Returns queuing score</t>
          </li>
          <li>
            <t>calcProbNative(qdelay) //
                                                   * score */
    calcProbNative(qdelay); /* Returns ECN-marking probability of the native
                             * Native LL AQM</t>
          </li>
        </ul> AQM */]]></sourcecode>
        <t>The following function is primarily concerned with
        policy:</t>
        <ul empty="true" spacing="normal">
          <li>
            <t>qprotect(packet,
        <sourcecode name="" type="pseudocode" markers="false"><![CDATA[
    qprotect(packet, ...); // /* Returns exit status to either forward
                            * or redirect the packet</t>
          </li>
        </ul> packet */]]></sourcecode>
        <t>('...' suppresses distracting detail.)</t>
        <t>Future modifications to policy aspects are more likely than modifications to
        mechanisms. Therefore, policy aspects would be less appropriate
        candidates for any hardware acceleration.</t>
        <t>The entry point to these functions is qprotect(), which is called
        from packet classification before each packet is enqueued into the
        appropriate queue, queue_id, as follows:</t>
        <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
classifier(packet) {
   // Determine which queue using ECN, DSCP DSCP, and any local-use fields
   queue_id = classify(packet);
   //  LQ & CQ are macros for valid queue IDs returned by classify()
   if (queue_id == LQ) {
      // if packet classified to Low Latency Low-Latency Service Flow
      if (QPROTECT_ON) {
         if (qprotect(packet, ...) == EXIT_SANCTION) {
            // redirect packet to Classic Service Flow
            queue_id = CQ;
         }
      }
   return queue_id;
}
]]></sourcecode>
}]]></sourcecode>

        <section anchor="qp_qprotect" numbered="true" toc="default">
          <name>The qprotect() function</name> Function</name>
          <t>On each packet arrival at the LL Low-Latency (LL) queue, qprotect() measures the
          current delay of the LL queue and derives the native Native LL marking
          probability from it. Then Then, it uses pick_bucket to find the bucket
          already holding the flow's state, state or to allocate a new bucket if the
          flow is new or its state has expired (the most likely case). Then Then,
          the queuing score is updated by the fill_bucket() function. That
          completes the mechanism aspects.</t>
          <t>The comments against the subsequent policy conditions and actions
          should be self-explanatory at a superficial level. The deeper
          rationale for these conditions is given in <xref target="qp_rationale_conditions" format="default"/>.</t>
          <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
// Per packet queue protection Per-packet Queue Protection
qprotect(packet, ...) {

   bckt_id;   // bucket index
   qLscore;   // queuing score of pkt's flow in units of T_RES

   qdelay = qL.qdelay(...);
   probNative = calcProbNative(qdelay);

   bckt_id = pick_bucket(packet.uflow);
   qLscore = fill_bucket(buckets[bckt_id], packet.size, probNative);

   // Determine whether to sanction packet
   if ( ( ( qdelay > CRITICALqL ) // Test if qdelay over threshold...
      // ...and if flow's q'ing score scaled by qdelay/CRITICALqL
      // ...exceeds CRITICALqLSCORE
      && ( qdelay * qLscore > CRITICALqLPRODUCT ) )
      // or qLSCORE_MAX reached
      || ( qLscore >= qLSCORE_MAX ) )

      return EXIT_SANCTION;

   else
      return EXIT_SUCCESS;
}
]]></sourcecode>
}]]></sourcecode>
        </section>
        <section anchor="qp_pick_bucket" numbered="true" toc="default">
          <name>The pick_bucket() function</name> Function</name>
          <t>The pick_bucket() function is optimized for flow-state flow state that will
          normally have expired from packet to packet of the same flow. It is
          just one way of finding the bucket associated with the flow ID of
          each packet - packet: it might be possible to develop more efficient
          alternatives.</t>
          <t>The algorithm is arranged so that the bucket holding any live
          (non-expired) flow-state flow state associated with a packet will always be
          found before a new bucket is allocated. The constant ATTEMPTS,
          defined earlier, determines how many hashes are used to find a
          bucket for each flow (actually, flow.  (Actually, only one hash is generated; then, by
          default, 5 bits of it at a time are used as the hash value, because value because,
          by default default, there are 2^5 = 32 buckets).</t>
          <t>The algorithm stores the flow's own ID in its flow-state. flow state. So,
          when a packet of a flow arrives, the algorithm tries up to ATTEMPTS
          times to hash to a bucket, looking for the flow's own ID. If found,
          it uses that bucket, first resettings resetting the expiry time to 'now' "now" if it
          has expired.</t>
          <t>If it does not find the flow's ID, and the expiry time is still
          current, the algorithm can tell that another flow is using that
          bucket, and it continues to look for a bucket for the flow. Even if
          it finds another flow's bucket where the expiry time has passed, it
          doesn't immediately use it. It merely remembers it as the potential
          bucket to use. But first it runs through all the ATTEMPTS hashes to
          look for a bucket assigned to the flow ID. Then, if a live bucket is
          not already associated with the packet's flow, the algorithm should
          have already set aside an existing bucket with a score that has aged
          out. Given this bucket is no longer necessary to hold state for its
          previous flow, it can be recycled for use by the present packet's
          flow.</t>
          <t>If all else fails, there is one additional bucket (called the
          dregs) that can be used. If the dregs is still in live use by
          another flow, subsequent flows that cannot find a bucket of their
          own all share it, adding their score to the one in the dregs. A flow
          might get away with using the dregs on its own, but when there are
          many mis-marked mismarked flows, multiple flows are more likely to collide in
          the dregs, including innocent flows. The choice of number of buckets
          and number of hash attempts determines how likely it will be that
          this undesirable scenario will occur.</t>
          <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
// Pick the bucket associated with flow uflw
pick_bucket(uflw) {

   now;                      // current time
   j;                        // loop counter
   h32;                      // holds hash of the packet's flow IDs
   h;                        // bucket index being checked
   hsav;                     // interim chosen bucket index

   h32   = hash32(uflw);     // 32-bit hash of flow ID
   hsav  = NBUCKETS;         // Default bucket
   now   = get_time_now();   // in units of T_RES

   // The for loop checks ATTEMPTS buckets for ownership by flow-ID flow ID
   // It also records the 1st bucket, if any, that could be recycled
   // because it's expired.
   // Must not recycle a bucket until all ownership checks completed
   for (j=0; j<ATTEMPTS; j++) {
      // Use least signif. BI_SIZE bits of hash for each attempt
      h = h32 & MASK;
      if (buckets[h].id == uflw) {    // Once uflw's bucket found...
         if (buckets[h].t_exp <= now) // ...if bucket has expired...
            buckets[h].t_exp = now;   // ...reset it
         return h;                    // Either way, use it
      }
      else if ( (hsav == NBUCKETS)  // If not seen expired bucket yet
                                    //  and this bucket has expired
           && (buckets[h].t_exp <= now) ) {
         hsav = h;                  // set it as the interim bucket
      }
      h32 >>= BI_SIZE;          // Bit-shift hash for next attempt
   }
   // If reached here, no tested bucket was owned by the flow-ID flow ID
   if (hsav != NBUCKETS) {
      // If here, found an expired bucket within the above for loop
      buckets[hsav].t_exp = now;              // Reset expired bucket
   } else {
      // If here, we're having to use the default bucket (the dregs)
      if (buckets[hsav].t_exp <= now) {   // If dregs has expired...
         buckets[hsav].t_exp = now;       // ...reset it
      }
   }
   buckets[hsav].id = uflw; // In either case, claim for recycling
   return hsav;
}
]]></sourcecode>
}]]></sourcecode>
        </section>
        <section anchor="qp_fill_bucket" numbered="true" toc="default">
          <name>The fill_bucket() function</name> Function</name>
          <t>The fill_bucket() function both accumulates and ages the queuing
          score over time, as outlined in <xref target="qp_approach_mechanism" format="default"/>. To make aging the score efficient,
          the increment of the queuing score is transformed into units of time
          by dividing by AGING, AGING so that the result represents the new expiry
          time of the flow.</t>
          <t>Given that probNative is already used to select which packets to
          ECN-mark, it might be thought that the queuing score could just be
          incremented by the full size of each selected packet, instead of
          incrementing it by the product of every packet's size (pkt_sz) and
          probNative. However, the unpublished experience of one of the
          authors with other congestion policers has found that the score then
          increments far too jumpily, particularly when probNative is low.</t>
          <t>A deeper explanation of the queuing score is given in <xref target="qp_rationale" format="default"/>.</t>
          <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
fill_bucket(bckt_id, pkt_sz, probNative) {
   now;                                       // current time
   now = get_time_now();                      // in units of T_RES
   // Add packet's queuing score
   // For integer arithmetic, a bit-shift can replace the division
   qLscore = min(buckets[bckt_id].t_exp - now
                 + probNative * pkt_sz / AGING, qLSCORE_MAX);
   buckets[bckt_id].t_exp = now + qLscore;
   return qLscore;
}
]]></sourcecode>
}]]></sourcecode>
        </section>
        <section anchor="qp_calcProbNative" numbered="true" toc="default">
          <name>The calcProbNative() function</name> Function</name>
          <t>To derive this queuing score, the QProt algorithm uses the linear
          ramp function calcProbNative() to normalize instantaneous queuing
          delay of the LL queue into a probability in the range [0,1], which
          it assigns to probNative.</t>
          <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
calcProbNative(qdelay){
      if ( qdelay >= MAXTH ) {
         probNative = MAX_PROB;
      } else if ( qdelay > MINTH ) {
         probNative = MAX_PROB * (qdelay - MINTH)/RANGE;
         // In practice, the * and the / would use a bit-shift
      } else {
         probNative = 0;
      }
      return probNative;
}
]]></sourcecode>
}]]></sourcecode>
        </section>
      </section>
</section>

    <section anchor="qp_rationale" numbered="true" toc="default">
      <name>Rationale</name>
      <t/>
      <section anchor="qp_rationale_not_throughput" numbered="true" toc="default">
        <name>Rationale: Blame for Queuing, not Not for Rate in Itself</name>
        <t><xref target="qp_fig_blame_cbr_v_burst" format="default"/> shows the bit rates of
        two flows as stacked areas. It poses the question of which flow is
        more to blame for queuing delay; delay: the unresponsive constant bit rate
        flow (c) that is consuming about 80% of the capacity, capacity or the flow
        sending regular short unresponsive bursts (b)? The smoothness of c
        seems better for avoiding queuing, but its high rate does not.
        However, if flow c was were not there, or ran slightly more slowly, b would
        not cause any queuing.</t>
        <figure anchor="qp_fig_blame_cbr_v_burst">
          <name>Which is More more to Blame blame for Queuing Delay?</name> queuing delay?</name>
          <artwork name="" type="" align="left" alt=""><![CDATA[^ alt=""><![CDATA[
^ bit rate (stacked areas)
|  ,-.          ,-.          ,-.          ,-.          ,-.
|--|b|----------|b|----------|b|----------|b|----------|b|---Capacity
|__|_|__________|_|__________|_|__________|_|__________|_|_____
|
|                       c
|
|
|
+---------------------------------------------------------------->
                                                              time
]]></artwork>
                                                              time]]></artwork>
        </figure>
        <t>To explain queuing scores, in the following it will initially be
        assumed that the QProt algorithm is accumulating queuing scores, scores but
        not taking any action as a result.</t>
        <t>To quantify the responsibility that each flow bears for queuing
        delay, the QProt algorithm accumulates the product of the rate of each
        flow and the level of congestion, both measured at the instant each
        packet arrives. The instantaneous flow rate is represented at each
        discrete event when a packet arrives by the packet's size, which
        accumulates faster the more packets arrive within each unit of time.
        The level of congestion is normalized to a dimensionless number
        between 0 and 1 (probNative). This fractional congestion level is used
        in preference to a direct dependence on queuing delay for two
        reasons:</t>
        <ul spacing="normal">
          <li>
            <t>to be able to ignore very low levels of queuing that contribute
            insignificantly to delay</t>
          </li>
          <li>
            <t>to be able to erect a steep barrier against excessive queuing
            delay</t>
          </li>
        </ul>
        <t>The unit of the resulting queue score is "congested-bytes"
        per second, which distinguishes it from just bytes per second.</t>
        <t>Then, during the periods between bursts (b), neither flow
        accumulates any queuing score - score: the high rate of c is benign. But,
        during each burst, if we say the rate of c and b are 80% and 45% of
        capacity, thus causing 25% overload, they each bear (80/125)% and
        (45/125)% of the responsibility for the queuing delay (64% and 36%).
        The algorithm does not explicitly calculate these percentages. They
        are just the outcome of the number of packets arriving from each flow
        during the burst.</t>
        <t>To summarize, the queuing score never sanctions rate solely on its
        own account. It only sanctions rate inasmuch as it causes queuing.</t>
        <figure anchor="qp_fig_blame_scenario">
          <name>Responsibility for Queuing: A More Complex Scenario</name>
          <artwork name="" type="" align="left" alt=""><![CDATA[^ alt=""><![CDATA[
^ bit rate (stacked areas)                               ,
|               ,-.                       |\           ,-
|------Capacity-|b|----------,-.----------|b|----------|b\-----
|             __|_|_______   |b|        /``\| _...-._-': | ,.--
|  ,-.     __/            \__|_|_     _/    |/          \|/
|  |b| ___/                      \___/   __       r
|  |_|/                v             \__/  \_______    _/\____/
| _/                                               \__/
|
+---------------------------------------------------------------->
                                                              time
]]></artwork>
                                                              time]]></artwork>
        </figure>
        <t><xref target="qp_fig_blame_scenario" format="default"/> gives a more complex
        illustration of the way the queuing score assigns responsibility for
        queuing (limited to the precision that ASCII art can illustrate). The
        figure shows the bit rates of three flows represented as stacked areas
        labelled b, v v, and r. The unresponsive bursts (b) are the same as in
        the previous example, but a variable rate variable-rate video (v) replaces flow c.
        It's
        Its rate varies as the complexity of the video scene varies. Also Also, on
        a slower timescale, in response to the level of congestion, the video
        adapts its quality. However, on a short time-scale timescale it appears to be
        unresponsive to small amounts of queuing. Also, part-way partway through, a
        low latency
        low-latency responsive flow (r) joins in, aiming to fill the balance
        of capacity left by the other two.</t>
        <t>The combination of the first burst and the low application-limited
        rate of the video causes neither flow to accumulate queuing score. In
        contrast, the second burst causes similar excessive overload (125%) to
        the example in <xref target="qp_fig_blame_cbr_v_burst" format="default"/>. Then, the
        video happens to reduce its rate (probably due to a less complex less-complex
        scene) so the third burst causes only a little congestion. Let us
        assume the resulting queue causes probNative to rise to just 1%, then
        the queuing score will only accumulate 1% of the size of each packet
        of flows v and b during this burst.</t>
        <t>The fourth burst happens to arrive just as the new responsive flow
        (r) has filled the available capacity, so it leads to very rapid
        growth of the queue. After a round trip trip, the responsive flow rapidly
        backs off, and the adaptive video also backs off more rapidly than it
        would normally, normally because of the very high congestion level. The rapid
        response to congestion of flow r reduces the queuing score that all
        three flows accumulate, but they each still bear the cost in
        proportion to the product of the rates at which their packets arrive
        at the queue and the value of probNative when they do so. Thus, during
        the fifth burst, they all accumulate less a lower score than the fourth, fourth
        because the queuing delay is not as excessive.</t>
      </section>
      <section anchor="qp_rationale_aging" numbered="true" toc="default">
        <name>Rationale for
        <name>Rationale: Constant Aging of the Queuing Score</name>
        <t>Even well-behaved flows will not always be able to respond fast
        enough to dynamic events. Also Also, well-behaved flows, e.g., DCTCP Data Center TCP (DCTCP) <xref target="RFC8257" format="default"/>, TCP Prague <xref target="I-D.briscoe-iccrg-prague-congestion-control" format="default"/>, BBRv3 Bottleneck Bandwidth and Round-trip propagation time version 3 (BBRv3) <xref target="BBRv3" format="default"/> format="default"/>, or the L4S variant of SCReAM <xref target="SCReAM" format="default"/>
        for real-time media <xref target="RFC8298" format="default"/>, can maintain a very
        shallow queue by continual careful probing for more while also
        continually subtracting a little from their rate (or congestion
        window) in response to low levels of ECN signalling. Therefore, the
        QProt algorithm needs to continually offer a degree of forgiveness to
        age out the queuing score as it accumulates.</t>
        <t>Scalable congestion controllers controllers, such as those above above, maintain their
        congestion window in inverse proportion to the congestion level,
        probNative. That leads to the important property that that, on average average, a
        scalable flow holds the product of its congestion window and the
        congestion level constant, no matter the capacity of the link or how
        many other flows it competes with. For instance, if the link capacity
        doubles, a scalable flow induces half the congestion probability. Or Or,
        if three scalable flows compete for the capacity, each flow will
        reduce to one third of the capacity they would use on their own and
        increase the congestion level by 3x. Therefore, in steady state, a
        scalable flow will induce the same constant amount of
        "congested-bytes" per round trip, whatever the link capacity, capacity and no
        matter how many flows are sharing the capacity.</t>
        <t>This suggests that the QProt algorithm will not sanction a
        well-behaved scalable flow if it ages out the queuing score at a
        sufficient constant rate. The constant will need to be somewhat above
        the average of a well-behaved scalable flow to allow for normal
        dynamics.</t>
        <t>Relating QProt's aging constant to a scalable flow does not mean
        that a flow has to behave like a scalable flow. It flow: it can be less
        aggressive,
        aggressive but not more. more aggressive. For instance, a longer RTT flow can run at a
        lower congestion-rate than the aging rate, but it can also increase
        its aggressiveness to equal the rate of short RTT scalable flows <xref target="ScalingCC" format="default"/>. The constant aging of QProt also means that a
        long-running unresponsive flow will be prone to trigger QProt if it
        runs faster than a competing responsive scalable flow would. And, of
        course, if a flow causes excessive queuing in the short-term, short term, its
        queuing score will still rise faster than the constant aging process
        will decrease it. Then Then, QProt will still eject the flow's packets
        before they harm the low latency of the shared queue.</t>
      </section>
      <section anchor="qp_rationale_normalize" numbered="true" toc="default">
        <name>Rationale for
        <name>Rationale: Transformed Queuing Score</name>
        <t>The QProt algorithm holds a flow's queuing score state in a
        structure called a bucket, "bucket".  This is because of its similarity to a classic
        leaky bucket (except the contents of the bucket does do not represent
        bytes).</t>
        <figure anchor="qp_fig_qscore_normalize">
          <name>Transformation of Queuing Score</name>
          <artwork name="" type="" align="left" alt=""><![CDATA[probNative alt=""><![CDATA[
probNative * pkt_sz   probNative * pkt_sz / AGING
          |                        |
       |  V  |                  |  V  |
       |  :  |        ___       |  :  |
       |_____|        ___       |_____|
       |     |        ___       |     |
       |__ __|                  |__ __|
          |                        |
          V                        V
     AGING * Dt                    Dt

]]></artwork>                    Dt]]></artwork>
        </figure>
        <t>The accumulation and aging of the queuing score is shown on the
        left of <xref target="qp_fig_qscore_normalize" format="default"/> in token bucket form.
        Dt is the difference between the times when the scores of the current
        and previous packets were processed.</t>
        <t>A transformed equivalent of this token bucket is shown on the right
        of <xref target="qp_fig_qscore_normalize" format="default"/>, dividing both the input
        and output by the constant AGING rate. rate, AGING. The result is a bucket-depth
        that represents time and it drains at the rate that time passes.</t>
        <t>As a further optimization, the time the bucket was last updated is
        not stored with the flow-state. flow state. Instead, when the bucket is
        initialized
        initialized, the queuing score is added to the system time 'now' "now" and
        the resulting expiry time is written into the bucket. Subsequently, if
        the bucket has not expired, the incremental queuing score is added to
        the time already held in the bucket. Then Then, the queuing score always
        represents the expiry time of the flow-state flow state itself. This means that
        the queuing score does not need to be aged explicitly because it ages
        itself implicitly.</t>
      </section>
      <section anchor="qp_rationale_conditions" numbered="true" toc="default">
        <name>Rationale for
        <name>Rationale: Policy Conditions</name>
        <t>Pseudocode for the QProt policy conditions is given in <xref target="qp_header_file" format="default"/> within the second half of the qprotect()
        function. When each packet arrives, after finding its flow state and
        updating the queuing score of the packet's flow, the algorithm checks
        whether the shared queue delay exceeds a constant threshold CRITICALqL
        (e.g., 2 ms), as repeated below for convenience:</t>
        <sourcecode name="" type="" type="pseudocode" markers="true"><![CDATA[
   if (  ( qdelay > CRITICALqL )  // Test if qdelay over threshold...
      // ...and if flow's q'ing score scaled by qdelay/CRITICALqL
      // ...exceeds CRITICALqLSCORE
      && ( qdelay * qLscore > CRITICALqLPRODUCT ) )
      // Recall that CRITICALqLPRODUCT = CRITICALqL * CRITICALqLSCORE
]]></sourcecode> CRITICALqLSCORE]]></sourcecode>

        <t>If the queue delay threshold is exceeded, the flow's queuing score
        is temporarily scaled up by the ratio of the current queue delay to
        the threshold queuing delay, CRITICALqL (the reason for the scaling is
        given next). If this scaled up score exceeds another constant
        threshold CRITICALqLSCORE, the packet is ejected. The actual last line
        of code above multiplies both sides of the second condition by
        CRITICALqL to avoid a costly division.</t>
        <t>This approach allows each packet to be assessed once, as it
        arrives. Once queue delay exceeds the threshold, it has two
        implications:</t>
        <ul spacing="normal">
          <li>
            <t>The current packet might be ejected ejected, even though there are
            packets already in the queue from flows with higher queuing
            scores. However, any flow that continues to contribute to the
            queue will have to send further packets, giving an opportunity to
            eject them as well, as they subsequently arrive.</t>
          </li>
          <li>
            <t>The next packets to arrive might not be ejected, ejected because they
            might belong to flows with low queuing scores. In this case, queue
            delay could continue to rise with no opportunity to eject a
            packet. This is why the queuing score is scaled up by the current
            queue delay. Then, the more the queue has grown without ejecting a
            packet, the more the algorithm 'raises "raises the bar' bar" to further
            packets.</t>
          </li>
        </ul>
        <t>The above approach is preferred over the extra per-packet
        processing cost of searching the buckets for the flow with the highest
        queuing score and searching the queue for one of its packets to eject
        (if one is still in the queue).</t>
        <t>Note that by default CRITICALqL_us is set to the maximum threshold
        of the ramp marking algorithm, MAXTH_us. However, there is some debate
        as to whether setting it to the minimum threshold instead would
        improve QProt performance. This would roughly double the ratio of
        qdelay to CRITICALqL, which is compared against the CRITICALqLSCORE
        threshold. So the threshold would have to be roughly doubled
        accordingly.</t>
        <t><xref target="qp_fig_policy_conditions" format="default"/> explains this approach
        graphically. On the horizontal axis axis, it shows actual harm, meaning the
        queuing delay in the shared queue. On the vertical axis axis, it shows the
        behaviour record of the flow associated with the currently arriving
        packet, represented in the algorithm by the flow's queuing score. The
        shaded region represents the combination of actual harm and behaviour
        record that will lead to the packet being ejected.</t>
        <aside><t>Note that, by default, CRITICALqL_us is set to the maximum threshold
        of the ramp marking algorithm MAXTH_us. However, there is some debate
        as to whether setting it to the minimum threshold instead would
        improve QProt performance. This would roughly double the ratio of
        qdelay to CRITICALqL, which is compared against the CRITICALqLSCORE
        threshold. Therefore, the threshold would have to be roughly doubled
        accordingly.</t></aside>
        <figure anchor="qp_fig_policy_conditions">
          <name>Graphical Explanation of the Policy Conditions</name>
          <artwork name="" type="" align="left" alt=""><![CDATA[Behaviour alt=""><![CDATA[
Behaviour Record:
Queueing Score of
Arriving Packet's Flow
^
|   +          |/ / / / / / / / / / / / / / / / / / /
|    +   N     | / / / / / / / / / / / / / / / / / / /
|     +        |/ / / / /                   / / / / /
|      +       | / / / /  E (Eject packet)   / / / / /
|       +      |/ / / / /                   / / / / /
|         +    | / / / / / / / / / / / / / / / / / / /
|           +  |/ / / / / / / / / / / / / / / / / / /
|             +| / / / / / / / / / / / / / / / / / / /
|              |+ / / / / / / / / / / / / / / / / / /
|    N         |   + / / / / / / / / / / / / / / / / /
| (No actual   |       +/ / / / / / / / / / / / / / /
|   harm)      |            +  / / / / / / / / / / / /
|              | P (Pass over)   +   ,/ / / / / / / /
|              |                           ^ + /./ /_/
+--------------+------------------------------------------>
          CRITICALqL        Actual Harm: Shared Queue Delay
]]></artwork> Delay]]></artwork>
        </figure>
        <t>The regions labelled 'N' "N" represent cases where the first condition
        is not met - -- no actual harm - -- queue delay is below the critical
        threshold, CRITICALqL.</t>
        <t>The region labelled 'E' "E" represents cases where there is actual harm
        (queue delay exceeds CRITICALqL) and the queuing score associated with
        the arriving packet is high enough to be able to eject it with
        certainty.</t>
        <t>The region labelled 'P' "P" represents cases where there is actual
        harm, but the queuing score of the arriving packet is insufficient to
        eject it, so it has to be Passed passed over. This adds to queuing delay, but
        the alternative would be to sanction risk sanctioning an innocent flow. It can be seen
        that, as actual harm increases, the judgement of innocence becomes
        increasingly stringent; the behaviour record of the next packet's flow
        does not have to be as bad to eject it.</t>
        <t>Conditioning ejection on actual harm helps prevent VPN packets
        being ejected unnecessarily. VPNs consisting of multiple flows can
        tend to accumulate queuing score faster than it is aged out, out because
        the aging rate is intended for a single flow. However, whether or not
        some traffic is in a VPN, the queue delay threshold (CRITICALqL) will
        be no more likely to be exceeded. So Therefore, conditioning ejection on actual
        harm helps reduce the chance that VPN traffic will be ejected by the
        QProt function.</t>
      </section>
      <section anchor="qp_rationale_reclassify" numbered="true" toc="default">
        <name>Rationale for
        <name>Rationale: Reclassification as the Policy Action</name>
        <t>When the DOCSIS QProt algorithm deems that it is necessary to eject
        a packet to protect the Low Latency Low-Latency queue, it redirects the packet to
        the Classic queue. In the Low Latency Low-Latency DOCSIS architecture (as in
        Dual-Queue Coupled DualQ AQMs generally), the Classic queue is expected to
        frequently have a larger backlog of packets, which is caused by classic
        congestion controllers interacting with a classic AQM (which has a
        latency target of 10ms) 10 ms) as well as other bursty traffic.</t>
        <t>Therefore, typically, an ejected packet will experience higher
        queuing delay than it would otherwise, and it could be re-ordered
        within its flow (assuming QProt does not eject all packets of an
        anomalous flow). The mild harm caused to the performance of the
        ejected packet's flow is deliberate. It gives senders a slight
        incentive to identify their packets correctly.</t>
        <t>If there were no such harm, there would be nothing to prevent all
        flows from identifying themselves as suitable for classification into
        the low latency queue, low-latency queue and just letting QProt sort the resulting
        aggregate into queue-building and non-queue-building flows. This might
        seem like a useful alternative to requiring senders to correctly
        identify their flows. However, handling of mis-classified misclassified flows is not
        without a cost. The more packets that have to be reclassified, the
        more often the delay of the low latency low-latency queue would exceed the
        threshold. Also Also, more memory would be required to hold the extra flow
        state.</t>
        <t>When a packet is redirected into the Classic queue, an operator
        might want to alter the identifier(s) that originally caused it to be
        classified into the Low Latency queue, Low-Latency queue so that the packet will not be
        classified into another low latency low-latency queue further downstream. However,
        redirection of occasional packets can be due to unusually high
        transient load just at the specific bottleneck, not necessarily at any
        other bottleneck, bottleneck and not necessarily due to bad flow behaviour.
        Therefore, Section 5.4.1.2 of <xref target="RFC9331" format="default"/> section="5.4.1.2"/> precludes a
        network node from altering the end-to-end ECN field to exclude traffic
        from L4S treatment. Instead a local-use identifier ought to be used
        (e.g., Diffserv Codepoint codepoint or VLAN tag), tag) so that each operator can
        apply its own policy, without prejudging what other operators ought to
        do.</t>
        <t>Although not supported in the DOCSIS specs, specifications, QProt could be extended
        to recognize that large numbers of redirected packets belong to the
        same flow. This might be detected when the bucket expiry time t_exp
        exceeds a threshold. Depending on policy and implementation
        capabilities, QProt could then install a classifier to redirect a
        whole flow into the Classic queue, with an idle timeout to remove
        stale classifiers. In these 'persistent offender' "persistent offender" cases, QProt might
        also overwrite each redirected packet's DSCP or clear its ECN field to
        Not-ECT, in order to protect other potential L4S queues downstream.
        The DOCSIS specs specifications do not discuss sanctioning whole flows, so flows; further
        discussion is beyond the scope of the present document.</t>
      </section>
    </section>
    <section anchor="qp_limitations" numbered="true" toc="default">
      <name>Limitations</name>
      <t>The QProt algorithm groups packets with common layer-4 Layer 4 flow
      identifiers. It then uses this grouping to accumulate queuing scores and
      to sanction packets.</t>
      <t>This choice of identifier for grouping is pragmatic with no
      scientific basis. All the packets of a flow certainly pass between the
      same two endpoints. But However, some applications might initiate multiple flows
      between the same end-points, endpoints, e.g., for media, control, data, etc. Others
      might use common flow identifiers for all these streams. Also, a user
      might group multiple application flows within the same encrypted VPN
      between the same layer-4 Layer 4 tunnel end-points. And endpoints. And, even if there were a
      one-to-one mapping between flows and applications, there is no reason to
      believe that the rate at which congestion can be caused ought to be
      allocated on a per application flow per-application-flow basis.</t>
      <t>The use of a queuing score that excludes those aspects of flow rate
      that do not contribute to queuing (<xref target="qp_rationale_not_throughput" format="default"/>) goes some way to mitigating this
      limitation,
      limitation because the algorithm does not judge responsibility for
      queuing delay primarily on the combined rate of a set of flows grouped
      under one flow ID.</t>
    </section>
    <section anchor="l4sds_IANA" numbered="true" toc="default">
      <name>IANA Considerations  (to be removed by RFC Editor)</name> Considerations</name>
      <t>This specification contains document has no IANA considerations.</t>
    </section>
    <section anchor="qp_impl_status" numbered="true" toc="default">
      <name>Implementation Status</name>
      <table align="center">
        <thead>
          <tr>
            <th align="left">Implementation name:</th>
            <th align="left">DOCSIS models for ns-3</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td align="left">Organization</td>
            <td align="left">CableLabs</td>
          </tr>
          <tr>
            <td align="left">Web page</td>
            <td align="left">https://apps.nsnam.org/app/docsis-ns3/</td>
          </tr>
          <tr>
            <td align="left">Description</td>
            <td align="left">ns-3 simulation models developed and used in support of the Low
        Latency DOCSIS development, including models of Dual Queue Coupled
        AQM, Queue Protection, and the DOCSIS MAC</td>
          </tr>
          <tr>
            <td align="left">Maturity</td>
            <td align="left">Simulation models that can also be used in emulation mode in a
        testbed context</td>
          </tr>
          <tr>
            <td align="left">Coverage</td>
            <td align="left">Complete implementation of Annex P of DOCSIS 3.1</td>
          </tr>
          <tr>
            <td align="left">Version</td>
            <td align="left">DOCSIS 3.1, version I21;
        https://www.cablelabs.com/specifications/CM-SP-MULPIv3.1?v=I21</td>
          </tr>
          <tr>
            <td align="left">Licence</td>
            <td align="left">GPLv2</td>
          </tr>
          <tr>
            <td align="left">Contact</td>
            <td align="left">via web page</td>
          </tr>
          <tr>
            <td align="left">Last Impl'n update</td>
            <td align="left">Mar 2022</td>
          </tr>
          <tr>
            <td align="left">Information valid at</td>
            <td align="left">7 Mar 2022</td>
          </tr>
        </tbody>
      </table>
      <t>There are also a number of closed source implementations, including 2
      cable modem implementations written by different chipset manufacturers,
      and one CMTS implementation by a third manufacturer. These, as well as
      the ns-3 implementation, have passed the full suite of compliance tests
      developed by CableLabs.</t> actions.</t>
    </section>

    <section anchor="l4sds_Security_Considerations" numbered="true" toc="default">
      <name>Security Considerations</name>
      <t>The whole of this document concerns traffic security. It considers
      the security question of how to identify and eject traffic that does not
      comply with the non-queue-building behaviour required to use a shared
      low latency
      low-latency queue, whether accidentally or maliciously.</t>
      <t>Section 8.2
      <t><xref target="RFC9330" section="8.2"/> of the L4S architecture <xref target="RFC9330" format="default"/>
      introduces the problem of maintaining low latency by either
      self-restraint or enforcement, enforcement and places DOCSIS queue protection Queue Protection (QProt) in
      context within a wider set of approaches to the problem.</t>
      <section anchor="qp_resource_exhaust" numbered="true" toc="default">
        <name>Resource Exhaustion Attacks</name>
        <t>The QProt algorithm has been designed to fail gracefully in the face of
        traffic crafted to overrun the resources used for the algorithm's own
        processing and flow state. This means that non-queue-building flows
        will always be less likely to be sanctioned than queue-building flows.
        But an attack could be contrived to deplete resources in such a way
        that the proportion of innocent (non-queue-building) flows that are
        incorrectly sanctioned could increase.</t>
        <t>Incorrect sanctioning is intended not to be catastrophic; it
        results in more packets from well-behaved flows being redirected into
        the Classic queue; thus introducing queue, which introduces more reordering into innocent
        flows.</t>
        <section anchor="qp_flow-state_exhaust" numbered="true" toc="default">
          <name>Exhausting Flow-State Flow State Storage</name>
          <t>To exhaust the number of buckets, the most efficient attack is to
          send enough long-running attack flows to increase the chance that an
          arriving flow will not find an available bucket, bucket and therefore will, therefore, have
          to share the 'dregs' "dregs" bucket. For instance, if ATTEMPTS=2 and
          NBUCKETS=32, it requires about 94 attack flows, each using different
          port numbers, to increase the probability to 99% that an arriving
          flow will have to share the dregs, where it will share a high degree
          of redirection into the C Classic queue with the remainder of the attack
          flows.</t>
          <t>For an attacker to keep buckets busy, it is more efficient to
          hold onto them by cycling regularly through a set of port numbers
          (94 in the above example), example) rather than to keep occupying and
          releasing buckets with single packet flows across a much larger
          number of ports.</t>
          <t>During such an attack, the coupled marking probability will have
          saturated at 100%. So, Therefore, to hold a bucket, the rate of an attack flow
          needs to be no less than the AGING aging rate of each bucket; 4Mb/s bucket: 4 Mb/s by
          default. However, for an attack flow to be sure to hold on to its
          bucket, it would need to send somewhat faster. Thus Thus, an attack with
          100 flows would need a total force of more than 100 * 4Mb/s 4 Mb/s =
          400Mb/s.</t>
          400 Mb/s.</t>
          <t>This attack can be mitigated (but not prevented) by increasing
          the number of buckets. The required attack force scales linearly
          with the number of buckets, NBUCKETS. So, Therefore, if NBUCKETS were doubled
          to 64, twice as many 4Mb/s 4 Mb/s flows would be needed to maintain the
          same impact on innocent flows.</t>
          <t>Probably the most effective mitigation would be to implement
          redirection of whole-flows once enough of the individual packets of
          a certain offending flow had been redirected. This would free up the
          buckets used to maintain the per-packet queuing score of persistent
          offenders. Whole-flow redirection is outside the scope of the
          current version of the QProt algorithm specified here, but it is
          briefly discussed at the end of <xref target="qp_rationale_reclassify" format="default"/>.</t>
          <t>It might be considered that all the packets of persistently
          offending flows ought to be discarded rather than redirected.
          However, this is not recommended, recommended because attack flows might be able
          to pervert whole-flow discard, turning it onto at least some
          innocent flows, thus amplifying an attack that causes reordering
          into total deletion of some innocent flows.</t>
        </section>
        <section anchor="qp_proc_exhaust" numbered="true" toc="default">
          <name>Exhausting Processing Resources</name>
          <t>The processing time needed to apply the QProt algorithm to each
          LL
          Low-Latency (LL) packet is small and intended not to take all the time available
          between each of a run of fairly small packets. However, an attack
          could use minimum size sized packets launched from multiple input
          interfaces into a lower capacity output interface. Whether the QProt
          algorithm is vulnerable to processor exhaustion will depend on the
          specific implementation.</t>
          <t>Addition of a capability to redirect persistently offending flows
          from LL to C Classic would be the most effective way to reduce the
          per-packet processing cost of the QProt algorithm, algorithm when under
          attack. As already mentioned in <xref target="qp_flow-state_exhaust" format="default"/>, this would also be an effective
          way to mitigate flow-state exhaustion attacks. Further discussion of
          whole-flow redirection is outside the scope of the present document, document
          but is briefly discussed at the end of <xref target="qp_rationale_reclassify" format="default"/>.</t>
        </section>
      </section>
    </section>
    <section numbered="true" toc="default">
      <name>Comments Solicited</name>
      <t>Evaluation and assessment of the algorithm by researchers is
      solicited. Comments and questions are also encouraged and welcome. They
      can be addressed to the authors.</t>
    </section>
    <section numbered="true" toc="default">
      <name>Acknowledgements</name>
      <t>Thanks to Tom Henderson, Magnus Westerlund, David Black, Adrian
      Farrel and Gorry Fairhurst for their reviews of this document. The
      design of the QProt algorithm and the settings of the parameters
      benefited from discussion and critique from the participants of the
      cable industry working group on Low Latency DOCSIS. CableLabs funded Bob
      Briscoe's initial work on this document.</t>
    </section>

  </middle>
  <!--  *****BACK MATTER ***** -->

  <back>

    <displayreference target="I-D.briscoe-iccrg-prague-congestion-control" to="PRAGUE-CC"/>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <reference anchor="RFC3168" target="https://www.rfc-editor.org/info/rfc3168" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml">
          <front>
            <title>The Addition of Explicit Congestion Notification (ECN) to IP</title>
            <author fullname="K. Ramakrishnan" initials="K." surname="Ramakrishnan"/>
            <author fullname="S. Floyd" initials="S." surname="Floyd"/>
            <author fullname="D. Black" initials="D." surname="Black"/>
            <date month="September" year="2001"/>
            <abstract>
              <t>This memo specifies the incorporation of ECN (Explicit Congestion Notification) to TCP and IP, including ECN's use of two bits in the IP header. [STANDARDS-TRACK]</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="3168"/>
          <seriesInfo name="DOI" value="10.17487/RFC3168"/>
        </reference>
        <reference anchor="RFC8311" target="https://www.rfc-editor.org/info/rfc8311" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8311.xml">
          <front>
            <title>Relaxing Restrictions on Explicit Congestion Notification (ECN) Experimentation</title>
            <author fullname="D. Black" initials="D." surname="Black"/>
            <date month="January" year="2018"/>
            <abstract>
              <t>This memo updates RFC 3168, which specifies Explicit Congestion Notification (ECN) as an alternative to packet drops for indicating network congestion to endpoints. It relaxes restrictions in RFC 3168 that hinder experimentation towards benefits beyond just removal of loss. This memo summarizes the anticipated areas of experimentation and updates RFC 3168 to enable experimentation in these areas. An Experimental RFC in the IETF document stream is required to take advantage of any of these enabling updates. In addition, this memo makes related updates to the ECN specifications for RTP in RFC 6679 and for the Datagram Congestion Control Protocol (DCCP) in RFCs 4341, 4342, and 5622. This memo also records the conclusion of the ECN nonce experiment in RFC 3540 and provides the rationale for reclassification of
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8311.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9331.xml"/>
<!--  [I-D.ietf-tsvwg-nqb] -> [RFC9956]
companion doc RFC 3540 from Experimental to Historic; this reclassification enables new experimental use of the ECT(1) codepoint.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="8311"/>
          <seriesInfo name="DOI" value="10.17487/RFC8311"/>
        </reference>
        <reference anchor="RFC9331" target="https://www.rfc-editor.org/info/rfc9331" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9331.xml">
          <front>
            <title>The Explicit Congestion Notification (ECN) Protocol for Low Latency, Low Loss, and Scalable Throughput (L4S)</title>
            <author fullname="K. De Schepper" initials="K." surname="De Schepper"/>
            <author fullname="B. Briscoe" initials="B." role="editor" surname="Briscoe"/>
            <date month="January" year="2023"/>
            <abstract>
              <t>This specification defines the protocol to be used for a new network service called Low Latency, Low Loss, and Scalable throughput (L4S). L4S uses an Explicit Congestion Notification (ECN) scheme at the IP layer that is similar to the original (or 'Classic') ECN approach, except as specified within. L4S uses 'Scalable' congestion control, which induces much more frequent control signals from the network, and it responds to them with much more fine-grained adjustments so that very low (typically sub-millisecond on average) and consistently low queuing delay becomes possible for L4S traffic without compromising link utilization. Thus, even capacity-seeking (TCP-like) traffic can have high bandwidth and very low delay at the same time, even during periods of high traffic load.</t>
              <t>The L4S identifier defined in this document distinguishes L4S from 'Classic' (e.g., TCP-Reno-friendly) traffic. Then, network bottlenecks can be incrementally modified to distinguish and isolate existing traffic that still follows the Classic behaviour, to prevent it from degrading the low queuing delay and low loss of L4S traffic. This Experimental specification defines the rules that L4S transports and network elements need to follow, with the intention that L4S flows neither harm each other's performance nor that of Classic traffic. It also suggests open questions to be investigated during experimentation. Examples of new Active Queue Management (AQM) marking algorithms and new transports (whether TCP-like or real time) are specified separately.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="9331"/>
          <seriesInfo name="DOI" value="10.17487/RFC9331"/>
        </reference> 9956 = draft-ietf-tsvwg-nqb-33
-->

<reference anchor="I-D.ietf-tsvwg-nqb" target="https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-21" xml:base="https://bib.ietf.org/public/rfc/bibxml-ids/reference.I-D.ietf-tsvwg-nqb.xml"> anchor="RFC9956" target="https://www.rfc-editor.org/info/rfc9956">
  <front>
    <title>A Non-Queue-Building Per-Hop Behavior (NQB PHB) for Differentiated Services</title>
    <author fullname="Greg White" initials="G." surname="White"> surname="White" fullname="Greg White">
      <organization>CableLabs</organization>
    </author>
    <author fullname="Thomas Fossati" initials="T." surname="Fossati"> surname="Fossati" fullname="Thomas Fossati">
      <organization>Linaro</organization>
    </author>
    <author fullname="Ruediger Geib" initials="R." surname="Geib"> surname="Geib" fullname="Ruediger Geib">
      <organization>Deutsche Telekom</organization>
    </author>
    <date day="7" month="November" year="2023"/>
            <abstract>
              <t>This document specifies properties and characteristics of a Non- Queue-Building Per-Hop Behavior (NQB PHB). The NQB PHB provides a shallow-buffered, best-effort service as a complement to a Default deep-buffered best-effort service for Internet services. The purpose of this NQB PHB is to provide a separate queue that enables smooth (i.e. non-bursty), low-data-rate, application-limited traffic microflows, which would ordinarily share a queue with bursty and capacity-seeking traffic, to avoid the latency, latency variation and loss caused by such traffic. This PHB is implemented without prioritization and can be implemented without rate policing, making it suitable for environments where the use of these features is restricted. The NQB PHB has been developed primarily for use by access network segments, where queuing delays and queuing loss caused by Queue-Building protocols are manifested, but its use is not limited to such segments. In particular, applications to cable broadband links, Wi-Fi links, and mobile network radio and core segments are discussed. This document recommends a specific Differentiated Services Code Point (DSCP) to identify Non-Queue- Building microflows. [NOTE (to be removed by RFC-Editor): This document references an ISE submission draft (I-D.briscoe-docsis-q-protection) that is approved for publication as an RFC. This draft should be held for publication until the queue protection RFC can be referenced.]</t>
            </abstract> month='April' year='2026'/>
  </front>
  <seriesInfo name="Internet-Draft" value="draft-ietf-tsvwg-nqb-21"/> name="RFC" value="9956"/>
  <seriesInfo name="DOI" value="10.17487/RFC9956"/>
</reference>

        <reference anchor="DOCSIS" target="https://specification-search.cablelabs.com/CM-SP-MULPIv3.1"> target="https://www.cablelabs.com/specifications/CM-SP-MULPIv3.1">
          <front>
            <title>MAC and Upper Layer Protocols Interface (MULPI)
          Specification, CM-SP-MULPIv3.1</title>
            <author fullname="" surname="">
              <organization>CableLabs</organization>
            </author>
            <date day="21" month="January" year="2019"/>
          </front>
          <seriesInfo name="Data-Over-Cable Service Interface Specifications DOCSIS&reg; DOCSIS(r) 3.1" value="Version I17 or later"/>
        </reference>
        <reference anchor="DOCSIS-CM-OSS" target="https://specification-search.cablelabs.com/CM-SP-CM-OSSIv3.1"> target="https://www.cablelabs.com/specifications/CM-SP-CM-OSSIv3.1">
          <front>
            <title>Cable Modem Operations Support System Interface Spec</title> Specification</title>
            <author fullname="" surname="">
              <organization>CableLabs</organization>
            </author>
            <date day="21" month="January" year="2019"/>
          </front>
          <seriesInfo name="Data-Over-Cable Service Interface Specifications DOCSIS&reg; DOCSIS(r) 3.1" value="Version I14 or later"/>
        </reference>
        <reference anchor="DOCSIS-CCAP-OSS" target="https://specification-search.cablelabs.com/CM-SP-CM-OSSIv3.1"> target="https://www.cablelabs.com/specifications/CM-SP-CCAP-OSSIv3.1">
          <front>
            <title>CCAP Operations Support System Interface Spec</title> Specification</title>
            <author fullname="" surname="">
              <organization>CableLabs</organization>
            </author>
            <date day="21" month="January" day="7" month="February" year="2019"/>
          </front>
          <seriesInfo name="Data-Over-Cable Service Interface Specifications DOCSIS&reg; DOCSIS(r) 3.1" value="Version I14 or later"/>
          <format target="https://specification-search.cablelabs.com/CM-SP-CCAP-OSSIv3.1" type="PDF"/>
        </reference>
      </references>
      <references>
        <name>Informative References</name>
        <reference anchor="RFC4303" target="https://www.rfc-editor.org/info/rfc4303" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4303.xml">
          <front>
            <title>IP Encapsulating Security Payload (ESP)</title>
            <author fullname="S. Kent" initials="S." surname="Kent"/>
            <date month="December" year="2005"/>
            <abstract>
              <t>This document describes an updated version of the Encapsulating Security Payload (ESP) protocol, which is designed to provide a mix of security services in IPv4 and IPv6. ESP is used to provide confidentiality, data origin authentication, connectionless integrity, an anti-replay service (a form of partial sequence integrity), and limited traffic flow confidentiality. This document obsoletes RFC 2406 (November 1998). [STANDARDS-TRACK]</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="4303"/>
          <seriesInfo name="DOI" value="10.17487/RFC4303"/>
        </reference>
        <reference anchor="RFC6789" target="https://www.rfc-editor.org/info/rfc6789" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6789.xml">
          <front>
            <title>Congestion Exposure (ConEx) Concepts and Use Cases</title>
            <author fullname="B. Briscoe" initials="B." role="editor" surname="Briscoe"/>
            <author fullname="R. Woundy" initials="R." role="editor" surname="Woundy"/>
            <author fullname="A. Cooper" initials="A." role="editor" surname="Cooper"/>
            <date month="December" year="2012"/>
            <abstract>
              <t>This document provides the entry point to the set of documentation about the Congestion Exposure (ConEx) protocol. It explains the motivation for including a ConEx marking at the IP layer: to expose information about congestion to network nodes. Although such information may have a number of uses, this document focuses on how the information communicated by the ConEx marking can serve as the basis for significantly more efficient and effective traffic management than what exists on the Internet today. This document is not an Internet Standards Track specification; it is published for informational purposes.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="6789"/>
          <seriesInfo name="DOI" value="10.17487/RFC6789"/>
        </reference>
        <reference anchor="RFC7713" target="https://www.rfc-editor.org/info/rfc7713" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7713.xml">
          <front>
            <title>Congestion Exposure (ConEx) Concepts, Abstract Mechanism, and Requirements</title>
            <author fullname="M. Mathis" initials="M." surname="Mathis"/>
            <author fullname="B. Briscoe" initials="B." surname="Briscoe"/>
            <date month="December" year="2015"/>
            <abstract>
              <t>This document describes an abstract mechanism by which senders inform the network about the congestion recently encountered by packets in the same flow. Today, network elements at any layer may signal congestion to the receiver by dropping packets or by Explicit Congestion Notification (ECN) markings, and the receiver passes this information back to the sender in transport-layer feedback. The mechanism described here enables the sender to also relay this congestion information back into the network in-band at the IP layer, such that the total amount of congestion from all elements on the path is revealed to all IP elements along the path, where it could, for example, be used to provide input to traffic management. This mechanism is called Congestion Exposure, or ConEx. The companion document, "Congestion Exposure (ConEx) Concepts and Use Cases" (RFC 6789), provides the entry point to the set of ConEx documentation.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="7713"/>
          <seriesInfo name="DOI" value="10.17487/RFC7713"/>
        </reference>
        <reference anchor="RFC8257" target="https://www.rfc-editor.org/info/rfc8257" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8257.xml">
          <front>
            <title>Data Center TCP (DCTCP): TCP Congestion Control for Data Centers</title>
            <author fullname="S. Bensley" initials="S." surname="Bensley"/>
            <author fullname="D. Thaler" initials="D." surname="Thaler"/>
            <author fullname="P. Balasubramanian" initials="P." surname="Balasubramanian"/>
            <author fullname="L. Eggert" initials="L." surname="Eggert"/>
            <author fullname="G. Judd" initials="G." surname="Judd"/>
            <date month="October" year="2017"/>
            <abstract>
              <t>This Informational RFC describes Data Center TCP (DCTCP): a TCP congestion control scheme for data-center traffic. DCTCP extends the Explicit Congestion Notification (ECN) processing to estimate the fraction of bytes that encounter congestion rather than simply detecting that some congestion has occurred. DCTCP then scales the TCP congestion window based on this estimate. This method achieves high-burst tolerance, low latency, and high throughput with shallow- buffered switches. This memo also discusses deployment issues related to the coexistence of DCTCP and conventional TCP, discusses the lack of a negotiating mechanism between sender and receiver, and presents some possible mitigations. This memo documents DCTCP as currently implemented by several major operating systems. DCTCP, as described in this specification, is applicable to deployments in controlled environments like data centers, but it must not be deployed over the public Internet without additional measures.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="8257"/>
          <seriesInfo name="DOI" value="10.17487/RFC8257"/>
        </reference>
        <reference anchor="RFC8298" target="https://www.rfc-editor.org/info/rfc8298" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8298.xml">
          <front>
            <title>Self-Clocked Rate Adaptation for Multimedia</title>
            <author fullname="I. Johansson" initials="I." surname="Johansson"/>
            <author fullname="Z. Sarker" initials="Z." surname="Sarker"/>
            <date month="December" year="2017"/>
            <abstract>
              <t>This memo describes a rate adaptation algorithm for conversational media services such as interactive video. The solution conforms to the packet conservation principle and uses a hybrid loss-and-delay- based congestion control algorithm. The algorithm is evaluated over both simulated Internet bottleneck scenarios as well as in a Long Term Evolution (LTE) system simulator and is shown to achieve both low latency and high video throughput in these scenarios.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="8298"/>
          <seriesInfo name="DOI" value="10.17487/RFC8298"/>
        </reference>
        <reference anchor="RFC9332" target="https://www.rfc-editor.org/info/rfc9332" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9332.xml">
          <front>
            <title>Dual-Queue Coupled Active Queue Management (AQM) for Low Latency, Low Loss, and Scalable Throughput (L4S)</title>
            <author fullname="K. De Schepper" initials="K." surname="De Schepper"/>
            <author fullname="B. Briscoe" initials="B." role="editor" surname="Briscoe"/>
            <author fullname="G. White" initials="G." surname="White"/>
            <date month="January" year="2023"/>
            <abstract>
              <t>This specification defines a framework for coupling the Active Queue Management (AQM) algorithms in two queues intended for flows with different responses to congestion. This provides a way for the Internet to transition from the scaling problems of standard TCP-Reno-friendly ('Classic') congestion controls to the family of 'Scalable' congestion controls. These are designed for consistently very low queuing latency, very low congestion loss, and scaling of per-flow throughput by using Explicit Congestion Notification (ECN) in a modified way. Until the Coupled Dual Queue (DualQ), these Scalable L4S congestion controls could only be deployed where a clean-slate environment could be arranged, such as in private data centres.</t>
              <t>This specification first explains how a Coupled DualQ works. It then gives the normative requirements that are necessary for it to work well. All this is independent of which two AQMs are used, but pseudocode examples of specific AQMs are given in appendices.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="9332"/>
          <seriesInfo name="DOI" value="10.17487/RFC9332"/>
        </reference>
        <reference anchor="RFC9330" target="https://www.rfc-editor.org/info/rfc9330" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9330.xml">
          <front>
            <title>Low Latency, Low Loss, and Scalable Throughput (L4S) Internet Service: Architecture</title>
            <author fullname="B. Briscoe" initials="B." role="editor" surname="Briscoe"/>
            <author fullname="K. De Schepper" initials="K." surname="De Schepper"/>
            <author fullname="M. Bagnulo" initials="M." surname="Bagnulo"/>
            <author fullname="G. White" initials="G." surname="White"/>
            <date month="January" year="2023"/>
            <abstract>
              <t>This document describes the L4S architecture, which enables Internet applications to achieve low queuing latency, low congestion loss, and scalable throughput control. L4S is based on the insight that the root cause of queuing delay is in the capacity-seeking congestion controllers of senders, not in the queue itself. With the L4S architecture, all Internet applications could (but do not have to) transition away from congestion control algorithms that cause substantial queuing delay and instead adopt a new class of congestion controls that can seek capacity with very little queuing. These are aided by a modified form of Explicit Congestion Notification (ECN) from the network. With this new architecture, applications can have both low latency and high throughput.</t>
              <t>The architecture primarily concerns incremental deployment. It defines mechanisms that allow the new class of L4S congestion controls to coexist with 'Classic' congestion controls in a shared network. The aim is for L4S latency and throughput to be usually much better (and rarely worse) while typically not impacting Classic performance.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="9330"/>
          <seriesInfo name="DOI" value="10.17487/RFC9330"/>
        </reference>
        <reference anchor="I-D.briscoe-iccrg-prague-congestion-control" target="https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control-03" xml:base="https://bib.ietf.org/public/rfc/bibxml-ids/reference.I-D.briscoe-iccrg-prague-congestion-control.xml">
          <front>
            <title>Prague Congestion Control</title>
            <author fullname="Koen De Schepper" initials="K." surname="De Schepper">
              <organization>Nokia Bell Labs</organization>
            </author>
            <author fullname="Olivier Tilmans" initials="O." surname="Tilmans">
              <organization>Nokia Bell Labs</organization>
            </author>
            <author fullname="Bob Briscoe" initials="B." surname="Briscoe">
              <organization>Independent</organization>
            </author>
            <author fullname="Vidhi Goel" initials="V." surname="Goel">
              <organization>Apple Inc</organization>
            </author>
            <date day="14" month="October" year="2023"/>
            <abstract>
              <t>This specification defines the Prague congestion control scheme, which is derived from DCTCP and adapted for Internet traffic by implementing the Prague L4S requirements. Over paths with L4S support at the bottleneck, it adapts the DCTCP mechanisms to achieve consistently low latency and full throughput. It is defined independently of any particular transport protocol or operating system, but notes are added that highlight issues specific to certain transports and OSs. It is mainly based on experience with the reference Linux implementation of TCP Prague and the Apple implementation over QUIC, but it includes experience from other implementations where available. The implementation does not satisfy all the Prague requirements (yet) and the IETF might decide that certain requirements need to be relaxed
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4303.xml"/>
	        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5681.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6789.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7713.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8257.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8298.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9332.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9330.xml"/>
	        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9438.xml"/>
<!-- [I-D.briscoe-iccrg-prague-congestion-control]
draft-briscoe-iccrg-prague-congestion-control-04
IESG State: Expired as an outcome of the process of trying to satisfy them all. Future plans that have typically only been implemented as proof-of- concept code are outlined in a separate section.</t>
            </abstract>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-briscoe-iccrg-prague-congestion-control-03"/>
        </reference> 1/5/25
-->
<xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.briscoe-iccrg-prague-congestion-control.xml"/>

        <reference anchor="LLD" target="https://cablela.bs/low-latency-docsis-technology-overview-february-2019">
          <front>
            <title>Low Latency DOCSIS: Technology Overview</title>
            <author fullname="Greg White" initials="G." surname="White">
              <organization>CableLabs</organization>
            </author>
            <author fullname="Karthik Sundaresan" initials="K." surname="Sundaresan">
              <organization>CableLabs</organization>
            </author>
            <author fullname="Bob Briscoe" initials="B." surname="Briscoe">
              <organization>CableLabs</organization>
            </author>
            <date day="" month="February" year="2019"/>
          </front>
          <seriesInfo name="CableLabs
          <refcontent>CableLabs White Paper" value=""/> Paper</refcontent>
        </reference>
        <reference anchor="ScalingCC" target="https://arxiv.org/abs/1904.07605">
          <front>
            <title>Resolving Tensions between Congestion Control Scaling
          Requirements</title>
            <author fullname="Bob Briscoe" initials="B." surname="Briscoe">
              <organization>Simula Research Lab</organization>
            </author>
            <author fullname="Koen De Schepper" initials="K." surname="De Schepper">
              <organization>Nokia Bell Labs</organization>
            </author>
            <date month="July" year="2017"/>
          </front>
          <seriesInfo name="Simula Technical Report" value="TR-CS-2016-001 arXiv:1904.07605"/> value="TR-CS-2016-001"/>
          <seriesInfo name="DOI" value="10.48550/arXiv.1904.07605"/>
          <refcontent>arXiv:1904.07605</refcontent>
        </reference>

<!-- [rfced] FYI: We updated the [BBRv3] and [SCReAM] references to
match current style guidance for references to web-based public
code repositories:
https://www.rfc-editor.org/styleguide/part2/#ref_repo

[BB] I assume the intent is not to preclude use of a later commit, but also not
to endorse a later commit in case the code project changes direction and becomes
an inappropriate reference. Although the style guide doesn't say this, it might
be appropriate to add "when last accessed: " between the commit and the date.

-->

        <reference anchor="BBRv3" target="https://github.com/google/bbr/blob/v3/README.md">
          <front>
            <title>TCP BBR v3 Release</title>
            <author fullname="Neal Cardwell" initials="N" surname="Cardwell">
              <organization/>
            </author>
            <date/>
            <author/>
            <date day="18" month="March" year="2025"/>
          </front>
          <seriesInfo name="github repository;" value="Linux congestion control module"/>
          <refcontent>commit 90210de</refcontent>
        </reference>

        <reference anchor="SCReAM" target="https://github.com/EricssonResearch/scream/blob/master/README.md">
          <front>
            <title>SCReAM</title>
            <author fullname="Ingemar Johansson" initials="I" surname="Johansson">
              <organization/>
            </author>
            <date/>
            <author/>
            <date day="10" month="November" year="2025"/>
          </front>
          <seriesInfo name="github repository;" value=""/>
          <format target="https://github.com/google/bbr/blob/v2alpha/README.md" type="Source code"/>
          <refcontent>commit 0208f59</refcontent>
        </reference>
      </references>
    </references>

    <section numbered="false" toc="default">
      <name>Acknowledgements</name>
      <t>Thanks to <contact fullname="Tom Henderson"/>, <contact
      fullname="Magnus Westerlund"/>, <contact fullname="David Black"/>,
      <contact fullname="Adrian Farrel"/>, and <contact fullname="Gorry
      Fairhurst"/> for their reviews of this document.  The design of the QProt
      algorithm and the settings of the parameters benefited from discussion
      and critique from the participants of the cable industry working group
      on Low-Latency DOCSIS. CableLabs funded <contact fullname="Bob
      Briscoe"/>'s initial work on this document.</t>
    </section>
  </back>
</rfc>