rfc9623.original   rfc9623.txt 
TAPS Working Group A. Brunstrom, Ed. Internet Engineering Task Force (IETF) A. Brunstrom, Ed.
Internet-Draft Karlstad University Request for Comments: 9623 Karlstad University
Intended status: Informational T. Pauly, Ed. Category: Informational T. Pauly, Ed.
Expires: 16 June 2024 Apple Inc. ISSN: 2070-1721 Apple Inc.
R. Enghardt R. Enghardt
Netflix Netflix
P. Tiesel P.S. Tiesel
SAP SE SAP SE
M. Welzl M. Welzl
University of Oslo University of Oslo
14 December 2023 January 2025
Implementing Interfaces to Transport Services Implementing Interfaces to Transport Services
draft-ietf-taps-impl-18
Abstract Abstract
The Transport Services system enables applications to use transport The Transport Services System enables applications to use transport
protocols flexibly for network communication and defines a protocol- protocols flexibly for network communication and defines a protocol-
independent Transport Services Application Programming Interface independent Transport Services Application Programming Interface
(API) that is based on an asynchronous, event-driven interaction (API) that is based on an asynchronous, event-driven interaction
pattern. This document serves as a guide to implementing such a pattern. This document serves as a guide to implementing such a
system. system.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This document is not an Internet Standards Track specification; it is
provisions of BCP 78 and BCP 79. published for informational purposes.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are candidates for any level of Internet
Standard; see Section 2 of RFC 7841.
This Internet-Draft will expire on 16 June 2024. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9623.
Copyright Notice Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the Copyright (c) 2025 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents
license-info) in effect on the date of publication of this document. (https://trustee.ietf.org/license-info) in effect on the date of
Please review these documents carefully, as they describe your rights publication of this document. Please review these documents
and restrictions with respect to this document. Code Components carefully, as they describe your rights and restrictions with respect
extracted from this document must include Revised BSD License text as to this document. Code Components extracted from this document must
described in Section 4.e of the Trust Legal Provisions and are include Revised BSD License text as described in Section 4.e of the
provided without warranty as described in the Revised BSD License. Trust Legal Provisions and are provided without warranty as described
in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction
2. Implementing Connection Objects . . . . . . . . . . . . . . . 4 2. Implementing Connection Objects
3. Implementing Pre-Establishment . . . . . . . . . . . . . . . 5 3. Implementing Preestablishment
3.1. Configuration-time errors . . . . . . . . . . . . . . . . 5 3.1. Configuration-Time Errors
3.2. Role of system policy . . . . . . . . . . . . . . . . . . 6 3.2. Role of System Policy
4. Implementing Connection Establishment . . . . . . . . . . . . 7 4. Implementing Connection Establishment
4.1. Structuring Candidates as a Tree . . . . . . . . . . . . 9 4.1. Structuring Candidates as a Tree
4.1.1. Branch Types . . . . . . . . . . . . . . . . . . . . 10 4.1.1. Branch Types
4.1.2. Branching Order-of-Operations . . . . . . . . . . . . 13 4.1.2. Branching Order-of-Operations
4.1.3. Sorting Branches . . . . . . . . . . . . . . . . . . 14 4.1.3. Sorting Branches
4.2. Candidate Gathering . . . . . . . . . . . . . . . . . . . 16 4.2. Candidate Gathering
4.2.1. Gathering Endpoint Candidates . . . . . . . . . . . . 16 4.2.1. Gathering Endpoint Candidates
4.3. Candidate Racing . . . . . . . . . . . . . . . . . . . . 17 4.3. Candidate Racing
4.3.1. Simultaneous . . . . . . . . . . . . . . . . . . . . 18 4.3.1. Simultaneous
4.3.2. Staggered . . . . . . . . . . . . . . . . . . . . . . 18 4.3.2. Staggered
4.3.3. Failover . . . . . . . . . . . . . . . . . . . . . . 19 4.3.3. Failover
4.4. Completing Establishment . . . . . . . . . . . . . . . . 19 4.4. Completing Establishment
4.4.1. Determining Successful Establishment . . . . . . . . 20 4.4.1. Determining Successful Establishment
4.5. Establishing multiplexed connections . . . . . . . . . . 21 4.5. Establishing Multiplexed Connections
4.6. Handling connectionless protocols . . . . . . . . . . . . 22 4.6. Handling Connectionless Protocols
4.7. Implementing Listeners . . . . . . . . . . . . . . . . . 22 4.7. Implementing Listeners
4.7.1. Implementing Listeners for Connected Protocols . . . 22 4.7.1. Implementing Listeners for Connected Protocols
4.7.2. Implementing Listeners for Connectionless 4.7.2. Implementing Listeners for Connectionless Protocols
Protocols . . . . . . . . . . . . . . . . . . . . . . 23 4.7.3. Implementing Listeners for Multiplexed Protocols
4.7.3. Implementing Listeners for Multiplexed Protocols . . 23 5. Implementing Sending and Receiving Data
5. Implementing Sending and Receiving Data . . . . . . . . . . . 23 5.1. Sending Messages
5.1. Sending Messages . . . . . . . . . . . . . . . . . . . . 24 5.1.1. Message Properties
5.1.1. Message Properties . . . . . . . . . . . . . . . . . 24 5.1.2. Send Completion
5.1.2. Send Completion . . . . . . . . . . . . . . . . . . . 26 5.1.3. Batching Sends
5.1.3. Batching Sends . . . . . . . . . . . . . . . . . . . 26 5.2. Receiving Messages
5.2. Receiving Messages . . . . . . . . . . . . . . . . . . . 26 5.3. Handling of Data for Fast-Open Protocols
5.3. Handling of data for fast-open protocols . . . . . . . . 27 6. Implementing Message Framers
6. Implementing Message Framers . . . . . . . . . . . . . . . . 28 6.1. Defining Message Framers
6.1. Defining Message Framers . . . . . . . . . . . . . . . . 29 6.2. Sender-Side Message Framing
6.2. Sender-side Message Framing . . . . . . . . . . . . . . . 30 6.3. Receiver-Side Message Framing
6.3. Receiver-side Message Framing . . . . . . . . . . . . . . 31 7. Implementing Connection Management
7. Implementing Connection Management . . . . . . . . . . . . . 32 7.1. Pooled Connection
7.1. Pooled Connection . . . . . . . . . . . . . . . . . . . . 33 7.2. Handling Path Changes
7.2. Handling Path Changes . . . . . . . . . . . . . . . . . . 33 8. Implementing Connection Termination
8. Implementing Connection Termination . . . . . . . . . . . . . 35 9. Cached State
9. Cached State . . . . . . . . . . . . . . . . . . . . . . . . 35 9.1. Protocol State Caches
9.1. Protocol state caches . . . . . . . . . . . . . . . . . . 35 9.2. Performance Caches
9.2. Performance caches . . . . . . . . . . . . . . . . . . . 36 10. Specific Transport Protocol Considerations
10. Specific Transport Protocol Considerations . . . . . . . . . 37 10.1. TCP
10.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . 38 10.2. MPTCP
10.2. MPTCP . . . . . . . . . . . . . . . . . . . . . . . . . 40 10.3. UDP
10.3. UDP . . . . . . . . . . . . . . . . . . . . . . . . . . 40 10.4. UDP-Lite
10.4. UDP-Lite . . . . . . . . . . . . . . . . . . . . . . . . 42 10.5. UDP Multicast Receive
10.5. UDP Multicast Receive . . . . . . . . . . . . . . . . . 42 10.6. SCTP
10.6. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 44 11. IANA Considerations
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46 12. Security Considerations
12. Security Considerations . . . . . . . . . . . . . . . . . . . 46 12.1. Considerations for Candidate Gathering
12.1. Considerations for Candidate Gathering . . . . . . . . . 47 12.2. Considerations for Candidate Racing
12.2. Considerations for Candidate Racing . . . . . . . . . . 47 13. References
13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 47 13.1. Normative References
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 48 13.2. Informative References
14.1. Normative References . . . . . . . . . . . . . . . . . . 48 Appendix A. API Mapping Template
14.2. Informative References . . . . . . . . . . . . . . . . . 49 Appendix B. Reasons for Errors
Appendix A. API Mapping Template . . . . . . . . . . . . . . . . 51 Appendix C. Existing Implementations
Appendix B. Reasons for errors . . . . . . . . . . . . . . . . . 52 Acknowledgements
Appendix C. Existing Implementations . . . . . . . . . . . . . . 53 Authors' Addresses
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 54
1. Introduction 1. Introduction
The Transport Services architecture [I-D.ietf-taps-arch] defines a The Transport Services Architecture [RFC9621] defines a system that
system that allows applications to flexibly use transport networking allows applications to flexibly use transport networking protocols.
protocols. The API that such a system exposes to applications is The API that such a system exposes to applications is defined as the
defined as the Transport Services API [I-D.ietf-taps-interface]. Transport Services API [RFC9622]. This API is designed to be generic
This API is designed to be generic across multiple transport across multiple transport protocols and sets of protocol features.
protocols and sets of protocol features.
This document serves as a guide to implementing a system that This document serves as a guide to implementing a system that
provides a Transport Services API. This guide offers suggestions to provides a Transport Services API. This guide offers suggestions to
developers, but it is not prescriptive: implementations are free to developers, but it is not prescriptive: implementations are free to
take any desired form as long as the API specification in take any desired form as long as the API specification defined in
[I-D.ietf-taps-interface] is honored. It is the job of an [RFC9622] is honored. It is the job of an implementation of a
implementation of a Transport Services system to turn the requests of Transport Services System to turn the requests of an application into
an application into decisions on how to establish connections, and decisions on how to establish connections and how to transfer data
how to transfer data over those connections once established. The over those connections once established. The terminology used in
terminology used in this document is based on the Transport Services this document is based on the terminology defined in the Transport
architecture [I-D.ietf-taps-arch]. Services Architecture [RFC9621].
2. Implementing Connection Objects 2. Implementing Connection Objects
The connection objects that are exposed to applications for Transport The Connection objects that are exposed to applications for Transport
Services are: Services are:
* the Preconnection, the bundle of properties that describes the * the Preconnection, the bundle of Properties that describes the
application constraints on, and preferences for, the transport; application constraints on, and preferences for, the transport;
* the Connection, the basic object that represents a flow of data as * the Connection, the basic object that represents a flow of data as
Messages in either direction between the Local and Remote Messages in either direction between the Local and Remote
Endpoints; Endpoints;
* and the Listener, a passive waiting object that delivers new * and the Listener, a passive waiting object that delivers new
Connections. Connections.
Preconnection objects should be implemented as bundles of properties Preconnection objects should be implemented as bundles of Properties
that an application can both read and write. A Preconnection object that an application can both read and write. A Preconnection object
influences a Connection only at one point in time: when the influences a Connection only at one point in time: when the
Connection is created. Connection objects represent the interface Connection is created. Connection objects represent the interface
between the application and the implementation to manage transport between the application and the implementation to manage transport
state, and conduct data transfer. During the process of state and conduct data transfer. During the process of establishment
establishment (Section 4), the Connection will not necessarily be (Section 4), the Connection will not necessarily be immediately bound
immediately bound to a transport protocol instance, since multiple to a transport protocol instance, since multiple candidate Protocol
candidate Protocol Stacks might be raced. Stacks might be raced.
Once a Preconnection has been used to create an outbound Connection Once a Preconnection has been used to create an outbound Connection
or a Listener, the implementation should ensure that the copy of the or a Listener, the implementation should ensure that the copy of the
properties held by the Connection or Listener cannot be mutated by Properties held by the Connection or Listener cannot be mutated by
the application making changes to the original Preconnection object. the application making changes to the original Preconnection object.
This may involve the implementation performing a deep-copy, copying This may involve the implementation performing a deep-copy, copying
the object with all the objects that it references. the object with all the objects that it references.
Once the Connection is established, the Transport Services Once the Connection is established, the Transport Services
Implementation maps actions and events to the details of the chosen Implementation maps actions and events to the details of the chosen
Protocol Stack. For example, the same Connection object may Protocol Stack. For example, the same Connection object may
ultimately represent a single transport protocol instance (e.g., a ultimately represent a single transport protocol instance (e.g., a
TCP connection, a TLS session over TCP, a UDP flow with fully- TCP connection, a TLS session over TCP, a UDP flow with fully
specified Local and Remote Endpoint Identifiers, a DTLS session, a specified Local and Remote Endpoint Identifiers, a DTLS session, a
SCTP stream, a QUIC stream, or an HTTP/2 stream). The Connection Stream Control Transmission Protocol (SCTP) stream, a QUIC stream, or
Properties held by a Connection or Listener are independent of other an HTTP/2 stream). The Connection Properties held by a Connection or
Connections that are not part of the same Connection Group. Listener are independent of other Connections that are not part of
the same Connection Group.
Connection establishment is only a local operation for a Connection establishment is only a local operation for connectionless
connectionless protocols, which serves to simplify the local send/ protocols, which serves to simplify the local send/receive functions
receive functions and to filter the traffic for the specified and to filter the traffic for the specified addresses and ports
addresses and ports [RFC8085] (for example using UDP or UDP-Lite [RFC8085] (for example, using UDP or UDP-Lite transport without a
transport without a connection handshake procedure). connection handshake procedure).
Once Initiate has been called, the Selection Properties and Endpoint Once Initiate has been called, the Selection Properties and Endpoint
information of the created Connection are immutable (i.e, an information of the created Connection are immutable (i.e., an
application is not able to later modify the properties of a application is not able to later modify the Properties of a
Connection by manipulating the original Preconnection object). Connection by manipulating the original Preconnection object).
Listener objects are created with a Preconnection, at which point Listener objects are created with a Preconnection, at which point
their configuration should be considered immutable by the their configuration should be considered immutable by the
implementation. The process of listening is described in implementation. The process of listening is described in
Section 4.7. Section 4.7.
3. Implementing Pre-Establishment 3. Implementing Preestablishment
The pre-establishment phase allows applications to specify properties The preestablishment phase allows applications to specify Properties
for the Connections that they are about to make, or to query the API for the Connections that they are about to make or to query the API
about potential Connections they could make. about potential Connections they could make.
During pre-establishment the application specifies one or more During preestablishment, the application specifies one or more
Endpoints to be used for communication as well as protocol Endpoints to be used for communication as well as protocol
preferences and constraints via Selection Properties and, if desired, preferences and constraints via Selection Properties and, if desired,
also Connection Properties. Section 4 of [I-D.ietf-taps-interface] also Connection Properties. Section 4 of [RFC9622] states that
states that Connection Properties should preferably be configured Connection Properties should preferably be configured during
during pre-establishment, because they can serve as input to preestablishment because they can serve as input to decisions that
decisions that are made by the implementation (e.g., the capacity are made by the implementation (e.g., the capacity profile can guide
profile can guide usage of a protocol offering scavenger-type usage of a protocol offering scavenger-type congestion control).
congestion control).
The implementation stores these properties as a part of the The implementation stores these Properties as a part of the
Preconnection object for use during connection establishment. For Preconnection object for use during Connection establishment. For
Selection Properties that are not provided by the application, the Selection Properties that are not provided by the application, the
implementation uses the default values specified in the Transport implementation uses the default values specified in the Transport
Services API ([I-D.ietf-taps-interface]). Services API ([RFC9622]).
3.1. Configuration-time errors 3.1. Configuration-Time Errors
The Transport Services system should have a list of supported The Transport Services System should have a list of supported
protocols available, which each have transport features reflecting protocols available, each of which has transport features reflecting
the capabilities of the protocol. Once an application specifies its the capabilities of the protocol. Once an application specifies its
Transport Properties, the Transport Services system matches the Transport Properties, the Transport Services System matches the
required and prohibited properties against the transport features of required and prohibited Properties against the transport features of
the available protocols (see Section 6.2 of [I-D.ietf-taps-interface] the available protocols (see Section 6.2 of [RFC9622] for the
for the definition of property preferences). definition of Property Preferences).
In the following cases, failure should be detected during pre- In the following cases, failure should be detected during
establishment: preestablishment:
* A request by an application for properties that cannot be * A request by an application for Properties that cannot be
satisfied by any of the available protocols. For example, if an satisfied by any of the available protocols. For example, if an
application requires perMsgReliability, but no such feature is application requires perMsgReliability, but no such feature is
available in any protocol on the host running the Transport available in any protocol on the host running the Transport
Services system this should result in an error. Services System, this should result in an error.
* A request by an application for properties that are in conflict * A request by an application for Properties that are in conflict
with each other, such as specifying required and prohibited with each other, such as specifying required and prohibited
properties that cannot be satisfied by any protocol. For example, Properties that cannot be satisfied by any protocol. For example,
if an application prohibits reliability but then requires if an application prohibits reliability but then requires
perMsgReliability, this mismatch should result in an error. perMsgReliability, this mismatch should result in an error.
To avoid allocating resources that are not finally needed, it is To avoid allocating resources that are not needed, it is important
important that configuration-time errors fail as early as possible. that configuration-time errors fail as early as possible.
3.2. Role of system policy 3.2. Role of System Policy
The properties specified during pre-establishment have a close The Properties specified during preestablishment have a close
relationship to system policy. The implementation is responsible for relationship to System Policy. The implementation is responsible for
combining and reconciling several different sources of preferences combining and reconciling several different sources of preferences
when establishing Connections. These include, but are not limited when establishing Connections. These include, but are not limited
to: to:
1. Application preferences, i.e., preferences specified during the 1. Application preferences, i.e., preferences specified during
pre-establishment via Selection Properties. preestablishment via Selection Properties.
2. Dynamic system policy, i.e., policy compiled from internally and 2. Dynamic System Policy, i.e., policy compiled from internally and
externally acquired information about available network externally acquired information about available network
interfaces, supported transport protocols, and current/previous interfaces, supported transport protocols, and current/previous
Connections. Examples of ways to externally retrieve policy- Connections. Examples of ways to externally retrieve policy-
support information are through OS-specific statistics/ support information are through OS-specific statistics/
measurement tools and tools that reside on middleboxes and measurement tools and tools that reside on middleboxes and
routers. routers.
3. Default implementation policy, i.e., predefined policy by OS or 3. Default implementation policy, i.e., predefined policy by the OS
application. or application.
In general, any protocol or path used for a Connection must conform In general, any protocol or path used for a Connection must conform
to all three sources of constraints. A violation that occurs at any to all three sources of constraints. A violation that occurs at any
of the policy layers should cause a protocol or path to be considered of the policy layers should cause a protocol or path to be considered
ineligible for use. If such a violation prevents a Connection from ineligible for use. If such a violation prevents a Connection from
being established, this should be communicated to the application, being established, this should be communicated to the application,
e.g. via the EstablishmentError event. For an example of application e.g., via the EstablishmentError event. For an example of
preferences leading to constraints, an application may prohibit the application preferences leading to constraints, an application may
use of metered network interfaces for a given Connection to avoid prohibit the use of metered network interfaces for a given Connection
user cost. Similarly, the system policy at a given time may prohibit to avoid user cost. Similarly, the System Policy at a given time may
the use of such a metered network interface from the application's prohibit the use of such a metered network interface from the
process. Lastly, the implementation itself may default to application's process. Lastly, the implementation itself may default
disallowing certain network interfaces unless explicitly requested by to disallowing certain network interfaces unless explicitly requested
the application. by the application.
It is expected that the database of system policies and the method of It is expected that the database of system policies and the method of
looking up these policies will vary across various platforms. An looking up these policies will vary across various platforms. An
implementation should attempt to look up the relevant policies for implementation should attempt to look up the relevant policies for
the system in a dynamic way to make sure it is reflecting an accurate the system in a dynamic way to make sure it reflects an accurate
version of the system policy, since the system's policy regarding the version of the System Policy, since the system's policy regarding the
application's traffic may change over time due to user or application's traffic may change over time due to user or
administrative changes. administrative changes.
4. Implementing Connection Establishment 4. Implementing Connection Establishment
The process of establishing a network connection begins when an The process of establishing a network connection begins when an
application expresses intent to communicate with a Remote Endpoint by application expresses intent to communicate with a Remote Endpoint by
calling Initiate, at which point the Preconnection object contains calling Initiate, at which point the Preconnection object contains
all constraints or requirements the application has configured. The all constraints or requirements the application has configured. The
establishment process can be considered complete once there is at establishment process can be considered complete once there is at
least one Protocol Stack that has completed any required setup to the least one Protocol Stack that has completed any required setup to the
point that it can transmit and receive the application's data. point that it can transmit and receive the application's data.
Connection establishment is divided into two top-level steps: Connection establishment is divided into two top-level steps:
Candidate Gathering (defined in Section 4.2.1 of
[I-D.ietf-taps-arch]), to identify the paths, protocols, and * Candidate Gathering (defined in Section 4.2.1 of [RFC9621]) to
endpoints to use (see Section 4.2); and Candidate Racing (defined in identify the paths, protocols, and endpoints to use (see
Section 4.2.2 of [I-D.ietf-taps-arch]), in which the necessary Section 4.2) and
protocol handshakes are conducted so that the Transport Services
system can select which set to use (see Section 4.3). Candidate * Candidate Racing (defined in Section 4.2.2 of [RFC9621]), in which
Racing involves attempting multiple options for connection the necessary protocol handshakes are conducted so that the
establishment, and choosing the first option to succeed as the Transport Services System can select which set to use (see
Protocol Stack to use for the connection. These attempts are usually Section 4.3).
staggered, starting each next option after a delay, but they can also
be performed in parallel or only after waiting for failures. Candidate Racing involves attempting multiple options for Connection
establishment and choosing the first option to succeed as the
Protocol Stack to use for the Connection. These attempts are usually
staggered, with each next option starting after a delay; however,
they can also be performed in parallel or after failures occur.
For ease of illustration, this document structures the candidates for For ease of illustration, this document structures the candidates for
racing as a tree (see Section 4.1). This is not meant to restrict racing as a tree (see Section 4.1). This is not meant to restrict
implementations from structuring racing candidates differently. implementations from structuring racing candidates differently.
The most simple example of this process might involve identifying the The simplest example of this process might involve identifying the
single IP address to which the implementation wishes to connect, single IP address to which the implementation wishes to connect,
using the system's current default path (i.e., using the default using the system's current default path (i.e., using the default
interface), and starting a TCP handshake to establish a stream to the interface), and starting a TCP handshake to establish a stream to the
specified IP address. However, each step may also differ depending specified IP address. However, each step may also differ depending
on the requirements of the connection: if the Endpoint Identifier is on the requirements of the connection:
a hostname and port, then there may be multiple resolved addresses
that are available; there may also be multiple paths available, (in
this case using an interface other than the default system
interface); and some protocols may not need any transport handshake
to be considered "established" (such as UDP), while other connections
may utilize layered protocol handshakes, such as TLS over TCP.
Whenever an implementation has multiple options for connection * if the Endpoint Identifier is a hostname and port, then there may
establishment, it can view the set of all individual connection be multiple resolved addresses that are available;
establishment options as a single, aggregate connection
establishment. The aggregate set conceptually includes every valid
combination of endpoints, paths, and protocols. As an example,
consider an implementation that initiates a TCP connection to a
hostname + port Endpoint Identifier, and has two valid interfaces
available (Wi-Fi and LTE). The hostname resolves to a single IPv4
address on the Wi-Fi network, and resolves to the same IPv4 address
on the LTE network, as well as a single IPv6 address. The aggregate
set of connection establishment options can be viewed as follows:
Aggregate [Endpoint Identifier: www.example.com:443] [Interface: Any] [Protocol: TCP] * there may also be multiple paths available (in this case using an
|-> [Endpoint Identifier: [2001:db8:23::1]:443] [Interface: Wi-Fi] [Protocol: TCP] interface other than the default system interface); and
|-> [Endpoint Identifier: 192.0.2.1:443] [Interface: LTE] [Protocol: TCP]
|-> [Endpoint Identifier: [2001:db8:42::1]:443] [Interface: LTE] [Protocol: TCP]
Any one of these sub-entries on the aggregate connection attempt * some protocols may not need any transport handshake to be
would satisfy the original application intent. The concern of this considered "established" (such as UDP), while other connections
section is the algorithm defining which of these options to try, may utilize layered protocol handshakes, such as TLS over TCP.
when, and in what order.
Whenever an implementation has multiple options for Connection
establishment, it can view the set of all individual Connection
establishment options as a single aggregate Connection establishment.
The aggregate set conceptually includes every valid combination of
endpoints, paths, and protocols. As an example, consider an
implementation that initiates a TCP connection to a hostname + port
Endpoint Identifier and that has two valid interfaces available (Wi-
Fi and LTE). The hostname resolves to a single IPv4 address on the
Wi-Fi network, to the same IPv4 address on the LTE network, and to a
single IPv6 address. The aggregate set of Connection establishment
options can be viewed as follows, with the Endpoint Identifier
abbreviated as “EId”:
Aggregate [EId: example.com:443] [Interface: Any] [Protocol: TCP]
|-> [EId: [3fff:23::1]:443] [Interface: Wi-Fi] [Protocol: TCP]
|-> [EId: 192.0.2.1:443] [Interface: LTE] [Protocol: TCP]
|-> [EId: [3fff:42::1]:443] [Interface: LTE] [Protocol: TCP]
Any one of these subentries on the aggregate connection attempt would
satisfy the original application intent. The concern of this section
is the algorithm defining which of these options to try, when to try
them, and in what order.
During Candidate Gathering (Section 4.2), an implementation prunes During Candidate Gathering (Section 4.2), an implementation prunes
and sorts branches according to the Selection Property preferences and sorts branches according to the Selection Property Preferences
(Section 6.2 of [I-D.ietf-taps-interface]. It first excludes all (Section 6.2 of [RFC9622]). First, it excludes all protocols and
protocols and paths that match a Prohibit property or do not match paths that match a prohibited Property or do not match all required
all Require properties. Then it will sort branches according to Properties. Then, it will sort branches according to preferred
Preferred properties, Avoided properties, and possibly other Properties, avoided Properties, and, possibly, other criteria.
criteria.
4.1. Structuring Candidates as a Tree 4.1. Structuring Candidates as a Tree
As noted above, the consideration of multiple candidates in a As noted above, the consideration of multiple candidates in a
gathering and racing process can be conceptually structured as a gathering and racing process can be conceptually structured as a
tree; this terminological convention is used throughout this tree; this terminological convention is used throughout this
document. document.
Each leaf node of the tree represents a single, coherent connection Each leaf node of the tree represents a single coherent connection
attempt, with an endpoint, a network path, and a set of protocols attempt with an endpoint, a network path, and a set of protocols that
that can directly negotiate and send data on the network. Each node can directly negotiate and send data on the network. Each node in
in the tree that is not a leaf represents a connection attempt that the tree that is not a leaf represents a connection attempt that is
is either underspecified, or else includes multiple distinct options. either underspecified or includes multiple distinct options. For
For example, when connecting on an IP network, a connection attempt example, when connecting on an IP network, a connection attempt to a
to a hostname and port is underspecified, because the connection hostname and port is underspecified because the connection attempt
attempt requires a resolved IP address as its Remote Endpoint requires a resolved IP address as its Remote Endpoint Identifier. In
Identifier. In this case, the node represented by the connection this case, the node represented by the connection attempt to the
attempt to the hostname is a parent node, with child nodes for each hostname is a parent node with child nodes for each IP address.
IP address. Similarly, an implementation that is allowed to connect Similarly, an implementation that is allowed to connect using
using multiple interfaces will have a parent node of the tree for the multiple interfaces will have a parent node of the tree for the
decision between the network paths, with a branch for each interface. decision between the network paths with a branch for each interface.
The example aggregate connection attempt above can be drawn as a tree The example aggregate connection attempt above can be drawn as a tree
by grouping the addresses resolved on the same interface into by grouping the addresses resolved on the same interface into
branches: branches:
|| ||
+==============================+ +============================+
| www.example.com:443/any path | www.example.com:443/any path
+==============================+ +============================+
// \\ // \\
+===========================+ +===========================+ +=========================+ +=======================+
| www.example.com:443/Wi-Fi | | www.example.com:443/LTE | www.example.com:443/Wi-Fi www.example.com:443/LTE
+===========================+ +===========================+ +=========================+ +=======================+
|| // \\ || // \\
+============================+ +=====================+ +==========================+ +======================+ +=================+ +====================+
| [2001:db8:23::1]:443/Wi-Fi | | 192.0.2.1:443/LTE | | [2001:db8:42::1]:443/LTE | [3fff:23::1]:443/Wi-Fi 192.0.2.1:443/LTE [3fff:42::1]:443/LTE
+============================+ +=====================+ +==========================+ +======================+ +=================+ +====================+
The rest of this section will use a notation scheme to represent this The rest of this section will use a notation scheme to represent this
tree. The root node (or parent node) of the tree will be represented tree. The root node (or parent node) of the tree will be represented
by a single integer, such as "1". ("1" is used assuming that this is by a single integer, such as "1". ("1" is used assuming that this is
the first connection made by the system; future connections created the first connection made by the system; future connections created
by the application would allocate numbers in an increasing manner.) by the application would allocate numbers in an increasing manner.)
Each child of that node will have an integer that identifies it, from Each child of that node will have an integer that identifies it, from
1 to the number of children. That child node will be uniquely 1 to the number of children. That child node will be uniquely
identified by concatenating its integer to its parent's identifier identified by concatenating its integer to its parent's identifier
with a dot in between, such as "1.1" and "1.2". Each node will be with a dot character (".") in between, such as "1.1" and "1.2". Each
summarized by a tuple of three elements: endpoint, path (labeled here node will be summarized by a tuple of three elements: endpoint, path
by interface), and protocol. In Protocol Stacks, the layers are (labeled here by interface), and protocol. In Protocol Stacks, the
separated by '/' and ordered with the protocol closest to the layers are separated by a slash character ("/") and ordered with the
application first. The above example can now be written more protocol closest to the application first. The above example can now
succinctly as: be written more succinctly as:
1 [www.example.com:443, any path, TCP] 1 [www.example.com:443, any path, TCP]
1.1 [www.example.com:443, Wi-Fi, TCP] 1.1 [www.example.com:443, Wi-Fi, TCP]
1.1.1 [[2001:db8:23::1]:443, Wi-Fi, TCP] 1.1.1 [[2001:db8:23::1]:443, Wi-Fi, TCP]
1.2 [www.example.com:443, LTE, TCP] 1.2 [www.example.com:443, LTE, TCP]
1.2.1 [192.0.2.1:443, LTE, TCP] 1.2.1 [192.0.2.1:443, LTE, TCP]
1.2.2 [[2001:db8.42::1]:443, LTE, TCP] 1.2.2 [[2001:db8.42::1]:443, LTE, TCP]
When an implementation is asked to establish a single connection, When an implementation is asked to establish a single connection,
only one of the leaf nodes in the candidate set is needed to transfer only one of the leaf nodes in the candidate set is needed to transfer
data. Thus, once a single leaf node becomes ready to use, then the data. Thus, once a single leaf node becomes ready to use, the
connection establishment tree is considered ready. One way to Connection establishment tree is considered ready. One way to
implement this is by having every leaf node update the state of its implement this is by having every leaf node update the state of its
parent node when it becomes ready, until the root node of the tree is parent node when it becomes ready until the root node of the tree is
ready, which then notifies the application that the Connection as a ready, which then notifies the application that the Connection as a
whole is ready to use. whole is ready to use.
A connection establishment tree may consist of only a single node, A Connection establishment tree may consist of only a single node,
such as a connection attempt to an IP address over a single interface such as a connection attempt to an IP address over a single interface
with a single protocol. with a single protocol.
1 [[2001:db8:23::1]:443, Wi-Fi, TCP] 1 [[2001:db8:23::1]:443, Wi-Fi, TCP]
A root node may also only have one child (or leaf) node, such as a A root node may also only have one child (or leaf) node, such as a
when a hostname resolves to only a single IP address. when a hostname resolves to only a single IP address.
1 [www.example.com:443, Wi-Fi, TCP] 1 [www.example.com:443, Wi-Fi, TCP]
1.1 [[2001:db8:23::1]:443, Wi-Fi, TCP] 1.1 [[2001:db8:23::1]:443, Wi-Fi, TCP]
4.1.1. Branch Types 4.1.1. Branch Types
There are three types of branching from a parent node into one or There are three types of branching from a parent node into one or
more child nodes. Any parent node of the tree must only use one type more child nodes: Derived Endpoints, network paths, and protocol
of branching. options. Any parent node of the tree must use only one type of
branching.
4.1.1.1. Derived Endpoints 4.1.1.1. Derived Endpoints
If a connection originally targets a single Endpoint Identifer, there If a connection originally targets a single Endpoint Identifier,
may be multiple endpoint candidates of different types that can be there may be multiple endpoint candidates of different types that can
derived from the original. This creates an ordered list of the be derived from the original. This creates an ordered list of the
derived endpoint candidates according to application preference, derived endpoint candidates according to application preference,
system policy and expected performance. System Policy, and expected performance.
DNS hostname-to-address resolution is the most common method of DNS hostname-to-address resolution is the most common method of
endpoint derivation. When trying to connect to a hostname Endpoint endpoint derivation. When trying to connect to a hostname Endpoint
Identifer on a traditional IP network, the implementation should send Identifier on an IP network, the implementation should send all
all applicable DNS queries. Commonly, this will include both A applicable DNS queries. Commonly, this will include both A (IPv4)
(IPv4) and AAAA (IPv6) records if both address families are supported and AAAA (IPv6) records if both address families are supported on the
on the local interface. This can also include SRV records [RFC2782], local interface. This can also include SRV records [RFC2782], SVCB
SVCB and HTTPS records [I-D.ietf-dnsop-svcb-https], or other future and HTTPS records [RFC9460], or other future record types. The
record types. The algorithm for ordering and racing these addresses algorithm for ordering and racing these addresses should follow the
should follow the recommendations in Happy Eyeballs [RFC8305]. recommendations in Happy Eyeballs [RFC8305].
1 [www.example.com:443, Wi-Fi, TCP] 1 [www.example.com:443, Wi-Fi, TCP]
1.1 [[2001:db8::1]:443, Wi-Fi, TCP] 1.1 [[2001:db8::1]:443, Wi-Fi, TCP]
1.2 [192.0.2.1:443, Wi-Fi, TCP] 1.2 [192.0.2.1:443, Wi-Fi, TCP]
1.3 [[2001:db8::2]:443, Wi-Fi, TCP] 1.3 [[2001:db8::2]:443, Wi-Fi, TCP]
1.4 [[2001:db8::3]:443, Wi-Fi, TCP] 1.4 [[2001:db8::3]:443, Wi-Fi, TCP]
DNS-Based Service Discovery [RFC6763] can also provide an endpoint DNS-Based Service Discovery [RFC6763] can also provide an endpoint
derivation step. When trying to connect to a named service, the derivation step. When trying to connect to a named service, the
client may discover one or more hostname and port pairs on the local client may discover one or more hostname and port pairs on the local
skipping to change at page 11, line 37 skipping to change at line 496
addresses, which would create multiple layers of branching. addresses, which would create multiple layers of branching.
1 [term-printer._ipp._tcp.meeting.example.com, Wi-Fi, TCP] 1 [term-printer._ipp._tcp.meeting.example.com, Wi-Fi, TCP]
1.1 [term-printer.meeting.example.com:631, Wi-Fi, TCP] 1.1 [term-printer.meeting.example.com:631, Wi-Fi, TCP]
1.1.1 [31.133.160.18:631, Wi-Fi, TCP] 1.1.1 [31.133.160.18:631, Wi-Fi, TCP]
Applications can influence which derived Endpoints are allowed and Applications can influence which derived Endpoints are allowed and
preferred via Selection Properties set on the Preconnection. For preferred via Selection Properties set on the Preconnection. For
example, setting a preference for useTemporaryLocalAddress would example, setting a preference for useTemporaryLocalAddress would
prefer the use of IPv6 over IPv4, and requiring prefer the use of IPv6 over IPv4, and requiring
useTemporaryLocalAddress would eliminate IPv4 options, since IPv4 useTemporaryLocalAddress would eliminate IPv4 options since IPv4 does
does not support temporary addresses. not support temporary addresses.
4.1.1.2. Network Paths 4.1.1.2. Network Paths
If a client has multiple network paths available to it, e.g., a If a client has multiple network paths available to it, e.g., a
mobile client with interfaces for both Wi-Fi and Cellular mobile client with interfaces for both Wi-Fi and Cellular
connectivity, it can attempt a connection over any of the paths. connectivity, it can attempt a connection over any of the paths.
This represents a branch point in the connection establishment. This represents a branch point in the Connection establishment.
Similar to a derived endpoint, the paths should be ranked based on Similar to a derived endpoint, the paths should be ranked based on
preference, system policy, and performance. Attempts should be preference, policy, and performance. Attempts should be started on
started on one path (e.g., a specific interface), and then one path (e.g., a specific interface) and then successively on other
successively on other paths (or interfaces) after delays based on the paths (or interfaces) after delays based on the expected path RTT or
expected path round-trip-time or other available metrics. other available metrics.
1 [192.0.2.1:443, any path, TCP] 1 [192.0.2.1:443, any path, TCP]
1.1 [192.0.2.1:443, Wi-Fi, TCP] 1.1 [192.0.2.1:443, Wi-Fi, TCP]
1.2 [192.0.2.1:443, LTE, TCP] 1.2 [192.0.2.1:443, LTE, TCP]
The same approach applies to any situation in which the client is The same approach applies to any situation in which the client is
aware of multiple links or views of the network. A single interface aware of multiple links or views of the network. A single interface
may be shared by multiple network paths, each with a coherent set of may be shared by multiple network paths, each with a coherent set of
addresses, routes, DNS server, and more. A path may also represent a addresses, routes, DNS server, and more. A path may also represent a
virtual interface service such as a Virtual Private Network (VPN). virtual interface service such as a Virtual Private Network (VPN).
The list of available paths should be constrained by any requirements The list of available paths should be constrained by any requirements
the application sets, as well as by the system policy. the application sets as well as by the System Policy.
4.1.1.3. Protocol Options 4.1.1.3. Protocol Options
Differences in possible protocol compositions and options can also Differences in possible protocol compositions and options can also
provide a branching point in connection establishment. This allows provide a branching point in Connection establishment. This allows
clients to be resilient to situations in which a certain protocol is clients to be resilient to situations in which a certain protocol is
not functioning on a server or network. not functioning on a server or network.
This approach is commonly used for connections with optional proxy This approach is commonly used for connections with optional proxy
server configurations. A single connection might have several server configurations. A single connection might have several
options available: an HTTP-based proxy, a SOCKS-based proxy, or no options available: an HTTP-based proxy, a SOCKS-based proxy, or no
proxy. As above, these options should be ranked based on preference, proxy. As above, these options should be ranked based on preference,
system policy, and performance and attempted in succession. System Policy, and performance, and should be attempted in
succession.
1 [www.example.com:443, any path, HTTP/TCP] 1 [www.example.com:443, any path, HTTP/TCP]
1.1 [192.0.2.8:443, any path, HTTP/HTTP Proxy/TCP] 1.1 [192.0.2.8:443, any path, HTTP/HTTP Proxy/TCP]
1.2 [192.0.2.7:10234, any path, HTTP/SOCKS/TCP] 1.2 [192.0.2.7:10234, any path, HTTP/SOCKS/TCP]
1.3 [www.example.com:443, any path, HTTP/TCP] 1.3 [www.example.com:443, any path, HTTP/TCP]
1.3.1 [192.0.2.1:443, any path, HTTP/TCP] 1.3.1 [192.0.2.1:443, any path, HTTP/TCP]
This approach also allows a client to attempt different sets of This approach also allows a client to attempt different sets of
application and transport protocols that, when available, could application and transport protocols that, when available, could
provide preferable features. For example, the protocol options could provide preferable features. For example, the protocol options could
involve QUIC [RFC9000] over UDP on one branch, and HTTP/2 [RFC7540] involve QUIC [RFC9000] over UDP on one branch and HTTP/2 [RFC9113]
over TLS over TCP on the other: over TLS over TCP on the other:
1 [www.example.com:443, any path, HTTP] 1 [www.example.com:443, any path, HTTP]
1.1 [www.example.com:443, any path, HTTP3/QUIC/UDP] 1.1 [www.example.com:443, any path, HTTP3/QUIC/UDP]
1.1.1 [192.0.2.1:443, any path, HTTP3/QUIC/UDP] 1.1.1 [192.0.2.1:443, any path, HTTP3/QUIC/UDP]
1.2 [www.example.com:443, any path, HTTP2/TLS/TCP] 1.2 [www.example.com:443, any path, HTTP2/TLS/TCP]
1.2.1 [192.0.2.1:443, any path, HTTP2/TLS/TCP] 1.2.1 [192.0.2.1:443, any path, HTTP2/TLS/TCP]
Another example is racing SCTP with TCP: Another example is racing SCTP with TCP:
1 [www.example.com:4740, any path, reliable-inorder-stream] 1 [www.example.com:4740, any path, reliable-inorder-stream]
1.1 [www.example.com:4740, any path, SCTP] 1.1 [www.example.com:4740, any path, SCTP]
1.1.1 [192.0.2.1:4740, any path, SCTP] 1.1.1 [192.0.2.1:4740, any path, SCTP]
1.2 [www.example.com:4740, any path, TCP] 1.2 [www.example.com:4740, any path, TCP]
1.2.1 [192.0.2.1:4740, any path, TCP] 1.2.1 [192.0.2.1:4740, any path, TCP]
Implementations that support racing protocols and protocol options Implementations that support racing protocols and protocol options
should maintain a history of which protocols and protocol options should maintain a history of which protocols and protocol options
were successfully established, on a per-network and per-endpoint were successfully established on a per-network and per-endpoint basis
basis (see Section 9.2). This information can influence future (see Section 9.2). This information can influence future racing
racing decisions to prioritize or prune branches. decisions to prioritize or prune branches.
4.1.2. Branching Order-of-Operations 4.1.2. Branching Order-of-Operations
Branch types ought to occur in a specific order relative to one Branch types ought to occur in a specific order relative to one
another to avoid creating leaf nodes with invalid or incompatible another to avoid creating leaf nodes with invalid or incompatible
settings. In the example above, it would be invalid to branch for settings. In the example above, it would be invalid to branch for
derived endpoints (the DNS results for www.example.com) before derived endpoints (the DNS results for www.example.com) before
branching between interface paths, since there are situations when branching between interface paths since there are situations when the
the results will be different across networks due to private names or results will be different across networks due to private names or
different supported IP versions. Implementations need to be careful different supported IP versions. Implementations need to be careful
to branch in a consistent order that results in usable leaf nodes to branch in a consistent order that results in usable leaf nodes
whenever there are multiple branch types that could be used from a whenever there are multiple branch types that could be used from a
single node. single node.
This document recommends the following order of operations for This document recommends the following order of operations for
branching: branching:
1. Network Paths 1. Network paths
2. Protocol Options 2. Protocol options
3. Derived Endpoints 3. Derived Endpoints
where a lower number indicates higher precedence and therefore higher where a lower number indicates higher precedence and, therefore,
placement in the tree. Branching between paths is the first in the higher placement in the tree. Branching between paths is the first
list because results across multiple interfaces are likely not in the list because results across multiple interfaces are likely not
related to one another: endpoint resolution may return different related to one another: endpoint resolution may return different
results, especially when using locally resolved host and service results, especially when using locally resolved host and service
names, and which protocols are supported and preferred may differ names and the protocols that are supported and preferred may differ
across interfaces. Thus, if multiple paths are attempted, the across interfaces. Thus, if multiple paths are attempted, the
overall connection establishment process can be seen as a race overall Connection establishment process can be seen as a race
between the available paths or interfaces. between the available paths or interfaces.
Protocol options are next checked in order. Whether or not a set of Protocol options are next checked in order. Whether or not a set of
protocols, or protocol-specific options, can successfully connect is protocols, or protocol-specific options, can successfully connect is
generally not dependent on which specific IP address is used. generally not dependent on which specific IP address is used.
Furthermore, the Protocol Stacks being attempted may influence or Furthermore, the Protocol Stacks being attempted may influence or
altogether change the Endpoint Identifers being used. Adding a proxy altogether change the Endpoint Identifiers being used. Adding a
to a connection's branch will change the Endpoint Identifer to the proxy to a connection's branch will change the Endpoint Identifier to
proxy's IP address or hostname. Choosing an alternate protocol may the proxy's IP address or hostname. Choosing an alternate protocol
also modify the ports that should be selected. may also modify the ports that should be selected.
Branching for derived endpoints is the final step, and may have Branching for derived endpoints is the final step and may have
multiple layers of derivation or resolution, such as DNS service multiple layers of derivation or resolution, such as DNS service
resolution and DNS hostname resolution. resolution and DNS hostname resolution.
For example, if the application has indicated both a preference for For example, if the application has indicated both a preference for
WiFi over LTE and for a feature only available in SCTP, branches will Wi-Fi over LTE and for a feature only available in SCTP, branches
be first sorted accord to path selection, with WiFi attempted first. will first be sorted according to path selection, with Wi-Fi
Then, branches with SCTP will be attempted first within their subtree attempted as the first path. Then, branches with SCTP will be
according to the properties influencing protocol selection. However, attempted within their subtree according to the Properties
if the implementation has current cache information that SCTP is not influencing protocol selection. However, if the implementation has
available on the path over WiFi, there would be no SCTP node in the current cache information that SCTP is not available on the path over
WiFi subtree. Here, the path over WiFi will be attempted first, and, Wi-Fi, there would be no SCTP node in the Wi-Fi subtree. Here, the
if connection establishment succeeds, TCP will be used. Thus, the path over Wi-Fi will be attempted first, and, if connection
Selection Property preferring WiFi takes precedence over the Property establishment succeeds, TCP will be used. Thus, the Selection
that led to a preference for SCTP. Property preferring Wi-Fi takes precedence over the Property that led
to a preference for SCTP.
1. [www.example.com:80, any path, reliable-inorder-stream] 1. [www.example.com:80, any path, reliable-inorder-stream]
1.1 [192.0.2.1:443, Wi-Fi, reliable-inorder-stream] 1.1 [192.0.2.1:443, Wi-Fi, reliable-inorder-stream]
1.1.1 [192.0.2.1:443, Wi-Fi, TCP] 1.1.1 [192.0.2.1:443, Wi-Fi, TCP]
1.2 [192.0.3.1:443, LTE, reliable-inorder-stream] 1.2 [192.0.3.1:443, LTE, reliable-inorder-stream]
1.2.1 [192.0.3.1:443, LTE, SCTP] 1.2.1 [192.0.3.1:443, LTE, SCTP]
1.2.2 [192.0.3.1:443, LTE, TCP] 1.2.2 [192.0.3.1:443, LTE, TCP]
4.1.3. Sorting Branches 4.1.3. Sorting Branches
Implementations should sort the branches of the tree of connection Implementations should sort the branches of the tree of connection
options in order of their preference rank, from most preferred to options in order of their preference rank from most preferred to
least preferred as specified by Selection Properties least preferred as specified by Selection Properties [RFC9622]. Leaf
[I-D.ietf-taps-interface]. Leaf nodes on branches with higher nodes on branches with higher rankings represent connection attempts
rankings represent connection attempts that will be raced first. that will be raced first.
In addition to the properties provided by the application, an In addition to the Properties provided by the application, an
implementation may include additional criteria such as cached implementation may include additional criteria such as cached
performance estimates, see Section 9.2, or system policy, see performance estimates (see Section 9.2) or System Policy (see
Section 3.2, in the ranking. Two examples of how Selection and Section 3.2) in the ranking. Two examples of how Selection and
Connection Properties may be used to sort branches are provided Connection Properties may be used to sort branches are provided
below: below:
* "Interface Instance or Type" (property name interface): If the "Interface Instance or Type" (Property name interface):
application specifies an interface type to be preferred or If the application specifies an interface type to be preferred or
avoided, implementations should accordingly rank the paths. If avoided, implementations should accordingly rank the paths. If
the application specifies an interface type to be required or the application specifies an interface type to be required or
prohibited, an implementation is expected to exclude the non- prohibited, an implementation is expected to exclude the
conforming paths. nonconforming paths.
* "Capacity Profile" (property name connCapacityProfile): An "Capacity Profile" (Property name connCapacityProfile):
implementation can use the capacity profile to prefer paths that An implementation can use the capacity profile to prefer paths
match an application's expected traffic profile. This match will that match an application's expected traffic profile. This match
use cached performance estimates, see Section 9.2. Some examples will use cached performance estimates; see Section 9.2. Some
of path preferences based on capacity profiles include: examples of path preferences based on capacity profiles include:
- Low Latency/Interactive: Prefer paths with the lowest expected Low Latency/Interactive: Prefer paths with the lowest expected
Round Trip Time, based on observed Round Trip Time estimates; Round-Trip Time (RTT), based on observed RTT estimates;
- Low Latency/Non-Interactive: Prefer paths with a low expected Low Latency/Non-Interactive: Prefer paths with a low expected
Round Trip Time, but can tolerate delay variation; Round-Trip Time (RTT) and possible delay variation;
- Constant-Rate Streaming: Prefer paths that are expected to Constant-Rate Streaming: Prefer paths that are expected to
satisfy the requested stream send or receive bitrate, based on satisfy the requested stream send or receive bitrate based on
the observed maximum throughput; the observed maximum throughput;
- Capacity-Seeking: Prefer adapting to paths to determine the Capacity-Seeking: Prefer adapting to paths to determine the
highest available capacity, based on the observed maximum highest available capacity based on the observed maximum
throughput. throughput.
As another example, branch sorting can also be influenced by bounds As another example, branch sorting can also be influenced by bounds
on the send or receive rate (Selection Properties minSendRate / on the send or receive rate (Selection Properties minSendRate /
minRecvRate / maxSendRate / maxRecvRate): if the application minRecvRate / maxSendRate / maxRecvRate): if the application
indicates a bound on the expected send or receive bitrate, an indicates a bound on the expected send or receive bitrate, an
implementation may prefer a path that can likely provide the desired implementation may prefer a path that can likely provide the desired
bandwidth, based on cached maximum throughput, see Section 9.2. The bandwidth, based on cached maximum throughput (see Section 9.2). The
application may know the send or receive bitrate from metadata in application may know the send or receive bitrate from metadata in
adaptive HTTP streaming, such as MPEG-DASH. adaptive HTTP streaming, such as MPEG-DASH.
Implementations process the Properties (Section 6.2 of Implementations process the Properties (Section 6.2 of [RFC9622]) in
[I-D.ietf-taps-interface]) in the following order: Prohibit, Require, the following order: Prohibit, Require, Prefer, Avoid. If Selection
Prefer, Avoid. If Selection Properties contain any prohibited Properties contain any prohibited Properties, the implementation
properties, the implementation should first purge branches containing should first purge branches containing nodes with these Properties.
nodes with these properties. For required properties, it should only For required Properties, it should only keep branches that satisfy
keep branches that satisfy these requirements. Finally, it should these requirements. Finally, it should order the branches according
order the branches according to the preferred properties, and finally to the preferred Properties and use any avoided Properties as a
use any avoided properties as a tiebreaker. When ordering branches, tiebreaker. When ordering branches, an implementation can give more
an implementation can give more weight to properties that the weight to Properties that the application has explicitly set rather
application has explicitly set, than to the properties that are than to the Properties that are set by default.
default.
The available protocols and paths on a specific system and in a The available protocols and paths on a specific system and in a
specific context can change; therefore, the result of sorting and the specific context can change; therefore, the result of sorting and the
outcome of racing may vary, even when using the same Selection and outcome of racing may vary, even when using the same Selection and
Connection Properties. However, an implementation ought to provide a Connection Properties. However, an implementation ought to provide a
consistent outcome to applications, e.g., by preferring protocols and consistent outcome to applications, e.g., by preferring protocols and
paths that are already used by existing Connections that specified paths that are already used by existing Connections that specified
similar Properties. similar Properties.
4.2. Candidate Gathering 4.2. Candidate Gathering
The step of gathering candidates involves identifying which paths, The step of gathering candidates involves identifying which paths,
protocols, and endpoints may be used for a given Connection. This protocols, and endpoints may be used for a given Connection. This
list is determined by the requirements, prohibitions, and preferences list is determined by the requirements, prohibitions, preferences,
of the application as specified in the Selection Properties. and avoidances of the application as specified in the Selection
Properties.
4.2.1. Gathering Endpoint Candidates 4.2.1. Gathering Endpoint Candidates
Both Local and Remote Endpoint Candidates must be discovered during Both Local and Remote Endpoint Candidates must be discovered during
connection establishment. To support Interactive Connectivity Connection establishment. To support Interactive Connectivity
Establishment (ICE) [RFC8445], or similar protocols that involve out- Establishment (ICE) [RFC8445], or similar protocols that involve out-
of-band indirect signalling to exchange candidates with the Remote of-band indirect signaling to exchange candidates with the Remote
Endpoint, it is important to query the set of candidate Local Endpoint, it is important to query the set of candidate Local
Endpoints, and provide the Protocol Stack with a set of candidate Endpoints and provide the Protocol Stack with a set of candidate
Remote Endpoints, before the Local Endpoint attempts to establish Remote Endpoints before the Local Endpoint attempts to establish
connections. connections.
4.2.1.1. Local Endpoint candidates 4.2.1.1. Local Endpoint Candidates
The set of possible Local Endpoints is gathered. In a simple case, The set of possible Local Endpoints is gathered. In a simple case,
this merely enumerates the local interfaces and protocols, and this merely enumerates the local interfaces and protocols and
allocates ephemeral source ports. For example, a system that has allocates ephemeral source ports. For example, a system that has Wi-
WiFi and Ethernet and supports IPv4 and IPv6 might gather four Fi and Ethernet and supports IPv4 and IPv6 might gather four
candidate Local Endpoints (IPv4 on Ethernet, IPv6 on Ethernet, IPv4 candidate Local Endpoints (IPv4 on Ethernet, IPv6 on Ethernet, IPv4
on WiFi, and IPv6 on WiFi) that can form the source for a transient. on Wi-Fi, and IPv6 on Wi-Fi) that can form the source for a
transient.
If NAT traversal is required, the process of gathering Local If NAT traversal is required, the process of gathering Local
Endpoints becomes broadly equivalent to the ICE Candidate Gathering Endpoints becomes broadly equivalent to the ICE Candidate Gathering
phase (see Section 5.1.1 of [RFC8445]). The endpoint determines its phase (see Section 5.1.1 of [RFC8445]). The endpoint determines its
server reflexive Local Endpoints (i.e., the translated address of a server-reflexive Local Endpoints (i.e., the translated address of a
Local Endpoint, on the other side of a NAT, e.g via a STUN sever Local Endpoint, on the other side of a NAT, e.g., via a STUN server
[RFC5389]) and relayed Local Endpoints (e.g., via a TURN server [RFC8489]) and relayed Local Endpoints (e.g., via a TURN server
[RFC5766] or other relay), for each interface and network protocol. [RFC8656] or other relay) for each interface and network protocol.
These are added to the set of candidate Local Endpoint Identifers for These are added to the set of candidate Local Endpoint Identifiers
this connection. for this connection.
Gathering Local Endpoints is primarily a local operation, although it Gathering Local Endpoints is primarily a local operation, although it
might involve exchanges with a STUN server to derive server reflexive might involve exchanges with a STUN server to derive server-reflexive
Local Endpoints, or with a TURN server or other relay to derive Local Endpoints or with a TURN server or other relay to derive
relayed Local Endpoints. However, it does not involve communication relayed Local Endpoints. However, it does not involve communication
with the Remote Endpoint. with the Remote Endpoint.
4.2.1.2. Remote Endpoint Candidates 4.2.1.2. Remote Endpoint Candidates
The Remote Endpoint Identifer is typically a name that needs to be The Remote Endpoint Identifier is typically a name that needs to be
resolved into a set of possible addresses that can be used for resolved into a set of possible addresses that can be used for
communication. Resolving the Remote Endpoint is the process of communication. Resolving the Remote Endpoint is the process of
recursively performing such name lookups, until fully resolved, to recursively performing such name lookups, until fully resolved, to
return the set of candidates for the Remote Endpoint of this return the set of candidates for the Remote Endpoint of this
Connection. Connection.
How this resolution is done will depend on the type of the Remote How this resolution is done will depend on the type of the Remote
Endpoint, and can also be specific to each Local Endpoint. A common Endpoint and can also be specific to each Local Endpoint. A common
case is when the Remote Endpoint Identifer is a DNS name, in which case is when the Remote Endpoint Identifier is a DNS name, in which
case it is resolved to give a set of IPv4 and IPv6 addresses case, it is resolved to give a set of IPv4 and IPv6 addresses
representing that name. Some types of Remote Endpoint Identifers representing that name. Some types of Remote Endpoint Identifiers
might require more complex resolution. Resolving the Remote Endpoint might require more complex resolution. Resolving the Remote Endpoint
for a peer-to-peer connection might involve communication with a for a peer-to-peer connection might involve communication with a
rendezvous server, which in turn contacts the peer to gain consent to rendezvous server. The server, in turn, contacts the peer to gain
communicate and retrieve its set of candidate Local Endpoints, which consent to communicate and retrieve its set of candidate Local
are returned and form the candidate remote addresses for contacting Endpoints. These Endpoints are returned and form the candidate
that peer. remote addresses for contacting that peer.
Resolving the Remote Endpoint is not a local operation. It will Resolving the Remote Endpoint is not a local operation. It will
involve a directory service, and can require communication with the involve a directory service and can require communication between the
Remote Endpoint to rendezvous and exchange peer addresses. This can Remote Endpoint and a rendezvous server as well as the exchange of
expose some or all of the candidate Local Endpoints to the Remote peer addresses. This can expose some or all of the candidate Local
Endpoint. Endpoints to the Remote Endpoint.
4.3. Candidate Racing 4.3. Candidate Racing
The primary goal of the Candidate Racing process is to successfully The primary goal of the Candidate Racing process is to successfully
negotiate a Protocol Stack to an endpoint over an interface to negotiate a Protocol Stack to an Endpoint over an interface to
connect a single leaf node of the tree with as little delay and as connect a single leaf node of the tree with as little delay and as
few unnecessary connections attempts as possible. Optimizing these few unnecessary connection attempts as possible. Optimizing these
two factors improves the user experience, while minimizing network two factors improves the user experience, while minimizing network
load. load.
This section covers the dynamic aspect of connection establishment. This section covers the dynamic aspect of Connection establishment.
The tree described above is a useful conceptual and architectural The tree described above is a useful conceptual and architectural
model. However, an implementation is unable to know all of the nodes model. However, an implementation is unable to know all of the nodes
that will be used until steps like name resolution have occurred, and that will be used until steps like name resolution have occurred;
many of the possible branches ultimately might not be attempted. many of the possible branches ultimately might not be attempted.
There are three different approaches to racing the attempts for There are three different approaches to racing the attempts for
different nodes of the connection establishment tree: different nodes of the Connection establishment tree:
1. Simultaneous 1. Simultaneous
2. Staggered 2. Staggered
3. Failover 3. Failover
Each approach is appropriate in different use-cases and branch types. Each approach is appropriate in different use cases and branch types.
However, to avoid consuming unnecessary network resources, However, to avoid consuming unnecessary network resources,
implementations should not use simultaneous racing as a default implementations should not use simultaneous racing as a default
approach. approach.
The timing algorithms for racing should remain independent across The timing algorithms for racing should remain independent across
branches of the tree. Any timer or racing logic is isolated to a branches of the tree. Any timer or racing logic is isolated to a
given parent node, and is not ordered precisely with regards to given parent node and is not ordered precisely with regard to
children of other nodes. children of other nodes.
4.3.1. Simultaneous 4.3.1. Simultaneous
Simultaneous racing is when multiple alternate branches are started Simultaneous racing is when multiple alternate branches are started
without waiting for any one branch to make progress before starting without waiting for any one branch to make progress before starting
the next alternative. This means the attempts are effectively the next alternative. This means the attempts are effectively
simultaneous. Simultaneous racing should be avoided by simultaneous. Simultaneous racing should be avoided by
implementations, since it consumes extra network resources and implementations since it consumes extra network resources and
establishes state that might not be used. establishes state that might not be used.
4.3.2. Staggered 4.3.2. Staggered
Staggered racing can be used whenever a single node of the tree has Staggered racing can be used whenever a single node of the tree has
multiple child nodes. Based on the order determined when building multiple child nodes. Based on the order determined when building
the tree, the first child node will be initiated immediately, the tree, the first child node will be initiated immediately,
followed by the next child node after some delay. Once that second followed by the next child node after some delay. Once that second
child node is initiated, the third child node (if present) will begin child node is initiated, the third child node (if present) will begin
after another delay, and so on until all child nodes have been after another delay, and so on until all child nodes have been
initiated, or one of the child nodes successfully completes its initiated or one of the child nodes successfully completes its
negotiation. negotiation.
Staggered racing attempts can proceed in parallel. Implementations Staggered racing attempts can proceed in parallel. Implementations
should not terminate an earlier child connection attempt upon should not terminate an earlier child connection attempt upon
starting a secondary child. starting a secondary child.
If a child node fails to establish connectivity (as in Section 4.4.1) If a child node fails to establish connectivity (as in Section 4.4.1)
before the delay time has expired for the next child, the next child before the delay time has expired for the next child, the next child
should be started immediately. should be started immediately.
Staggered racing between IP addresses for a generic Connection should Staggered racing between IP addresses for a generic Connection should
follow the Happy Eyeballs algorithm described in [RFC8305]. follow the Happy Eyeballs algorithm described in [RFC8305]. Guidance
[RFC8421] provides guidance for racing when performing Interactive for racing when performing ICE can be found in [RFC8421].
Connectivity Establishment (ICE).
Generally, the delay before starting a given child node ought to be Generally, the delay before starting a given child node ought to be
based on the length of time the previously started child node is based on the length of time the previously started child node is
expected to take before it succeeds or makes progress in connection expected to take before it succeeds or makes progress in connection
establishment. Algorithms like Happy Eyeballs choose a delay based establishment. Algorithms like Happy Eyeballs choose a delay based
on how long the transport connection handshake is expected to take. on how long the transport connection handshake is expected to take.
When performing staggered races in multiple branch types (such as When performing staggered races in multiple branch types (such as
racing between network interfaces, and then racing between IP racing between network interfaces and then racing between IP
addresses), a longer delay may be chosen for some branch types. For addresses), a longer delay may be chosen for some branch types. For
example, when racing between network interfaces, the delay should example, when racing between network interfaces, the delay should
also take into account the amount of time it takes to prepare the also take into account the amount of time it takes to prepare the
network interface (such as radio association) and name resolution network interface (such as radio association) and name resolution
over that interface, in addition to the delay that would be added for over that interface in addition to the delay that would be added for
a single transport connection handshake. a single transport connection handshake.
Since the staggered delay can be chosen based on dynamic information, Since the staggered delay can be chosen based on dynamic information,
such as predicted Round Trip Time, implementations should define such as predicted RTT, implementations should define upper and lower
upper and lower bounds for delay times. These bounds are bounds for delay times. These bounds are implementation specific and
implementation-specific, and may differ based on which branch type is may differ based on which branch type is being used.
being used.
4.3.3. Failover 4.3.3. Failover
If an implementation or application has a strong preference for one If an implementation or application has a strong preference for one
branch over another, the branching node may choose to wait until one branch over another, the branching node may choose to wait until one
child has failed before starting the next. Failure of a leaf node is child has failed before starting the next. Failure of a leaf node is
determined by its protocol negotiation failing or timing out; failure determined by its protocol negotiation failing or timing out; failure
of a parent branching node is determined by all of its children of a parent branching node is determined by all of its children
failing. failing.
An example in which failover is recommended is a race between a An example in which failover is recommended is a race between a
preferred Protocol Stack that uses a proxy and an alternate Protocol preferred Protocol Stack that uses a proxy and an alternate Protocol
Stack that bypasses the proxy. Failover is useful in case the proxy Stack that bypasses the proxy. Failover is useful if the proxy is
is down or misconfigured, but any more aggressive type of racing may down or misconfigured, but any more aggressive type of racing may end
end up unnecessarily avoiding a proxy that was preferred by policy. up unnecessarily avoiding a proxy that was preferred by policy.
4.4. Completing Establishment 4.4. Completing Establishment
The process of connection establishment completes when one leaf node The process of Connection establishment completes when one leaf node
of the tree has successfully completed negotiation with the Remote of the tree has successfully completed negotiation with the Remote
Endpoint, or else all nodes of the tree have failed to connect. The Endpoint or when all nodes of the tree have failed to connect. The
first leaf node to complete its connection is then used by the first leaf node to complete its connection is then used by the
application to send and receive data. This is signalled to the application to send and receive data. This is signaled to the
application using the Ready event in the API (Section 7.1 of application using the Ready event in the API (Section 7.1 of
[RFC9622]).
[I-D.ietf-taps-interface]).
Successes and failures of a given attempt should be reported up to Successes and failures of a given attempt should be reported up to
parent nodes (towards the root of the tree). For example, in the parent nodes (toward the root of the tree). For example, in the
following case, if 1.1.1 fails to connect, it reports the failure to following case, if 1.1.1 fails to connect, it reports the failure to
1.1. Since 1.1 has no other child nodes, it also has failed and 1.1. Since 1.1 has no other child nodes, it also has failed and
reports that failure to 1. Because 1.2 has not yet failed, 1 is not reports that failure to 1. Because 1.2 has not yet failed, 1 is not
considered to have failed. Since 1.2 has not yet started, it is considered to have failed. Since 1.2 has not yet started, it is
started and the process continues. Similarly, if 1.1.1 successfully started and the process continues. Similarly, if 1.1.1 successfully
connects, then it marks 1.1 as connected, which propagates to the connects, then it marks 1.1 as connected, which propagates to the
root node 1. At this point, the Connection as a whole is considered root node 1. At this point, the Connection as a whole is considered
to be successfully connected and ready to process application data. to be successfully connected and ready to process application data.
1 [www.example.com:443, Any, TCP] 1 [www.example.com:443, Any, TCP]
1.1 [www.example.com:443, Wi-Fi, TCP] 1.1 [www.example.com:443, Wi-Fi, TCP]
1.1.1 [192.0.2.1:443, Wi-Fi, TCP] 1.1.1 [192.0.2.1:443, Wi-Fi, TCP]
1.2 [www.example.com:443, LTE, TCP] 1.2 [www.example.com:443, LTE, TCP]
... ...
If a leaf node has successfully completed its connection, all other If a leaf node has successfully completed its connection, all other
attempts should be made ineligible for use by the application for the attempts should be made ineligible for use by the application for the
original request. New connection attempts that involve transmitting original request. New connection attempts that involve transmitting
data on the network ought not to be started after another leaf node data on the network ought not to be started after another leaf node
has already successfully completed, because the Connection as a whole has already successfully completed because the Connection as a whole
has now been established. An implementation could choose to let has now been established. An implementation could choose to let
certain handshakes and negotiations complete to gather metrics that certain handshakes and negotiations complete to gather metrics that
influence future connections. Keeping additional connections is influence future connections. Keeping additional connections is
generally not recommended, because those attempts were slower to generally not recommended because those attempts were slower to
connect and may exhibit less desirable properties. connect and may exhibit less desirable properties.
4.4.1. Determining Successful Establishment 4.4.1. Determining Successful Establishment
On a per-protocol basis, implementations may select different On a per-protocol basis, implementations may select different
criteria by which a leaf node is considered to be successfully criteria by which a leaf node is considered to be successfully
connected. If the only protocol being used is a transport protocol connected. If the only protocol being used is a transport protocol
with a clear handshake, like TCP, then the obvious choice is to with a clear handshake, like TCP, then the obvious choice is to
declare that node "connected" when the three-way handshake has been declare that node "connected" when the three-way handshake completes.
completed. If the only protocol being used is an connectionless If the only protocol being used is a connectionless protocol, like
protocol, like UDP, the implementation may consider the node fully UDP, the implementation may consider the node fully "connected" the
"connected" the moment it determines a route is present, before moment it determines a route is present, before sending any packets
sending any packets on the network, see further Section 4.6. on the network, see further in Section 4.6.
When the Initiate action is called without any Messages being sent at Depending on the protocols involved, there is no guarantee that the
the same time, depending on the protocols involved, it is not Remote Endpoint will be notified when the Initiate action is called
guaranteed that the Remote Endpoint will be notified of this, and without any Messages being sent at the same time. Therefore, a
hence a passive endpoint's application may not receive a passive Endpoint's application may not receive a ConnectionReceived
ConnectionReceived event until it receives the first Message on the event until it receives the first Message on the new Connection.
new Connection.
For Protocol Stacks with multiple handshakes, the decision becomes For Protocol Stacks with multiple handshakes, the decision becomes
more nuanced. If the Protocol Stack involves both TLS and TCP, an more nuanced. If the Protocol Stack involves both TLS and TCP, an
implementation could determine that a leaf node is connected after implementation could determine that a leaf node is connected after
the TCP handshake is complete, or it can wait for the TLS handshake the TCP handshake is complete, or it can wait for the TLS handshake
to complete as well. The benefit of declaring completion when the to complete as well. The benefit of declaring completion when the
TCP handshake finishes, and thus stopping the race for other branches TCP handshake finishes, and thus stopping the race for other branches
of the tree, is reduced burden on the network and Remote Endpoints of the tree, is reduced burden on the network and Remote Endpoints
from further connection attempts that are likely to be abandoned. On from further connection attempts that are likely to be abandoned. On
the other hand, by waiting until the TLS handshake is complete, an the other hand, by waiting until the TLS handshake is complete, an
implementation avoids the scenario in which a TCP handshake completes implementation avoids the scenario in which a TCP handshake completes
quickly, but TLS negotiation is either very slow or fails altogether quickly, but TLS negotiation is either very slow or fails altogether
in particular network conditions or to a particular endpoint. To in particular network conditions or to a particular endpoint. To
avoid the issue of TLS possibly failing, the implementation should avoid the issue of TLS possibly failing, the implementation should
not generate a Ready event for the Connection until the TLS handshake not generate a Ready event for the Connection until the TLS handshake
is complete. is complete.
If all of the leaf nodes fail to connect during racing, i.e. none of If all of the leaf nodes fail to connect during racing, i.e., none of
the configurations that satisfy all requirements given in the the configurations that satisfy all requirements given in the
Transport Properties actually work over the available paths, then the Transport Properties actually work over the available paths, then the
Transport Services system should report an EstablishmentError to the Transport Services System should report an EstablishmentError to the
application. An EstablishmentError event should also be generated in application. An EstablishmentError event should also be generated if
case the Transport Services system finds no usable candidates to the Transport Services System finds no usable candidates to race.
race.
4.5. Establishing multiplexed connections 4.5. Establishing Multiplexed Connections
Multiplexing several Connections over a single underlying transport Multiplexing several Connections over a single underlying transport
connection requires that the Connections to be multiplexed belong to connection requires that the multiplexed Connections belong to the
the same Connection Group (as is indicated by the application using same Connection Group (as is indicated by the application using the
the Clone action). When the underlying transport connection supports Clone action). When the underlying transport connection supports
multi-streaming, the Transport Services System can map each multistreaming, the Transport Services System can map each Connection
Connection in the Connection Group to a different stream of this in the Connection Group to a different stream of this connection.
connection.
For such streams, there is often no explicit connection establishment For such streams, there is often no explicit connection establishment
procedure for the new stream prior to sending data on it (e.g., with procedure for the new stream prior to sending data on it (e.g., with
SCTP). In this case, the same considerations apply to determining SCTP). In this case, the same considerations apply to determining
stream establishment as apply to establishing a UDP connection, as stream establishment as apply to establishing a UDP connection, as
discussed in Section 4.4.1. This means that there might not be any discussed in Section 4.4.1. This means that there might not be any
"establishment" message (like a TCP SYN). "establishment" message (like a TCP SYN).
4.6. Handling connectionless protocols 4.6. Handling Connectionless Protocols
While protocols that use an explicit handshake to validate a While protocols that use an explicit handshake to validate a
connection to a peer can be used for racing multiple establishment connection to a peer can be used for racing multiple establishment
attempts in parallel, connectionless protocols such as raw UDP do not attempts in parallel, connectionless protocols such as raw UDP do not
offer a way to validate the presence of a peer or the usability of a offer a way to validate the presence of a peer or the usability of a
Connection without application feedback. An implementation should Connection without application feedback. An implementation should
consider such a Protocol Stack to be established as soon as the consider such a Protocol Stack to be established as soon as the
Transport Services system has selected a path on which to send data. Transport Services System has selected a path on which to send data.
However, this can cause a problem if a specific peer is not reachable However, this can cause a problem if a specific peer is not reachable
over the network using the connectionless protocol, or data cannot be over the network using the connectionless protocol or data cannot be
exchanged with the peer for any other reason. To handle the lack of exchanged with the peer for any other reason. To handle the lack of
an explicit handshake in the underlying protocol, an application can an explicit handshake in the underlying protocol, an application can
use a Message Framer (Section 6) on top of a connectionless protocol use a Message Framer (Section 6) on top of a connectionless protocol
to only mark a specific connection attempt as ready when some data to only mark a specific connection attempt as ready when some data
has been received, or after some application-level handshake has been has been received or after some application-level handshake has been
performed by the Message Framer. performed by the Message Framer.
4.7. Implementing Listeners 4.7. Implementing Listeners
When an implementation is asked to Listen, it registers with the When an implementation is asked to Listen, it registers with the
system to wait for incoming traffic to the Local Endpoint. If no system to wait for incoming traffic to the Local Endpoint. If no
Local Endpoint Identifer is specified, the implementation should use Local Endpoint Identifier is specified, the implementation should use
an ephemeral port. an ephemeral port.
If the Selection Properties do not require a single network interface If the Selection Properties do not require a single network interface
or path, but allow the use of multiple paths, the Listener object or path but allow the use of multiple paths, the Listener object
should register for incoming traffic on all of the network interfaces should register for incoming traffic on all of the network interfaces
or paths that conform to the Properties. The set of available paths or paths that conform to the Properties. The set of available paths
can change over time, so the implementation should monitor network can change over time, so the implementation should monitor network
path changes, and change the registration of the Listener across all path changes and change the registration of the Listener across all
usable paths as appropriate. When using multiple paths, the Listener usable paths as appropriate. When using multiple paths, the Listener
is generally expected to use the same port for listening on each. is generally expected to use the same port for listening on each.
If the Selection Properties allow multiple protocols to be used for If the Selection Properties allow multiple protocols to be used for
listening, and the implementation supports it, the Listener object listening and the implementation supports it, the Listener object
should support receiving inbound connections for each eligible should support receiving inbound connections for each eligible
protocol on each eligible path. protocol on each eligible path.
4.7.1. Implementing Listeners for Connected Protocols 4.7.1. Implementing Listeners for Connected Protocols
Connected protocols such as TCP and TLS-over-TCP have a strong Connected protocols such as TCP and TLS-over-TCP have a strong
mapping between the Local and Remote Endpoint Identifers (four-tuple) mapping between the Local and Remote Endpoint Identifiers (four-
and their protocol connection state. These map into Connection tuple) and their protocol connection state. These map to Connection
objects. Whenever a new inbound handshake is being started, the objects. Whenever a new inbound handshake is being started, the
Listener should generate a new Connection object and pass it to the Listener should generate a new Connection object and pass it to the
application. application.
4.7.2. Implementing Listeners for Connectionless Protocols 4.7.2. Implementing Listeners for Connectionless Protocols
Connectionless protocols such as UDP and UDP-lite generally do not Connectionless protocols such as UDP and UDP-Lite generally do not
provide the same mechanisms that connected protocols do to offer provide the same mechanisms that connected protocols do to offer
Connection objects. Implementations should wait for incoming packets Connection objects. Implementations should wait for incoming packets
for connectionless protocols on a listening port and should perform for connectionless protocols on a listening port and should perform
four-tuple matching of packets to existing Connection objects if four-tuple matching of packets to existing Connection objects if
possible. If a matching Connection object does not exist, an possible. If a matching Connection object does not exist, an
incoming packet from a connectionless protocol should cause a new incoming packet from a connectionless protocol should cause a new
Connection object to be created. Connection object to be created.
4.7.3. Implementing Listeners for Multiplexed Protocols 4.7.3. Implementing Listeners for Multiplexed Protocols
Protocols that provide multiplexing of streams can listen for Protocols that provide multiplexing of streams can listen for
entirely new connections as well as for new sub-connections (streams entirely new connections as well as for new subconnections (streams
of an already existing connection). A new stream arrival on an of an already-existing connection). A new stream arrival on an
existing connection is presented to the application as a new existing connection is presented to the application as a new
Connection. This new Connection is grouped with all other Connection. This new Connection is grouped with all other
Connections that are multiplexed via the same protocol. Connections that are multiplexed via the same protocol.
5. Implementing Sending and Receiving Data 5. Implementing Sending and Receiving Data
The most basic mapping for sending a Message is an abstraction of The most basic mapping for sending a Message is an abstraction of
datagrams, in which the transport protocol naturally deals in datagrams, in which the transport protocol naturally deals in
discrete packets (such as UDP). Each Message here corresponds to a discrete packets (such as UDP). Each Message here corresponds to a
single datagram. single datagram.
skipping to change at page 23, line 42 skipping to change at line 1058
For protocols that expose byte-streams (such as TCP), the only For protocols that expose byte-streams (such as TCP), the only
delineation provided by the protocol is the end of the stream in a delineation provided by the protocol is the end of the stream in a
given direction. Each Message in this case corresponds to the entire given direction. Each Message in this case corresponds to the entire
stream of bytes in a direction. These Messages may be quite long, in stream of bytes in a direction. These Messages may be quite long, in
which case they can be sent in multiple parts. which case they can be sent in multiple parts.
Protocols that provide framing (such as length-value protocols, or Protocols that provide framing (such as length-value protocols, or
protocols that use delimiters like HTTP/1.1) may support Message protocols that use delimiters like HTTP/1.1) may support Message
sizes that do not fit within a single datagram. Each Message for sizes that do not fit within a single datagram. Each Message for
framing protocols corresponds to a single frame, which may be sent framing protocols corresponds to a single frame, which may be sent
either as a complete Message in the underlying protocol, or in either as a complete Message in the underlying protocol or in
multiple parts. multiple parts.
Messages themselves generally consist of bytes passed in the Messages themselves generally consist of bytes passed in the
messageData parameter intended to be processed at an application messageData parameter intended to be processed at an application
layer. However, Message objects presented through the API can carry layer. However, Message objects presented through the API can carry
associated Message Properties passed through the messageContext associated Message Properties passed through the messageContext
parameter. When these are Protocol Specific Properties, they can parameter. When these are Protocol-specific Properties, they can
include metadata that exists separately from a byte encoding. For include metadata that exists separately from a byte encoding. For
example, these Properties can include name-value pairs of example, these Properties can include name-value pairs of
information, like HTTP header fields. In such cases, Messages might information, like HTTP header fields. In such cases, Messages might
be "empty", insofar as they contain zero bytes in the messageData be "empty" insofar as they contain zero bytes in the messageData
parameter, but can still include data in the messageContext that is parameter, but they can still include data in the messageContext that
interpreted by the Protocol Stack. is interpreted by the Protocol Stack.
5.1. Sending Messages 5.1. Sending Messages
The effect of the application sending a Message is determined by the The effect of the application sending a Message is determined by the
top-level protocol in the established Protocol Stack. That is, if top-level protocol in the established Protocol Stack. That is, if
the top-level protocol provides an abstraction of framed Messages the top-level protocol provides an abstraction of framed Messages
over a connection, the receiving application will be able to obtain over a connection, the receiving application will be able to obtain
multiple Messages on that connection, even if the framing protocol is multiple Messages on that connection, even if the framing protocol is
built on a byte-stream protocol like TCP. built on a byte-stream protocol like TCP.
5.1.1. Message Properties 5.1.1. Message Properties
The API allows various properties to be associated with each Message, The API allows various Properties to be associated with each Message,
which should be implemented as discussed below. which should be implemented as discussed below.
* msgLifetime: this should be implemented by removing the Message msgLifetime: This should be implemented by removing the Message from
from the queue of pending Messages after the Lifetime has expired. the queue of pending Messages after the Lifetime has expired. A
A queue of pending Messages within the Transport Services queue of pending Messages within the Transport Services
Implementation that have yet to be handed to the Protocol Stack Implementation that have yet to be handed to the Protocol Stack
can always support this property, but once a Message has been sent can always support this Property, but once a Message has been sent
into the send buffer of a protocol, only certain protocols may into the send buffer of a protocol, only certain protocols may
support removing it from their send buffer. For example, a support removing it from their send buffer. For example, a
Transport Services Implementation cannot remove bytes from a TCP Transport Services Implementation cannot remove bytes from a TCP
send buffer, while it can remove data from a SCTP send buffer send buffer, while it can remove data from an SCTP send buffer
using the partial reliability extension [RFC8303]. When there is using the partial reliability extension [RFC8303]. When there is
no standing queue of Messages within the system, and the Protocol no standing queue of Messages within the system, and the Protocol
Stack does not support the removal of a Message from the stack's Stack does not support the removal of a Message from the stack's
send buffer, this property may be ignored. send buffer, this Property may be ignored.
* msgPriority: this represents the ability to prioritize a Message msgPriority: This represents the ability to prioritize a Message
over other Messages. This can be implemented by the Transport over other Messages. This can be implemented by the Transport
Services system by re-ordering Messages that have yet to be handed Services System by reordering Messages that have yet to be handed
to the Protocol Stack, or by giving relative priority hints to to the Protocol Stack or by giving relative priority hints to
protocols that support priorities per Message. For example, an protocols that support priorities per Message. For example, an
implementation of HTTP/2 could choose to send Messages of implementation of HTTP/2 could choose to send Messages of
different priority on streams of different priority. different priority on streams of different priority.
* msgOrdered: when this is false, this disables the requirement of msgOrdered: When this is false, it disables the requirement of in-
in-order-delivery for protocols that support configurable order delivery for protocols that support configurable ordering.
ordering. When the Protocol Stack does not support configurable When the Protocol Stack does not support configurable ordering,
ordering, this property may be ignored. this Property may be ignored.
* safelyReplayable: when this is true, this means that the Message safelyReplayable: When this is true, it means that the Message can
can be used by a transport mechanism that might deliver it be used by a transport mechanism that might deliver it multiple
multiple times -- e.g., as a result of racing multiple transports times -- e.g., as a result of racing multiple transports or as
or as part of TCP Fast Open. Also, protocols that do not protect part of TCP Fast Open (TFO). Also, protocols that do not protect
against duplicated Messages, such as UDP (when used directly, against duplicated Messages, such as UDP (when used directly,
without a protocol layered atop), can only be used with Messages without a protocol layered atop), can only be used with Messages
that are Safely Replayable. When a Transport Services system is that are safely replayable. When a Transport Services System is
permitted to replay Messages, replay protection could be provided permitted to replay Messages, replay protection could be provided
by the application. by the application.
* final: when this is true, this means that the sender will not send final: When this is true, it means that the sender will not send any
any further Messages. The Connection need not be closed (in case further Messages. The Connection need not be closed (if the
the Protocol Stack supports half-close operation, like TCP). Any Protocol Stack supports half-closed operations, like TCP). Any
Messages sent after a Message marked final will result in a Messages sent after a Message marked Final will result in a
SendError. SendError.
* msgChecksumLen: when this is set to any value other than Full msgChecksumLen: When this is set to any value other than Full
Coverage, it sets the minimum protection in protocols that allow Coverage, it sets the minimum protection in protocols that allow
limiting the checksum length (e.g. UDP-Lite). If the Protocol limiting the checksum length (e.g., UDP-Lite). If the Protocol
Stack does not support checksum length limitation, this property Stack does not support checksum length limitation, this Property
may be ignored. may be ignored.
* msgReliable: When true, the property specifies that the Message msgReliable: When true, this Property specifies that the Message
must be reliably transmitted. When false, and if unreliable must be reliably transmitted. When false, and if unreliable
transmission is supported by the underlying protocol, then the transmission is supported by the underlying protocol, then the
Message should be unreliably transmitted. If the underlying Message should be unreliably transmitted. If the underlying
protocol does not support unreliable transmission, the Message protocol does not support unreliable transmission, the Message
should be reliably transmitted. should be reliably transmitted.
* msgCapacityProfile: When true, this expresses a wish to override msgCapacityProfile: When true, this expresses a wish to override the
the Generic Connection Property connCapacityProfile for this Generic Connection Property connCapacityProfile for this Message.
Message. Depending on the value, this can, for example, be Depending on the value, this can, for example, be implemented by
implemented by changing the DSCP value of the associated packet changing the Differentiated Services Code Point (DSCP) value of
(note that the guidelines in Section 6 of [RFC7657] apply; e.g., the associated packet (note that the guidelines in Section 6 of
the DSCP value should not be changed for different packets within [RFC7657] apply; for example, the DSCP value should not be changed
a reliable transport protocol session or DCCP connection). for different packets within a reliable transport protocol session
or DCCP connection).
* noFragmentation: Setting this avoids network-layer fragmentation. noFragmentation: Setting this avoids network-layer fragmentation.
Messages exceeding the transport’s current estimate of its maximum Messages exceeding the transport's current estimate of its maximum
packet size (the singularTransmissionMsgMaxLen Connection packet size (the singularTransmissionMsgMaxLen Connection
Property) can result in transport segmentation when permitted, or Property) can result in transport segmentation when permitted or
generate an error. When used with transports running over IP generate an error. When used with transports running over IPv4,
version 4, the Don't Fragment bit should be set to avoid on-path the Don't Fragment (DF) bit should be set to avoid on-path IP
IP fragmentation ([RFC8304]). fragmentation [RFC8304].
* noSegmentation: When set, this property limits the Message size to noSegmentation: When set, this Property limits the Message size to
the transport’s current estimate of its maximum packet size (the the transport's current estimate of its maximum packet size (the
singularTransmissionMsgMaxLen Connection Property). Messages singularTransmissionMsgMaxLen Connection Property). Messages
larger than this size generate an error. Setting this avoids larger than this size generate an error. Setting this avoids
transport-layer segmentation and network-layer fragmentation. transport-layer segmentation and network-layer fragmentation.
When used with transports running over IP version 4, the Don't When used with transports running over IPv4, the DF bit should be
Fragment bit should be set to avoid on-path IP fragmentation set to avoid on-path IP fragmentation ([RFC8304]).
([RFC8304]).
5.1.2. Send Completion 5.1.2. Send Completion
The application should be notified (using a Sent, Expired or The application should be notified (using a Sent, Expired, or
SendError event) whenever a Message or partial Message has been SendError event) whenever a Message or partial Message has been
consumed by the Protocol Stack, or has failed to send. The time at consumed by the Protocol Stack or has failed to send. The time at
which a Message is considered to have been consumed by the Protocol which a Message is considered to have been consumed by the Protocol
Stack may vary depending on the protocol. For example, for a basic Stack may vary depending on the protocol. For example, for a basic
datagram protocol like UDP, this may correspond to the time when the datagram protocol like UDP, this may correspond to the time when the
packet is sent into the interface driver. For a protocol that packet is sent into the interface driver. For a protocol that
buffers data in queues, like TCP, this may correspond to when the buffers data in queues, like TCP, this may correspond to when the
data has entered the send buffer. The time at which a Message failed data has entered the send buffer. The time at which a Message failed
to send is when the Transport Services Implementation (including the to send is when the Transport Services Implementation (including the
Protocol Stack) has experienced a failure related to sending; this Protocol Stack) has experienced a failure related to sending; this
can depend on protocol-specific timeouts. can depend on protocol-specific timeouts.
skipping to change at page 26, line 43 skipping to change at line 1197
switch between the application and the Transport Services System). switch between the application and the Transport Services System).
To avoid this, the application can indicate a batch of Send actions To avoid this, the application can indicate a batch of Send actions
through the API. When this is used, the implementation can defer the through the API. When this is used, the implementation can defer the
processing of Messages until the batch is complete. processing of Messages until the batch is complete.
5.2. Receiving Messages 5.2. Receiving Messages
Similar to sending, receiving a Message is determined by the top- Similar to sending, receiving a Message is determined by the top-
level protocol in the established Protocol Stack. The main level protocol in the established Protocol Stack. The main
difference with receiving is that the size and boundaries of the difference with receiving is that the size and boundaries of the
Message are not known beforehand. The application can communicate in Message are not known beforehand. The application can communicate
its Receive action the parameters for the Message, which can help the the parameters for the Message in its Receive action, which can help
Transport Services Implementation know how much data to deliver and the Transport Services Implementation know how much data to deliver
when. For example, if the application only wants to receive a and when. For example, if the application only wants to receive a
complete Message, the implementation should wait until an entire complete Message, the implementation should wait until an entire
Message (datagram, stream, or frame) is read before delivering any Message (datagram, stream, or frame) is read before delivering any
Message content to the application. This requires the implementation Message content to the application. This requires the implementation
to understand where Messages end, either via a supplied Message to understand where Messages end, either via a supplied Message
Framer or because the top-level protocol in the established Protocol Framer or because the top-level protocol in the established Protocol
Stack preserves message boundaries. The application can also control Stack preserves Message boundaries. The application can also control
the flow of received data by specifying the minimum and maximum the flow of received data by specifying the minimum and maximum
number of bytes of Message content it wants to receive at one time. number of bytes of Message content it wants to receive at one time.
If a Connection finishes before a requested Receive action can be If a Connection finishes before a requested Receive action can be
satisfied, the Transport Services system should deliver any partial satisfied, the Transport Services System should deliver any
Message content outstanding, or if none is available, an indication outstanding partial Message content; if none is available, the system
that there will be no more received Messages. should indicate that there will be no additional received Messages.
5.3. Handling of data for fast-open protocols 5.3. Handling of Data for Fast-Open Protocols
Several protocols allow sending higher-level protocol or application Several protocols allow sending higher-level protocol or application
data during their protocol establishment, such as TCP Fast Open data during their protocol establishment, such as TFO [RFC7413] and
[RFC7413] and TLS 1.3 [RFC8446]. This approach is referred to as TLS 1.3 [RFC8446]. This approach is referred to as sending Zero-RTT
sending Zero-RTT (0-RTT) data. This is a desirable feature, but (0-RTT) data. This is a desirable feature, but it poses challenges
poses challenges to an implementation that uses racing during to an implementation that uses racing during Connection
connection establishment. establishment.
The application can express its preference for sending messagess as The application can express its preference for sending Messages as
0-RTT data by using the zeroRttMsg Selection Property on the 0-RTT data by using the zeroRttMsg Selection Property on the
Preconnection. Then, the application can provide the message to send Preconnection. Then, the application can provide the Message to send
as 0-RTT data via the InitiateWithSend action. In order to be sent as 0-RTT data via the InitiateWithSend action. In order to be sent
as 0-RTT data, the message needs to be marked with the as 0-RTT data, the Message needs to be marked with the
safelyReplayable send paramteter. In general, 0-RTT data may be safelyReplayable Property. In general, 0-RTT data may be replayed
replayed (for example, if a TCP SYN contains data, and the SYN is (for example, if a TCP SYN contains data, and the SYN is
retransmitted, the data will be retransmitted as well but may be retransmitted, the data will be retransmitted as well but may be
considered as a new connection instead of a retransmission). When considered a new connection instead of a retransmission). When
racing connections, different leaf nodes have the opportunity to send racing connections, different leaf nodes have the opportunity to send
the same data independently. If data is truly safely replayable, the same data independently. If data is truly safely replayable,
this is permissible. this is permissible.
Once the application has provided its 0-RTT data, a Transport Once the application has provided its 0-RTT data, a Transport
Services Implementation should keep a copy of this data and provide Services Implementation should keep a copy of this data and provide
it to each new leaf node that is started and for which a protocol it to each new leaf node that is started and for which a protocol
instance supporting 0-RTT is being used. Note that the amount of instance supporting 0-RTT is being used. Note that the amount of
data that can actually be sent as 0-RTT data varies by protocol, so data that can actually be sent as 0-RTT data varies by protocol, so
any given Protocol Stack might only consume part of the saved data any given Protocol Stack might only consume part of the saved data
prior to becoming established. The implementation needs to keep prior to becoming established. The implementation needs to keep
track of how much data a particular Protocol Stack has consumed, and track of how much data a particular Protocol Stack has consumed and
ensure that any pending 0-RTT-eligible data from the application is ensure that any pending 0-RTT-eligible data from the application is
handled before subsequent Messages. handled before subsequent Messages.
It is also possible for Protocol Stacks within a particular leaf node It is also possible for Protocol Stacks within a particular leaf node
to use a 0-RTT handshakes in a lower-level protocol without any to use a 0-RTT handshake in a lower-level protocol without any safely
safely replayable application data if a higher-level protocol in the replayable application data if a higher-level protocol in the stack
stack has idempotent handshake data to send. For example, TCP Fast has idempotent handshake data to send. For example, TFO could use a
Open could use a Client Hello from TLS as its 0-RTT data, without any Client Hello from TLS as its 0-RTT data without any data being
data being provided by the application. provided by the application.
0-RTT handshakes often rely on previous state, such as TCP Fast Open 0-RTT handshakes often rely on previous state, such as TFO cookies,
cookies, previously established TLS tickets, or out-of-band previously established TLS tickets, or out-of-band distributed pre-
distributed pre-shared keys (PSKs). Implementations should be aware shared keys (PSKs). Implementations should be aware of security
of security concerns around using these tokens across multiple concerns around using these tokens across multiple addresses or paths
addresses or paths when racing. In the case of TLS, any given ticket when racing. In the case of TLS, any given ticket or PSK should only
or PSK should only be used on one leaf node, since servers will be used on one leaf node, since servers will likely reject duplicate
likely reject duplicate tickets in order to prevent replays (see tickets in order to prevent replays (see Section 8.1 of [RFC8446]).
Section 8.1 of [RFC8446]). If implementations have multiple tickets If implementations have multiple tickets available from a previous
available from a previous connection, each leaf node attempt can use connection, each leaf node attempt can use a different ticket. In
a different ticket. In effect, each leaf node will send the same effect, each leaf node will send the same early application data, but
early application data, yet encoded (encrypted) differently on the the data will be encoded (encrypted) differently on the wire.
wire.
6. Implementing Message Framers 6. Implementing Message Framers
Message Framers are functions that define simple transformations Message Framers are functions that define simple transformations
between application Message data and raw transport protocol data. between application Message data and raw transport protocol data.
Generally, a Message Framer implements a simple application protocol Generally, a Message Framer implements a simple application protocol
that can either be provided by the Transport Services implementation that can be provided either by the Transport Services implementation
or by the application. It is optional for Transport Services system or by the application. It is optional for Transport Services
implementations to provide Message Framers: the specification Implementations to provide Message Framers: the API specification
[I-D.ietf-taps-interface] does not prescribe any particular Message [RFC9622] does not prescribe any particular Message Framers to be
Framers to be implemented. A Framer can encapsulate or encode implemented. A Framer can encapsulate or encode outbound Messages,
outbound Messages, decapsulate or decode inbound data into Messages, decapsulate or decode inbound data into Messages, and implement parts
and implement parts of protocols that do not directly map to of protocols that do not directly map to application Messages (such
application Messages (such as protocol handshakes or preludes before as protocol handshakes or preludes before Message exchange).
Message exchange).
While many protocols can be represented as Message Framers, for the While many protocols can be represented as Message Framers, for the
purposes of the Transport Services API, these are ways for purposes of the Transport Services API, these are ways for
applications or application frameworks to define their own Message applications or application frameworks to define their own Message
parsing to be included within a Connection's Protocol Stack. As an parsing to be included within a Connection's Protocol Stack. As an
example, TLS is a protocol that is by default built into the example, TLS is a protocol that is by default built into the
Transport Services API, even though it could also serve the purpose Transport Services API, even though it could also serve the purpose
of framing data over TCP. of framing data over TCP.
Most Message Framers fall into one of two categories: Most Message Framers fall into one of two categories:
* Header-prefixed record formats, such as a basic Type-Length-Value * Header-prefixed record formats, such as a basic Type-Length-Value
(TLV) structure (TLV) structure
* Delimiter-separated formats, such as HTTP/1.1 * Delimiter-separated formats, such as HTTP/1.1
Common Message Framers can be provided by a Transport Services Common Message Framers can be provided by a Transport Services
Implementation, but an implementation ought to allow custom Message Implementation, but an implementation ought to allow custom Message
Framers to be defined by the application or some other piece of Framers to be defined by the application or some other piece of
software. This section describes one possible API for defining software. This section describes one possible API for defining
Message Framers, as an example. Message Framers as an example.
6.1. Defining Message Framers 6.1. Defining Message Framers
A Message Framer is primarily defined by the code that handles events A Message Framer is primarily defined by the code that handles events
for a framer implementation, specifically how it handles inbound and for a Framer implementation, specifically how it handles inbound and
outbound data parsing. The function that implements custom framing outbound data parsing. The function that implements custom framing
logic will be referred to as the "framer implementation", which may logic will be referred to as the "Framer Implementation", which may
be provided by a Transport Services implementation or the application be provided by a Transport Services Implementation or the application
itself. The Message Framer refers to the object or function within itself. The Message Framer holds a reference to the object or
the main Connection implementation that delivers events to the custom function within the main Connection implementation that delivers
framer implementation whenever data is ready to be parsed or framed. events to the custom Framer implementation whenever data is ready to
be parsed or framed.
The API examples in this section use the notation conventions for the The API examples in this section use the notation conventions for the
Transport Services API defined in Section 1.1 of Transport Services API defined in Section 1.1 of [RFC9622].
[I-D.ietf-taps-interface].
The Transport Services Implementation needs to ensure that all of the The Transport Services Implementation needs to ensure that all of the
events and actions taken on a Message Framer are synchronized to events and actions taken on a Message Framer are synchronized to
ensure consistent behavior. For example, some of the actions defined ensure consistent behavior. For example, some of the actions defined
below (such as PrependFramer and StartPassthrough) modify how data below (such as PrependFramer and StartPassthrough) modify how data
flows in a protocol stack, and require synchronization with sending flows in a Protocol Stack and require synchronization with sending
and parsing data in the Message Framer. and parsing data in the Message Framer.
When a Connection establishment attempt begins, an event can be When a Connection establishment attempt begins, an event can be
delivered to notify the framer implementation that a new Connection delivered to notify the Framer implementation that a new Connection
is being created. Similarly, a stop event can be delivered when a is being created. Similarly, a Stop event can be delivered when a
Connection is being torn down. The framer implementation can use the Connection is being torn down. The Framer implementation can use the
Connection object to look up specific properties of the Connection or Connection object to look up specific Properties of the Connection or
the network being used that may influence how to frame Messages. the network being used that may influence how to frame Messages.
MessageFramer -> Start<connection> MessageFramer -> Start<connection>
MessageFramer -> Stop<connection> MessageFramer -> Stop<connection>
When a Message Framer generates a Start event, the framer When a Message Framer generates a Start event, the Framer
implementation has the opportunity to start writing some data prior implementation has the opportunity to start writing some data prior
to the Connection delivering its Ready event. This allows the to the Connection delivering its Ready event. This allows the
implementation to communicate control data to the Remote Endpoint implementation to communicate control data to the Remote Endpoint
that can be used to parse Messages. that can be used to parse Messages.
Once the framer implementation has completed its setup or handshake, Once the Framer implementation has completed its setup or handshake,
it can indicate to the application that it is ready for handling data it can indicate to the application that it is ready for handling data
with this call. with this call.
MessageFramer.MakeConnectionReady(connection) MessageFramer.MakeConnectionReady(connection)
Similarly, when a Message Framer generates a Stop event, the framer
Similarly, when a Message Framer generates a Stop event, the Framer
implementation has the opportunity to write some final data or clear implementation has the opportunity to write some final data or clear
up its local state before the Closed event is delivered to the up its local state before the Closed event is delivered to the
Application. The framer implementation can indicate that it has application. The Framer implementation can indicate that it has
finished with this call. finished with this call.
MessageFramer.MakeConnectionClosed(connection) MessageFramer.MakeConnectionClosed(connection)
At any time if the implementation encounters a fatal error, it can If the implementation encounters a fatal error at any time, it can
also cause the Connection to fail and provide an error. also cause the Connection to fail and provide an error.
MessageFramer.FailConnection(connection, error) MessageFramer.FailConnection(connection, error)
Should the framer implementation deem the candidate selected during Should the Framer implementation deem the candidate selected during
racing unsuitable, it can signal this to the Transport Services API racing unsuitable, it can signal this to the Transport Services API
by failing the Connection prior to marking it as ready. If there are by failing the Connection prior to marking it as ready. If there are
no other candidates available, the Connection will fail. Otherwise, no other candidates available, the Connection will fail. Otherwise,
the Connection will select a different candidate and the Message the Connection will select a different candidate and the Message
Framer will generate a new Start event. Framer will generate a new Start event.
Before an implementation marks a Message Framer as ready, it can also Before an implementation marks a Message Framer as ready, it can also
dynamically add a protocol or framer above it in the stack. This dynamically add a protocol or Framer above it in the stack. This
allows protocols that need to add TLS conditionally, like STARTTLS allows protocols that need to add TLS conditionally, like STARTTLS
[RFC3207], to modify the Protocol Stack based on a handshake result. [RFC3207], to modify the Protocol Stack based on a handshake result.
otherFramer := NewMessageFramer() otherFramer := NewMessageFramer()
MessageFramer.PrependFramer(connection, otherFramer) MessageFramer.PrependFramer(connection, otherFramer)
A Message Framer might also choose to go into a passthrough mode once A Message Framer might also choose to go into a passthrough mode once
an initial exchange or handshake has been completed, such as the an initial exchange or handshake has been completed, such as the
STARTTLS case mentioned above. This can also be useful for proxy STARTTLS case mentioned above. This can also be useful for proxy
protocols like SOCKS [RFC1928] or HTTP CONNECT [RFC7230]. In such protocols like SOCKS [RFC1928] or HTTP CONNECT [RFC9110]. In such
cases, a Message Framer implementation can intercept sending and cases, a Message Framer implementation can initially intercept
receiving of Messages at first, but then indicate that no more Messages being sent and received and subsequently indicate that no
processing is needed. further processing is needed.
MessageFramer.StartPassthrough() MessageFramer.StartPassthrough()
6.2. Sender-side Message Framing 6.2. Sender-Side Message Framing
Message Framers generate an event whenever a Connection sends a new Message Framers generate an event whenever a Connection sends a new
Message. The parameters to the event align with the Send action in Message. The parameters to the event align with the Send action in
the API (Section 9.2 of [I-D.ietf-taps-interface]). the API (Section 9.2 of [RFC9622]).
MessageFramer MessageFramer
| |
V V
NewSentMessage<connection, messageData, messageContext, endOfMessage> NewSentMessage<connection, messageData, messageContext, endOfMessage>
Upon receiving this event, a framer implementation is responsible for
Upon receiving this event, a Framer implementation is responsible for
performing any necessary transformations and sending the resulting performing any necessary transformations and sending the resulting
data back to the Message Framer, which will in turn send it to the data back to the Message Framer, which, in turn, will send it to the
next protocol. To improve performance, implementations should ensure next protocol. To improve performance, implementations should ensure
that there is a way to pass the original data through without that there is a way to pass the original data through without
copying. copying.
MessageFramer.Send(connection, messageData) MessageFramer.Send(connection, messageData)
To provide an example, a simple protocol that adds the length of the To provide an example, a simple protocol that adds the length of the
Message data as a header would receive the NewSentMessage event, Message data as a header would receive the NewSentMessage event,
create a data representation of the length of the Message data, and create a data representation of the length of the Message data, and
then send a block of data that is the concatenation of the length then send a block of data that is the concatenation of the length
header and the original Message data. header and the original Message data.
6.3. Receiver-side Message Framing 6.3. Receiver-Side Message Framing
In order to parse a received flow of data into Messages, the Message In order to parse a received flow of data into Messages, the Message
Framer notifies the framer implementation whenever new data is Framer notifies the Framer implementation whenever new data is
available to parse. available to parse.
The parameters to the events and calls for receiving data with a The parameters to the events and calls for receiving data with a
framer align with the Receive action in the API (Section 9.3 of Framer align with the Receive action in the API (Section 9.3 of
[I-D.ietf-taps-interface]). [RFC9622]).
MessageFramer -> HandleReceivedData<connection> MessageFramer -> HandleReceivedData<connection>
Upon receiving this event, the framer implementation can inspect the Upon receiving this event, the Framer implementation can inspect the
inbound data. The data is parsed from a particular cursor inbound data. The data is parsed from a particular cursor
representing the unprocessed data. The application requests a representing the unprocessed data. The application requests a
specific amount of data it needs to have available in order to parse. specific amount of data it needs to have available in order to parse.
If the data is not available, the parse fails. If the data is not available, the parse fails.
MessageFramer.Parse(connection, minimumIncompleteLength, maximumLength) MessageFramer.Parse(connection, minimumIncompleteLength, maximumLength)
| |
V V
(messageData, messageContext, endOfMessage) (messageData, messageContext, endOfMessage)
The framer implementation can directly advance the receive cursor The Framer implementation can directly advance the receive cursor
once it has parsed data to effectively discard data (for example, once it has parsed data to effectively discard data (for example,
discard a header once the content has been parsed). discard a header once the content has been parsed).
To deliver a Message to the application, the framer implementation To deliver a Message to the application, the Framer implementation
can either directly deliver data that it has allocated, or deliver a can either directly deliver data that it has allocated or deliver a
range of data directly from the underlying transport and range of data directly from the underlying transport and
simultaneously advance the receive cursor. simultaneously advance the receive cursor.
MessageFramer.AdvanceReceiveCursor(connection, length) MessageFramer.AdvanceReceiveCursor(connection, length)
MessageFramer.DeliverAndAdvanceReceiveCursor(connection, messageContext, length, endOfMessage) MessageFramer.DeliverAndAdvanceReceiveCursor(connection, messageContext,
MessageFramer.Deliver(connection, messageContext, messageData, endOfMessage) length, endOfMessage)
MessageFramer.Deliver(connection, messageContext, messageData,
endOfMessage)
Note that MessageFramer.DeliverAndAdvanceReceiveCursor allows the Note that MessageFramer.DeliverAndAdvanceReceiveCursor allows the
framer implementation to earmark bytes as part of a Message even Framer implementation to earmark bytes as part of a Message even
before they are received by the transport. This allows the delivery before they are received by the transport. This allows the delivery
of very large Messages without requiring the implementation to of very large Messages without requiring the implementation to
directly inspect all of the bytes. directly inspect all of the bytes.
To provide an example, a simple protocol that parses the length of To provide an example, a simple protocol that parses the length of
the Message data as a header value would receive the the Message data as a header value would receive the
HandleReceivedData event, and call Parse with a minimum and maximum HandleReceivedData event and call Parse with a minimum and maximum
set to the length of the header field. Once the parse succeeded, it set to the length of the header field. Once the parse succeeded, it
would call AdvanceReceiveCursor with the length of the header field, would call AdvanceReceiveCursor with the length of the header field
and then call DeliverAndAdvanceReceiveCursor with the length of the and then call DeliverAndAdvanceReceiveCursor with the length of the
body that was parsed from the header, marking the new Message as body that was parsed from the header, marking the new Message as
complete. complete.
7. Implementing Connection Management 7. Implementing Connection Management
Once a Connection is established, the Transport Services API allows Once a Connection is established, the Transport Services API allows
applications to interact with the Connection by modifying or applications to interact with the Connection by modifying or
inspecting Connection Properties. A Connection can also generate inspecting Connection Properties. A Connection can also generate
error events in the form of SoftError events. error events in the form of SoftError events.
The set of Connection Properties that are supported for setting and The set of Connection Properties that are supported for setting and
getting on a Connection are described in [I-D.ietf-taps-interface]. getting on a Connection are described in [RFC9622]. For any
For any properties that are generic, and thus could apply to all Properties that are generic and, thus, could apply to all protocols
protocols being used by a Connection, the Transport Services being used by a Connection, the Transport Services Implementation
Implementation should store the properties in storage common to all should store the Properties in storage common to all protocols and
protocols, and notify the Protocol Stack as a whole whenever the notify the Protocol Stack as a whole whenever the Properties have
properties have been modified by the application. [RFC8303] and been modified by the application. [RFC8303] and [RFC8304] offer
[RFC8304] offer guidance on how to do this for TCP, MPTCP, SCTP, UDP guidance on how to do this for TCP, Multipath TCP (MPTCP), SCTP, UDP,
and UDP-Lite; see Section 10 for a description of a back-tracking and UDP-Lite; see Section 10 for a description of a backtracking
method to find the relevant protocol primitives using these method to find the relevant protocol primitives using these
documents. For Protocol-specific Properties, such as the User documents. For Protocol-specific Properties, such as the User
Timeout that applies to TCP, the Transport Services Implementation Timeout that applies to TCP, the Transport Services Implementation
only needs to update the relevant protocol instance. only needs to update the relevant protocol instance.
Some Connection Properties might apply to multiple protocols within a Some Connection Properties might apply to multiple protocols within a
Protocol Stack. Depending on the specific property, it might be Protocol Stack. Depending on the specific Property, it might be
appropriate to apply the property across multiple protocols appropriate to apply the Property across multiple protocols
simultaneously, or else only apply it to one protocol. In general, simultaneously or only apply it to one protocol. In general, the
the Transport Services Implementation should allow the protocol Transport Services Implementation should allow the protocol closest
closest to the application to interpret Connection Properties, and to the application to interpret Connection Properties and,
potentially modify the set of Connection Properties passed down to potentially, modify the set of Connection Properties passed down to
the next protocol in the stack. For example, if the application has the next protocol in the stack. For example, if the application has
requested to use keepalives with the keepAlive property, and the requested to use keep-alives with the keepAlive Property, and the
Protocol Stack contains both HTTP/2 and TCP, the HTTP/2 protocol can Protocol Stack contains both HTTP/2 and TCP, the HTTP/2 protocol can
choose to enable its own keepalives to satisfy the application choose to enable its own keep-alives to satisfy the application
request, and disable TCP-level keepalives. For cases where the request and disable TCP-level keep-alives. For cases where the
application needs to have fine-grained per-protocol control, the application needs to have fine-grained per-protocol control, the
Transport Services Implementation can expose Protocol-specific Transport Services Implementation can expose Protocol-specific
Properties. Properties.
If an error is encountered in setting a property (for example, if the If an error is encountered in setting a Property (for example, if the
application tries to set a TCP-specific property on a Connection that application tries to set a TCP-specific Property on a Connection that
is not using TCP), the action must fail gracefully. The application is not using TCP), the action must fail gracefully. The application
must be informed of the error, but the Connection itself must not be must be informed of the error but the Connection itself must not be
terminated. terminated.
When protocol instances in the Protocol Stack report generic or When protocol instances in the Protocol Stack report generic or
protocol-specific errors, the API will deliver them to the protocol-specific errors, the API will deliver them to the
application as SoftError events. These allow the application to be application as SoftError events. These allow the application to be
informed of ICMP errors, and other similar events. informed of ICMP errors and other similar events.
7.1. Pooled Connection 7.1. Pooled Connection
For applications that do not need in-order delivery of Messages, the For applications that do not need in-order delivery of Messages, the
Transport Services Implementation may distribute Messages of a single Transport Services Implementation may distribute Messages of a single
Connection across several underlying transport connections or Connection across several underlying transport connections or
multiple streams of multi-streaming connections between endpoints, as multiple streams of multistreaming connections between endpoints, as
long as all of these satisfy the Selection Properties. The Transport long as all of these satisfy the Selection Properties. The Transport
Services Implementation will then hide this connection management and Services Implementation will then hide this connection management and
only expose a single Connection object, which we here call a "Pooled only expose a single Connection object, which we call a Pooled
Connection". This is in contrast to Connection Groups, which Connection. This is in contrast to Connection Groups, which
explicitly expose combined treatment of Connections, giving the explicitly expose combined treatment of Connections, giving the
application control over multiplexing, for example. application control over multiplexing, for example.
Pooled Connections can be useful when the application using the Pooled Connections can be useful when the application using the
Transport Services system implements a protocol such as HTTP, which Transport Services System implements a protocol such as HTTP, which
employs request/response pairs and does not require in-order delivery employs request/response pairs and does not require in-order delivery
of responses. This enables implementations of Transport Services of responses. This enables implementations of Transport Services
systems to realize transparent connection coalescing, connection Systems to realize transparent connection coalescing and connection
migration, and to perform per-message endpoint and path selection by migration and to perform per-Message endpoint and path selection by
choosing among multiple underlying connections. choosing among multiple underlying connections.
7.2. Handling Path Changes 7.2. Handling Path Changes
When a path change occurs, e.g., when the IP address of an interface When a path change occurs, e.g., when the IP address of an interface
changes or a new interface becomes available, the Transport Services changes or a new interface becomes available, the Transport Services
Implementation is responsible for notifying the Protocol Instance of Implementation is responsible for notifying the protocol instance of
the change. The path change may interrupt connectivity on a path for the change. The path change may interrupt connectivity on a path for
an active Connection or provide an opportunity for a transport that an active Connection or provide an opportunity for a transport that
supports multipath or migration to adapt to the new paths. Note supports multipath or migration to adapt to the new paths. Note
that, in the model of the Transport Services API, migration is that, in the model of the Transport Services API, migration is
considered a part of multipath connectivity; it is just a limiting considered a part of multipath connectivity; it is just a limiting
policy on multipath usage. If the multipath Selection Property is policy on multipath usage. If the multipath Selection Property is
set to Disabled, migration is disallowed. set to Disabled, migration is disallowed.
For protocols that do not support multipath or migration, the For protocols that do not support multipath or migration, the
Protocol Instances should be informed of the path change, but should protocol instances should be informed of the path change but should
not be forcibly disconnected if the previously used path becomes not be forcibly disconnected if the previously used path becomes
unavailable. There are many common usage scenarios that can lead to unavailable. There are many common usage scenarios that can lead to
a path becoming temporarily unavailable, and then recovering before a path becoming temporarily unavailable and then recovering before
the transport protocol reaches a timeout error. These are the transport protocol reaches a timeout error. These are
particularly common using mobile devices. Examples include: an particularly common using mobile devices. Examples include:
Ethernet cable becoming unplugged and then plugged back in; a device
losing a Wi-Fi signal while a user is in an elevator, and reattaching * an Ethernet cable becoming unplugged and then plugged back in;
when the user leaves the elevator; and a user losing the radio signal
while riding a train through a tunnel. If the device is able to * a device losing a Wi-Fi signal while a user is in an elevator and
rejoin a network with the same IP address, a stateful transport reattaching when the user leaves the elevator; and
connection can generally resume. Thus, while it is useful for a
Protocol Instance to be aware of a temporary loss of connectivity, * a user losing the radio signal while riding a train through a
the Transport Services Implementation should not aggressively close tunnel.
Connections in these scenarios.
If the device is able to rejoin a network with the same IP address, a
stateful transport connection can generally resume. Thus, while it
is useful for a protocol instance to be aware of a temporary loss of
connectivity, the Transport Services Implementation should not
aggressively close Connections in these scenarios.
If the Protocol Stack includes a transport protocol that supports If the Protocol Stack includes a transport protocol that supports
multipath connectivity, the Transport Services Implementation should multipath connectivity, the Transport Services Implementation should
also inform the Protocol Instance about potentially new paths that also inform the protocol instance about potentially new paths that
become permissible based on the multipath Selection Property and the become permissible based on the multipath Selection Property and the
multipathPolicy Connection Property choices made by the application. multipathPolicy Connection Property choices made by the application.
A protocol can then establish new subflows over new paths while an A protocol can then establish new subflows over new paths while an
active path is still available or, if migration is supported, also active path is still available or after a break has been detected,
after a break has been detected, and should attempt to tear down and it should attempt to tear down subflows over paths that are no
subflows over paths that are no longer used. The Connection Property longer used. The Connection Property multipathPolicy of the
multipathPolicy of the Transport Services API allows an application Transport Services API allows an application to indicate when and how
to indicate when and how different paths should be used. However, different paths should be used. However, detailed handling of these
detailed handling of these policies is implementation-specific. For policies is implementation specific. For example, if the multipath
example, if the multipath Selection Property is set to active, the Selection Property is set to Active, the decision about when to
decision about when to create a new path or to announce a new path or create a new path or to announce a new path or set of paths to the
set of paths to the Remote Endpoint, e.g., in the form of additional Remote Endpoint, e.g., in the form of additional IP addresses, is
IP addresses, is implementation-specific. If the Protocol Stack implementation specific. If the Protocol Stack includes a transport
includes a transport protocol that does not support multipath, but protocol that does not support multipath but does support migrating
does support migrating between paths, the update to the set of between paths, the update to the set of available paths can trigger
available paths can trigger the connection to be migrated. the connection to be migrated.
In the case of a Pooled Connection Section 7.1, the Transport In the case of a Pooled Connection (Section 7.1), the Transport
Services Implementation may add connections over new paths to the Services Implementation may add connections over new paths to the
pool if permissible based on the multipath policy and Selection pool if permissible based on the multipathPolicy and Selection
Properties. In the case that a previously used path becomes Properties. If a previously used path becomes unavailable, the
unavailable, the Transport Services system may disconnect all Transport Services System may disconnect all connections that require
connections that require this path, but should not disconnect the this path, but it should not disconnect the Pooled Connection object
pooled Connection object exposed to the application. The strategy to exposed to the application. The strategy to do so is implementation
do so is implementation-specific, but should be consistent with the specific, but it should be consistent with the behavior of multipath
behavior of multipath transports. transports.
8. Implementing Connection Termination 8. Implementing Connection Termination
For Close (which leads to a Closed event) and Abort (which leads to a For Close (which leads to a Closed event) and Abort (which leads to a
ConnectionError event), the application might find it useful to be ConnectionError event), the application might find it useful to be
informed when a peer closes or aborts a Connection. Whether this is informed when a peer closes or aborts a Connection. Whether this is
possible depends on the underlying protocol, and no guarantees can be possible depends on the underlying protocol, and no guarantees can be
given. When an underlying transport connection supports multi- given. When an underlying transport connection supports
streaming (such as SCTP), the Transport Services system can use a multistreaming (such as SCTP), the Transport Services System can use
stream reset procedure to cause a Finish event upon a Close action a stream reset procedure to cause a Finish event upon a Close action
from the peer [NEAT-flow-mapping]. from the peer [NEAT-flow-mapping].
9. Cached State 9. Cached State
Beyond a single Connection's lifetime, it is useful for an Beyond a single Connection's lifetime, it is useful for an
implementation to keep state and history. This cached state can help implementation to keep state and history. This cached state can help
improve future Connection establishment due to re-using results and improve future Connection establishment due to reusing results and
credentials, and favoring paths and protocols that performed well in credentials and favoring paths and protocols that performed well in
the past. the past.
Cached state may be associated with different endpoints for the same Cached state may be associated with different endpoints for the same
Connection, depending on the protocol generating the cached content. Connection, depending on the protocol generating the cached content.
For example, session tickets for TLS are associated with specific For example, session tickets for TLS are associated with specific
endpoints, and thus should be cached based on a connection's hostname endpoints; thus, they should be cached based on a connection's
Endpoint Identifer (if applicable). However, performance hostname Endpoint Identifier (if applicable). However, performance
characteristics of a path are more likely tied to the IP address and characteristics of a path are more likely tied to the IP address and
subnet being used. subnet being used.
9.1. Protocol state caches 9.1. Protocol State Caches
Some protocols will have long-term state to be cached in association Some protocols will have long-term state to be cached in association
with endpoints. This state often has some time after which it is with endpoints. This state often has some time after which it is
expired, so the implementation should allow each protocol to specify expired, so the implementation should allow each protocol to specify
an expiration for cached content. an expiration for cached content.
Examples of cached protocol state include: Examples of cached protocol state include:
* The DNS protocol can cache resolved addresses (such as those * The DNS protocol can cache resolved addresses (such as those
retrieved from A and AAAA queries), associated with a Time To Live retrieved from A and AAAA queries) associated with a Time To Live
(TTL) to be used for future hostname resolutions without requiring (TTL) to be used for future hostname resolutions without requiring
asking the DNS resolver again. asking the DNS resolver again.
* TLS caches session state and tickets based on a hostname, which * TLS caches session state and tickets based on a hostname, which
can be used for resuming sessions with a server. can be used for resuming sessions with a server.
* TCP can cache cookies for use in TCP Fast Open. * TCP can cache cookies for use in TFO.
Cached protocol state is primarily used during Connection Cached protocol state is primarily used during Connection
establishment for a single Protocol Stack, but may be used to establishment for a single Protocol Stack, but it may be used to
influence an implementation's preference between several candidate influence an implementation's preference between several Candidate
Protocol Stacks. For example, if two IP address Endpoint Identifers Protocol Stacks. For example, if two IP address Endpoint Identifiers
are otherwise equally preferred, an implementation may choose to are otherwise equally preferred, an implementation may choose to
attempt a connection to an address for which it has a TCP Fast Open attempt a connection to an address for which it has a TFO cookie.
cookie.
Applications can use the Transport Services API to request that a Applications can use the Transport Services API to request that a
Connection Group maintain a separate cache for protocol state. Connection Group maintain a separate cache for protocol state.
Connections in the group will not use cached state from Connections Connections in the group will not use Cached State from Connections
outside the group, and Connections outside the group will not use outside the group, and Connections outside the group will not use
state cached from Connections inside the group. This may be state cached from Connections inside the group. This may be
necessary, for example, if application-layer identifiers rotate and necessary, for example, if application-layer identifiers rotate and
clients wish to avoid linkability via trackable TLS tickets or TFO clients wish to avoid linkability via trackable TLS tickets or TFO
cookies. cookies.
9.2. Performance caches 9.2. Performance Caches
In addition to protocol state, Protocol Instances should provide data In addition to protocol state, protocol instances should provide data
into a performance-oriented cache to help guide future protocol and into a performance-oriented cache to help guide future protocol and
path selection. Some performance information can be gathered path selection. Some performance information can be gathered
generically across several protocols to allow predictive comparisons generically across several protocols to allow predictive comparisons
between protocols on given paths: between protocols on given paths:
* Observed Round Trip Time * Observed RTT
* Connection establishment latency * Connection establishment latency
* Connection establishment success rate * Connection establishment success rate
These items can be cached on a per-address and per-subnet These items can be cached on a per-address and per-subnet granularity
granularity, and averaged between different values. The information and averaged between different values. The information should be
should be cached on a per-network basis, since it is expected that cached on a per-network basis since it is expected that different
different network attachments will have different performance network attachments will have different performance characteristics.
characteristics. Besides Protocol Instances, other system entities Besides protocol instances, other system entities may also provide
may also provide data into performance-oriented caches. This could data into performance-oriented caches. This could for instance be
for instance be signal strength information reported by radio modems signal strength information reported by radio modems like Wi-Fi and
like Wi-Fi and mobile broadband or information about the battery- mobile broadband or information about the battery level of the
level of the device. Furthermore, the system may cache the observed device. Furthermore, the system may cache the observed maximum
maximum throughput on a path as an estimate of the available throughput on a path as an estimate of the available bandwidth.
bandwidth.
An implementation should use this information, when possible, to An implementation should use this information, when possible, to
influence preference between candidate paths, endpoints, and protocol influence preference between Candidate Paths, endpoints, and protocol
options. Eligible options that historically had significantly better options. Eligible options that historically had significantly better
performance than others should be selected first when gathering performance than others should be selected first when gathering
candidates (see Section 4.2) to ensure better performance for the candidates (see Section 4.2) to ensure better performance for the
application. application.
The reasonable lifetime for cached performance values will vary The reasonable lifetime for cached performance values will vary
depending on the nature of the value. Certain information, like the depending on the nature of the value. Certain information, like the
connection establishment success rate to a Remote Endpoint using a connection establishment success rate to a Remote Endpoint using a
given Protocol Stack, can be stored for a long period of time (hours given Protocol Stack, can be stored for a long period of time (hours
or longer), since it is expected that the capabilities of the Remote or longer) since it is expected that the capabilities of the Remote
Endpoint are not changing very quickly. On the other hand, the Round Endpoint are not changing very quickly. On the other hand, the RTT
Trip Time observed by TCP over a particular network path may vary observed by TCP over a particular network path may vary over a
over a relatively short time interval. For such values, the relatively short time interval. For such values, the implementation
implementation should remove them from the cache more quickly, or should remove them from the cache more quickly or treat older values
treat older values with less confidence/weight. with less confidence/weight.
[RFC9040] provides guidance about sharing of TCP Control Block [RFC9040] provides guidance about sharing of TCP Control Block
information between connections on initialization. information between connections on initialization.
10. Specific Transport Protocol Considerations 10. Specific Transport Protocol Considerations
Each protocol that is supported by a Transport Services Each protocol that is supported by a Transport Services
Implementation should have a well-defined API mapping. API mappings Implementation should have a well-defined API mapping. API mappings
for a protocol are important for Connections in which a given for a protocol are important for Connections in which a given
protocol is the "top" of the Protocol Stack. For example, the protocol is the "top" of the Protocol Stack. For example, the
mapping of the Send function for TCP applies to Connections in which mapping of the Send action for TCP applies to Connections in which
the application directly sends over TCP. the application directly sends over TCP.
Each protocol has a notion of Connectedness. Possible definitions of Each protocol has a notion of "Connectedness". Possible definitions
Connectedness for various types of protocols are: of Connectedness for various types of protocols are:
* Connectionless. Connectionless protocols do not establish Connectionless: Connectionless protocols do not establish explicit
explicit state between endpoints, and do not perform a handshake state between endpoints and do not perform a handshake during
during Connection establishment. connection establishment.
* Connected. Connected (also called "connection-oriented") Connected: Connected (also called "connection-oriented") protocols
protocols establish state between endpoints, and perform a establish state between endpoints and perform a handshake during
handshake during connection establishment. The handshake may be connection establishment. The handshake may be 0-RTT to send data
0-RTT to send data or resume a session, but bidirectional traffic or resume a session, but bidirectional traffic is required to
is required to confirm connectedness. confirm Connectedness.
* Multiplexing Connected. Multiplexing Connected protocols share Multiplexing connected: Multiplexing connected protocols share
properties with Connected protocols, but also explictly support properties with connected protocols but also explicitly support
opening multiple application-level flows. This means that they opening multiple application-level flows. This means that they
can support cloning new Connection objects without a new explicit can support cloning new Connection objects without a new explicit
handshake. handshake.
Protocols also have a notion of Data Unit. Possible values for Data Protocols also have a notion of "Data Unit". Possible values for
Unit are: Data Unit are:
* Byte-stream. Byte-stream protocols do not define any message Byte-stream: Byte-stream protocols do not define any message
boundaries of their own apart from the end of a stream in each boundaries of their own apart from the end of a stream in each
direction. direction.
* Datagram. Datagram protocols define message boundaries at the Datagram: Datagram protocols define message boundaries at the same
same level of transmission, such that only complete (not partial) level of transmission, such that only complete (not partial)
messages are supported. messages are supported.
* Message. Message protocols support message boundaries that can be Message: Message protocols support message boundaries that can be
sent and received either as complete or partial messages. Maximum sent and received either as complete or partial messages. Maximum
message lengths can be defined, and messages can be partially message lengths can be defined, and messages can be partially
reliable. reliable.
Below, terms in capitals with a dot (e.g., "CONNECT.SCTP") refer to Below, terms in capitals with a dot character (".") (e.g.,
the primitives with the same name in Section 4 of [RFC8303]. For "CONNECT.SCTP") refer to the primitives with the same name in
further implementation details, the description of these primitives Section 4 of [RFC8303]. For further implementation details, the
in [RFC8303] points to Section 3 of [RFC8303] and Section 3 of description of these primitives in [RFC8303] points to Section 3 of
[RFC8304], which refers back to the relevant specifications for each [RFC8303] and Section 3 of [RFC8304], which refers back to the
protocol. This back-tracking method applies to all elements of relevant specifications for each protocol. This applies to all
[RFC8923] (see appendix D of [I-D.ietf-taps-interface]): they are elements of [RFC8923] (see Appendix C of [RFC9622]): they are listed
listed in appendix A of [RFC8923] with an implementation hint in the in Appendix A of [RFC8923] with an implementation hint in the same
same style, pointing back to Section 4 of [RFC8303]. style, pointing back to Section 4 of [RFC8303].
This document presents the protocol mappings defined in [RFC8923]. This document presents the protocol mappings defined in [RFC8923].
Other protocol mappings can be provided as separate documents, Other protocol mappings can be provided as separate documents,
following the mapping template in Appendix A. following the mapping template in Appendix A.
10.1. TCP 10.1. TCP
Connectedness: Connected Connectedness: Connected
Data Unit: Byte-stream Data Unit: Byte-stream
Connection Object: TCP connections between two hosts map directly to Connection Object: TCP connections between two hosts map directly to
Connection objects. Connection objects.
Initiate: CONNECT.TCP. Calling Initiate on a TCP Connection causes Initiate: CONNECT.TCP. Calling Initiate on a TCP connection causes
it to reserve a local port, and send a SYN to the Remote Endpoint. it to reserve a local port and send a SYN to the Remote Endpoint.
InitiateWithSend: CONNECT.TCP with parameter user message. Early InitiateWithSend: CONNECT.TCP with parameter user message. Early
safely replayable data is sent on a TCP Connection in the SYN, as safely replayable data is sent on a TCP connection in the SYN, as
TCP Fast Open data. TFO data.
Ready: A TCP Connection is ready once the three-way handshake is Ready: A TCP connection is ready once the three-way handshake is
complete. complete.
EstablishmentError: Failure of CONNECT.TCP. TCP can throw various EstablishmentError: Failure of CONNECT.TCP. TCP can throw various
errors during connection setup. Specifically, it is important to errors during connection setup. Specifically, it is important to
handle a RST being sent by the peer during the handshake. handle a RST being sent by the peer during the handshake.
ConnectionError: Once established, TCP throws errors whenever the ConnectionError: Once established, TCP throws errors whenever the
connection is disconnected, such as due to receiving a RST from connection is disconnected, such as due to receiving a RST from
the peer. the peer.
Listen: LISTEN.TCP. Calling Listen for TCP binds a local port and Listen: LISTEN.TCP. Calling Listen for TCP binds a local port and
prepares it to receive inbound SYN packets from peers. prepares it to receive inbound SYN packets from peers.
ConnectionReceived: TCP Listeners will deliver new connections once ConnectionReceived: TCP Listeners will deliver new connections once
they have replied to an inbound SYN with a SYN-ACK. they have replied to an inbound SYN with a SYN-ACK.
Clone: Calling Clone on a TCP Connection creates a new Connection Clone: Calling Clone on a TCP connection creates a new TCP
with equivalent parameters. These Connections, and Connections connection with equivalent parameters. The two associated
generated via later calls to Clone on an Established Connection, Connection objects, and Connections generated via later calls to
form a Connection Group. To realize entanglement for these Clone on an Established Connection, form a Connection Group. To
Connections, with the exception of connPriority, changing a realize entanglement for these Connections, with the exception of
Connection Property on one of them must affect the Connection connPriority, changing a Connection Property on one of them must
Properties of the others too. No guarantees of honoring the affect the Connection Properties of the others too. No guarantees
Connection Property connPriority are given, and thus it is safe of honoring the connPriority Connection Property are given; thus,
for an implementation of a Transport Services system to ignore it is safe for an implementation of a Transport Services System to
this property. When it is reasonable to assume that Connections ignore this Property. When it is reasonable to assume that
traverse the same path (e.g., when they share the same Connections traverse the same path (e.g., when they share the same
encapsulation), support for it can also experimentally be encapsulation), support for it can also experimentally be
implemented using a congestion control coupling mechanism (see for implemented using a congestion control coupling mechanism (for
example [TCP-COUPLING] or [RFC3124]). example, see [TCP-COUPLING] or [RFC3124]).
Send: SEND.TCP. TCP does not on its own preserve message Send: SEND.TCP. On its own, TCP does not preserve Message
boundaries. Calling Send on a TCP connection lays out the bytes boundaries. Calling Send on a TCP connection lays out the bytes
on the TCP send stream without any other delineation. Any Message on the TCP send stream without any other delineation. Any Message
marked as Final will cause TCP to send a FIN once the Message has marked as Final will cause TCP to send a FIN once the Message has
been completely written, by calling CLOSE.TCP immediately upon been completely written, by calling CLOSE.TCP immediately upon
successful termination of SEND.TCP. Note that transmitting a successful termination of SEND.TCP. Note that transmitting a
Message marked as Final should not cause the Closed event to be Message marked as Final should not cause the Closed event to be
delivered to the application, as it will still be possible to delivered to the application as it will still be possible to
receive data until the peer closes or aborts the TCP connection. receive data until the peer closes or aborts the TCP connection.
Receive: With RECEIVE.TCP, TCP delivers a stream of bytes without Receive: With RECEIVE.TCP, TCP delivers a stream of bytes without
any Message delineation. All data delivered in the Received or any Message delineation. All data delivered in the Received or
ReceivedPartial event will be part of a single stream-wide Message ReceivedPartial event will be part of a single stream-wide Message
that is marked Final (unless a Message Framer is used). that is marked Final (unless a Message Framer is used). The value
EndOfMessage will be delivered when the TCP Connection has of the endOfMessage Property will be delivered when the TCP
received a FIN (CLOSE-EVENT.TCP) from the peer. Note that connection has received a FIN (CLOSE-EVENT.TCP) from the peer.
reception of a FIN should not cause the Closed event to be Note that reception of a FIN should not cause the Closed event to
delivered to the application, as it will still be possible for the be delivered to the application, as it will still be possible for
application to send data. the application to send data.
Close: Calling Close on a TCP Connection indicates that the Close: Calling Close on a TCP connection indicates that the TCP
Connection should be gracefully closed (CLOSE.TCP) by sending a connection should be gracefully closed (CLOSE.TCP) by sending a
FIN to the peer. It will then still be possible to receive data FIN to the peer. It will then still be possible to receive data
until the peer closes or aborts the TCP connection. The Closed until the peer closes or aborts the TCP connection. The Closed
event will be issued upon reception of a FIN. event will be issued upon reception of a FIN.
Abort: Calling Abort on a TCP Connection indicates that the Abort: Calling Abort on a TCP connection indicates that the TCP
Connection should be immediately closed by sending a RST to the connection should be immediately closed by sending a RST to the
peer (ABORT.TCP). peer (ABORT.TCP).
CloseGroup: Calling CloseGroup on a TCP Connection (CLOSE.TCP) is CloseGroup: Calling CloseGroup on a TCP connection (CLOSE.TCP) is
identical to calling Close on this Connection and on all identical to calling Close on its Connection object and on all
Connections in the same ConnectionGroup. Connections in the same ConnectionGroup.
AbortGroup: Calling AbortGroup on a TCP Connection (ABORT.TCP) is AbortGroup: Calling AbortGroup on a TCP connection (ABORT.TCP) is
identical to calling Abort on this Connection and on all identical to calling Abort on its Connection object and on all
Connections in the same ConnectionGroup. Connections in the same ConnectionGroup.
10.2. MPTCP 10.2. MPTCP
Connectedness: Connected Connectedness: Connected
Data Unit: Byte-stream Data Unit: Byte-stream
The Transport Services API mappings for MPTCP are identical to TCP. The Transport Services API mappings for MPTCP are identical to TCP.
MPTCP adds support for multipath properties, such as multipath and MPTCP adds support for multipath Properties, such as multipath and
multipathPolicy, and actions for managing paths, such as AddRemote multipathPolicy, and actions for managing paths, such as AddRemote
and RemoveRemote. and RemoveRemote.
10.3. UDP 10.3. UDP
Connectedness: Connectionless Connectedness: Connectionless
Data Unit: Datagram Data Unit: Datagram
Connection Object: UDP Connections represent a pair of specific IP Connection Object: UDP connections represent a pair of specific IP
addresses and ports on two hosts. addresses and ports on two hosts.
Initiate: CONNECT.UDP. Calling Initiate on a UDP Connection causes Initiate: CONNECT.UDP. Calling Initiate on a UDP connection causes
it to reserve a local port, but does not generate any traffic. it to reserve a local port but does not generate any traffic.
InitiateWithSend: Early data on a UDP Connection does not have any InitiateWithSend: Early data on a UDP connection does not have any
special meaning. The data is sent whenever the Connection is special meaning. The data is sent whenever the connection is
Ready. Ready.
Ready: A UDP Connection is ready once the system has reserved a Ready: A UDP connection is ready once the system has reserved a
local port and has a path to send to the Remote Endpoint. local port and has a path to send to the Remote Endpoint.
EstablishmentError: UDP Connections can only generate errors on EstablishmentError: UDP connections can only generate errors on
initiation due to port conflicts on the local system. initiation due to port conflicts on the local system.
ConnectionError: UDP Connections can only generate Connection errors ConnectionError: UDP connections can only generate Connection errors
in response to Abort calls. (Once in use, UDP Connections can in response to Abort actions. (Once in use, UDP connections can
also generate SoftError events (ERROR.UDP) upon receiving ICMP also generate SoftError events (ERROR.UDP) upon receiving ICMP
notifications indicating failures in the network.) notifications indicating failures in the network.)
Listen: LISTEN.UDP. Calling Listen for UDP binds a local port and Listen: LISTEN.UDP. Calling Listen for UDP binds a local port and
prepares it to receive inbound UDP datagrams from peers. prepares it to receive inbound UDP datagrams from peers.
ConnectionReceived: UDP Listeners will deliver new connections once ConnectionReceived: UDP Listeners will deliver new Connections once
they have received traffic from a new Remote Endpoint. they have received traffic from a new Remote Endpoint.
Clone: Calling Clone on a UDP Connection creates a new Connection Clone: Calling Clone on a UDP connection creates a new connection
with equivalent parameters. The two Connections are otherwise with equivalent parameters. The two Connection objects are
independent. otherwise independent.
Send: SEND.UDP. Calling Send on a UDP connection sends the data as Send: SEND.UDP. Calling Send on a UDP connection sends the data as
the payload of a complete UDP datagram. Marking Messages as Final the payload of a complete UDP datagram. Marking Messages as Final
does not change anything in the datagram's contents. Upon sending does not change anything in the datagram's contents. Upon sending
a UDP datagram, some relevant fields and flags in the IP header a UDP datagram, some relevant fields and flags in the IP header
can be controlled: DSCP (SET_DSCP.UDP), DF in IPv4 (SET_DF.UDP) can be controlled: DSCP (SET_DSCP.UDP), DF in IPv4 (SET_DF.UDP),
and ECN flag (SET_ECN.UDP). and ECN flag (SET_ECN.UDP).
Receive: RECEIVE.UDP. UDP only delivers complete Messages to Receive: RECEIVE.UDP. UDP only delivers complete Messages to
Received, each of which represents a single datagram received in a Received, each of which represents a single datagram received in a
UDP packet. Upon receiving a UDP datagram, the ECN flag from the UDP packet. Upon receiving a UDP datagram, the ECN flag from the
IP header can be obtained (GET_ECN.UDP). IP header can be obtained (GET_ECN.UDP).
Close: Calling Close on a UDP Connection (ABORT.UDP) releases the Close: Calling Close on a UDP connection (ABORT.UDP) releases the
local port reservation. The Connection then issues a Closed local port reservation. A Closed event is then issued.
event.
Abort: Calling Abort on a UDP Connection (ABORT.UDP) is identical to Abort: Calling Abort on a UDP connection (ABORT.UDP) is identical to
calling Close, except that the Connection will send a calling Close except that a ConnectionError event rather than a
ConnectionError event rather than a Closed event. Closed event is issued.
CloseGroup: Calling CloseGroup on a UDP Connection (ABORT.UDP) is CloseGroup: Calling CloseGroup on a UDP connection (ABORT.UDP) is
identical to calling Close on this Connection and on all identical to calling Close on its Connection object and on all
Connections in the same ConnectionGroup. Connections in the same ConnectionGroup.
AbortGroup: Calling AbortGroup on a UDP Connection (ABORT.UDP) is AbortGroup: Calling AbortGroup on a UDP connection (ABORT.UDP) is
identical to calling Close on this Connection and on all identical to calling Close on its Connection object and on all
Connections in the same ConnectionGroup. Connections in the same ConnectionGroup.
10.4. UDP-Lite 10.4. UDP-Lite
Connectedness: Connectionless Connectedness: Connectionless
Data Unit: Datagram Data Unit: Datagram
The Transport Services API mappings for UDP-Lite are identical to The Transport Services API mappings for UDP-Lite are identical to
UDP. In addition, UDP-Lite supports the msgChecksumLen and UDP. In addition, UDP-Lite supports the msgChecksumLen and
recvChecksumLen Properties that allow an application to specify the recvChecksumLen Properties that allow an application to specify the
minimum number of bytes in a Message that need to be covered by a minimum number of bytes in a Message that need to be covered by a
checksum. checksum.
This includes: CONNECT.UDP-Lite; LISTEN.UDP-Lite; SEND.UDP-Lite; This includes: CONNECT.UDP-Lite; LISTEN.UDP-Lite; SEND.UDP-Lite;
RECEIVE.UDP-Lite; ABORT.UDP-Lite; ERROR.UDP-Lite; SET_DSCP.UDP-Lite; RECEIVE.UDP-Lite; ABORT.UDP-Lite; ERROR.UDP-Lite; SET_DSCP.UDP-Lite;
SET_DF.UDP-Lite; SET_ECN.UDP-Lite; GET_ECN.UDP-Lite. SET_DF.UDP-Lite; SET_ECN.UDP-Lite; GET_ECN.UDP-Lite.
10.5. UDP Multicast Receive 10.5. UDP Multicast Receive
Connectedness: Connectionless Connectedness: Connectionless
Data Unit: Datagram Data Unit: Datagram
Connection Object: Established UDP Multicast Receive connections Connection Object: Established UDP Multicast Receive connections
represent a pair of specific IP addresses and ports. The represent a pair of specific IP addresses and ports. The
direction Selection Property must be set to unidirectional direction Selection Property must be set to Unidirectional
receive, and the Local Endpoint must be configured with a group IP receive, and the Local Endpoint must be configured with a group IP
address and a port. address and a port.
Initiate: Calling Initiate on a UDP Multicast Receive Connection Initiate: Calling Initiate on a UDP Multicast Receive connection
causes an immediate EstablishmentError. This is an unsupported causes an immediate EstablishmentError. This is an unsupported
operation. operation.
InitiateWithSend: Calling InitiateWithSend on a UDP Multicast InitiateWithSend: Calling InitiateWithSend on a UDP Multicast
Receive Connection causes an immediate EstablishmentError. This Receive connection causes an immediate EstablishmentError. This
is an unsupported operation. is an unsupported operation.
Ready: A UDP Multicast Receive Connection is ready once the system Ready: A UDP Multicast Receive connection is ready once the system
has received traffic for the appropriate group and port. has received traffic for the appropriate group and port.
EstablishmentError: UDP Multicast Receive Connections generate an EstablishmentError: UDP Multicast Receive connections cause an
EstablishmentError indicating that joining a multicast group EstablishmentError indicating that joining a multicast group
failed if Initiate is called. failed if Initiate is called.
ConnectionError: The only ConnectionError generated by a UDP ConnectionError: The only ConnectionError generated by a UDP
Multicast Receive Connection is in response to an Abort call. Multicast Receive connection is in response to an Abort action.
Listen: LISTEN.UDP. Calling Listen for UDP Multicast Receive binds Listen: LISTEN.UDP. Calling Listen for UDP Multicast Receive binds
a local port, prepares it to receive inbound UDP datagrams from a local port, prepares it to receive inbound UDP datagrams from
peers, and issues a multicast host join. If a Remote Endpoint peers, and issues a multicast host join. If a Remote Endpoint
Identifer with an address is supplied, the join is Source-specific Identifier with an address is supplied, the join is Source-
Multicast, and the path selection is based on the route to the Specific Multicast, and the path selection is based on the route
Remote Endpoint. If a Remote Endpoint Identifer is not supplied, to the Remote Endpoint. If a Remote Endpoint Identifier is not
the join is Any-source Multicast, and the path selection is based supplied, the join is Any-Source Multicast, and the path selection
on the outbound route to the group supplied in the Local Endpoint. is based on the outbound route to the group supplied in the Local
Endpoint.
There are cases where it is required to open multiple connections for There are cases where it is required to open multiple connections for
the same address(es). For example, one Connection might be opened the same address(es). For example, one Connection might be opened
for a multicast group to for a multicast control bus, and another for a multicast group used for a shared control bus, and another
application later opens a separate Connection to the same group to application later opens a separate Connection to the same group to
send signals to and/or receive signals from the common bus. In such send signals to and/or receive signals from the common bus. In such
cases, the Transport Services system needs to explicitly enable re- cases, the Transport Services System needs to explicitly enable reuse
use of the same set of addresses (equivalent to setting SO_REUSEADDR of the same set of addresses (equivalent to setting SO_REUSEADDR in
in the socket API). the Socket API).
ConnectionReceived: UDP Multicast Receive Listeners will deliver new ConnectionReceived: UDP Multicast Receive Listeners will deliver new
Connections once they have received traffic from a new Remote Connections once they have received traffic from a new Remote
Endpoint. Endpoint.
Clone: Calling Clone on a UDP Multicast Receive Connection creates a Clone: Calling Clone on a UDP Multicast Receive connection creates a
new Connection with equivalent parameters. The two Connections new UDP Multicast Receive connection with equivalent parameters.
are otherwise independent. The two associated Connection objects are otherwise independent.
Send: SEND.UDP. Calling Send on a UDP Multicast Receive connection Send: SEND.UDP. Calling Send on a UDP Multicast Receive connection
causes an immediate SendError. This is an unsupported operation. causes an immediate SendError. This is an unsupported operation.
Receive: RECEIVE.UDP. The Receive operation in a UDP Multicast Receive: RECEIVE.UDP. UDP Multicast Receive only delivers complete
Receive connection only delivers complete Messages to Received, Messages to Received, each of which represents a single datagram
each of which represents a single datagram received in a UDP received in a UDP packet. Upon receiving a UDP datagram, the ECN
packet. Upon receiving a UDP datagram, the ECN flag from the IP flag from the IP header can be obtained (GET_ECN.UDP).
header can be obtained (GET_ECN.UDP).
Close: Calling Close on a UDP Multicast Receive Connection Close: Calling Close on a UDP Multicast Receive connection
(ABORT.UDP) releases the local port reservation and leaves the (ABORT.UDP) releases the local port reservation and leaves the
group. The Connection then issues a Closed event. group. A Closed event is then issued.
Abort: Calling Abort on a UDP Multicast Receive Connection Abort: Calling Abort on a UDP Multicast Receive connection
(ABORT.UDP) is identical to calling Close, except that the (ABORT.UDP) is identical to calling Close except that a
Connection will send a ConnectionError event rather than a Closed ConnectionError event rather than a Closed event is issued.
event.
CloseGroup: Calling CloseGroup on a UDP Multicast Receive Connection CloseGroup: Calling CloseGroup on a UDP Multicast Receive connection
(ABORT.UDP) is identical to calling Close on this Connection and (ABORT.UDP) is identical to calling Close on its Connection object
on all Connections in the same ConnectionGroup. and on all Connections in the same ConnectionGroup.
AbortGroup: Calling AbortGroup on a UDP Multicast Receive Connection AbortGroup: Calling AbortGroup on a UDP Multicast Receive connection
(ABORT.UDP) is identical to calling Close on this Connection and (ABORT.UDP) is identical to calling Close on its Connection object
on all Connections in the same ConnectionGroup. and on all Connections in the same ConnectionGroup.
10.6. SCTP 10.6. SCTP
Connectedness: Connected Connectedness: Connected
Data Unit: Message Data Unit: Message
Connection Object: Connection objects can be mapped to an SCTP Connection Object: Connection objects can be mapped to an SCTP
association or a stream in an SCTP association. Mapping association or a stream in an SCTP association. Mapping
Connection objects to SCTP streams is called "stream mapping" and Connection objects to SCTP streams is called "stream mapping" and
has additional requirements as follows. The following explanation has additional requirements as follows. The following explanation
assumes a client-server communication model. assumes a client-server communication model.
Stream mapping requires an association to already be in place between Stream mapping requires an association to already be in place
the client and the server, and it requires the server to understand between the client and the server, and it requires the server to
that a new incoming stream should be represented as a new Connection understand that a new incoming stream should be represented as a
object by the Transport Services system. A new SCTP stream is new Connection object by the Transport Services System. A new
created by sending an SCTP message with a new stream id. Thus, to SCTP stream is created by sending an SCTP message with a new
implement stream mapping, the Transport Services API must provide a stream id. Thus, to implement stream mapping, the Transport
newly created Connection object to the application upon the reception Services API must provide a newly created Connection object to the
of such a message. The necessary semantics to implement a Transport application upon the reception of such a message. The necessary
Services system's Close and Abort primitives are provided by the semantics to implement a Transport Services System's Close and
stream reconfiguration (reset) procedure described in [RFC6525]. Abort primitives are provided by the stream reconfiguration
This also allows to re-use a stream id after resetting ("closing") (reset) procedure described in [RFC6525]. This also allows a
the stream. To implement this functionality, SCTP stream stream id to be reused after resetting ("closing") the stream. To
reconfiguration [RFC6525] must be supported by both the client and implement this functionality, SCTP stream reconfiguration
the server side. [RFC6525] must be supported by both the client and the server
side.
To avoid head-of-line blocking, stream mapping should only be To avoid head-of-line blocking, stream mapping should only be
implemented when both sides support message interleaving [RFC8260]. implemented when both sides support message interleaving
This allows a sender to schedule transmissions between multiple [RFC8260]. This allows a sender to schedule transmissions between
streams without risking that transmission of a large message on one multiple streams without risking that transmission of a large
stream might block transmissions on other streams for a long time. message on one stream will block transmissions on other streams
for a long time.
To avoid conflicts between stream ids, the following procedure is To avoid conflicts between stream ids, the following procedure is
recommended: the first Connection, for which the SCTP association has recommended: the first Connection, for which the SCTP association
been created, must always use stream id zero. All additional has been created, must always use stream id zero. All additional
Connections are assigned to unused stream ids in growing order. To Connections are assigned to unused stream ids in ascending order.
avoid a conflict when both endpoints map new Connections To avoid a conflict when both endpoints map new Connections
simultaneously, the peer which initiated association must use even simultaneously, the peer that initiated association must use even
stream ids whereas the remote side must map its Connections to odd stream ids whereas the remote side must map its Connections to odd
stream ids. Both sides maintain a status map of the assigned stream stream ids. Both sides maintain a status map of the assigned
ids. Generally, new streams should consume the lowest available stream ids. Generally, new streams should consume the lowest
(even or odd, depending on the side) stream id; this rule is relevant available (even or odd, depending on the side) stream id; this
when lower ids become available because Connection objects associated rule is relevant when lower stream ids become available because
with the streams are closed. Connection objects associated with the streams are closed.
SCTP stream mapping as described here has been implemented in a SCTP stream mapping as described here has been implemented in a
research prototype; a desription of this implementation is given in research prototype; a description of this implementation is given
[NEAT-flow-mapping]. in [NEAT-flow-mapping].
Initiate: If this is the only Connection object that is assigned to Initiate: If this is the only Connection object that is assigned to
the SCTP Association or stream mapping is not used, CONNECT.SCTP the SCTP association or stream mapping is not used, CONNECT.SCTP
is called. Else, unless the Selection Property is called. Else, unless the Selection Property
activeReadBeforeSend is Preferred or Required, a new stream is activeReadBeforeSend is preferred or required, a new stream is
used: if there are enough streams available, Initiate is a local used: if there are enough streams available, Initiate is a local
operation that assigns a new stream id to the Connection object. operation that assigns a new stream id to the Connection object.
The number of streams is negotiated as a parameter of the prior The number of streams is negotiated as a parameter of the prior
CONNECT.SCTP call, and it represents a trade-off between local CONNECT.SCTP call, and it represents a trade-off between local
resource usage and the number of Connection objects that can be resource usage and the number of Connection objects that can be
mapped without requiring a reconfiguration signal. When running mapped without requiring a reconfiguration signal. When running
out of streams, ADD_STREAM.SCTP must be called. out of streams, ADD_STREAM.SCTP must be called.
InitiateWithSend: If this is the only Connection object that is InitiateWithSend: If this is the only Connection object that is
assigned to the SCTP association or stream mapping is not used, assigned to the SCTP association or stream mapping is not used,
CONNECT.SCTP is called with the "user message" parameter. Else, a CONNECT.SCTP is called with the user message parameter. Else, a
new stream is used (see Initiate for how to handle running out of new stream is used (see Initiate for how to handle running out of
streams), and this just sends the first message on a new stream. streams), and this just sends the first message on a new stream.
Ready: Initiate or InitiateWithSend returns without an error, i.e. Ready: Initiate or InitiateWithSend returns without an error, i.e.,
SCTP's four-way handshake has completed. If an association with SCTP's four-way handshake has completed. If an association with
the peer already exists, stream mapping is used and enough streams the peer already exists, stream mapping is used, and enough
are available, a Connection object instantly becomes Ready after streams are available, a Connection object instantly becomes Ready
calling Initiate or InitiateWithSend. after calling Initiate or InitiateWithSend.
EstablishmentError: Failure of CONNECT.SCTP. EstablishmentError: Failure of CONNECT.SCTP.
ConnectionError: TIMEOUT.SCTP or ABORT-EVENT.SCTP. ConnectionError: TIMEOUT.SCTP or ABORT-EVENT.SCTP.
Listen: LISTEN.SCTP. If an association with the peer already exists Listen: LISTEN.SCTP. If an association with the peer already exists
and stream mapping is used, Listen just expects to receive a new and stream mapping is used, Listen just expects to receive a new
message with a new stream id (chosen in accordance with the stream message with a new stream id (chosen in accordance with the stream
id assignment procedure described above). id assignment procedure described above).
ConnectionReceived: LISTEN.SCTP returns without an error (a result ConnectionReceived: LISTEN.SCTP returns without an error (a result
of successful CONNECT.SCTP from the peer), or, in case of stream of successful CONNECT.SCTP from the peer) or, in the case of
mapping, the first message has arrived on a new stream (in this stream mapping, the first message has arrived on a new stream (in
case, Receive is also invoked). this case, Receive is also invoked).
Clone: Calling Clone on an SCTP association creates a new Connection Clone: Calling Clone on an SCTP association creates a new Connection
object and assigns it a new stream id in accordance with the object and assigns it a new stream id in accordance with the
stream id assignment procedure described above. If there are not stream id assignment procedure described above. If there are not
enough streams available, ADD_STREAM.SCTP must be called. enough streams available, ADD_STREAM.SCTP must be called.
Send: SEND.SCTP. Message Properties such as msgLifetime and Send: SEND.SCTP. Message Properties such as msgLifetime and
msgOrdered map to parameters of this primitive. msgOrdered map to parameters of this primitive.
Receive: RECEIVE.SCTP. The "partial flag" of RECEIVE.SCTP invokes a Receive: RECEIVE.SCTP. The "partial flag" of RECEIVE.SCTP invokes a
ReceivedPartial event. ReceivedPartial event.
Close: If this is the only Connection object that is assigned to the Close: If this is the only Connection object that is assigned to the
SCTP association, CLOSE.SCTP is called, and the Closed event will be SCTP association, CLOSE.SCTP is called and the Closed event will
delivered to the application upon the ensuing CLOSE-EVENT.SCTP. be delivered to the application upon the ensuing CLOSE-EVENT.SCTP.
Else, the Connection object is one out of several Connection objects Else, the Connection object is one out of several Connection
that are assigned to the same SCTP assocation, and RESET_STREAM.SCTP objects that are assigned to the same SCTP association, and
must be called, which informs the peer that the stream will no longer RESET_STREAM.SCTP must be called, which informs the peer that the
be used for mapping and can be used by future Initiate, stream will no longer be used for mapping and can be used by a
InitiateWithSend or Listen calls. At the peer, the event future Initiate, InitiateWithSend, or Listen action. At the peer,
RESET_STREAM-EVENT.SCTP will fire, which the peer must answer by the event RESET_STREAM-EVENT.SCTP will be initiated, which the
issuing RESET_STREAM.SCTP too. The resulting local RESET_STREAM- peer must answer by issuing RESET_STREAM.SCTP too. The resulting
EVENT.SCTP informs the Transport Services system that the stream id local RESET_STREAM-EVENT.SCTP informs the Transport Services
can now be re-used by the next Initiate, InitiateWithSend or Listen System that the stream id can now be reused by the next Initiate,
calls, and invokes a Closed event towards the application. InitiateWithSend, or Listen action, and invokes a Closed event
toward the application.
Abort: If this is the only Connection object that is assigned to the Abort: If this is the only Connection object that is assigned to the
SCTP association, ABORT.SCTP is called. Else, the Connection object SCTP association, ABORT.SCTP is called. Else, the Connection
is one out of several Connection objects that are assigned to the object is one out of several Connection objects that are assigned
same SCTP assocation, and shutdown proceeds as described under Close. to the same SCTP association, and shutdown proceeds as described
under Close.
CloseGroup: Calling CloseGroup calls CLOSE.SCTP, closing all CloseGroup: Calling CloseGroup calls CLOSE.SCTP, which closes all
Connections in the SCTP association. Connections in the SCTP association.
AbortGroup: Calling AbortGroup calls ABORT.SCTP, immediately closing AbortGroup: Calling AbortGroup calls ABORT.SCTP, which immediately
all Connections in the SCTP association. closes all Connections in the SCTP association.
In addition to the API mappings described above, when there are In addition to the API mappings described above, when there are
multiple Connection objects assigned to the same SCTP association, multiple Connection objects assigned to the same SCTP association,
SCTP can support Connection properties such as connPriority and SCTP can support Connection Properties such as connPriority and
connScheduler where CONFIGURE_STREAM_SCHEDULER.SCTP can be called to connScheduler where CONFIGURE_STREAM_SCHEDULER.SCTP can be called to
adjust the priorities of streams in the SCTP association. adjust the priorities of streams in the SCTP association.
11. IANA Considerations 11. IANA Considerations
This document has no actions for IANA. This document has no IANA actions.
12. Security Considerations 12. Security Considerations
[I-D.ietf-taps-arch] outlines general security consideration and [RFC9621] outlines general security considerations and requirements
requirements for any system that implements the Transport Services for any system that implements the Transport Services Architecture.
archtecture. [I-D.ietf-taps-interface] provides further discussion [RFC9622] provides further discussion on security and privacy
on security and privacy implications of the Transport Services API. implications of the Transport Services API. This document provides
This document provides additional guidance on implementation additional guidance on implementation specifics for the Transport
specifics for the Transport Services API and as such the security Services API; as such, the security considerations in both of these
considerations in both of these documents apply. The next two documents apply. The next two subsections discuss further
subsections discuss further considerations that are specific to considerations that are specific to mechanisms specified in this
mechanisms specified in this document. document.
12.1. Considerations for Candidate Gathering 12.1. Considerations for Candidate Gathering
The Security Considerations of the Transport Services Architecture As discussed in Sections 3 and 6 of [RFC9621], gathering and racing
[I-D.ietf-taps-arch] forbids gathering and racing with Protocol with Protocol Stacks that do not have equivalent security properties
Stacks that do not have equivalent security properties. Therefore, ought not be attempted. Therefore, implementations need to avoid
implementations need to avoid downgrade attacks that allow network downgrade attacks that allow network interference to cause the
interference to cause the implementation to select less secure, or implementation to select less secure, or entirely insecure,
entirely insecure, combinations of paths and protocols. combinations of paths and protocols.
12.2. Considerations for Candidate Racing 12.2. Considerations for Candidate Racing
See Section 5.3 for security considerations around racing with 0-RTT See Section 5.3 for security considerations around racing with 0-RTT
data. data.
An attacker that knows a particular device is racing several options An attacker that knows a particular device is racing several options
during connection establishment may be able to block packets for the during Connection establishment may be able to block packets for the
first connection attempt, thus inducing the device to fall back to a first connection attempt, thus inducing the device to fall back to a
secondary attempt. This is a problem if the secondary attempts have secondary attempt. This is a problem if the secondary attempts have
worse security properties that enable further attacks. worse security properties that enable further attacks.
Implementations should ensure that all options have equivalent Implementations should ensure that all options have equivalent
security properties to avoid incentivizing attacks. security properties to avoid incentivizing attacks.
Since results from the network can determine how a connection attempt Since results from the network can determine how a connection attempt
tree is built, such as when DNS returns a list of resolved endpoints, tree is built, such as when DNS returns a list of resolved endpoints,
it is possible for the network to cause an implementation to consume it is possible for the network to cause an implementation to consume
significant on-device resources. Implementations should limit the significant on-device resources. Implementations should limit the
maximum amount of state allowed for any given node, including the maximum amount of state allowed for any given node, including the
number of child nodes, especially when the state is based on results number of child nodes, especially when the state is based on results
from the network. from the network.
13. Acknowledgements 13. References
This work has received funding from the European Union's Horizon 2020
research and innovation programme under grant agreement No. 644334
(NEAT) and No. 815178 (5GENESIS).
This work has been supported by Leibniz Prize project funds of DFG -
German Research Foundation: Gottfried Wilhelm Leibniz-Preis 2011 (FKZ
FE 570/4-1).
This work has been supported by the UK Engineering and Physical
Sciences Research Council under grant EP/R04144X/1.
This work has been supported by the Research Council of Norway under
its "Toppforsk" programme through the "OCARINA" project.
Thanks to Colin Perkins, Tom Jones, Karl-Johan Grinnemo, Gorry
Fairhurst, for their contributions to the design of this
specification. Thanks also to Stuart Cheshire, Josh Graessley, David
Schinazi, and Eric Kinnear for their implementation and design
efforts, including Happy Eyeballs, that heavily influenced this work.
14. References
14.1. Normative References
[I-D.ietf-taps-arch]
Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., and
C. Perkins, "Architecture and Requirements for Transport
Services", Work in Progress, Internet-Draft, draft-ietf-
taps-arch-19, 9 November 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-taps-
arch-19>.
[I-D.ietf-taps-interface] 13.1. Normative References
Trammell, B., Welzl, M., Enghardt, R., Fairhurst, G.,
Kühlewind, M., Perkins, C., Tiesel, P. S., and T. Pauly,
"An Abstract Application Layer Interface to Transport
Services", Work in Progress, Internet-Draft, draft-ietf-
taps-interface-23, 14 November 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-taps-
interface-23>.
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
<https://www.rfc-editor.org/rfc/rfc7413>. <https://www.rfc-editor.org/info/rfc7413>.
[RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
DOI 10.17487/RFC7540, May 2015,
<https://www.rfc-editor.org/rfc/rfc7540>.
[RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of
Transport Features Provided by IETF Transport Protocols", Transport Features Provided by IETF Transport Protocols",
RFC 8303, DOI 10.17487/RFC8303, February 2018, RFC 8303, DOI 10.17487/RFC8303, February 2018,
<https://www.rfc-editor.org/rfc/rfc8303>. <https://www.rfc-editor.org/info/rfc8303>.
[RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the
User Datagram Protocol (UDP) and Lightweight UDP (UDP- User Datagram Protocol (UDP) and Lightweight UDP (UDP-
Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018,
<https://www.rfc-editor.org/rfc/rfc8304>. <https://www.rfc-editor.org/info/rfc8304>.
[RFC8305] Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2: [RFC8305] Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2:
Better Connectivity Using Concurrency", RFC 8305, Better Connectivity Using Concurrency", RFC 8305,
DOI 10.17487/RFC8305, December 2017, DOI 10.17487/RFC8305, December 2017,
<https://www.rfc-editor.org/rfc/rfc8305>. <https://www.rfc-editor.org/info/rfc8305>.
[RFC8421] Martinsen, P., Reddy, T., and P. Patil, "Guidelines for [RFC8421] Martinsen, P., Reddy, T., and P. Patil, "Guidelines for
Multihomed and IPv4/IPv6 Dual-Stack Interactive Multihomed and IPv4/IPv6 Dual-Stack Interactive
Connectivity Establishment (ICE)", BCP 217, RFC 8421, Connectivity Establishment (ICE)", BCP 217, RFC 8421,
DOI 10.17487/RFC8421, July 2018, DOI 10.17487/RFC8421, July 2018,
<https://www.rfc-editor.org/rfc/rfc8421>. <https://www.rfc-editor.org/info/rfc8421>.
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
<https://www.rfc-editor.org/rfc/rfc8446>. <https://www.rfc-editor.org/info/rfc8446>.
[RFC8923] Welzl, M. and S. Gjessing, "A Minimal Set of Transport [RFC8923] Welzl, M. and S. Gjessing, "A Minimal Set of Transport
Services for End Systems", RFC 8923, DOI 10.17487/RFC8923, Services for End Systems", RFC 8923, DOI 10.17487/RFC8923,
October 2020, <https://www.rfc-editor.org/rfc/rfc8923>. October 2020, <https://www.rfc-editor.org/info/rfc8923>.
14.2. Informative References [RFC9113] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113,
DOI 10.17487/RFC9113, June 2022,
<https://www.rfc-editor.org/info/rfc9113>.
[I-D.ietf-dnsop-svcb-https] [RFC9621] Pauly, T., Ed., Trammell, B., Ed., Brunstrom, A.,
Schwartz, B. M., Bishop, M., and E. Nygren, "Service Fairhurst, G., and C. S. Perkins, "Architecture and
Binding and Parameter Specification via the DNS (SVCB and Requirements for Transport Services", RFC 9621,
HTTPS Resource Records)", Work in Progress, Internet- DOI 10.17487/RFC9621, January 2025,
Draft, draft-ietf-dnsop-svcb-https-12, 11 March 2023, <https://www.rfc-editor.org/info/rfc9621>.
<https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-
svcb-https-12>. [RFC9622] Trammell, B., Ed., Welzl, M., Ed., Enghardt, R.,
Fairhurst, G., Kühlewind, M., Perkins, C. S., Tiesel, P.
S., and T. Pauly, "An Abstract Application Programming
Interface (API) for Transport Services", RFC 9622,
DOI 10.17487/RFC9622, January 2025,
<https://www.rfc-editor.org/info/rfc9622>.
13.2. Informative References
[NEAT-flow-mapping] [NEAT-flow-mapping]
"Transparent Flow Mapping for NEAT", IFIP NETWORKING 2017 Weinrank, F. and M. Tuxen, "Transparent flow mapping for
Workshop on Future of Internet Transport (FIT 2017) , NEAT", 2017 IFIP Networking Conference (IFIP Networking)
2017. and Workshops, DOI 10.23919/IFIPNetworking.2017.8264876,
June 2017, <https://ieeexplore.ieee.org/document/8264876>.
[RFC1928] Leech, M., Ganis, M., Lee, Y., Kuris, R., Koblas, D., and [RFC1928] Leech, M., Ganis, M., Lee, Y., Kuris, R., Koblas, D., and
L. Jones, "SOCKS Protocol Version 5", RFC 1928, L. Jones, "SOCKS Protocol Version 5", RFC 1928,
DOI 10.17487/RFC1928, March 1996, DOI 10.17487/RFC1928, March 1996,
<https://www.rfc-editor.org/rfc/rfc1928>. <https://www.rfc-editor.org/info/rfc1928>.
[RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
specifying the location of services (DNS SRV)", RFC 2782, specifying the location of services (DNS SRV)", RFC 2782,
DOI 10.17487/RFC2782, February 2000, DOI 10.17487/RFC2782, February 2000,
<https://www.rfc-editor.org/rfc/rfc2782>. <https://www.rfc-editor.org/info/rfc2782>.
[RFC3124] Balakrishnan, H. and S. Seshan, "The Congestion Manager", [RFC3124] Balakrishnan, H. and S. Seshan, "The Congestion Manager",
RFC 3124, DOI 10.17487/RFC3124, June 2001, RFC 3124, DOI 10.17487/RFC3124, June 2001,
<https://www.rfc-editor.org/rfc/rfc3124>. <https://www.rfc-editor.org/info/rfc3124>.
[RFC3207] Hoffman, P., "SMTP Service Extension for Secure SMTP over [RFC3207] Hoffman, P., "SMTP Service Extension for Secure SMTP over
Transport Layer Security", RFC 3207, DOI 10.17487/RFC3207, Transport Layer Security", RFC 3207, DOI 10.17487/RFC3207,
February 2002, <https://www.rfc-editor.org/rfc/rfc3207>. February 2002, <https://www.rfc-editor.org/info/rfc3207>.
[RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
"Session Traversal Utilities for NAT (STUN)", RFC 5389,
DOI 10.17487/RFC5389, October 2008,
<https://www.rfc-editor.org/rfc/rfc5389>.
[RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using
Relays around NAT (TURN): Relay Extensions to Session
Traversal Utilities for NAT (STUN)", RFC 5766,
DOI 10.17487/RFC5766, April 2010,
<https://www.rfc-editor.org/rfc/rfc5766>.
[RFC6525] Stewart, R., Tuexen, M., and P. Lei, "Stream Control [RFC6525] Stewart, R., Tuexen, M., and P. Lei, "Stream Control
Transmission Protocol (SCTP) Stream Reconfiguration", Transmission Protocol (SCTP) Stream Reconfiguration",
RFC 6525, DOI 10.17487/RFC6525, February 2012, RFC 6525, DOI 10.17487/RFC6525, February 2012,
<https://www.rfc-editor.org/rfc/rfc6525>. <https://www.rfc-editor.org/info/rfc6525>.
[RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762, [RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762,
DOI 10.17487/RFC6762, February 2013, DOI 10.17487/RFC6762, February 2013,
<https://www.rfc-editor.org/rfc/rfc6762>. <https://www.rfc-editor.org/info/rfc6762>.
[RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service [RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service
Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013, Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013,
<https://www.rfc-editor.org/rfc/rfc6763>. <https://www.rfc-editor.org/info/rfc6763>.
[RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
Protocol (HTTP/1.1): Message Syntax and Routing",
RFC 7230, DOI 10.17487/RFC7230, June 2014,
<https://www.rfc-editor.org/rfc/rfc7230>.
[RFC7657] Black, D., Ed. and P. Jones, "Differentiated Services [RFC7657] Black, D., Ed. and P. Jones, "Differentiated Services
(Diffserv) and Real-Time Communication", RFC 7657, (Diffserv) and Real-Time Communication", RFC 7657,
DOI 10.17487/RFC7657, November 2015, DOI 10.17487/RFC7657, November 2015,
<https://www.rfc-editor.org/rfc/rfc7657>. <https://www.rfc-editor.org/info/rfc7657>.
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
March 2017, <https://www.rfc-editor.org/rfc/rfc8085>. March 2017, <https://www.rfc-editor.org/info/rfc8085>.
[RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann,
"Stream Schedulers and User Message Interleaving for the "Stream Schedulers and User Message Interleaving for the
Stream Control Transmission Protocol", RFC 8260, Stream Control Transmission Protocol", RFC 8260,
DOI 10.17487/RFC8260, November 2017, DOI 10.17487/RFC8260, November 2017,
<https://www.rfc-editor.org/rfc/rfc8260>. <https://www.rfc-editor.org/info/rfc8260>.
[RFC8445] Keranen, A., Holmberg, C., and J. Rosenberg, "Interactive [RFC8445] Keranen, A., Holmberg, C., and J. Rosenberg, "Interactive
Connectivity Establishment (ICE): A Protocol for Network Connectivity Establishment (ICE): A Protocol for Network
Address Translator (NAT) Traversal", RFC 8445, Address Translator (NAT) Traversal", RFC 8445,
DOI 10.17487/RFC8445, July 2018, DOI 10.17487/RFC8445, July 2018,
<https://www.rfc-editor.org/rfc/rfc8445>. <https://www.rfc-editor.org/info/rfc8445>.
[RFC8489] Petit-Huguenin, M., Salgueiro, G., Rosenberg, J., Wing,
D., Mahy, R., and P. Matthews, "Session Traversal
Utilities for NAT (STUN)", RFC 8489, DOI 10.17487/RFC8489,
February 2020, <https://www.rfc-editor.org/info/rfc8489>.
[RFC8656] Reddy, T., Ed., Johnston, A., Ed., Matthews, P., and J.
Rosenberg, "Traversal Using Relays around NAT (TURN):
Relay Extensions to Session Traversal Utilities for NAT
(STUN)", RFC 8656, DOI 10.17487/RFC8656, February 2020,
<https://www.rfc-editor.org/info/rfc8656>.
[RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
Multiplexed and Secure Transport", RFC 9000, Multiplexed and Secure Transport", RFC 9000,
DOI 10.17487/RFC9000, May 2021, DOI 10.17487/RFC9000, May 2021,
<https://www.rfc-editor.org/rfc/rfc9000>. <https://www.rfc-editor.org/info/rfc9000>.
[RFC9040] Touch, J., Welzl, M., and S. Islam, "TCP Control Block [RFC9040] Touch, J., Welzl, M., and S. Islam, "TCP Control Block
Interdependence", RFC 9040, DOI 10.17487/RFC9040, July Interdependence", RFC 9040, DOI 10.17487/RFC9040, July
2021, <https://www.rfc-editor.org/rfc/rfc9040>. 2021, <https://www.rfc-editor.org/info/rfc9040>.
[RFC9110] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
Ed., "HTTP Semantics", STD 97, RFC 9110,
DOI 10.17487/RFC9110, June 2022,
<https://www.rfc-editor.org/info/rfc9110>.
[RFC9460] Schwartz, B., Bishop, M., and E. Nygren, "Service Binding
and Parameter Specification via the DNS (SVCB and HTTPS
Resource Records)", RFC 9460, DOI 10.17487/RFC9460,
November 2023, <https://www.rfc-editor.org/info/rfc9460>.
[TCP-COUPLING] [TCP-COUPLING]
"ctrlTCP: Reducing Latency through Coupled, Heterogeneous Islam, S., Welzl, M., Hiorth, K., Hayes, D., Armitage, G.,
Multi-Flow TCP Congestion Control", IEEE INFOCOM Global and S. Gjessing, "ctrlTCP: Reducing latency through
Internet Symposium (GI) workshop (GI 2018) , n.d.. coupled, heterogeneous multi-flow TCP congestion control",
IEEE INFOCOM 2018 - IEEE Conference on Computer
Communications Workshops (INFOCOM WKSHPS),
DOI 10.1109/INFCOMW.2018.8406887, 2018,
<https://ieeexplore.ieee.org/document/8406887>.
Appendix A. API Mapping Template Appendix A. API Mapping Template
Any protocol mapping for the Transport Services API should follow a Any protocol mapping for the Transport Services API should follow a
common template. common template.
Connectedness: (Connectionless/Connected/Multiplexing Connected) Connectedness: (Connectionless/Connected/Multiplexing Connected)
Data Unit: (Byte-stream/Datagram/Message) Data Unit: (Byte-stream/Datagram/Message)
skipping to change at page 52, line 15 skipping to change at line 2393
Receive: Receive:
Close: Close:
Abort: Abort:
CloseGroup: CloseGroup:
AbortGroup: AbortGroup:
Appendix B. Reasons for errors Appendix B. Reasons for Errors
The Transport Services API [I-D.ietf-taps-interface] allows for the The Transport Services API [RFC9622] allows for several generic error
several generic error types to specify a more detailed reason about types to specify a more detailed reason about why an error occurred.
why an error occurred. This appendix lists some of the possible This appendix lists some of the possible reasons.
reasons.
* InvalidConfiguration: The transport properties and Endpoint InvalidConfiguration: The Properties and Endpoint Identifiers
Identifers provided by the application are either contradictory or provided by the application are either contradictory or
incomplete. Examples include the lack of a Remote Endpoint incomplete. Examples include the lack of a Remote Endpoint
Identifer on an active open or using a multicast group address Identifier on an active open or using a multicast group address
while not requesting a unidirectional receive. while not requesting a Unidirectional receive.
* NoCandidates: The configuration is valid, but none of the NoCandidates: The configuration is valid, but none of the available
available transport protocols can satisfy the transport properties transport protocols can satisfy the Properties provided by the
provided by the application. application.
* ResolutionFailed: The remote or local specifier provided by the ResolutionFailed: The remote or local specifier provided by the
application can not be resolved. application cannot be resolved.
* EstablishmentFailed: The Transport Services system was unable to EstablishmentFailed: The Transport Services System was unable to
establish a transport-layer connection to the Remote Endpoint establish a transport-layer connection to the Remote Endpoint
specified by the application. specified by the application.
* PolicyProhibited: The system policy prevents the Transport PolicyProhibited: The System Policy prevents the Transport Services
Services system from performing the action requested by the System from performing the action requested by the application.
application.
* NotCloneable: The Protocol Stack is not capable of being cloned. NotCloneable: The Protocol Stack is not capable of being cloned.
* MessageTooLarge: The Message is too big for the Transport Services MessageTooLarge: The Message is too big for the Transport Services
system to handle. System to handle.
* ProtocolFailed: The underlying Protocol Stack failed. ProtocolFailed: The underlying Protocol Stack failed.
* InvalidMessageProperties: The Message Properties either contradict InvalidMessageProperties: The Message Properties either contradict
the Transport Properties or they can not be satisfied by the the Transport Properties or cannot be satisfied by the Transport
Transport Services system. Services System.
* DeframingFailed: The data that was received by the underlying DeframingFailed: The data that was received by the underlying
Protocol Stack could not be processed by the Message Framer. Protocol Stack could not be processed by the Message Framer.
* ConnectionAborted: The connection was aborted by the peer. ConnectionAborted: The connection was aborted by the peer.
* Timeout: Delivery of a Message was not possible after a timeout. Timeout: Delivery of a Message was not possible after a timeout.
Appendix C. Existing Implementations Appendix C. Existing Implementations
This appendix gives an overview of existing implementations, at the This appendix gives an overview of existing implementations, at the
time of writing, of Transport Services systems that are (to some time of writing, of Transport Services Systems that are (to some
degree) in line with this document. degree) in line with this document.
* Apple's Network.framework: * Apple's Network.framework:
- Network.framework is a transport-level API built for C, - Network.framework is a transport-level API built for C,
Objective-C, and Swift. It a connect-by-name API that supports Objective-C, and Swift. It is a connect-by-name API that
transport security protocols. It provides userspace supports transport security protocols. It provides user-space
implementations of TCP, UDP, TLS, DTLS, proxy protocols, and implementations of TCP, UDP, TLS, DTLS, and proxy protocols,
allows extension via custom framers. and it allows extension via custom Framers.
- Documentation: https://developer.apple.com/documentation/ - Documentation: https://developer.apple.com/documentation/
network (https://developer.apple.com/documentation/network) network
* NEAT and NEATPy: * NEAT and NEATPy:
- NEAT is the output of the European H2020 research project - NEAT is the output of the European H2020 research project
"NEAT"; it is a user-space library for protocol-independent "NEAT"; it is a user-space library for protocol-independent
communication on top of TCP, UDP and SCTP, with many more communication on top of TCP, UDP, and SCTP, with many more
features, such as a policy manager. features, such as a policy manager.
- Code: https://github.com/NEAT-project/neat (https://github.com/ - Code: https://github.com/NEAT-project/neat
NEAT-project/neat)
- Code at the Software Heritage Archive: - Code at the Software Heritage Archive:
https://archive.softwareheritage.org/swh:1:dir:737820840f83c4ec https://archive.softwareheritage.org/swh:1:dir:737820840f83c4ec
9493a8c0cc89b3159e2e1a57;origin=https://github.com/NEAT- 9493a8c0cc89b3159e2e1a57;origin=https://github.com/NEAT-
project/neat;visit=swh:1:snp:bbb611b04e355439d47e426e8ad5d07cdb project/neat;visit=swh:1:snp:bbb611b04e355439d47e426e8ad5d07cdb
f647e0;anchor=swh:1:rev:652ee991043ce3560a6e5715fa2a5c211139d15 f647e0;anchor=swh:1:rev:652ee991043ce3560a6e5715fa2a5c211139d15
c (https://archive.softwareheritage.org/swh:1:dir:737820840f83c c
4ec9493a8c0cc89b3159e2e1a57;origin=https://github.com/NEAT-
project/neat;visit=swh:1:snp:bbb611b04e355439d47e426e8ad5d07cdb
f647e0;anchor=swh:1:rev:652ee991043ce3560a6e5715fa2a5c211139d15
c)
- NEAT project: https://www.neat-project.org (https://www.neat-
project.org)
- NEATPy is a Python shim over NEAT which updates the NEAT API to - NEATPy is a Python shim over NEAT that updates the NEAT API to
be in line with version 6 of the Transport Services API draft. be in line with version 6 of the Transport Services API
[RFC9622].
- Code: https://github.com/theagilepadawan/NEATPy - Code: https://github.com/theagilepadawan/NEATPy
(https://github.com/theagilepadawan/NEATPy)
- Code at the Software Heritage Archive: - Code at the Software Heritage Archive:
https://archive.softwareheritage.org/swh:1:dir:295ccd148cf918cc https://archive.softwareheritage.org/swh:1:dir:295ccd148cf918cc
b9ed7ad14b5ae968a8d2c370;origin=https://github.com/ b9ed7ad14b5ae968a8d2c370;origin=https://github.com/
theagilepadawan/NEATPy;visit=swh:1:snp:6e1a3a9dd4c532ba6c0f52c8 theagilepadawan/NEATPy;visit=swh:1:snp:6e1a3a9dd4c532ba6c0f52c8
f734c1256a06cedc;anchor=swh:1:rev:cd0788d7f7f34a0e9b8654516da7c f734c1256a06cedc;anchor=swh:1:rev:cd0788d7f7f34a0e9b8654516da7c
002c44d2e95 (https://archive.softwareheritage.org/swh:1:dir:295 002c44d2e95
ccd148cf918ccb9ed7ad14b5ae968a8d2c370;origin=https://github.com
/theagilepadawan/NEATPy;visit=swh:1:snp:6e1a3a9dd4c532ba6c0f52c
8f734c1256a06cedc;anchor=swh:1:rev:cd0788d7f7f34a0e9b8654516da7
c002c44d2e95)
* PyTAPS: * PyTAPS:
- A TAPS implementation based on Python asyncio, offering - A Transport Services (TAPS) implementation based on Python
protocol-independent communication to applications on top of asyncio, offering protocol-independent communication to
TCP, UDP and TLS, with support for multicast. applications on top of TCP, UDP, and TLS, with support for
multicast.
- Code: https://github.com/fg-inet/python-asyncio-taps - Code: https://github.com/fg-inet/python-asyncio-taps
(https://github.com/fg-inet/python-asyncio-taps)
- Code at the Software Heritage Archive: - Code at the Software Heritage Archive:
https://archive.softwareheritage.org/swh:1:dir:a7151096d91352b4 https://archive.softwareheritage.org/swh:1:dir:a7151096d91352b4
39b092ef116d04f38e52e556;origin=https://github.com/fg-inet/ 39b092ef116d04f38e52e556;origin=https://github.com/fg-inet/
python-asyncio-taps;visit=swh:1:snp:4841e59b53b28bb385726e7d3a5 python-asyncio-taps;visit=swh:1:snp:4841e59b53b28bb385726e7d3a5
69bee0fea7fc4;anchor=swh:1:rev:63571fd7545da25142bc1a6371b8f130 69bee0fea7fc4;anchor=swh:1:rev:63571fd7545da25142bc1a6371b8f130
97cba38e (https://archive.softwareheritage.org/swh:1:dir:a71510 97cba38e
96d91352b439b092ef116d04f38e52e556;origin=https://github.com/
fg-inet/python-asyncio-taps;visit=swh:1:snp:4841e59b53b28bb3857 Acknowledgements
26e7d3a569bee0fea7fc4;anchor=swh:1:rev:63571fd7545da25142bc1a63
71b8f13097cba38e) This work has received funding from the European Union's Horizon 2020
research and innovation programme under grant agreement No. 644334
(NEAT) and No. 815178 (5GENESIS).
This work has been supported by:
* Leibniz Prize project funds from the DFG - German Research
Foundation: Gottfried Wilhelm Leibniz-Preis 2011 (FKZ FE 570/4-1).
* the UK Engineering and Physical Sciences Research Council under
grant EP/R04144X/1.
* the Research Council of Norway under its "Toppforsk" programme
through the "OCARINA" project.
Thanks to Colin S. Perkins, Tom Jones, Karl-Johan Grinnemo, and Gorry
Fairhurst for their contributions to the design of this
specification. Thanks also to Stuart Cheshire, Josh Graessley, David
Schinazi, and Eric Kinnear for their implementation and design
efforts, including Happy Eyeballs, that heavily influenced this work.
Authors' Addresses Authors' Addresses
Anna Brunstrom (editor) Anna Brunstrom (editor)
Karlstad University Karlstad University
Universitetsgatan 2 Universitetsgatan 2
651 88 Karlstad 651 88 Karlstad
Sweden Sweden
Email: anna.brunstrom@kau.se Email: anna.brunstrom@kau.se
Tommy Pauly (editor) Tommy Pauly (editor)
Apple Inc. Apple Inc.
One Apple Park Way One Apple Park Way
Cupertino, California 95014, Cupertino, CA 95014
United States of America United States of America
Email: tpauly@apple.com Email: tpauly@apple.com
Reese Enghardt Reese Enghardt
Netflix Netflix
121 Albright Way 121 Albright Way
Los Gatos, CA 95032, Los Gatos, CA 95032
United States of America United States of America
Email: ietf@tenghardt.net Email: ietf@tenghardt.net
Philipp S. Tiesel Philipp S. Tiesel
SAP SE SAP SE
George-Stephenson-Straße 7-13 George-Stephenson-Str. 7-13
10557 Berlin 10557 Berlin
Germany Germany
Email: philipp@tiesel.net Email: philipp@tiesel.net
Michael Welzl Michael Welzl
University of Oslo University of Oslo
PO Box 1080 Blindern PO Box 1080 Blindern
0316 Oslo 0316 Oslo
Norway Norway
Email: michawe@ifi.uio.no Email: michawe@ifi.uio.no
 End of changes. 381 change blocks. 
1050 lines changed or deleted 1041 lines changed or added

This html diff was produced by rfcdiff 1.48.