Internet Engineering Task Force Jun-ichiro itojun Hagino INTERNET-DRAFT IIJ Research Laboratory Expires: April 16, 2001 October 16, 2000 Socket API for IPv6 traffic class field draft-itojun-ipv6-tclass-api-01.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Distribution of this memo is unlimited. The internet-draft will expire in 6 months. The date of expiration will be April 16, 2001. Abstract The draft outlines a socket API proposal for controlling the traffic class field in the IPv6 header. The API uses ancillary data stream to manipulate the traffic class field, following practice in the IPv6 advanced API. The draft is, at this moment, written separately from the IPv6 basic/advanced API RFCs [Gilligan, 2000; Stevens, 1999] , as there can be many discussion items. The ultimate goal of the draft is to be a part of the IPv6 basic/advanced API. 1. Background The IPv6 traffic class field is a 8bit field in the IPv6 header. The field serves just like the IPv4 type of service (TOS) field. There are two types of proposed use of the field: (1) topmost 6 bits for the differentiated services (diffserv) field [Nichols, 1998] , and (2) lowermost 2 bits for explicit congestion notification (ECN) HAGINO Expires: April 16, 2001 [Page 1] DRAFT API for IPv6 traffic class field October 2000 [Ramakrishnan, 1999] . Those two proposals plan to rewrite the field at intermediate routers. There is a certain set of applications which need to manipulate and inspect the traffic class field. Here are some examples. o ECN implementations outside of the kernel (like UDP ECN). o A diffserv-aware application, which tries to mark low-priority traffic (such as non-important packets in a video traffic) on its own. In this case, the application does not need to inspect the field on outbound traffic. o Debugging tools for differentiated services. 2. Inbound traffic When an application is interested in inspecting the traffic class field on packets, the application should set the IPV6_RECVTCLASS socket option to 1: /* enable */ const int on = 1; setsockopt(fd, IPPROTO_IPV6, IPV6_RECVTCLASS, &on, sizeof(on)); Subsequent incoming traffic will be accompanied with an ancillary data item that carries an unsigned octet value. The ancillary data item will be tagged with the level IPPROTO_IPV6 and type IPV6_TCLASS. An application can obatain the value of the traffic class field by the following operation, after the recvmsg(2) system call: struct cmsghdr *cm; u_int8_t tclass; if (cm->cmsg_len == CMSG_LEN(sizeof(u_int8_t)) && cm->cmsg_level == IPPROTO_IPV6 && cm->cmsg_type == IPV6_TCLASS) tclass = *(u_int8_t *)CMSG_DATA(cm); else tclass = 0x00; /* could not obtain traffic class value */ By setting the socket option to 0, the behavior is disabled: /* disable */ const int off = 0; setsockopt(fd, IPPROTO_IPV6, IPV6_RECVTCLASS, &off, sizeof(off)); For TCP sockets, an ancillary data item will be present only when the traffic class value is changed. See section 4.1 (TCP Implications) of [Stevens, 1999] for details. HAGINO Expires: April 16, 2001 [Page 2] DRAFT API for IPv6 traffic class field October 2000 3. Outbound traffic To control the value of the traffic class field for a single packet transmission, you can use an ancillary data item, just like presented above, with a sendmsg(2) system call. The level of the ancillary data item must be IPPROTO_IPV6, and the type must be IPV6_TCLASS. int s; /* socket */ u_int8_t tclass; struct sockaddr_in6 *dst; struct msghdr m; struct cmsghdr *cm; struct iovec iov[2]; u_char cmsgbuf[256]; /* must be > CMSG_SPACE(sizeof(tclass)) */ /* set the data buffer to send */ memset(m, 0, sizeof(m)); memset(iov, 0, sizeof(iov)); m.msg_name = (caddr_t)dst; m.msg_namelen = sizeof(dst); iov[0].iov_base = buf; iov[0].iov_len = len; m.msg_iov = iov; m.msg_iovlen = 1; /* set ancillary data for the traffic class field */ memset(cmsgbuf, 0, sizeof(cmsgbuf)); cm = (struct cmsghdr *)cmsgbuf; m.msg_control = cm; m.msg_controllen = CMSG_SPACE(sizeof(tclass)); cm->cmsg_len = CMSG_LEN(sizeof(tclass)); cm->cmsg_level = IPPROTO_IPV6; cm->cmsg_type = IPV6_TCLASS; memcpy(CMSG_DATA(cm), &tclass, sizeof(tclass)); sendmsg(s, &m, 0); If you want to put specific value to the traffic class field on multiple packets, you can use a "sticky" option: u_int8_t tclass; setsockopt(fd, IPPROTO_IPV6, IPV6_TCLASS, &tclass, sizeof(tclass)); 4. Conflict resolution There are two entities which may modify the traffic class field, in the kernel of the originating node: a kernel IPv6 code with diffserv marking enabled, and an ECN-capable TCP stack. Those entities may modify the traffic class field, even if an application tries to manipulate the value. It may present a difficult constraint to the API. For outbound traffic, even if an application specifies the value to be put into the HAGINO Expires: April 16, 2001 [Page 3] DRAFT API for IPv6 traffic class field October 2000 traffic class field, in-kernel mechanism(s) may need to modify the field. The specified value may not be reflected into the packet on the wire (example: outbound processing in an ECN-capable TCP stack). For inbound traffic, even if the kernel presents the value on the field to the application, the value may not be the same as the value on the packet on the wire, due to manipulation in the kernel (example: traffic received by a diffserv egress node itself). The following text proposes a suggested behavior. One of the goals of the suggestion is to allow applications to implement UDP ECN by themselves. The behavior may need more discussions: Outbound traffic If there is no conflict (for example, the TCP stack is not ECN- capable), the kernel should honor the value an application specified, and put the specified value into the traffic class field as is. If there is a conflict, the kernel should override the value specified by the application, for the part of the field (bits) the kernel is using. For example, if the kernel has an ECN- capable TCP stack but does not support diffserv, the kernel should override ECN bits only. Inbound traffic Kernel should present the traffic class value appeared on the wire as is to applications. Note that, in some cases, the kernel may want to alter specific bits in the field, before presenting the value to the userland. For example, if the kernel implements TCP ECN and would like to make it transparent to the user programs, the kernel may want to hide ECN bits >From diffserv and ECN protocol specifications, the traffic class field may be rewritten by intermediate routers. So even if the sender specifies a value, the value may be altered before the packet reaches the final destination. 5. Issues o Revise conflict resolution rule? 6. Security consideration The API could be used for attempted theft of service. An attacker may try to inject packets, with some specific value in traffic class field, into a diffserv cloud. Refer to RFC2474 [Nichols, 1998] section 7.1 for detail. Note that the theft of diffserv service is possible even without the API. HAGINO Expires: April 16, 2001 [Page 4] DRAFT API for IPv6 traffic class field October 2000 References Gilligan, 2000. R. Gilligan, S. Thomson, J. Bound, and W. Stevens, "Basic Socket Interface Extensions for IPv6," internet draft (May 2000). work in progress material. Stevens, 1999. W. Richard Stevens, Matt Thomas, and Eric Nordmark, "Advanced Sockets API for IPv6," internet draft (October 1999). work in progress material. Nichols, 1998. K. Nichols, S. Blake, F. Baker, and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers" in RFC2474 (December 1998). ftp://ftp.isi.edu/in-notes/rfc2474.txt. Ramakrishnan, 1999. K. Ramakrishnan and S. Floyd, "A Proposal to add Explicit Congestion Notification (ECN) to IP" in RFC2481 (January 1999). ftp://ftp.isi.edu/in-notes/rfc2481.txt. Acknowledgements The document was made possible by numerous invaluable comments from members of WIDE research group and KAME team. Here are people gave comments on the draft: Brian Carpenter (in no particular order). Change history 00 -> 01 Improve the section on security consideration. Author's address Jun-ichiro itojun HAGINO Research Laboratory, Internet Initiative Japan Inc. Takebashi Yasuda Bldg., 3-13 Kanda Nishiki-cho, Chiyoda-ku, Tokyo 101-0054, JAPAN Tel: +81-3-5259-6350 Fax: +81-3-5259-6351 Email: itojun@iijlab.net HAGINO Expires: April 16, 2001 [Page 5]