Network Working Group R. Ullmann Request for Comments: 1475 Process Software Corporation June 1993
This memo defines an Experimental Protocol for the Internet community. It does not specify an Internet standard. Discussion and suggestions for improvement are requested. Please refer to the current edition of the "IAB Official Protocol Standards" for the standardization state and status of this protocol. Distribution of this memo is unlimited.
The first version of this memo, describing a possible next generation of Internet protocols, was written by the present author in the summer and fall of 1989, and circulated informally, including to the IESG, in December 1989. A further informal note on the addressing, called "Toasternet Part II", was circulated on the IETF mail list during March of 1992.
2. Internet numbers
2.1 Is 64 Bits Enough?
2.2 Why version 7?
2.3 The version 7 IP address
2.4 AD numbers
2.5 Mapping of version 4 numbers
3. IP: Internet datagram protocol
3.1 IP datagram header format
3.1.2 Header length
3.1.3 Time to live
3.1.4 Total datagram length
3.1.5 Forward route identifier
3.2 Option Format
3.2.1 Class (C)
3.2.2 Copy on fragmentation (F)
3.2.5 Option data
3.3 IP options
3.3.3 Last Fragment
3.3.4 Don't Fragment
3.3.5 Don't Convert
3.4 Forward route identifier
3.4.1 Procedure description
3.4.3 Mobile hosts
4. TCP: Transport protocol
4.1 TCP segment header format
4.1.1 Data offset
4.1.5 Source port
4.1.6 Destination port
4.2 Port numbers
4.3 TCP options
4.3.1 Option Format
4.3.3 Maximum Segment Size
4.3.4 Urgent Pointer
4.3.5 32 Bit rollover
5. UDP: User Datagram protocol
5.1 UDP header format
5.1.1 Data offset
5.1.4 Source port
5.1.5 Destination port
6.1 ICMP header format
6.2 Conversion failed ICMP message
7. Notes on the domain system
7.1 A records
7.2 PTR zone
8. Conversion between version 4 and version 7
8.1 Version 4 IP address extension option
8.1.1 Option format
8.2 Fragmented datagrams
8.3 Where does the conversion happen?
8.4 Hybrid IPv4 systems
8.5 Maximum segment size in TCP
8.6 Forwarding and redirects
8.7 Design considerations
8.8 Conversion from IPv4 to IPv7
8.9 Conversion from IPv7 to IPv4
8.10 Conversion from TCPv4 to TCPv7
8.11 Conversion from TCPv7 to TCPv4
8.12 ICMP conversion
11. Security Considerations
12. Author's Address
This memo presents the specification for version 7 of the Internet Protocol, as well as version 7 of the TCP and the user datagram protocol. Version 7 has been designed to address several major problems that have arisen as version 4 has evolved and been deployed, and to make a major step forward in the datagram switching and forwarding architecture of the Internet.
The major problems are threefold. First, the address space of version 4 is now seen to be too small. While it was viewed as being almost impossibly large when version 4 was designed, two things have occurred to create a problem. The first is a success crisis: the internet protocols have been more widely used and accepted than their designers anticipated. Also, technology has moved forward, putting microprocessors into devices not anticipated except as future dreams a decade ago.
The second major problem is a perceived routing explosion. The present routing architecture of the internet calls for routing each organization's network independently. It is becoming increasingly clear that this does not scale to a universal internet. While it is possible to route several billion networks in a flat, structureless domain, it is not desireable.
There is also the political administrative issue of assigning network numbers to organizations. The version 4 administrative system calls for organizations to request network assignments from a single
authority. While to some extent this has been alleviated by reserving blocks to delegated assignments, the address space is not large enough to do this in the necessary general case, with large blocks allocated to (e.g.) national authority.
The third problem is the increasing bandwidth of the networks and of the applications possible on the network. The TCP, while having proven useful on an unprecedented range of network speeds, is now the limiting factor at the highest speeds, due to restrictions of window size, sequence-space, and port numbers. These limitations can all be addressed by increasing the sizes of the relevant fields. See [RFC1323].
There is also an opportunity to move the technology forward, and take advantage of a combination of the best features of the hop-by-hop connectionless forwarding of version 4 (and CLNP) as well as the pre-established paths of version 5 (and, e.g., the OSI CONS).
Internet Version 7 includes four major areas of improvement, while at the same time retaining interoperation with version 4 with a small amount of conversion knowledge imposed on version 7 hosts and routers.
The result is not just a movement sideways, deploying a new network layer protocol to patch current problems. It is a significant step forward for network layer technology,
The following are some of the objectives of the design.
I believe all of these are accomplishable in a consistent, well- engineered solution, and all are essential to the survival of the Internet.
Protocols should become simpler as they evolve.
The version 4 numbering system has proven to be very flexible, (mostly) expandable, and simple. In short: it works. There are two problems, neither serious when this specification was first developed in 1988 and 1989, but have as expected become more serious:
Consider: (thought experiment) 32 bits presently numbers "all" of the computers in the world, and another 32 bits could be used to number all of the bytes of on-line storage on each computer. (Most have a lot less than 4 gigabytes on-line, the ones that have more could be notionally assigned more than one address.)
So: 64 bits is enough to number every byte of online storage in existence today, in a hierarchical structured numbering plan.
Another way of looking at 64 bits: it is more than 2 billion
addresses for each person on the planet. Even if I have
microprocessors in my shirt buttons I'm not going to have that many. 32 bits, on the other hand, was never going to be sufficient: there are more than 2^32 people.
It was clearly recognized at the start of this project in 1988 that making the address 64 bits implies a new IP header format, which was called either "TP/IX" or "IP version 7"; there wasn't anything magic about the number 7, I made it up. Version 4 is the familiar current version of IP. Version 5 is the experimental ST (Stream) protocol. ST-II, a newer version of ST, uses the same version number, something I was not aware of until recently; I suspected it might have been allocated 6. Besides, I liked 7.
Apparently (as reported by Bob Braden) the IAB followed much the same logic, and may have had the idea planted by the mention of version 7 in the "Toasternet Part II" memo. The IAB in June 1992 floated a proposal that CLNP, or a CLNP-based design, be Internet Version 7. (And promptly got themselves toasted.) However, close inspection of the bits shows that CLNP is clearly version 8.
The Version 7 IP 64 bit address looks like:
+-------+-------+-------+-------+-------+-------+-------+-------+ | Admin Domain | Network | Host | +-------+-------+-------+-------+-------+-------+-------+-------+
Note: the boundary between "network" and "host" is no more fixed than it is today; each (sub)network will have its own mask. Just as the mask today can be anywhere from FF00 0000 (8/24) to FFFF FFFC (30/2), the mask for the 64 bit address can reasonably be FFFF FF00 0000 0000 (24/40) to FFFF FFFF FFFF FFFC (62/2).
The AD (Administrative Domain), identifies an administration which may be a service provider, a national administration, or a large multi-organization (e.g. a government). The idea is that there should not be more than a few hundred of these at first, and eventually thousands or tens of thousands at most. (But note that we do not introduce a hard limit of 2^16 here; this estimate may be off by a few orders of magnitude.) Since only 1/4th of the address space is initially used (first two bits are 01), the remainder can then be allocated in the future with more information available.
Most individual organizations would not be ADs. In the short term, ADs are known to the "core routing"; it pays to keep the number smallish, a few thousand given current routing technology. In the long term, this is not necessary. Big administrations (i.e., with tens of millions of networks) get small blocks where needed, or additional single AD numbers when needed.
While the AD may be used for last resort routing (with a 24/40 mask), it is primarily only an administrative device. Most routing will be done on the entire 48 bit AD+network number, or sub and super-sets of those numbers. (I.e., masks between about 32/32 and 56/8.)
Some ADs (e.g., NSF) may make permanent assignments; others (such as a telephone company defining a network number for each subscriber line) may tie the assignment to such a subscription. But in no case does this require traffic to be routed via the AD.
AD numbers are allocated out of the same numbering space as network numbers. This means that a version 4 address can be distinguished from the first 32 bits of a version 7 address. This is useful to help prevent the inadvertent use of the first half of the longer address by a version 4 host.
There is a non-trivial amount of software that assumes that an "int" is the same size and shape as an IP address, and does things like "ipaddr = *(int *)ptr". This usage has always been incorrect, but does occur with disturbing frequency. As IPv7 8 byte addresses appear in the application layers, this software will find those addresses unreachable; this is preferable to interacting with a random host.
One possible method would be to allocate ADs in the range 96.0.0 to 192.255.255, using the top 1/4 of the version 4 class A space. It is probably best to allocate the first component downwards from 192, so that the boundary between class A and AD can be moved if desired later. This initial allocation provides for 2031616 ADs, many more than there should be even in full deployment.
Eventually, both AD and network will use the full 24 bit space available to them. Knowledge of the AD range should not be coded into software. If it was coded in, that software would break when the entire 24 bit space is used for ADs. (This lesson should have been learned from CIDR.)
Initially, all existing Internet numbers are defined as belonging to the NSF/Internet AD, number 192.0.0.
The mapping from/to version 4 IP addresses:
+-------+-------+-------+-------+-------+-------+-------+-------+ | Admin Domain | Network | Host | +-------+-------+-------+-------+-------+-------+-------+-------+ [ fixed at A0 00 00 ] [ 1st 24 bits of V4 IP]  [last 8]
So, for example, 220.127.116.11 (V4) becomes 18.104.22.168.22.214.171.124.
And the "standard" loopback interface address becomes
The present protocol multicast (126.96.36.199.x.y.1.z) and loopback addresses are permanently allocated in the NSF AD.
The Internet datagram protocol is revised to expand some fields (most notably the addresses), while removing and relegating to options all fields not universally useful (imperative) in every datagram in every environment.
This results in some simplification, a length less than twice the size of IPv4 even though most fields are doubled in size, and an expanded space for options.
There is also a change in the option philosophy from IPv4: it specified that implementation of options was not optional, what was optional was the existence of options in any given datagram. This is changed in IPv7: no option need be implemented to be fully conformant. However, implementations must understand the option classes; and a future Host Requirements specification for hosts and routers used in the "connected Internet" may require some options in its profile, e.g., Fragment would probably be required.
Digression: In IPv4, options are often "considered harmful". It is the opinion of the present author that this is because they are rarely needed, and not designed to be processed rapidly on most architectures. This leads to little or no attempt to improve performance in implementations, while at the same time enormous effort is dedicated to optimization of the no-option case. IPv7 is expected to be different on both counts.
Fields are always aligned on their own size; the 64 bit fields on 64 bit intervals from the start of the datagram.
Options are all 32 bit aligned, and the null option can be used to
push a subsequent option (or the transport layer header) into 64 bit or 64+32 off-phase alignment as desired.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |version| header length | time to live | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | total datagram length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + forward route identifier + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + destination address + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + source address + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | protocol | checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | options | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A description of each field follows.
This document describes version 7 of the protocol.
The header length is a 12 bit count of the number of 32 bit words in the IPv7 header. This allows a header to be (theoretically at least) up to 16380 bytes in length.
The time to live is a 16 bit count, nominally in 1/16 seconds. Each hop is required to decrement TTL by at least one.
This definition should allow continuation of the useful (even though not entirely valid) interpretation of TTL as a hop count, while we
move to faster networks and routers. (The most familiar use is by "traceroute", which really ought to be directly implemented by one or more ICMP messages.)
The scale factor converts the usual version 4 default TTL into a larger number of hops. This is desireable because the forward route architecture of version 7 enables the construction of simpler, faster switches, and this may cause the network diameter to increase.
The 32 bit length of the entire datagram in octets. A datagram can therefore be up to 4294967295 bytes in overall length. Particular networks will normally impose lower limits.
The identifier from the routing protocol to be used by the next hop router to find its next hop. (A more complete description is given below.)
The 64 bit IPv7 destination address.
The 64 bit IPv7 source address.
The transport layer protocol, e.g., TCP is 6. The present code space for this layer of demultiplexing is about half full. Expanding it to 16 bits, allowing 65535 registered "transport" layers seems prudent.
The checksum is a 16 bit checksum of the entire IP header, using the familiar algorithm used in IPv4.
Options may follow. They are variable length, and always 32 bit aligned, as discussed previously.
Each option begins with a 32 bit header:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | C |F| type | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | option data ... | padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A description of each field:
This field tells implementations what to do with datagrams containing options they do not understand. No implementation is required to implement (i.e., understand) any given option by the TCP/IP specification itself.
0 use or forward and include this option unmodified 1 use this datagram, but do not forward the datagram 2 discard, or forward and include this option unmodified 3 discard this datagram
A host receiving a datagram addressed to itself will use it if there are no unknown options of class 2 or 3. A router receiving a datagram not addressed to it will forward the datagram if and only if there are no unknown options of class 1 or 3. (The astute reader will note that the bits can also be seen as having individual interpretations, one allowing use even if unknown, one allowing forwarding if unknown.)
Note that classes 0 and 2 are imperative: if the datagram is forwarded, the unknown option must be included.
Class and type are entirely orthogonal, different implementations might use different classes for the same option, except where restricted by the option definition.
Also note that for options that are known (implemented by) the host or router, the class has no meaning; the option definition totally determines the behavior. (Although it should be noted that the option might explicitly define a class dependent behavior.)
If the F bit is set, this option must be copied into all fragments when a datagram is fragmented. If the F bit is reset (zero), the option must only be copied into the first (zero-offset) fragment.
The type field identifies the particular option, types being registered as well known values in the internet. A few of the options with their types are described below.
Length of the option data, in bytes.
Variable length specified by the length field, plus 0-3 bytes of zeros to pad to a 32 bit boundary. Fields within the option data that are 64 bits long are normally placed on the assumption that the option header is off-phase aligned, the usual case when the option is the only one present, and immediately follows the IP header.
The following sections describe the options defined to emulate IPv4 features, or necessary in the basic structure of the protocol.
The null option, type 0, provides for a space filler in the option area. The data may be of any size, including 0 bytes (perhaps the most useful case.)
It may be used to change alignment of the following options or to replace an option being deleted, by setting type to 0 and class to 0, leaving the length and content of the data unmodified. (Note that this implies that options must not contain "secret" data, relying on class 3 to prevent the data from leaving the domain of routers that understand the option.)
Null is normally class 0, and need not be implemented to serve its function.
Fragment (type 1) indicates that the datagram is part of a complete IP datagram. It is always class 2.
The data consists of (one of) the 64 bit IP address(es) of the router doing the fragmentation, a 64 bit datagram ID generated by that router, and a 32 bit fragment offset. The IDs should be generated so as to be very likely unique over a period of time larger than the TCP MSL (maximum segment lifetime). (The TCP ISN (initial sequence number) generator might be used to initialize the ID generator in a router.)
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | C |F| type | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + fragmenting router IP address + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + datagram ID + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
If a datagram must be refragmented, the original 128 bit address+ID is preserved, so that the datagram can be reassembled from any sufficient set of the resulting fragments. The 64 bits fields are positioned so that they are aligned in the usual case of the fragment option following the IP header.
A router implementing Fragment (doing fragmentation) must recognize the Don't Fragment option.
Last Fragment (type 2) has the same format as Fragment, but implies that this datagram is the last fragment needed to reassemble the original datagram.
Note that an implementation can reasonably add arriving datagrams with Fragment to a cache, and then attempt a reassembly when a datagram with Last Fragment arrives (and the the total length is known); this will work well when datagrams are not reordered in the
This option (type 3, class 0) indicates that the datagram may not be fragmented. If it can not be forwarded without fragmentation, it is discarded, and the appropriate ICMP message sent. (Unless, of course, the datagram is an ICMP message.) There is no data present.
The Don't Convert option prohibits conversion from IPv7 to IPv4 protocol, requiring instead that the datagram be discarded and an ICMP message sent (conversion failed/don't convert set). It is type 4, usually class 0, and must be implemented by any router implementing conversion. A host is under no such constraint; like any protocol specification, only the "bits on the wire" can be specified, the host receiving the datagram may convert it as part of its procedure. There is no data present in this option.
Each IP datagram carries a 64 bit field, called "forward route identifier", that is updated (if the information is available) at each hop. This field's value is derived from the routing protocol (e.g., RAP [RFC1476]). It is used to expedite routing decisions by preserving knowledge where possible between consecutive routers. It can also be used to make datagrams stay within reserved flows and mobile-host tunnels where required.
Consider 3 routers, A, B, and C. Traffic is passing through them, between two other hosts (or networks), X and Y, packets are going XABCY and YCBAX. Consider only one direction: routing info flowing from C to A, to provide a route from A to C. The same thing will be happening in the other direction.
An explanation of the notation:
R(r,d,i,h) A route that means: "from router r, to go toward final destination d, replace the forward route identifier in the packet with i, and take next hop h." Ri(r,d) An opaque (outside of router r) identifier, that can be used by r to find R(r,d,...).
Flowi(r,rt) An opaque (outside of router r) identifier, that router r can use to find a flow or tunnel with which the datagram is associated, and from that the route rt on which the flow or tunnel is built, as well as the Flowi() for the subsequent hop. Ri(Dgram) The forward route identifier in a datagram.
Router C announces a route R(C,Y,0,Y) to router B. It includes in it an identifier Ri(C,Y) internal to C, that will allow C to find the route rapidly. (A table index, or an actual memory address.)
Router B creates a route R(B,Y,Ri(C,Y),C) via router C, it announces it to A, including an identifier Ri(B,Y), internal to B, and used by A as an opaque object.
Router A creates a route R(A,Y,Ri(B,Y),B) via router B. It has no one to announce it to.
Now: X originates a datagram addressed to Y. It has no routing information, and sets Ri(Dgram) to zero. It forwards the datagram to router A (X's default gateway).
A finds no valid Ri(Dgram), and looks up the destination (Y) in its routing tables. It finds R(A,Y,Ri(B,Y),B), sets Ri(Dgram) <- Ri(B,Y), and forwards the datagram to B.
Router B looks at Ri(Dgram) which directly identifies the next hop route R(B,Ri(C,Y),C), sets Ri(Dgram) <- Ri(C,Y) and forwards it to router C.
Router C looks at Ri(Dgram) which directly locates R(C,0,Y), sets Ri(Dgram) <- 0 and forwards to Y.
Y recognizes its own address in Dest(Dgram), ignores Ri(Dgram).
Of course, the routers will validate the Ri's received, particularily if they are memory addresses (e.g., M(a) < Ri < M(b), Ri mod N == 0), and probably check that the route in fact describes the destination of the datagram. If the Ri is invalid, the router must use the ordinary method of finding a route (i.e., what it would have done if Ri was 0), and silently ignore the invalid Ri.
When a route has been aggregated at some router, implicitly or explicitly, it will find that the incoming Ri(Dgram) at most can identify the aggregation, and it must make a decision; the forwarded datagram then contains the Ri for the specific route. (Note this may happen well upstream of the point at which the routes actually
This allows all cooperating routers to make immediate forwarding decisions, without any searching of tables or caches once the datagram has entered the routing domain. If the host participates in the routing, at least to the extent of acquiring the initial Ri required from the first router, then only routers that have done aggregations need make decisions. (If the routing changes with datagrams in flight, some router will be required to make a decision to re-rail each datagram.)
If a "flow" is to be set up, the identifiers are replaced by Flowi(router,route), where each router's structure for the flow contains a pointer to the route on which the flow is built. Datagrams can drop out of the flow at some point, and can be inserted either by the originating host or by a cooperating router near the originator. Since the forward route identifier field is opaque to the sending router, and implicitly meaningful only to the next hop router, use for flows (or similar optimizations) need not be otherwise defined by the protocol. (One presumes that a router issuing both Ri's and Flowi's will take care to make sure that it can distinguish them by some private method.)
If a flow has been set up by a restricted target RAP route announcement, it is no different from a route in the implementation. If this announcement originates from the host itself, the Ri in incoming datagrams can be used to determine whether they followed the flow, or to optimize delivery of the datagrams to the next layer protocol.
First, a definition: A "mobile host" is a host that can move around, connecting via different networks at different times, while maintaining open TCP connections. It is distinguished from a "portable host", which is simply a host that can appear in various places in the net, without continuity. A portable host can be implemented by assigning a new address for each location (more or less automatically), and arranging to update the domain system. Supporting truly mobile hosts is the more interesting problem.
To implement mobile host support in a general way, either some layer of the protocol suite must provide network-wide routing, or the datagrams must be tunnelled from the "home" network of the host to its present location. In the real network, some combination of these is probable: most of the net will forward datagrams toward the home
network, and then the datagrams will follow a specific host route to the mobile host.
The requirement on the routing system is that it must be able to propagate a host route at least to the home network; any other distribution is useful optimization. When a host route is propagated by RAP as a targeted route, and the routers use the resulting Ri's, the datagram follows an effective tunnel to the mobile host. (Not a real tunnel, in the strict sense; the datagrams are following an actual route at the network protocol layer.)
As explained in RAP [RFC14XX-RAP], a targeted route can be issued when desired; in particular, it can be triggered by the establishment of a TCP connection or by the arrival of datagrams that do not carry an Ri indicating that they have followed a (non-tunnel) route.
Internet version 7 expands the sizes of the sequence and
acknowledgement fields, the window, and the port numbers. This is to remove limitations in version 4 that begin to restrict throughput at (for example) the bandwidth of FDDI and round trip delay of more than 60 milliseconds. At gigabit speeds and delays typical of international links, the version 4 TCP would be a serious limitation. See [RFC1323].
The port numbers are also expanded. This alleviates the problem of going through the entire port number range with a rapid sequence of transactions in less than the lifetime of datagrams in the network.
The 64 bit fields (sequence and acknowledgement) in the TCP header are off-phase aligned, in anticipation of the usual case of the TCP header following the 9 32-bit word IP header. If IP options add up to an odd number of 32 bit words, a null option may be added to push the transport header to off-phase alignment.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data offset | MBZ |A|P|R|S|F| checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | source port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | destination port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + sequence number + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + acknowledgement number + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | window | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | options ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A description of each field:
An 8 bit count of the number of 32 bit words in the TCP header, including any options.
Reserved bits, must be zero, and must be ignored.
These are the protocol state flags, use exactly as in TCPv4, except that there is no urgent data flag.
This is a 16 bit checksum of the segment. The pseudo-header used in the checksum consists of the destination address, the source address, the protocol field (constant 6 for TCP), and the 32 bit length of the TCP segment.
The source port number, a 32 bit identifier. See the section on port numbers below.
The 32 bit destination port number.
A 64 bit sequence number, the sequence number of the first octet of user data in the segment.
The ISN (Initial Sequence Number) generator used in TCPv4 is used in TCPv7, with the 32 bit value loaded into both the high and low 32 bits of the TCPv7 sequence number. This provides reasonable behavior when the 32 bit rollover option is used (see below) for TCPv4 interoperation. V7 hosts must implement the full 64 bit sequence number rollover.
The 64 bit acknowledgement number, acknowledging receipt of octets up to but not including the octet identified. Valid if the A flag is set, if A is reset (0), this field should be zero, and must be ignored.
The 32 bit offered window.
TCP options, some of which are documented below.
Port numbers are divided into several ranges: (all numbers are decimal)
0 reserved 1-32767 Internet registered ("well-known") protocols 32768-98303 reserved, to allow TCPv7-TCPv4 conversion 98304 up dynamic assignment
It must also be remembered that hosts are free to dynamically assign for active connections any port not actually in use by that host:
hosts must not reject connections because the "client" port is in the registered range.
Each option begins with a 32 bit header:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | option data ... | padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The null option (type = 0), is to be ignored.
Maximum segment size (type = 1) specifies the largest segment that the other TCP should send, in terms of the number of data octets. When sent on a SYN segment, it is mandatory; if sent on any other segment it is advisory.
Data is one 32 bit word specifying the size in octets.
The urgent pointer (type = 2) emulates the urgent field in TCPv4. Its presence is equivalent to the U flag being set. The data is a 64 bit sequence number identifying the last octet of urgent data. (Not an offset, as in v4.)
The 32 bit rollover option (type = 3) indicates that only the low order 32 bits of the sequence and acknowledgement packets are significant in the packet.
This is necessary because a converting internet layer gateway has no retained state, and cannot properly set the high order bits. This option must be implemented by version 7 hosts that want to interoperate with version 4 hosts.
The user datagram protocol is also expanded to include larger port numbers, for reasons similar to the TCP.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data offset | MBZ | checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | source port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | destination port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | options ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A description of each field:
An 8 bit count of the number of 32 bit words in the UDP header, including any options.
Reserved bits, must be zero, and must be ignored.
This is a 16 bit checksum of the datagram. The pseudo-header used in the checksum consists of the destination address, the source, address, and the protocol field (constant 17 for UDP), and the 32 bit length of the user datagram.
The source port number, a 32 bit identifier. See the section on TCP port numbers above.
The 32 bit destination port number.
UDP options, none are presently defined.
The ICMP protocol is very similar to ICMPv4, in some cases not requiring any conversion.
The complication is that IP datagrams are nested within ICMP messages, and must be converted. This is discussed later.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type | code | checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type-specific parameter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type-specific data ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type and code are well-known values, defined in [RFC792]. The codes have meaning only within a particular type, they are not orthogonal.
The next 32 bit word is usually defined for the specific type, sometimes it is unused.
For many types, the data consists of a nested IP datagram, usually truncated, which is a copy of the datagram causing the event being reported. In IPv4, the nested datagram consists of the IP header, and another 64 bits (at least) of the original datagram.
For IPv7, the nested datagram must include the IP header plus 96 bits of the remaining datagram (thus including the port numbers within TCP and UDP), and should include the first 256 bytes of the datagram. I.e., in most cases where the original datagram was not large, it will return the entire datagram.
The introduction of network layer conversion requires a new message type, to report conversion errors. Note that an invalid datagram should result in the sending of some other ICMP message (e.g., parameter problem) or the silent discarding of the datagram. This message is only sent when a valid datagram cannot be converted.
Note: implementations are not expected to, and should not, check the validity of incoming datagrams just to accomplish this; it simply means that an error detected during conversion that is known to be an actual error in the incoming datagram should be reported as such, not as a conversion failure.
Note that the conversion failed ICMP message may be sent in either the IPv4 or IPv7 domain; it is a valid ICMP message type for IPv4.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type | code | checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | pointer to problem area | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | copy of datagram that could not be converted .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The type for Conversion Failed is 31.
The codes are:
0 Unknown/unspecified error 1 Don't Convert option present 2 Unknown mandatory option present 3 Known unsupported option present 4 Unsupported transport protocol 5 Overall length exceeded 6 IP header length exceeded 7 Transport protocol > 255 8 Port conversion out of range 9 Transport header length exceeded 10 32 Bit Rollover missing and ACK set 11 Unknown mandatory transport option present
The use of code 0 should be avoided, any other condition found by implementors should be assigned a new code requested from IANA. When code 0 is used, it is particularily important that the pointer be set properly.
The pointer is an offset from the start of the original datagram to the beginning of the offending field.
The data is part of the datagram that could not be converted. It must be at least the IP and transport headers, and must include the field pointed to by the previous parameter. For code 4, the transport header is probably not identifiable; the data should
include 256 bytes of the original datagram.
Address records will be added to the IN (Internet) zone with IPv7
addresses for all hosts as IPv7 is deployed. Eventually the IPv4
addresses will be removed. As mentioned above, the AD
(Administrative Domain) space is initially assigned so that the first 4 octets of a v7 address cannot be confused with a v4 address (or, rather, the confusion will be to no effect.)
DELTA.Process.COM. A 188.8.131.52 A 184.108.40.206.220.127.116.11
It is important that the A record be used, to avoid the cache consistancy problem that would arise when different records had different remaining TTLs.
Note that if an unmodified version of the more popular public domain nameserver is a secondary for a zone containing IPv7 addresses, it will erroneously issue RRs with only the first four bytes. (I.e., 18.104.22.168 in the example.) This is another reason to ensure that the AD numbers are initially reserved out of the IPv4 network number space. Eventually, zones with IPv7 addresses would be expected to be served only by upgraded servers.
The inverse (PTR) zone is .#, with the IPv7 address (reversed). I.e., just like .IN-ADDR.ARPA, but with .# instead.
This respects the difference in actual authority: the NSF/DDN NIC is the authority for the entire space rooted in .IN-ADDR.ARPA. in the v4 Internet, while in the new Internet it holds the authority only for the AD 0.0.192.#. (Plus, of course, any other ADs assigned to it over time.)
As noted in the description of datagram format, it is possible to provide a mostly-transparent bridge between version 4 and version 7.
This discusses TCP and ICMP at the session/transport layer; UDP is a subset of the TCP conversion. Most protocols at this layer will
probably need no translation; however it will probably be necessary to specify exactly which will have translations done.
New protocols at the session/transport layer defined over IPv7 should have protocol numbers greater than 255, and will not be translated to IPv4.
Most of the translations should consist of copying various fields, verifying fixed values in the datagram being translated, and setting fixed values in the datagram being produced. In general, the checksum must be verified first, and then a new checksum computed for the generated datagram.
A new option is defined for IP version 4, to carry the extended addresses of IPv7. This will be particularily useful in the initial testing of IPv7, during a time when most of the fabric of the internet is IPv4. An IPv7 host will be able to connect to another IPv7 host anywhere in the internet even though most of the paths and routers are IPv4, and still use the full addressing. This will continue to work until non-unique network numbers are assigned, by which time most of the infrastructure should be IPv7.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type (147) | length = 10 | source IPv7 AD number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | src 7th octet | destination IPv7 AD | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | number ... | dst 7th octet | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The source and destination are in IPv4 order (source first), for consistancy. The type code is 147.
Datagrams that have been fragmented must be reassembled by the converting host or router before conversion. Where the conversion is being done by the destination host (i.e., the case of a v7 host receiving v4 datagrams), this is similar to the present fragmentation model.
When it is being done by an intermediate router (acting as an internetwork layer gateway) the router should use all of source, destination, and datagram ID for identification of IPv4 fragments;
note that destination is used implicitly in the usual reassembly at the destination. When reassembling an IPv7 datagram, the 128 bit fragment ID is used as usual.
If the fragments take different paths through the net, and arrive at different conversion points, the datagram is lost.
The objective of conversion is to be able to upgrade systems, both hosts and routers, in whatever order desired by their owners. Organizations must be able to upgrade any given system without reconfiguration or modification of any other; and IPv4 hosts must be able to interoperate essentially forever. (IPv4 routers will probably be effectively eliminated at some point, except where they exist in their own remote or isolated corners.)
Each TCP/IP v7 system, whether host or router, must be able to recognize adjacent systems in the topology that are (only) v4, and call the appropriate conversion routine just before sending the datagram.
Digression: I believe v7 hosts will get much better performance by doing everything internally in v7, and using conversion to filter datagrams when necessary. This keeps the usual code path simple, with only a "hook" right after receiving to convert incoming IPv4 datagrams, and just before sending to convert to IPv4. Routers may prefer to keep datagrams in their incoming version, at least until after the routing decision is made, and then doing the conversion only if necessary. In either case, this is an implementation specific decision.
It must be noted that any forwarding system may convert datagrams to IPv7, then back to IPv4, even if that loses information such as unknown options. The reverse is not acceptable: a system that receives an IPv7 datagram should not convert it to IPv4, then back to IPv7 on forwarding.
The preferred method for identifying which hosts require conversion is to ARP first for the IPv7 address, and then again if no response is received, for the IPv4 address. The reservation of ADs out of the v4 network number space is useful again here, protecting hosts that fail to properly use the ARP address length fields.
On networks where ARP is not normally used, the method is to assume that a remote system is v7. If an IPv7 datagram is received from it, the assumption is confirmed. If, after a short time, no IPv7 datagram is received, a v7 ICMP echo is sent. If a reply is received
(in either version) the assumption is confirmed.
If no reply is recieved, the remote system is assumed not to understand IPv7, and datagrams are converted to IPv4 just before transmitting them.
Implementations should also provide for explicit configuration where desired.
In the course of implementing IPv7, especially in constrained environments such as small terminal servers, it may be useful to implement the IPv4 address extension option directly, thereby regaining universal connectivity.
This may also be a useful interim step for vendors not prepared to do a major rework of an implementation; but it is important not to get stalled in this step.
A hybrid IPv4 + address extension system does not have to implement the conversion, it places this onus on its neighbors. It may itself have an address with the subnet extension (7th byte) not equal to 1.
The implication of hybrid systems is that it is not valid to assume that a host with a IPv7 address is a native IPv7 implementation.
It is probably advisable for IPv4 implementations to reduce the MSS offered by a small amount where possible, to avoid fragmentation when datagrams are converted to IPv7. This arises when IPv4 hosts are communicating through an IPv7 infrastructure, with the same MTU as the local networks of the hosts.
It may be important for a router to not send ICMP redirects when it finds that it must do a conversion as part of forwarding the datagram. In this case, the hosts involved may not be able to interact directly. The IPv7 host could ignore the redirect, but this results in an unpleasant level of noise as the sequence continually recurs.
The conversion is designed to be fairly efficient in implementation, especially on RISC architectures, assuming they can either do a
conditional move (or store), or do a short forward branch without losing the instruction cache. The other conditional branches in the body of the code are usually not-taken out to the failure/discard case.
Handling options does involve a loop and a dispatch (case) operation. The options in IPv4 are more difficult to handle, not being designed for speed on a 32 bit aligned RISCish architecture, but they do not occur often, except perhaps the address extension option.
For CISC machines, the same considerations will lead to fairly efficient code.
The conversion code must be extremely careful to be robust when presented with invalid input; in particular, it may be presented with truncated transport layer headers when called recursively from the ICMP conversion.
Individual steps in the conversion; the order is in most cases not significant.
with copy-on-fragment is found, fail. If copy-on-fragment is not set, ignore the option. I.e., the flag is (ab)used as an indicator of whether the option is mandatory.
The steps to convert IPv7 to IPv4 follow. Note that the converting router or host is partly in the role of destination host; it checks both bits of class in IP options, and (as in the other direction) must reassemble fragmented datagrams.
probably should be a configuration option, should default to on.)
ICMP messages are converted by copying the type and code into the new packet, and copying the other type-specific fields directly.
If the message contains an encapsulated, and usually truncated, IP datagram, the conversion routine is called recursively to translate it as far as possible. There are some special considerations:
It may be best in a given implementation to have a separate code path for the nested conversion, that handles these issues out of the optimized usual path.
The present version of TCP/IP has been a success partly by accident, for reasons that weren't really designed in. Perhaps the most significant is the low level of network integration required to make it work.
We must be careful to retain the successful ingredients, even where we may be unaware of them. Tread lightly, and use all that we have learned, especially about not changing things that work.
This document has described a fairly conservative step forward, with clear extensibility for future developments, but without jumping into the abyss.
[RFC768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, USC/Information Sciences Institute, August 1980. [RFC791] Postel, J., "Internet Protocol - DARPA Internet Program Protocol Specification", STD 5, RFC 791, DARPA, September 1981. [RFC792] Postel, J., "Internet Control Message Protocol - DARPA Internet Program Protocol Specification" STD 5, RFC 792, USC/Information Sciences Institute, September 1981. [RFC793] Postel, J., "Transmission Control Protocol - DARPA Internet Program Protocol Specification", STD 7, RFC 793, USC/Information Sciences Institute, September 1981. [RFC801] Postel, J., "NCP/TCP Transition Plan", USC/Information Sciences Institute, November 1981. [RFC1287] Clark, D., Chapin, L., Cerf, V., Braden, R., and R. Hobby, "Towards the Future Internet Architecture", RFC 1287, MIT, BBN, CNRI, ISI, UCDavis, December 1991. [RFC1323] Jacobson, V., Braden, R, and D. Borman, "TCP Extensions for High Performance", RFC 1323, LBL, USC/Information Sciences Institute, Cray Research, May 1992. [RFC1335] Wang, Z., and J. Crowcroft, A Two-Tier Address Structure for the Internet: A Solution to the Problem of Address Space Exhaustion", RFC 1335, University College London, May 1992. [RFC1338] Fuller, V., Li, T., Yu, J., and K. Varadhan, "Supernetting: an Address Assignment and Aggregation Strategy", RFC 1338, BARRNet, cicso, Merit, OARnet, June 1992.
[RFC1347] Callon, R., "TCP and UDP with Bigger Addresses (TUBA), A Simple Proposal for Internet Addressing and Routing", RFC 1347, DEC, June 1992. [RFC1476] Ullmann, R., "RAP: Internet Route Access Protocol", RFC 1476, Process Software Corporation, June 1993. [RFC1379] Braden, R., "Extending TCP for Transactions -- Concepts", RFC 1379, USC/Information Sciences Institute, November 1992.
Security issues are not discussed in this memo.
Process Software Corporation
959 Concord Street
Framingham, MA 01701
Phone: +1 508 879 6994 x226