Internet Router: Return to Sender, Address Unknown

by Doctor Electron

It may not be crying time when internet routers sing the Elvis Presley hit, "Return to sender, address unknown." The song consists of returned packets which, like ordinary mail, are deemed to be undeliverable according to the routing table of the router. Simple "hello" messages called ping or echo requests may be returned to the sender in the form of time-to-live (TTL) expired ICMP packets [RFC 792], each associated with an undeliverable destination IP address.

According to RFC 1812, a router "must" return host or network unreachable messages, but the lab received TTL expired messages instead, so that is what was used to create a negative image of the routing table. More on this in the router vulnerablity discussion below. By the way, RFC's often say that internet hosts "must" reply, and then qualify that with something like, "but only if the boss says so."

This paper illustrates how this information may be used to construct a general routing table for the worldwide internet. This table could be practically useful by indicating which address ranges do not contain hosts. A similar procedure could be applied to smaller IP address spaces to obtain more detailed packet deliverability information. Unroutable addresses may be archived for that purpose. In some cases, recent BGP routing tables [RFC 1267] may be downloaded (see below).

ICMP echo requests with maximum time-to-live (255) were sent to randomly generated IP addresses ranging from 1.0.0.1 to 219.255.255.254, excluding the local host prefix of 127. Echo responses were tabulated as well as ICMP error packets, mostly from routers. Data was collected with an automated program written by the author.

Of 96,261 randomly selected addresses tested, only 29.7% were deliverable, as defined by absence of TTL expired ICMP error replies. This observed percent routable corresponds almost exactly to the percent of address space represented by routing table entries, as reported by several web sites analyzing routing tables. This suggests that a previous estimate of about 40% usage was too high, as could be surmised by the number of address ranges counted as responsive but using only a fraction of their address capacity [1].

If one determines the addresses to which internet routers cannot or will not deliver packets, the inverse is the list of deliverable addresses, presented in Table 1 as percent of all addresses in x/8 IP address sub-spaces consisting of x.0.0.1 to x.255.255.254 [RFC 1519]. Table 1 below may be compared with Table 1 in the previous report [1].

Table 1: Percent Deliverable by X/8 IP Address Prefix
       0     1     2     3     4     5     6     7     8     9
  0    NA   0.4   0.0 100.0  95.8   0.2   0.5   0.2   0.2   0.2
 10   0.3   0.4  83.0  98.4   0.0 100.0 100.0 100.0  99.8   0.2
 20  28.1   0.4   0.0   0.0  42.2   1.6   0.2   0.0   0.0   0.2
 30   0.2   0.0  98.9  99.6 100.0  99.8   0.0   0.0  99.1   0.0
 40  74.5   0.4   0.2   7.1  99.0 100.0   0.2  19.4   0.0   0.5
 50   0.2   0.9   0.2 100.0   0.5 100.0   0.5 100.0   0.2   0.9
 60   0.2  63.8  69.1  84.4  61.9  70.6  57.1  44.1  33.3   0.0
 70   0.2   0.4   0.0   0.2   0.4   0.5   0.2   0.7   0.0   0.2
 80  67.7  26.4   0.4   0.2   0.2   0.0   0.2   0.0   0.0   0.5
 90   0.4   0.5   0.4   0.0   0.5   0.2   0.2   0.0   0.9   0.7
100   0.6   0.2   0.4   0.2   0.2   0.8   0.0   0.5   0.0   0.2
110   0.7   0.0   0.2   0.2   0.2   0.0   0.2   0.6   0.7   0.0
120   0.5   0.0   0.7   0.7   0.0   0.0   0.2    NA  70.5  67.8
130  69.7  80.4  86.9  48.5  63.4  41.9  27.0  61.9  65.0  46.4
140  57.0  69.9  51.0  63.1  57.3  54.3  54.6  62.3  46.5  49.0
150  54.8  60.5  58.1  17.7   5.5  72.2  26.6  55.9  63.5  46.5
160  54.8  45.4  31.3  53.6  47.3  48.8  52.0  48.9  59.5  24.6
170  50.2  31.6  20.3   0.0   0.0   0.0   0.2   0.0   0.2   0.0
180   0.0   0.0   0.0   0.6   0.7   0.5   0.0   0.2   0.0   0.4
190   0.0   0.2  30.8  67.1  68.7  78.4  10.2   0.5  48.0  54.7
200  54.0   0.0  62.3  60.6  66.7  60.2  73.6  76.3  64.1  68.9
210  71.0  71.3  77.9  66.5  99.8  50.2  66.4  72.0  63.0  54.4
Legend: Each entry equals 100 percent minus the percent of TTL expired ICMP error reports received, for an x/8 IP address range [RFC 1519] where x is the row label plus the column label. n = 96,261. Average n for each entry is about 441 for the randomly generated IP addresses. A BGP table (see text) also shows routes to 220/8 prefixes.

A next step might be to specify a general functional routing table in more detail (like a photograph) based on the complete archived list of individual addresses found to be unroutable (the negative for the photograph based on router TTL expired replies). Lists like Table 1 could be created for single x values, e.g., x = 61, for all variations y (0 to 255) of the second byte of the IP address (e.g., 61.y.0.1 to 61.y.255.254).

The distribution of these results (Table 1 and histogram below) suggests three categories of x/8 address spaces: extreme values near 100% or 0% and intermediate values.

(1) Low values at or near 0.0% suggest that all or almost all of the packets received TTL expired messages from routers and were hence, undeliverable.

In fact, almost all of these TTL expired error reports were received from two routers only 5 hops upstream from the Net Census lab. A "whois" lookup indicated that these two machines are operated by the parent company of the ISP of the lab.

Instead of traveling to far off lands, it appears that our echo requests with unknown addresses were bouncing back and forth between two near routers in the same x.y.z/24 domain, each perhaps assuming that the other would know how to deliver the packet, until it expired. If these routers could talk, one might hear each saying to the other, "Since I don't know where this packet should go, you must know." and "Who is the jerk sending all these packets to nowhere?" Therein lies the value of Table 1. The Net Census research question addressed could not be simpler. The zero values indicate "nowhere," according to these two routers. This anecdotal observation suggests that these two routers have the final word only because the TTL expires, each thinking the other is more authoritative or may have a routing table entry for the destination address.

Since 5 hops of the 255 TTL value were used to travel to this pair of routers, it appears that each machine in this router pair processed the same packet 125 times ((255-5) / 2 routers). Might it be a vulnerability that a single packet addressed to nowhere can create router work equivalent to 250 packets with routable addresses? Not having researched the literature, there can be no claim here to discovery of this apparent vulnerability. However, a google.com search for "router vulnerability" turned up a number of cases, but none like what was observed in this study. It does seem, however, from our observations, that a malicious party could flood router pairs like those described above, knowing that each packet with a non-routable address, would multiply router processing load by a factor of about 250 per packet sent. This apparent router flaw essentially multiplies by 250 the rate at which a malevolent host can issue packets designed to congest and hamper router processing of other internet traffic.

Of course, this assumes that each packet was processed about 250 times. If the router software has a bug which codes replies as TTL expired when they are actually network unreachable, then that would be a quite different story. We plan to capture the packets in such a transaction to see more specifically if the TTL in the sent packet actually is listed as zero. Stay tuned for an answer on that. Also, the router administrator may figure out this puzzle and inform us.

The reader can check this at his/her location with a console command: ping -n 1 -i 255 1.1.1.1 where "-n 1" specifies one packet, "-i 255" specifies maximum TTL and "1.1.1.1" is a valid but unroutable IP address. The book [RFC 1812] states the router should process the packet once and reply with "network unreachable." From our location, as stated above, the TTL expired reply is contrary to specifications and suggests the packet was processed by one or more routers multiple times. The reader might do a good deed by emailing the result (unreachable or TTL expired) from their location to the author for tabulation. Thanks in advance.

Non-zero values in Table 1 less than 0.5% could also represent unroutable packets or the presence of small networks in the space. The time out alternative was observed, which is subject to multiple interpretations. For example, many of the very low values in Table 1 may indicate lost packets or replies. That is, they may have been undeliverable, but the lab was not notified. In any case, with the small sample size in this data set, it not possible to conclude that sub-spaces with low percents in Table 1 are void of systems connected to the internet. Indeed, previous reports [1, 2, 3] have itemized internet responses from some of these address spaces.

(2) Values at or near 100% routability may represent situations where routers deliver the packets to the particular x/8 prefix networks, plain and simple. The x/8 IP address ranges where the highest percent of packets were apparently considered by routers to be deliverable are listed in Table 2.

Table 2: x/8 Prefixes With High Deliverability
x/8 IANA allocation                 N   TO TTL UNR  ACT notTTL
  3 General Electric Company       424 423   0   1    0 100.0
 15 Hewlett-Packard Company        441 441   0   0    1 100.0
 16 Digital Equipment Corporation  426   5   0 421    0 100.0
 17 Apple Computer Inc.            455 454   0   1    0 100.0
 34 Halliburton Company            440 440   0   0    0 100.0
 45 Interop Show Network           448 448   0   0    0 100.0
 53 Cap Debis CCS                  442 442   0   0    4 100.0
 55 Boeing Computer Services .mil  435 435   0   0 4954 100.0
 57 SITA (French)                  461 461   0   0   57 100.0
 18 MIT                            447 445   1   1   83  99.8
 35 MERIT Computer Network         455 453   1   1   41  99.8
214 US-DOD                         415 414   1   0    4  99.8
 33 DLA Systems Automation Center  473 470   2   1   14  99.6
 38 Performance Systems Internat'l 438 426   4   8   62  99.1
 44 Amateur Radio Digital Com.     417 413   4   0    1  99.0
 32 Norsk Informasjonsteknologi    444 439   5   0   30  98.9
 13 Xerox Corporation              439 431   7   1    3  98.4
  4 Bolt Beranek and Newman Inc.   425 407  18   0  362  95.8
Legend: N, number of echo requests to random addresses (TO + TTL + UNR). TO, number timed out. TTL, number TTL expired responses. UNR, number host unreachable responses. ACT, number of other internet responses (e.g., ICMP responses, TCP connections) [1,2, unpublished data]. notTTL, 100% - (100 x TTL / N), as shown in Table 1.

A prominent feature in this group of sub-spaces (highest values in Table 2) is that the most probable outcome is time out or no response. This is expected since only about 1% of echo requests to randomly generated addresses elicit an echo reply from the specified host [4]. The ACT column lists internet response counts from other studies by Net Census confirming that a high percent of time out outcomes does not establish the absence of hosts. Also, the time out outcome suggests the presence of a firewall, especially in cases where no responses are seen (e.g., 34 and 45 in Table 2).

The 214 US-DOD (Dept. of Defense) entry may be of interest. Apparently, the routers just 5 hops upstream act as if essentially all of the packets destined for 214/8 addresses are deliverable, while only about 50% of those destined for the 215/8 prefix, also allocated to the US-DOD, are deliverable. This seems to indicate that the latter prefix is where growth of activity might be expected and that the end-user in cases like this (US-DOD) keeps routers well informed on what addresses are accepting packet delivery to maintain a high level of efficiency in router usage. Thus, resources are not wasted by needless transmission of packets that will be undeliverable.

(3) The address prefixes in Table 1 showing intermediate values for the percent of deliverable packets (e.g., 128 - 172) probably comprise most of what might be called the public internet occupied by internet subscribers and servers of many varieties. The intermediate values indicate that even this address space is not fully occupied, and that probably the router table has much more detailed entries. The archived list of unroutable addresses (TTL expired replies) provides a negative image, from which the active networks can be listed, as described above.

This little study was put together over a several day period by modifying an existing program to tabulate additional variables, when it was noticed that essentially all of the TTL expired replies were originating from only two routers. In contrast, ICMP host unreachable replies to echo requests to random addresses, mostly arise from other addresses. This list of address pairs -- echo request destination and address of host unreachable reporter -- appears to exhibit distinct patterns but has not yet been analyzed.

The automated data collection program was written to check if the address of a host reporting an ICMP error was already in a list of such "volunteers" collected previously [1,2,3], and if not, to add it. As this list lengthened the console display would pause long enough so that the high percent of error reports from these two routers could be noticed. This suggested the idea of a "dump" of the routing table(s) as reported above.

Since the method used may multiply the work of these routers in about 70% of randomly selected addresses that are unroutable, it is suggested that this method not be used, if implemented by an automated program. The data reported above was collected before we were able to locate and download an example of a recent routing table.

In this case, data collection was intentionally paced with only one ICMP echo request outstanding at any time consisting of a single packet and a 3500 msec time out. Even so, over 96 thousand cases were logged in a few days. It would have been easy enough to increase data collection rate by a factor of 100 for a goal of 10 million cases. If something like this were done and if the two upstream routers were indeed looping packets as described above (the ISP has been informed and may figure that out), such an "experiment" might have caused decreased router performance. In short, researchers need not hurry data collection, even if the bandwidth is available, especially if custom programs are written for a project.

What sample size does one need? The desired degree of accuracy is expressed by a confidence interval. A rule of thumb is that the lower and upper limits of this interval are the percent observed minus and plus respectively twice the standard error. All we need is the standard error. Easy. The 29.7% value reported above is a probability p expressed as 0.297. The standard error is simply the square root of p (1 - p) / N where N was 96,261 or SQR(0.297 x 0.703 / 96261) which is 0.00147 or about 0.15%. Doubling this and applying the rule of thumb above, we have a decent confidence interval of 29.4% - 30.0%. So the size of this interval depends to a great extent on the sample size. One can reverse the process above and ask what sample size is required to achieve an acceptably small range in the confidence interval.

A brief effort using internet search engines was made to simply locate a functional worldwide routing table for reference and several were found. A BGP table as seen by KPNQwest is downloadable as bgp-table.txt 26 megabytes. This table illustrates the principle -- be careful what you ask for, you might get it -- so please note its length. It is more than three times longer than needed as a reference since each line is repeated three times. For a reference of routable addresses, the author wrote a utility program, which can be downloaded as bgp_trim.zip 29k, to extract the routable prefixes to form a 1.7 megabyte file, which is much shorter zipped, downloadable as bgptable.zip 316k. [Any text file where the desired strings (IP address prefixes) start on the fourth character of a line and end with one or more space characters separating other text are suitable for the bgp_trim program.] General inspection of this routing table suggests excellect agreement with the empirical data reported above. Other "snapshots" of routing tables are available, e.g., at www.merit.edu, but these appear to be incomplete or are not intended to be complete, leaving out entire x/8 prefixes that definitely have thousands of on-line hosts.

In any case, routing tables are updated regularly. Hence, the downloadable utility program (bgp_trim) was posted so that future versions of the KPNQwest table, or others with similar text format, might be reduced to a list of routable prefixes in a more managable file length. The next hop address and other data in the table are specific to a specific router and may not be of general interest. This utility may help others to update lists of functioning address prefixes used for research purposes.

The IANA has many informative links and may be used to reach regional registries and their lists of IP address allocations. Whois lookups may provide information on entities to which address sub-spaces have been allocated. But none of this information indicates in a definite manner what addresses are actually routable and in use at a particular time. Companies providing internet infrastructure products and services, like Cisco, provide detailed technical information on the subject of internet routing.

References

[1] Doctor Electron, "Computers Connected to IPv4 Address Space", June, 2002.
[2] Doctor Electron, "A TCP Ping Reveals Hosts by Connection Refused Error", August, 2002.
[3] Doctor Electron, "Network Profiling with Randomly Sampled Data", August, 2002.
[4] Doctor Electron, "Internet Host Behavior Statistics by Port", August, 2002.
[IANA] Internet Assigned Numbers Authority, "Internet Protocol in v4 Address Space", December, 2001.
[RFC 792] Postel, J., "Internet Control Message Protocol", September, 1981.
[RFC 1267] Lougheed, K., and Y. Rekhter. "A Border Gateway Protocol 3 (BGP-3)", October, 1991.
[RFC 1519] Fuller, V. et al. "Classless Inter- Domain Routing (CIDR): an Address Assignment and Aggregation Strategy", September, 1993.

Copyright © 2002 Global Services
Original publication: September 13, 2002

Back to Net Census