Rfc | 1306 |
Title | Experiences Supporting By-Request Circuit-Switched T3 Networks |
Author | A.
Nicholson, J. Young |
Date | March 1992 |
Format: | TXT, HTML |
Status: | INFORMATIONAL |
|
Network Working Group A. Nicholson
Request for Comments: 1306 J. Young
Cray Research, Inc.
March 1992
Experiences Supporting By-Request Circuit-Switched T3 Networks
Status of this Memo
This RFC provides information for the Internet community. It does
not specify an Internet standard. Distribution of this memo is
unlimited.
Abstract
This memo describes the experiences of a project team at Cray
Research, Inc., in implementing support for circuit-switched T3
services. While the issues discussed may not be directly relevant to
the research problems of the Internet, they may be interesting to a
number of researchers and implementers.
Developers at Cray Research, Inc. were presented with an opportunity
to use a circuit-switched T3 network for wide area networking. They
devised an architectural model for using this new resource. This
involves activating the circuit-switched connection when an
application program engages in a bulk data transfer, and releasing
the connection when the transfer is complete.
Three software implementations for this feature have been tested, and
the results documented here. A variety of issues are involved, and
further research is necessary. Network users are beginning to
recognize the value of this service, and are planning to make use of
by-request circuit-switched networks. A standard method of access
will be needed to ensure interoperability among vendors of circuit-
switched network support products.
Acknowledgements
The authors thank the T3 project team and other members of the
Networking Group at Cray Research, Inc., for their efforts: Wayne
Roiger, Gary Klesk, Joe Golio, John Renwick, Dave Borman and Craig
Alesso.
Overview
Users of wide-area networks often must make a compromise between low
cost and high speed when accessing long haul connections. The high
money cost of dedicated high speed connections makes them
uneconomical for scientists and engineers with limited budgets. For
many traditional applications this has not been a problem. Datasets
can be maintained on the remote computer and results were presented
in a text-only form where a low-speed connection would suffice.
However, for visualization and other data transfer intensive
applications, this limitation can severely impact the usability of
high performance computing tools which are available only through
long-haul network connections.
Supercomputers are one such high performance tool. Many users who
can benefit from access to supercomputers are limited by slow network
connections to a centrally located supercomputer. A solution to this
problem is to use a circuit-switched network to provide high speed
network connectivity at a reduced cost by allocating the network only
when it is needed.
Consider how a researcher using a visualization application might
efficiently use a dedicated low speed link and a circuit switched
high speed link. The researcher logs in to the remote supercomputer
over the low speed link. After running whatever programs are
necessary to prepare the visualization, the high speed connection is
activated and used to transfer the graphics data to the researcher's
workstation.
We built and demonstrated this capability in September, 1990, at the
Telecommunications Association show in San Diego, using this type of
visualization application. Further, it will be available in a
forthcoming release of our system software.
Architectural Model
We developed our support for circuit switched services around a
simple model of a switched network. At some point in the path
between two hosts, there is a switched network connection. This
connection is likely to connect two enterprise networks operated by
the same organization. Administrative overlap between the two
networks is useful for accounting and configuration purposes. We
believe that with further investigation circuit switched network
support could be extended to multiple switched links in an internet
environment.
The switch which makes the network connection operates on a "by-
request" basis (also called "on-demand"). When it receives a request
to make a network connection it will do so (if possible), and breaks
the connection when requested. The switch will not activate
automatically if there is an attempt to transfer data over an
incomplete connection.
We also made the assumption that the circuit would be switched on a
connection basis rather than a packet basis. When an application
begins sending data utilizing the switched connection, it will send
all the data it has, without stopping, until it is finished. At this
time it will release the connection. It is assumed that the quantity
of data will be large enough that the circuit setup time is
negligible relative to the period of the transfer. Otherwise, it is
not worth the effort to support the circuit switched network for
small data transfers.
This model requires that just before the application begins a large
bulk transfer of data, a request message is sent to the switch asking
that the switched network connection be activated. Once the link is
up, the application begins sending data, and the network routes all
the data from the application through the switched network. As soon
as all the data has been sent, a message is sent to the switch to
turn off the switched link, and the network returns to routing data
through the slower link.
The prototype system we built for the TCA show was designed around
this model of circuit switched services. We connected a FDDI
backbone at Cray Research in Eagan, Minnesota to the TCA show's FDDI
network through 2 NSC 703 FDDI/T1/T3 routers. MCI provided a
dedicated T1 line and a switched T3 line, using a DSC DS3 T3 switch
located in Dallas, Texas. These networks provided connectivity
between a Cray Research computer in Eagan to a Sun workstation on the
show floor in San Diego.
Alternative Solution Strategies
The first aspect of using the switched services involved the circuit
switch. The DS3 switch available to us was accessed via a dial up
modem, and it communicated using a subset of the CCITT Q.295
protocol. Activating the switch required a 4 message exchange and
deactivation required a 3 message exchange. We felt the protocol was
awkward and might be different for other switch hardware.
Furthermore, we believed that the dial up aspect of communicating
with the switch suffered from the same drawbacks. A good solution
would require a cleaner method of controlling the switch from the
source host requesting the switched line.
The next aspect of using switched services involves the source host
software which requests and releases the switched network. Ideally,
the switched network is activated just before data transfer takes
place and it is released as soon as all data has been sent. We
considered using special utility programs which a user could execute
to control the link, special system libraries which application
programs could call, or building the capability into the kernel. We
also considered the possibility that these methods could send
messages to a daemon running on the source host which would then
communicate with another entity actually controlling the switch.
The last aspect of using switched services we considered is selection
of the switch controlled network. This involves both policy issues
and routing issues. Policy issues include which users running which
applications will be able to use higher cost switched links. And
packets must be routed amongst multiple connections offering varying
levels of service after they leave their source.
Implementations
We have developed a model for switch control through the internetwork
which we believe to be reasonable. However, we have experimented
with three different source host implementations. These different
implementations are detailed here.
Switch control
Our simplest design decisions involved the switch itself. We decided
that the complex protocol and dial up line must be hidden from the
source host requesting the switched link. We decided that the source
host would use a simple request/release protocol with messages sent
through the regular network (as opposed to dial up lines or other
connections). Some host accessible through the local network would
run a program translating the simple request and release messages
into the more complicated switch protocol and also have the modem to
handle the dial up connection.
This has a variety of advantages. First, it isolates differences in
switch hardware. Second, multiple hosts may access the switch
without requiring multiple modems for the dial up line. And it
provides a central point of control for switch access. We did not
consider any alternatives to this model of switch control.
Our initial implementation used a simple translator daemon running on
a Sun workstation. Listening on a raw IP port, this program would
wait for switch control messages. Upon receipt of such a message, it
would dial up the switch and attempt to handle the request. It would
then send back a success or failure response. This host, in
conjunction with the translator daemon software, is referred to as
the switch controller. The switch controller we used was local to
our enterprise network; however, it could reside anywhere in the
Internet.
Later we designed a simple protocol for switch control, which was
implemented in the translator daemon. This protocol is documented in
RFC 1307, "Dynamically Switched Link Control Protocol".
Source Control of the Switched Link
This problem involves a decision regarding what entity on the source
host will issue the switch request and release messages to the switch
controller, and when those messages will be issued. Because we do
not have very much field experience with this service, we do not feel
that it is appropriate to recommend one method over the others. They
all have advantages and disadvantages.
What we did do is make 3 different implementations of the request
software and can report our experiences with each. These are one set
of special utility programs which communicate with the switch
controller, and 2 kernel implementations. We did not experiment with
special libraries, nor did we implement a daemon for switch control
messages on the source host.
Switch control user programs
This implementation of source host control of the switch is the
simplest. Two programs were written which would communicate requests
to the switch controller; one for activating the connection, and
another to deactivate the connection. The applications using this
feature were then put into shell scripts with the switch control
programs for simple execution.
This approach has the significant advantage of not requiring any
kernel modifications to any machine. Furthermore, application
programs do not need to be modified to access this feature. And
access to the circuit-switched links can be controlled using the
access permissions for the switch-control programs.
However, there are disadvantages as well. First, there is
significant potential for the switch to be active (and billing the
user) for the dead time while the application program is doing tasks
other than transferring bulk data. The granularity of turning the
switch on and off is limited to a per-application basis.
Another disadvantage is that most applications use only the
destination host's address for transfer, and this is the only
information available to the transport and network layers for routing
data packets. Some other method must be used to distinguish between
traffic which should use the circuit-switched connection and lower-
priority traffic. This problem can be addressed using route aliases,
described below.
Kernel switch control
We have made two different implementations of switch control
facilities within the operating system kernel. Both rely upon the
routing lookup code in the kernel to send switch connect and tear
down messages. The difference is in how the time delay between
request of the switch and a response is handled.
For starters, routing table entries were expanded to include the
internet address of the switch controller and state information for
the switched connection. If there is a switch controller address
specified, then the connection must be set up before packets may be
sent on this route. We also added a separate module to handle the
sending and receiving of the switch control messages.
When a routing lookup is satisfied, the routing code would check
whether the routing table entry specified a switch controller. If
so, then the routine requesting switch setup would be called. This
would send a message on the Internet to the switch controller to
setup the connection.
In our first implementation, the routing lookup call would return
immediately after sending the switch connection request message. It
would be the responsibility of the transport protocol to deal with
the time delay while the connection is setup, and to tear down if the
switched connection could not be made. This has significant
ramifications. In the case of UDP and IP, packets must be buffered
for later transmission or face almost certain extermination as they
will probably start arriving at the switched connection before it is
ready to carry traffic. Because of this problem, we decided that
this feature would not be available for UDP or IP traffic.
We did make this work for TCP. Since TCP is already designed to work
so that it buffers all data for possible later retransmission, this
was not a problem. Our first cut was to change TCP to check that the
route it was using was up if it is a switch controlled route. TCP
would not send any data until the route was complete, and it would
close the connection if the switch did not come up.
This did not work well at first because every time TCP tried to send
data before the switch came up, the retransmit time would be reset
and backed off. The rtt estimate, retransmit timeouts and the
congestion control mechanism were seriously skewed before any data
was ever sent. The retransmit timer would expire as many as 3 times
before data could be transmitted. We solved this problem by adding
another timer for handling the delay while the route came up, and not
allowing the delay to affect any of the normal rtt timers.
Our experiences with this approach were not particularly positive,
and we decided to try another. We also felt that unreliable datagram
protocols should be able to use the service without excessive
reworking. Our alternative still sends the switch control message
when a routing lookup finds a controlled route. However, we now
suspend execution of the thread of control until a response comes
back from the switch controller.
This proved to be easier to implement in many ways. However, there
were two major areas requiring changes outside the routing code.
First, we decided that if the switch refused to activate the
connection, it was pointless to try again. So we changed the routing
lookup interface so that it could return an error specifying a
permanent error condition. The transport layer could then return an
appropriate error such as a host unreachable condition.
The other, more complex issue deals with the suspension of the thread
of execution. Our operating system, UNICOS, is an ATT System V
derivative, and our networking subsystem is based on the BSD tahoe
and reno releases. The only way to suspend execution is to sleep.
This is fine, as long as there is a user context to put to sleep.
However, it is not a good idea to go to sleep when processing network
interrupts, as when forwarding a packet.
We solved this problem by using a global flag regarding whether it
was ok for the switch control message code to sleep. If it is
necessary to send a message and sleep, then the flag must be set and
an error is returned if sleeping is not allowed. User system calls
which might cause a switch control message to be sent set and clear
the flag upon entrance and exit. We also made it impossible to
forward packets on a switch controlled route. We feel that this is
reasonable since the overhead of switch control should be incurred
only when an application program has made an explicit request to
begin transfer of data.
The one other change we made was to make sure that TCP freed the
route it is using upon entering TIME_WAIT state. There is no point
in holding the circuit open for two minutes in case we need to
retransmit the final ack. Of course, this assumes that an alternate
path exists for the the peer to retransmit its fin.
The advantage of building this facility into the kernel is that it
allows a fine degree of control over when the switch will and will
not need to be activated. Many applications which open a data
connection, transmit their bulk data, and then close the connection
will not require modifications and will make efficient use of the
resource. It also opens the possibility that applications written to
use type-of-service can use the same network connection for low-
bandwidth interactive traffic, change the type-of-service (thus
activating the switched connection) for bulk transfers, and then
release the switch upon returning to interactive traffic.
Putting this feature into the kernel also allows strong control over
when and how the switched link can be used, keeping accounting
information, and limiting multiple use access to the switched link.
The disadvantage is that significant kernel modifications are
required, and some implementation details can be very difficult to
handle.
Switch control libraries
The switch control programs we used were built on a library of simple
switch control routines; however, we did not alter any standard
applications to use this library. We did consider some advantages
and disadvantages. On the plus side, it is possible to achieve a
satisfactory degree of switch control without requiring any kernel
modifications.
The primary disadvantage of this approach is that all applications
must be altered and recompiled. This is particularly inconvenient
when source is not available.
Link Selection
When an application wishes to send data over a circuit-switched
connection, it will be necessary to select the switched link over
other links. This selection process may need to take place many
times, depending on the local network between the source host and the
bridge to the circuit switched connection.
For example, if the kernel routing code is controlling the link, then
there must be a way to choose a controlled route over another route.
Further downstream, there must be a way to route packets to the
switched link rather than other links.
This issue has the potential for great complexity, and we avoided as
much of the complexity as possible. Policy routing and local routing
across multiple connections are fertile areas for work and it is
outside the scope of this work to address those issues. Instead we
opted for simple answers to difficult questions.
First of all, we added no special policies to link accessibility
beyond that already found in UNICOS. And we handled local routing
issues to the NSC FDDI/T1/T3 routers with routing table manipulation
and IP Type-of-Service.
We came up with three solutions for selecting a routing table entry.
The first possibility is to use the type-of-service bits, which
seemed natural to us. We changed the routing table to include type-
of-service values associated with routing entries, and the routing
lookups would select using the type-of-service. UNICOS already
supports a facility to mark connections with a type-of-service value.
A controlled route could be marked with high throughput type-of-
service and an application wishing to transfer bulk data could set
the socket for high throughput before making the connection. It
could also be possible to change the type-of-service on an existing
connection and start using the switched link if one is available.
Using the type-of-service bits have the advantage that downstream
routers can also use this information. In our demonstration system,
the NSC FDDI/T1/T3 routers were configured to transfer packets with
high throughput type-of-service over the T3 connection and all others
over the T1 connection.
Another possibility is to take advantage of the multiple addresses of
a multi-homed host. Routing tables could be set up so that packets
for one of the addresses get special treatment by traveling over the
switched link. The routing table in the source host would have an
entry for accessing the switch controller when sending to the high
throughput destination address.
We also derived a method we call route aliasing. Route aliasing
involves associating extra addresses to a single host. However,
rather than the destination being an actual multi-homed host, the
alias is known only to the source host and is used as an alternative
lookup key. When an application tries to connect to the alias
address the routing lookup returns an aliased route. The route alias
contains the actual address of the host, but because of looking up
the special address, the switch is activated. The alias could also
specify a type-of-service value to send in the packets so that
downstream routers could properly route the packets to the switched
link. We realize that some may bemoan the waste of the limited
Internet address space for aliases; however, only the source host is
aware of the alias, and the primary shortage is with Internet network
addresses rather than host addresses. In fact, we argue that this is
a more efficient use of the already sparse allocation of host
addresses available with each network address.
Future considerations
We believe that by-request services will become increasingly
important to certain classes of users. Many data centers make high
performance resources available over a wide area, and these will be
the first users to take advantage of wide-area circuit-switched
networks. Some users, such as CICNet ([2]), are already interested
in deploying this capability and telecom vendors are working to
satisfy this need. However, there are a lot of issues involved in
providing this functionality. We are working to involve others in
this process.
References
[1] Nicholson, et. al., "High Speed Networking at Cray Research",
Computer Communications Review, January 1991.
[2] CICNet DS3 Working Group, "High Performance Applications on
CICNet: Impact on Design and Capacity", public report, CICNet,
Inc., June 1991.
[3] Young, J., and A. Nicholson, "Dynamically Switched Link Control
Protocol", RFC 1307, Cray Research, Inc., March 1992.
Security Considerations
Security issues are not discussed in this memo.
Authors' Addresses
Andy Nicholson
Cray Research, Inc.
655F Lone Oak Drive
Eagan, MN 55123
Phone: (612) 452-6650
EMail: droid@cray.com
Jeff Young
Cray Research, Inc.
655F Lone Oak Drive
Eagan, MN 55123
Phone: (612) 452-6650
EMail: jsy@cray.com