papers | collections | search login | register | forgot password?

Scaling Internet Routers Using Optics
by Isaac Keslassy, Shang-Tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown
url  show details
You need to log in to add tags and post comments.
Tags
Public comments
#1 posted on Feb 11 2008, 23:21 in collection CMU 15-744: Computer Networks -- Spring 08
Summary:

This paper sets out to design a blazing-fast router, using an optical interconnect and technology FROM THE FUTURE (actually, since this paper was written in 2003, technology of last year). Their proposed design routes packets between 640 linecards (each running at 160 Gb/s) for a total throughput of 100Tb/s. As a point of reference, this would be equivalent to every person in the United States transmitting data at 333kbps, or every person on earth transmitting at 18kbps. Obviously, you aren't going to need many of these routers.

The core of the router design is a passive interconnect that carefully distributes packets between line cards. Basically, incoming packets on each line card are distributed evenly across different sets of intermediate queues, which are, in turn, serviced in a round-robin fashion. This is a "load-balanced switch" -- an existing design prior to this paper -- and can guarantee 100% throughput as long as packet destinations are distributed uniformly.

This paper makes this load-balanced switch idea more feasible by lowering the power consumption -- using passive optics for the connecting fabric instead of active electronics -- as well as by addressing other faults:
- Throughput. The paper adds an additional queue to make sure that packets get distributed "uniformly" enough to guarantee good throughput (without this, one could get quite unlucky and always hit just one intermediate queue, getting only 1/N of your link bandwidth).
- Packet Mis-sequencing. The throughput queue also helps keep packet mis-sequencing errors small, so an output buffer of small size can re-order effectively. (They call this scheme 'Full Ordered Frames First'.)
- Linecard Placement. In the original design, one needs exactly N linecards connected in a uniform way. In a real system, one probably wants to be able to fill in racks piecemeal or take broken linecards offline. The authors address this by partitioning the switch into two stages, and adding some reconfigurability to the "local" switches to keep packet distribution uniform.

The paper concludes by noting that there are some other barriers to the creation of such an Ur-Router as this -- notably, getting the electronics to actually run fast enough.

Thoughts:

- Has anyone built one of these? Is there actually any need to put this much routing throughput in one spot?

- This paper trades latency for throughput. As a quick thought experiment: say the router uses 40-byte packets; then it takes 32ns to transmit a packet to a queue. In an "empty" router, the packet will land in a queue and about 320 steps in the future that queue will be serviced, for an end-to-end delay of about 10 microseconds. In a full router, things are a bit more complicated. The packet will be queued on input in a queue of length 640, queued in a VOQ of length 640 (not sure on this one), and finally queued in a reorder queue of, potentially, length 640^2. This ends up in a 13 millisecond delay. Even though the final delay isn't very significant, is this sort of trade-off the right one to make?
#2 posted on Feb 12 2008, 12:10 in collection CMU 15-744: Computer Networks -- Spring 08
To be honest, I did not enjoy reading this paper. It introduces a particular technology for routers, bu most of the theoretical topics were already covered in the other reading.

There are some interesting ideas, such as the absence of centralized scheduling and the re-sequency stage using FOFF.
#3 posted on Feb 12 2008, 13:32 in collection CMU 15-744: Computer Networks -- Spring 08
Even though I didn't fully understand this paper, I feel that the four problems of Load-Balanced switch addressed in the paper are quite practical matters. Did anybody really implement a high speed router with this technique yet? :-)

For the throughput problem, I couldn't understand the concept of proof only with the outline provided in the paper.
#4 posted on Feb 12 2008, 13:40 in collection CMU 15-744: Computer Networks -- Spring 08
I liked the idea of load balancing switch even though it was not the contribution of this paper.
I think their main contribution the flexible line card placement part.
Using mechanisms to partition switches, they made the router more robust to failure in linecards. Linecard failure can be isolated if it's detected and interconnecting switches are reconfigured.
#5 posted on Feb 12 2008, 14:34 in collection CMU 15-744: Computer Networks -- Spring 08
This paper describes a way to design an extremely hi-speed router using optical switches. It is largely a description of a design that could work if some technology developed in the future. However, I found it surprising that the paper was published without even simulation results, showing the effects of their design choices. It seems to me, that purely on the basis of their theoretical results, it will be hard to convince a h/w manufacturer to spend significant resources manufacturing such a router.
#6 posted on Feb 12 2008, 14:38 in collection CMU 15-744: Computer Networks -- Spring 08
This paper proposes a 100Tb/s load-balanced switch architecture that incorporates optics. I like the "vision" of this paper but I disagree that they "achieved" their original goal in the conclusion. Their assumptions seem predicated on "what will happen 3 years from now", which is fine, but I don't think they should overstate the claim. As with all paper designs and thought experiments, actual implementation detail ultimately decide whether they properly accounted for all the area/power costs of the switching architecture.
#7 posted on Feb 12 2008, 15:20 in collection CMU 15-744: Computer Networks -- Spring 08
The article presents a design for a 100Tb/s router. This goal, as the authors mention, is used primarily as motivation for the study of Chang's Load Balanced switch architecture and of the integration optical components in routers.

Like other commenters, I too am curious as to whether a router has been built according to their design; I am also curious as to whether some of their contributions have been implemented in other (perhaps less ambitious) router designs?
#8 posted on Feb 12 2008, 15:27 in collection CMU 15-744: Computer Networks -- Spring 08
It was a very nice reading for me to understand basic requirements and challenges that high-capacity routers have as the Internet traffic explosively grows. I could easily think about traffic control issues such as load-balancing, fault-tolerance and throughput. However, the part about the relationship between power consumption and optical fabrics was new and interesting for me.
#9 posted on Feb 12 2008, 16:25 in collection CMU 15-744: Computer Networks -- Spring 08
Sort of a funny paper -- using technology we don't have, can we build a router we don't need?

Its a fun exercise though, and highlights the hardware aspect, which I had never really thought about. Its interesting how the same problem (routing) has a very different character when looked at from the macro scale (when incentives play a huge role), and the micro scale (when we are concerned about heat).
#10 posted on Feb 12 2008, 16:36 in collection CMU 15-744: Computer Networks -- Spring 08
This paper faces an interesting type of technological issue: if it solved the current problem assuming the existing technology, then by the time the new ideas were accepted and developed technology would have changed so much that they might not work anymore. So they are forced to develop a hypothetical model of the future in order to justify their approach (which seems to work well on the hypothetical technology).

One problem that didn't even occur to me when I read the other paper was the packet mis-sequencing. It is possible for two consecutive packets arriving from the same application to get placed at different intermediate linecards and for their departure order to get reversed as a result. The authors note that mis-sequencing is allowed and common on the Internet. I'm curious how often it is, and if there have ever been any major issues that resulted from mis-sequencing on the Internet.
#11 posted on Feb 12 2008, 16:44 in collection CMU 15-744: Computer Networks -- Spring 08
I enjoyed reading this paper and especially liked the idea behind the load-balanced router (alhtough it was not first proposed in this paper). Most previous router architectures spend a lot of effort and were limited by very complex scheduling algorithms to accommodate for "inconvenient" traffic. Instead the load-balanced router adds an additional load-balancing switch that essentially redistributes the incoming traffic before it enters the VOQs in the line cards. This makes the scheduling of the second switch much easier, since the traffic it operates on is guaranteed to be uniformly distributed.

I also think that the authors came up with nice ways to get the best out of both the electrical and the optical "switching worlds". For fast switching they avoid directly switching optical links and either use conventional electrical switches or use Wavelength Division Multiplexing (WDM) in the case of optical links. Slow MEMS switches are only used for reconfiguring the router in the rare case when line cards have to be added or removed.

As were other posters, I was also wondering whether a router with this architecture has been built today. The authors state that a power efficient version of the proposed router would require successful integration of opto-electronic devices in silicon. I wonder if this technology is mature even today.
#12 posted on Feb 12 2008, 16:45 in collection CMU 15-744: Computer Networks -- Spring 08
For me, this paper is hard to follow; especially when, of the two paper on router design, it is the first paper I read.

I think the authors just picked 100 Tb/s router as an example to show their design concepts for such a very high-speed router, although there might be no need for that kind of router for now.

One good point of this paper is that the authors tried to be as practical as possible. However, what I wonder is that in order to deal with linecard placement problem, instead of using a seemingly very complicate, at least for me, switch fabric design, is it possible to have a few redundant linecards at each router?
#13 posted on Feb 12 2008, 16:48 in collection CMU 15-744: Computer Networks -- Spring 08
According to the paper, the average link utilization is below 10%. However, peer-to-peer traffic, especially file sharing such as Bittorent, recently dominated the Internet traffic, likely 40-60% of overall traffic. If this type of traffic continues to increase, a total throughput of 100Tb/s might not be enough.

Moreover, in peer-to-peer network, each peer often send several thousands of small packets to hundreds of peers per second, so the forwarding engine might become bottle-neck instead of the bandwidth.

Also, I heard that ISPs currently took the totally different approach. Instead of handle this type of traffic more efficient, they just tried to throttle this type of traffic. For example, Comcast use an application from Sandvine to prohibit all seeding in Bittorrent traffic.

My question is what should we do differently to handle this new type of traffic.
#14 posted on Feb 12 2008, 16:55 in collection CMU 15-744: Computer Networks -- Spring 08
One of the motivations for the proposed design is that "centralized schedulers don't scale with an increase in the nubmer of ports". However, unless I'm misunderstanding something, it seems that the number of ports would correspond to the number of physical long-distance connections to other routers, and that this number is not going to necessarily dramatically increase in the future. More routers might be added, but the number of connections per router won't necessarily increase.
#15 posted on Feb 12 2008, 16:59 in collection CMU 15-744: Computer Networks -- Spring 08
The author stated in Sec. 7 that a 100Tb/s router could be built in 3 years, I wonder when it could be really practical from a business perspective, say before 2020?

The flexible line card placement, as pointed out by Dongsu as the main contribution of the paper, is by partitioning a large number of line cards to multiple groups. Spreading traffic uniformly is essential for getting a full throughput, the author was able to provide a theorem in choosing the number of static paths. Along with the polynomial-time middle-switch configuration algorithm theorem, they provide a convincing solution to provide flexible line card statement.
#16 posted on Feb 12 2008, 16:59 in collection CMU 15-744: Computer Networks -- Spring 08
This one was an interesting thought paper. I liked the idea of the load-balanced switch architecture (though as some others pointed out, this is not the contribution of the paper) partly because that was one of the few things I could understand and appreciate :). I must admit I hadn't thought about the issues they raised about the above mentioned architecture.

One thing I am wondering is, if we do need 100 Tbps routers given the 10% link utilization in core routers. As others, I was not sure if anyone made an attempt to build this. Nevertheless, the paper had a cool design using optics, for it.
#17 posted on Feb 12 2008, 17:40 in collection CMU 15-744: Computer Networks -- Spring 08
It is a quite interesting idea to have 100Tb/s routers. The researchers claimed it is only possible through optical links, with the load-balance architecture for switching. Is there any scenario for intentionally unbalancing?

Further more, the optical links and fibers might be translated into electrical signals. How much will the delay be? If we could have a computer operated on optics, that might be very useful, and we do not need the optoelectronic translation.
#18 posted on Feb 12 2008, 17:43 in collection CMU 15-744: Computer Networks -- Spring 08
again, the authors mentioned they were using the techniques within next three years. It has already been 5 years since there publication.
Any big company is using this now????
#19 posted on Feb 12 2008, 22:29 in collection CMU 15-744: Computer Networks -- Spring 08
I can't say that I understand this paper. The part I can understand states that using a load balancing component to convert arbitrarily distributed traffic to uniform so that the switcher can stabilize this uniform traffic. I think this idea is very cool. I get lost in the following parts. However I think it is worthy to consider the technology ahead of time.
#20 posted on Feb 13 2008, 00:17 in collection CMU 15-744: Computer Networks -- Spring 08
I didn't understand the claim about 100% throughput. Isn't that simply a function of amount of input vs. amount of output? I think what they meant to say is that when other queuing mechanisms will fail due to the non-uniformity of destination of packets, this scheme can mediate that...

Also, is it correct to say that their solutions to the 4 practical problems of the basic load balance switch and using optics are two orthogonal aspects of the architecture? Or is it that their solutions require overhead that the optics are necessary to help with power consumption (and maybe latency)? I think the authors do keep mentioning something along these lines, but it's not that convincing to me.
#21 posted on Feb 13 2008, 00:34 in collection CMU 15-744: Computer Networks -- Spring 08
Nevermind the throughput question. It makes much more sense after reading the other paper...