Tuesday, May 29

Spanning The Tree : Dr Radia Perlman & Untangling Networks

As computer networks get bigger, it becomes increasingly hard to keep track of the flow of data over this network. How do you route data, making sure that the data is spread to all parts of the network? You use an algorithm called the spanning tree protocol — just one of the contributions to computer science of a remarkable engineer, Dr. Radia Perlman. But before she created this fundamental Internet protocol, she also worked on LOGO, the first programming language for children, creating a dialect for toddlers.

Born in 1952, Perlman was a prodigy who excelled in math and science, and in her own words, “Every time there was a new subject or a quiz I would be very excited at the opportunity to solve all sorts of puzzles”. She graduated from MIT in 1973 and got her Masters degree in 1976.

While she was working on her Master’s degree, she worked with Seymour Papert at the MIT Artificial Intelligence Lab, which was working on LOGO, the first programming language for children. In the simplest version of this language, kids could learn the fundamentals of programming by writing programs that controlled the motion of an on-screen or motorized turtle. Perlman created Turtles Own Recursive Turtle Interpreter System (TORTIS), a simplified version of LOGO that could be used by pre-school children. This was controlled by buttons that allowed the toddler to experiment with a Logo turtle with a less intimidating interface than existing systems. “Most important of all,” the abstract for her paper describing the project says, “it should teach that learning is fun.”

Spanning Tree

After getting her masters and leaving MIT, Perlman joined BBN, a defense contractor, then moved to Digital Equipment Corporation (DEC) in 1980. At DEC, she was tasked with looking into ways to deal with the increasing complexity of the local area networks (LANs) that the company was creating. Specifically, how do you stop data from getting trapped in a loop?

Imagine a small network of two computers connected to a router. When one device sends out a data packet, the switch (a router that includes the ability to switch data between multiple ports) passes it onto the other device. But LANs are seldom that simple, and with more computers being connected to the nascent Internet, it was getting harder to figure out how to effectively and efficiently route data between multiple switches. To create networks that could survive the loss of a connection, a switch would be connected to multiple other switches. That way, if one connection failed, you had a backup. You also have multiple possible ways to route data, which creates the possibility of loops. A loop is where one switch passes a packet of data to another, which then sends it back to the first. The first switch then resends it to the second, and so on. That can create a broadcast storm, where the links between switches become filled with these echoing packets.

So how do you avoid this? Perlman’s solution was deceptively simple. She created the Spanning Tree Protocol (STP), which allows each switch in the network to figure out its place in the scheme of things. This innovative standard was codified in the IEEE standard 802.1D, last revised in 2004.

Each switch is assigned a number by the network designer, called the Bridge ID (BID) which indicates its prominence in the network. The switch with the lowest BID is designated as the root bridge. If you think of it as the root system of a tree, the root bridge is the base of the trunk, the point where the root system begins. Below this, each switch is a branch of the root system, branching off and reaching the devices at the root tips.

It’s more complex than a tree root, though: there are multiple links between root branches, and sometimes these links get cut. To negotiate this, each of the switches periodically sends out special data packets called Bridge Protocol Data Units (BPDUs) that contain the BID and another number called the path cost for each connection it has available. This number is calculated from the bandwidth that the connection offers: the lower the path cost, the higher the bandwidth the connection offers. As each switch receives the BPDU from its neighbors, it uses the path cost to decide which connection to use: the lower the cost, the more favorable the connection is. It then sends out its own BPDUs to neighbors, which use this information to calculate which connections to use. This then propagates through the network until every switch knows its place, and which routes are most efficient to use.

If a connection fails, the switch sends out a specially flagged BPDU that contained the details of the remaining connections, which the other switches use to recalculate. In effect, the network would route around the damage and just keep working.

It is such a simple solution that it now appears obvious, but Dr Perlman struggled to get her fellow engineers to accept it, because it was so simple. Many engineers struggled to accept that the answer to the problem could be that simple. “My designs were so deceptively simple that it was easy for people to assume I just had easy problems,” she told the Atlantic, “whereas others, who made super-complicated designs (that were technically unsound and never worked) and were able to talk about them in ways that nobody understood, were considered geniuses.”

The beauty of the STP is that none of the routers need to know the details of the entire network to work: all it needs to know is its own BID and the route cost of the connection to its immediate neighbors. The inverted tree of the network is automatically created by each of the switches using the information it has to decide how to route data. It’s perhaps best described by a poem that Perlman herself wrote in the patent application for the standard and the academic paper describing the protocol (PDF link):

Algorhyme

I think that I shall never see
a graph more lovely than a tree.

A tree whose crucial property
is loop-free connectivity.

A tree that must be sure to span
so packet can reach every LAN.

First, the root must be selected.
By ID, it is elected.

Least-cost paths from root are traced.
In the tree, these paths are placed.

A mesh is made by folks like me,
then bridges find a spanning tree.

Although the original STP has since been replaced by enhanced and updated versions such as the Rapid Spanning Tree Protocol (RTSP), the fundamental idea of a self-organizing, adaptive network remains one of the underpinnings of networking and the Internet. Many of these replacement protocols have been created with the input of Perlman, who continues to work on networking standards such as and TRansparent Interconnection of Lots of Links (TRILL). Although the networks of the Internet are now far more complex than those she first thought about in the 1970’s, the survival of these networks owes much to her innovative, and we would say creative, thinking.

No comments:

Post a Comment