New data center network architecture addresses latency and throughput issues

Jun 17, 2020

Leave a message

Data traffic within data centers and the services they support are growing rapidly. For this reason, great progress has been made in hardware. Switch Tomahawk 4 has just reached 25.6 megabits per second. Transceiver technology recently exceeded 400 gigabytes.


By 2021, about 95 percent of all data center traffic will come from the cloud, and in cloud applications, most packets are under 500 bytes. As the size gets smaller, you need to switch faster to match. Unfortunately, the network is still struggling with latency.


As data center systems expand, electronic packets in use experience a "long tail delay," typically hundreds of milliseconds or more, several orders of magnitude higher than the median value. In more detail, the peak waiting time for 1 in 100 users is not a problem in normal times, but it becomes a real problem when 1% of users become thousands.


The recently released PULSE architecture offers an innovative solution. An optical circuit switched network controlled by distributed hardware scheduler is proposed. When modeled on MATLAB, the architecture has an average delay of about 1 microsecond and a tail delay of about 5 microseconds. When adjustment overhead is taken into account, its throughput is a staggering 25.6 petabytes per second, despite the instantaneous node-to-node limit of 100Gbps.


This is done through some of the key functions of the network. A parallel star coupler is used, which allows light to travel equally from any port to all other connected ports. Each rack has 64 nodes, a total of 64 racks, and each node has multiple transceivers to facilitate the subnet. Each transceiver connects its nodes to a different subnet via a different star coupler.


During data transmission, the transmitter and receiver are tuned to the same time slot and wavelength. Therefore, for each coupler, there is a corresponding node scheduler in the same rack to handle the source-target rack pair. In addition, the request is sent to the scheduler several times in advance (cycle duration). Innovative scheduling algorithms compute a new wavelength for each circuit cycle. A key feature of the architecture is its nanosecond reconfiguration speed.


Because the subnet is completely independent, this unique setting allows for reuse of wavelengths. As a result, the network can support more than 250,000 channels. In addition, the system allows 100% wavelength use. This architecture does not require buffering, addressing, and in-network switching.


However, it does require extremely fast filtering, scheduling, data recovery, tunable wavelength switching, and synchronization. Under this layout, nodes can effectively share resources and minimize bottlenecks.


One of the surprising findings is that compared to the current network architecture, which costs about $5 /Gbps, it is actually very cost-effective. For this purpose, the network consumes only 82 picojoules per person. The cost of transceivers is falling, which will further benefit systems such as PULSE. In addition, during the data center refresh cycle, only the terminal node transceiver needs to be upgraded, resulting in additional cost savings.


Send Inquiry