Chair of Communications

Reinforcement Learning for maximizing the system capacity of optical transmission systems




in cooperation with         Infinera



In order to manage the ever-growing bandwidth requirements effectively and with efficient use of spectral resources, elastic optical networks (EON) are applied. Flexible devices and reconfigurable optical add/drop multiplexer (ROADM) architectures are used in these networks. A key element of an EON is a bandwidth variable transponder (BVT). Today's BVTs allow the simultaneous variation with fine granularity of bit rate [from 100 Gb/s to 600 Gb/s], baud rate [approx. 34-69 GBd] and modulation format [QPSK - 64QAM, incl. bit interleaved time domain hybrid QAM]. Furthermore, the systems are now flexigrid-capable to transport these signals on a variable frequency grid.

The higher dimensionality of the parameters for the optimal use of resources is a major challenge. A possible solution of the problem is to use a combination of the system and an algorithm of machine learning, for example reinforcement learning.

Reinforcement Learning is based on the natural learning behaviour of humans. In a trial-and-error procedure, an agent learns to repeat actions that are followed by a reward and to avoid actions that are followed by a penalty. After several runs, the agent can develop a strategy that leads to the goal as quickly as possible. This approach is particularly suitable, if there is no "high-quality data” available in which the desired result is already defined.

We investigate the feasibility of a reinforcement learning algorithm to achieve a higher system goal (e.g. highest system capacity, max capacity-reach-product) by varying the available system parameters in a reasonable computational time. The algorithm then outputs the BVT settings and the power per wavelength.