Optical nonlinear impairment compensation based on Deep Neural Network (DNN) for coherent modulation systems

One and most important of the intrinsic challenges facing the optical fibers communication systems and main restriction to limited the system capacity is the fiber nonlinearity impairments. Classical Nonlinear Impairments Compensation (NLC) techniques are widely used and exist on the basis of the approximate Nonlinear Schrodinger Equation (NLSE) solution, their use and requires excessive signal resources, and highlevel knowledge accuracy. In addition, their parameterizations can be numerically unstable. Algorithms of Artificial Intelligence (AI) are utilized to determine and resolve the deficiencies by learning from the receiving information itself. To the best of our knowledge, this novel approach is implemented. Therefore, this article proposes a system nonlinearity and single-step compensation algorithm according to a Deep Neural Network (DNN) as a new alternative framework for future optical communications. So, we proposed to use the DNN to compensation the nonlinearity impairments in optical communication systems. The suggested DNN is accessible to higher-order QAM modulations with achieving greater gain in nonlinear impairments compensation compare to classical NLC techniques based on Digital Back Propagation (DBP). Its performance is evaluated experimentally on coherent 65536-bit sequence length with 25 Gbaud single polarization 4-16-64 QAM with 50 and 120 Gb/s back-to-back measurements through using pre-distort symbols at the transmitter for showing Q factor development after 5000 km standard single-mode fiber transmission link. The DNN's weights are to train data with the intrachannel cross-phase modulation (XPM) and self-phase modulation (SPM) that used as input features.


Introduction
At the beginning of this decade, the universe has witnessed a rapid increase in the need for high-speed communication (millions of minutes of multimedia transit the IP network per second) [1]. Fiber communication systems represent the backbone (core) systems for Internet architecture which have evolved in order to meet this growing traffic demand. As a result, to fulfill this demand three main points for optical transmissions in 2020s that turned in an important area for research development, it illustrated in the triangle show in Fig. 1 (a). According to the enhancement of the optical transmission capacity, the main technologies is possibly fulfilled by the total use of the next aspects: Unfortunately, the optical system is highly impacted by different linear and nonlinear deficiencies illustrated in Fig. 2. it has become very necessary to use NLC techniques. Nonlinear impairments have always been a main challenge and which are the main effects that limit the system performance according to reliability, capacity, synchronization, transmission distance, and time latency. Its due to the nonlinearity of fiber media transfer characteristics as well as, nonlinear impairment from optical and electrical components [2][3][4]. Hence, as a result, the highest system performance cannot be achieved without using effective NLC techniques.
An efficient technique is used to solve the Manakov equation, known as the classical NCL technique. It is based on the approximate solution of the Nonlinear Schrodinger Equation (NLSE) through Split-step Fourier Method (SSFM) or Digital Back Propagation (DBP) [5,6]. In general, Maximum-Likelihood Sequence Equalizer (MLSE), Volterra Series Transfer Function (VSTF), and Digital backpropagation (DBP) are the most common NLC modules. In addition, these techniques have wide disadvantages and difficulties and the moving to use the AI techniques have become hot research besides the impressive findings while utilizing AI [7][8][9][10]. In complicated systems, AI helps optical communication in getting a flexible statistical analysis with no reliance on any particular models.
Furthermore, it gives a high potential for improving signal design, traffic control, and nonlinearity compensation performance [11][12][13]. Fig. 3 presents the total AI methods in relation to decision-making and learning methods and statistical models used in different aspects of optical systems of transmission. Of these systems, the most significant are statistics, analysis, and compensation for nonlinear disabilities for improving the general operation of the system. In particular, the contributions are below: ▪ Orientation overview: First, doing a general overview of the modern trends for ML and then we specialize in Deep learning in the optical community. ▪ Propose a DNN architecture for NLC to further improve performance. Results analysis and optimize: finally, analyze and optimize the resulting chart of our DNN with the proposed DNN is: fast time convergence, less complexity and achieving higher throughput. In our proposed design, we will not use any of the digital processing techniques and deep learning based on DNN will be fully relied upon to compensate for linear and nonlinear impairments.

ML algorithms techniques for optical communications
As a rule, ML is for teaching a black-box to overcome situations by computational theories that are complex and intractable. There are many ML algorithms for optical communications systems (Table 1 [ 14]), although with no enough attention as that of the deep neural network (Fig. 4) because of the benchmark obtained in DNN [7]. Monitoring [15-20, 3, 21-23]. In addition, the DNN needs huge data to learning and it consumes high amount of hardware resources to do the calculations which makes it suitable for future optical communications. This paper shows that the proposed DNN architecture demonstrated in Fig. 5 can obtain NLC, achieves low complexity than DMP algorithms, and is more robust.

Optical system model design
The proposed optical communications is shown in Fig. 6. Single-Channel SP-nQAM. optical signals for 40 Gbps bitrate are sent by fiber media to coherent receiver side to improve after @5000 km (SMF) transmission. N-spans of fiber links with 100 km per-span. The SMF has a nonlinear factor is i = 2 /W/km and dispersion parameter of D = 16 ps/nm/km, with an attenuation of 0.2 dB/km.

Figure 6. Proposed optical communications system under consideration
In the front of the receiver side, the noise figure is assumed (4dB), and Erbium-doped fiber amplifiers (EDFA) is implemented for compensating the span loss with all ASE noise. EDFA is especially suiteble for prompting performance analysis of amplifiers in a long-haul system. Also, used Optical filter with a Gaussian frequency transfer function at receiver side only with 100 dB Depth (maximum attenuation value for the filter) and 3 order of the function, For the proposed setup, the considered parameters are given in Table 2. In addition, we will not use any of the digital processing techniques such as standard phase recovery, MLSE, linear equalization (LE), Equalizer or Decision-Feedback Equalizer and depended on the deep learning based on DNN to compensate the nonlinear impairments.

DNNs model design and properties setup
The proposed DNN layers setup with parameters are explain in Table 3. Input features: DNN algorithm needs data for obtaining a working model of the nonlinear deficiency. It is also necessary to provide the DNN with nonlinear impairment features for sufficient nonlinearity, thus the launch power P0 must be bigger than the optimum channel power, these features are provided to the DNN through the intrachannel four-wave mixing (IFWM) and the first calculating of the IXPM and SPM. The dataset is real and imaginary elements before entering DNN. The Non-linear activation function is used because it improves performance than linear function and the (A Leaky Rectified Linear Unit) ReLU is the optimum activation function [25] as a result and based on the study as shown in fig.7, finally, we used Leaky ReLU to maximize the gain of the NLC. Algorithm of Adam learning with an initial rate learning 0.001 and max epochs size of b = 1000, mini batch size is 250 for 64QAM and 125 for 4-16 QAM and learn rate drop factor is 0.9.
In ML, the NLE can be treated as a supervised nonlinear problem. A DNNs learns by the training processes on the dataset. A single neuron is a unit of the input, weights (W), bias and the Activation Function (AF), which applied to find the output required [14,16]; each neuron contains a number of inputs (associated with a weight) and one output. to obtain the neuron output, the neuron calculates the AF parameter by sum the input weight and the bias [26]. This output of AF must be identical to the application and is the general dynamic of the target signal (differ from the AF in the hidden layers). At the DNN output, the NLC signal is related to the input and the nonlinear functions. So, for the evaluation of the performance of the DNNs, the loss function must be introduced to find the perfect estimates neurons' weights (W), here the loss function is cross entropy function is used. so that, the outputs are proximate to the target outcomes. In addition, the error between the the required values and the DNN output must be described from the error Gaussian distribution with cross entropy function. Based on ML principle, we aim to find the perfect estimates neurons' weights (W) for the DNNs. This done by finding the highest result of the Joint Conditional Probability Density Function and the lowest value of the loss Function. These values correspond to finding the ideal ML estimates neurons' weights (W) for the DNNs [27]. So, in order to reduce the loss Function and get the best performance of DNNs [21], the Gradient Descent Algorithm is applied to find the perfect estimates neurons' weights. This is the common and efficient method which was used, and called Backpropagation Algorithm [28]. Table 3.

Sequence Input
The inputs sequence data to a network is the sequence input layer.
Size of input =2. o Output size =Num. of Classes o Input size= Hidden Size.

SoftMax & Classification Output
A SoftMax layer applies a SoftMax function to the input. A classification layer determines the crossentropy loss in multi-class classification flows with mutually exclusive classes. The class number is obtained by the layer from the preceding layer output size.
o Output size=Num. of Classes. o Loss function=cross-entropy.

Results and discussions
If the power is small in an optical fiber, the fiber is dealt with as a linear medium. Yet, if the power is high, the nonlinear effects are to be paied attention to. the intrachannel SPM and XPM impacts signals and cause spectral broadening, increasing the dispersion penalties. Since there is a negative correlation between the the fiber loss and the signal power, sufficient nonlinearity is required. Thus, the launch power P0 should be larger than the optimum channel power and in addition, dispersion is critical in the reduction of the impacts of nonlinearities. Yet, dispersion itself may lead to intersymbol interference. It also broaden the optical pulses and chirp them, as the frequencies accumulates a delay or phase shift between them when propagating at various speeds. We will provide dispersion compensation and leave some residual in the link after each span by adds ~900 ps/nm total residual dispersion to the system through, the transmission link is of 10 spans with (50 Gbps) bit rate, the 100 km SMF dispersion reaches 16 ps/nm-km, and its effective region is 72 square microns. The DCF is -80 ps/nmkm dispersion appears with the use of 18 km DCF of 30 square micron effective area. The compensated SMF DCF losses is by EDFA with 25 dB gain. Depending on nonlinear Schrödinger equation the nonlinear parameter can be written as: = 2 2 / , with as the effective area, n2 nonlinear index and the carrier wavelength. two quantities can be defined to describe their relative importance they are the dispersion length (LD), LD=T 2 0 /β2 and nonlinear length (LNL), =  [29,30].
The obtained results are shown in Fig.8. the output pulse is effects by SPM and induced chirping and spectral broadening with propagation distance. The DNN receiver has 10 hidden layers with 32 and 128 hidden nodes. All networks are trained by activating functions of sigmoid, SoftMax & Classification Output layer, LSTM layer and ReLU that conduct a threshold operation with cross-entropy loss function and the Adam optimization algorithm. The number of samples in the dataset was 262,144. It also performs the transmission system repeat 128 time for the same sequence input to learning the network, where 85% of dataset used to training and 15% for testing. The finally results after DNN learning are shown in Table.4 with Max. Q Factor equal to 12.04518896 for 4QAM signal. The results shown in Fig.10 after training and testing the network proved that the DNNs based on classification output layer is capable of classifying the received symbol for the reference target 4,16-QAM and at iteration (2000) the result of constellation mapper is very clear but the loss is high (~0.2) with BER is (2.1x10^6), although the network classified the symbols correctly, but the bit locations were still incorrect So, with more training of the neural network, DNNs losses began to decrease (~0) as shown in Fig.11 and Fig. 12 and the validation accuracy rises to nearly 100% @iteration (5000). The final result of the Q-factor for the system at testing is equal to (~12 dB) @5000 Km link distance and 50 Gbps bitrate.

Conclusion
The the suggested DNN helps to learn the features of an optical fiber channel model, as complex computing and expensive for estimation. A sparse and huge data set enables the network to extract the most significant properties and lead to the interference effect; Thus, the network is capable of predicting these feature.