Serial link speeds have increased 25X in under 20 years, thus increasing the complexity of the IBIS- Algorithmic Modeling Interface (AMI) models used in simulating these links. With the increased speed and complexity of designs, it is crucial to analyze channels to ensure sufficient margin for error-free data transmission. An exhaustive manual search method is typically used to find the best set of parameters for a given channel, but given the increased number of model parameters and ranges, this approach can quickly become computationally expensive, even with parallel execution.

Machine learning (ML) techniques1 have proven effective in modeling complex systems with numerous interacting components and nonlinear relationships. Some of these techniques can be employed to optimize the parameters of a complex system more efficiently than the exhaustive search method typically used in serial link simulations.

This article describes the use of Cadence’s SigrityTM signal and power integrity solution ML optimization algorithm to quickly and efficiently converge on the best set of parameters in a set of IBIS-AMI models. The application of Sigrity was investigated for refining IBIS- AMI parameters to find the optimal set of values to maximize a specific metric.

Reference Designs
As the speeds of serial link standards have increased, so has the complexity of the designs needed to transmit and receive data. This is reflected in the amount of transmitter (Tx) and receiver (Rx) equalization specified in the standards. This is needed to overcome the increased channel losses that come with the faster data rates.

In the past, IBIS-AMI models based on the Electronics Industries Association serial standards, called “reference designs,” have swept all possible Tx and Rx parameter combinations to find the best solution for a given channel. While this was a reasonable approach in the past, this method has become more difficult to execute. Even with many cores and tools to manage the sweeping of these parameters, it has become computationally expensive and very time consuming.

Simulations can be run more efficiently using statistical methodology. However, these types of models lack nonlinear effects such as noise and signal clipping that can give less than optimal results. As a result, time domain channel simulation was used for this study, and a possibly more computationally efficient approach was used that applied an ML optimization algorithm to find the best set of IBIS-AMI model parameters for a given channel.

ML Optimization
The ML optimization process or methodology can be done in many ways. One method is to execute an exhaustive search that involves a brute force algorithm that systematically enumerates all possible solutions to a problem and checks each one to see if it is a valid solution. This algorithm is typically used for problems that have a small and well defined search space, where it is feasible to check all possible solutions.2 However, for cases where the search space is extremely large, a different approach is needed. This is where ML can be used to more efficiently find an optimal solution.

There are areas in the ML field that specifically deal with the optimization of parameters of a function. This function is often called an objective function, which is an unknown function (sometimes called a “black box”). The goal is to find the set of parameters that will provide the maximum (or minimum) value of the objective function. A model of this unknown objective function (sometimes called a surrogate model) is built and updated based on evaluations of the objective function. The update of the surrogate model is based on an acquisition function. While there are several types of surrogate models and acquisition functions from which to choose, the most common surrogate function is a Gaussian process model, and the most common acquisition function is the expected improvement.

This optimization process (see Figure 1) can be detailed as follows:3

  1. An initial random sampling is collected by applying random parameter values to the objective functions. The size of this sampling can be scaled based on the number of parameters.
  2. The surrogate model is trained on the initial random sampling.
  3. Create updated sets of parameters that would:
    1. Move closer to the maximum (or minimum) of the acquisition function.
    2. Further random sampling of the acquisition function.
  4. Evaluate the objective function based on the updated parameters from the surrogate model.
  5. Update the surrogate model based on the latest samples of the objective function.
  6. Repeat steps 3-5 until a stopping criterion has been met.
10M28SIJ-FIG1-x700.jpgFigure 1. Machine learning optimization process.
When creating the updated set of parameters in Step 3, there is a tradeoff between the number of simulations that will exploit the acquisition function, called exploitation, and the number of simulations that will explore the acquisition functions, called exploration. This is an important tradeoff because too much exploitation will result in possibly not finding the best set of parameters, and too much exploration will not spend enough time working towards the best set of parameters. This tradeoff is sometimes called the “exploration- exploitation dilemma.” Typically, more exploration is done early in the optimization process before it switches to more exploitation towards the end of the process.

For this article, the objective function was the simulation of a specific channel with IBIS-AMI models at the Tx and Rx that, together, would have many equalization parameters. From the simulation, an output needed to be chosen for the surrogate model. Typical outputs for this type of simulation would include eye height, eye width, eye jitter, and channel operation margin (COM). A single output, or a combination of several outputs, could be used as an output to maximize or minimize. For most cases, this would be reliable but, as will be discussed in a later section, there are situations where these outputs would not work, and a different measurement would be needed.

IBIS-AMI Model Generation
For this study, heavily parametrized IBIS-AMI models were needed to leverage the Sigrity ML optimization. A model based on the OIF-CEI-112G-LR standard was selected due to its large number of parameter combinations, shown in Table 1.4


10M28SIJ-TABLE-1-x500.jpg


The Tx AMI model has three taps of precursor and one tap of post-cursor, with the ranges shown in Table 1. Note, not all combinations of values are valid for the Tx, as the main cursor cannot be lower than 0.5. This issue, the impact to the optimization algorithm, and the work- around are discussed in a later section.                                                                                                          

The Rx AMI model is a two-stage continuous time linear equalization (CTLE) receiver with a 12-tap decision feedback equalizer (DFE). The two stages of CTLE, shown in Figure 2, were based on the pole-zero pairs shown in Table 1, along with the various gain steps. The taps of the DFE were adapted and limited based on the command parameters. Had the brute force method been used, there were a total of six different parameters that could be optimized between the Rx and Tx models, translating to 636,804 simulations for complete coverage of the solution space.

10M28SIJ-FIG2-x500.jpg
Figure 2. Bode plots of the two stages of CTLE.

Simulation Setup
A testbench was built in an IBIS-AMI simulator, as shown in Figure 3. The simulation was run long enough to allow the DFE to adapt to the incoming data (~70 kbits), and to accumulate ~100 kbits for making measurements. The simulation was run with 64 steps per unit interval (UI) and a vertical resolution of 2048, allowing sufficient detail for taking accurate measurements.
10M28SIJ-FIG3-x500.jpgFigure 3. Testbench schematic.

To ensure that the optimization algorithm would perform for different setups, three different channels were selected based on the insertion loss mask from the OIF-CEI-112G-LR spec4 shown in Figure 4. The channels were selected for a short, medium, and long channel, based on the loss at the fundamental frequency (28 GHz), as labeled in the figure.

10M28SIJ-FIG4-x700.jpgFigure 4. Insertion loss mask.

For the optimization to run correctly, the algorithm needed to know which parameters could be adjusted and which outputs could be maximized for the best result. The CTLE control was set up as a list of numbers, while the Tx pre- and post-cursor were set up as a range of float numbers, with the upper and lower bound based on values from Table 1. Figure 5 shows the parameter setup details. It was assumed the impact of the parameters would have a continuous impact on the simulation output when increased or decreased. Placing the parameter values in a random order could create discontinuities in the output, causing the algorithm to possibly not converge on a solution.

10M28SIJ-FIG5-x700.jpgFigure 5. Optimization parameter setup details.

As stated in the previous section, the sum of the Tx pre- and post-cursor needed be 0.5 or less. The main cursor was calculated by summing the pre- and post-cursor values together and subtracting that number from 1.0. During the optimization process, it was not possible to allow the ML algorithm to skip these simulations, as they were not valid. This issue was solved by creating a custom code for the Tx AMI model to set all pre- and post-cursor coefficients to zero and set the main cursor to a very small value (0.01) whenever the sum of the coefficients was more than 0.5. This had the result of presenting a very poor result to the ML algorithm, forcing the adaptation of the parameters away from the invalid set of parameters. The output to be maximized should have been an output that would increase as the simulation result improved. This could be a combination of an eye height and eye width, or a relative measurement like COM. However, these types of measurements suffer from the fact that they are always zero for a closed eye simulation result, which should not be an issue if, across the global solution space, there are few closed eye simulation results. It has been found that for non-return-to-zero (NRZ) signaling simulations, this turns out to be true for most channels except for the most difficult. However, for four-level pulse amplitude modulated (PAM4) signaling simulations, the opposite is true; there are many more closed eye simulation results than open eye results.

This is an important detail, as the ML algorithm needs a gradient of results to help find a maximum/minimum value; known as gradient descent,5 a common optimizing algorithm in the ML field. If the algorithm observes mostly zeros from the simulation results, the algorithm will have a gradient of zero and will be unable to find the best solution. Therefore, a measurement that will provide a non-zero result for all simulations is needed, including closed eye simulation results. This measurement should work for any signaling type (NRZ, PAM4, etc.). The signal-to-noise ratio (SNR) measurement would meet these requirements. A SNR measurement is defined as a ratio of the desired signal level to the level of background noise, plus any distortion.6 For serial link simulations, the desired signal is defined as the difference between the high and low level of an eye diagram. The background noise and any distortions are measured by the sum of the one sigma of the high level and one sigma of the low level. This definition will work for an NRZ eye diagram, as well as a PAM4 eye diagram. The details of this measurement can be found in Figure 6.

10M28SIJ-FIG6-x500.jpgFigure 6. SNR measurement details.

Figure 7 shows SNR measurement for various PAM4 eye density plots. While the lower plots had some eye opening, the upper plots had no eye opening and returned a zero for typical eye measurements, providing useless information for the ML algorithm. Also, the SNR measurement increased as the eye plot improved, even for the closed eye measurements.

10M28SIJ-FIG7-x500.jpgFigure 7. SNR comparison for various PAM4 eye density plots.

As stated earlier, the optimization process can find a minimum or maximum value of the evaluation function. For this study, the ML algorithm was set up to find a minimum value. To accompany this, the SNR value was inverted because the ML algorithm needed to know what the stopping criterion should be. This could be either an output value below a target value, in which case the simulations would stop once this value was reached or a maximum number of simulations were run, or the finding of the best result after a specific number of simulations were run. For this study, the best SNR value was to be found within 100 simulation runs.

Parallel execution of the simulations was used, and due to the way in which the ML model is updated, the number of parallel simulations was limited to four. This may seem to be a disadvantage when compared to the almost unlimited number of cores that could be used for an exhaustive search; however, the results reflected that this limitation had a minimal impact on the time to convergence when measured by the number of simulations needed.

The optimization was run on the three different channels, as shown in Figure 6. For each channel, the following steps were performed:

  1. Based on the number of parameters to be optimized (six), 30 initial simulations were run with random parameter settings. The SNR of each simulation is recorded.
  2. The surrogate model is trained on the SNR output from the initial set of simulations.
  3. The surrogate model is then queried for the next set of simulations.
  4. The results from the next set of simulations are used to update the surrogate model.
  5. Steps 3 and 4 are repeated until no parameter sets are available, or if the total number of simulations has been reached (100 simulations).
Simulation Results
Figure 8 shows the optimization convergence for the three different channels detailed in Figure 6.
10M28SIJ-FIG8-x500.jpgFigure 8. A) short channel results, B) medium channel results, and C) long channel results.

Based on the flow described in the previous section, it can be observed that after the initial 30 simulations, the algorithm quickly converged to a good result, with the following simulations used to refine the result to a better answer. In addition, the algorithm was able to converge for three different lossy channels, showing the robustness of the process.

The results clearly show that the ML optimization found parameters that resulted in good open eye results at the Rx. While these results were good, it was necessary to verify that they were the best results, or at a minimum, close to the best results. To confirm that the algorithm converged on the best set of parameters, the parameters from the best result of the medium channel were locally swept around to see if a better result could be found. Due to the difficulty of showing more than four parameter sweeps in a single graph, it was decided to keep the values for C(-3) and C(-2) constant, while C(-1), C(+1), and the two CTLE stage settings were swept around the best result found by the ML algorithm. Assuming the algorithm came close to the best result, this sweeping methodology should confirm the algorithm’s result.

Figure 9 shows a grid of heatmaps of this manual sweep. The title for each heatmap shows the C(-1), C(+1) setting. Each heatmap shows the sweep values of the two CTLE stages. The best result from the ML algorithm is highlighted in blue, while the best result from the manual sweeping is highlighted in white. While the maximum SNR value found by the manual sweeping is much higher than that found by the algorithm, the parameters for the maximum SNR are only one step away from those found by the algorithm.
10M28SIJ-FIG9-x700.jpgFigure 9. Heatmap grid of the parameter sweep.
Conclusion
The application of Cadence’s Sigrity SI/PI ML technology was investigated for refining IBIS-AMI parameters to quickly and efficiently converge on the best set of parameters in a set of IBIS-AMI models. The ML algorithm was applied to the optimization of AMI parameters in a serial link simulation. The results show that the algorithm was able to successfully find good results for three different channels. This method found a good set of parameters in fewer simulations than if a traditional manual method had been deployed, conserving the use of limited human and computing resources. In most test cases, it was found that only 100 simulations were needed to find the best set of parameters.

 

 REFERENCES                       
  1. M. Kashyap, K. Keshavan, and A. Varma, “A novel use of deep learning to optimize solution space exploration for signal integrity analysis,” IEEE, April 5, 2018.
  2. “Implementation of Exhaustive Search Algorithm for Set Packing."
  3. W. Wang, “Bayesian Optimization Concept Explained in Layman Terms,” March 18, 2020, https://medium.com/data-science/bayesian-optimization- concept-explained-in-layman-terms-1d2bcdeaf12f.
  4. “Common Electrical I/O (CEI) - Electrical and Jitter Interoperability agreements for 6G+ bps, 11G+ bps, 25G+ bps, 56G+ bps and 112G+ bps I/O,” Optical Internetworking Forum, May 5, 2022.
  5. J. Brownlee, “Gradient Descent for Machine Learning,” Machine Learning Algorithms, August 12, 2019.
  6. “Understanding Eye Pattern Measurements,” Anritsu.