Multivariate rational regression and its application in semiconductor device modeling

删除或更新信息，请邮件至freekaoyan#163.com(#换成@)

本站小编 Free考研考试/2022-01-01

1.
Introduction

Accurate device modeling is essential for circuit simulation and design. In 1996, BSIM3 Version 3 (commonly abbreviated as BSIM3v3) was established by SEMATECH as the first industry-wide standard of its kind^[1]. It has since been widely used by most semiconductor and IC design companies world-wide for device modeling and CMOS IC design. Though the BSIM model is accurate, it needs a long time to adjust for non-ideal effects. Meanwhile, Moore’s law has nearly come to an end and lots of new devices show up which need study^[2–4]. It is unwise to invest a huge amount of resources to model a new device; we may just test its circuit characteristics rather than business use. Efficient modeling methods should be proposed for these purposes.

A lot of methods have been proposed for the similar usage. Gustanven proposed a lot of black box Macro-modeling methods for huge complex systems such as the high-voltage system and electronic magnetic system^[5]. The whole large systems are viewed as a black box with input and output, and statistical regression models are built to describe these systems. By using this method, the complexity of systems is greatly reduced^[6–8]. The same thought can be used in semiconductor device modeling since a semiconductor device can also be viewed as a black box. If an ideal statistical regression model is found, we can save a lot of time and money. For example, the carbon nanotubes field effect transistor (CNT-FET) is a promising candidate for MOS-FET, which generates much less heat and runs as fast as MOS-FET^[9]. But CNT-FET shows different I–V characteristics with a different manufacturing process, which is hard for us to build a universal physical model for all of them. A black box statistical modeling is a choice for efficiently modeling different CNT-FET^{[10, 11]}.

The keys of statistical modeling are the choice of model and at what extent the model can fit the true data to. Numerous statistical models are proposed for this task. Ordinary least square (OLS) and its regularization method (Ridge and LASSO) are the most commonly used regression approaches^[12]. But they are mainly used for linear regression and may not fit for semiconductor device modeling, since the device has a highly nonlinear property. In this paper, we propose two numerical methods for semiconductor device modeling, single-pole denominator numerator fitting (single-pole MRR) and double-pole MRR. They have a great nonlinear curve fitting ability and good numerical stability which are critical in semiconductor device modeling.

2.
Background

A lot of methods have been proposed for the curve fitting task. Suppose we have p dimension input attributes and one dimension output observed data points from the curve, $({{x}_i},{y_i}),i = 1,...,K,{
m{ }}{x_i} in {R^P},{
m{ }}{y_i} in R,{
m{ }}P in {N^ + }.$

We want to find a model ${hat y_i} = g({{x}_i})$

to best fit our original curve. OLS (Ordinary Least Square) has been widely used as a linear regression method, which supposes the input and output has a simple linear relationship. Firstly, using system input attributes ${{x}}$

to construct a feature vector $X$

, then, combining these features linearly to get one output $hat y = g(x;w) = {X^T}w,w in {R^P}$

. OLS tries to minimize the least square error of true data and model output to find optimal parameters, this is also called objective function,

$${w^*} = arg mathop {min }limits_w ||y - {X^T}w|{|_2}.$$

(1)

If X^TX is invertible, the optimal solution of coefficient w is w* = (X^TX)^–1X^Ty^[1]. The Lasso is a shrinkage method for OLS, which adds the L₁ norm to the objective function,

$${w^*} = arg mathop {min }limits_w ||y - {X^T}w|{|_2} + lambda sumlimits_{j = 1}^P {{w_j}} .$$

(2)

It can shrink some of the parameters to be exactly zeros if positive penalty $lambda $

is large enough to achieve higher predict accuracy and interpretation^[1]. Though a basis function such as polynomial basis and triangle basis can be used for nonlinear expansion, the OLS and Lasso suffer from its linear nature and behave not so well on a nonlinear curve fitting task.

Compared with the linear models, the rational model, $y = N(x)/D(x){
m{ = }}{X^T}{w_{
m n}}/{X^T}{w_{
m d}}$

, has nonlinearity in nature, thus is more suitable for fitting a nonlinear curve. The optimal rational model for a system can be obtained through minimizing objective function,

$${w^*} = arg mathop {min }limits_w ||y - frac{{N(X)}}{{D(X)}}|{|_2}.$$

(3)

However, Eq. (3) cannot be solved directly in closed form as there are unknown parameters in the denominator.

Since the 1950s, considerable effort has been devoted to the development of methods for parameters extraction of rational function. Levy, Sanathanan and Koerner, Lawrence and Rogers, and Stahl have presented various techniques by posing linear least squares problems. Pintelon and Guilaume analyze these and several other techniques^[13–16]. Vector-Fitting(VF), introduced by Gustavsen and Semlyen and using partial fraction basis, has been widely accepted as a robust modeling method for approximating frequency domain responses^[17–19].

In this work, we proposed single-pole MRR and double-pole MRR methods based on vector fitting, which are of great help in a nonlinear curve fitting task. The power of MRR is shown by numerical examples involving artificially created binary function, SMIC 40 nm NMOS DC characteristic, CNT-FET, and the LNA performance model.

3.
Algorithm

3.1
Single-pole MRR

Consider the rational function approximation,

$$y = frac{{N(X)}}{{D(X)}} = frac{{{w_0} + {X^T}{w_{
m n}}}}{{1 - {X^T}{w_{
m d}}}},$$

(4)

where $X$

is a feature vector formed by input attributes $x$

(x is a column vector and its elements are input variable, like ${x_1},{x_2},{x_3}$

and so on). We can use a different basis function to map input attributes $x$

to feature vector $X$

. For example, $X = {left[{x_1},{x_2},{x_3},x_1^2,x_2^2,x_3^2,...
ight]^T}$

.

We solve Eq. (4) by transforming it into an iterating OLS problem. We knew that equation $y = {X^T}w$

can be solved by objective function equation 1. In rational function approximation Eq. (4), we have object function,

$$w_{
m n}^*,w_{
m d}^* = arg mathop {min }limits_{{w_n},{w_d}} ||y - frac{{{w_0} + {X^T}{w_{
m n}}}}{{1 - {X^T}{w_{
m d}}}}|{|_2}.$$

(5)

Suppose in our iteration process, we have a set of ${ w_{
m d}^t} ,t = 0,1,...{
m{ }}w_{
m d}^0 = 0$

and $w_{
m d}^{t - 1}$

is known in the ith iteration, where $t$

is the iteration step of the whole process. We multiply a factor $||frac{{1 - {X^T}w_{
m d}^t}}{{1 - {X^T}w_{
m d}^{t - 1}}}|{|_2}$

to Eq. (5), and let $w_{
m d}^t = w_{
m d}^{t - 1} + w_{
m d}^ + $

$$begin{split}& ||frac{{1 - {X^T}w_{
m d}^t}}{{1 - {X^T}w_{
m d}^{t - 1}}}|{|_2}*arg mathop {min }limits_{{w_{
m n}},{w_{
m d}}} ||y - frac{{{w_0} + {X^T}w_{
m n}^t}}{{1 - {X^T}w_{
m d}^t}}|{|_2} & qquad = {
m{arg}}mathop {min ||}limits_{{w_{
m n}},{w_{
m d}}} (1 - frac{{{X^T}w_{
m d}^ + }}{{1 - {X^T}w_{
m d}^{t - 1}}})*(y - frac{{{w_0} + {X^T}w_{
m n}^t}}{{1 - {X^T}w_{
m d}^t}})|{|_2} & qquad = arg mathop {min }limits_{{w_{
m n}},{w_{
m d}}} ||y - frac{{{X^T}w_{
m d}^ + y}}{{1 - {X^T}w_{
m d}^{t - 1}}} - frac{{{w_0} + {X^T}w_{
m n}^t}}{{1 - {X^T}w_{
m d}^{t - 1}}}|{|_2} & qquad = arg mathop {min }limits_{{w_{
m n}},{w_{
m d}}} ||y - frac{{{w_0} + {X^T}w_{
m n}^t + {X^T}yw_{
m d}^ + }}{{1 - {X^T}w_{
m d}^{t - 1}}}|{|_2}. end{split}$$

(6)

Here in the last equation, a new OLS problem is made where the feature vector is $left[frac{1}{{1 - {X^T}w_{
m d}^{t - 1}}},frac{{{X^T}}}{{1 - {X^T}w_{
m d}^{t - 1}}},frac{{{X^T}y}}{{1 - {X^T}w_{
m d}^{t - 1}}}
ight]$

and coefficient is $left[{w_0},w_{
m n}^t,w_{
m d}^ +
ight]$

. By solving this OLS problem we get $w_{
m d}^ + $

and $w_{
m d}^t$

can be solved by $w_{
m d}^t = w_{
m d}^{t - 1} + w_{
m d}^ + $

. In the ${(t + 1)^{th}}$

iteration, $w_{
m d}^t$

is known, and we continue the above iteration until the convergence of ${w_{
m d}}$

. If $mathop {lim }limits_{t to infty } w_{
m d}^t = C$

, we will have $mathop {lim }limits_{t to infty } w_{
m d}^ + = w_{
m d}^t - w_{
m d}^{t - 1} = 0$

and the factor will be $mathop {lim }limits_{t to infty } ||frac{{1 - {X^T}w_{
m d}^t}}{{1 - {X^T}w_{
m d}^{t - 1}}}|{|_2} = 1$

. We will get the optimal solution $w_{
m n}^*,w_{
m d}^*$

if ${w_{
m d}}$

converges.

3.2
Double-pole MRR

Consider the rational function approximation,

$$y = frac{{N(X)}}{{D(X)}} = frac{{{w_{01}} + {X^T}{w_{{
m n}1}}}}{{1 - {X^T}{w_{{
m d}1}}}} + frac{{{w_{02}} + {X^T}{w_{{
m n}2}}}}{{1 - {X^T}{w_{{
m d}2}}}}.$$

(7)

Though we can still multiply a factor $ ||frac{{1 - {X^T}w_{{
m d}1}^t}}{{1 - {X^T}w_{{
m d}1}^{t - 1}}} * $

$ frac{{1 - {X^T}w_{{
m d}2}^t}}{{1 - {X^T}w_{{
m d}2}^{t - 1}}}|{|_2}$

to its objective function, the equation may have too many matrix products such as ${X^T}{w_{{
m d}1}}{X^T}{w_{{
m d}2}}$

and is hard to solve. The same as single-pole MRR, we transform Eq. (7) by adding two additional elements in the ${t^{th}}$

iteration,

$$begin{split}&frac{{w_{01}^t + {X^T}w_{{
m n}1}^t}}{{1 - {X^T}w_{{
m d}1}^{t - 1}}} + frac{{w_{02}^t + {X^T}w_{{
m n}2}^t}}{{1 - {X^T}w_{{
m d}2}^{t - 1}}}&quad = left(1 - frac{{{X^T}w_{{
m d}1}^ + }}{{1 - {X^T}w_{{
m d}1}^{t - 1}}} - frac{{{X^T}w_{{
m d}2}^ + }}{{1 - {X^T}w_{{
m d}2}^{t - 1}}}
ight)y,end{split}$$

(8)

$$frac{{w_{01}^t + {X^T}w_{{
m n}1}^t}}{{1 - {X^T}w_{{
m d}1}^{t - 1}}} + frac{{w_{02}^t + {X^T}w_{{
m n}2}^t}}{{1 - {X^T}w_{{
m d}2}^{t - 1}}} + frac{{{X^T}yw_{{
m d}1}^ + }}{{1 - {X^T}w_{{
m d}1}^{t - 1}}}{
m{ + }}frac{{{X^T}yw_{{
m d}2}^ + }}{{1 - {X^T}w_{{
m d}2}^{t - 1}}} = y.$$

(9)

Here $w_{{
m d}1}^{t - 1} + w_{{
m d}1}^ + = w_{{
m d}1}^t$

, $w_{{
m d}2}^{t - 1} + w_{{
m d}2}^ + = w_{{
m d}2}^t$

. A new OLS problem is made where the feature vector is $ left[frac{1}{{1 - {X^T}w_{{
m d}1}^{t - 1}}},
ight. $

$left.frac{{{X^T}}}{{1 - {X^T}w_{{
m d}1}^{t - 1}}},frac{1}{{1 - {X^T}w_{{
m d}2}^{t - 1}}},frac{{{X^T}}}{{1 - {X^T}w_{{
m d}2}^{t - 1}}},frac{{{X^T}y}}{{1 - {X^T}w_{{
m d}1}^{t - 1}}},frac{{{X^T}y}}{{1 - {X^T}w_{{
m d}2}^{t - 1}}}
ight]$

and the coefficient is $left[w_{01}^t, w_{{
m n}1}^t, w_{02}^t, w_{{
m n}2}^t, w_{{
m d}1}^ + , w_{{
m d}2}^ +
ight]$

. We continue the above iteration until the convergence of ${w_{{
m d}1}}, {w_{{
m d}2}}$

. Different from the single-pole MRR, if we set ${w_{{
m d}1}}, {w_{{
m d}2}}$

to zero as the initial value, the equation will suffer from rank deficiency, the solution is we get the value of ${w_{{
m d}1}}, {w_{{
m d}2}}$

from a random normal distribution with zero mean and small variance. In fact, if we get the initial value of ${w_{
m d}}$

from a random normal distribution rather than zero in single-pole MRR, the convergence value of ${w_{
m d}}$

will not change. This means the random initial value of ${w_{
m d}}$

will not affect the final convergence, and the same goes for double-pole MRR.

3.3
Data preprocessing method

3.3.1
Normalization

The key step of MRR is accurately solving the over-determined Eq. (7). However, because the condition of such a problem is poor, the solving of the normal equation is of reduced numerical stability and may result in large errors in the solution. Besides, if there are degrees of magnitude difference between y and x, the rank of matrix A might be rank deficient, which also lead to the worst solution. In order to circumvent these cases, normalizing input attributes x and target output y before the parameters extraction procedure is usually of great help.

3.3.2
Logarithm transformation

Semiconductor device I–V characteristic value might be small in number and varies in a wide range, e.g. the I_dof NMOS-FET may vary from $1 times {10^{ - 5}};{
m{ to }};1 times {10^{ - 13}}$

. Log transformation is important in this circumstance. However, this will also require high precision of the model since inverse operation, exponentiation, will zoom in the prediction error dramatically.

4.
Experiment

4.1
Fitting artificial created function

To illustrate the validity of the proposed method, we firstly consider an artificially created function defined $y = sin (2sqrt {x_1^2 + x_2^2})/(2sqrt {x_1^2 + x_2^2}),{x_1},{x_2} in [ - 3, 3]$

. We adopt Single-pole MRR with the decic polynomial basis. Figs. 1(a) and 1(b) show the origin function and MRR fitting result. The normalized mean-squared-error (NMSE) is about 7.45 × 10^?10. Figs. 1(c) and 1(d) show the step NMSEs of two MRR methods, and that single-pole MRR achieves better accuracy and stability. So in this fitting task, we adopt single-pole MRR as fitting model.

onerror="this.onerror=null;this.src='http://www.jos.ac.cn/fileBDTXB/journal/article/jos/2018/9/PIC/18010005-1.jpg'"
class="figure_img" id="Figure1"/>

Download

Larger image

PowerPoint slide

Figure1.
(Color online) The original function and MRR fitting function. (a) Original artificial function. (b) Single-pole MRR fitting result. (c) Step NMSEs of single-pole MRR. (d) Step NMSEs of double-pole MRR.

4.2
Fitting I–V characteristics curve of SMIC 40 nm NMOS-FET

DC characteristics of NMOS-FET are used to show the performance of MRR, compared with OLS and LASSO. As BSIM has been widely used in industry for years and, to some extent, it can represent the authentic physical property of NMOS-FET, so we use Cadence and SPICE to get DC simulation data of SMIC 40 nm NMOS-FET with a channel width of 1 μm and channel length of 40 nm. In this case, we choose two independent variables, ${V_{
m d}}$

and ${V_{
m g}}$

, and one dependent variable, ${I_{
m d}}$

4.2.1
Analysis of fitting result

Fig. 2 shows the fitting results of different algorithms for the dataset. We plot the I_d–V_d curve and I_d–V_g curves to show the fitting performance. Table 1 shows the parameters and NMSE of the fitting result. Single-pole MRR with sextic polynomial, double-pole MRR with quartic polynomial and OLS nonic polynomial are used for contrast. The number of parameters in each model is close. Because of the regularization property, we choose the nonic polynomial for LASSO to include as many meaningful features as possible. After 10-fold cross validation, the optimal regularization parameter $lambda $

in Eq. (2) is 1 and the number of non-zero coefficient is 36. Normalization and logarithm transformation are both used as preprocessing methods. Two MRR algorithms outperform OLS and LASSO, as shown in Fig. 2. For convenience, we zoom-in the I_d–V_g curve and clip two ranges of V_g from each curve. The detail of the fitting result can then be seen clearly in Fig. 3.

Algorithm	Single-pole MRR (sextic polynomial)	Double-pole MRR (quartic polynomial)	OLS (nonic polynomial)	LASSO (nonic polynomial)
Parameter	55	58	55	36
NMSE	3.0157 × 10^–8	2.631 × 10^–6	4.357 × 10^–4	5.082 × 10^–4

Table1.
NMSE and parameters number results of different algorithms.

Table options
-->

Download as CSV

Algorithm	Single-pole MRR (sextic polynomial)	Double-pole MRR (quartic polynomial)	OLS (nonic polynomial)	LASSO (nonic polynomial)
Parameter	55	58	55	36
NMSE	3.0157 × 10^–8	2.631 × 10^–6	4.357 × 10^–4	5.082 × 10^–4

onerror="this.onerror=null;this.src='http://www.jos.ac.cn/fileBDTXB/journal/article/jos/2018/9/PIC/18010005-2.jpg'"
class="figure_img" id="Figure2"/>

Download

Larger image

PowerPoint slide

Figure2.
(Color online) The fitting results of different algorithms. (a) I_d–V_d curve by single-pole MRR. (b) I_d–V_g curve by single-pole MRR. (c) I_d–V_d curve by double-pole MRR. (d) I_d–V_g curve by double-pole MRR. (e) I_d–V_d curve by OLS. (f) I_d–V_g curve by OLS. (g) I_d–V_d curve by LASSO. (h) I_d–V_g curve by LASSO.

onerror="this.onerror=null;this.src='http://www.jos.ac.cn/fileBDTXB/journal/article/jos/2018/9/PIC/18010005-3.jpg'"
class="figure_img" id="Figure3"/>

Download

Larger image

PowerPoint slide

Figure3.
(Color online) Zoom-in pictures of I_d–V_g curves fitting by different algorithms. (a) Zoom-in of I_d–V_g curve by single-pole MRR, V_g ranges from 0.15 to 0.45 V. (b) Zoom-in of I_d–V_g curve by single-pole MRR, V_g ranges from 0.65 to 1.1 V. (c) Zoom-in of I_d–V_g curve by double-pole MRR, V_g ranges from 0.15 to 0.45 V. (d) Zoom-in of I_d–V_g curve by double-pole MRR, V_g ranges from 0.65 to 1.1 V. (e) Zoom-in of I_d–V_g curve by OLS, V_g ranges from 0.15 to 0.45 V. (f) Zoom-in of I_d–V_g curve by single-pole MRR, V_g ranges from 0.65 to 1.1 V. (g) Zoom-in of I_d–V_g curve by LASSO, V_g ranges from 0.15 to 0.45 V. (h) Zoom-in of I_d–V_g curve by LASSO, V_g ranges from 0.65 to 1.1 V.

The Table 1 details the NMSE of four algorithms. As the data is very small, the Normalized Mean Square Error (NMSE) is used, as shown in Eq. 10. Single-pole MRR with sextic polynomial

$$frac{1}{m}sumlimits_{i = 1}^K {{{left[frac{{{t_i}}}{{max (t)}} - frac{{{y_i}}}{{max (t)}}
ight]}^2}} .$$

(10)

NMSE is $3.0157 times {10^{ - 8}}$

and double-pole MRR with cubic polynomial’s NMSE is $2.631 times {10^{ - 6}}$

. The result shows MRRs have a much better nonlinear curve fitting ability than OLS and LASSO. We also compare the computation complexity with OLS and LASSO in this paper. Double-pole MRR reach smaller NMSE than OLS and LASSO with less parameters. It is more efficient than the classics. The time complexity of MRR is iteration time plus the time period of a single iteration of MRR. The time of a single iteration of MRR is equal to other classics. The MRR will converge in less than 10 iterations. This will be discussed in Section 4.2.2. In all of the experiments in this paper, the iteration time is no longer than 1 s. The experiment environment is Intel i7-4790 CPU with 32G memory and MATLAB 2017b.

Although single-pole MRR reaches the minimum of NMSE in the results, double-pole MRR uses a lower polynomial (cubic polynomial) and the fewest parameters to reach the second best fitting solution. In fact, it is the best model with polynomial of low degree (1, 2, 3). Double-pole MRR with cubic polynomial’s NMSE is 1.593 × 10^?5. This is very helpful in the circumstance where the number of input attributes is large and polynomial of a high degree cannot be achieved due to the limit of CPU memory, e.g. a sextic polynomial of 12 input attributes requires 1237 GB CPU memory.

4.2.2
Analysis of convergence

The convergence of the algorithm is critical, for our parameters extraction method is based on iteration. As has been analyzed, the stability of every least squared problem would be improved after data normalization. Single-pole MRR has very good numerical stability with all polynomial while double-pole MRR shows good numerical stability for a polynomial of low degrees. When the polynomial degree is high, double-pole MRR may suffer from rank deficiency and the numerical stability will decrease. We get the sequential NMSE of two MRRs’ iteration with a quadratic polynomial. The result is plotted in Fig. 4 in log scale. Fig. 4(a) shows step NMSEs of the global optimal fitting results for 40 nm MOSFET. Single-pole MRR adopts a sextic polynomial and double-pole MRR adopts a quartic polynomial. Fig. 4(b) shows step NMSEs of two MRR methods when they adopt the quadratic polynomial. The best NMSE of single-pole MRR and double-pole MRR with the quadratic polynomial are $5.94 times {10^{ - 2}}$

and $2.851 times {10^{ - 5}}$

4.3
Fitting CNT-FET

In this section, we apply single-pole MRR to CNT-FET DC behavior modeling. Fig. 4 visualizes the measured ${I_{
m d}}$

with respect to ${V_{
m d}}$

and ${V_{
m g}}$

for a CNT-FET ($w = 3 ,,mu {
m m}$

and $l = 15;{
m nm}$

). ${V_{
m g}}$

ranges from ?2 to 0 V with a step size 0.15 V and ${V_{
m d}}$

ranges from ?2 to 0.02 V with a step size 0.02 V. We choose a septic polynomial to generate feature vector X.

onerror="this.onerror=null;this.src='http://www.jos.ac.cn/fileBDTXB/journal/article/jos/2018/9/PIC/18010005-4.jpg'"
class="figure_img" id="Figure4"/>

Download

Larger image

PowerPoint slide

Figure4.
(Color online) Step NMSEs of two MRR methods in different polynomial degrees. (a) Step NMSEs of MRR methods in the situation of a polynomial of high degree. (b) Step NMSEs of MRR methods in the situation of a polynomial of low degree.

Normalized mean square error (NMSE) of I_dis $5.961 times {10^{ - 4}}$

. Fig. 5 shows the original data surface and fitting data surface generated by single-pole MRR. Figs. 6(a)–6(d) shows the fitting result of output characteristics (|I_d|–|V_d|) and transfer characteristics (|I_d|–|V_g|) respectively. The prediction fits the measured data well. Figs. 6(e) and 6(f) show the step NMSEs of two MRR methods, they all can converge but the single-pole MRR outperforms the double-pole MRR in the global optimal NMSE, so we choose single-pole MRR as our final fitting model. Considering the manufacturing process of CNT-FET is not mature, there is still great uncertainty in devices, so we think this fitting result is acceptable.

onerror="this.onerror=null;this.src='http://www.jos.ac.cn/fileBDTXB/journal/article/jos/2018/9/PIC/18010005-5.jpg'"
class="figure_img" id="Figure5"/>

Download

Larger image

PowerPoint slide

Figure5.
(Color online) The observed CNT-FET data point surface and MRR fitting surface. (a) Original data of CNT-FET. (b) Single-pole MRR fitting result.

onerror="this.onerror=null;this.src='http://www.jos.ac.cn/fileBDTXB/journal/article/jos/2018/9/PIC/18010005-6.jpg'"
class="figure_img" id="Figure6"/>

Download

Larger image

PowerPoint slide

Figure6.
(Color online) |I_d|–|V_d|, |I_d|–|V_g| curves and step NMSEs by MRR methods. (a) |I_d|–|V_d| curve by single-pole MRR. (b) |I_d|–|V_g| curve by single-pole MRR. (c) Zoom-in |I_d|–|V_g| curve, V_g ranges from 0–0.8 V. (d) Zoom-in |I_d|–|V_g| curve, V_g ranges from 1–2 V. (e) Step NMSEs of single-pole MRR. (f) Step NMSEs of double-pole MRR.

4.4
Fitting LNA performance model

In this section we apply MRR methods to model LNA performance characteristics and demonstrate our algorithm in multi-variables (more than 2) regression. Generally, we get a certain performance of a circuit by means of solving the KCL and KVL equations. The solution is accurate but very time-consuming. If we want to do behavioral simulation and optimization for the RF circuit, directly mathematical mapping between design parameters and performance is highly useful.

We chose an LNA which worked at 4GHz and use Cadence to get simulation data for modeling. Figs. 7(a)–7(c) show the fitting result of three LNA indicators, NF (noise factor), gain and power by single-pole MRR. Figs. 7(d)–7(f) are step NMSEs of single-pole MRR and Figs. 7(g)–7(i) are step NMSEs of double-pole MRR. The performance of two MRR methods are close. We sort the sample point in an increased order and record the index sequence, then we plot data points, predictions and errors according to the sorted index sequence in one figure. We can see that the training result is good. Normalized mean square error (NMSE) of NF, gain and power are 0.0031, 0.00710, and ${
m{3}}{
m{.5887}} times {10^{{
m{ - }}7}}$

onerror="this.onerror=null;this.src='http://www.jos.ac.cn/fileBDTXB/journal/article/jos/2018/9/PIC/18010005-7.jpg'"
class="figure_img" id="Figure7"/>

Download

Larger image

PowerPoint slide

Figure7.
(Color online) The original function and MRR fitting function. (a) Noise factor fitting by single-pole MRR. (b) Gain fitting by single-pole MRR. (c) Power fitting by single-pole MRR.(d) Step NMSEs of single-pole MRR for noise factor. (e) Step NMSEs of single-pole MRR for gain. (f) Step NMSEs of single-pole MRR for power. (g) Step NMSEs of double-pole MRR for noise factor. (h) Step NMSEs of double-pole MRR for gain. (i) Step NMSEs of double-pole MRR for power.

5.
Conclusion

This paper proposes a family of numerical methods---MRR to approximate an unknown system and extract model parameters. We firstly use single-pole MRR to fit an artificial function and the result is extremely good. Then we compare the performance among single-pole MRR, double-pole MRR, OLS and LASSO on SMIC 40 nm DC characteristics dataset. The results show single-pole MRR has the highest fitting precision and double-pole performs better than single-pole MRR in circumstances of a low degree polynomial. The MRR methods have a more powerful nonlinear curve fitting ability than OLS and LASSO and are proved to be numerically stable. CNT-FET and LNA performance indicators are also modeled, of which the fitting results are good as well. But there are two key points for using MRR methods. Firstly, users have to pay close attention to the numerical stability of MRR methods. We have done one artificial function fitting task and three device model fitting tasks for the convergence analysis of two MRR methods. The results show single-pole MRR has better numerical stability than double-pole MRR. Secondly, the datasets used for the fitting curve should be well-distributed and not be sparse. MRR methods are powerful in fitting a highly-nonlinear function but can also lead to overfitting if the dataset is ill-distributed. Our paper shows the MRR methods are good choices for semiconductor devices statistical modeling as well as other highly-nonlinear curve fitting tasks.