Solving forward and inverse problems of the nonlinear Schr【-逻*辑*与-】ouml;dinger equation with the gen

删除或更新信息，请邮件至freekaoyan#163.com(#换成@)

Solving forward and inverse problems of the nonlinear Schr【-逻辑与-】ouml;dinger equation with the gen

本站小编 Free考研考试/2022-01-02

Abstract
In this paper, based on physics-informed neural networks (PINNs), a good deep learning neural network framework that can be used to effectively solve the nonlinear evolution partial differential equations (PDEs) and other types of nonlinear physical models, we study the nonlinear Schrödinger equation (NLSE) with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential, which is an important physical model in many fields of nonlinear physics. Firstly, we choose three different initial values and the same Dirichlet boundary conditions to solve the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential via the PINN deep learning method, and the obtained results are compared with those derived by the traditional numerical methods. Then, we investigate the effects of two factors (optimization steps and activation functions) on the performance of the PINN deep learning method in the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential. Ultimately, the data-driven coefficient discovery of the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential or the dispersion and nonlinear items of the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential can be approximately ascertained by using the PINN deep learning method. Our results may be meaningful for further investigation of the nonlinear Schrödinger equation with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential in the deep learning.
Keywords： nonlinear Schrödinger equation;generalized PT-symmetric scarf-II potential;physics-informed neural networks;deep learning;initial value and dirichlet boundary conditions;data-driven coefficient discovery

PDF (1041KB)Metadata Metrics Related articlesExportEndNote|Ris|Bibtex Favorite
Cite this article
Jiaheng Li, Biao Li. Solving forward and inverse problems of the nonlinear Schrödinger equation with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential via PINN deep learning. Communications in Theoretical Physics, 2021, 73(12): 125001- doi:10.1088/1572-9494/ac2055

1. Introduction

With the rapid development of modern science and technology, many scientific fields and real social lives provide us the exponential growth of data. At present and even for a long time in the future, how to extract effective information from these huge amounts of data has become a major and significant challenge. Recently, some scholars have used machine learning (deep learning) and data analysis to deal with these huge amounts of data, and good results have been achieved in some fields including but not limited to data mining, image recognition, cognitive science, genes, product recommendation systems, natural language processing (NLP), automatic quantitative driving, stock market trading [1–6] and solving equations [7–19]. Different scientific fields, more or less, may have some unique challenges, but how to obtain good results with a small amount of data or with data with noise error has long been a common problem in all fields, but it has not been effectively solved. As is well known, it is critical to use accurate and adequate data in deep learning training, and the lack of sample data or using error data leads to poor robustness, which also usually makes the results of deep learning unsatisfactory. In addition, for solving higher dimensional nonlinear partial differential equations (PDEs) [18, 19], conventional numerical solutions have been a longstanding challenge, and finite difference methods become infeasible in higher dimensions due to the explosion in the number of grid points and the demand for reduced time step size. Therefore, these limitations urge us to develop better deep learning neural network frameworks to improve the accuracy and optimize the robustness for solving those linear or nonlinear mathematical physical models.

In recent years, thanks to the continuous research of some scholars, some new deep learning neural network frameworks [16, 18] and their optimizations [20–22] for solving PDEs have been put forward. The prospect of using structured prior information to build data-efficient and physical information machine learning models has been preliminarily demonstrated [9–11]. Using Gaussian process regression to design a functional representation for a given linear operator allows us to accurately infer solutions and provide uncertainty estimates for several classical problems in mathematical physics [12]. An extension of the nonlinear problem involving inference and system identification has been proposed [13, 14]. Despite the flexibility and mathematical elegance of Gaussian processes in encoding prior information, the treatment of nonlinear problems introduces two important limitations. Firstly, the users have to locally linearize any potential nonlinear term in time, thus limiting the applicability of the proposed method in the discrete time domain and affecting the accuracy of the prediction in the strongly nonlinear region. Secondly, the Bayesian nature of Gaussian process regression requires a certain priori assumption, which may limit the model's representation ability and cause robustness problems, especially for nonlinear problems. However, the PINN deep learning method [16] does a good job of avoiding these limitations. Additionally, it is worth noting that, based on some excellent properties of PINNs, some interesting results have been achieved by some scholars, mainly on nonlinear wave solutions of nonlinear partial differential equations [23–29].

Generally, the PINN deep learning method can be used to learn the latent (hidden) solution $\hat{w}(t,x)$ of the nonlinear evolution PDEs of the following general form [16]:(1)$\begin{eqnarray}{w}_{t}-{ \mathcal N }[x,w;\lambda ]=0,\quad (t,x)\in [0,T]\times {\rm{\Omega }}.\end{eqnarray}$Specifically, in this paper, we consider the nonlinear evolution PDEs of the general form with initial value and boundary conditions as follows:(2)$\begin{eqnarray}\left\{\begin{array}{ll}{\rm{i}}{w}_{t}={ \mathcal N }[x,w;{\lambda }_{0}], & (t,x)\in [0,T]\times {\rm{\Omega }},\\ { \mathcal I }[w(t,x)]{| }_{t=0}={w}_{{ \mathcal I }}(x), & x\in {\rm{\Omega }}\ ({\rm{initial}}\ {\rm{value}}\ {\rm{conditions}}),\\ { \mathcal B }[w(t,x)]{| }_{x\in \partial {\rm{\Omega }}}={w}_{{ \mathcal B }}(t), & t\in [0,T]\ ({\rm{boundary}}\ {\rm{conditions}}),\end{array}\right.\end{eqnarray}$where 0 and T represent the lower and upper boundaries of the time variable t, respectively, ω stands for the range of spatial variable x, ∂ω is the boundary of the spatial domain ω, and the latent solution $\hat{w}(t,x)$ is of course unknown. ${ \mathcal N }[\bullet ;{\lambda }_{0}]$ is the combination of linear and nonlinear operators parameterized by the initial vector λ₀, ${ \mathcal I }[\bullet ]$ and ${ \mathcal B }[\bullet ]$ are the initial value and boundary operators, ${ \mathcal I }[w(t,x)]{| }_{t\,=\,0}={w}_{{ \mathcal I }}(x)$ and ${ \mathcal B }[w(t,x)]{| }_{x\in \partial {\rm{\Omega }}}={w}_{{ \mathcal B }}(t)$ stand for the initial value conditions and the boundary conditions, respectively. In this paper, we set a complex-valued (multi-output) physics model f(t, x) as follows:(3)$\begin{eqnarray}f(t,x)={\rm{i}}{w}_{t}-{ \mathcal N }[x,w;{\lambda }_{0}].\end{eqnarray}$

Using the automatic differentiation technique [30], a derivative technique based on the chain rule is widely used for the back propagation (BP) [31] of the feed-forward neural networks, the derivatives of the time variable t and space variable x of the latent solution $\hat{w}(t,x)$ in the neural networks can be quickly obtained. It is worth noting that this technique of calculating derivatives is superior to the traditional numerical or symbolic differentiation. To ensure that the BP, automatic differentiation and related operations in the complex-valued PINN deep learning method can perform well, we use TensorFlow [32], a relatively mature, mainstream deep learning neural network scientific computing library with open source code. Based on the experimental results (shown in Section 4.2) of deep learning, we select the hyperbolic tangent (tanh) as our nonlinear activation function as follows:(4)$\begin{eqnarray}{H}_{j}=\tanh ({\omega }_{j}\cdot {H}_{j-1}+{b}_{j}),\end{eqnarray}$where the weight ω_j is a dim(H_j) × dim(H_j−1) matrix, the returned matrix H_j ∈ Rⁿ, the bias b_j is a dim(H_j) vector, and the subscript j is a positive integer that j = 1, 2, 3, ⋯ ,n. Moreover, the loss function of the latent solution $\hat{w}(t,x)$ can be found by minimizing the mean squared error loss:(5)$\begin{eqnarray}\begin{array}{rcl}L(\hat{w})&=&\displaystyle \frac{1}{{N}_{{ \mathcal I }}}\sum _{j=1}^{{N}_{{ \mathcal I }}}{\left|\ { \mathcal I }[\hat{w}({t}_{{ \mathcal I }},{x}_{{ \mathcal I }}^{j})]{| }_{{t}_{{ \mathcal I }}=0}-{w}_{{ \mathcal I }}({x}_{{ \mathcal I }}^{j})\ \right|}^{2}\\ & & +\ \displaystyle \frac{1}{{N}_{{ \mathcal B }}}\sum _{j=1}^{{N}_{{ \mathcal B }}}{\left|\ { \mathcal B }[\hat{w}({t}_{{ \mathcal B }}^{j},{x}_{{ \mathcal B }})]{| }_{{x}_{{ \mathcal B }}\in \partial {\rm{\Omega }}}-{w}_{{ \mathcal B }}({t}_{{ \mathcal B }}^{j})\ \right|}^{2}\\ & & +\ \displaystyle \frac{1}{{N}_{{ \mathcal C }}}\sum _{j=1}^{{N}_{C}}{\left|\ f({t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j})\ \right|}^{2},\end{array}\end{eqnarray}$where $\{{x}_{{ \mathcal I }}^{j},{w}_{{ \mathcal I }}^{j}\}{}_{j=1}^{{N}_{{ \mathcal I }}}$ and $\{{t}_{{ \mathcal B }}^{j},{w}_{{ \mathcal B }}^{j}\}{}_{j=1}^{{N}_{{ \mathcal B }}}$ stand for the initial and boundary value training sets, respectively, and $\{{t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j},f({x}_{{ \mathcal C }}^{j},{t}_{{ \mathcal C }}^{j})\}{}_{j=1}^{{N}_{{ \mathcal C }}}$ represents the collocations points at f(t, x). The loss function $L(\hat{w})$ denotes how well the latent solution $\hat{w}(t,x)$ satisfies the differential operator, initial value conditions, boundary conditions and the punishment for the PDE not being satisfied on the collocation points. In addition, all sample points are generated by the classical space-filling Latin hypercube sampling (LHS) technique [33], and the optimization method for all loss functions is L − BFGS algorithm [34], which is a mainstream full-batch gradient descent optimization method. Using various techniques and strategies, our ultimate goal is to obtain the latent functions $\hat{w}(t,x)$ when the loss function $L(\hat{w})$ is as close to zero as possible.

In summary, from what has been described above, the main steps of the PINN deep learning schemes [16] for solving the nonlinear evolution PDEs with initial value and boundary conditions can be broadly composed of three ingredients:• define the architecture of the Feed-forward neural network by setting its depth (number of layers), width (number of neurons in per layer) and activation function (4);
• prepare the initial value and boundary conditions as the training data sets and select random collocation points sets by the space-filling LHS algorithm [33];
• introduce and minimize the loss function L($\hat{w}$) (5) to determine the optimal arguments, including the weight and bias $(\hat{\omega },\hat{b})$, using the L − BFGS algorithm [34] and the BP with the automatic differentiation technique [30] after initializing all data sets via Xavier initialization [35] firstly.

Since the ${ \mathcal P }{ \mathcal T }$-symmetric was proposed from quantum mechanics [36], the study of linear and nonlinear physical models with ${ \mathcal P }{ \mathcal T }$-symmetric has been continuous [37–43]. Recently, the investigation of nonlinear wave equations with ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential has been an important subject [38, 40, 41]. To our knowledge, the deep learning methods has not been used to study nonlinear wave equations with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential. Therefore, in this paper, we will investigate the nonlinear Schrödinger equation (NLSE) with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential via the PINN deep learning method.

The rest of the paper is arranged as follows. In Section 2, we introduce the schemes for solving the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential via the PINN deep learning method. In Section 3, we use the PINNs to deeply study the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential by setting diverse initial value conditions and same Dirichlet boundary conditions. In Section 4, we investigate the effects of two factors (optimization steps and activation functions) on the performance of the PINN deep learning method in the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential. In Section 5, we use the PINN deep learning method to obtain the data-driven coefficient discovery of the PT-symmetric Scarf-II potential or the dispersion and nonlinear terms of the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential. Finally, our conclusions and discussions are presented in Section 6.

2. The scheme of the PINNs for solving the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential

The nonlinear Schrödinger equation (NLSE) plays an important role both in the integrable system theory and many physical fields, including but not limit to matter physics, plasma physics, and nonlinear optics [44–47]. In 1997, Bang et al found bright spatial solitons in defocusing Kerr media supported by cascaded nonlinearities [48]. Then, Musslimani et al studied the optical solitons in ${ \mathcal P }{ \mathcal T }$ periodic potentials [38], Shi et al obtained the bright spatial solitons in defocusing Kerr media with ${ \mathcal P }{ \mathcal T }$-symmetric potentials [40] and Yan et al investigated the spatial solitons and stability in self-focusing and defocusing Kerr nonlinear media with the generalized parity-time-symmetric Scarf-II potentials [41]. The NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potentials as follows [42]:(6)$\begin{eqnarray}{\rm{i}}\displaystyle \frac{\partial \psi }{\partial t}=-\left[m\displaystyle \frac{{\partial }^{2}}{\partial {x}^{2}}+V(x)+{\rm{i}}W(x)+g| \psi {| }^{2}\right]\psi ,\end{eqnarray}$where ψ = ψ(t, x) denotes a complex field, m is a non-zero constant mass, g characterizes the self-focusing (g > 0) or defocusing (g < 0) Kerr nonlinearity, respectively. V(x) and W(x) are the real and imaginary components of the complex ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential where V(x) is an even function and W(x) is odd. Physically, V(x) is associated with index guiding while W(x) represents the gain or loss distribution of the optical potential [40].

In this paper, we use the complex-valued PINN deep learning method shown above to study the data-driven solutions of the NLSE with the complex generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and the initial boundary value conditions:(7)$\begin{eqnarray}\begin{array}{l}{\rm{i}}{\psi }_{t}=-{\psi }_{{xx}}-[V(x)+{\rm{i}}W(x)]\psi -g| \psi {| }^{2}\psi ,\\ (t,x)\in [0,T]\times {\rm{\Omega }},\\ \psi (0,x)={\psi }_{{ \mathcal I }}(x),\\ x\in {\rm{\Omega }}\ ({\rm{initial}}\ {\rm{value}}\ {\rm{conditions}}),\\ \psi (t,-x)=\psi (t,x),\\ (t,x)\in [0,T]\times \partial {\rm{\Omega }}\ ({\rm{boundary}}\ {\rm{conditions}}),\end{array}\end{eqnarray}$where ψ = ψ(t, x) is a complex field, V(x) and W(x) are real-valued functions of space variable x. Based on the previous description, the real and the imaginary components of the ${ \mathcal P }{ \mathcal T }$-symmetric potential satisfy the following relations V(−x) = V(x), W(−x) = −W(−x), respectively. Additionally, we chose the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential, which is important in the nonlinear optical beam dynamics in the ${ \mathcal P }{ \mathcal T }$-symmetric complex potential [38].(8)$\begin{eqnarray}\begin{array}{l}V(x)={V}_{0}{{\rm{sech}} }^{2}(x),\\ W(x)={W}_{0}{\rm{sech}} (x)\tanh (x),\end{array}\end{eqnarray}$where V₀ and W₀ being the amplitudes of the real and imaginary part, V₀ and W₀ have a constraint condition with ${W}_{0}\leqslant {V}_{0}+\tfrac{1}{4}$. The coefficient of the Kerr nonlinearity g can be chosen as g = ± 1. Moreover, either the ${ \mathcal P }{ \mathcal T }$-symmetric complex potential or the constant g both have important physical significance to equation (7).

Because $\hat{\psi }(t,x)$ is a latent complex-valued solution of equation (7), we need to set the $\hat{\psi }(t,x)=\hat{u}(t,x)+{\rm{i}}\hat{v}(t,x)$, where the $\hat{u}(t,x)$ and $\hat{v}(t,x)$ are real-valued functions of t and x, and real and imaginary parts of $\hat{\psi }(t,x)$, respectively. We use the complex-valued PINNs f(t, x) (3), then let f(t, x) = if_u(t, x) − f_v(t, x) where f_u(t, x) and − f_v(t, x) stand for the imaginary and real parts, respectively, and f(t, x), f_u(t, x), f_v(t, x) satisfy(9)$\begin{eqnarray}\begin{array}{l}f(t,x)={\rm{i}}{\hat{\psi }}_{t}+{\hat{\psi }}_{{xx}}\\ +[V(x)+{\rm{i}}W(x)]\hat{\psi }+g| \hat{\psi }{| }^{2}\hat{\psi },\\ {f}_{u}(t,x)={\hat{u}}_{t}+{\hat{v}}_{{xx}}\\ +W(x)\hat{u}+V(x)\hat{v}+g({\hat{u}}^{2}+{\hat{v}}^{2})\hat{v},\\ {f}_{v}(t,x)={\hat{v}}_{t}-{\hat{u}}_{{xx}}\\ -V(x)\hat{u}+W(x)\hat{v}-g({\hat{u}}^{2}+{\hat{v}}^{2})\hat{u},\end{array}\end{eqnarray}$and proceed by placing a complex-valued (multi-out) deep neural network prior on ψ(t, x) = [u(t, x), v(t, x)]. To this end, [u(t, x), v(t, x)] can be simply defined as follows.

# To obtain the real and imaginary parts of psi(t, x)=[u(t, x), v(t, x)]

def psi_uv(t, x):

psi=neural_net(tf.concat([t, x],1), weights, biases)

u, v=psi[:, 0:1], psi[:, 1:2]

return u, v

Correspondingly, the complex-valued (multi-out) PINNs f(t, x) = [f_u(t, x), f_v(t, x)] takes the form

# To obtain the real and imaginary parts of f(t, x)=[f_u, f_v]

def f_uv(t, x, W, V, g):

u, v=psi_uv(t, x)

u_t=tf.gradients(u, t)[0]

u_x=tf.gradients(u, x)[0]

u_xx=tf.gradients(u_x, x)[0]

v_t=tf.gradients(v, t)[0]

v_x=tf.gradients(v, x)[0]

v_xx=tf.gradients(v_x, x)[0]

f_u=u_t + v_xx + W ∗ u+V ∗ v+g ∗ (u ∗∗ 2 + v ∗∗ 2) ∗ v

f_v=v_t-u_-V ∗ u+W ∗ v-g ∗ (u ∗∗2 + v ∗∗ 2) ∗ u

return f_u, f_v

The shared parameters of the complex-valued PINNs f(t, x) and latent solution $\hat{\psi }(t,x)$ can be learned by minimizing the loss function (5)(10)$\begin{eqnarray}L(\hat{\psi })={L}_{{ \mathcal I }}(\hat{\psi })+{L}_{{ \mathcal B }}(\hat{\psi })+{L}_{{ \mathcal C }}(\hat{\psi }),\end{eqnarray}$where the ${L}_{{ \mathcal I }},{L}_{{ \mathcal B }},{L}_{{ \mathcal C }}$ are defined as shown below:$\begin{eqnarray*}\begin{array}{l}{L}_{{ \mathcal I }}(\hat{\psi })=\displaystyle \frac{1}{{N}_{{ \mathcal I }}}\sum _{j=1}^{{N}_{{ \mathcal I }}}{\left|{ \mathcal I }[\hat{\psi }({t}_{{ \mathcal I }},{x}_{{ \mathcal I }}^{j})]{| }_{{t}_{{ \mathcal I }}=0}-{\psi }_{{ \mathcal I }}({x}_{{ \mathcal I }}^{j})\right|}^{2}\\ =\displaystyle \frac{1}{{N}_{{ \mathcal I }}}\sum _{j=1}^{{N}_{{ \mathcal I }}}\left({\left|\hat{u}(0,{x}_{0}^{j})-{u}_{0}({x}_{0}^{j})\right|}^{2}+{\left|\hat{v}(0,{x}_{0}^{j})-{v}_{0}({x}_{0}^{j})\right|}^{2}\right),\\ {L}_{{ \mathcal B }}(\hat{\psi })=\displaystyle \frac{1}{{N}_{{ \mathcal B }}}\sum _{j=1}^{{N}_{{ \mathcal B }}}{\left|{ \mathcal B }[\hat{\psi }({t}_{{ \mathcal B }}^{j},{x}_{{ \mathcal B }})]{| }_{{x}_{{ \mathcal B }}\in \partial {\rm{\Omega }}}-{\psi }_{{ \mathcal B }}({x}_{{ \mathcal B }}^{j})\right|}^{2}\\ =\displaystyle \frac{1}{{N}_{{ \mathcal I }}}\sum _{j=1}^{{N}_{{ \mathcal I }}}\left({\left|\hat{u}({t}_{{ \mathcal B }}^{j},-x)-\hat{u}({t}_{{ \mathcal B }}^{j},x)\right|}_{x\in \partial {\rm{\Omega }}}^{2}+\left|\hat{v}({t}_{{ \mathcal B }}^{j},-x)\right.\right.\\ \left.{\left.-\hat{v}({t}_{{ \mathcal B }}^{j},x)\right|}_{x\in \partial {\rm{\Omega }}}^{2}\right),\\ {L}_{{ \mathcal C }}(\hat{\psi })=\displaystyle \frac{1}{{N}_{{ \mathcal C }}}\sum _{j=1}^{{N}_{{ \mathcal C }}}{\left|f({t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j})\right|}^{2}\\ =\displaystyle \frac{1}{{N}_{{ \mathcal C }}}\sum _{j=1}^{{N}_{{ \mathcal C }}}\left({\left|{f}_{u}({t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j})\right|}^{2}+{\left|{f}_{v}({t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j})\right|}^{2}\right),\end{array}\end{eqnarray*}$with $\hat{\psi }(t,x)$ standing for the approximate latent solution of equation (7), ${\left\{{x}_{{ \mathcal I }}^{j},{u}_{0}^{j},{v}_{0}^{j}\right\}}_{j=1}^{{N}_{{ \mathcal I }}}$ denotes the initial data sets (${\psi }_{{ \mathcal I }}$(x) = u₀(x) + iv₀(x)), ${\left\{{t}_{{ \mathcal B }}^{j},u{(\pm x,{t}_{{ \mathcal B }}^{j})}_{x\in \partial {\rm{\Omega }}},v{(\pm x,{t}_{{ \mathcal B }}^{j})}_{x\in \partial {\rm{\Omega }}}\right\}}_{j\,=\,1}^{{N}_{{ \mathcal B }}}$ denotes the boundary data sets, $\left\{{t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j},{f}_{u}({t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j}),{f}_{v}({t}_{{ \mathcal C }}^{j},{x}_{{ \mathcal C }}^{j})\right\}$ stands for the collocation points at f(t, x). Consequently, ${L}_{{ \mathcal I }}(\hat{\psi })$ and ${L}_{{ \mathcal B }}(\hat{\psi })$ represent the loss on the initial and boundary data sets, respectively, and ${L}_{{ \mathcal C }}(\hat{\psi })$ penalizes the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential not being satisfied on the collocation points.

Before training the complex-valued PINNs, we need to generate and initialize the training data sets for the initial and boundary value conditions by using spectral Fourier discretization with 256 modes and a fourth-order explicit Runge–Kutta temporal integrator with 201 temporal sampling points at the same space/time interval, and select collection points at ψ(t, x) = u(t, x) + iv(t, x) with a 201 × 256 matrix as inputting data sets. All sample points can be generated by a space-filling LHS algorithm. Moreover, all the codes in this paper are based on Python 3.7 and Tensorflow 1.14, and these numerical experiments run on the ACER Aspire E5-571G laptop with 2.20 GHz 4-cores i5-5200U CPU.

3. Data-driven solutions of the NLSE with the different initial value and same Dirichlet boundary conditions

In this section, we will use the PINN deep learning method described above to study the soliton solutions of the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential under the different initial value and same Dirichlet boundary conditions.

3.1. The initial value condition comes from optical soliton

We can obtain the exact optical soliton from equation (7) with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential (8):(11)$\begin{eqnarray}\psi (t,x)=K{\rm{sech}} (x)\exp \left\{{\rm{i}}\left[\mu {\tan }^{-1}\left(\sinh (x)\right)+\rho t\right]\right\},\end{eqnarray}$where K is a non-zero constant and(12)$\begin{eqnarray}K=\sqrt{\displaystyle \frac{1}{g}\left(\displaystyle \frac{{W}_{0}^{2}}{9}+2-{V}_{0}\right)},\quad \mu =\displaystyle \frac{{W}_{0}}{3},\quad \rho =1.\end{eqnarray}$Moreover, when ∣x∣ → ∞ , ∣ψ(t, x)∣ → 0 and ${\int }_{-\infty }^{\infty }| \psi (t,x){| }^{2}{\rm{d}}x\,=2{K}^{2}$ [42].

Case 1 : Selecting the coefficient of Kerr nonlinear (self-focusing) g = 1 and ${ \mathcal P }{ \mathcal T }$-symmetric potential V₀ = 1, W₀ = 0.5.

Considering that the initial value condition ${\psi }_{{ \mathcal I }}(x)$ of equation (7) comes from the optical soliton (11), we use the complex-valued PINN deep learning method to learn equation (7) with the initial value condition as follows:(13)$\begin{eqnarray}{\psi }_{{ \mathcal I }}(x)=K{\rm{sech}} (x)\exp \left\{{\rm{i}}\mu {\tan }^{-1}\left[\sinh (x)\right]\right\},\quad x\in {\rm{\Omega }},\end{eqnarray}$where $K=\tfrac{\sqrt{37}}{6},\mu =\tfrac{1}{6}$ and the Dirichlet boundary condition ψ(t, − x) = ψ(t, x), x ∈ ∂ω. We simulate equation (7) with high-resolution data sets generated by using the conventional spectral method. Under the data-driven setting, we can find all measurements $\{{x}_{{ \mathcal I }}^{j},{u}_{0}^{j},{v}_{0}^{j}\}{}_{j=1}^{{N}_{{ \mathcal I }}}$ of equation (11) at time t = 0 and ${\left\{{t}_{{ \mathcal B }}^{j},u{(\pm x,{t}_{{ \mathcal B }}^{j})}_{x\in \partial {\rm{\Omega }}},v{(\pm x,{t}_{{ \mathcal B }}^{j})}_{x\in \partial {\rm{\Omega }}}\right\}}_{j=1}^{{N}_{{ \mathcal B }}}$ on the Dirichlet boundaries. Specifically, the training data sets including ${N}_{{ \mathcal I }}=50$ random sample points with ${\psi }_{{ \mathcal I }}(x)$ and ${N}_{{ \mathcal B }}=100$ random sample points with the Dirichlet boundary conditions. Moreover, we generate ${N}_{{ \mathcal C }}=10000$ random sample collocation points used to learn better equation (7) inside the solution domain. Here, all random sample points are generated by a space-filling LHS strategy[33].

Based on training data sets of the initial value and Dirichlet boundary conditions, we approximate the latent solution $\hat{\psi }(t,x)=\hat{u}(t,x)+{\rm{i}}\hat{v}(t,x)$ via the deep-layer complex-valued PINNs with 12 layers and 50 neurons per hidden layer and a hyperbolic tangent activation function, and we set space domain ω = [−5, 5] and time range [0, 2](i. e., T = 2).

After training the PINNs, the results for approximating the latent solution of the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential (8) are summarized, as shown in Figure 1. Specifically, the top panel of Figure 1 shows the the magnitude of the exact optical soliton solution and the approximate predicted solution $| \hat{\psi }(t,x)| =\sqrt{{\hat{u}}^{2}(t,x)+{\hat{v}}^{2}(t,x)}$ with the locations of the initial and boundary training data sets, and the absolute error between the exact solution ψ and the predicted solution $\hat{\psi }$, respectively. The relative ${{\mathbb{L}}}_{2}$-norm errors of {ψ(t, x), u(t, x), v(t, x)} are {1.068591e − 03, 1.040790e − 03, 1.465346e − 03 }, and the absolute error is very small. The whole training time of this case is 20 637.650 s. The bottom panel of Figure 1 shows a more detailed comparison between the exact solution and predicted solution as presented in the top panel at the time instants t = 0.5, 1.0 and 1.5, respectively. Obviously, the complex-valued PINNs can accurately capture the tanglesome nonlinear dynamic behavior of equation (7) with only a small number of initial data.

Figure 1.

New window|Download| PPT slide
Figure 1.The self-focusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and initial value conditions (13): Top : The magnitude of the exact solution ∣ψ(t, x)∣, predicted solution $| \hat{\psi }(t,x)| $ with ${N}_{{ \mathcal I }}=50$ random sample points with the initial value conditions and ${N}_{{ \mathcal B }}=100$ random sample points with the Dirichlet boundary conditions. ${N}_{{ \mathcal C }}=10000$ collocation points are generated by a space-filling LHS strategy, and the absolute error between the exact solutions and predicted solutions are shown, respectively. The relative ${{\mathbb{L}}}_{2}$-norm error for this case is 1.068591e-03. Bottom : Comparisons of the exact solutions and predicted solutions at the three temporal snapshots described by the three dotted black lines in the top panel corresponding to time instants t = 0.5, 1.0 and 1.5, respectively.

Case 2 : Selecting the coefficient of Kerr nonlinear (defocusing) g = −1 and ${ \mathcal P }{ \mathcal T }$-symmetric potential V₀ = 2.91, W₀ = 0.3.

Similar to Case 1 , we use equation (13) with $K=\sqrt{0.9},\mu =0.1$ and the Dirichlet boundary conditions ψ(t, − x) = ψ(t, x), x ∈ ∂ω. Specifically, the training data sets including ${N}_{{ \mathcal I }}=50$ random sample points at ψ(0, x) = {u₀(t, x), v₀(t, x)} and ${N}_{{ \mathcal B }}=100$ random sample points with the Dirichlet boundary conditions ${\left\{u(t,\pm x),v(t,\pm x)\right\}}_{x\in \partial {\rm{\Omega }}}$, and we generate ${N}_{{ \mathcal C }}=10000$ random sample collocation points as the inputting data sets used to learn better equation (7) inside the solution domain. Moreover, based on the initial values and boundary conditions described above, we approximate the latent solution $\hat{\psi }(t,x)=\hat{u}(t,x)+{\rm{i}}\hat{v}(t,x)$ via the deep-layer complex-valued PINNs with 12 layers and 50 neurons per hidden layer and a hyperbolic tangent activation function, and we set space domain ω = [−5, 5] and time range [0, 2] (i. e., T = 2).

After training the PINNs, the results for approximating the latent solution of the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential (8) are summarized, as shown in Figure 2. Specifically, the top panel of Figure 2 shows the the magnitude of the exact optical soliton solution and the approximate predicted solution $| \hat{\psi }(t,x)| =\sqrt{{\hat{u}}^{2}(t,x)+{\hat{v}}^{2}(t,x)}$ with the locations of the initial and boundary training data sets, and the absolute error between the exact solution ψ and the predicted solution $\hat{\psi }$, respectively. The relative ${{\mathbb{L}}}_{2}$-norm errors of {ψ(t, x), u(t, x), v(t, x)} are {2.439456e − 03, 1.407065e − 03, 8.380674e − 04}. The whole training time of this case is 20 656.014 s. The bottom panel of Figure 2 describes a more detailed comparison between the exact solution and predicted solution as presented in the top panel at the time instants t = 0.5, 1.0 and 1.5, respectively.

Figure 2.

New window|Download| PPT slide
Figure 2.The defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and initial value conditions (13): Top : The magnitude of the exact solution ∣ψ(t, x)∣, predicted solution $| \hat{\psi }(t,x)| $ with ${N}_{{ \mathcal I }}=50$ random sample points with the initial value conditions and ${N}_{{ \mathcal B }}=100$ random sample points with the Dirichlet boundary conditions. ${N}_{{ \mathcal C }}=10000$ collocation points are generated by a space-filling LHS strategy, and the absolute error between the exact solutions and predicted solutions are shown, respectively. The relative ${{\mathbb{L}}}_{2}$-norm error for this case is 8.380674e-04. Bottom : Comparisons of the exact solutions and predicted solutions at the three temporal snapshots described by the three dotted black lines in the top panel corresponding to time instants t = 0.5, 1.0 and 1.5.

3.2. The initial value condition comes from instability solution

In this subsection, we use equation (7) with the new initial value condition that does not satisfy the stability equation (7).(14)$\begin{eqnarray}\begin{array}{l}{\psi }_{{ \mathcal I }}(x)=K{\rm{sech}} (x)\exp \left\{-{\rm{i}}\mu {\tan }^{-1}\left[\sinh (x)\right]\right\},\\ x\in {\rm{\Omega }},\end{array}\end{eqnarray}$where K in equation (12) with other parameters V₀ = 1, W₀ = 0.6, g = 1 so $K=\sqrt{1.04}$ and we can obtain(15)$\begin{eqnarray}\begin{array}{l}{\psi }_{{ \mathcal I }}(x)=\sqrt{1.04}{\rm{sech}} (x)\exp \left\{-0.2{\mathrm{itan}}^{-1}\left[\sinh (x)\right]\right\},\\ x\in {\rm{\Omega }},\end{array}\end{eqnarray}$with the Dirichlet boundary conditions ψ(t, − x) = ψ(t, x), x ∈ ∂ω. Moreover, in order to study equation (7) with an initial value and boundary conditions shown above via the complex-valued PINNs, we need to obtain the corresponding data sets, including ${N}_{{ \mathcal I }}=50$ randomly distributed points with the initial value condition ${\psi }_{{ \mathcal I }}(x)$ at the space domain ω = [−5, 5], ${N}_{{ \mathcal B }}=100$ randomly distributed points with the Dirichlet boundary conditions at time domain t ∈ [0, 2] and ${N}_{{ \mathcal C }}=10000$ random collocation points by a space-filling LHS strategy.

In this case, we use the complex-valued PINN deep learning method with five hidden layers and 100 neurons per hidden layer and a hyperbolic tangent activation function to approximate the latent solution $\hat{\psi }(t,x)=\hat{u}(t,x)+{\rm{i}}\hat{v}(t,x)$. After training the PINNs with the above-mentioned data sets for about 20 371.696 s, we obtain the results show in Figure 3. Specifically, the top panel of Figure 3 shows the magnitude of the approximate predicted solution $| \hat{\psi }(t,x)| =\sqrt{{\hat{u}}^{2}(t,x)+{\hat{v}}^{2}(t,x)}$ with the locations of the initial and boundary training data sets. The relative ${{\mathbb{L}}}_{2}$-norm errors of {ψ(t, x), u(t, x), v(t, x)} are {9.409916e − 03, 1.017411e − 02, 1.311260e − 02}, respectively. The bottom panel of Figure 3 describes a more detailed comparison between the exact solution and the predicted solution as presented in the top panel at the time instants t = 1.5, 3.0 and 4.5, respectively.

Figure 3.

New window|Download| PPT slide
Figure 3.The defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and initial value conditions (15): Top : The magnitude of the predicted solution $| \hat{\psi }(t,x)| $ with ${N}_{{ \mathcal I }}=50$ random sample points with the initial value conditions and ${N}_{{ \mathcal B }}=100$ random sample points with the Dirichlet boundary conditions. ${N}_{{ \mathcal C }}=10000$ collocation points are generated by a space-filling LHS strategy. The relative ${{\mathbb{L}}}_{2}$-norm error for this case is 9.409916e-03. Bottom : Comparisons of the exact solutions and predicted solutions at the three temporal snapshots described by the three dotted black lines in the top panel corresponding to time instants t = 1.5, 3.0 and 4.5, respectively.

3.3. The initial value condition comes from a hyperbolic secant function and one constant

In this example, we use equation (7) with a hyperbolic secant function and one constant as initial value condition:(16)$\begin{eqnarray}{\psi }_{{ \mathcal I }}(x)=1+{\rm{sech}} ({\rm{x}}),\quad x\in {\rm{\Omega }}=[-5.0,5.0],\end{eqnarray}$where the Dirichlet boundary conditions ψ(t, − 5.0) = ψ(t, 5.0), t ∈ [0, 1.4], and other parameters g = 1, V₀ = 1.0, W₀ = 0.5 in (12) so that we can obtain high-precision data sets with the above-mentioned parameters, random collocation points and initial and boundary value conditions to numerical simulate equation (7).

We use the complex-valued PINNs with five hidden layers and 100 neurons per hidden layer and a hyperbolic tangent activation function to aspproximate the latent solution ψ(t, x) = u(t, x) + iv(t, x) where u(t, x), v(t, x) stand for the real and imaginary parts of ψ(t, x), respectively. After training the PINNs with above-mentioned data sets about 26 350.433 s, we obtain the results shown in Figure 4. Specifically, the top panel of Figure 4 shows the magnitude of the approximate predicted solution $| \hat{\psi }(t,x)| =\sqrt{{\hat{u}}^{2}(t,x)+{\hat{v}}^{2}(t,x)}$ with the locations of the initial and boundary training data sets. The relative ${{\mathbb{L}}}_{2}$-norm errors of {ψ(t, x), u(t, x), v(t, x)} are {3.110669e − 02, 3.166569e − 02, 4.096653e − 02}. The bottom panel of Figure 4 describes a more detailed comparison between the exact solution and the predicted solution as presented in the top panel at the time instants t = 0.35, 0.70 and 1.05, respectively.

Figure 4.

New window|Download| PPT slide
Figure 4.The self-focusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and initial value conditions (16): Top : The magnitude of the predicted solution $| \hat{\psi }(t,x)| $ with ${N}_{{ \mathcal I }}=50$ random sample points with the initial value conditions and ${N}_{{ \mathcal B }}=100$ random sample points with Dirichlet boundary conditions. ${N}_{{ \mathcal C }}=10000$ collocation points are generated by a space-filling LHS strategy. The relative ${{\mathbb{L}}}_{2}$-norm error for this case is 3.110669e-02. Bottom : Comparisons of the exact solutions and predicted solutions at the three temporal snapshots described by the three black dotted lines in the top panel corresponding to time instants t = 2.5, 5.0 and 7.5, respectively.

4. Comparisons of the influencing factors for the PINN deep learning method in the NLSE

In this section, we will discuss the influence of two factors on the learning ability of this complex-valued PINNs.

4.1. The influence of optimization steps on the learning ability of this complex-valued PINNs

In the following, we will study the influence of optimization steps on the learning ability of this complex-valued PINNs in the defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential. Moreover, choosing the same example given in Section 3.2 to simplify this problem, we study the nonlinear evolution equation (7) with initial and boundary conditions (14) in the four different optimization steps: N = {20000, 30 000, 40 000, 50 000}. Here, we use 10-hidden-layer deep complex-valued PINNs and 50 neurons per layer with a hyperbolic tangent activation function to approximate the latent solution and these training steps optimized by L-BFGS algorithm.

In Figure 5, the first column describes the magnitude of the predicted solution $| \hat{\psi }(t,x)| =\sqrt{{\hat{u}}^{2}(t,x)+{\hat{v}}^{2}(t,x)}$, and the other three columns show the comparisons between the exact solution and predicted solution presented in the first column at the time instants t = 1.5, 3.0 and 4.5, respectively. The first row to the last row describe the different optimization steps N = {20000, 30 000, 40 000, 50 000}, and the training time of these four cases are {8671.709, 12 704.724, 16 640.617, 20 642.990} s, respectively. Finally, all of these numerical experiments shown in Table 1. Obviously, noting Figure 5 and Table 1, the learning ability of the complex-valued PINN deep learning method becomes better before the optimized steps up to 50 000. Whether the learning ability of the complex-valued PINN deep learning method will be stronger when more steps are optimized is for future work to determine; in this paper, we will not carry out such a detailed study.

Figure 5.

New window|Download| PPT slide
Figure 5.The optimization steps' influence on the learning ability of this complex-valued PINNs in the defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and initial boundary value conditions in Section 3.2: the deep learning results in four different optimization steps N = {20000, 30 000, 40 000, 50 000} from the first row to the last row. Left column images : the magnitude of the approximate predicted solution $| \hat{\psi }(t,x)| $ with the initial and Dirichlet boundary training data and 10 000 collocation points. Right three columns : comparisons of the exact solutions and predicted solutions at the three temporal snapshots described by the three dotted black lines in the first column corresponding to time instants t = 1.5, 3.0 and 4.5.

Table 1.
Table 1.Comparison between the ${{\mathbb{L}}}_{2}$-norm errors and optimization steps of the predicted solution $\hat{\psi }(t,x)$.

Step (N)	20000	30000	40000	50000
${{\mathbb{L}}}_{2}$-norm error
u	8.117321e-03	8.733681e-03	8.113012e-04	2.130153e-04
v	7.993082e-03	4.103771e-03	8.034227e-04	6.880915e-04
ψ	9.910189e-04	9.204713e-04	6.001371e-04	3.437614e-04
Loss function	4.421448e-03	5.101375e-03	1.121848e-03	9.857296e-04

New window|CSV

4.2. The influence of activation functions on the learning ability of this complex-valued PINN

In this subsection, we will study the influence of activation functions on the complex-valued PINN's learning ability in the self-focusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential. Moreover, in order to simply this problem, we use the same example as Case 1 given in Section 3.1. Using four common activation functions, ReLU, Leaky ReLUs, Sigmoid, Tanh (4) with the 50 000 L-BFGS optimized steps:(17)$\begin{eqnarray}\begin{array}{rcl}{H}_{j}&=&{\rm{relu}}({\omega }_{j}\cdot {H}_{j-1}+{b}_{j}),\\ {H}_{j}&=&{\rm{leaky}}\_{\rm{relu}}({\omega }_{j}\cdot {H}_{j-1}+{b}_{j}),\\ {H}_{j}&=&{\rm{sigmoid}}({\omega }_{j}\cdot {H}_{j-1}+{b}_{j}).\end{array}\end{eqnarray}$Note that the activation functions with the same name mentioned above are implemented in TensorFlow library and can be used directly. Moreover, we use the complex-valued PINNs with a 10-hidden-layer neural network with 50 neurons per hidden layer to approximate the latent solution.

In Figure 6, the first column describes the magnitude of the predicted solution $| \hat{\psi }(t,x)| =\sqrt{{\hat{u}}^{2}(t,x)+{\hat{v}}^{2}(t,x)}$, and the other three columns show the detailed comparisons between the exact solution and predicted solution as presented in the first column at the time instants t = 0.5, 1.0 and 1.5, respectively. The first row to the last row describe the different activation functions $\{{\rm{ReLU}},{\rm{Leaky}}\ {\rm{ReLUs}},{\rm{Sigmoid}},{\rm{Tanh}}\}$, and the training time of these four cases are {6487.622, 6498.930, 24 081.133, 21 082.208} s, respectively. Finally, all of these numerical experiments are shown in Table 2. Obviously, when noting Figure 6 and Table 2, the learning ability of the complex-valued PINN deep learning method with activation function Tanh is far superior to ReLU and Leaky ReLUs in ${{\mathbb{L}}}_{2}$-norm error, and slightly better than Sigmoid in ${{\mathbb{L}}}_{2}$-norm error and training time.

Figure 6.

New window|Download| PPT slide
Figure 6.The activation function's influence on the learning ability of this complex-valued PINN in the self-focusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and initial boundary value conditions in Case 1 given in Section 3.1: the deep learning results in four different activation functions $\{{\rm{ReLU}},{\rm{Leaky}}\ {\rm{ReLUs}},{\rm{Sigmoid}},{\rm{Tanh}}\}$ from the first row to the last row. Left column images : the magnitude of the approximate predicted solution $| \hat{\psi }(t,x)| $ with the initial and Dirichlet boundary training data and 10 000 collocation points. Right three columns : comparisons of the exact and predicted solutions at the three temporal snapshots described by the three dotted black lines in the panels in the first column corresponding to time instants t = 0.5, 1.0 and 1.5.

Table 2.
Table 2.Comparison between the ${{\mathbb{L}}}_{2}$-norm errors and the activation functions of the predicted solution $\hat{\psi }(t,x)$.

Activation function	ReLU	Leaky ReLUs	Sigmoid	Tanh
${{\mathbb{L}}}_{2}$-norm error
u	5.786123e-01	5.000701e-01	6.347641e-03	4.195627e-03
v	6.045615e-01	6.477965e-01	4.094651e-03	2.428159e-03
ψ	4.509659e-01	5.020687e-01	3.260262e-03	1.350881e-03
Loss function	4.489665e-02	4.765688e-02	5.862674e-06	1.384692e-06

New window|CSV

5. Data-driven coefficient discovery of the defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential

In order to explore the coefficients of the defocusing (g = −1) NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential (6), we will use the data-driven discovery method of PDEs that comes from the PINN deep learning framework [16]. The scheme of data-driven coefficient discovery of PDEs is similar for obtaining the data-driven solutions of PDEs.

5.1. Explore the coefficients of the dispersive and nonlinear items

In this subsection, we will explore the unknown parameters [m, g], which are the coefficients of the second-order dispersive item and defocusing Kerr nonlinearity, respectively, in the defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential:(18)$\begin{eqnarray}\begin{array}{l}{\rm{i}}{\psi }_{t}=-m{\psi }_{{xx}}-[V(x)+{\rm{i}}W(x)]\psi -g| \psi {| }^{2}\psi ,\\ (t,x)\in [0,T]\times {\rm{\Omega }},\end{array}\end{eqnarray}$where$\begin{eqnarray*}\begin{array}{l}V(x)=2.91{{\rm{sech}} }^{2}(x),\\ W(x)=0.3{\rm{sech}} (x)\tanh (x).\end{array}\end{eqnarray*}$

Here, the latent function $\hat{\psi }(t,x)=\hat{u}(t,x)+{\rm{i}}\hat{v}(t,x)$ with $[\hat{v}(t,x),\hat{v}(t,x)]$ are the real and imaginary parts, respectively. We use the deep complex-valued PINNs f(t, x) = if_u(t, x) − f_v(t, x) with f(t, x), f_u(t, x) and f_v(t, x) satisfying(19)$\begin{eqnarray}\begin{array}{l}f(t,x)={\rm{i}}{\hat{\psi }}_{t}+m{\hat{\psi }}_{{xx}}+[2.91{{\rm{sech}} }^{2}(x)\\ +\,0.3{\rm{i}}\ {\rm{sech}} (x)\tanh (x)]\hat{\psi }+g| \hat{\psi }{| }^{2}\hat{\psi },\\ {f}_{u}(t,x)={\hat{u}}_{t}+m{\hat{v}}_{{xx}}+0.3{\rm{sech}} (x)\tanh (x)\ \hat{u}\\ +\,2.91{{\rm{sech}} }^{2}(x)\ \hat{v}+g({\hat{u}}^{2}+{\hat{v}}^{2})\hat{v},\\ {f}_{v}(t,x)={\hat{v}}_{t}-m{\hat{u}}_{{xx}}-2.91{{\rm{sech}} }^{2}(x)\ \hat{u}\\ +\,0.3{\rm{sech}} (x)\tanh (x)\ \hat{v}-g({\hat{u}}^{2}+{\hat{v}}^{2})\hat{u}.\end{array}\end{eqnarray}$Training the inputting data sets [u(t, x), v(t, x)] and unknown parameters [m, g] via the deep complex-valued PINNs to approximate the undiscovered solution $\hat{\psi }(t,x)=\hat{u}(t,x)\,+{\rm{i}}\hat{v}(t,x)$ by minimizing the loss function,(20)$\begin{eqnarray}\begin{array}{l}L(\hat{\psi })=\displaystyle \frac{1}{{N}_{S}}\sum _{j=1}^{{N}_{S}}\left({\left|\ \hat{u}({t}^{j},{x}^{j})-u({t}^{j},{x}^{j})\ \right|}^{2}\right.\\ +\,{\left|\ \hat{v}({t}^{j},{x}^{j})-v({t}^{j},{x}^{j})\ \right|}^{2}+{\left|{f}_{u}({t}_{S}^{j},{x}_{S}^{j})\right|}^{2}\\ \left.+\,{\left|{f}_{v}({t}_{S}^{j},{x}_{S}^{j})\right|}^{2}\right).\end{array}\end{eqnarray}$

In order to obtain the unknown parameters [m, g] of equation (18), we randomly choose N_S = 10000 sample points from the exact solution ψ(t, x) = u(t, x) + iv(t, x) with m = 1, g = − 1 (Case 2 in Section 3.1) in the spatiotemporal domain [0, 2] × [ − 5, 5], and [u(t^j, x^j), v(t^j, x^j)] given by equation (20) is equivalent to [u(t, x), v(t, x)] of the exact solution ψ(t, x). Using the training data sets with the 10-hidden-layer PINNs and 50 neurons per hidden layer, a hyperbolic tangent activation function and 50 000 steps L-BFGS optimization are used to explore the unknown parameters [m, g] by minimizing the loss function (20).

After training the complex-valued PINNs for 20 609.737 s, we obtain the unknown parameters [m, g]. All of the results are shown in Table 3 including the relative error of the correct NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and identical twos with clean data or 1% noise data, respectively, and V(x), W(x) are given by (18).

Table 3.
Table 3.The correct NLSE with ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and the identified twos obtained by learning m and g, and the relative errors.

Item	Nonlinear evolution equation	Relative error [m, g](%)
PDE
Correct	iψ_t = − ψ_xx − [V(x) + iW(x)]ψ + ∣ψ∣²ψ	[0, 0]
Identified (clean data)	iψ_t = − 1.00353ψ_xx − [V(x) + iW(x)]ψ + 0.994 05∣ψ∣²ψ	[0.353, 0.595]
Identified (1% noise)	iψ_t = − 0.98293ψ_xx − [V(x) + iW(x)]ψ + 1.006 51∣ψ∣²ψ	[1.707, 0.651]

New window|CSV

5.2 Explore the coefficients of the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential

In the following, we will explore the coefficients of the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential [V₀, W₀] in the defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential:(21)$\begin{eqnarray}\begin{array}{l}{\rm{i}}{\psi }_{t}=-{\psi }_{{xx}}-[{V}_{0}{{\rm{sech}} }^{2}(x)\\ +\,{\rm{i}}{W}_{0}{\rm{sech}} (x)\tanh (x)]\psi +| \psi {| }^{2}\psi ,\\ (t,x)\in [0,T]\times {\rm{\Omega }},\end{array}\end{eqnarray}$where [V₀, W₀] are the unknown real-valued parameters.

Here, the latent function $\hat{\psi }(t,x)=\hat{u}(t,x)+{\rm{i}}\hat{v}(t,x)$ with $[\hat{u}(t,x),\hat{v}(t,x)]$ are the real and imaginary parts, respectively. We use the complex-valued PINNs f(t, x) = if_u(t, x) − f_v(t, x) with f(t, x), f_u(t, x) and f_v(t, x) satisfying(22)$\begin{eqnarray}\begin{array}{l}f(t,x)={\rm{i}}{\hat{\psi }}_{t}+{\hat{\psi }}_{{xx}}+[{V}_{0}{{\rm{sech}} }^{2}(x)\\ +\,{\rm{i}}{W}_{0}\ {\rm{sech}} (x)\tanh (x)]\hat{\psi }-| \hat{\psi }{| }^{2}\hat{\psi },\\ {f}_{u}(t,x)={\hat{u}}_{t}+{\hat{v}}_{{xx}}+{W}_{0}{\rm{sech}} (x)\tanh (x)\ \hat{u}\\ +\,{V}_{0}{{\rm{sech}} }^{2}(x)\ \hat{v}-({\hat{u}}^{2}+{\hat{v}}^{2})v,\\ {f}_{v}(t,x)={\hat{v}}_{t}-{\hat{u}}_{{xx}}-{V}_{0}{{\rm{sech}} }^{2}(x)\ \hat{u}\\ +\,{W}_{0}{\rm{sech}} (x)\tanh (x)\ \hat{v}+({\hat{u}}^{2}+{\hat{v}}^{2})\hat{u}.\end{array}\end{eqnarray}$In order to obtain the unknown parameters [V₀, W₀] of equation (18), we randomly choose N_S = 10000 sample points from the exact solution ψ(t, x) = u(t, x) + iv(t, x) with V₀ = 2.91, W₀ = 0.3 (given by Case 2 in Section 3.1) in the spatiotemporal domain [0, 2] × [ − 5, 5], and [u(t^j, x^j), v(t^j, x^j)] given by equation (20) is equivalent to [u(t, x), v(t, x)] of the exact solution ψ(t, x). Using the training data sets with the 10-hidden-layer PINNs and 50 neurons per hidden layer, a hyperbolic tangent activation function and 50 000 steps L-BFGS optimization are used to explore the unknown parameters [V₀, W₀] by minimizing the loss function (20).

After training the deep complex-valued PINNs for 20 782.791 s, we obtain the unknown parameters [V₀, W₀]. All of these results are shown in Table 4, including the relative error of the correct NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and identical twos with clean data or 1% noise data, respectively.

Table 4.
Table 4.The correct NLSE with ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential and the identified twos obtained by learning V₀ and W₀, and the relative errors.

Item	Nonlinear evolution equation	Relative error [V₀, W₀](%)
PDE
Correct	${\rm{i}}{\psi }_{t}=-{\psi }_{{xx}}-[2.91{{\rm{sech}} }^{2}(x)+0.3\mathrm{isech}(x)\tanh (x)]\psi +\| \psi {\| }^{2}\psi $	[0, 0]
Identified (clean data)	${\rm{i}}{\psi }_{t}=-{\psi }_{{xx}}-[2.91283{{\rm{sech}} }^{2}(x)+0.303\,28\mathrm{isech}(x)\tanh (x)]\psi +\| \psi {\| }^{2}\psi $	[0.097, 1.093]
Identified (1 % noise)	${\rm{i}}{\psi }_{t}=-{\psi }_{{xx}}-[2.89035{{\rm{sech}} }^{2}(x)+0.301\,65\mathrm{isech}(x)\tanh (x)]\psi +\| \psi {\| }^{2}\psi $	[0.675, 0.552]

New window|CSV

6. Conclusions and discussions

In conclusion, we have solved the nonlinear Schrödinger equation (NLSE) with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential via the complex-valued PINN deep learning method under same Dirichlet boundary and different initial value conditions including optical soliton, instability solution and a hyperbolic secant function with one constant. Particularly, when selecting the exact optical soliton solution as the initial value conditions, we find that the complex-valued PINNs can better learn the nonlinear dynamic behaviors and structures of corresponding solutions. Moreover, we also confirmed the learning ability of the complex-valued PINNs for solving the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential in different optimization steps or activation functions, and we find that the complex-valued PINN deep learning method can learn better with more optimized steps and a hyperbolic tangent activation function. We also investigated the data-driven coefficients discovery of the defocusing NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential; the results show that the complex-valued PINN deep learning method can also be used to learn other nonlinear wave equations in many nonlinear science fields.

As can be seen from the figures and tables here, obviously, with a small number of training data, the complex-valued PINNs can effectively learn the NLSE with the generalized ${ \mathcal P }{ \mathcal T }$-symmetric Scarf-II potential. However, there are still some questions to answer: (1) if we set more hidden layers and neurons of per hidden layer, will the learning ability of the complex-valued PINN deep learning method improve? (2) If more steps are optimized, will the learning ability of the complex-valued PINN deep learning method be the stronger? (3) Is there a more generalized loss function for the other classical nonlinear partial differential equations? (4) Is there a more effective physical constraint for the complex-valued PINN deep learning method? Meaningful discussions and optimizations of the neural network model will be undertaken in future work.

Conflict of interest

The authors declare that they have no conflict of interests.

Data availability statements

All data generated or analyzed during this study are included in this article.

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant Nos. 11 775 121 and 11 435 005, and the K. C. Wong Magna Fund of Ningbo University.

Reference By original order
By published year
By cited within times
By Impact factor

[1]

Emanuello

et al. 2021

Proceedings—AI/ML for Cybersecurity: Challenges, Solutions, and Novel Ideas at SIAM Data Mining

arXiv: 2104.13254

[Cited within: 1]

[2]

Krizhevsky

Sutskever

Hinton

G E

2012 Imagenet classification with deep convolutional neural networks
Adv. Neural Inf. Process. Syst. 25 1097 1105

DOI:10.1145/3065386

[3]

Lake

B M

Salakhutdinov

Tenenbaum

J B

2015 Human-level concept learning through probabilistic program induction
Science 350 1332 1338

DOI:10.1126/science.aab3050

[4]

Alipanahi

Delong

Weirauch

M T

Frey

B J

2015 Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
Nat. Biotechnol. 33 831 838

DOI:10.1038/nbt.3300

[5]

Goodfellow

Bengio

Courville

2016 Deep Learning Cambridge MA MIT Press

DOI:10.1007/s10710-017-9314-z

[6]

Larranaga

Atienza

Diaz-Rozo

Ogbechie

Puerto-Santana

C E

Bielza

2019 Industrial Applications of Machine Learning Boca Raton, FL CRC Press

DOI:10.1201/9781351128384 [Cited within: 1]

[7]

Psichogios

D C

Ungar

L H

1992 A hybrid neural network-first principles approach to process modeling
AIChE J. 38 1499 1511

DOI:10.1002/aic.690381003 [Cited within: 1]

[8]

Lagaris

I E

Likas

Fotiadis

D I

1998 Artificial neural networks for solving ordinary and partial differential equations
IEEE Trans. Neural Netw. 9 987 1000

DOI:10.1109/72.712178

[9]

Owhadi

2015 Bayesian numerical homogenization
Multiscale Model. Simul. 13 812 828

DOI:10.1137/140974596 [Cited within: 1]

[10]

Raissi

Perdikaris

Karniadakis

G E

2017 Inferring solutions of differential equations using noisy multifidelity data
J. Comput. Phys. 335 736 746

DOI:10.1016/j.jcp.2017.01.060

[11]

Raissi

Perdikaris

Karniadakis

G E

2017 Machine learning of linear differential equations using Gaussian processes
J. Comput. Phys. 348 683 693

DOI:10.1016/j.jcp.2017.07.050 [Cited within: 1]

[12]

Rasmussen

C E

Williams

C K

2006 Gaussian Processes for Machine Learning Cambridge vol 1Cambridge, MA MIT Press

DOI:10.1142/S0129065704001899 [Cited within: 1]

[13]

Raissi

Perdikaris

Karniadakis

G E

2017 Numerical Gaussian processes for time-dependent and nonlinear partial differential equations
SIAM J. Sci. Comput. 40 A172 A198

DOI:10.1137/17M1120762 [Cited within: 1]

[14]

Raissi

2018 Deep hidden physics models: deep learning of nonlinear partial differential equations
J. Mach. Learn. Res. 19 932 955

[Cited within: 1]

[15]

Raissi

Karniadakis

G E

2018 Hidden physics models: machine learning of nonlinear partial differential equations
J. Comput. Phys. 357 125 141

DOI:10.1016/j.jcp.2017.11.039

[16]

Raissi

Perdikaris

Karniadakis

G E

2019 physics-Informed Neural Networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
J. Comput. Phys. 378 686 707

DOI:10.1016/j.jcp.2018.10.045 [Cited within: 5]

[17]

Michoski

et al. 2020 Solving differential equations using deep neural networks
Neurocomputing 399 193 212

DOI:10.1016/j.neucom.2020.02.015

[18]

Sirignano

Spiliopoulos

2018 DGM: A deep learning algorithm for solving partial differential equations
J. Comput. Phys. 375 1339 1364

DOI:10.1016/j.jcp.2018.08.029 [Cited within: 2]

[19]

Han

J Q

Jentzen

Weinan

2018 Solving high-dimensional partial differential equations using deep learning
PNAS 115 8505 8510

DOI:10.1073/pnas.1718942115 [Cited within: 2]

[20]

Qian

Liu

Wong

H S

2018 Adaptive activation functions in convolutional neural networks
Neurocomputing 272 204 212

DOI:10.1016/j.neucom.2017.06.070 [Cited within: 1]

[21]

Jagtap

A D

Kawaguchi

Karniadakis

G E

2020 Adaptive activation functions accelerate convergence in deep and physics-informed neural networks
J. Comput. Phys. 404 109136

DOI:10.1016/j.jcp.2019.109136

[22]

Jagtap

A D

Kawaguchi

Karniadakis

G E

2020 Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks
Proc. R. Soc. A 476 20200334

DOI:10.1098/rspa.2020.0334 [Cited within: 1]

[23]

Chen

2020 Solving second-order nonlinear evolution partial differential equations using deep learning
Commun. Theor. Phys. 72 105005

DOI:10.1088/1572-9494/aba243 [Cited within: 1]

[24]

Chen

2020 A deep learning method for solving third-order nonlinear evolution equations
Commun. Theor. Phys. 72 115003

DOI:10.1088/1572-9494/aba243

[25]

Chen

2021 A physics-constrained deep residual network for solving the sine-Gordon equation
Commun. Theor. Phys. 73 015001

DOI:10.1088/1572-9494/abc3ad

[26]

Zhou

Z J

Yan

Z Y

2021 Solving forward and inverse problems of the logarithmic nonlinear Schrödinger equation with ${ \mathcal P }{ \mathcal T }$-symmetric harmonic potential via deep learning
Phys. Lett. A 387 127010

DOI:10.1016/j.physleta.2020.127010

[27]

Wang

Yan

Z Y

2021 Data-driven rogue waves and parameter discovery in the defocusing nonlinear Schrödinger equation with a potential using the PINN deep learning
Phys. Lett. A 404 127408

DOI:10.1016/j.physleta.2021.127408

[28]

Zhou

Z J

Yan

Z Y

2021 Deep learning neural networks for the third-order nonlinear Schrödinger equation: Solitons, breathers, and rogue waves
arXiv: 2104.14809v1

[29]

Wang

Yan

Z Y

2021 Data-driven peakon and periodic peakon travelling wave solutions of some nonlinear dispersive equations via deep learning
arXiv: 2101.04371v1

[Cited within: 1]

[30]

Baydin

A G

Pearlmutter

B A

Radul

A A

Siskind

J M

2018 Automatic differentiation in machine learning: a survey
J. Mach. Learning Research 18 1 43

[Cited within: 2]

[31]

Rumelhart

D E

Hinton

G E

Williams

R J

1986 Learning representations by back-propagating errors
Nature 323 533 536

DOI:10.1038/323533a0 [Cited within: 1]

[32]

Abadi

et al. 2016 Tensorflow: a system for large-scale machine learning

XII USENIX Symposium on Operating Systems Design and Implementation

265 283

DOI:10.5555/3026877.3026899 [Cited within: 1]

[33]

Stein

M L

1987 Large sample properties of simulations using latin hypercube sampling
Technometrics 29 143 151

DOI:10.1080/00401706.1987.10488205 [Cited within: 3]

[34]

Liu

D C

Nocedal

1989 On the limited memory BFGS method for large scale optimization
Math. Program. 45 503 528

DOI:10.1007/BF01589116 [Cited within: 2]

[35]

Glorot

Bengio

2010 Understanding the difficulty of training deep feedforward neural networks
Proc. AISTATS9 249 256(http://proceedings.mlr.press/v9/glorot10a.html)

[Cited within: 1]

[36]

Bender

C M

Boettcher

1998 Real spectra in non-Hermitian Hamiltonians having ${ \mathcal P }{ \mathcal T }$-symmetry
Phys. Rev. Lett. 80 5243 5246

DOI:10.1103/PhysRevLett.80.5243 [Cited within: 1]

[37]

Dorey

Dunning

Tateo

2001 Spectral equivalences, Bethe ansatz equations, and reality properties in ${ \mathcal P }{ \mathcal T }$-symmetric quantum mechanics
J. Phys. A: Math. Gen. 34 5679 5704

DOI:10.1088/0305-4470/34/28/305 [Cited within: 1]

[38]

Musslimani

Z H

Makris

K G

El-Ganainy

Christodoulides

D N

2008 Optical solitons in ${ \mathcal P }{ \mathcal T }$ periodic potentials
Phys. Rev. Lett. 100 030402

DOI:10.1103/PhysRevLett.100.030402 [Cited within: 3]

[39]

Makris

K G

El-Ganainy

Christodoulides

D N

Musslimani

Z H

2011 ${ \mathcal P }{ \mathcal T }$-symmetric periodic optical potentials
Int. J. Theor. Phys. 50 1019 1041

DOI:10.1007/s10773-010-0625-6

[40]

Shi

Z W

Jiang

X J

Zhu

H G

2011 Bright spatial solitons in defocusing Kerr media with ${ \mathcal P }{ \mathcal T }$-symmetric potentials
Phys. Rev. A 84 053855

DOI:10.1103/PhysRevA.84.053855 [Cited within: 3]

[41]

Yan

Z Y

Wen

Z C

Hang

2015 Spatial solitons and stability in self-focusing and defocusing Kerr nonlinear media with generalized parity-time-symmetric Scarf-II potentials
Phys. Rev. E 92 022913

DOI:10.1103/PhysRevE.92.022913 [Cited within: 2]

[42]

Chen

et al. 2017 Families of stable solitons and excitations in the ${ \mathcal P }{ \mathcal T }$-symmetric nonlinear Schrodinger equations with position-dependent effective masses
Sci. Rep. 7 2045 2322

DOI:10.1038/s41598-017-01401-3 [Cited within: 2]

[43]

Yan

Z Y

Chen

2017 The nonlinear Schrödinger equation with generalized nonlinearities and ${ \mathcal P }{ \mathcal T }$-symmetric potentials: stable solitons, interactions, and excitations
Chaos 27 073114

DOI:10.1063/1.4995363 [Cited within: 1]

[44]

Draper

1965 Freak ocean Mar. Obs. 35 193 195

DOI:10.1002/j.1477-8696.1966.tb05176.x [Cited within: 1]

[45]

Peregrine

D H

1983 Water waves, nonlinear Schrödinger equations and their solutions
J. Aust. Math. Soc. Ser. B 25 16 43

DOI:10.1017/S0334270000003891

[46]

Zabusky

N J

Kruskal

M D

1965 Interaction of solitons in a collisionless plasma and the recurrence of initial states
Phys. Rev. Lett. 15 240 243

DOI:10.1103/PhysRevLett.15.240

[47]

Hasegawa

Tappert

1973 Transmission of stationary nonlinear optical pulses in dispersive dielectric fibers: I. Anomalous dispersion
Appl. Phys. Lett. 23 142 144

DOI:10.1063/1.1654836 [Cited within: 1]

[48]

Bang

Kivshar

Y S

Buryak

A V

1992 Bright spatial solitons in defocusing Kerr media supported by cascaded nonlinearities
Opt. Lett. 22 1680 1682

DOI:10.1364/OL.22.001680 [Cited within: 1]