Deep learning neural networks for the third-order nonlinear Schr【-逻*辑*与-】ouml;dinger equation: brigh

删除或更新信息，请邮件至freekaoyan#163.com(#换成@)

Deep learning neural networks for the third-order nonlinear Schr【-逻辑与-】ouml;dinger equation: brigh

本站小编 Free考研考试/2022-01-02

Abstract
The dimensionless third-order nonlinear Schrödinger equation (alias the Hirota equation) is investigated via deep leaning neural networks. In this paper, we use the physics-informed neural networks (PINNs) deep learning method to explore the data-driven solutions (e.g. bright soliton, breather, and rogue waves) of the Hirota equation when the two types of the unperturbated and perturbated (a 2% noise) training data are considered. Moreover, we use the PINNs deep learning to study the data-driven discovery of parameters appearing in the Hirota equation with the aid of bright solitons.
Keywords： third-order nonlinear Schrödinger equation;deep learning;data-driven solitons;data-driven parameter discovery

PDF (1482KB)Metadata Metrics Related articlesExportEndNote|Ris|Bibtex Favorite
Cite this article
Zijian Zhou, Zhenya Yan. Deep learning neural networks for the third-order nonlinear Schrödinger equation: bright solitons, breathers, and rogue waves. Communications in Theoretical Physics, 2021, 73(10): 105006- doi:10.1088/1572-9494/ac1cd9

1. Introduction

As a fundamental and prototypical physical model, the nonlinear Schrödinger (NLS) equation is(1)$\begin{eqnarray}{\rm{i}}{q}_{t}+{q}_{{xx}}+\sigma | q{| }^{2}q=0,\quad (x,t)\in {{\mathbb{R}}}^{2},\quad \sigma =\pm 1,\end{eqnarray}$where q=q(x, t) denotes the complex field, and the subscripts stand for the partial derivatives with respect to the variables, σ=1 and σ=−1 corresponds to the focusing and defocusing interactions, respectively. Equation (1) can be used to describe the wave propagation in many fields of Kerr nonlinear and dispersion media such as plasmas physics, deep ocean, nonlinear optics, Bose–Einstein condensate, and even finance (see, e.g. [1–8] and references therein). When the ultra-short laser pulse (e.g. 100 fs [7]) propagation were considered, the study of the higher-order dispersive and nonlinear effects is of important significance, such as third-order dispersion, self-frequency shift, and self-steepening arising from the stimulated Raman scattering [9–11]. The third-order NLS equation (alias the Hirota equation [12]) is also a fundamental physical model. The Hirota equation and its extensions can also be used to describe the strongly dispersive ion-acoustic wave in plasma [13] and the broader-banded waves on deep ocean [14, 15]. The Hirota equation is completely integrable, and can be solved via the bi-linear method [12], inverse scattering transform [16, 17], and Darboux transform (see, e.g. [18–22]), and etc. Recently, we numerically studied the spectral signatures of the spatial Lax pair with distinct potentials (e.g. bright solitons, breathers, and rogue waves) of the Hirota equation [23].

Up to now, artificial intelligence and machine learning have been widely used to powerfully deal with big data, and play a more and more important role in the various fields, such as language translation, computer vision, speech recognition, and so on [24, 25]. More recently, the deep neural networks were presented to study the data-driven solutions and parameter discovery of nonlinear physical models [26–36]. Particularly, the physics-informed neural networks (PINNs) technique [28, 32] were developed to study nonlinear partial differential equations. In this paper, we would like to extend the PINNs deep learning method to investigate the data-driven solutions and parameter discovery for the focusing third-order NLS equation (alias the Hirota equation) with initial-boundary value conditions(2)$\begin{eqnarray}\left\{\begin{array}{l}{\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)+{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x})=0,\\ x\in (-L,L),\quad t\in ({t}_{0},T),\\ q(x,{t}_{0})={q}_{0}(x),\quad x\in [-L,L],\\ q(-L,t)=q(L,t),\quad t\in [{t}_{0},T],\end{array}\right.\end{eqnarray}$where q=q(x, t) is a complex envelope field, α and β are real constants for the second- and third-order dispersion coefficients, respectively. For β=0, the Hirota equation (2) becomes a NLS equation, whereas α=0, the Hirota equation (2) reduces to the complex modified KdV equation [12].

2. The PINN scheme for the data-driven solutions

2.1. The PINNs scheme

In this section, we would like to simply introduce the PINN deep learning method [32] for the data-driven solutions. The main idea of the PINN deep learning method is to use a deep neural network to fit the solutions of equation (2). Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively. The complex-valued PINN $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively are written as(3)$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)\\ +{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)],\end{array}\end{eqnarray}$and proceeded by approximating q(x, t) by a complex-valued deep neural network. In the PINN scheme, the complex-valued neural network $q(x,t)=(u(x,t),v(x,t))$ can be written asdef q(x, t):q=neural_net(tf.concat([x,t],1), weights, biases)
u=q[:,0:1]
v=q[:,1:2]
return u, v

Based on the defined q(x, t), the PINN F(x, t) can be taken asdef F(x, t):u, v=q(x, t)
u_t=tf.gradients(u, t)[0]
u_x=tf.gradients(u, x)[0]
u_xx=tf.gradients(u_x, x)[0]
u_xxx=tf.gradients(u_xx, x)[0]
v_t=tf.gradients(v, t)[0]
v_x=tf.gradients(v, x)[0]
v_xx=tf.gradients(v_x, x)[0]
v_xxx=tf.gradients(v_xx, x)[0]
F_u=-v_t+alpha*(u_xx+2*(u**2+v**2)*u)-beta*(v_xxx+6*(u**2+v**2)*v_x)
F_v=u_t+alpha*(v_xx+2*(u**2+v**2)*v)+beta*(u_xxx+6*(u**2+v**2)*u_x)
return F_u, F_v

The shared parameters, weights and biases, between the neural network $\tilde{q}(x,t)=u(x,t)+{\rm{i}}v(x,t)$ and $F(x,t)\,={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ can be learned by minimizing the whole training loss (TL), that is, the sum of the ${{\mathbb{L}}}^{2}$-norm TLs of the initial data (${\mathrm{TL}}_{I}$), boundary data (${\mathrm{TL}}_{B}$), and the whole equation F(x, t) (${\mathrm{TL}}_{S}$)(4)$\begin{eqnarray}\mathrm{TL}={\mathrm{TL}}_{I}+{\mathrm{TL}}_{B}+{\mathrm{TL}}_{S},\end{eqnarray}$where the mean squared (i.e. ${{\mathbb{L}}}^{2}$-norm) errors are chosen for them in the forms(5)$\begin{eqnarray}\begin{array}{rcl}{\mathrm{TL}}_{I} & = & \displaystyle \frac{1}{{N}_{I}}\sum _{j=1}^{{N}_{I}}\left({\left|u({x}_{I}^{j},{t}_{0})-{u}_{0}^{j}\right|}^{2}+{\left|v({x}_{I}^{j},{t}_{0})-{v}_{0}^{j}\right|}^{2}\right),\\ {\mathrm{TL}}_{S} & = & \displaystyle \frac{1}{{N}_{S}}\sum _{j=1}^{{N}_{S}}\left({\left|{F}_{u}({x}_{S}^{j},{t}_{S}^{j})\right|}^{2}+{\left|{F}_{v}({x}_{S}^{j},{t}_{S}^{j})\right|}^{2}\right),\\ {\mathrm{TL}}_{B} & = & \displaystyle \frac{1}{{N}_{B}}\sum _{j=1}^{{N}_{B}}\left({\left|u(-L,{t}_{B}^{j})-u(L,{t}_{B}^{j})\right|}^{2}\right.\\ & & \left.+{\left|v(-L,{t}_{B}^{j})-v(L,{t}_{B}^{j})\right|}^{2}\right),\end{array}\end{eqnarray}$with $\{{x}_{I}^{j},{u}_{0}^{j},{v}_{0}^{j}\}{}_{j=1}^{{N}_{I}}$ denoting the initial data (${q}_{0}(x)={u}_{0}(x)\,+{\rm{i}}{v}_{0}(x)$), $\{{t}_{B}^{j},u(\pm L,{t}_{B}^{j}),v(\pm L,{t}_{B}^{j}\}{}_{j=1}^{{N}_{B}}$ standing for the periodic boundary data, $\{{x}_{S}^{j},{t}_{S}^{j},{F}_{u}({x}_{S}^{j},{t}_{S}^{j}),{F}_{v}({x}_{S}^{j},{t}_{S}^{j})\}{}_{j\,=\,1}^{{N}_{S}}$ representing the collocation points of $F(x,t)={F}_{u}+{\rm{i}}{F}_{v}$ within a spatio-temporal region $(x,t)\in (-L,L)\times ({t}_{0},T]$. All of these sampling points are generated using a space filling Latin Hypercube Sampling strategy [37].

We would like to discuss some data-driven solutions of equation (2) by the deep learning method. Here we choose a 5-layer deep neural network with 40 neurons per layer and a hyperbolic tangent activation function $\tanh (\cdot )$(6)$\begin{eqnarray}\begin{array}{rcl}{A}^{j+1} & = & \tanh \left({W}^{j+1}{A}^{j}+{B}^{j+1}\right)\\ & & \times \left(\tanh \left(\sum _{s=1}^{{m}_{j}}{w}_{1s}^{j+1}{a}_{s}^{j}+{b}_{1}^{j+1}\right),\cdots ,\right.\\ & & {\left.\tanh \left(\sum _{s=1}^{{m}_{j}}{w}_{{m}_{j+1}s}^{j+1}{a}_{s}^{j}+{b}_{{m}_{j}}^{j+1}\right)\right)}^{{\rm{T}}},\end{array}\end{eqnarray}$to approximate the learning solutions, where ${A}^{j}\,={({a}_{1}^{j},{a}_{2}^{j},\ldots ,{a}_{{m}_{j}}^{j})}^{{\rm{T}}}$ and ${B}^{j}={({b}_{1}^{j},{b}_{2}^{j},\ldots ,{b}_{{m}_{j}}^{j})}^{{\rm{T}}}$ denote the output and bias column vectors of the jth layer, respectively, ${W}^{j+1}={({w}_{{ks}}^{j+1})}_{{m}_{j\,+\,1}\times {m}_{j}}$ stands for the weight matrix of the jth layer, ${A}^{0}={(x,t)}^{{\rm{T}}}$, ${A}^{M+1}={(u,v)}^{{\rm{T}}}$. The real and imaginary parts, u(x, t) and v(x, t), of approximated solution $\tilde{q}(x,t)=u(x,t)+{\rm{i}}v(x,t)$ are represented by the two outputs of one neural network (see figure 1 for the PINN scheme). In the following, we consider some fundamental solutions (e.g. bright soliton, breather, and rogue wave solutions) of equation (2) by using the PINNs deep leaning scheme. For the case $\alpha \beta \ne 0$ in equation (2), without loss of generality, we can take α=1, β=0.01.

Figure 1.

New window|Download| PPT slide
Figure 1.The PINN scheme solving the Hirota equation (2) with the initial and boundary conditions, where the activation function ${ \mathcal T }=\tanh (\cdot )$.

2.2. The data-driven bright soliton

The first example we would like to consider is the fundamental bright soliton of equation (2) [9, 12](7)$\begin{eqnarray}{q}_{\mathrm{bs}}(x,t)={\rm{sech}} (x-\beta t){{\rm{e}}}^{{\rm{i}}t},\end{eqnarray}$where the third-order dispersion coefficient β stands for the wave velocity, and the sign of β represents the direction of wave propagation [right-going (left-going) travelling wave bright soliton for β > 0 (β < 0)].

We here choose L=10, t₀=0, T=5, and will consider this problem by choosing two distinct kinds of initial sample points: In the first case, we will choose the N_I=100 random sample points from the initial data ${q}_{\mathrm{bs}}(x,t=0)$ with $x\in [-10,10]$. But in the second case, we only choose N_I=5 sample points from the initial data ${q}_{\mathrm{bs}}(x,t=0)$ with 5 equidistant and symmetric points $x\in \{-5,-2.5,0,2.5,5\}$. In the both cases, we use the same N_B= 200 periodic boundary random sample points and N_S= 10 000 random sample points in the solution region $\{(x,t,{q}_{\mathrm{bs}}(x,t))| (x,t)\in [-10,10]\times [0,5]\}$. It is worth mentioning that the N_S=10 000 sample points are obtained via the Latin Hypercube Sampling strategy [37].

We emulate the first case of initial data by using 10 000 steps Adam and 10 000 steps L-BFGS optimizations such that figures 2(a1)–(a3) and (b1)–(b3) illustrate the learning results starting from the unperturbated and perturbated (2% noise) training data, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are 9.3183×10⁻³, 5.3270×10⁻², 3.8502×10⁻² in figures 2(a1)–(a2), and 7.0707×10⁻³, 2.4057×10⁻², 1.6464×10⁻² in figures 2(b1)–(a2). Similarly, we use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations for the second case of initial data such that figures 2(c1)–(c3) and (d1)–(b3) illustrate the learning results starting from the unperturbated and perturbated training data, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are 1.8822×10⁻², 4.9227×10⁻², 4.0917×10⁻² in figures 2(c1)–(c2), and 2.5427×10⁻², 3.4825×10⁻², 2.5983×10⁻² in figures 2(d1)–(d2). Notice that those total learning times are (a) 717s, (b) 741s, (c) 1255s, and (d) 1334s, respectively, by using a Lenovo notebook with a 2.6 GHz six-cores i7 processor and a RTX2060 graphics processor.

Figure 2.

New window|Download| PPT slide
Figure 2.Data-driven bright soliton of the Hirota equation (2): (a1), (a2) and (b1), (b2) the learning solutions arising from the unpeturbated and perturbated (2%) training data related to the first case of initial data, respectively; (c1), (c2) and (d1), (d2) the learning solutions arising from the unpeturbated and perturbated (2%) training data related to the first case of initial data, respectively; (a3), (b3), (c3), (d3) the absolute values of the errors between the modules of exact and learning solutions. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 9.3183×10⁻³, 5.3270×10⁻², 3.8502×10⁻², (b1)–(b3) 7.0707×10⁻³, 2.4057×10⁻², 1.6464×10⁻², (c1)–(c3) 1.8822×10⁻², 4.9227×10⁻², 4.0917×10⁻², (d1)–(d3) 2.5427×10⁻², 3.4825×10⁻², 2.5983×10⁻².

Remark. In each step of the L-BFGS optimization, the program is stop at(8)$\begin{eqnarray}\begin{array}{l}| \mathrm{loss}(n)-\mathrm{loss}(n-1)| /\max (| \mathrm{loss}(n)| ,\\ | \mathrm{loss}(n-1)| ,1)\lt 1.0\times \mathrm{np}.\mathrm{finfo}(\mathrm{float}).\mathrm{eps},\end{array}\end{eqnarray}$where the loss(n) represents the value of loss function in the nth step L-BFGS optimization, and $1.0\times \mathrm{np}.\mathrm{finfo}(\mathrm{float}).\mathrm{eps}$ represent Machine Epsilon. When the relative error between $\mathrm{loss}(n)$ and $\mathrm{loss}(n-1)$ less than Machine Epsilon, procedure would be stop. This is why the computation times are different for each test by using the same step optimization.

2.3. The data-driven AKM breather solution

The second example we would like to study is the AKM breather (spatio-temporal periodic pattern) of equation (2) [18](9)$\begin{eqnarray}{q}_{\mathrm{akm}}(x,t)=\displaystyle \frac{\cosh (\omega t-2{\rm{i}}c)-\cos (c)\cos (p\xi )}{\cosh (\omega t)-\cos (c)\cos (p\xi )}{{\rm{e}}}^{2{\rm{i}}t},\end{eqnarray}$where $\xi =x-2\beta [2+\cos (2c)t]$, $\omega =2\sin (2c)$, $p=2\sin (c)$, and c is a real constant. The wave velocity and wavenumber of this periodic wave are $2\beta (2+\cos (2c))$ and p, respectively. This AKM breather differs from the Akhmediev breather (spatial periodic pattern) of the NLS equation because equation (2) contains the third-order coefficient β. In this example, we assume β=0.01 again. When $t\to \infty $, $| {q}_{\mathrm{akm}}(x,t){| }^{2}\to 1$. If $\beta \to 0$, we have $\xi \to x$, and then AKM breather almost becomes the Akhmediev breather.

We here choose L=10 and $t\in [-3,3]$, and choose the N_I=100 random sample points from the initial data ${q}_{\mathrm{akm}}(x,t=0)$, N_B=200 random sample points from the periodic boundary data, and N_S=10 000 random sample points in the solution region $(x,t)\in [-10,10]\times [-3,3]$. We use the 20 000 Adam and 50 000 L-BFGS optimizations to learn the solutions from the unperturbated and perturbated (a 2% noise) initial data. As a result, figures 3 (a1)–(a3) and (b1)–(b3) exhibit the leaning results for the unperturbated and perturbated (a 2% noise) cases, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a) 1.1011×10⁻², 3.5650×10⁻², 5.0245×10⁻², (b) 1.3458×10⁻², 5.1326×10⁻², 7.0242×10⁻². The learning times are 2268 s and 1848 s, respectively.

Figure 3.

New window|Download| PPT slide
Figure 3.Learning breathers related to the AKM breather (9) of the Hirota equation (2). (a1)–(a3) The unperturbated case, (b1)–(b3) the 2% perturbated case. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 1.1011×10⁻², 3.5650×10⁻², 5.0245×10⁻², (b1)–(b3) 1.3458×10⁻², 5.1326×10⁻², 7.0242×10⁻².

2.4. The data-driven rogue wave solution

The third example is a fundamental rogue wave solution of equation (2), which can be generated when one takes $c\to 0$ in the AKM breather (9) in the form [19](10)$\begin{eqnarray}{q}_{\mathrm{rw}}(x,t)=\left[1-\displaystyle \frac{4(1+4{\rm{i}}t)}{4{\left(x-6\beta t\right)}^{2}+16{t}^{2}+1}\right]{{\rm{e}}}^{2{\rm{i}}t}.\end{eqnarray}$As $| x| ,| t| \to \infty $, $| {q}_{\mathrm{rw}}| \to 1$, and ${\max }_{x,t}| {q}_{\mathrm{rw}}| =3$.

We here choose L=2.5 and $t\in [-0.5,0.5]$, and consider ${q}_{\mathrm{rw}}(x,t=-0.5)$ as the initial condition. We still choose N_I=100 random sample points from the initial data ${q}_{\mathrm{rw}}(x,t=-0.5)$, N_B=200 random sample points from the periodic boundary data, and N_S=10 000 random sample points in the solution region $(x,t)\in [-2.5,2.5]\,\times [-0.5,0.5]$. We use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations to learn the rogue wave solutions from the unperturbated and perturbated (a 2% noise) initial data, respectively. As a result, figures 4(a1)–(a3) and (b1)–(b3) exhibit the leaning results for the unperturbated and perturbated (a 2% noise) cases, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a) 6.7597×10⁻³, 8.8414×10⁻³, 1.6590×10⁻², (b) 3.9537×10⁻³, 5.8719×10⁻³, 9.0493×10⁻³. The learning times are 1524 s and 1414 s, respectively.

Figure 4.

New window|Download| PPT slide
Figure 4.Learning rogue wave solution related to equation (10) of the Hirota equation (2). (a1)–(a3) The unperturbated case, (b1)–(b3) the 2% perturbated case. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 6.7597×10⁻³, 8.8414×10⁻³, 1.6590×10⁻², (b1)–(b3) 3.9537×10⁻³, 5.8719×10⁻³, 9.0493×10⁻³.

3. The PINNs scheme for the data-driven parameter discovery

In this section, we apply the PINNs deep learning method to study the data-driven parameter discovery of the Hirota equation (2). In the following, we use the deep learning method to identify the parameters α and β in the Hirota equation (2). Moreover, we also use this method to identify the parameters of the high-order terms of equation (2).

3.1. The data-driven parameter discovery for α and β

Here we would like to use the PINNs deep learning method to identify the coefficients α, β of second- and third-order dispersive terms in the Hirota equation(11)$\begin{eqnarray}{\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)+{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x})=0,\end{eqnarray}$where α, β are the unknown real-valued parameters.

Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively, and the PINNs $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively, be(12)$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)\\ +{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)].\end{array}\end{eqnarray}$Then the deep neural network is used to learn $\{u(x,t),v(x,t)\}$ and parameters (α, β) by minimizing the mean squared error loss(13)$\begin{eqnarray}\begin{array}{l}\mathrm{TL}=\displaystyle \frac{1}{{N}_{p}}\sum _{j=1}^{{N}_{p}}\left(| u({x}^{j},{t}^{j})-{u}^{j}{| }^{2}+| v({x}^{j},{t}^{j})\right.\\ \left.-{v}^{j}{| }^{2}+| {F}_{u}({x}^{j},{t}^{j}){| }^{2}+| {F}_{v}({x}^{j},{t}^{j}){| }^{2}\right),\end{array}\end{eqnarray}$where ${\{{x}^{j},{t}^{j},{u}^{j},{v}^{j}\}}_{i=1}^{{N}_{p}}$ represents the training data on the real part and imaginary part of exact solution $u(x,t),v(x,t)$ given by equation (7) with $\alpha =1,\beta =0.5$ in $(x,t)\in [-8,8]\times [-3,3]$, and $u({x}^{j},{t}^{j}),v({x}^{j},{t}^{j})$ are real and imaginary parts of the approximate solution $q(x,t)\,=u(x,t)+{\rm{i}}v(x,t)$.

To study the data-driven parameter discovery of the Hirota equation (2) for α, β, we generate a training data-set by using the Latin Hypercube Sampling strategy to randomly select randomly choosing N_q=10 000 points in the solution region arising from the exact bright soliton (7) with α=1, β=0.5 and $(x,t)\in [-8,8]\times [-3,3]$. Then the obtained data-set is applied to train an 8-layer deep neural network with 20 neurons per layer and a same hyperbolic tangent activation function to approximate the parameters α, β in terms of minimizing the mean squared error loss given by equation (13) starting from α=β=0 in equation (12). We here use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations. Table 1 illustrates the learning parameters α, β in equation (11) under the cases of the data without perturbation and a 2% perturbation, and their errors of α, β are 3.85×10⁻⁵, 7.48×10⁻⁵ and 3.31×10⁻⁴, 2.89×10⁻⁴, respectively. Figure 5 exhibits the learning solutions and the relative ${{\mathbb{L}}}^{2}-$ norm errors of q(x, t), u(x, t) and v(x, t): (a1)–(a2) 7.0371×10⁻⁴, 1.0894×10⁻³, 1.0335×10⁻³; (b1)–(b2) 9.4420×10⁻⁴, 1.4055×10⁻³, 1.2136×10⁻³, where the training times are (a1)–(a2) 1510 s and (b1)–(b2) 3572 s, respectively.

Figure 5.

New window|Download| PPT slide
Figure 5.Data-driven parameter discovery of α and β in the sense of bright soliton (7). (a1)–(a2) Bright soliton without perturbation. (b1)–(b2) Bright soliton with a 2% noise. (a2), (b2) The absolute value of difference between the modules of exact and learning bright solitons. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a2) 7.0371×10⁻⁴, 1.0894×10⁻³, 1.0335× 10⁻³, (b1)–(b2) 9.4420×10⁻⁴, 1.4055×10⁻³, 1.2136×10⁻³.

Table 1.
Table 1.Comparisons of α, β and their errors in the different training data-set via deep learning.

Case	Solution	α	Error of α	β	Error of β
1	Exact bright soliton	1	0	0.5	0
2	Bright soliton without perturbation	1.000 04	3.85×10⁻⁵	0.050 08	7.48×10⁻⁵
3	Bright soliton with a 2% perturbation	0.999 67	3.31×10⁻⁴	0.050 29	2.89×10⁻⁴

New window|CSV

3.2. The data-driven parameter discovery for μ and ν

In what follows, we will study the learning coefficients of the high-order term in equation (2) via the deep learning method. We consider the Hirota equation (2) with two parameters in the form(14)$\begin{eqnarray}{\rm{i}}{q}_{t}+{q}_{{xx}}+2| q{| }^{2}q+\displaystyle \frac{{\rm{i}}}{2}(\mu {q}_{{xxx}}+\nu | q{| }^{2}{q}_{x})=0,\end{eqnarray}$where μ and ν are the unknown real constants of higher-order dispersion and nonlinear terms, respectively.

Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively, and the PINNs $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively, be(15)$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+{q}_{{xx}}+2| q{| }^{2}q\\ +\displaystyle \frac{{\rm{i}}}{2}(\mu {q}_{{xxx}}+\nu | q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)].\end{array}\end{eqnarray}$Then the deep neural network is used to learn $\{u(x,t),v(x,t)\}$ and parameters $(\mu ,\nu )$ by minimizing the mean squared error loss given by equation (13).

To illustrate the learning ability, we still use an 8-layer deep neural network with 20 neurons per layer. We choose N_q=10 000 sample points by the same way in the interior of solution region. The 20 000 steps Adam and 50 000 steps L-BFGS optimizations are used in the training process. Table 2 exhibits the training value and value errors of μ and ν in different training data set. And the results of neural network fitting exact solution are shown in figure 6. The training times are (a1)–(a2) 1971 s and (b1)–(b2) 1990 s, respectively.

Figure 6.

New window|Download| PPT slide
Figure 6.Data-driven parameter discovery of μ and ν in the sense of bright soliton (7). (a) (b) display the learning result under bright soliton data set. (a1)–(a2) are calculated without perturbation. (b1)–(b2) are calculated with 2% perturbation. (a2) and (b2) exhibit absolute value of difference between real solution and the function represented by the neural network. The relative ${{\mathbb{L}}}^{2}-$norm error of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a2) 8.0153×10⁻⁴, 1.0792×10⁻³, 1.2177×10⁻³, (b1)–(b2) 1.0770×10⁻³, 1.6541×10⁻³, 1.3370×10⁻³.

Table 2.
Table 2.Comparisons of μ, ν and their errors in the different training data-set via deep learning.

Case	Solution	μ	Error of μ	ν	Error of ν
1	Exact bright soliton	1	0	1	0
2	Bright soliton without perturbation	1.003 70	3.69×10⁻³	6.031 43	3.14×10⁻²
3	Bright soliton with a 2% perturbation	0.981 59	1.84×10⁻²	5.887 33	1.13×10⁻¹

New window|CSV

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Nos. 11 925 108 and 11 731 014).

Reference By original order
By published year
By cited within times
By Impact factor

[1]

Askar'yan

G A

1962 Sov. Phys. JETP 15 1088

[Cited within: 1]

[2]

Zakharov

V E J

1968 Appl. Mech. Tech. Phys. 9 86 94

[3]

Ablowitz

M J

Prinari

Trubatch

A D

2003 Discrete and Continuous Nonlinear Schrödinger Systems Cambridge Cambridge University Press

[4]

Malomed

B A

Mihalache

Wise

Torner

2005 J. Opt. B 7 R53

DOI:10.1088/1464-4266/7/5/R02

[5]

Osborne

A R

2009 Nonlinear Ocean Waves New York Academic

[6]

Yan

2010 Commun. Theor. Phys. 54 947 949

DOI:10.1088/0253-6102/54/5/31

[7]

Kivshar

Y S

Agrawal

G P

2013 Optical Solitons: From Fibers to Photonic Crystals New York Academic

[Cited within: 1]

[8]

Pitaevskii

Stringari

2016 Bose–Einstein Condensation and Superfluidity Oxford Oxford University Press

[Cited within: 1]

[9]

Kodama

Y J

1985 Stat. Phys. 39 597

DOI:10.1007/BF01008354 [Cited within: 2]

[10]

Kodama

Hasegawa

1987 IEEE J. Quantum Electron. 23 510

DOI:10.1109/JQE.1987.1073392

[11]

Yan

Dai

2013 J. Opt. 15 064012

DOI:10.1088/2040-8978/15/6/064012 [Cited within: 1]

[12]

Hirota

R J

1973 Math. Phys. 14 805

DOI:10.1063/1.1666399 [Cited within: 4]

[13]

Gogoi

Kalita

Devi

2010 J. Phys.: Conf. Ser. 208 012085

DOI:10.1088/1742-6596/208/1/012085 [Cited within: 1]

[14]

Trulsen

Dysthe

K B

1996 Wave Motion 24 281 289

DOI:10.1016/S0165-2125(96)00020-0 [Cited within: 1]

[15]

Craig

Guyenne

Sulem

2012 Eur. J. Mech. B 32 22 31

DOI:10.1016/j.euromechflu.2011.09.008 [Cited within: 1]

[16]

Dodd

R K

Bullough

R K

1975 Lett. Nuovo Cimento 13 313 318

DOI:10.1007/BF02746476 [Cited within: 1]

[17]

Zhang

Chen

Yan

2020 Commun. Nonlinear Sci. Numer. Simul. 80 104927

DOI:10.1016/j.cnsns.2019.104927 [Cited within: 1]

[18]

Akhmediev

Korneev

V I

Mitskevich

N V

1990 Radiophys. Quantum Electron. 33 95 100

DOI:10.1007/BF01037826 [Cited within: 2]

[19]

Ankiewicz

Soto-Crespo

J M

Akhmediev

2010 Phys. Rev. E 81 046602

DOI:10.1103/PhysRevE.81.046602 [Cited within: 1]

[20]

Tao

2012 Phys. Rev. E 85 026601

DOI:10.1103/PhysRevE.85.026601

[21]

Yang

Yan

Malomed

B A

2015 Chaos 25 103112

DOI:10.1063/1.4931594

[22]

Chen

Yan

2019 Appl. Math. Lett. 95 65

DOI:10.1016/j.aml.2019.03.020 [Cited within: 1]

[23]

Wang

Yan

Guo

2020 Chaos 30 013114

DOI:10.1063/1.5129313 [Cited within: 1]

[24]

LeCun

Bengio

Hinton

2015 Nature 521 436

DOI:10.1038/nature14539 [Cited within: 1]

[25]

Goodfellow

Bengio

Courville

2016 Deep Learning Cambridge MIT Press

[Cited within: 1]

[26]

Dissanayake

Phan-Thien

1994 Commun. Numer. Methods Eng. 10 195 201

DOI:10.1002/cnm.1640100303 [Cited within: 1]

[27]

Lagaris

I E

Likas

Fotiadis

D I

1998 IEEE Trans. Neural Netw. 9 987 1000

DOI:10.1109/72.712178

[28]

Raissi

Karniadakis

G E J

2018 Comput. Phys. 357 125 141

DOI:10.1016/j.jcp.2017.11.039 [Cited within: 1]

[29]

Han

Jentzen

A E W

2018 Proc. Natl Appl. Sci. 115 8505 8510

DOI:10.1073/pnas.1718942115

[30]

Pang

Karniadakis

G E

2019 SIAM J. Sci. Comput. 41 A2603 A2626

DOI:10.1137/18M1229845

[31]

Zhang

Guo

Karniadakis

G E

2019 J. Comput. Phys. 397 108850

DOI:10.1016/j.jcp.2019.07.048

[32]

Raissi

Perdikaris

Karniadakis

G E

2019 J. Comput. Phys. 378 686

DOI:10.1016/j.jcp.2018.10.045 [Cited within: 2]

[33]

Long

Dong

2019 J. Comput. Phys. 399 108925

DOI:10.1016/j.jcp.2019.108925

[34]

Raissi

Yazdani

Karniadakis

G E

2020 Science 367 1026 1030

DOI:10.1126/science.aaw4741

[35]

Zhou

Yan

2021 Phys. Lett. A 387 127010

DOI:10.1016/j.physleta.2020.127010

[36]

Wang

Yan

2021 Phys. Lett. A 404 127408

DOI:10.1016/j.physleta.2021.127408 [Cited within: 1]

[37]

Stein

1987 Technometrics 29 143 151

DOI:10.1080/00401706.1987.10488205 [Cited within: 2]