删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

Deep learning neural networks for the third-order nonlinear Schr【-逻*辑*与-】ouml;dinger equation: brigh

本站小编 Free考研考试/2022-01-02

<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.2-beta.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> <script type='text/x-mathjax-config'> MathJax.Hub.Config({ extensions: ["tex2jax.js"], jax: ["input/TeX", "output/HTML-CSS"], tex2jax: {inlineMath: [ ['$','$'], ["\\(","\\)"] ],displayMath: [ ['$$','$$'], ["\\[","\\]"] ],processEscapes: true}, "HTML-CSS": { availableFonts: ["TeX"] }, "HTML-CSS": {linebreaks: {automatic: true}}, SVG: {linebreaks: {automatic: true}} }); </script> Zijian Zhou1,2, Zhenya Yan,1,2,*1Key Laboratory of Mathematics Mechanization, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
2School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China

First author contact: *Author to whom any correspondence should be addressed.
Received:2021-06-3Revised:2021-07-10Accepted:2021-08-12Online:2021-09-03


Abstract
The dimensionless third-order nonlinear Schrödinger equation (alias the Hirota equation) is investigated via deep leaning neural networks. In this paper, we use the physics-informed neural networks (PINNs) deep learning method to explore the data-driven solutions (e.g. bright soliton, breather, and rogue waves) of the Hirota equation when the two types of the unperturbated and perturbated (a 2% noise) training data are considered. Moreover, we use the PINNs deep learning to study the data-driven discovery of parameters appearing in the Hirota equation with the aid of bright solitons.
Keywords: third-order nonlinear Schrödinger equation;deep learning;data-driven solitons;data-driven parameter discovery


PDF (1482KB)MetadataMetricsRelated articlesExportEndNote|Ris|BibtexFavorite
Cite this article
Zijian Zhou, Zhenya Yan. Deep learning neural networks for the third-order nonlinear Schrödinger equation: bright solitons, breathers, and rogue waves. Communications in Theoretical Physics, 2021, 73(10): 105006- doi:10.1088/1572-9494/ac1cd9

1. Introduction

As a fundamental and prototypical physical model, the nonlinear Schrödinger (NLS) equation is$\begin{eqnarray}{\rm{i}}{q}_{t}+{q}_{{xx}}+\sigma | q{| }^{2}q=0,\quad (x,t)\in {{\mathbb{R}}}^{2},\quad \sigma =\pm 1,\end{eqnarray}$where q=q(x, t) denotes the complex field, and the subscripts stand for the partial derivatives with respect to the variables, σ=1 and σ=−1 corresponds to the focusing and defocusing interactions, respectively. Equation (1) can be used to describe the wave propagation in many fields of Kerr nonlinear and dispersion media such as plasmas physics, deep ocean, nonlinear optics, Bose–Einstein condensate, and even finance (see, e.g. [18] and references therein). When the ultra-short laser pulse (e.g. 100 fs [7]) propagation were considered, the study of the higher-order dispersive and nonlinear effects is of important significance, such as third-order dispersion, self-frequency shift, and self-steepening arising from the stimulated Raman scattering [911]. The third-order NLS equation (alias the Hirota equation [12]) is also a fundamental physical model. The Hirota equation and its extensions can also be used to describe the strongly dispersive ion-acoustic wave in plasma [13] and the broader-banded waves on deep ocean [14, 15]. The Hirota equation is completely integrable, and can be solved via the bi-linear method [12], inverse scattering transform [16, 17], and Darboux transform (see, e.g. [1822]), and etc. Recently, we numerically studied the spectral signatures of the spatial Lax pair with distinct potentials (e.g. bright solitons, breathers, and rogue waves) of the Hirota equation [23].

Up to now, artificial intelligence and machine learning have been widely used to powerfully deal with big data, and play a more and more important role in the various fields, such as language translation, computer vision, speech recognition, and so on [24, 25]. More recently, the deep neural networks were presented to study the data-driven solutions and parameter discovery of nonlinear physical models [2636]. Particularly, the physics-informed neural networks (PINNs) technique [28, 32] were developed to study nonlinear partial differential equations. In this paper, we would like to extend the PINNs deep learning method to investigate the data-driven solutions and parameter discovery for the focusing third-order NLS equation (alias the Hirota equation) with initial-boundary value conditions$\begin{eqnarray}\left\{\begin{array}{l}{\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)+{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x})=0,\\ x\in (-L,L),\quad t\in ({t}_{0},T),\\ q(x,{t}_{0})={q}_{0}(x),\quad x\in [-L,L],\\ q(-L,t)=q(L,t),\quad t\in [{t}_{0},T],\end{array}\right.\end{eqnarray}$where q=q(x, t) is a complex envelope field, α and β are real constants for the second- and third-order dispersion coefficients, respectively. For β=0, the Hirota equation (2) becomes a NLS equation, whereas α=0, the Hirota equation (2) reduces to the complex modified KdV equation [12].

2. The PINN scheme for the data-driven solutions

2.1. The PINNs scheme

In this section, we would like to simply introduce the PINN deep learning method [32] for the data-driven solutions. The main idea of the PINN deep learning method is to use a deep neural network to fit the solutions of equation (2). Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively. The complex-valued PINN $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively are written as$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)\\ +{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)],\end{array}\end{eqnarray}$and proceeded by approximating q(x, t) by a complex-valued deep neural network. In the PINN scheme, the complex-valued neural network $q(x,t)=(u(x,t),v(x,t))$ can be written asdef q(x, t):q=neural_net(tf.concat([x,t],1), weights, biases)
u=q[:,0:1]
v=q[:,1:2]
return u, v



Based on the defined q(x, t), the PINN F(x, t) can be taken asdef F(x, t):u, v=q(x, t)
u_t=tf.gradients(u, t)[0]
u_x=tf.gradients(u, x)[0]
u_xx=tf.gradients(u_x, x)[0]
u_xxx=tf.gradients(u_xx, x)[0]
v_t=tf.gradients(v, t)[0]
v_x=tf.gradients(v, x)[0]
v_xx=tf.gradients(v_x, x)[0]
v_xxx=tf.gradients(v_xx, x)[0]
F_u=-v_t+alpha*(u_xx+2*(u**2+v**2)*u)-beta*(v_xxx+6*(u**2+v**2)*v_x)
F_v=u_t+alpha*(v_xx+2*(u**2+v**2)*v)+beta*(u_xxx+6*(u**2+v**2)*u_x)
return F_u, F_v



The shared parameters, weights and biases, between the neural network $\tilde{q}(x,t)=u(x,t)+{\rm{i}}v(x,t)$ and $F(x,t)\,={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ can be learned by minimizing the whole training loss (TL), that is, the sum of the ${{\mathbb{L}}}^{2}$-norm TLs of the initial data (${\mathrm{TL}}_{I}$), boundary data (${\mathrm{TL}}_{B}$), and the whole equation F(x, t) (${\mathrm{TL}}_{S}$)$\begin{eqnarray}\mathrm{TL}={\mathrm{TL}}_{I}+{\mathrm{TL}}_{B}+{\mathrm{TL}}_{S},\end{eqnarray}$where the mean squared (i.e. ${{\mathbb{L}}}^{2}$-norm) errors are chosen for them in the forms$\begin{eqnarray}\begin{array}{rcl}{\mathrm{TL}}_{I} & = & \displaystyle \frac{1}{{N}_{I}}\sum _{j=1}^{{N}_{I}}\left({\left|u({x}_{I}^{j},{t}_{0})-{u}_{0}^{j}\right|}^{2}+{\left|v({x}_{I}^{j},{t}_{0})-{v}_{0}^{j}\right|}^{2}\right),\\ {\mathrm{TL}}_{S} & = & \displaystyle \frac{1}{{N}_{S}}\sum _{j=1}^{{N}_{S}}\left({\left|{F}_{u}({x}_{S}^{j},{t}_{S}^{j})\right|}^{2}+{\left|{F}_{v}({x}_{S}^{j},{t}_{S}^{j})\right|}^{2}\right),\\ {\mathrm{TL}}_{B} & = & \displaystyle \frac{1}{{N}_{B}}\sum _{j=1}^{{N}_{B}}\left({\left|u(-L,{t}_{B}^{j})-u(L,{t}_{B}^{j})\right|}^{2}\right.\\ & & \left.+{\left|v(-L,{t}_{B}^{j})-v(L,{t}_{B}^{j})\right|}^{2}\right),\end{array}\end{eqnarray}$with $\{{x}_{I}^{j},{u}_{0}^{j},{v}_{0}^{j}\}{}_{j=1}^{{N}_{I}}$ denoting the initial data (${q}_{0}(x)={u}_{0}(x)\,+{\rm{i}}{v}_{0}(x)$), $\{{t}_{B}^{j},u(\pm L,{t}_{B}^{j}),v(\pm L,{t}_{B}^{j}\}{}_{j=1}^{{N}_{B}}$ standing for the periodic boundary data, $\{{x}_{S}^{j},{t}_{S}^{j},{F}_{u}({x}_{S}^{j},{t}_{S}^{j}),{F}_{v}({x}_{S}^{j},{t}_{S}^{j})\}{}_{j\,=\,1}^{{N}_{S}}$ representing the collocation points of $F(x,t)={F}_{u}+{\rm{i}}{F}_{v}$ within a spatio-temporal region $(x,t)\in (-L,L)\times ({t}_{0},T]$. All of these sampling points are generated using a space filling Latin Hypercube Sampling strategy [37].

We would like to discuss some data-driven solutions of equation (2) by the deep learning method. Here we choose a 5-layer deep neural network with 40 neurons per layer and a hyperbolic tangent activation function $\tanh (\cdot )$$\begin{eqnarray}\begin{array}{rcl}{A}^{j+1} & = & \tanh \left({W}^{j+1}{A}^{j}+{B}^{j+1}\right)\\ & & \times \left(\tanh \left(\sum _{s=1}^{{m}_{j}}{w}_{1s}^{j+1}{a}_{s}^{j}+{b}_{1}^{j+1}\right),\cdots ,\right.\\ & & {\left.\tanh \left(\sum _{s=1}^{{m}_{j}}{w}_{{m}_{j+1}s}^{j+1}{a}_{s}^{j}+{b}_{{m}_{j}}^{j+1}\right)\right)}^{{\rm{T}}},\end{array}\end{eqnarray}$to approximate the learning solutions, where ${A}^{j}\,={({a}_{1}^{j},{a}_{2}^{j},\ldots ,{a}_{{m}_{j}}^{j})}^{{\rm{T}}}$ and ${B}^{j}={({b}_{1}^{j},{b}_{2}^{j},\ldots ,{b}_{{m}_{j}}^{j})}^{{\rm{T}}}$ denote the output and bias column vectors of the jth layer, respectively, ${W}^{j+1}={({w}_{{ks}}^{j+1})}_{{m}_{j\,+\,1}\times {m}_{j}}$ stands for the weight matrix of the jth layer, ${A}^{0}={(x,t)}^{{\rm{T}}}$, ${A}^{M+1}={(u,v)}^{{\rm{T}}}$. The real and imaginary parts, u(x, t) and v(x, t), of approximated solution $\tilde{q}(x,t)=u(x,t)+{\rm{i}}v(x,t)$ are represented by the two outputs of one neural network (see figure 1 for the PINN scheme). In the following, we consider some fundamental solutions (e.g. bright soliton, breather, and rogue wave solutions) of equation (2) by using the PINNs deep leaning scheme. For the case $\alpha \beta \ne 0$ in equation (2), without loss of generality, we can take α=1, β=0.01.

Figure 1.

New window|Download| PPT slide
Figure 1.The PINN scheme solving the Hirota equation (2) with the initial and boundary conditions, where the activation function ${ \mathcal T }=\tanh (\cdot )$.


2.2. The data-driven bright soliton

The first example we would like to consider is the fundamental bright soliton of equation (2) [9, 12]$\begin{eqnarray}{q}_{\mathrm{bs}}(x,t)={\rm{sech}} (x-\beta t){{\rm{e}}}^{{\rm{i}}t},\end{eqnarray}$where the third-order dispersion coefficient β stands for the wave velocity, and the sign of β represents the direction of wave propagation [right-going (left-going) travelling wave bright soliton for β > 0 (β < 0)].

We here choose L=10, t0=0, T=5, and will consider this problem by choosing two distinct kinds of initial sample points: In the first case, we will choose the NI=100 random sample points from the initial data ${q}_{\mathrm{bs}}(x,t=0)$ with $x\in [-10,10]$. But in the second case, we only choose NI=5 sample points from the initial data ${q}_{\mathrm{bs}}(x,t=0)$ with 5 equidistant and symmetric points $x\in \{-5,-2.5,0,2.5,5\}$. In the both cases, we use the same NB= 200 periodic boundary random sample points and NS= 10 000 random sample points in the solution region $\{(x,t,{q}_{\mathrm{bs}}(x,t))| (x,t)\in [-10,10]\times [0,5]\}$. It is worth mentioning that the NS=10 000 sample points are obtained via the Latin Hypercube Sampling strategy [37].

We emulate the first case of initial data by using 10 000 steps Adam and 10 000 steps L-BFGS optimizations such that figures 2(a1)–(a3) and (b1)–(b3) illustrate the learning results starting from the unperturbated and perturbated (2% noise) training data, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are 9.3183×10−3, 5.3270×10−2, 3.8502×10−2 in figures 2(a1)–(a2), and 7.0707×10−3, 2.4057×10−2, 1.6464×10−2 in figures 2(b1)–(a2). Similarly, we use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations for the second case of initial data such that figures 2(c1)–(c3) and (d1)–(b3) illustrate the learning results starting from the unperturbated and perturbated training data, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are 1.8822×10−2, 4.9227×10−2, 4.0917×10−2 in figures 2(c1)–(c2), and 2.5427×10−2, 3.4825×10−2, 2.5983×10−2 in figures 2(d1)–(d2). Notice that those total learning times are (a) 717s, (b) 741s, (c) 1255s, and (d) 1334s, respectively, by using a Lenovo notebook with a 2.6 GHz six-cores i7 processor and a RTX2060 graphics processor.

Figure 2.

New window|Download| PPT slide
Figure 2.Data-driven bright soliton of the Hirota equation (2): (a1), (a2) and (b1), (b2) the learning solutions arising from the unpeturbated and perturbated (2%) training data related to the first case of initial data, respectively; (c1), (c2) and (d1), (d2) the learning solutions arising from the unpeturbated and perturbated (2%) training data related to the first case of initial data, respectively; (a3), (b3), (c3), (d3) the absolute values of the errors between the modules of exact and learning solutions. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 9.3183×10−3, 5.3270×10−2, 3.8502×10−2, (b1)–(b3) 7.0707×10−3, 2.4057×10−2, 1.6464×10−2, (c1)–(c3) 1.8822×10−2, 4.9227×10−2, 4.0917×10−2, (d1)–(d3) 2.5427×10−2, 3.4825×10−2, 2.5983×10−2.


In each step of the L-BFGS optimization, the program is stop at$\begin{eqnarray}\begin{array}{l}| \mathrm{loss}(n)-\mathrm{loss}(n-1)| /\max (| \mathrm{loss}(n)| ,\\ | \mathrm{loss}(n-1)| ,1)\lt 1.0\times \mathrm{np}.\mathrm{finfo}(\mathrm{float}).\mathrm{eps},\end{array}\end{eqnarray}$where the loss(n) represents the value of loss function in the nth step L-BFGS optimization, and $1.0\times \mathrm{np}.\mathrm{finfo}(\mathrm{float}).\mathrm{eps}$ represent Machine Epsilon. When the relative error between $\mathrm{loss}(n)$ and $\mathrm{loss}(n-1)$ less than Machine Epsilon, procedure would be stop. This is why the computation times are different for each test by using the same step optimization.

2.3. The data-driven AKM breather solution

The second example we would like to study is the AKM breather (spatio-temporal periodic pattern) of equation (2) [18]$\begin{eqnarray}{q}_{\mathrm{akm}}(x,t)=\displaystyle \frac{\cosh (\omega t-2{\rm{i}}c)-\cos (c)\cos (p\xi )}{\cosh (\omega t)-\cos (c)\cos (p\xi )}{{\rm{e}}}^{2{\rm{i}}t},\end{eqnarray}$where $\xi =x-2\beta [2+\cos (2c)t]$, $\omega =2\sin (2c)$, $p=2\sin (c)$, and c is a real constant. The wave velocity and wavenumber of this periodic wave are $2\beta (2+\cos (2c))$ and p, respectively. This AKM breather differs from the Akhmediev breather (spatial periodic pattern) of the NLS equation because equation (2) contains the third-order coefficient β. In this example, we assume β=0.01 again. When $t\to \infty $, $| {q}_{\mathrm{akm}}(x,t){| }^{2}\to 1$. If $\beta \to 0$, we have $\xi \to x$, and then AKM breather almost becomes the Akhmediev breather.

We here choose L=10 and $t\in [-3,3]$, and choose the NI=100 random sample points from the initial data ${q}_{\mathrm{akm}}(x,t=0)$, NB=200 random sample points from the periodic boundary data, and NS=10 000 random sample points in the solution region $(x,t)\in [-10,10]\times [-3,3]$. We use the 20 000 Adam and 50 000 L-BFGS optimizations to learn the solutions from the unperturbated and perturbated (a 2% noise) initial data. As a result, figures 3 (a1)–(a3) and (b1)–(b3) exhibit the leaning results for the unperturbated and perturbated (a 2% noise) cases, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a) 1.1011×10−2, 3.5650×10−2, 5.0245×10−2, (b) 1.3458×10−2, 5.1326×10−2, 7.0242×10−2. The learning times are 2268 s and 1848 s, respectively.

Figure 3.

New window|Download| PPT slide
Figure 3.Learning breathers related to the AKM breather (9) of the Hirota equation (2). (a1)–(a3) The unperturbated case, (b1)–(b3) the 2% perturbated case. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 1.1011×10−2, 3.5650×10−2, 5.0245×10−2, (b1)–(b3) 1.3458×10−2, 5.1326×10−2, 7.0242×10−2.


2.4. The data-driven rogue wave solution

The third example is a fundamental rogue wave solution of equation (2), which can be generated when one takes $c\to 0$ in the AKM breather (9) in the form [19]$\begin{eqnarray}{q}_{\mathrm{rw}}(x,t)=\left[1-\displaystyle \frac{4(1+4{\rm{i}}t)}{4{\left(x-6\beta t\right)}^{2}+16{t}^{2}+1}\right]{{\rm{e}}}^{2{\rm{i}}t}.\end{eqnarray}$As $| x| ,| t| \to \infty $, $| {q}_{\mathrm{rw}}| \to 1$, and ${\max }_{x,t}| {q}_{\mathrm{rw}}| =3$.

We here choose L=2.5 and $t\in [-0.5,0.5]$, and consider ${q}_{\mathrm{rw}}(x,t=-0.5)$ as the initial condition. We still choose NI=100 random sample points from the initial data ${q}_{\mathrm{rw}}(x,t=-0.5)$, NB=200 random sample points from the periodic boundary data, and NS=10 000 random sample points in the solution region $(x,t)\in [-2.5,2.5]\,\times [-0.5,0.5]$. We use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations to learn the rogue wave solutions from the unperturbated and perturbated (a 2% noise) initial data, respectively. As a result, figures 4(a1)–(a3) and (b1)–(b3) exhibit the leaning results for the unperturbated and perturbated (a 2% noise) cases, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a) 6.7597×10−3, 8.8414×10−3, 1.6590×10−2, (b) 3.9537×10−3, 5.8719×10−3, 9.0493×10−3. The learning times are 1524 s and 1414 s, respectively.

Figure 4.

New window|Download| PPT slide
Figure 4.Learning rogue wave solution related to equation (10) of the Hirota equation (2). (a1)–(a3) The unperturbated case, (b1)–(b3) the 2% perturbated case. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 6.7597×10−3, 8.8414×10−3, 1.6590×10−2, (b1)–(b3) 3.9537×10−3, 5.8719×10−3, 9.0493×10−3.


3. The PINNs scheme for the data-driven parameter discovery

In this section, we apply the PINNs deep learning method to study the data-driven parameter discovery of the Hirota equation (2). In the following, we use the deep learning method to identify the parameters α and β in the Hirota equation (2). Moreover, we also use this method to identify the parameters of the high-order terms of equation (2).

3.1. The data-driven parameter discovery for α and β

Here we would like to use the PINNs deep learning method to identify the coefficients α, β of second- and third-order dispersive terms in the Hirota equation$\begin{eqnarray}{\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)+{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x})=0,\end{eqnarray}$where α, β are the unknown real-valued parameters.

Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively, and the PINNs $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively, be$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)\\ +{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)].\end{array}\end{eqnarray}$Then the deep neural network is used to learn $\{u(x,t),v(x,t)\}$ and parameters (α, β) by minimizing the mean squared error loss$\begin{eqnarray}\begin{array}{l}\mathrm{TL}=\displaystyle \frac{1}{{N}_{p}}\sum _{j=1}^{{N}_{p}}\left(| u({x}^{j},{t}^{j})-{u}^{j}{| }^{2}+| v({x}^{j},{t}^{j})\right.\\ \left.-{v}^{j}{| }^{2}+| {F}_{u}({x}^{j},{t}^{j}){| }^{2}+| {F}_{v}({x}^{j},{t}^{j}){| }^{2}\right),\end{array}\end{eqnarray}$where ${\{{x}^{j},{t}^{j},{u}^{j},{v}^{j}\}}_{i=1}^{{N}_{p}}$ represents the training data on the real part and imaginary part of exact solution $u(x,t),v(x,t)$ given by equation (7) with $\alpha =1,\beta =0.5$ in $(x,t)\in [-8,8]\times [-3,3]$, and $u({x}^{j},{t}^{j}),v({x}^{j},{t}^{j})$ are real and imaginary parts of the approximate solution $q(x,t)\,=u(x,t)+{\rm{i}}v(x,t)$.

To study the data-driven parameter discovery of the Hirota equation (2) for α, β, we generate a training data-set by using the Latin Hypercube Sampling strategy to randomly select randomly choosing Nq=10 000 points in the solution region arising from the exact bright soliton (7) with α=1, β=0.5 and $(x,t)\in [-8,8]\times [-3,3]$. Then the obtained data-set is applied to train an 8-layer deep neural network with 20 neurons per layer and a same hyperbolic tangent activation function to approximate the parameters α, β in terms of minimizing the mean squared error loss given by equation (13) starting from α=β=0 in equation (12). We here use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations. Table 1 illustrates the learning parameters α, β in equation (11) under the cases of the data without perturbation and a 2% perturbation, and their errors of α, β are 3.85×10−5, 7.48×10−5 and 3.31×10−4, 2.89×10−4, respectively. Figure 5 exhibits the learning solutions and the relative ${{\mathbb{L}}}^{2}-$ norm errors of q(x, t), u(x, t) and v(x, t): (a1)–(a2) 7.0371×10−4, 1.0894×10−3, 1.0335×10−3; (b1)–(b2) 9.4420×10−4, 1.4055×10−3, 1.2136×10−3, where the training times are (a1)–(a2) 1510 s and (b1)–(b2) 3572 s, respectively.

Figure 5.

New window|Download| PPT slide
Figure 5.Data-driven parameter discovery of α and β in the sense of bright soliton (7). (a1)–(a2) Bright soliton without perturbation. (b1)–(b2) Bright soliton with a 2% noise. (a2), (b2) The absolute value of difference between the modules of exact and learning bright solitons. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a2) 7.0371×10−4, 1.0894×10−3, 1.0335× 10−3, (b1)–(b2) 9.4420×10−4, 1.4055×10−3, 1.2136×10−3.



Table 1.
Table 1.Comparisons of α, β and their errors in the different training data-set via deep learning.
CaseSolutionαError of αβError of β
1Exact bright soliton100.50
2Bright soliton without perturbation1.000 043.85×10−50.050 087.48×10−5
3Bright soliton with a 2% perturbation0.999 673.31×10−40.050 292.89×10−4

New window|CSV

3.2. The data-driven parameter discovery for μ and ν

In what follows, we will study the learning coefficients of the high-order term in equation (2) via the deep learning method. We consider the Hirota equation (2) with two parameters in the form$\begin{eqnarray}{\rm{i}}{q}_{t}+{q}_{{xx}}+2| q{| }^{2}q+\displaystyle \frac{{\rm{i}}}{2}(\mu {q}_{{xxx}}+\nu | q{| }^{2}{q}_{x})=0,\end{eqnarray}$where μ and ν are the unknown real constants of higher-order dispersion and nonlinear terms, respectively.

Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively, and the PINNs $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively, be$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+{q}_{{xx}}+2| q{| }^{2}q\\ +\displaystyle \frac{{\rm{i}}}{2}(\mu {q}_{{xxx}}+\nu | q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)].\end{array}\end{eqnarray}$Then the deep neural network is used to learn $\{u(x,t),v(x,t)\}$ and parameters $(\mu ,\nu )$ by minimizing the mean squared error loss given by equation (13).

To illustrate the learning ability, we still use an 8-layer deep neural network with 20 neurons per layer. We choose Nq=10 000 sample points by the same way in the interior of solution region. The 20 000 steps Adam and 50 000 steps L-BFGS optimizations are used in the training process. Table 2 exhibits the training value and value errors of μ and ν in different training data set. And the results of neural network fitting exact solution are shown in figure 6. The training times are (a1)–(a2) 1971 s and (b1)–(b2) 1990 s, respectively.

Figure 6.

New window|Download| PPT slide
Figure 6.Data-driven parameter discovery of μ and ν in the sense of bright soliton (7). (a) (b) display the learning result under bright soliton data set. (a1)–(a2) are calculated without perturbation. (b1)–(b2) are calculated with 2% perturbation. (a2) and (b2) exhibit absolute value of difference between real solution and the function represented by the neural network. The relative ${{\mathbb{L}}}^{2}-$norm error of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a2) 8.0153×10−4, 1.0792×10−3, 1.2177×10−3, (b1)–(b2) 1.0770×10−3, 1.6541×10−3, 1.3370×10−3.



Table 2.
Table 2.Comparisons of μ, ν and their errors in the different training data-set via deep learning.
CaseSolutionμError of μνError of ν
1Exact bright soliton1010
2Bright soliton without perturbation1.003 703.69×10−36.031 433.14×10−2
3Bright soliton with a 2% perturbation0.981 591.84×10−25.887 331.13×10−1

New window|CSV

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Nos. 11 925 108 and 11 731 014).


Reference By original order
By published year
By cited within times
By Impact factor

Askar'yan G A 1962 Sov. Phys. JETP 15 1088
[Cited within: 1]

Zakharov V E J 1968 Appl. Mech. Tech. Phys. 9 86 94


Ablowitz M J Prinari B Trubatch A D 2003 Discrete and Continuous Nonlinear Schrödinger Systems Cambridge Cambridge University Press


Malomed B A Mihalache D Wise F Torner L 2005 J. Opt. B 7 R53
DOI:10.1088/1464-4266/7/5/R02

Osborne A R 2009 Nonlinear Ocean Waves New York Academic


Yan Z 2010 Commun. Theor. Phys. 54 947 949
DOI:10.1088/0253-6102/54/5/31

Kivshar Y S Agrawal G P 2013 Optical Solitons: From Fibers to Photonic Crystals New York Academic
[Cited within: 1]

Pitaevskii L Stringari S 2016 Bose–Einstein Condensation and Superfluidity Oxford Oxford University Press
[Cited within: 1]

Kodama Y J 1985 Stat. Phys. 39 597
DOI:10.1007/BF01008354 [Cited within: 2]

Kodama Y Hasegawa A 1987 IEEE J. Quantum Electron. 23 510
DOI:10.1109/JQE.1987.1073392

Yan Z Dai C 2013 J. Opt. 15 064012
DOI:10.1088/2040-8978/15/6/064012 [Cited within: 1]

Hirota R J 1973 Math. Phys. 14 805
DOI:10.1063/1.1666399 [Cited within: 4]

Gogoi R Kalita L Devi N 2010 J. Phys.: Conf. Ser. 208 012085
DOI:10.1088/1742-6596/208/1/012085 [Cited within: 1]

Trulsen K Dysthe K B 1996 Wave Motion 24 281 289
DOI:10.1016/S0165-2125(96)00020-0 [Cited within: 1]

Craig W Guyenne P Sulem C 2012 Eur. J. Mech. B 32 22 31
DOI:10.1016/j.euromechflu.2011.09.008 [Cited within: 1]

Dodd R K Bullough R K 1975 Lett. Nuovo Cimento 13 313 318
DOI:10.1007/BF02746476 [Cited within: 1]

Zhang G Chen S Yan Z 2020 Commun. Nonlinear Sci. Numer. Simul. 80 104927
DOI:10.1016/j.cnsns.2019.104927 [Cited within: 1]

Akhmediev N Korneev V I Mitskevich N V 1990 Radiophys. Quantum Electron. 33 95 100
DOI:10.1007/BF01037826 [Cited within: 2]

Ankiewicz A Soto-Crespo J M Akhmediev N 2010 Phys. Rev. E 81 046602
DOI:10.1103/PhysRevE.81.046602 [Cited within: 1]

Tao Y He J 2012 Phys. Rev. E 85 026601
DOI:10.1103/PhysRevE.85.026601

Yang Y Yan Z Malomed B A 2015 Chaos 25 103112
DOI:10.1063/1.4931594

Chen S Yan Z 2019 Appl. Math. Lett. 95 65
DOI:10.1016/j.aml.2019.03.020 [Cited within: 1]

Wang L Yan Z Guo B 2020 Chaos 30 013114
DOI:10.1063/1.5129313 [Cited within: 1]

LeCun Y Bengio Y Hinton G 2015 Nature 521 436
DOI:10.1038/nature14539 [Cited within: 1]

Goodfellow I Bengio Y Courville A 2016 Deep Learning Cambridge MIT Press
[Cited within: 1]

Dissanayake M Phan-Thien N 1994 Commun. Numer. Methods Eng. 10 195 201
DOI:10.1002/cnm.1640100303 [Cited within: 1]

Lagaris I E Likas A Fotiadis D I 1998 IEEE Trans. Neural Netw. 9 987 1000
DOI:10.1109/72.712178

Raissi M Karniadakis G E J 2018 Comput. Phys. 357 125 141
DOI:10.1016/j.jcp.2017.11.039 [Cited within: 1]

Han J Jentzen A E W 2018 Proc. Natl Appl. Sci. 115 8505 8510
DOI:10.1073/pnas.1718942115

Pang G Lu L Karniadakis G E 2019 SIAM J. Sci. Comput. 41 A2603 A2626
DOI:10.1137/18M1229845

Zhang D Lu L Guo L Karniadakis G E 2019 J. Comput. Phys. 397 108850
DOI:10.1016/j.jcp.2019.07.048

Raissi M Perdikaris P Karniadakis G E 2019 J. Comput. Phys. 378 686
DOI:10.1016/j.jcp.2018.10.045 [Cited within: 2]

Long Z Lu Y Dong B 2019 J. Comput. Phys. 399 108925
DOI:10.1016/j.jcp.2019.108925

Raissi M Yazdani A Karniadakis G E 2020 Science 367 1026 1030
DOI:10.1126/science.aaw4741

Zhou Z Yan Z 2021 Phys. Lett. A 387 127010
DOI:10.1016/j.physleta.2020.127010

Wang L Yan Z 2021 Phys. Lett. A 404 127408
DOI:10.1016/j.physleta.2021.127408 [Cited within: 1]

Stein M 1987 Technometrics 29 143 151
DOI:10.1080/00401706.1987.10488205 [Cited within: 2]

相关话题/learning neural networks