Deep learning neural networks for the third-order nonlinear Schr【-逻*辑*与-】ouml;dinger equation: brigh
本站小编 Free考研考试/2022-01-02
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.2-beta.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type='text/x-mathjax-config'>
MathJax.Hub.Config({
extensions: ["tex2jax.js"],
jax: ["input/TeX", "output/HTML-CSS"],
tex2jax: {inlineMath: [ ['$','$'], ["\\(","\\)"] ],displayMath: [ ['$$','$$'], ["\\[","\\]"] ],processEscapes: true},
"HTML-CSS": { availableFonts: ["TeX"] },
"HTML-CSS": {linebreaks: {automatic: true}},
SVG: {linebreaks: {automatic: true}}
});
</script>
Zijian Zhou1,2, Zhenya Yan,1,2,*1Key Laboratory of Mathematics Mechanization, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China 2School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
First author contact:*Author to whom any correspondence should be addressed. Received:2021-06-3Revised:2021-07-10Accepted:2021-08-12Online:2021-09-03
Abstract The dimensionless third-order nonlinear Schrödinger equation (alias the Hirota equation) is investigated via deep leaning neural networks. In this paper, we use the physics-informed neural networks (PINNs) deep learning method to explore the data-driven solutions (e.g. bright soliton, breather, and rogue waves) of the Hirota equation when the two types of the unperturbated and perturbated (a 2% noise) training data are considered. Moreover, we use the PINNs deep learning to study the data-driven discovery of parameters appearing in the Hirota equation with the aid of bright solitons. Keywords:third-order nonlinear Schrödinger equation;deep learning;data-driven solitons;data-driven parameter discovery
PDF (1482KB)MetadataMetricsRelated articlesExportEndNote|Ris|BibtexFavorite Cite this article Zijian Zhou, Zhenya Yan. Deep learning neural networks for the third-order nonlinear Schrödinger equation: bright solitons, breathers, and rogue waves. Communications in Theoretical Physics, 2021, 73(10): 105006- doi:10.1088/1572-9494/ac1cd9
1. Introduction
As a fundamental and prototypical physical model, the nonlinear Schrödinger (NLS) equation is$\begin{eqnarray}{\rm{i}}{q}_{t}+{q}_{{xx}}+\sigma | q{| }^{2}q=0,\quad (x,t)\in {{\mathbb{R}}}^{2},\quad \sigma =\pm 1,\end{eqnarray}$where q=q(x, t) denotes the complex field, and the subscripts stand for the partial derivatives with respect to the variables, σ=1 and σ=−1 corresponds to the focusing and defocusing interactions, respectively. Equation (1) can be used to describe the wave propagation in many fields of Kerr nonlinear and dispersion media such as plasmas physics, deep ocean, nonlinear optics, Bose–Einstein condensate, and even finance (see, e.g. [1–8] and references therein). When the ultra-short laser pulse (e.g. 100 fs [7]) propagation were considered, the study of the higher-order dispersive and nonlinear effects is of important significance, such as third-order dispersion, self-frequency shift, and self-steepening arising from the stimulated Raman scattering [9–11]. The third-order NLS equation (alias the Hirota equation [12]) is also a fundamental physical model. The Hirota equation and its extensions can also be used to describe the strongly dispersive ion-acoustic wave in plasma [13] and the broader-banded waves on deep ocean [14, 15]. The Hirota equation is completely integrable, and can be solved via the bi-linear method [12], inverse scattering transform [16, 17], and Darboux transform (see, e.g. [18–22]), and etc. Recently, we numerically studied the spectral signatures of the spatial Lax pair with distinct potentials (e.g. bright solitons, breathers, and rogue waves) of the Hirota equation [23].
Up to now, artificial intelligence and machine learning have been widely used to powerfully deal with big data, and play a more and more important role in the various fields, such as language translation, computer vision, speech recognition, and so on [24, 25]. More recently, the deep neural networks were presented to study the data-driven solutions and parameter discovery of nonlinear physical models [26–36]. Particularly, the physics-informed neural networks (PINNs) technique [28, 32] were developed to study nonlinear partial differential equations. In this paper, we would like to extend the PINNs deep learning method to investigate the data-driven solutions and parameter discovery for the focusing third-order NLS equation (alias the Hirota equation) with initial-boundary value conditions$\begin{eqnarray}\left\{\begin{array}{l}{\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)+{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x})=0,\\ x\in (-L,L),\quad t\in ({t}_{0},T),\\ q(x,{t}_{0})={q}_{0}(x),\quad x\in [-L,L],\\ q(-L,t)=q(L,t),\quad t\in [{t}_{0},T],\end{array}\right.\end{eqnarray}$where q=q(x, t) is a complex envelope field, α and β are real constants for the second- and third-order dispersion coefficients, respectively. For β=0, the Hirota equation (2) becomes a NLS equation, whereas α=0, the Hirota equation (2) reduces to the complex modified KdV equation [12].
2. The PINN scheme for the data-driven solutions
2.1. The PINNs scheme
In this section, we would like to simply introduce the PINN deep learning method [32] for the data-driven solutions. The main idea of the PINN deep learning method is to use a deep neural network to fit the solutions of equation (2). Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively. The complex-valued PINN $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively are written as$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)\\ +{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)],\end{array}\end{eqnarray}$and proceeded by approximating q(x, t) by a complex-valued deep neural network. In the PINN scheme, the complex-valued neural network $q(x,t)=(u(x,t),v(x,t))$ can be written asdef q(x, t):q=neural_net(tf.concat([x,t],1), weights, biases) u=q[:,0:1] v=q[:,1:2] return u, v
Based on the defined q(x, t), the PINN F(x, t) can be taken asdef F(x, t):u, v=q(x, t) u_t=tf.gradients(u, t)[0] u_x=tf.gradients(u, x)[0] u_xx=tf.gradients(u_x, x)[0] u_xxx=tf.gradients(u_xx, x)[0] v_t=tf.gradients(v, t)[0] v_x=tf.gradients(v, x)[0] v_xx=tf.gradients(v_x, x)[0] v_xxx=tf.gradients(v_xx, x)[0] F_u=-v_t+alpha*(u_xx+2*(u**2+v**2)*u)-beta*(v_xxx+6*(u**2+v**2)*v_x) F_v=u_t+alpha*(v_xx+2*(u**2+v**2)*v)+beta*(u_xxx+6*(u**2+v**2)*u_x) return F_u, F_v
The shared parameters, weights and biases, between the neural network $\tilde{q}(x,t)=u(x,t)+{\rm{i}}v(x,t)$ and $F(x,t)\,={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ can be learned by minimizing the whole training loss (TL), that is, the sum of the ${{\mathbb{L}}}^{2}$-norm TLs of the initial data (${\mathrm{TL}}_{I}$), boundary data (${\mathrm{TL}}_{B}$), and the whole equation F(x, t) (${\mathrm{TL}}_{S}$)$\begin{eqnarray}\mathrm{TL}={\mathrm{TL}}_{I}+{\mathrm{TL}}_{B}+{\mathrm{TL}}_{S},\end{eqnarray}$where the mean squared (i.e. ${{\mathbb{L}}}^{2}$-norm) errors are chosen for them in the forms$\begin{eqnarray}\begin{array}{rcl}{\mathrm{TL}}_{I} & = & \displaystyle \frac{1}{{N}_{I}}\sum _{j=1}^{{N}_{I}}\left({\left|u({x}_{I}^{j},{t}_{0})-{u}_{0}^{j}\right|}^{2}+{\left|v({x}_{I}^{j},{t}_{0})-{v}_{0}^{j}\right|}^{2}\right),\\ {\mathrm{TL}}_{S} & = & \displaystyle \frac{1}{{N}_{S}}\sum _{j=1}^{{N}_{S}}\left({\left|{F}_{u}({x}_{S}^{j},{t}_{S}^{j})\right|}^{2}+{\left|{F}_{v}({x}_{S}^{j},{t}_{S}^{j})\right|}^{2}\right),\\ {\mathrm{TL}}_{B} & = & \displaystyle \frac{1}{{N}_{B}}\sum _{j=1}^{{N}_{B}}\left({\left|u(-L,{t}_{B}^{j})-u(L,{t}_{B}^{j})\right|}^{2}\right.\\ & & \left.+{\left|v(-L,{t}_{B}^{j})-v(L,{t}_{B}^{j})\right|}^{2}\right),\end{array}\end{eqnarray}$with $\{{x}_{I}^{j},{u}_{0}^{j},{v}_{0}^{j}\}{}_{j=1}^{{N}_{I}}$ denoting the initial data (${q}_{0}(x)={u}_{0}(x)\,+{\rm{i}}{v}_{0}(x)$), $\{{t}_{B}^{j},u(\pm L,{t}_{B}^{j}),v(\pm L,{t}_{B}^{j}\}{}_{j=1}^{{N}_{B}}$ standing for the periodic boundary data, $\{{x}_{S}^{j},{t}_{S}^{j},{F}_{u}({x}_{S}^{j},{t}_{S}^{j}),{F}_{v}({x}_{S}^{j},{t}_{S}^{j})\}{}_{j\,=\,1}^{{N}_{S}}$ representing the collocation points of $F(x,t)={F}_{u}+{\rm{i}}{F}_{v}$ within a spatio-temporal region $(x,t)\in (-L,L)\times ({t}_{0},T]$. All of these sampling points are generated using a space filling Latin Hypercube Sampling strategy [37].
We would like to discuss some data-driven solutions of equation (2) by the deep learning method. Here we choose a 5-layer deep neural network with 40 neurons per layer and a hyperbolic tangent activation function $\tanh (\cdot )$$\begin{eqnarray}\begin{array}{rcl}{A}^{j+1} & = & \tanh \left({W}^{j+1}{A}^{j}+{B}^{j+1}\right)\\ & & \times \left(\tanh \left(\sum _{s=1}^{{m}_{j}}{w}_{1s}^{j+1}{a}_{s}^{j}+{b}_{1}^{j+1}\right),\cdots ,\right.\\ & & {\left.\tanh \left(\sum _{s=1}^{{m}_{j}}{w}_{{m}_{j+1}s}^{j+1}{a}_{s}^{j}+{b}_{{m}_{j}}^{j+1}\right)\right)}^{{\rm{T}}},\end{array}\end{eqnarray}$to approximate the learning solutions, where ${A}^{j}\,={({a}_{1}^{j},{a}_{2}^{j},\ldots ,{a}_{{m}_{j}}^{j})}^{{\rm{T}}}$ and ${B}^{j}={({b}_{1}^{j},{b}_{2}^{j},\ldots ,{b}_{{m}_{j}}^{j})}^{{\rm{T}}}$ denote the output and bias column vectors of the jth layer, respectively, ${W}^{j+1}={({w}_{{ks}}^{j+1})}_{{m}_{j\,+\,1}\times {m}_{j}}$ stands for the weight matrix of the jth layer, ${A}^{0}={(x,t)}^{{\rm{T}}}$, ${A}^{M+1}={(u,v)}^{{\rm{T}}}$. The real and imaginary parts, u(x, t) and v(x, t), of approximated solution $\tilde{q}(x,t)=u(x,t)+{\rm{i}}v(x,t)$ are represented by the two outputs of one neural network (see figure 1 for the PINN scheme). In the following, we consider some fundamental solutions (e.g. bright soliton, breather, and rogue wave solutions) of equation (2) by using the PINNs deep leaning scheme. For the case $\alpha \beta \ne 0$ in equation (2), without loss of generality, we can take α=1, β=0.01.
Figure 1.
New window|Download| PPT slide Figure 1.The PINN scheme solving the Hirota equation (2) with the initial and boundary conditions, where the activation function ${ \mathcal T }=\tanh (\cdot )$.
2.2. The data-driven bright soliton
The first example we would like to consider is the fundamental bright soliton of equation (2) [9, 12]$\begin{eqnarray}{q}_{\mathrm{bs}}(x,t)={\rm{sech}} (x-\beta t){{\rm{e}}}^{{\rm{i}}t},\end{eqnarray}$where the third-order dispersion coefficient β stands for the wave velocity, and the sign of β represents the direction of wave propagation [right-going (left-going) travelling wave bright soliton for β > 0 (β < 0)].
We here choose L=10, t0=0, T=5, and will consider this problem by choosing two distinct kinds of initial sample points: In the first case, we will choose the NI=100 random sample points from the initial data ${q}_{\mathrm{bs}}(x,t=0)$ with $x\in [-10,10]$. But in the second case, we only choose NI=5 sample points from the initial data ${q}_{\mathrm{bs}}(x,t=0)$ with 5 equidistant and symmetric points $x\in \{-5,-2.5,0,2.5,5\}$. In the both cases, we use the same NB= 200 periodic boundary random sample points and NS= 10 000 random sample points in the solution region $\{(x,t,{q}_{\mathrm{bs}}(x,t))| (x,t)\in [-10,10]\times [0,5]\}$. It is worth mentioning that the NS=10 000 sample points are obtained via the Latin Hypercube Sampling strategy [37].
We emulate the first case of initial data by using 10 000 steps Adam and 10 000 steps L-BFGS optimizations such that figures 2(a1)–(a3) and (b1)–(b3) illustrate the learning results starting from the unperturbated and perturbated (2% noise) training data, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are 9.3183×10−3, 5.3270×10−2, 3.8502×10−2 in figures 2(a1)–(a2), and 7.0707×10−3, 2.4057×10−2, 1.6464×10−2 in figures 2(b1)–(a2). Similarly, we use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations for the second case of initial data such that figures 2(c1)–(c3) and (d1)–(b3) illustrate the learning results starting from the unperturbated and perturbated training data, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are 1.8822×10−2, 4.9227×10−2, 4.0917×10−2 in figures 2(c1)–(c2), and 2.5427×10−2, 3.4825×10−2, 2.5983×10−2 in figures 2(d1)–(d2). Notice that those total learning times are (a) 717s, (b) 741s, (c) 1255s, and (d) 1334s, respectively, by using a Lenovo notebook with a 2.6 GHz six-cores i7 processor and a RTX2060 graphics processor.
Figure 2.
New window|Download| PPT slide Figure 2.Data-driven bright soliton of the Hirota equation (2): (a1), (a2) and (b1), (b2) the learning solutions arising from the unpeturbated and perturbated (2%) training data related to the first case of initial data, respectively; (c1), (c2) and (d1), (d2) the learning solutions arising from the unpeturbated and perturbated (2%) training data related to the first case of initial data, respectively; (a3), (b3), (c3), (d3) the absolute values of the errors between the modules of exact and learning solutions. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 9.3183×10−3, 5.3270×10−2, 3.8502×10−2, (b1)–(b3) 7.0707×10−3, 2.4057×10−2, 1.6464×10−2, (c1)–(c3) 1.8822×10−2, 4.9227×10−2, 4.0917×10−2, (d1)–(d3) 2.5427×10−2, 3.4825×10−2, 2.5983×10−2.
In each step of the L-BFGS optimization, the program is stop at$\begin{eqnarray}\begin{array}{l}| \mathrm{loss}(n)-\mathrm{loss}(n-1)| /\max (| \mathrm{loss}(n)| ,\\ | \mathrm{loss}(n-1)| ,1)\lt 1.0\times \mathrm{np}.\mathrm{finfo}(\mathrm{float}).\mathrm{eps},\end{array}\end{eqnarray}$where the loss(n) represents the value of loss function in the nth step L-BFGS optimization, and $1.0\times \mathrm{np}.\mathrm{finfo}(\mathrm{float}).\mathrm{eps}$ represent Machine Epsilon. When the relative error between $\mathrm{loss}(n)$ and $\mathrm{loss}(n-1)$ less than Machine Epsilon, procedure would be stop. This is why the computation times are different for each test by using the same step optimization.
2.3. The data-driven AKM breather solution
The second example we would like to study is the AKM breather (spatio-temporal periodic pattern) of equation (2) [18]$\begin{eqnarray}{q}_{\mathrm{akm}}(x,t)=\displaystyle \frac{\cosh (\omega t-2{\rm{i}}c)-\cos (c)\cos (p\xi )}{\cosh (\omega t)-\cos (c)\cos (p\xi )}{{\rm{e}}}^{2{\rm{i}}t},\end{eqnarray}$where $\xi =x-2\beta [2+\cos (2c)t]$, $\omega =2\sin (2c)$, $p=2\sin (c)$, and c is a real constant. The wave velocity and wavenumber of this periodic wave are $2\beta (2+\cos (2c))$ and p, respectively. This AKM breather differs from the Akhmediev breather (spatial periodic pattern) of the NLS equation because equation (2) contains the third-order coefficient β. In this example, we assume β=0.01 again. When $t\to \infty $, $| {q}_{\mathrm{akm}}(x,t){| }^{2}\to 1$. If $\beta \to 0$, we have $\xi \to x$, and then AKM breather almost becomes the Akhmediev breather.
We here choose L=10 and $t\in [-3,3]$, and choose the NI=100 random sample points from the initial data ${q}_{\mathrm{akm}}(x,t=0)$, NB=200 random sample points from the periodic boundary data, and NS=10 000 random sample points in the solution region $(x,t)\in [-10,10]\times [-3,3]$. We use the 20 000 Adam and 50 000 L-BFGS optimizations to learn the solutions from the unperturbated and perturbated (a 2% noise) initial data. As a result, figures 3 (a1)–(a3) and (b1)–(b3) exhibit the leaning results for the unperturbated and perturbated (a 2% noise) cases, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a) 1.1011×10−2, 3.5650×10−2, 5.0245×10−2, (b) 1.3458×10−2, 5.1326×10−2, 7.0242×10−2. The learning times are 2268 s and 1848 s, respectively.
Figure 3.
New window|Download| PPT slide Figure 3.Learning breathers related to the AKM breather (9) of the Hirota equation (2). (a1)–(a3) The unperturbated case, (b1)–(b3) the 2% perturbated case. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 1.1011×10−2, 3.5650×10−2, 5.0245×10−2, (b1)–(b3) 1.3458×10−2, 5.1326×10−2, 7.0242×10−2.
2.4. The data-driven rogue wave solution
The third example is a fundamental rogue wave solution of equation (2), which can be generated when one takes $c\to 0$ in the AKM breather (9) in the form [19]$\begin{eqnarray}{q}_{\mathrm{rw}}(x,t)=\left[1-\displaystyle \frac{4(1+4{\rm{i}}t)}{4{\left(x-6\beta t\right)}^{2}+16{t}^{2}+1}\right]{{\rm{e}}}^{2{\rm{i}}t}.\end{eqnarray}$As $| x| ,| t| \to \infty $, $| {q}_{\mathrm{rw}}| \to 1$, and ${\max }_{x,t}| {q}_{\mathrm{rw}}| =3$.
We here choose L=2.5 and $t\in [-0.5,0.5]$, and consider ${q}_{\mathrm{rw}}(x,t=-0.5)$ as the initial condition. We still choose NI=100 random sample points from the initial data ${q}_{\mathrm{rw}}(x,t=-0.5)$, NB=200 random sample points from the periodic boundary data, and NS=10 000 random sample points in the solution region $(x,t)\in [-2.5,2.5]\,\times [-0.5,0.5]$. We use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations to learn the rogue wave solutions from the unperturbated and perturbated (a 2% noise) initial data, respectively. As a result, figures 4(a1)–(a3) and (b1)–(b3) exhibit the leaning results for the unperturbated and perturbated (a 2% noise) cases, respectively. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a) 6.7597×10−3, 8.8414×10−3, 1.6590×10−2, (b) 3.9537×10−3, 5.8719×10−3, 9.0493×10−3. The learning times are 1524 s and 1414 s, respectively.
Figure 4.
New window|Download| PPT slide Figure 4.Learning rogue wave solution related to equation (10) of the Hirota equation (2). (a1)–(a3) The unperturbated case, (b1)–(b3) the 2% perturbated case. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a3) 6.7597×10−3, 8.8414×10−3, 1.6590×10−2, (b1)–(b3) 3.9537×10−3, 5.8719×10−3, 9.0493×10−3.
3. The PINNs scheme for the data-driven parameter discovery
In this section, we apply the PINNs deep learning method to study the data-driven parameter discovery of the Hirota equation (2). In the following, we use the deep learning method to identify the parameters α and β in the Hirota equation (2). Moreover, we also use this method to identify the parameters of the high-order terms of equation (2).
3.1. The data-driven parameter discovery for α and β
Here we would like to use the PINNs deep learning method to identify the coefficients α, β of second- and third-order dispersive terms in the Hirota equation$\begin{eqnarray}{\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)+{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x})=0,\end{eqnarray}$where α, β are the unknown real-valued parameters.
Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively, and the PINNs $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively, be$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+\alpha ({q}_{{xx}}+2| q{| }^{2}q)\\ +{\rm{i}}\beta ({q}_{{xxx}}+6| q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)].\end{array}\end{eqnarray}$Then the deep neural network is used to learn $\{u(x,t),v(x,t)\}$ and parameters (α, β) by minimizing the mean squared error loss$\begin{eqnarray}\begin{array}{l}\mathrm{TL}=\displaystyle \frac{1}{{N}_{p}}\sum _{j=1}^{{N}_{p}}\left(| u({x}^{j},{t}^{j})-{u}^{j}{| }^{2}+| v({x}^{j},{t}^{j})\right.\\ \left.-{v}^{j}{| }^{2}+| {F}_{u}({x}^{j},{t}^{j}){| }^{2}+| {F}_{v}({x}^{j},{t}^{j}){| }^{2}\right),\end{array}\end{eqnarray}$where ${\{{x}^{j},{t}^{j},{u}^{j},{v}^{j}\}}_{i=1}^{{N}_{p}}$ represents the training data on the real part and imaginary part of exact solution $u(x,t),v(x,t)$ given by equation (7) with $\alpha =1,\beta =0.5$ in $(x,t)\in [-8,8]\times [-3,3]$, and $u({x}^{j},{t}^{j}),v({x}^{j},{t}^{j})$ are real and imaginary parts of the approximate solution $q(x,t)\,=u(x,t)+{\rm{i}}v(x,t)$.
To study the data-driven parameter discovery of the Hirota equation (2) for α, β, we generate a training data-set by using the Latin Hypercube Sampling strategy to randomly select randomly choosing Nq=10 000 points in the solution region arising from the exact bright soliton (7) with α=1, β=0.5 and $(x,t)\in [-8,8]\times [-3,3]$. Then the obtained data-set is applied to train an 8-layer deep neural network with 20 neurons per layer and a same hyperbolic tangent activation function to approximate the parameters α, β in terms of minimizing the mean squared error loss given by equation (13) starting from α=β=0 in equation (12). We here use the 20 000 steps Adam and 50 000 steps L-BFGS optimizations. Table 1 illustrates the learning parameters α, β in equation (11) under the cases of the data without perturbation and a 2% perturbation, and their errors of α, β are 3.85×10−5, 7.48×10−5 and 3.31×10−4, 2.89×10−4, respectively. Figure 5 exhibits the learning solutions and the relative ${{\mathbb{L}}}^{2}-$ norm errors of q(x, t), u(x, t) and v(x, t): (a1)–(a2) 7.0371×10−4, 1.0894×10−3, 1.0335×10−3; (b1)–(b2) 9.4420×10−4, 1.4055×10−3, 1.2136×10−3, where the training times are (a1)–(a2) 1510 s and (b1)–(b2) 3572 s, respectively.
Figure 5.
New window|Download| PPT slide Figure 5.Data-driven parameter discovery of α and β in the sense of bright soliton (7). (a1)–(a2) Bright soliton without perturbation. (b1)–(b2) Bright soliton with a 2% noise. (a2), (b2) The absolute value of difference between the modules of exact and learning bright solitons. The relative ${{\mathbb{L}}}^{2}-$norm errors of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a2) 7.0371×10−4, 1.0894×10−3, 1.0335× 10−3, (b1)–(b2) 9.4420×10−4, 1.4055×10−3, 1.2136×10−3.
Table 1. Table 1.Comparisons of α, β and their errors in the different training data-set via deep learning.
3.2. The data-driven parameter discovery for μ and ν
In what follows, we will study the learning coefficients of the high-order term in equation (2) via the deep learning method. We consider the Hirota equation (2) with two parameters in the form$\begin{eqnarray}{\rm{i}}{q}_{t}+{q}_{{xx}}+2| q{| }^{2}q+\displaystyle \frac{{\rm{i}}}{2}(\mu {q}_{{xxx}}+\nu | q{| }^{2}{q}_{x})=0,\end{eqnarray}$where μ and ν are the unknown real constants of higher-order dispersion and nonlinear terms, respectively.
Let $q(x,t)=u(x,t)+{\rm{i}}v(x,t)$ with $u(x,t),v(x,t)$ being its real and imaginary parts, respectively, and the PINNs $F(x,t)={F}_{u}(x,t)+{\rm{i}}{F}_{v}(x,t)$ with ${F}_{u}(x,t),{F}_{v}(x,t)$ being its real and imaginary parts, respectively, be$\begin{eqnarray}\begin{array}{l}F(x,t):= {\rm{i}}{q}_{t}+{q}_{{xx}}+2| q{| }^{2}q\\ +\displaystyle \frac{{\rm{i}}}{2}(\mu {q}_{{xxx}}+\nu | q{| }^{2}{q}_{x}),\\ {F}_{u}(x,t):= \mathrm{Re}[F(x,t)],\\ {F}_{v}(x,t):= \mathrm{Im}[F(x,t)].\end{array}\end{eqnarray}$Then the deep neural network is used to learn $\{u(x,t),v(x,t)\}$ and parameters $(\mu ,\nu )$ by minimizing the mean squared error loss given by equation (13).
To illustrate the learning ability, we still use an 8-layer deep neural network with 20 neurons per layer. We choose Nq=10 000 sample points by the same way in the interior of solution region. The 20 000 steps Adam and 50 000 steps L-BFGS optimizations are used in the training process. Table 2 exhibits the training value and value errors of μ and ν in different training data set. And the results of neural network fitting exact solution are shown in figure 6. The training times are (a1)–(a2) 1971 s and (b1)–(b2) 1990 s, respectively.
Figure 6.
New window|Download| PPT slide Figure 6.Data-driven parameter discovery of μ and ν in the sense of bright soliton (7). (a) (b) display the learning result under bright soliton data set. (a1)–(a2) are calculated without perturbation. (b1)–(b2) are calculated with 2% perturbation. (a2) and (b2) exhibit absolute value of difference between real solution and the function represented by the neural network. The relative ${{\mathbb{L}}}^{2}-$norm error of q(x, t), u(x, t) and v(x, t), respectively, are (a1)–(a2) 8.0153×10−4, 1.0792×10−3, 1.2177×10−3, (b1)–(b2) 1.0770×10−3, 1.6541×10−3, 1.3370×10−3.
Table 2. Table 2.Comparisons of μ, ν and their errors in the different training data-set via deep learning.