删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

Neural-Network Quantum State of Transverse-Field Ising Model

本站小编 Free考研考试/2022-01-02

Han-Qing Shi,?, Xiao-Yue Sun,?, Ding-Fang Zeng,§Theoretical Physics Division, College of Applied Sciences, Beijing University of Technology,Beijing 100124, China

Corresponding authors: ? E-mail:hq-shi@emails.bjut.edu.cn;? E-mail:xy_sun@emails.bjut.edu.cn;§ E-mail:dfzeng@bjut.edu.cn

Received:2019-04-28Online:2019-11-1
Fund supported:*Supported by the Natural Science Foundation of China under Grant.No. 11875082


Abstract
Along the way initiated by Carleo and Troyer [G. Carleo and M. Troyer, Science 355 (2017) 602], we construct the neural-network quantum state of transverse-field Ising model (TFIM) by an unsupervised machine learning method. Such a wave function is a map from the spin-configuration space to the complex number field determined by an array of network parameters. To get the ground state of the system, values of the network parameters are calculated by a Stochastic Reconfiguration (SR) method. We provide for this SR method an understanding from action principle and information geometry aspects. With this quantum state, we calculate key observables of the system, the energy, correlation function, correlation length, magnetic moment, and susceptibility. As innovations, we provide a high efficiency method and use it to calculate entanglement entropy (EE) of the system and get results consistent with previous work very well.
Keywords: neural network quantum state;Stochastic reconfiguration method;transverse field Ising model;quantum phase transition


PDF (1046KB)MetadataMetricsRelated articlesExportEndNote|Ris|BibtexFavorite
Cite this article
Han-Qing Shi, Xiao-Yue Sun, Ding-Fang Zeng. Neural-Network Quantum State of Transverse-Field Ising Model *. [J], 2019, 71(11): 1379-1387 doi:10.1088/0253-6102/71/11/1379

1 Introduction

In a general quantum many-body system, the dimension of Hilbert space increases exponentially with the system size. Kohn called this "an Exponential Wall problem" in his Nobel Prize talks.[1] This lofty wall prevents physicists from extracting features and information from the system. To bypass this lofty wall, physicists make many efforts. The most productive or influential ones are density matrix renormalization group (DMRG)[2] and quantum monte carlo (QMC).[3] But till this day, no satisfactory methods are discovered for this problem universally. Each method has its advantage and disadvantages. For example, DMRG is highly efficient for 1-dimensional system, but it works not so well in higher dimensions. QMC suffers from the notorious sign problem.[4]

However, people note that machine learning is a rather strong method for rule-drawing and information-extracting from big data sources. In this method, machine can "learn" from data sources and "get intelligence", and analyze newly input data then do decisions intelligently. Very naturally, we expect machine learning may be also used to solve problems appearing in quantum many-body systems. It has been used in condensed matter physics, statistical physics, Quantum Chromodynamics, AdS/CFT, black hole physics and so on.[5-10]

In Ref. [13], Carleo and Troyer introduced a variational representation of quantum states for typical spin models in one and two dimensions, which can be considered as a combination of Laughlin's idea and neural-networks. This neural-network quantum state (NQS) is actually a map from the spin configuration space to wave function or complex number domain. In this framework, adjusting the neural-network parameters so that for each input spin configuration, the output number is proportional to probability amplitude. In the current work, we will try out this NQS representation and machine learning method to reconstruct the ground state of the TFIM, both in one and two dimensions, and calculate its key observables, especially the EE. For the SR method,[14-16] we will provide an understanding basing on least action principle and information geometry.

The layout of our paper is as follows, this section is about history and motivation; the next section is a brief introduction to the neural-network quantum state and TFIM. Section 3 is our discussion on the SR method and its programing implementation. Section 4 is our calculation results of key observables of the ground state TFIM, using machine learned NQS in Sec. 3. Section 5 is our method for the calculation of EE of the ground state TFIM. The last section is our summary and prospect for future work.

2 Neural-Network Quantum State and Transverse-Field Ising Model

The neural-network that Carleo and Troyer proposed to describe spin-${1}/{2}$ quantum system has only two layers, a visible layer $s=(s_1,s_2,\ldots,s_N) $ corresponding to the real system, and a hidden layer $h=(h_1,h_2,\ldots,h_M)$ corresponding to an auxiliary structure. The connecting lines between the visible nodes and the hidden nodes represent interactions between them. But there are no connecting lines inside the visible layer and hidden layer. This type of neural-network is termed as Restrict Boltzmann Machine (RBM). Its schematic diagram is shown in Fig. 1. In the following, we do not distinguish between neural-network and RBM.

Fig. 1

New window|Download| PPT slide
Fig. 1(Color online) Schematic diagram of Restricted Boltzmann Machine. This is a two layer structure. The left is visible layer, the right is the auxiliary hidden layer. The dashed line between nodes in both left and right layers does not imply interactions, they are plotted here only for visual impression for "layer". The lines between the visible nodes and hidden nodes represent interactions.



The many-body wave function could be understood as a map from the lattice spin configuration space to complex number field. Explicitly, this can be written as

$$ \Psi(s,\mathcal{W})=\!\!\sum_{\{h_j\}}\!\exp{\Big[\sum_{i}a_i s_i+\!\sum_{j}b_j h_j+\!\sum_{i,j}w_{ij}s_i h_j\Big]} \,, $$
where $s=\{s_i\}$ denotes the spin configuration and $\mathcal{W}=\{a,b,w\}$ is the weight parameters of the neural-network. Adjusting $\mathcal{W}$ is equivalent to adjusting rules of the map. And $h_{i}=\{1,-1\}$ is the hidden variables. Since there is no interactions inside the visible layer and hidden layer themselves, the summation over hidden layers spin configuration can be traced out. So the wave function can be more simply written as

$$ \Psi(s,\mathcal{W})=e^{\sum_{i}a_i s_i}\Pi_{j=1}^{M} 2\cosh\Big[b_j+\sum_{i}w_{ij}s_i\Big]\,. $$
Mathematically, this NQS representation can be traced back to the work of Kolmogorov and Arnold.[17-18] It is the now named Kolmogorov-Arnold representation theorem that makes the complicated higher-dimensional function's expressing as superpositions of lower-dimensional functions possible.[19]

This work focuses on the TFIM, whose Hamiltonian has the form

$$ \mathcal{H}=-J\sum_{\langle i,j\rangle}\sigma_{i}^{z}\sigma_{j}^{z}-h\sum_{i}\sigma_{i}^{x}\,, $$
where $\sigma_{i}^{z}=\begin{equation} \left( \begin{array}{ccc} 1 & 0 \\ 0 & -1 \\ \end{array} \right) \end{equation}$ and $\sigma_{i}^{x}=\begin{equation} \left( \begin{array}{ccc} 0 & 1 \\ 1 & 0 \\ \end{array} \right) \end{equation}$ are the Pauli matrixes. $J$ represents the spin coupling and $h$ represents the strengthen of the transverse field. For our purpose, absolute value of $J$ and $h$ do not matter, what matters is their ratio $h/J$. We will set $J=1$ and let $h$ be variables throughout this work. Interests to this model can be dated back to the 1960s work of de Gennes and others in the study of order/disorder transition in some double-well ferroelectric systems.[20-21] Pfeuty's work[22] in 1970 is a milestone in this area, where the one-dimensional model is solved exactly by Jordan-Wigner transformation. His results provide us a referring standard to test the validity of our calculations. So, our interests in this work are the neural-network quantum state representation and the corresponding ML method. TFIM provides us working examples to illustrate ideas behind this method. It is believed that this new method will provide us with ways to find the new physics behind some more complicated lattice models.

3 Stochastic Reconfiguration Method for the Ground State

SR method[14-15] was firstly proposed by Sorella and his collaborators in studies of addressing the sign problem firstly. Then it was used as an optimization method for finding goal functions from some general trial-function set. It can be looked as a variation of the steepest descent (SD) method. Considering its key value for numerical calculation of neural-network quantum state, we provide here a new understand for it basing on the least action principle and information geometry. Information geometry can be dated back to Rao's work in 1945.[23] In that work, Rao takes the Fisher information metric as Riemannian metric of statistical manifold, and regards geodesic distances as the differences between different distributions. This discipline drives to maturity after the work by Shun'ichi Amari and others.[24] In recent years, it also gets attention as a tool to understand gravitation emergence and AdS/CFT correspondence.[25-26]

The quantum state of our system is functions of the neural-network parameter set $\{\mathcal{W}_{k}\}\equiv\{a_i,b_j,w_{ij}\}$. We will start from a trial function $\Psi_{T}$, which is controlled by the initial parameters $\{\mathcal{W}_k^{0}\}$. Consider a small variation of the parameters $\mathcal{W}_{k}=\mathcal{W}_{k}^{0}+\delta\mathcal{W}_{k}$, under the first order approximation, the new wave function becomes

$$ \Psi_{T}^{\prime}(\mathcal{W})=\Psi_{T}(\mathcal{W}^{0})\Big[1+\sum_{k}\delta\mathcal{W}_{k}\frac{\partial}{\partial\mathcal{W}_{k}}\ln\Psi_{T}(\mathcal{W}^{0})\Big] . $$
Introduce a local operator $ \mathcal{O}^{k} $, so that

$$ \mathcal{O}^{k}=\frac{\partial}{\partial\mathcal{W}_{k}}\ln\Psi_{T} , \label{Okdefinition} $$
and set the identity operator $\mathcal{O}^{0}=1$, then $\Psi_{T}^{\prime} $ can be rewritten as a more compact form

$$ \Psi_{T}^{\prime}(\mathcal{W})=\sum_{k}\delta\mathcal{W}_{k}\mathcal{O}^{k}\Psi_{T} \,. $$
Our goal is to find the ground state wave function, so that the expectation value of energy $ \langle E\rangle={\langle\Psi|H|\Psi\rangle}/{\langle\Psi|\Psi\rangle}$ is minimized. Obviously, $E$ depends on parameters involved in the neural-network. The procedure of looking for the ground state is equivalent to the network parameters' adjusting. The key question is the strategy of updating parameters from $\{\mathcal{W}_0\}$ to $\{\mathcal{W}\}$. This process is something like a process that a moving from an initial point to the target point (the ground energy state in our question) in parameter space. The parameter path connecting the initial point to the target point is determined by the "least action principle" in parameter space as we will show as below.

In SR method, the parameters are updated by strategies

$$ \mathcal{W}_{i}\longrightarrow\mathcal{W}_{i}+\Delta t\sum_{k}s_{ik}^{-1}f_{k}\,, $$
where $s_{ik}$ is the metric of the parameter space, which will be clear from the following discussion.Our task here is to show that this strategy is the requirement of least action principles. For this purpose, we firstly introduce generalized forces $f$

$$ f_{k}=-\frac{\partial E}{\partial \mathcal{W}_{k}}\,. $$
Then variations of the energy $E$ due to changes of $\mathcal{W}$ can be written as

$$\begin{eqnarray*} \Delta E =\frac{\Delta E}{\Delta\mathcal{W}_i}\Delta\mathcal{W}_i=\frac{\Delta E}{\Delta\mathcal{W}_i}\Delta t \cdot s^{-1}_{ik}f_k\\ =-\Delta t \cdot s^{-1}_{ik}\frac{\Delta E}{\Delta\mathcal{W}_i}\frac{\Delta E}{\Delta\mathcal{W}_k} , \end{eqnarray*} $$

i.e.

$$ \Delta E=-\Delta t\frac{(\Delta E)^{2}}{s_{ik}\Delta \mathcal{W}_i\Delta\mathcal{W}_k}\,. $$
Now if we define $s_{ik}\Delta\mathcal{W}_i\Delta\mathcal{W}_k\equiv\Delta s$ as the line element in the parameter space, then

$$ \Delta s=-\Delta E\Delta t \,. $$
In integration form, this is nothing but,

$$ \int d s=S=\int d t \mathcal{L}\,, $$
where $S$ is the "action" of the iterative process when seeking the ground state of the system and $\mathcal{L}$ is its corresponding "Lagrangian". The path forms in the parameter space when the parameters are updated is determined by the corresponding least action principle. This is the physical meaning of the SR method. The SD method is a special case of SR one, whose parameter space metric is a simple Cartesian one

$$ \Delta s=\sum_{k}|\mathcal{W}^{\prime}_{k}-\mathcal{W}_{k}|^{2}\,. $$
However, in general cases we have no reason to take the parameter space as such a simple one. So we have to introduce a metric so that

$$ \Delta s=\sum_{i,j}s_{ij}(\mathcal{W}^{\prime}_{i}-\mathcal{W}_{i})(\mathcal{W}^{\prime}_{j}-\mathcal{W}_{j})\,. $$
This is the reason why $s_{ik}$ appears in Eq. (7).

Obviously, $s_{ik}$'s determination is the key to the question. On this point, SR method tells us that

$$ s_{ik}=\langle\mathcal{O}^{i}\mathcal{O}^{k}\rangle- \langle\mathcal{O}^{i}\rangle\langle\mathcal{O}^{k}\rangle . $$
From information geometry's perspective, this is very natural. Consider a general data distribution $p(x;\theta)$, the Fisher information matrix or Riemannian metric on the statistic manifold is defined as.

$$ g_{ik}(\theta) =\int p(x;\theta)\frac{\partial \ln p(x;\theta)}{\partial \theta^{i}}\frac{\partial \ln p(x;\theta)}{\partial \theta^{k}}d x\nonumber\\ =\langle\partial_i\ln p\,\partial_k\ln p\rangle\,. $$
In our neural network quantum state, the probability reads (our wave function is limited to real fields)

$$ p(s,\mathcal{W}_{k})=\frac{|\Psi(s,\mathcal{W}_{k})|^2}{\langle\Psi|\Psi\rangle}=\frac{\Psi^2(s,\mathcal{W}_{k})}{\sum_s \Psi^2(s,\mathcal{W}_{k})}\,. $$
Substituting this into Eq. (16), we know

$$ g_{ik}=\sum_s p(s,\mathcal{W})\frac{\partial\ln p(s,\mathcal{W})}{\partial\mathcal{W}^i}\frac{\partial\ln P(s,\mathcal{W})}{\partial \mathcal{W}^k}\nonumber\\ \quad =\langle\mathcal{O}^{i}\mathcal{O}^{k}\rangle-\langle\mathcal{O}^{i}\rangle\langle\mathcal{O}^{k}\rangle. $$
This is exactly the results we want to show. The rationality behind this derivation is that, mathematically a distribution function determined by its parameter set has little difference from a quantum state wave function determined by the corresponding neural-network parameters.

Now comes our concrete implementation of the ground state finding numeric programs. The key idea is iterative execution of Eq. (7), starting from some arbitrary point of the $\mathcal{W}\equiv\{a,b,w\}$-parameter space. When the ground state is arrived on, the generalized force $ f=-{\partial E}/{\partial \mathcal{W}} $ tend to zero and the parameters are stable. Due to the exponential size of the Hilbert space, for arbitrarily chosen parameters $\mathcal{W}$, we cannot determine which state is the ground one by complete listing of all spin configurations. We use Metropolis-Hastings algorithm to sample the important configurations for approximation. The detailed step is as follows.

$\bullet$ Step 1, starting from an arbitrary $a,b,w$ we construct $\Psi_T(s,\mathcal{W})$ and generate $N_s=10^{3}-10^{4}$ spin state sample $\{s\}$ through a Markov chain of $s\rightarrow s'\rightarrow\cdots\rightarrow s^{(f)} $.The transition probability between two configurations $ s $ and $ s' $ is

$$ P(s\rightarrow s^{\prime})=\min\Big(1,\Big|\frac{\Psi(s^{\prime})}{\Psi(s)}\Big|^{2}\Big) . $$
$\bullet$ Step 2, for given $a,b,w$, calculate the corresponding $\mathcal{O}^{k}$,

$$ \mathcal{O}^{a_i}=\frac{1}{\Psi(s)}\partial_{a_{i}}\Psi(s)=\sigma_{i}^{z} , $$
$$ \mathcal{O}^{b_i}=\frac{\partial_{b_{j}}\Psi(s)}{\Psi(s)}=\tanh\Big[b_{j}+\sum_{i}w_{ij}\sigma_{i}^{z}\Big] , $$
$$ \mathcal{O}^{w_{ij}}\!=\!\frac{\partial_{w_{ij}}\Psi(s)}{\Psi(s)}\!=\!\sigma_{i}^{z}\tanh\Big[b_{j}+\sum_{i}w_{ij} \sigma_{i}^{z}\Big] . $$
$\bullet$ Step 3, with $\mathcal{O}^{k}$, we calculate $s_{ik}$ according to (14) where $ \langle\cdots \rangle $ means averaging over the $N_s$ samples. Get its inverse $s_{ik}^{-1}$ and update parameters $a,b,w$ through Eq. (7).

$\bullet$ Step 4, repeat the above steps enough times, until the generalized force $f_{k}$ tends to zero and the parameters become iteration stable, we will get the desired parameter for ground state.

Two points are noteworthy here

i) In practical calculations $ f_{k} $ takes the form of

$$ f_{k}=\langle E_{\rm loc}\rangle\langle\mathcal{O}^{k}\rangle-\langle E_{\rm loc}\mathcal{O}^{k}\rangle . $$
$E_{\rm loc}={\langle s|\mathcal{H}|\psi\rangle}/{\psi(s)} $ is the local energy in Variational Monte Carlo (VMC)[27] for each spin configuration.

ii) Using symmetries of the model to reduce the number of parameters, which was discussed in supplementary materials of Carleo and Troyer's paper.[28] In our models, we impose periodic boundary conditions for the lattice, so translation symmetries are used in our calculation. Due to this symmetry, the number of free components in $a_i$ is $0$, in $b_j$ is $M/N$, in $w_{ij}$ is $\alpha\times N={M}/{N}\times N=M$, instead of $M\times N$, where $ \alpha $ is the ratio of hidden nodes number and visible nodes number.

The following is our numeric results for TFIM in both one- and two-dimensional square lattices.Our numerical work can be divided into three parts:

i) The ground state wave function training.

ii) Key observables' measurement excluding EE.

iii) The EE's measurement.

In the one-dimensional model, we do the ML and measurements in three different network parameters $\alpha=1$, $2$, and $4$. Almost no superiority is observed for larger $\alpha$. For the non-entanglement-entropy observables, our results are compared with exact solutions of Ref. [22]. They coincide very well. While for the EE, we compared our results with Ref. [29], probably due to the finite size effects, our results are only qualitatively agreeing with the literature.

4 Key Observables of the TFIM Ground State

Our first set of observables is the per-site ground state energy $E/N$ of TFIM for one- and two-dimensional models, whose dependence on the transverse-field strength is illustrated in Fig. 2.

Fig. 2

New window|Download| PPT slide
Fig. 2(Color online) The ground state energy $E/N$ of TFIM in one-dimensional 32-site spin-chain (a) and two-dimensional (b) $10\times10$-site lattice as functions of the external field-strength. The red, green and blue points in the left figure are for network parameters $\alpha=1,2,4$. They are displaced from each other artificially otherwise coincide almost exactly. The dashed line is the analytic result of Ref. [22]. While the two-dimensional result is compared with the real space renormalization group analysis of Ref. [31].



In one-dimensional case, our numerical results fit Pfeuty's exact result[22] very good. Many numerical studies about 1D TFIM have been done, for example the Mote Carlo method was used in Ref. [30]. While in two dimension models, our results coincide with those from real space renormalization group analysis.[31-32] In Ref. [31], the critical transverse field is 3.28. In Ref. [32], the critical transverse field is $h=3.4351$. From the figure, we easily see that the energy decreases as the field strength increases. This is because part of the energy arises from interactions between the spin sites and external magnetic fields. Very importantly, in one-dimensional case we note that enlarging the network parameter $\alpha\equiv{M}/{N}$, i.e. the ration of hidden to manifest nodes number almost has no affects on the value and precision of the per site energy. However, the computation time grows at least linearly with $\alpha$'s growing. Due to this reason, we do not make 2-d ML and measurements for $\alpha$ greater than 1.

Our second set of observables is the per-site magnet moment and the corresponding susceptibility of the ground state TFIM

$$ \langle M_x\rangle\!=\!\frac{\sum_i\langle\psi|\sigma^i_x|\psi\rangle}{N}\,, \nonumber \\ \chi_x = \lim_{\scriptstyle\Delta\rightarrow0} \frac{\langle M_x (h+{\scriptstyle\Delta})\!-\!M\!_x\!(h)\rangle} {{\scriptstyle\Delta}}\,. $$
Focusing on the component along external transverse field, the results are displayed in Fig. 3 explicitly. For one-dimensional case, our results coincide with existing literatures very well. From the figure, we see that the susceptibility contains a singularity at $h=1$ (1D case) and $h\approx3$ (2D case), which corresponds to the quantum phase transition points as the external field strength varies. The enlarged detail figure in this figure seems to indicate that more larger $\alpha$ ML gives $\langle M_x\rangle-h$ line more well coincides with the analytic result.

Fig. 3

New window|Download| PPT slide
Fig. 3(Color online) The per-site magnet moment expectations $\langle M_{x}\rangle$ and susceptibilities $\chi_{x}$ in the one-dimensional (a) (c) and two-dimensional (b) (d) TFIM as functions of the external field-strength. The 1D model's calculation is done for three different $\alpha$'s while the 2D calculation has fixed $\alpha=1$. The dash line in the upper-left figure is the analytic results $\langle M_{x}\rangle$ from Ref. [22]. The sub figure in it shows details of conformity with the analytic result of the three $\alpha$'s numeric.



Our third set of observables is the spin-$z$ correlation function $\langle\sigma_{i}^{z}\sigma_{j}^{z}\rangle$ and the corresponding correlation length $\xi_{zz}$, with the latter defined as $ \xi_{zz}={\langle \sum_{j}|\vec{r}_{i}-\vec{r}_{j}|\sigma_{i}^{z}\sigma_{j}^{z}\rangle}/{\langle \sum_{j}\sigma_{i}^{z}\sigma_{j}^{z}\rangle} $ in numeric implementations. Our result is displayed in Fig. 4. From the figure, we easily see that the system manifests long-range spin $z$-$z$ correlation in small transverse-field strength, while in large $h$ region, the correlating function decreases quickly. The correlation length $\xi_{zz}$'s behavior tells us this point more directly. In 1D case, the correlation length's jumping occurs on $h\approx1$. While in the 2D case, such jumping occurs on $h\approx3$. Due to the finite size effect of our lattice model, $\xi_{zz}$ has saturation values in the small $h$ region. This saturation phenomena will disappear in thermodynamic limits and the correlation length will diverge on the critical point.

Fig. 4

New window|Download| PPT slide
Fig. 4(Color online) The ground state spin-$z$ correlation function $\langle\sigma_{i}^{z}\sigma_{i+x}^{z}\rangle$ (a) (b) and the corresponding correlation length (c) (d) of TFIM. The left is for 1D chain with 32 sites while the right is for 2D lattice of $10\times10$ sites.



Physically, the spin-spin interaction $J$ and the external field $h$'s influence in TFIM are two competitive factors. The former has the trends of preserving orders in the lattice, while the latter tries to break such orders. The quantum phase transition occurs on a critical value of $h/J$. In our numerics, we set $J=1$. The more larger critical value $h=3$ in 2D than the $h=1$ value in 1D is due to the more number --- twice as much --- of spin-spin interaction bonds for each site in 2D than in 1D.

5 EE of the Ground State

Entanglement, a fascinating and spooky quantum feature without classical correspondences, is getting more and more attentions in many physics areas.[33] People believe that many important information about a quantum system can obtained from its bipartite EE and spectrum.[34] More importantly, the area law feature of EE sheds new light on our understanding of holographic dualities, quantum gravity and spacetime emergence.[35-36] Nevertheless, we have very few ways to calculate the this quantity efficiently. Our purpose in this section is to provide a new method for its calculation in both one- and two-dimensional TFIM.

Let the total system be described by a density matrix $\rho$ and divide the system into two parts,$ A $ and $ B $. The EE between the two is defined as follows

$$ S_{A}=-tr[\rho_{A}\log \rho_{A}] , $$
where $\rho_{A}$ is the reduced density matrix of $A$, which follows from the B-part degrees of freedom's truncating. The behavior of EE is regarded as a criteria for quantum phase transition. Using conformal field theory methods, Calabrese and Cardy[29] calculate the EE of 1D infinitely long Ising spin chain and show that it tends to the value of $\log2$ asymptotically for $h\rightarrow0$ and tends to 0 as $h\rightarrow\infty$. In the quantum critical point $h=1$, it diverges. In Ref. [37] Vidal, et al. showed that for spin chain in the noncritical regime the EE grows monotonically as a function of the subsystem size $L$ and will get saturated at certain subsystem size $L_{0}$. At the critical value of $h$, it diverges logarithmically with singular value $L$.

For a general state $|\Psi\rangle$ of a system consisting of two parts $A$ and $B$,

$$ |\Psi\rangle=\sum_{i,j}c_{i,j}|i\rangle_{A}\otimes|j\rangle_{B}\,. $$
The matrix coefficient $c_{ij}$ is the probability amplitude of a configuration whose part-A is in $i$-th spin configuration while part-B is in $j$-th spin configuration.With the help of Singular Value Decomposition (SVD)

$$ c_{ij}=U_{ik}\Sigma_{kk'}V^\dagger_{k'j}\,, $$
where $U$, $V$, and $\Sigma$ are $d_A\times d_A $, $d_B\times d_{B}$, and $d_A\times d_B$ matrices respectively, we can diagonalize $c_{ij}$ into $\Sigma_{kk'}$ and rewrite $|\Psi\rangle$ into the form

$$ |\Psi\rangle=\sum_{k=1}^{\min(d_{A},d_{B})}\sqrt{p_{k}}|\psi_{k}\rangle_{A}\otimes|\psi_{k}\rangle_{B}\,, $$
where $\sqrt{p_{k}}$ equals to the diagonal elements of $\Sigma_{kk'}$. Then EE between $ A $ and $ B $ can be calculated as

$$ S_{A}=-\sum_{k=1}^{\min(d_{A},d_{B})}p_{k}\log p_{k}\,. $$
However, the size of $c_{ij}$ increases exponentially with the number of lattices. This exponential devil makes the SVD hard to do. We hope to get a reduced coefficient matrix to approximate the original $c_{ij}$ but reserve key features of system. We want to and only can include the important elements of $c_{ij}$. Here an approximation method to bypass the exponential wall problem is needed. The exposition and schematic diagram of our idea is as follows.

Firstly, we write the general state of the lattice system with $ N $ sites as the superposition of spin configurations in descending order of $|c_{\ell}|$,

$$ \hspace{-3mm}|\Psi\rangle=\sum_{\ell}^{2^{N}}c_{\ell}|s_{\ell}\rangle , \mathrm{with} \cdots\leqslant |c_3|\leqslant |c_2|\leqslant |c_1|. $$
Only the first $q$ configurations with the maximal $|c_\ell|$ will be produced by a Monte Carlo sampling algorithm and saved for successive computations. For the 1D TFIM with 32 lattices, $q=10^4$ is enough. $|c_{\ell}|=\psi(s_{\ell})$ here is just the value of NQS wave functions coming from MLs. When we write the subscript of $c_\ell$ as the combination of part-A's configuration-$i$ and part-B's configuration-$j$, we will get a very sparse matrix $c_{ij}\equiv c_\ell$. If we substitute this ${c}_{ij}$ into Eqs. (26)--(28), what we get will be a very poor approximation of EE. However, if we fill the blank position of this $c_{ij}$ matrix with NQS wave function values $\psi[s_{ij\equiv\ell}]$. Our results will become much better than the those following from the original sparse matrix $c_{ij}$.

Firstly, we show in Fig. 5 the $h$-dependence of EE when the system (both 1D chain and 2D lattice) is equally bipartite. For the 1D chain model, 3 different network parameters $\alpha=1$, $2$, and $4$ are studied and all of them yield equally good results for $S$, but the larger $\alpha$ computation costs time at least linearly increasing with $\alpha$. For this reason, we do not consider this parameters' effect on numerics for the 2D lattice. Our 1D numeric EE is compared with analytical results of Refs. [29,37]. They have equal small-$h$ limit, approximately $S$ $\overset h\rightarrow0\to\longrightarrow$ ${\log 2}$ and similar decaying trends in the large-$h$ region. They also have the same quantum phase transition point $h\approx1$. For the 2D lattice, our EE indicates that the system may experience quantum phase transitions at $h=3\sim5$. Combining with magnetic susceptibility and correlation length calculation in the previous section, we know that his transition occurs at $h\approx3$.

Fig. 5

New window|Download| PPT slide
Fig. 5(Color online) The equal-size bipartie EE of TFIM as functions of the external field strength. The left 1D spin-chain has 32 sites and the right 2D lattice has $10\times10$ ones. In the 1D chain, three different network parameters $ \alpha=1,2,4 $ in red, green and blue are tried but the results exhibit little differences. The dashed gray line in it is the analytic result of Ref. [29].



Then in Fig. 6 we studied the A-part size dependence of EE when the spin chain is arbitrarily bipartite. From the figure we see that this dependence is symmetric on the size of A and B. And, in the critical value case of the transverse-field strength, the EE increases monotonically before A-part size increases to half the chain length. While for the much less than or more larger than the critical value of transverse-field strengths, EE rises quickly to some saturation values before the the size of A-part increases to the half size of the total system. These results agree with those of Vidal et al.[37]

Fig. 6

New window|Download| PPT slide
Fig. 6(Color online) The EE of TFIM as functions of the size of the A-Part in some typical transverse-field strengths. The upper is for 1D chain with 32 sites, the downer is for 2D lattice with $10\times10$ sites. In the 2D lattice, bi-partition is along the $45^\circ$ line of the lattice square.



For all known quantum many body system, their EE's calculation is a challenging work. References [38] and [39] are two well known works in this area. The former uses methods of QMC with an improved ratio trick sampling, whose illustrating calculation is done for 1D spin-chain of 32 sites and only the second Renyi entropy is calculated with long running times. While the latter uses wave function obtained from RBM + ML and a replica trick, also only the second Renyi entropy for a 32-sites 1D spin-chain is calculated as illustrations. As a comparison, our method can be used to calculate the von Neumann entropy for both 1D and 2D TFIM directly. Our method adopts new approximation method in the SVD approach. We preserve the most important configurations of the system, which corresponds to the important elements of the matrix coefficient, to represent the full wave function. The key to our method is the reduction of the matrix coefficient $c_{ij}$ and the filling of its blank positions by values of wave functions getting from RBM. In the 1D case, we get results highly agree with the known analytic results of CFT.[29,37] While in the 2D case, our EE's calculation yields quantum phase transition signals consistent with those yielding by other observables.

6 Conclusion and Discussion

Follow the idea of artificial neural-networks of Carleo and Troyer, we reconstruct the quantum wave function of one- and two-dimensional TFIM at ground state through an unsupervised ML method. Basing on the resulting wave function, we firstly calculate most of the key observables, including the ground state energy, correlation function, correlation length, magnetic moment and susceptibility of the system and get results consistent with previous works. The stochastic reconfiguration method plays key roles in the ML of neural-network quantum state representation. We provide in Sec. 2 of this work an intuitive understanding for this method based on least action principle and information geometry. As a key innovation, we provide a numeric algorithm for the calculation of EE in this framework of neural-network and ML methods. By this algorithm, we calculate entanglement entropies of the system in both one and two dimensions.

For almost all quantum many-body system, calculations of their EE are all a challenging work to do. Both DMRG and QMC do not solve this question satisfactorily. The former works well main in 1D models, while the latter has difficulties to treat large lattice size. Our method introduced here can be used to calculate the EE directly and applies to both 1- and 2D models. On our Macbook of two core 2.9 GHz CPU and 8 G RAM, finishing all illustrating calculations presented in this work costs time less than five days.

As prospects, we point here that, further exploration and revision of our numerical algorithm so that in 2D lattices it can give more clear and definite EE signals of quantum phase transition, or use our methods to study the behavior of time-dependent processes in the spin-lattice model[40] are all valuable working directions. On the other hand, to explore the NQS representation and their ML algorithm for other physic models, such as the more general spin-lattice and Hubbard model, is obviously interesting direction to consider. For these models, more complicated neural-network such as the deep and convolution ones may be more powerful. In Ref. [41], deep neural-networks has the potential to represent quantum many-body system more effectively. While Ref. [42] shows that the combination of convolution neural-networks with QMC works even for systems exhibiting severe sign problems.

Reference By original order
By published year
By cited within times
By Impact factor

W. Kohn , Rev. Mod. Phys. 71 ( 1999) 1253.
[Cited within: 1]

S.R. White , Phys. Rev. Lett. 69 ( 1992) 2863.
[Cited within: 1]

D. Ceperley and B. Alder , Science 231 ( 1986) 555.
[Cited within: 1]

M. Troyer and U. Wiese , Phys. Rev. Lett. 94 ( 2005) 170201, arXiv:cond-mat/0408370.
[Cited within: 1]

P. Shanahan, D. Trewartha, W. Detmold , Phys. Rev. D 97 ( 2018) 094506, arXiv:hep-lat/1801. 05784.
[Cited within: 1]

W.C. Gan and F.W. Shu , Int. J. Mod. Phys. D 26 ( 2017) 1743020, arXiv:gr-qc/1705. 05750.


J. Carrasquilla and R. Melko , Nat. Phys. 13 ( 2017) 431, arXiv:cond-mat/1605. 01735.


G. Torlai and R. Melko , Phys. Rev. B 94 ( 2016) 165134, arXiv:cond-mat/1606. 02718.


L. Wang , Phys. Rev. B 94 ( 2016) 195105, arXiv:cond-mat/1606. 00318.


A. Askar , et al., Finding Black Holes with Black Boxes -- Using Machine Learning to Identify Globular Clusters with Black Hole Subsystems, arXiv:astro-ph/1811. 06473.
[Cited within: 1]

R. Laughlin , Phys. Rev. Lett. 50 ( 1983) 1395.


E. G. Hinton and R. R. Reducing , Science 313 ( 2006) 504.


G. Carleo and M. Troyer , Science 355 ( 2017) 602.
[Cited within: 1]

S. Sorella , Phys. Rev. Lett. 80 ( 1998) 4558, arXiv:cond-st/9902211.
[Cited within: 2]

S. Sorella and L. Capriotti , Phys. Rev. B 61 ( 2000) 2599, arXiv:cond-mat/9902211.
[Cited within: 1]

M. Casula, C. Attaccalite, S. Sorella , J. Chem. Phys. 121 ( 2004) 7110, arXiv:cond-mat/0409644.
[Cited within: 1]

A. Kolmogorov , Dokl. Akad. Nauk SSSR 108 ( 1961) 179.
[Cited within: 1]

V. Arnold , Doklady Akademii nauk SSSR 114 ( 1957) 679.
[Cited within: 1]

V. K$\mathring{\mbox{u}}$rková , Neural networks 5 ( 1992) 501.
[Cited within: 1]

R. Blinc , J. Phys. Chem. Solids 13 ( 1960) 204.
[Cited within: 1]

P. de Gennes , Solid State Commun. 1 ( 1963) 132.
[Cited within: 1]

P. Pfeuty , ANNALS Phys. 57 ( 1970) 79.
[Cited within: 5]

C. Rao , Bull Calcutta Math. Soc. 37 ( 1945) 81.
[Cited within: 1]

S. Amari and H. Nagaoka , Methods of Information Geometry
Series: Translations of Mathematical Monographs (Book 191), American Mathematical Society; UK ed. edition (Aprial 13, 2007), ISBN-10:0821843028.

[Cited within: 1]

H. Matsueda , Emergent General Relativity from Fisher Information Metric, arXiv:hep-th/1310. 1831.
[Cited within: 1]

H. Matsueda , Geometry and Dynamics of Emergent Spacetime from Entanglement Spectrum, arXiv:hep-th/1408. 5589.
[Cited within: 1]

W. McMillan , Phys. Rev. 138 ( 1965) A442.
[Cited within: 1]

G. Carleo and M. Troyer , Supplementary Materials for Solving the Quantum Many-Body Problem with Artificial Neural Networks, .
URL [Cited within: 1]

P. Calabrese and J. Cardy , J. Stat. Mech. 0406 ( 2004) P06002, arXiv:hep-th/0405152.
[Cited within: 5]

M. J. de Oliveira and J. R. N. Chiappin , Physica A 238 ( 1997) 307.
[Cited within: 1]

D. Mattis and J. Gallardo , J. Phys. C: Solid St. Phys. 13 ( 1980) 2519.
[Cited within: 3]

R. Miyazaki, H. Nishimori, Gerardo Ortiz , Phys. Rev. E 83 ( 2011) 051103, arXiv:cond-mat/1012.4557v2.
[Cited within: 2]

N. Laflorencie , Phys. Rep. 646 ( 2016) 1, arXiv:cond-mat/1512.03388v3.
[Cited within: 1]

H. Li and F. Haldane , Phys. Rev. Lett. 101 ( 2008) 010504, arXiv:cond-mat/0805.0332v2.
[Cited within: 1]

S. Ryu and T. Takayanagi , Phys. Rev. Lett. 96 ( 2006) 181602, arXiv:hep-th/0306001.
[Cited within: 1]

M. Raamsdonk , Gen. Rel. Grav. 42 ( 2010) 2323, arXiv:het-th/1005. 3035.
[Cited within: 1]

G. Vidal , et al., Phys. Rev. Lett. 90 ( 2003) 227902, arXiv:quant-ph/0211074.
[Cited within: 4]

M. Hastings, I. Gonzalez, A. Kallin, R. Melko , Phys. Rev. Lett. 104 ( 2010) 157201.
[Cited within: 1]

G. Torlai , et al., Nature Phys. 14 ( 2018) 447, arXiv:cond-mat/1703.05334v2.
[Cited within: 1]

S. Czischek, M. G?rttner, T. Gasenzer , Phys. Rev. B 98 ( 2018) 024311.
[Cited within: 1]

L. Gao and L. Duan , Nature Commun. 8 ( 2017) 662, arXiv:cond-mat/1701. 05039.
[Cited within: 1]

P. Broecker , et al., Scientific Rep. 7 ( 2017) 8823, arXiv:cond-mat/1608. 07848.
[Cited within: 1]

相关话题/NeuralNetwork Quantum State