删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

Self-consistent tomography of temporally correlated errors

本站小编 Free考研考试/2022-01-02

Mingxia Huo1,2, Ying Li,3,1Beijing Computational Science Research Center, Beijing 100193, China
2Department of Physics and Beijing Key Laboratory for Magneto-Photoelectrical Composite and Interface Science, School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China
3Graduate School of China Academy of Engineering Physics, Beijing 100193, China

First author contact: Author to whom any correspondence should be addressed.
Received:2021-02-13Revised:2021-04-9Accepted:2021-04-13Online:2021-05-13


Abstract
The error model of a quantum computer is essential for optimizing quantum algorithms to minimize the impact of errors using quantum error correction or error mitigation. Noise with temporal correlations, e.g. low-frequency noise and context-dependent noise, is common in quantum computation devices and sometimes even significant. However, conventional tomography methods have not been developed for obtaining an error model describing temporal correlations. In this paper, we propose self-consistent tomography protocols to obtain a model of temporally correlated errors, and we demonstrate that our protocols are efficient for low-frequency noise and context-dependent noise.
Keywords: self-consistent tomography; temporally correlated errors; quantum computer


PDF (768KB)MetadataMetricsRelated articlesExportEndNote|Ris|BibtexFavorite
Cite this article
Mingxia Huo, Ying Li. Self-consistent tomography of temporally correlated errors. Communications in Theoretical Physics, 2021, 73(7): 075101- doi:10.1088/1572-9494/abf72f

1. Introduction

How to correct errors is one of the most critical issues in practical quantum computation. In the theory of quantum fault tolerance based on quantum error correction (QEC), an arbitrarily high-fidelity quantum computation can be achieved, providing sub-threshold error rates and sufficient qubits [1]. Recently, error rates within or close to the sub-threshold regime have been demonstrated in various platforms [26]. These error rates are measured using either randomized benchmarking (RB) [713] or quantum process tomography (QPT) [14, 15]. RB only estimates an average effect of the noise, and QPT can provide a model of error channels. Rigorously speaking, whether or not a quantum system is in the sub-threshold regime is not only determined by the error rate but also the detailed error model [16, 17], including correlations between errors [18]. Therefore, an error model describing correlated errors is important for verifying sub-threshold quantum devices. We can also optimize QEC protocols by exploring these correlations [16, 19], which is crucial for the early-stage demonstration of small-scale quantum fault tolerance. Given the limited number of qubits and error rate close to the threshold, we need to carefully choose the protocol to observe any advantage of QEC [2028].

We may still need many years to realise a fault-tolerant quantum computer [29, 30], however noisy intermediate-scale quantum (NISQ) computers are likely to be developed in the near future [3133]. Quantum error mitigation is an alternative approach to high-fidelity quantum computation [3437], which does not require encoding, therefore, is more promising than QEC on NISQ devices. In quantum error mitigation using the error extrapolation, we can increase errors to learn their effect on the observable representing the computation result. Once we know how the observable changes with the level of errors, we can make an extrapolated estimate of the zero-error computation result. This extrapolation can be implemented directly on the final result or each gate using the quasi-probability decomposition formalism. The effect of errors depends on the error model. Therefore, we need to increase errors according to the model of original errors in the system, i.e. at first we need a proper error model of original errors. The error model can be obtained using gate set tomography (GST) [6, 3842], a self-consistent QPT protocol. With the self-consistent error model, the effect of errors on the computation result can be eliminated, under the condition that errors are not correlated [36]. However, correlations are common in quantum systems [4345], e.g. the slow drift of laser frequency can cause time-dependent gate fidelity in ion trap systems [4650], which limits the fidelity of quantum computation on NISQ devices. Neither RB nor conventional QPT can provide an error model describing temporal correlations [6, 1013].

In this paper, we propose tomography protocols to obtain the model of temporally correlated errors without using any additional operations accessing the environment. Our protocols reconstruct the error model in the self-consistent manner [6, 3842], i.e. the model may be different from the actual one but can fit the experimentally measured data. Temporal correlations are caused by the correlations between the system and the environment. Without a set of informationally-complete state preparation and measurement operations, we cannot implement conventional QPT on the environment. We find that quantum gates are fully characterized by a set of linear operators acting on a subspace of Hermitian matrices, which can be measured in the experiment only using themselves even without information completeness. However, these operators may not be complete completely positive (CP) maps as in conventional QPT. We term our method as linear operator tomography (LOT), which can be used to reconstruct an operator representation of quantum gates without using additional operations on the environment. A tremendous amount of experimental data may be required to obtain the exact model of temporally correlated errors. For the practical implementation, we aim at an approximate error model, and we find that efficient approximations for low-frequency time-dependent noise and context-dependent noise exist [51, 52]. Practical protocols are proposed and demonstrated numerically. Error rates estimated using RB and GST may exhibit significant difference due to temporal correlations [6]. In numerical simulations, we show that RB and LOT results coincide with each other even in the presence of temporal correlations.

The paper is organized as follows. In section 2, we first introduce a general model of a quantum computer including the environment, wherein temporal correlations are caused by the environment. In section 3, we show that in principle by only using operations for operating the system, we can reconstruct a self-consistent model of both the system and the environment. The exact LOT protocol is introduced in section 3. In section 4, we present the condition for performing a truncation on the system-environment (SE) state space. In section 5, we discuss low-frequency time-dependent noise and context-dependent noise. We show that a low-dimensional state space can characterize these two types of temporally correlations. In section 6, we give two approximate tomography protocols for the practical implementation. In section 7, we demonstrate the protocol in numerical simulations.

2. The model of a quantum computer

For illustrating how to describe errors with temporal correlations, we start with an example in ion trap systems. If the quantum gate is driven by the laser field, usually the gate fidelity depends on the laser frequency λ [4650]. The laser frequency drifts with time, and usually we are not aware of its instant value. We use λ to denote a time-dependent random variable, such as the laser frequency. The stochastic process of λ is characterized by the probability distribution $\bar{p}(f)$, i.e. the value of λ at time t is f(t), and $\bar{p}(f)$ is the probability density in the space of functions f(t). We use the superoperator ${{ \mathcal O }}_{{\rm{S}}}(\lambda )$ to denote the gate operation given by the laser frequency λ. With the initial state ρS, the output state after two gates ${{ \mathcal O }}_{{\rm{S}}}$ and ${{ \mathcal O }}_{{\rm{S}}}^{\prime} $ at t and $t^{\prime} $, respectively, reads ${\rho }_{{\rm{S}}}^{(2)}=\int {\rm{d}}f\bar{p}(f){{ \mathcal O }}_{{\rm{S}}}^{\prime} (f(t^{\prime} )){{ \mathcal O }}_{{\rm{S}}}(f(t))({\rho }_{{\rm{S}}})$. We can find that the state ${\rho }_{{\rm{S}}}^{(2)}$ cannot be factorized as two independent operations on the initial state. Therefore, conventional QPT cannot be applied [6, 14, 15, 3842].

For simplification, we assume that λ changes slowly with time, and the typical time that λ changes is much longer than the time scale of a quantum circuit, i.e. the time from the state preparation to the measurement. In this case, we neglect the change of λ within each run of the quantum circuit, i.e. $f(t^{\prime} )=f(t)$ for two gates in the same run. We also assume that the distribution is stationary, then the distribution of the instant value, i.e. $p(\lambda )=\int {\rm{d}}f\bar{p}(f)\delta (\lambda -f(t))$, is time-independent. With the distribution of the instant value, we can rewrite the output state in the form ${\rho }_{{\rm{S}}}^{(2)}=\int {\rm{d}}\lambda p(\lambda ){{ \mathcal O }}_{{\rm{S}}}^{\prime} (\lambda ){{ \mathcal O }}_{{\rm{S}}}(\lambda )({\rho }_{{\rm{S}}})$.

We can factorize multi-gate superoperators by introducing the state space of the laser frequency. We use ${\left|\lambda \right|}_{{\rm{E}}}$ to denote the state corresponding to the laser frequency λ. The state space of ${\left|\lambda \right|}_{{\rm{E}}}$ can be virtual, i.e. it is not necessary that ${\left|\lambda \right|}_{{\rm{E}}}$ corresponds to a pure state in a physical Hilbert space. The initial state of qubits and the laser frequency can be expressed as ${\rho }_{\mathrm{SE}}=\int {\rm{d}}\lambda p(\lambda ){\rho }_{{\rm{S}}}\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}$. Then the superoperator of a laser-frequency-dependent gate can be expressed as ${{ \mathcal O }}_{\mathrm{SE}}=\int {\rm{d}}\lambda {{ \mathcal O }}_{{\rm{S}}}(\lambda )\otimes [\left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}]$, where [U] denotes a superoperator [U](ρ) = UρU. After two gates, the state of the system and the laser frequency reads ${\rho }_{\mathrm{SE}}^{(2)}={{ \mathcal O }}_{\mathrm{SE}}^{\prime} {{ \mathcal O }}_{\mathrm{SE}}({\rho }_{\mathrm{SE}})$, which is in the factorized form. One can find that ${\rho }_{{\rm{S}}}^{(2)}={\mathrm{Tr}}_{{\rm{E}}}({\rho }_{\mathrm{SE}}^{(2)})$.

We note that multi-gate superoperators can be factorized following a similar procedure for noise with any spectrum, i.e. the change of the variable λ in the time scale of a quantum circuit can be nonnegligible or even significant, as we will show in section 5.1. It is straightforward to generalize the approach to the case of multiple random variables, e.g. the gate fidelity depends on a set of drifting laser parameters, and the case that the initial state of the system also depends on random variables.

By introducing the environment, which is the frequency space in the example, we can describe temporally correlated errors. Such a formalism has been used in an ion-trap tomography experiment [6], in which a classical bit is introduced to represent the environment memory. With one classical bit and one qubit, the tomography is implemented on an eight-dimensional state space.

Now, we introduce a general model of quantum computer. For a quantum computer with n qubits, we call the 2n-dimensional Hilbert space of qubits the system. Degrees of freedom coupled to the system form the environment, including but not limited to all random variables determining the evolution of the system. Quantum computation is realised by a sequence of quantum gates. The gate sequence is stored in a terminal, e.g. a classical computer, and the evolution of the SE is controlled by the terminal as shown in figure 1. We use χ to denote the state of the terminal indicating ‘Implement the gate χ' and superoperator ${{ \mathcal O }}_{\mathrm{SE}}(\chi )$ to denote the corresponding evolution of SE. Here χ is a deterministic parameter rather than a random variable. By setting χ according to the gate sequence, we realise the quantum computation. We assume that the Born-Markov approximation can be applied to the terminal and SE, and operations on SE are Markovian and factorized. For the gate sequence χ1, χ2,…,χN, the overall evolution of SE is ${{ \mathcal O }}_{\mathrm{SE}}({\chi }_{N})\cdots {{ \mathcal O }}_{\mathrm{SE}}({\chi }_{2}){{ \mathcal O }}_{\mathrm{SE}}({\chi }_{1})$.

Figure 1.

New window|Download| PPT slide
Figure 1.Quantum computer controlled by the state of a terminal. The state of the terminal χ results in the evolution ${{ \mathcal O }}_{\mathrm{SE}}(\chi )$ of the system and environment. The evolution of the system ${{ \mathcal O }}_{{\rm{S}}}(\chi ,{\rho }_{\mathrm{SE}})$ depends on both the terminal state χ and the state of the system and environment at the beginning of the evolution ρSE.


In this model, the time dependence for operations on the system are not expressed explicitly. Operations on SE are time-independent. However, corresponding system operations are stochastic and depend on the environment state. When the environment state evolves, which is driven by SE operations, system operations evolves accordingly. In section 5.1, we will give an example that SE operations drive the stochastic process of the environment. In this way, we can describe errors with general temporal correlations, not only correlations caused by classical random variables such as laser frequencies, but also correlations caused by the coupling to a quantum system in the environment.

In general, the evolution of the system ${{ \mathcal O }}_{{\rm{S}}}(\chi ,{\rho }_{\mathrm{SE}})$ depends on not only χ but also the state of SE at the beginning of the evolution ρSE. If the system and environment are correlated in ρSE, the system evolution may not even be CP [53]. If the system evolution ${{ \mathcal O }}_{{\rm{S}}}(\chi ,{\rho }_{\mathrm{SE}})={{ \mathcal O }}_{{\rm{S}}}(\chi )$ does not depend on ρSE, the overall system evolution of a gate sequence is ${{ \mathcal O }}_{{\rm{S}}}({\chi }_{N})\cdots {{ \mathcal O }}_{{\rm{S}}}({\chi }_{2}){{ \mathcal O }}_{{\rm{S}}}({\chi }_{1})$. In this case, conventional QPT can be applied, we can obtain ${{ \mathcal O }}_{{\rm{S}}}(\chi )$ up to a similarity transformation using GST [6, 3842], and the computation error can be mitigated as proposed in [36]. By introducing the environment, non-Markovian processes can be reconstructed using quantum tomography [54].

From now on, we focus on states and operations of SE and neglect the subscript ‘SE'. All states, operations and observables without a subscript (‘SE', ‘S' or ‘E') correspond to SE; and subscripts ‘S' and ‘E' are used to denote the system and the environment, respectively.

We would like to remark that LOT protocols proposed in this paper cannot reconstruct the complete CP maps acting on SE as in conventional QPT protocols. In LOT, we only use the operations designed to operate the system, i.e. quantum gates for the computation, which actually act on SE because of imperfections. Given the limited accessibility to the environment, it is unrealistic to implement informationally-complete state preparation and measurement for the full tomography of SE.

2.1. State, measurement, operations and Pauli transfer matrix representation

A quantum computer is characterized by a set of linear operators: the initial state ρin which is a normalized positive Hermitian operator, the measurement (i.e. measured observable) Qout which is also a Hermitian operator, and a set of elementary computation operations $\{{ \mathcal O }(\chi )\}$ which are CP maps. We remark that, ρin, Qout and $\{{ \mathcal O }(\chi )\}$ describe the actual quantum computer (including both the system and environment) rather than an ideal quantum computer, and they are all unknown therefore need to be investigated in tomography. The quantum computation is realised by a sequence of elementary operations on the initial state. The set of operation sequences $O=\{{ \mathcal O }({\chi }_{N})\cdots { \mathcal O }({\chi }_{2}){ \mathcal O }({\chi }_{1})\}$ includes all operations generated by elementary operations $\{{ \mathcal O }(\chi )\}$.

We focus on the case that the quantum computer only provides one option of the initial state ρin and one option of the observable to be measured Qout. It is straightforward to generalize our results to the case that multiple options of initial states and observables are available.

In this paper, we use Pauli transfer matrix representation [6, 3842]: $\left|\rho \right.\unicode{x027EB}$ is a column vector with elements ${\left|\rho \right.\unicode{x027EB}}_{\sigma }=\mathrm{Tr}(\sigma \rho );$ $\left.\unicode{x027EA}Q\right|$ is a row vector with elements ${\left.\unicode{x027EA}Q\right|}_{\sigma }={d}_{{\rm{H}}}^{-1}\mathrm{Tr}(Q\sigma );$ then an quantum operation ${ \mathcal O }$ can be expressed as a square matrix with elements ${{ \mathcal O }}_{\sigma ,\tau }={d}_{{\rm{H}}}^{-1}\mathrm{Tr}[\sigma { \mathcal O }(\tau )]$. Here, Σ and τ are Pauli operators or generalized Pauli operators, i.e. Hermitian operators satisfying $\mathrm{Tr}(\sigma \tau )={d}_{{\rm{H}}}{\delta }_{\sigma ,\tau }$, and dH is the dimension of the Hilbert space of SE. All these vectors and matrices are real and ${d}_{{\rm{H}}}^{2}$-dimensional. For a state ρ and an observable Q, $\unicode{x027EA}Q| \rho \unicode{x027EB}=\mathrm{Tr}(Q\rho )$ is the mean of the observable Q in the state ρ. For an operation ${ \mathcal O }$, $\left|{ \mathcal O }(\rho )\right.\unicode{x027EB}={ \mathcal O }\left|\rho \right.\unicode{x027EB}$ is the vector corresponding to the state ${ \mathcal O }(\rho )$. Therefore, the mean of the observable Q in the state ρ after a sequence of quantum operations reads $\mathrm{Tr}[Q{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}(\rho )]=\left.\unicode{x027EA}Q\right|{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}\left|\rho \right.\unicode{x027EB}$.

3. Self-consistent tomography without information completeness

Information completeness is required by conventional QPT protocols. If we can prepare a complete set of states $\{\left|{\rho }_{i}\right.\unicode{x027EB}={{ \mathcal O }}_{i}\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}\}$ and measure a complete set of observables $\{\left.\unicode{x027EA}{Q}_{k}\right|=\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{{ \mathcal O }}_{k}^{\prime} \}$, i.e. ${d}_{{\rm{H}}}^{2}$ linearly independent vectors in each set, we can implement QPT on SE to reconstruct the CP maps of quantum gates. Here, the CP maps act on SE. However, information completeness for SE is unrealistic.

In this section, we demonstrate that it is possible to exactly characterize a set of quantum gates using tomography without information completeness. In the LOT formalism, we obtain a set of operators acting on a subspace of Hermitian matrices to represent quantum gates, which may not be complete CP maps without information completeness. Such an operator representation is adequate in the sense that given the initial state, an arbitrary operation sequence and the observable to be measured, the average value of the observable can be computed using these operators.

Because information completeness is not required, we can use LOT to characterize the quantum computer even in the presence of temporal correlations. In this section, the feasibility is not our concern. In the following sections, we will discuss how to adapt the protocol for the purpose of practical implementation.

3.1. Linear operator tomography

With only computation operations, which are designed to operate the system but act on SE because of imperfections, usually we do not have complete state and observable sets, therefore, we cannot access the entire space of Hermitian matrices.

Given an initial state ρin, an observable Qout and a set of elementary operations $\{{ \mathcal O }(\chi )\}$, we consider three subspaces of Hermitian matrices as follows. The subspace ${V}_{\mathrm{in}}=\mathrm{span}(\{{ \mathcal O }\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}\,| \,{ \mathcal O }\in O\})$ is the span of all states that can be prepared in the quantum computer, and the subspace ${V}_{\mathrm{out}}=\mathrm{span}(\{\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{ \mathcal O }\,| \,{ \mathcal O }\in O\})$ is the span of all observables that can be effectively measured. Note that O is the set of all operations generated by elementary operations. We use Pin and Pout to denote the orthogonal projections on Vin and Vout, respectively. The third subspace is $V=\mathrm{span}(\{{P}_{\mathrm{out}}{ \mathcal O }\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}\,| \,{ \mathcal O }\in O\})$, and we use P to denote the orthogonal projection on V.

The subspace V is the space of Hermitian matrices that the finite set of operations can access to. If V is the entire Hermitian-matrix space with the dimension ${d}_{{\rm{H}}}^{2}$, states and observables are complete, then LOT is the same as GST. In general, the completeness is not required in LOT.

Our first result is that in order to fully characterize the quantum computer, we only need to reconstruct $P\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$, $\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|P$ and $\{P{ \mathcal O }(\chi )P\}$ in the tomography. The reason is that, for an arbitrary sequence of operations in O, we have$\begin{eqnarray}\begin{array}{rcl} & & \left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|P{{ \mathcal O }}_{N}P\cdots P{{ \mathcal O }}_{2}P{{ \mathcal O }}_{1}P\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}.\end{array}\end{eqnarray}$See appendix A for the proof. We would like to remark that the conclusion is the same for the subspace $\mathrm{span}(\{\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{ \mathcal O }{P}_{\mathrm{in}}\,| \,{ \mathcal O }\in O\})$.

With this result, we can perform the tomography in a similar way to GST. We note that the protocol presented in this section is for the purpose of illustrating the self-consistent formalism rather than the practical implementation, and we discuss the practical implementation later. We need to assume that the dimension of the subspace V is finite and known, see discussions at the end of this section. The dimension of V is ${d}_{V}=\mathrm{Tr}(P)$. The exact LOT protocol is as follows:Choose a set of states $\{\left|{\rho }_{i}\right.\unicode{x027EB}={{ \mathcal O }}_{i}\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}\}$ and a set of observables $\{\left.\unicode{x027EA}{Q}_{k}\right|=\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{{ \mathcal O }}_{k}^{\prime} \}$. Here, ${{ \mathcal O }}_{i},{{ \mathcal O }}_{k}^{\prime} \in O$, i, k = 1, ⋯ ,d and we take d = dV. We always take $\left|{\rho }_{1}\right.\unicode{x027EB}=\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$ and $\left.\unicode{x027EA}{Q}_{1}\right|=\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|$.
These states and observables must satisfy the condition that $\{P\left|{\rho }_{i}\right.\unicode{x027EB}\}$ and $\{\left.\unicode{x027EA}{Q}_{k}\right|P\}$ are both linearly independent. According to the definition of the subspace V, states and observables satisfying the condition always exist and can be realised in the quantum computer using the combination of elementary operations.
Obtain matrices g = MoutMin and $\widetilde{{ \mathcal O }}(\chi )={M}_{\mathrm{out}}{ \mathcal O }(\chi ){M}_{\mathrm{in}}$ for each χ in the experiment. Here, ${M}_{\mathrm{in}}\,=[\ \left|{\rho }_{1}\right.\unicode{x027EB}\ \cdots \ \left|{\rho }_{d}\right.\unicode{x027EB}\ ]$ is the matrix with $\{\left|{\rho }_{i}\right.\unicode{x027EB}\}$ as columns, and ${M}_{\mathrm{out}}={[{\left.\unicode{x027EA}{Q}_{1}\right|}^{{\rm{T}}}\cdots {\left.\unicode{x027EA}{Q}_{d}\right|}^{{\rm{T}}}]}^{{\rm{T}}}$ is the matrix with $\{\left.\unicode{x027EA}{Q}_{k}\right|\}$ as rows. Each matrix element can be measured in the experiment. The element ${g}_{k,i}=\unicode{x027EA}{Q}_{k}| {\rho }_{i}\unicode{x027EB}$ is the mean of Qk in the state ρi. The element ${\widetilde{{ \mathcal O }}}_{k,i}(\chi )\,=\left.\unicode{x027EA}{Q}_{k}\right|{ \mathcal O }(\chi )\left|{\rho }_{i}\right.\unicode{x027EB}$ is the mean ofQk in the state ρi after the operation ${ \mathcal O }(\chi )$.
When $\{P\left|{\rho }_{i}\right.\unicode{x027EB}\}$ and $\{\left.\unicode{x027EA}{Q}_{k}\right|P\}$ are both linearly independent, g is invertible. We remark that $\{P\left|{\rho }_{i}\right.\unicode{x027EB}\}$ and $\{\left.\unicode{x027EA}{Q}_{k}\right|P\}$ may not be linearly independent when $\{\left|{\rho }_{i}\right.\unicode{x027EB}\}$ and $\{\left.\unicode{x027EA}{Q}_{k}\right|\}$ are linearly independent because of the projection P.


Data g and $\{\widetilde{{ \mathcal O }}(\chi )\}$ exactly characterize the quantum computer. We have$\begin{eqnarray}{M}_{\mathrm{out}}{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}{M}_{\mathrm{in}}={\widetilde{{ \mathcal O }}}_{N}{g}^{-1}\cdots {\widetilde{{ \mathcal O }}}_{2}{g}^{-1}{\widetilde{{ \mathcal O }}}_{1}\end{eqnarray}$for an arbitrary sequence of operations in O. See appendix A for the proof.

Given g and $\{\widetilde{{ \mathcal O }}(\chi )\}$, we can obtain an exact error model of the quantum computer.Choose a d-dimensional invertible real matrix ${\widehat{M}}_{\mathrm{in}}$, and compute ${\widehat{M}}_{\mathrm{out}}=g{\widehat{M}}_{\mathrm{in}}^{-1}$.
Take $\left|{\widehat{\rho }}_{\mathrm{in}}\right.\unicode{x027EB}={\widehat{M}}_{\mathrm{in};\bullet ,1}$ and $\left.\unicode{x027EA}{\widehat{Q}}_{\mathrm{out}}\right|={\widehat{M}}_{\mathrm{out};1,\bullet }$, and compute $\widehat{{ \mathcal O }}(\chi )={\widehat{M}}_{\mathrm{out}}^{-1}\widetilde{{ \mathcal O }}(\chi ){\widehat{M}}_{\mathrm{in}}^{-1}$ for each χ.
Here, M•,i and Mk,• denote the ith column and the kth row of the matrix M, respectively. The error model of the quantum computer is formed by $\left|{\widehat{\rho }}_{\mathrm{in}}\right.\unicode{x027EB}$, $\left.\unicode{x027EA}{\widehat{Q}}_{\mathrm{out}}\right|$ and $\{\widehat{{ \mathcal O }}(\chi )\}$, which correspond to $\left|\rho \right.\unicode{x027EB}$, $\left.\unicode{x027EA}Q\right|$ and $\{{ \mathcal O }(\chi )\}$, respectively.

According to equation (2), we have$\begin{eqnarray}\begin{array}{rcl} & & \left.\unicode{x027EA}{\widehat{Q}}_{\mathrm{out}}\right|\widehat{{ \mathcal O }}({\chi }_{N})\cdots \widehat{{ \mathcal O }}({\chi }_{1})\left|{\widehat{\rho }}_{\mathrm{in}}\right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{ \mathcal O }({\chi }_{N})\cdots { \mathcal O }({\chi }_{1})\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}.\end{array}\end{eqnarray}$The first line is the computation result according to the error model (the tomography result), and the second line is the experimental result of the actual quantum computer, which are equal. In this sense the error model is exact. The exactness of the error model does not rely on how to choose the matrix ${\widehat{M}}_{\mathrm{in}}$. If we choose a different matrix ${\widehat{M}}_{\mathrm{in}}^{\prime} $, then we can obtain another error model $\left|{\widehat{\rho }}_{\mathrm{in}}^{\prime} \right.\unicode{x027EB}=S\left|{\widehat{\rho }}_{\mathrm{in}}\right.\unicode{x027EB}$, $\left.\unicode{x027EA}{\widehat{Q}}_{\mathrm{out}}^{\prime} \right|=\left.\unicode{x027EA}{\widehat{Q}}_{\mathrm{out}}\right|{S}^{-1}$ and $\{\widehat{{ \mathcal O }}^{\prime} (\chi )=S\widehat{{ \mathcal O }}(\chi ){S}^{-1}\}$, where $S={\widehat{M}}_{\mathrm{in}}^{\prime} {\widehat{M}}_{\mathrm{in}}^{-1}$. Both error models can exactly characterize the quantum computer, because the difference between two error models is only asimilarity transformation [6, 3842, 55].

If states and observables are informationally complete for SE, V is the entire Hermitian-matrix space of SE, and LOT is the same as GST applied on SE. Without temporal correlations, the states and observables are usually informationally complete for the system, i.e. V is the entire Hermitian-matrix space of the system, and LOT is the same as GST applied on the system. With temporal correlations, in general V is neither the entire space of SE nor the entire space of the system, in which case LOT is different from GST.

The error model reconstructed using LOT can only be applied to states in the subspace Vin and observables in the subspace Vout. If the state is not in Vin or the observable is not in Vout, the error model cannot predict the computation result as in equation (3). This is different from GST, in which the reconstructed model can be applied to any state and observable.

In the exact LOT protocol introduced in this section, we have assumed that the dimension of the subspace V is known. If we can collect all the data generated by the operation set O, we can find out the dimension of V by analyzing the number of linearly independent states (in the subspace Vout). In section 6, we introduce approximate LOT protocols for the practical implementation, in which we do not need to assume that the dimension of V is known.

4. Space dimension truncation

Usually, the environment is a high-dimensional Hilbert space. Although only the subspace V is relevant in the exact LOT, its dimension could still be too high to allow the exact LOT to be implemented. Therefore, a practical LOT protocol is approximate and requires that a low-dimensional subspace approximately characterize the quantum computer. Here, we give a sufficient condition for the existence of such a subspace.

We consider the case that d < dV, i.e. states $\{\left|{\rho }_{i}\right.\unicode{x027EB}\,| \,i\,=1,\ldots ,d\}$ and observables $\{\left.\unicode{x027EA}{Q}_{k}\right|\,| \,k=1,\ldots ,d\}$ are not sufficient for implementing the exact LOT. We define a quantum computer to be approximately characterized by ΠinMin, MoutΠin and $\{{{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}\}$ if the subspace spanned by $\{\left|{\rho }_{i}\right.\unicode{x027EB}\}$ is approximately invariant under operations $\{{ \mathcal O }(\chi )\}$. Here, Πin and Πout are orthogonal projections on subspaces $\mathrm{span}(\{\left|{\rho }_{i}\right.\unicode{x027EB}\})$ and $\mathrm{span}(\{\left.\unicode{x027EA}{Q}_{k}\right|\})$, respectively.

If $\parallel \left|{\rho }_{i}\right.\unicode{x027EB}\parallel \leqslant {N}_{\rho }$, $\parallel \left.\unicode{x027EA}{Q}_{k}\right|\parallel \leqslant {N}_{Q}$, $\parallel { \mathcal O }(\chi )\parallel \leqslant {N}_{{ \mathcal O }}$, and $\parallel {{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}-{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}\parallel \leqslant \epsilon $44We use $\parallel \cdot \parallel $ to denote a vector norm satisfying $\left|\unicode{x027EA}A| B\unicode{x027EB}\right|\leqslant \parallel \left.\unicode{x027EA}A\right|\parallel \parallel \left|B\right.\unicode{x027EB}\parallel $ and the submultiplicative matrix norm induced by the vector norm. We can take $\parallel \left|\cdot \right.\unicode{x027EB}\parallel ={\parallel \cdot \parallel }_{1}$, where ${\parallel \cdot \parallel }_{1}$ is the trace norm. Then, $\parallel { \mathcal O }\parallel =1$ if ${ \mathcal O }$ is trace-preserving. ${\parallel \cdot \parallel }_{\max }$ denotes max norm.
, we have$\begin{eqnarray}\begin{array}{rcl} & & \parallel {M}_{\mathrm{out}}{ \mathcal O }({\chi }_{N})\cdots { \mathcal O }({\chi }_{1}){M}_{\mathrm{in}}\\ & & {-{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }({\chi }_{N}){{\rm{\Pi }}}_{\mathrm{in}}\cdots {{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }({\chi }_{1}){{\rm{\Pi }}}_{\mathrm{in}}{M}_{\mathrm{in}}\parallel }_{\max }\\ & \leqslant & {N}_{Q}{N}_{\rho }\left[{\left({N}_{{ \mathcal O }}+\epsilon \right)}^{N}-{N}_{{ \mathcal O }}^{N}\right]\sim {N}_{Q}{N}_{\rho }{N}_{{ \mathcal O }}^{N-1}\times N\epsilon ,\end{array}\end{eqnarray}$for an arbitrary sequence of elementary operations. See appendix B for the proof. Here, we always have ${N}_{{ \mathcal O }}=1$ by taking the trace norm. A small ε means that the subspace is approximately invariant under elementary operations, in which case an approximate tomography is possible.

There are various ways to find out the truncated dimension d < dV. For low-frequency noise and context-dependent noise, the dimension can be determined by the prior knowledge about the noise, as we show in section 5. We can validate the truncation by computing the spectrum of the Gram matrix gt, see section 6.1. Similarly, for operations with temporal correlations, the number of eigenvalues in the spectrum of an operation is more than ${d}_{{\rm{S}}}^{2}$, where dS is the dimension of the Hilbert space of the system. The spectrum of an operation can be measured using the spectral quantum tomography [56], which can also be used to determine the truncated dimension. Finally, one can also validate the truncation by comparing the reconstructed error model to the experimental data: one needs to increase the truncated dimension d when the model is not consistent with the data.

5. Approximate models of temporally correlated errors

The practical use of LOT requires that a low-dimensional approximate model exists. In this section, we consider two typical sources of temporal correlations, i.e. low-frequency noise and context-dependent noise. For both of them, low-dimensional approximate models exist.

5.1. Low-frequency noise and classical random variables

A typical source of temporally correlated errors in laboratory systems is the stochastic variation of classical parameters as discussed in section 2. For instances, in the trapped ion system, drifts of laser parameters cause time dependent coherent error [47]; in the superconducting qubit system, fluctuations in the quasiparticle population lead to temporal variations in the qubit decay rate [57]. If the correlation time of the noise is negligible compared with the time of a quantum gate, temporal correlations in gate errors are insignificant. However, if the correlation time is comparable or even longer than the gate time, errors are correlated, i.e. a sequence of quantum gates cannot be factorized because of low-frequency noise. We show that errors with this kind of correlations can be efficiently approximated using a low-dimensional model if moments of the parameter distribution converge rapidly. We remark that LOT is not limited to classical correlations and can be applied to general cases as long as the dimension truncation is valid.

As the same as in section 2, We can use$\begin{eqnarray}\rho =\int {\rm{d}}\lambda p(\lambda ){\rho }_{{\rm{S}}}(\lambda )\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}},\end{eqnarray}$to describe a state that depends on random variables λ. Here, λ is an array with n elements that respectively denote n variables, and ρS(λ) is the state of the system when variables are λ. An operation depending on random variables reads$\begin{eqnarray}{ \mathcal O }(\chi )={ \mathcal T }(\chi )\int {\rm{d}}\lambda {{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda )\otimes [\left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}],\end{eqnarray}$where ${{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda )$ is the operation on the system when variables are λ. Compared with the expression in section 2, there is an additional operation ${ \mathcal T }(\chi )$ in ${ \mathcal O }(\chi )$, which describes the stochastic evolution of variables in the time of the operation. Here ${ \mathcal T }(\chi )=[{{\mathbb{1}}}_{{\rm{S}}}]\otimes {{ \mathcal T }}_{{\rm{E}}}(\chi )$, ${{ \mathcal T }}_{{\rm{E}}}(\chi )=\int {\rm{d}}\lambda ^{\prime} {\rm{d}}\lambda {t}_{\lambda ^{\prime} ,\lambda }(\chi )[\left|\lambda ^{\prime} \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}]$, ${t}_{\lambda ^{\prime} ,\lambda }(\chi )$ is the transition probability density from λ to $\lambda ^{\prime} $, and 1 is the identity operator. Similarly, an observable depending on random variables reads$\begin{eqnarray}Q=\int {\rm{d}}\lambda {Q}_{{\rm{S}}}(\lambda )\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}},\end{eqnarray}$where QS(λ) is the observable of the system when variables are λ.

The approximate model is given by$\begin{eqnarray}{\rho }^{{\rm{a}}}=\sum _{\lambda \in L}{p}^{{\rm{a}}}(\lambda ){\rho }_{{\rm{S}}}(\lambda )\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}^{{\rm{a}}},\end{eqnarray}$$\begin{eqnarray}{{ \mathcal O }}^{{\rm{a}}}(\chi )={{ \mathcal T }}^{{\rm{a}}}(\chi )\sum _{\lambda \in L}{{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda )\otimes [\left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}^{{\rm{a}}}],\end{eqnarray}$$\begin{eqnarray}{Q}^{{\rm{a}}}=\sum _{\lambda \in L}{Q}_{{\rm{S}}}(\lambda )\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}^{{\rm{a}}}.\end{eqnarray}$Here, L is a finite subset of random variables. If λ takes m values in L, i.e. $\left|L\right|=m$, the environment in the approximate model is m-dimensional. The transition operation in the approximate model is ${{ \mathcal T }}^{{\rm{a}}}(\chi )=[{{\mathbb{1}}}_{{\rm{S}}}]\otimes {{ \mathcal T }}_{{\rm{E}}}^{{\rm{a}}}(\chi )$, where ${{ \mathcal T }}_{{\rm{E}}}^{{\rm{a}}}(\chi )={\sum }_{\lambda ^{\prime} ,\lambda \in L}{t}_{\lambda ^{\prime} ,\lambda }^{{\rm{a}}}(\chi )[\left|\lambda ^{\prime} \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}^{{\rm{a}}}]$. We remark that ρS(λ), ${{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda )$ and QS(λ) are the same as in equations (5)-(7). By properly choosing the subset of random variables L, the distribution pa(λ) and transition matrices ${t}_{\lambda ^{\prime} ,\lambda }^{{\rm{a}}}(\chi )$, such a model can approximately characterize the quantum computer as we will show next.

We focus on the case of only one random variable, and the generalization to the case of multiple variables is straightforward. In a quantum computation platform, the effect of the noise on quantum operations should be weak, i.e. error rates are low. In this case, only low-order moments are important. Using the Taylor expansion, we have$\begin{eqnarray}{\rho }_{{\rm{S}}}(\lambda )=\sum _{l=0}^{\infty }{\rho }_{{\rm{S}}}^{(l)}{\lambda }^{l},\end{eqnarray}$$\begin{eqnarray}{{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda )=\sum _{l=0}^{\infty }{{ \mathcal O }}_{{\rm{S}}}^{(l)}(\chi ){\lambda }^{l},\end{eqnarray}$$\begin{eqnarray}{Q}_{{\rm{S}}}(\lambda )=\sum _{l=0}^{\infty }{Q}_{{\rm{S}}}^{(l)}{\lambda }^{l}.\end{eqnarray}$Then, the quantum computation of a mean value, i.e. the mean of an observable Q in the state ρ after a sequence of operations, can be expressed as$\begin{eqnarray}\begin{array}{rcl} & & \left.\unicode{x027EA}Q\right|{ \mathcal O }({\chi }_{N})\cdots { \mathcal O }({\chi }_{1})\left|\rho \right.\unicode{x027EB}\\ & = & \sum _{{\bf{l}}}\left.\unicode{x027EA}{Q}_{{\rm{S}}}^{({l}_{N+1})}\right|{{ \mathcal O }}_{{\rm{S}}}^{({l}_{N})}({\chi }_{N})\cdots {{ \mathcal O }}_{{\rm{S}}}^{({l}_{1})}({\chi }_{1})\left|{\rho }_{{\rm{S}}}^{({l}_{0})}\right.\unicode{x027EB}\\ & & \times \overline{{\lambda }_{N+1}^{{l}_{N+1}}{\lambda }_{N}^{{l}_{N}}\cdots {\lambda }_{1}^{{l}_{1}}{\lambda }_{0}^{{l}_{0}}},\end{array}\end{eqnarray}$where l = (l0, l1,…,lN, lN+1), λj is the value of the variable at the time of the jth operation, and the overline denotes the average. Therefore, the behavior of the quantum computer is determined by correlations of random variables. If these correlations can be approximately reconstructed in the approximate model, the model approximately characterizes the quantum computer.

Correlations are formally defined here. We introduce the operator $\hat{\lambda }=\int {\rm{d}}\lambda \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}$. Then,$\begin{eqnarray}\begin{array}{rcl} & & \overline{{\lambda }_{N+1}^{{l}_{N+1}}{\lambda }_{N}^{{l}_{N}}\cdots {\lambda }_{1}^{{l}_{1}}{\lambda }_{0}^{{l}_{0}}}\\ & = & \mathrm{Tr}\left[{\hat{\lambda }}^{{l}_{N+1}}{{ \mathcal T }}_{{\rm{E}}}({\chi }_{N}){\hat{\lambda }}^{{l}_{N}}\cdots {{ \mathcal T }}_{{\rm{E}}}({\chi }_{1}){\hat{\lambda }}^{{l}_{1}}{\hat{\lambda }}^{{l}_{0}}{\rho }_{{\rm{E}}}\right],\end{array}\end{eqnarray}$where ${\rho }_{{\rm{E}}}=\int {\rm{d}}\lambda p(\lambda )\left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}$.

5.1.1. Second-order approximation

First, we consider the case that the contribution of high-order correlations other than $\overline{{\lambda }_{j}}$ and $\overline{{\lambda }_{j^{\prime} }{\lambda }_{j}}$ is negligible, the distribution of λ is stationary, and the correlation time is much longer than the time scale of a quantum circuit. In this case, only $\overline{{\lambda }_{j}}$ and $\overline{{\lambda }_{j^{\prime} }{\lambda }_{j}}$ are important. Without loss of generality, we assume that the distribution is centered at λ = 0, i.e. $\overline{{\lambda }_{j}}=0$. Because of the long correlation time, the change of the random variable is slow, and $\overline{{\lambda }_{j^{\prime} }{\lambda }_{j}}\simeq {\sigma }^{2}$ is approximately a constant. Such correlations can be reconstructed in the approximate model with m = 2. We can take parameters in the approximate model as L = {λ = ± Σ}, pa( ± Σ) = 1/2 and ${t}_{\lambda ^{\prime} ,\lambda }^{{\rm{a}}}(\chi )={\delta }_{\lambda ^{\prime} ,\lambda }$.

Next, we consider the case that the correlation is not a constant but decreases with time. We assume that for each operation χ the correlation is reduced by a factor of eΓ(χ). If the correlation decreases exponentially with time, Γ(χ) is proportional to the operation time. The correlation reads $\overline{{\lambda }_{j^{\prime} }{\lambda }_{j}}={\sigma }^{2}\exp [-{\sum }_{i=\max \{1,j\}}^{j^{\prime} -1}{\rm{\Gamma }}({\chi }_{i})]$. This two-time correlation can also be reconstructed in the approximate model with m = 2. The only difference is the transition matrix. We take the transition matrix as ${{ \mathcal T }}_{{\rm{E}}}^{{\rm{a}}}(\chi )={ \mathcal T }{{\prime} }_{{\rm{E}}}({\rm{\Gamma }}(\chi ))$, and$\begin{eqnarray}\begin{array}{rcl}{ \mathcal T }{{\prime} }_{{\rm{E}}}({\rm{\Gamma }}) & \equiv & \displaystyle \frac{1+{{\rm{e}}}^{-{\rm{\Gamma }}}}{2}([\left|+\right\rangle {\left\langle +\right|}_{{\rm{E}}}^{{\rm{a}}}]+[\left|-\right\rangle {\left\langle -\right|}_{{\rm{E}}}^{{\rm{a}}}])\\ & & +\displaystyle \frac{1-{{\rm{e}}}^{-{\rm{\Gamma }}}}{2}([\left|-\right\rangle {\left\langle +\right|}_{{\rm{E}}}^{{\rm{a}}}]+[\left|+\right\rangle {\left\langle -\right|}_{{\rm{E}}}^{{\rm{a}}}]),\end{array}\end{eqnarray}$where ${\left|\pm \right|}_{{\rm{E}}}^{{\rm{a}}}\equiv {\left|\lambda =\pm \sigma \right|}_{{\rm{E}}}^{{\rm{a}}}$. Then we have ${ \mathcal T }{{\prime} }_{{\rm{E}}}({\rm{\Gamma }}^{\prime} ){ \mathcal T }{{\prime} }_{{\rm{E}}}({\rm{\Gamma }})\,={ \mathcal T }{{\prime} }_{{\rm{E}}}({\rm{\Gamma }}^{\prime} +{\rm{\Gamma }})$.

5.1.2. High-order approximations and multiple variables

We consider the case that the change of the random variable is negligible in the time scale of a quantum circuit, i.e. ${{ \mathcal T }}_{{\rm{E}}}(\chi )\simeq [{{\mathbb{1}}}_{{\rm{E}}}]$. Then, correlations become $\overline{{\lambda }_{N+1}^{{l}_{N+1}}{\lambda }_{N}^{{l}_{N}}\cdots {\lambda }_{1}^{{l}_{1}}{\lambda }_{0}^{{l}_{0}}}\,\simeq \int {\rm{d}}\lambda {\lambda }^{l}$, where $l={\sum }_{j=0}^{N+1}{l}_{j}$. If the contribution of correlations with l > lt is negligible, we only need to reconstruct correlations with llt in the approximate model, which is always possible by taking m = ⌈(lt + 1)/2⌉ [58]. We remark that m is the dimension of the environment in the approximate model.

It is similar for multiple random variables. If moments of the distribution converge rapidly, the Gaussian cubature approximation can be applied [59]. Then, up to ${l}_{{\rm{t}}}^{\mathrm{th}}$-order moments can be reconstructed with $m=\displaystyle \left(\genfrac{}{}{0em}{}{{n}_{\lambda }+{l}_{{\rm{t}}}}{{l}_{{\rm{t}}}}\right)$, where nλ is the number of random variables.

5.2. Classical context-dependent noise

Context dependence is the effect that the error in an operation depends on previous operations, i.e. the environment has the memory of previous operations. Here, we consider the case that the environment has a record of the classical information about previous operations. Because this kind of effects is in the scope of our general model of the quantum computer in section 2, our results can be applied to the context-dependent noise. A list of previous operations is the classical information, so the memory of previous operations can be treated as a set of classical variables whose evolution is operation-dependent. Therefore, we can use the same formalism for classical random variables to characterize the context-dependent noise. We consider two examples as follows.

In the ion trap, the temperature of ions may depend on how many gates have been performed after the last cooling operation, and the fidelity of a gate depends on the temperature. This effect can be characterized using a set of discretized variables λ = (n1, n2, …). Here, ni denotes the phonon number of the ith mode. Because of the low temperature of ions, each ni can be truncated at a small number. Suppose the evolution of the qubit state mainly depends on the distribution at the beginning of the gate, the gate can be expressed as the same as in equation (9), where ${ \mathcal T }(\chi )$ describes the heating process in the gate χ. The cooling operation can also be expressed in this form. Then, we can apply the approximation similar to classical random variables.

If the error in an operation only significantly depends on a few previous operations, we can use a low-dimensional environment to characterize the effect. We focus on the case that the error only depends on the last one operation, and it can be generalized to the case of depending on multiple previous operations. We characterize this effect using one discretised variable λ ∈ {χ}, where {χ} is the list of all possible operations. The state after the operation λ can be expressed in the form $\rho ={\rho }_{{\rm{S}}}\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}$. An operation can be expressed in the form ${ \mathcal O }(\chi )={\sum }_{\lambda }{{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda )\otimes [\left|\chi \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}]$, where ${{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda )$ is the operation on the system when the last operation is λ. After the operation, the state becomes ${ \mathcal O }(\chi )\rho =[{{ \mathcal O }}_{{\rm{S}}}(\chi ,\lambda ){\rho }_{{\rm{S}}}]\otimes \left|\chi \right\rangle {\left\langle \chi \right|}_{{\rm{E}}}$. We remark that it is not necessary that ${\left|\lambda \right|}_{{\rm{E}}}$ corresponds to a pure state in a physical Hilbert space.

6. Approximate quantum tomography

The exact tomography protocol is not practical because of the high-dimensional state space of the environment. In section 5, we show that an effective model with a low-dimensional environment state space can approximately characterize the quantum computer for typical temporally correlated noises. In this section, we discuss how to implement LOT to obtain a low-dimensional approximate model of the quantum computer. There are two approaches of self-consistent tomography, the linear inversion method (LIM) and the maximum likelihood estimation (MLE) [6, 3842], and we will discuss both of them.

6.1. Linear inversion method

Given sufficient data from the experiment, we can use LIM to obtain an exact model of the quantum computer as discussed in section 3. However, to obtain an approximate model, even if the approximate model exists, LIM does not always work. We suppose that d × d matrices ${M}_{\mathrm{in}}^{{\rm{a}}}$, ${M}_{\mathrm{out}}^{{\rm{a}}}$ and $\{{{ \mathcal O }}^{{\rm{a}}}(\chi )\}$ satisfy$\begin{eqnarray}\begin{array}{rcl} & & \parallel {M}_{\mathrm{out}}{ \mathcal O }({\chi }_{N})\cdots { \mathcal O }({\chi }_{1}){M}_{\mathrm{in}}\\ & & {-{M}_{\mathrm{out}}^{{\rm{a}}}{{ \mathcal O }}^{{\rm{a}}}({\chi }_{N})\cdots {{ \mathcal O }}^{{\rm{a}}}({\chi }_{1}){M}_{\mathrm{in}}^{{\rm{a}}}\parallel }_{\max }\leqslant {\epsilon }_{N},\end{array}\end{eqnarray}$for any sequence of elementary operations, where εN is a small quantity depending on N, and d < dV. Then, these matrices form a model that approximately characterizes the quantum computer. If the approximate model exists, we only need to obtain d × d matrices ${g}^{{\rm{a}}}={M}_{\mathrm{out}}^{{\rm{a}}}{M}_{\mathrm{in}}^{{\rm{a}}}$ and $\{{\widetilde{{ \mathcal O }}}^{{\rm{a}}}(\chi )={M}_{\mathrm{out}}^{{\rm{a}}}{{ \mathcal O }}^{{\rm{a}}}(\chi ){M}_{\mathrm{in}}^{{\rm{a}}}\}$ in order to approximately characterize the quantum computer. Because ${\parallel g-{g}^{{\rm{a}}}\parallel }_{\max }\leqslant {\epsilon }_{0}$ and ${\parallel \widetilde{{ \mathcal O }}(\chi )-{\widetilde{{ \mathcal O }}}^{{\rm{a}}}(\chi )\parallel }_{\max }\leqslant {\epsilon }_{1}$, we can directly use g and $\{\widetilde{{ \mathcal O }}(\chi )\}$ as estimates of ga and $\{{\widetilde{{ \mathcal O }}}^{{\rm{a}}}(\chi )\}$, which can be obtained in the experiment. However, g−1 may be very different from ${\left({g}^{{\rm{a}}}\right)}^{-1}$, and in this case equation (2) may not even approximately hold. We remark that equation (2) always exactly holds if d = dV. If equation (2) does not hold, LIM does not work.

LIM works for the approximate model if the following conditions are satisfied. $\{\left|{\rho }_{i}^{{\rm{a}}}\right.\unicode{x027EB}\}$ are columns of ${M}_{\mathrm{in}}^{{\rm{a}}}$, and $\{\left.\unicode{x027EA}{Q}_{k}^{{\rm{a}}}\right|\}$ are rows of ${M}_{\mathrm{out}}^{{\rm{a}}}$. Then, if $\parallel \left|{\rho }_{i}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \leqslant {N}_{\rho }^{{\rm{a}}}$, $\parallel \left.\unicode{x027EA}{Q}_{k}^{{\rm{a}}}\right|\parallel \leqslant {N}_{Q}^{{\rm{a}}}$, $\parallel {{ \mathcal O }}^{{\rm{a}}}(\chi )\parallel \leqslant {N}_{{ \mathcal O }}^{{\rm{a}}}$, $\parallel {M}_{\mathrm{in}}^{{\rm{a}}}{g}^{-1}{M}_{\mathrm{out}}^{{\rm{a}}}-{\mathbb{1}}\parallel \leqslant {\epsilon }_{g}$ and $\parallel {\left({M}_{\mathrm{out}}^{{\rm{a}}}\right)}^{-1}\widetilde{{ \mathcal O }}(\chi ){\left({M}_{\mathrm{in}}^{{\rm{a}}}\right)}^{-1}-{{ \mathcal O }}^{{\rm{a}}}(\chi )\parallel \leqslant {\epsilon }_{{ \mathcal O }}$, we have$\begin{eqnarray}\begin{array}{rcl} & & \parallel \widetilde{{ \mathcal O }}({\chi }_{N}){g}^{-1}\cdots \widetilde{{ \mathcal O }}({\chi }_{2}){g}^{-1}\widetilde{{ \mathcal O }}({\chi }_{1})\\ & & {-{M}_{\mathrm{out}}^{{\rm{a}}}{{ \mathcal O }}^{{\rm{a}}}({\chi }_{N})\cdots {{ \mathcal O }}^{{\rm{a}}}({\chi }_{2}){{ \mathcal O }}^{{\rm{a}}}({\chi }_{1}){M}_{\mathrm{in}}^{{\rm{a}}}\parallel }_{\max }\\ & \leqslant & {N}_{Q}^{{\rm{a}}}{N}_{\rho }^{{\rm{a}}}\left[{\left(1+{\epsilon }_{g}\right)}^{N-1}{\left({N}_{{ \mathcal O }}+{\epsilon }_{{ \mathcal O }}\right)}^{N}-{N}_{{ \mathcal O }}^{N}\right]\\ & \sim & {N}_{Q}^{{\rm{a}}}{N}_{\rho }^{{\rm{a}}}\times {N}_{{ \mathcal O }}^{N-1}\left[(N-1){N}_{{ \mathcal O }}{\epsilon }_{g}+N{\epsilon }_{{ \mathcal O }}\right],\end{array}\end{eqnarray}$for any sequence of elementary operations. See appendix C for the proof. Therefore, LIM works under conditions that ${N}_{{ \mathcal O }}^{{\rm{a}}}\lesssim 1$, and εg and ${\epsilon }_{{ \mathcal O }}$ are small quantities.

We apply this result to the approximate model given by the approximately invariant subspace Πin. We have$\begin{eqnarray}\begin{array}{rcl} & & \parallel \widetilde{{ \mathcal O }}({\chi }_{N}){g}^{-1}\cdots \widetilde{{ \mathcal O }}({\chi }_{2}){g}^{-1}\widetilde{{ \mathcal O }}({\chi }_{1})\\ & & -{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }({\chi }_{N}){{\rm{\Pi }}}_{\mathrm{in}}\cdots {{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }({\chi }_{1}){{\rm{\Pi }}}_{\mathrm{in}}{M}_{\mathrm{in}}\parallel \max \\ & \leqslant & {N}_{Q}(1+{\epsilon }_{{\rm{\Pi }}}){N}_{\rho }\left[{\left({N}_{{ \mathcal O }}+\displaystyle \frac{\epsilon }{1-{\epsilon }_{{\rm{\Pi }}}}\right)}^{N}-{\left({N}_{{ \mathcal O }}+\epsilon \right)}^{N}\right]\\ & \sim & {N}_{Q}{N}_{\rho }{N}_{{ \mathcal O }}^{N-1}\times \displaystyle \frac{(1+{\epsilon }_{{\rm{\Pi }}}){\epsilon }_{{\rm{\Pi }}}}{1-{\epsilon }_{{\rm{\Pi }}}}N\epsilon ,\end{array}\end{eqnarray}$for any sequence of elementary operations. See appendix C for the proof. Here, ${\epsilon }_{{\rm{\Pi }}}=\parallel {{\rm{\Pi }}}_{\mathrm{in}}-{{\rm{\Pi }}}_{\mathrm{out}}\parallel $, and we have ${N}_{{ \mathcal O }}=1$ by taking the trace norm. Therefore, if $\parallel {{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}-{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}\parallel \ll 1$ and $\parallel {{\rm{\Pi }}}_{\mathrm{in}}-{{\rm{\Pi }}}_{\mathrm{out}}\parallel \ll 1$ for the trace norm, LIM can be applied. Here, $\parallel {{\rm{\Pi }}}_{\mathrm{in}}-{{\rm{\Pi }}}_{\mathrm{out}}\parallel \ll 1$ means that two subspaces Πin and Πout are approximately the same.

In order to implement LIM, we need to find states $\{\left|{\rho }_{i}\right.\unicode{x027EB}\}$ and observables $\{\left.\unicode{x027EA}{Q}_{k}\right|\}$ corresponding to an approximately invariant subspace Πin. For this purpose, we choose a set of trial states $\{\left|{\rho }_{i}^{{\rm{t}}}\right.\unicode{x027EB}\}$ and a set of trial observables $\{\left.\unicode{x027EA}{Q}_{k}^{{\rm{t}}}\right|\}$. The most interesting approximate invariant subspace is the subspace containing the initial state $\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$. If such an approximate invariant subspace exists, states in the form ${ \mathcal O }({\chi }_{N})\,\cdots { \mathcal O }({\chi }_{1})\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$ are all approximately within the subspace, as long as N is sufficiently small. Therefore, we can choose the initial state $\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$ and states in the form ${ \mathcal O }({\chi }_{N})\cdots { \mathcal O }({\chi }_{1})\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$ as trial states. Similarly, we can choose the observable $\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|$ and effective observables in the form $\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{ \mathcal O }({\chi }_{N})\cdots { \mathcal O }({\chi }_{1})$ as trial observables.

In the ideal case, i.e. Πin is an exactly invariant subspace, and trial states and observables are exactly within Πin, i.e. ${{\rm{\Pi }}}_{\mathrm{in}}\left|{\rho }_{i}^{{\rm{t}}}\right.\unicode{x027EB}=\left|{\rho }_{i}^{{\rm{t}}}\right.\unicode{x027EB}$ and $\left.\unicode{x027EA}{Q}_{k}^{{\rm{t}}}\right|{{\rm{\Pi }}}_{\mathrm{in}}=\left.\unicode{x027EA}{Q}_{k}^{{\rm{t}}}\right|$. Then, the rank of ${g}^{{\rm{t}}}={M}_{\mathrm{out}}^{{\rm{t}}}{M}_{\mathrm{in}}^{{\rm{t}}}$ is not greater than the dimension of the subspace Πin. Here, ${M}_{\mathrm{in}}^{{\rm{t}}}$ and ${M}_{\mathrm{out}}^{{\rm{t}}}$ are matrices corresponding to trial states and observables, respectively. If the subspace Πin is approximately invariant, the matrix gt should still be close to a matrix with a rank not greater than the subspace dimension. Therefore, we can determine Min and Mout by performing a truncation on the spectrum of singular values of gt, i.e. we choose Min and Mout corresponding to the greatest d singular values of gt. Suppose the singular value decomposition is UgtV = Λ, where ${\rm{\Lambda }}=\mathrm{diag}({s}_{1},{s}_{2},\ldots ,{s}_{{d}^{{\rm{t}}}})$ and ${s}_{1}\geqslant {s}_{2}\geqslant \cdots \geqslant {s}_{{d}^{{\rm{t}}}}$, then we use states $\{\left|{\rho }_{i}\right.\unicode{x027EB}={\sum }_{j}\left|{\rho }_{i}^{{\rm{t}}}\right.\unicode{x027EB}{V}_{i,j}\,| \,i=1,\ldots d\}$ and observables $\{\left.\unicode{x027EA}{Q}_{k}\right|={\sum }_{j}{U}_{k,j}\left.\unicode{x027EA}{Q}_{j}^{{\rm{t}}}\right|\,| \,k=1,\ldots d\}$ to implement LOT.

The approximate LOT protocol using LIM is as follows:Choose a set of states $\{\left|{\rho }_{i}^{{\rm{t}}}\right.\unicode{x027EB}={{ \mathcal O }}_{i}\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}\,| \,{{ \mathcal O }}_{i}\in O;i\,=1,\cdots ,{d}^{{\rm{t}}}\}$ and a set of observables $\{\left.\unicode{x027EA}{Q}_{k}^{{\rm{t}}}\right|=\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{{ \mathcal O }}_{k}^{\prime} \,| \,{{ \mathcal O }}_{k}^{\prime} \,\in O;k=1,\cdots ,{d}^{{\rm{t}}}\}$. We always take $\left|{\rho }_{1}^{{\rm{t}}}\right.\unicode{x027EB}=\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$ and $\left.\unicode{x027EA}{Q}_{1}^{{\rm{t}}}\right|=\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|$. Here, O is the set of operation sequences.
Obtain matrices ${g}^{{\rm{t}}}={M}_{\mathrm{out}}^{{\rm{t}}}{M}_{\mathrm{in}}^{{\rm{t}}}$ and ${\widetilde{{ \mathcal O }}}^{{\rm{t}}}(\chi )={M}_{\mathrm{out}}^{{\rm{t}}}{ \mathcal O }(\chi ){M}_{\mathrm{in}}^{{\rm{t}}}$ for each χ in the experiment. Here, ${M}_{\mathrm{in}}^{{\rm{t}}}=[\ \left|{\rho }_{1}^{{\rm{t}}}\right.\unicode{x027EB}\ \cdots \ \left|{\rho }_{{d}^{{\rm{t}}}}^{{\rm{t}}}\right.\unicode{x027EB}\ ]$ and ${M}_{\mathrm{out}}^{{\rm{t}}}={[{\left.\unicode{x027EA}{Q}_{1}^{{\rm{t}}}\right|}^{{\rm{T}}}\cdots {\left.\unicode{x027EA}{Q}_{{d}^{{\rm{t}}}}^{{\rm{t}}}\right|}^{{\rm{T}}}]}^{{\rm{T}}}$.
Compute the singular value decomposition UgtV = Λ, where ${\rm{\Lambda }}=\mathrm{diag}({s}_{1},{s}_{2},\ldots ,{s}_{{d}^{{\rm{t}}}})$, and singular values are sorted in the descending order ${s}_{1}\geqslant {s}_{2}\geqslant \cdots \geqslant {s}_{{d}^{{\rm{t}}}}$.
Choose the dimension d. Compute g = diag(s1, s2,…,sd) = DΛD and $\widetilde{{ \mathcal O }}(\chi )={DU}{\widetilde{{ \mathcal O }}}^{{\rm{t}}}(\chi ){{VD}}^{\dagger }$ for each χ. Here, D is a d × dt matrix, and ${D}_{i,i^{\prime} }={\delta }_{i,i^{\prime} }$.
Choose a d-dimensional invertible real matrix ${\widehat{M}}_{\mathrm{in}}$, and compute ${\widehat{M}}_{\mathrm{out}}=g{\widehat{M}}_{\mathrm{in}}^{-1}$.
Compute $\left|{\widehat{\rho }}_{\mathrm{in}}\right.\unicode{x027EB}={\sum }_{i=1}^{d}{\widehat{M}}_{\mathrm{in};\bullet ,{\rm{i}}}{V}_{1,i}^{* }={\widehat{M}}_{\mathrm{out}}^{-1}{{DUg}}_{\bullet ,1}^{{\rm{t}}}$, $\left.\unicode{x027EA}{\widehat{Q}}_{\mathrm{out}}\right|={\sum }_{k=1}^{d}{U}_{k,1}^{* }{\widehat{M}}_{\mathrm{out};{\rm{k}},\bullet }={g}_{1,\bullet }^{{\rm{t}}}{{VD}}^{\dagger }{\widehat{M}}_{\mathrm{in}}^{-1}$, and $\widehat{{ \mathcal O }}(\chi )={\widehat{M}}_{\mathrm{out}}^{-1}\widetilde{{ \mathcal O }}(\chi ){\widehat{M}}_{\mathrm{in}}^{-1}$ for each χ.


6.2. Maximum likelihood estimation

The alternative method for determining the error model is based on MLE. Given a model of the quantum computer with unknown parameters, MLE is to find the estimated values of the unknown parameters, such that the likelihood of samples observed in the experiment is maximized. Let d-dimensional column vector $\left|{\bar{\rho }}_{\mathrm{in}}({\boldsymbol{x}})\right.\unicode{x027EB}$, row vector $\left.\unicode{x027EA}{\bar{Q}}_{\mathrm{out}}({\boldsymbol{x}})\right|$ and matrices $\{\bar{{ \mathcal O }}(\chi ,{\boldsymbol{x}})\}$, respectively corresponding to the initial state, measured observable and operations, be the theoretical model of the quantum computer depending on parameters x. Our goal is to estimate parameters x based on data from the experiment. The mean of Qout in ρin after a sequence of operations measured in the experiment is $C=\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{ \mathcal O }({\chi }_{N})\,\cdots { \mathcal O }({\chi }_{1})\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}+\delta $, where δ is the deviation from the actual mean value, and the mean according to the model is $\bar{C}({\boldsymbol{x}})=\left.\unicode{x027EA}{\bar{Q}}_{\mathrm{out}}({\boldsymbol{x}})\right|\bar{{ \mathcal O }}({\chi }_{N},{\boldsymbol{x}})\cdots \bar{{ \mathcal O }}({\chi }_{1},{\boldsymbol{x}})\left|{\bar{\rho }}_{\mathrm{in}}({\boldsymbol{x}})\right.\unicode{x027EB}$.Using the Gaussian approximation, the likelihood function to be maximized is $L({\boldsymbol{x}})=\exp \{-{[\bar{C}({\boldsymbol{x}})-C]}^{2}/{\sigma }^{2}\}$, where Σ is the standard deviation of C. In the practical implementation, multiple quantum circuits and corresponding mean values are needed to determine the error model. The protocol is as follows:Parameterize the d-dimensional column vector $\left|{\bar{\rho }}_{\mathrm{in}}({\boldsymbol{x}})\right.\unicode{x027EB}$, row vector $\left.\unicode{x027EA}{\bar{Q}}_{\mathrm{out}}({\boldsymbol{x}})\right|$ and matrix $\bar{{ \mathcal O }}(\chi ,{\boldsymbol{x}})$ for each χ as functions of parameters x = (x1, x2, …).
Choose M circuits {χ1,…,χM}. For each circuit ${{\boldsymbol{\chi }}}_{m}\,=({\chi }_{m,1},\cdots ,{\chi }_{m,{N}_{m}})$, obtain $\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{ \mathcal O }({\chi }_{m,{N}_{m}})\cdots { \mathcal O }({\chi }_{m,1})\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}$ in the experiment. The result is Cm.
Minimize the likelihood function $L({\boldsymbol{x}})={\prod }_{m\,=\,1}^{M}\exp \{-{[{\bar{C}}_{m}({\boldsymbol{x}})-{C}_{m}]}^{2}/{\sigma }_{m}^{2}\}$, where ${\bar{C}}_{m}({\boldsymbol{x}})=\left.\unicode{x027EA}{\bar{Q}}_{\mathrm{out}}({\boldsymbol{x}})\right|\bar{{ \mathcal O }}({\chi }_{m,{N}_{m}},{\boldsymbol{x}})\cdots \bar{{ \mathcal O }}({\chi }_{m,1},{\boldsymbol{x}})\left|{\bar{\rho }}_{\mathrm{in}}({\boldsymbol{x}})\right.\unicode{x027EB}$, and ${\sigma }_{m}^{2}$ is the variance of Cm. The likelihood function is minimized at $\widehat{{\boldsymbol{x}}}=\arg \ {\min }_{{\boldsymbol{x}}}\{L({\boldsymbol{x}})\}$.
Compute $\left|{\widehat{\rho }}_{\mathrm{in}}\right.\unicode{x027EB}=\left|{\bar{\rho }}_{\mathrm{in}}(\widehat{{\boldsymbol{x}}})\right.\unicode{x027EB}$, $\left.\unicode{x027EA}{\widehat{Q}}_{\mathrm{out}}\right|=\left.\unicode{x027EA}{\bar{Q}}_{\mathrm{out}}(\widehat{{\boldsymbol{x}}})\right|$, and $\widehat{{ \mathcal O }}(\chi )=\bar{{ \mathcal O }}(\chi ,\widehat{{\boldsymbol{x}}})$ for each χ.
We can parameterize the error model by taking each vector and matrix element as a parameter. If the main source of temporal correlations is low-frequency noise or context-dependent noise as discussed in section 5, we can parameterize the error model according to equations (8)−(10). Choosing the proper initial values of parameters is important in the MLE method, and it is convenient to implement LIM and take the result of LIM as the initial values.

7. Numerical simulation of low-frequency noise

To demonstrate our protocols numerically, we consider a model of one qubit with time-dependent gate fidelities and implement LOT using the numerical simulation on a classical computer. In the model, gate fidelities depend on a low-frequency time-dependent variable λ, whose distribution is Gaussian. We assume that the change of the variable is negligible in the time scale of a quantum circuit. The initial state and the observable to be measured are error free, which are ${\rho }_{\mathrm{in}}=\left|0\right\rangle {\left\langle 0\right|}_{{\rm{S}}}\otimes {\rho }_{{\rm{E}}}$ and ${Q}_{\mathrm{out}}=\left|0\right\rangle {\left\langle 0\right|}_{{\rm{S}}}\otimes {{\mathbb{1}}}_{{\rm{E}}}$, respectively. Here, the state of the environment is ${\rho }_{{\rm{E}}}=\int {\rm{d}}\lambda \tfrac{1}{2\pi {\sigma }^{2}}\exp (\tfrac{{\lambda }^{2}}{2{\sigma }^{2}})\left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}$. We remark that LOT can deal with state preparation and measurement errors as the same as GST. We neglect state preparation and measurement errors in our simulation for simplification. Errors in single-qubit gates are depolarizing errors, and depolarizing rates depend on λ. For a unitary single-qubit gate G, the actual gate with error is ${{ \mathcal O }}_{{\rm{S}}}(G,\lambda )={ \mathcal E }({\epsilon }_{G}(\lambda ))[G]$, where εG(λ) is the depolarizing rate, ${ \mathcal E }(\epsilon )=(1-\epsilon )[{{\mathbb{1}}}_{{\rm{S}}}]+\epsilon { \mathcal D }$, and ${ \mathcal D }=\tfrac{1}{4}([{{\mathbb{1}}}_{{\rm{S}}}]+[{X}_{{\rm{S}}}]+[{Y}_{{\rm{S}}}]+[{Z}_{{\rm{S}}}])$. Here, X, Y and Z are Pauli operators. Then the operation on SE for the gate G is ${ \mathcal O }(G)=\int {\rm{d}}\lambda {{ \mathcal O }}_{{\rm{S}}}(G,\lambda )\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}$.

We consider two single-qubit gates, the Hadamard gate H and the phase gate S, which can generate all single-qubit Clifford gates. We take ${\epsilon }_{H}(\lambda )={\epsilon }_{S}(\lambda )=\eta [1-\exp (-{\lambda }^{2})]$, therefore, two gates are both optimised at λ = 0. Here, η ∈ [0, 1] denotes the strength of the noise. RB is the usual way of the verification of a quantum computing system [7, 8, 10]. In our simulation, we perform a sequence of H and S gates randomly chosen in the uniform distribution. We initialize the qubit in the state $\left|0\right|$, perform the random gate sequence and measure the probability in the state $\left|0\right|$. We only take into account gate sequences that the final state is $\left|0\right|$ in the case of ideal gates without error, so that the probability in the state $\left|0\right|$ is expected to be 1. When errors are switched on, the probability in the state $\left|0\right|$ is $F({N}_{{\rm{g}}})=(1+1/\sqrt{1+2{N}_{{\rm{g}}}{\sigma }^{2}})/2$ if η = 1, where Ng is the number of gates in the random gate sequence. The non-exponential decay of the probability is due to temporal correlations [713]. Without temporal correlation, the probability decreases exponentially with the gate number. If depolarizing rates are constants, i.e. εH(λ) = εS(λ) = ε, we have $[1+{\left(1-\epsilon \right)}^{{N}_{{\rm{g}}}}]/2$.

In our simulation, we implement both LIM and MLE. We take the dimension of the state space d = 4, 7 to compare LOT with conventional GST. In approximate models of classical random variables with stationary distribution (see section 5.1), the state space is $[({d}_{{\rm{S}}}^{2}-1)m+1]$-dimensional when the system and environment Hilbert spaces are respectively dS-dimensional and m-dimensional, as explained in appendix D. Therefore, d=4,7 correspond to m=1,2 approximations, respectively. If d = 4, the LOT protocol is equivalent to conventional GST protocol, because the one-dimensional environment is trivial and does not have any effect. As shown in figure 2, LOT with d = 7 can characterize the behavior of the quantum computer much more accurately than LOT with d = 4 (i.e. conventional GST). We remark that the Hilbert space dimension of the environment is infinite in the error model used in the numerical simulation, and the full tomography of the entire Hilbert space using informationally-complete states and observables is impractical. The number of measurements needed in the tomography increases with the dimension of space. In our method, when we truncate the space dimension to d = 7, we can already reconstruct an approximate error model that is consistent with the behavior of the quantum computer with temporal correlations.

Figure 2.

New window|Download| PPT slide
Figure 2.Probabilities in the state $\left|0\right|$ after a sequence of randomly chosen Hadamard and phase gates as functions of the gate number. We initialize the qubit in the state $\left|0\right|$, perform the random gate sequence and measure the probability in the state $\left|0\right|$. We only take into account gate sequences that the final state is $\left|0\right|$ in the case of ideal gates without error. Therefore the probability should be 1 in this case. In our simulation, we take Σ = 1 and η = 0.02. In the presence of errors, the probability in the actual quantum computation (QC) decreases with the gate number (black curve). Based on error models obtained in linear operator tomography (LOT) using MLE, we can estimate the decreasing probability, and the results are plotted. We can find the that the error model with d = 7 (red crosses) fits the actual behavior of the quantum computer much more accurately than the error model with d = 4 (blue squares). When d = 4, LOT is equivalent to conventional GST. Results for the linear inversion method are similar. See appendix E for details of the simulation and more data.


In the simulation of LOT using MLE, we parametrize the state, observable and operations as follows. The state is in the form ${\bar{\rho }}_{\mathrm{in}}={\sum }_{\lambda \in L}{p}^{{\rm{a}}}(\lambda )\left|0\right\rangle {\left\langle 0\right|}_{{\rm{S}}}\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}^{{\rm{a}}}$. The observable is in the form ${\bar{Q}}_{\mathrm{out}}={\sum }_{\lambda \in L}\left|0\right\rangle {\left\langle 0\right|}_{{\rm{S}}}\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}^{{\rm{a}}}$. The gate G with error is in the form $\bar{{ \mathcal O }}(G)={\sum }_{\lambda \in L}{ \mathcal E }({\epsilon }_{G}^{{\rm{a}}}(\lambda ))[G]\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}^{{\rm{a}}}$. We take {pa(λ)} and $\{{\epsilon }_{G}^{{\rm{a}}}(\lambda )\}$ as parameters (i.e. x) in MLE. With the error model parametrized in this way, the number of values that λ can take is important, but the value of λ is not important. For the one-dimensional environment approximation, i.e. d = 4, we take L = {1}; and for the two-dimensional environment approximation, i.e. d = 7, we take L = {1, 2}. Using MLE, we obtain pa(1) = 0.5606, pa(2) = 0.4394, ${\epsilon }_{H}^{{\rm{a}}}(1)\simeq {\epsilon }_{S}^{{\rm{a}}}(1)=2.485\,\times \,{10}^{-3}$ and ${\epsilon }_{H}^{{\rm{a}}}(2)\simeq {\epsilon }_{S}^{{\rm{a}}}(2)\,=1.606\,\times \,{10}^{-2}$.

In the case that the random variable takes two values λ = 1, 2, the environment state is ${\rho }_{{\rm{E}}}^{{\rm{a}}}={p}^{{\rm{a}}}(1)\left|1\right\rangle {\left\langle 1\right|}_{{\rm{E}}}^{{\rm{a}}}\,+{p}^{{\rm{a}}}(2)\left|2\right\rangle {\left\langle 2\right|}_{{\rm{E}}}^{{\rm{a}}}$. Because the distribution is stationary, the state of the environment does not evolve. Usually, we can use an eight-dimensional Pauli transfer matrix to represent an operation on a qubit and a classical bit [6]. However, because the component ${I}_{{\rm{S}}}\otimes \left[{p}^{{\rm{a}}}(2)\left|1\right\rangle {\left\langle 1\right|}_{{\rm{E}}}^{{\rm{a}}}-{p}^{{\rm{a}}}(1)\left|2\right\rangle {\left\langle 2\right|}_{{\rm{E}}}^{{\rm{a}}}\right]$ is always zero ( see appendix D), the dimension of the state space is effectively seven. For the operation ${ \mathcal O }$, the corresponding seven-dimensional Pauli transfer matrix is ${M}_{{ \mathcal O };\sigma ,\ \tau }\, = \mathrm{Tr}[\sigma { \mathcal O }(\tau )]/2$, where
$\sigma ,\ \ \tau ={I}_{{\rm{S}}}\otimes {\rho }_{{\rm{E}}}^{{\rm{a}}}/\sqrt{a},\ {X}_{{\rm{S}}}\otimes \left|1\right\rangle {\left\langle 1\right|}_{{\rm{E}}}^{{\rm{a}}},\ {Y}_{{\rm{S}}}\otimes \left|1\right\rangle {\left\langle 1\right|}_{{\rm{E}}}^{{\rm{a}}},\ {Z}_{{\rm{S}}}\otimes \left|1\right\rangle {\left\langle 1\right|}_{{\rm{E}}}^{{\rm{a}}},\ {X}_{{\rm{S}}}\otimes \left|2\right\rangle {\left\langle 2\right|}_{{\rm{E}}}^{{\rm{a}}},\ {Y}_{{\rm{S}}}\otimes \left|2\right\rangle {\left\langle 2\right|}_{{\rm{E}}}^{{\rm{a}}},\ {Z}_{{\rm{S}}}\otimes \left|2\right\rangle {\left\langle 2\right|}_{{\rm{E}}}^{{\rm{a}}}$ and $a=\mathrm{Tr}({\rho }_{{\rm{E}}}^{{\rm{a}}2})$. Pauli transfer matrices obtained using LIM are show in figure 3. In figure 4, we show the error in probabilities in the ideal state ∣0> estimated using different methods.

Figure 3.

New window|Download| PPT slide
Figure 3.Pauli transfer matrices obtained using the linear inversion method. The difference between the matrix obtained in tomography and the matrix of the ideal gate, i.e. ${M}_{{ \mathcal O }}-{M}_{{ \mathcal O }}^{\mathrm{ideal}}$, is plotted. See appendix E for ${M}_{{ \mathcal O }}^{\mathrm{ideal}}$.


Figure 4.

New window|Download| PPT slide
Figure 4.The difference between probabilities in the ideal state $\left|0\right|$ after a sequence of randomly chosen gates. (a,b) The probability difference FtomFact, where Fact is the probability obtained in the actual quantum computing, and Ftom is the probability estimated using linear operator tomography (LOT). The errorbar denotes one standard deviation.


8. Conclusions

We have proposed self-consistent tomography protocols to obtain the model of temporally correlated errors in a quantum computer. Given sufficient data from the experiment, the model obtained in our protocol can be exact. We also propose approximate models for the practical implementation. To obtain approximate models characterizing temporal correlations, more quantities need to be measured compared with conventional QPT and GST, but the overhead is moderate. We can use such approximate models to predict the behavior of a quantum computer much more accurately than the model obtained in GST, for systems with temporally correlated errors. Our protocols provide a way to quantitatively assess temporal correlations in quantum computers.

Appendix A. Exact LOT

We consider two subspaces ${\hat{V}}_{\mathrm{in}}=\mathrm{span}(\{\left.\unicode{x027EA}{Q}_{\mathrm{out}}\right|{ \mathcal O }{P}_{\mathrm{in}}\,| \,{ \mathcal O }\in O\})$ and ${\hat{V}}_{\mathrm{out}}=\mathrm{span}(\{{P}_{\mathrm{out}}{ \mathcal O }\left|{\rho }_{\mathrm{in}}\right.\unicode{x027EB}\,| \,{ \mathcal O }\in O\})$. We use ${\hat{P}}_{\mathrm{in}}$ and ${\hat{P}}_{\mathrm{out}}$ to denote the orthogonal projection on ${\hat{V}}_{\mathrm{in}}$ and ${\hat{V}}_{\mathrm{out}}$, respectively. Here, ${\hat{V}}_{\mathrm{out}}=V$ and ${\hat{P}}_{\mathrm{out}}=P$. ${P}_{\mathrm{in}\cap \overline{\mathrm{out}}}$ is the orthogonal projection on the intersection of Vin and the orthogonal complement of Vout. ${P}_{\mathrm{out}\cap \overline{\mathrm{in}}}$ is the orthogonal projection on the intersection of Vout and the orthogonal complement of Vin. Then, ${\hat{P}}_{\mathrm{in}}={P}_{\mathrm{in}}-{P}_{\mathrm{in}\cap \overline{\mathrm{out}}}$ and ${\hat{P}}_{\mathrm{out}}={P}_{\mathrm{out}}-{P}_{\mathrm{out}\cap \overline{\mathrm{in}}}$.

Let ${ \mathcal O }\in O$, all the following expressions are valid.$\begin{eqnarray}{\hat{P}}_{\mathrm{in}}={\hat{P}}_{\mathrm{in}}{P}_{\mathrm{in}}={P}_{\mathrm{in}}{\hat{P}}_{\mathrm{in}},\end{eqnarray}$$\begin{eqnarray}{\hat{P}}_{\mathrm{out}}={\hat{P}}_{\mathrm{out}}{P}_{\mathrm{out}}={P}_{\mathrm{out}}{\hat{P}}_{\mathrm{out}},\end{eqnarray}$$\begin{eqnarray}{P}_{\mathrm{out}}{P}_{\mathrm{in}}={P}_{\mathrm{out}}{\hat{P}}_{\mathrm{in}}={\hat{P}}_{\mathrm{out}}{P}_{\mathrm{in}},\end{eqnarray}$$\begin{eqnarray}\begin{array}{rcl}{P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}} & = & {P}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{in}}={P}_{\mathrm{out}}{\hat{P}}_{\mathrm{in}}{ \mathcal O }{\hat{P}}_{\mathrm{in}}\\ & = & {\hat{P}}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}}={\hat{P}}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{out}}{P}_{\mathrm{in}}.\end{array}\end{eqnarray}$

We have$\begin{eqnarray}{P}_{\mathrm{in}}{P}_{\mathrm{in}\cap \overline{\mathrm{out}}}={P}_{\mathrm{in}\cap \overline{\mathrm{out}}}{P}_{\mathrm{in}}={P}_{\mathrm{in}\cap \overline{\mathrm{out}}},\end{eqnarray}$$\begin{eqnarray}{P}_{\mathrm{out}}{P}_{\mathrm{in}\cap \overline{\mathrm{out}}}={P}_{\mathrm{in}\cap \overline{\mathrm{out}}}{P}_{\mathrm{out}}=0,\end{eqnarray}$$\begin{eqnarray}{P}_{\mathrm{out}}{P}_{\mathrm{out}\cap \overline{\mathrm{in}}}={P}_{\mathrm{out}\cap \overline{\mathrm{in}}}{P}_{\mathrm{out}}={P}_{\mathrm{out}\cap \overline{\mathrm{in}}},\end{eqnarray}$$\begin{eqnarray}{P}_{\mathrm{in}}{P}_{\mathrm{out}\cap \overline{\mathrm{in}}}={P}_{\mathrm{out}\cap \overline{\mathrm{in}}}{P}_{\mathrm{in}}=0.\end{eqnarray}$Therefore, equation (A1), equation (A2) and equation (A3) are valid.
Using ${ \mathcal O }{P}_{\mathrm{in}}={P}_{\mathrm{in}}{ \mathcal O }{P}_{\mathrm{in}}$ and ${P}_{\mathrm{out}}{ \mathcal O }={P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{out}}$, we have$\begin{eqnarray}\begin{array}{rcl}{P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}} & = & {P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{out}}{P}_{\mathrm{in}}={P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{out}}{\hat{P}}_{\mathrm{in}}\\ & = & {P}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{in}}={P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}}{\hat{P}}_{\mathrm{in}}\\ & = & {P}_{\mathrm{out}}{P}_{\mathrm{in}}{ \mathcal O }{P}_{\mathrm{in}}{\hat{P}}_{\mathrm{in}}={P}_{\mathrm{out}}{\hat{P}}_{\mathrm{in}}{ \mathcal O }{P}_{\mathrm{in}}{\hat{P}}_{\mathrm{in}}\\ & = & {P}_{\mathrm{out}}{\hat{P}}_{\mathrm{in}}{ \mathcal O }{\hat{P}}_{\mathrm{in}}.\end{array}\end{eqnarray}$Similarly,$\begin{eqnarray}\begin{array}{rcl}{P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}} & = & {P}_{\mathrm{out}}{P}_{\mathrm{in}}{ \mathcal O }{P}_{\mathrm{in}}={\hat{P}}_{\mathrm{out}}{P}_{\mathrm{in}}{ \mathcal O }{P}_{\mathrm{in}}\\ & = & {\hat{P}}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}}={\hat{P}}_{\mathrm{out}}{P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}}\\ & = & {\hat{P}}_{\mathrm{out}}{P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{out}}{P}_{\mathrm{in}}={\hat{P}}_{\mathrm{out}}{P}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{out}}{P}_{\mathrm{in}}\\ & = & {\hat{P}}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{out}}{P}_{\mathrm{in}}.\end{array}\end{eqnarray}$Therefore, equation (A4) is valid.

Let $\left|\sigma \right.\unicode{x027EB}\in {V}_{\mathrm{in}}$, $\left.\unicode{x027EA}H\right|\in {V}_{\mathrm{out}}$ and ${{ \mathcal O }}_{i}\in O$, where $i=1,2,\ldots ,N$. Then,$\begin{eqnarray}\begin{array}{rcl} & & \left.\unicode{x027EA}H\right|{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{in}}\cdots {\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{out}}\cdots {\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{out}}\left|\sigma \right.\unicode{x027EB}.\,\,\,\end{array}\end{eqnarray}$

Using lemma 1, we have$\begin{eqnarray}\begin{array}{rcl}{P}_{\mathrm{out}}{ \mathcal O }^{\prime} {P}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{in}} & = & {P}_{\mathrm{out}}{ \mathcal O }^{\prime} {P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}}{\hat{P}}_{\mathrm{in}}\\ & = & {P}_{\mathrm{out}}{ \mathcal O }^{\prime} {P}_{\mathrm{out}}{\hat{P}}_{\mathrm{in}}{ \mathcal O }{\hat{P}}_{\mathrm{in}}{\hat{P}}_{\mathrm{in}}\\ & = & {P}_{\mathrm{out}}{ \mathcal O }^{\prime} {\hat{P}}_{\mathrm{in}}{ \mathcal O }{\hat{P}}_{\mathrm{in}},\end{array}\end{eqnarray}$and$\begin{eqnarray}\begin{array}{rcl}{\hat{P}}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}}{ \mathcal O }{P}_{\mathrm{in}} & = & {\hat{P}}_{\mathrm{out}}{P}_{\mathrm{out}}{ \mathcal O }{P}_{\mathrm{in}}{ \mathcal O }^{\prime} {P}_{\mathrm{in}}\\ & = & {\hat{P}}_{\mathrm{out}}{\hat{P}}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{out}}{P}_{\mathrm{in}}{ \mathcal O }^{\prime} {P}_{\mathrm{in}}\\ & = & {\hat{P}}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{out}}{ \mathcal O }^{\prime} {P}_{\mathrm{in}}.\end{array}\end{eqnarray}$
Using $\left|\sigma \right.\unicode{x027EB}={P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}$ and $\left.\unicode{x027EA}H\right|=\left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}$, we have$\begin{eqnarray}\begin{array}{rcl} & & \left.\unicode{x027EA}H\right|{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}{P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}{{ \mathcal O }}_{N}{P}_{\mathrm{out}}\cdots {P}_{\mathrm{out}}{{ \mathcal O }}_{2}{P}_{\mathrm{out}}{{ \mathcal O }}_{1}{P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}{{ \mathcal O }}_{N}{P}_{\mathrm{out}}\cdots {P}_{\mathrm{out}}{{ \mathcal O }}_{2}{P}_{\mathrm{out}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{in}}\cdots {\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{in}}\cdots {\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{in}}\cdots {\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}.\end{array}\end{eqnarray}$Similarly,$\begin{eqnarray}\begin{array}{rcl} & & \left.\unicode{x027EA}H\right|{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}{P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{P}_{\mathrm{out}}{{ \mathcal O }}_{N}{P}_{\mathrm{in}}\cdots {P}_{\mathrm{in}}{{ \mathcal O }}_{2}{P}_{\mathrm{in}}{{ \mathcal O }}_{1}{P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{N}{P}_{\mathrm{in}}\cdots {P}_{\mathrm{in}}{{ \mathcal O }}_{2}{P}_{\mathrm{in}}{{ \mathcal O }}_{1}{P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{out}}\cdots {\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{1}{P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{out}}\cdots {\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{out}}{P}_{\mathrm{in}}\left|\sigma \right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}H\right|{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{out}}\cdots {\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{out}}\left|\sigma \right.\unicode{x027EB}.\end{array}\end{eqnarray}$

Let $d=\mathrm{Tr}(P)$, and each of $\{P\left|{\rho }_{i}\right.\unicode{x027EB}\}$ and $\{\left.\unicode{x027EA}{Q}_{k}\right|P\}$ be a set of d linearly-independent vectors. Then, $g={M}_{\mathrm{out}}{M}_{\mathrm{in}}={M}_{\mathrm{out}}{{PM}}_{\mathrm{in}}$ is invertible, $\widetilde{{ \mathcal O }}={M}_{\mathrm{out}}{ \mathcal O }{M}_{\mathrm{in}}\,={M}_{\mathrm{out}}P{ \mathcal O }{{PM}}_{\mathrm{in}}$ and$\begin{eqnarray}\begin{array}{rcl}C & = & {M}_{\mathrm{out}}{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}{M}_{\mathrm{in}}\\ & = & {M}_{\mathrm{out}}P{{ \mathcal O }}_{N}P\cdots {{ \mathcal O }}_{2}P{{ \mathcal O }}_{1}{{PM}}_{\mathrm{in}}\\ & = & {\widetilde{{ \mathcal O }}}_{N}{g}^{-1}\cdots {\widetilde{{ \mathcal O }}}_{2}{g}^{-1}{\widetilde{{ \mathcal O }}}_{1}.\end{array}\end{eqnarray}$

We remark that $P={\hat{P}}_{\mathrm{out}}$, and the theorem is also valid for ${\hat{P}}_{\mathrm{in}}$.
According to definitions of ${M}_{\mathrm{out}}$ and ${M}_{\mathrm{in}}$, we have ${g}_{k,i}=\unicode{x027EA}{Q}_{k}| {\rho }_{i}\unicode{x027EB}$. Because $\left|{\rho }_{i}\right.\unicode{x027EB}\in {V}_{\mathrm{in}}$ and $\left.\unicode{x027EA}{Q}_{k}\right|\in {V}_{\mathrm{out}}$, we have ${g}_{k,i}=\left.\unicode{x027EA}{Q}_{k}\right|{\hat{P}}_{\mathrm{in}}\left|{\rho }_{i}\right.\unicode{x027EB}=\left.\unicode{x027EA}{Q}_{k}\right|{\hat{P}}_{\mathrm{out}}\left|{\rho }_{i}\right.\unicode{x027EB}$. Here, we have used theorem 1. Therefore, ${M}_{\mathrm{out}}{M}_{\mathrm{in}}={M}_{\mathrm{out}}{{PM}}_{\mathrm{in}}$.
Similarly, we have ${\widetilde{{ \mathcal O }}}_{k,i}=\left.\unicode{x027EA}{Q}_{k}\right|{ \mathcal O }\left|{\rho }_{i}\right.\unicode{x027EB}=\left.\unicode{x027EA}{Q}_{k}\right|{\hat{P}}_{\mathrm{in}}{ \mathcal O }{\hat{P}}_{\mathrm{in}}\left|{\rho }_{i}\right.\unicode{x027EB}\,=\left.\unicode{x027EA}{Q}_{k}\right|{\hat{P}}_{\mathrm{out}}{ \mathcal O }{\hat{P}}_{\mathrm{out}}\left|{\rho }_{i}\right.\unicode{x027EB}$. Therefore, ${M}_{\mathrm{out}}{ \mathcal O }{M}_{\mathrm{in}}={M}_{\mathrm{out}}P{ \mathcal O }{{PM}}_{\mathrm{in}}$
We also have$\begin{eqnarray}\begin{array}{rcl} & & {C}_{k,i}=\left.\unicode{x027EA}{Q}_{k}\right|{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{2}{{ \mathcal O }}_{1}\left|{\rho }_{i}\right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}{Q}_{k}\right|{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{in}}\cdots {\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{in}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{in}}\left|{\rho }_{i}\right.\unicode{x027EB}\\ & = & \left.\unicode{x027EA}{Q}_{k}\right|{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{N}{\hat{P}}_{\mathrm{out}}\cdots {\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{2}{\hat{P}}_{\mathrm{out}}{{ \mathcal O }}_{1}{\hat{P}}_{\mathrm{out}}\left|{\rho }_{i}\right.\unicode{x027EB}.\end{array}\end{eqnarray}$Therefore, the first two lines of equation (A16) are equal.
${{PM}}_{\mathrm{in}}$ is a full rank ${d}_{{\rm{H}}}^{2}\times \mathrm{Tr}(P)$ matrix, and ${M}_{\mathrm{out}}P$ is a full rank $\mathrm{Tr}(P)\times {d}_{{\rm{H}}}^{2}$ matrix. Thus, ${\left({{PM}}_{\mathrm{in}}\right)}^{+}{{PM}}_{\mathrm{in}}={\mathbb{1}}$, ${{PM}}_{\mathrm{in}}{\left({{PM}}_{\mathrm{in}}\right)}^{+}=P$, ${M}_{\mathrm{out}}P{\left({M}_{\mathrm{out}}P\right)}^{+}={\mathbb{1}}$ and ${\left({M}_{\mathrm{out}}P\right)}^{+}{M}_{\mathrm{out}}P\,=P$. Here, ${A}^{+}$ denotes the pseudo inverse of matrix A.
Using pseudo inverses, we have ${g}^{-1}={\left({{PM}}_{\mathrm{in}}\right)}^{+}{\left({M}_{\mathrm{out}}P\right)}^{+}$, i.e. $g={M}_{\mathrm{out}}{{PPM}}_{\mathrm{in}}$ is invertible. Thus,$\begin{eqnarray}\begin{array}{rcl} & & {\widetilde{{ \mathcal O }}}_{N}{g}^{-1}\cdots {\widetilde{{ \mathcal O }}}_{2}{g}^{-1}{\widetilde{{ \mathcal O }}}_{1}\\ & = & {M}_{\mathrm{out}}P{{ \mathcal O }}_{N}{{PM}}_{\mathrm{in}}{\left({{PM}}_{\mathrm{in}}\right)}^{+}{\left({M}_{\mathrm{out}}P\right)}^{+}\cdots \\ & & \times {M}_{\mathrm{out}}P{{ \mathcal O }}_{2}{{PM}}_{\mathrm{in}}{\left({{PM}}_{\mathrm{in}}\right)}^{+}{\left({M}_{\mathrm{out}}P\right)}^{+}{M}_{\mathrm{out}}P{{ \mathcal O }}_{1}{{PM}}_{\mathrm{in}}\\ & = & {M}_{\mathrm{out}}P{{ \mathcal O }}_{N}P\cdots {{ \mathcal O }}_{2}P{{ \mathcal O }}_{1}{{PM}}_{\mathrm{in}}.\end{array}\end{eqnarray}$Therefore, the last two lines of equation (A16) are equal.

Appendix B. Space dimension truncation

We use $\parallel \cdot \parallel $ to denote a vector norm satisfying $\left|\unicode{x027EA}A| B\unicode{x027EB}\right|\leqslant \parallel \left.\unicode{x027EA}A\right|\parallel \parallel \left|B\right.\unicode{x027EB}\parallel $ and the submultiplicative matrix norm induced by the vector norm, i.e. $\parallel { \mathcal O }\left|B\right.\unicode{x027EB}\parallel \leqslant \parallel { \mathcal O }\parallel \parallel \left|B\right.\unicode{x027EB}\parallel $ and $\parallel {{ \mathcal O }}_{1}{{ \mathcal O }}_{2}\parallel \leqslant \parallel {{ \mathcal O }}_{1}\parallel \parallel {{ \mathcal O }}_{2}\parallel $.

Two examples of such norms First, we can take $\parallel \left|B\right.\unicode{x027EB}\parallel =\sqrt{\unicode{x027EA}B| B\unicode{x027EB}}=\sqrt{\mathrm{Tr}({B}^{2})}$. Then, $\parallel \left|B\right.\unicode{x027EB}\parallel =\sqrt{{\sum }_{i}{\sigma }_{i}^{2}}$, where {Σi} are singular values of B. Second, we can take $\parallel \left|B\right.\unicode{x027EB}\parallel ={\parallel B\parallel }_{1}={\sum }_{i}{\sigma }_{i}\geqslant \sqrt{{\sum }_{i}{\sigma }_{i}^{2}}$, where ${\parallel \cdot \parallel }_{1}$ denotes the trace norm.

We use ${\parallel \cdot \parallel }_{\max }$ to denote the max norm.

Let ${N}_{Q}\geqslant \parallel \left.\unicode{x027EA}{Q}_{k}\right|\parallel $ for all k, and ${N}_{\rho }\geqslant \parallel \left|{\rho }_{i}\right.\unicode{x027EB}\parallel $ for all i. Then$\begin{eqnarray}{\parallel {M}_{\mathrm{out}}{{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{1}{M}_{\mathrm{in}}\parallel }_{\max }\leqslant {N}_{Q}{N}_{\rho }\prod _{j=1}^{N}\parallel {{ \mathcal O }}_{j}\parallel .\end{eqnarray}$

According to the property of vector norm, the proof of lemma 2 is straightforward.

Let ${N}_{Q}\geqslant \parallel \left.\unicode{x027EA}{Q}_{k}\right|\parallel $ for all k, and ${N}_{\rho }\geqslant \parallel \left|{\rho }_{i}\right.\unicode{x027EB}\parallel $ for all i. Then, for any sequence of operations,$\begin{eqnarray}\begin{array}{rcl} & & {\parallel {M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{1}{M}_{\mathrm{in}}-{M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{n+1}{\overline{{ \mathcal P }}}_{n}{M}_{\mathrm{in}}\parallel }_{\max }\\ & \leqslant & {N}_{Q}{N}_{\rho }\prod _{j=n+1}^{N}\parallel {{ \mathcal O }}_{j}\parallel \\ & & \times \left[\prod _{j=1}^{n}\left(\parallel {{ \mathcal O }}_{j}\parallel +\parallel {\delta }_{{{ \mathcal O }}_{j}}\parallel \right)-\prod _{j=1}^{n}\parallel {{ \mathcal O }}_{j}\parallel \right],\end{array}\end{eqnarray}$where$\begin{eqnarray}{\delta }_{{ \mathcal O }}={{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }{{\rm{\Pi }}}_{\mathrm{in}}-{ \mathcal O }{{\rm{\Pi }}}_{\mathrm{in}},\end{eqnarray}$$\begin{eqnarray}{\overline{{ \mathcal O }}}_{m}={{ \mathcal O }}_{N}\cdots {{ \mathcal O }}_{m+1}{{ \mathcal O }}_{m},\end{eqnarray}$$\begin{eqnarray}{\overline{{ \mathcal P }}}_{m}={{\rm{\Pi }}}_{\mathrm{in}}{{ \mathcal O }}_{m}{{\rm{\Pi }}}_{\mathrm{in}}\cdots {{\rm{\Pi }}}_{\mathrm{in}}{{ \mathcal O }}_{2}{{\rm{\Pi }}}_{\mathrm{in}}{{ \mathcal O }}_{1}{{\rm{\Pi }}}_{\mathrm{in}}.\end{eqnarray}$

Inequality (B2) holds for n = 0, because$\begin{eqnarray}{\parallel {M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{1}{M}_{\mathrm{in}}-{M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{1}{{\rm{\Pi }}}_{\mathrm{in}}{M}_{\mathrm{in}}\parallel }_{\max }=0.\end{eqnarray}$
If inequality (B2) holds for n = m, then it also holds for $n=m+1$. Now we assume that inequality (B2) holds for n = m. Because$\begin{eqnarray}{\parallel {M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{1}{M}_{\mathrm{in}}\parallel }_{\max }\leqslant {N}_{Q}{N}_{\rho }\prod _{j=1}^{N}\parallel {{ \mathcal O }}_{j}\parallel ,\end{eqnarray}$we have$\begin{eqnarray}\begin{array}{rcl} & & {\parallel {M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{m+1}{\overline{{ \mathcal P }}}_{m}{M}_{\mathrm{in}}\parallel }_{\max }\\ & \leqslant & {N}_{Q}{N}_{\rho }\prod _{j=m+1}^{N}\parallel {{ \mathcal O }}_{j}\parallel \prod _{j=1}^{m}\left(\parallel {{ \mathcal O }}_{j}\parallel +\parallel {\delta }_{{{ \mathcal O }}_{j}}\parallel \right),\end{array}\end{eqnarray}$Because$\begin{eqnarray}\begin{array}{rcl} & & {M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{m+2}{\overline{{ \mathcal P }}}_{m+1}{M}_{\mathrm{in}}\\ & = & {M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{m+2}{\delta }_{{{ \mathcal O }}_{m+1}}{\overline{{ \mathcal P }}}_{m}{M}_{\mathrm{in}}\\ & & +{M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{m+1}{\overline{{ \mathcal P }}}_{m}{M}_{\mathrm{in}},\end{array}\end{eqnarray}$we have$\begin{eqnarray}\begin{array}{rcl} & & {\parallel {M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{1}{M}_{\mathrm{in}}-{M}_{\mathrm{out}}{\overline{{ \mathcal O }}}_{m+2}{\overline{{ \mathcal P }}}_{m+1}{M}_{\mathrm{in}}\parallel }_{\max }\\ & \leqslant & {N}_{Q}{N}_{\rho }\prod _{j=m+2}^{N}\parallel {{ \mathcal O }}_{j}\parallel \parallel {\delta }_{{{ \mathcal O }}_{m+1}}\parallel \prod _{j=1}^{m}\left(\parallel {{ \mathcal O }}_{j}\parallel +\parallel {\delta }_{{{ \mathcal O }}_{j}}\parallel \right)\\ & & +{N}_{Q}{N}_{\rho }\prod _{j=m+1}^{N}\parallel {{ \mathcal O }}_{j}\parallel \\ & & \times \left[\prod _{j=1}^{m}\left(\parallel {{ \mathcal O }}_{j}\parallel +\parallel {\delta }_{{{ \mathcal O }}_{j}}\parallel \right)-\prod _{j=1}^{m}\parallel {{ \mathcal O }}_{j}\parallel \right]\\ & = & {N}_{Q}{N}_{\rho }\prod _{j=m+2}^{N}\parallel {{ \mathcal O }}_{j}\parallel \\ & & \times \left[\prod _{j=1}^{m+1}\left(\parallel {{ \mathcal O }}_{j}\parallel +\parallel {\delta }_{{{ \mathcal O }}_{j}}\parallel \right)-\prod _{j=1}^{m+1}\parallel {{ \mathcal O }}_{j}\parallel \right],\end{array}\end{eqnarray}$i.e. inequality (B2) holds for $n=m+1$.

Appendix C. Linear inversion method

$\{\left|{\rho }_{i}^{{\rm{a}}}\right.\unicode{x027EB}\}$ are columns of ${M}_{\mathrm{in}}^{{\rm{a}}}$, and $\{\left.\unicode{x027EA}{Q}_{k}^{{\rm{a}}}\right|\}$ are rows of ${M}_{\mathrm{out}}^{{\rm{a}}}$. Let ${N}_{Q}^{{\rm{a}}}\geqslant \parallel \left.\unicode{x027EA}{Q}_{k}^{{\rm{a}}}\right|\parallel $ for all k, and ${N}_{\rho }^{{\rm{a}}}\geqslant \parallel \left|{\rho }_{i}^{{\rm{a}}}\right.\unicode{x027EB}\parallel $ for all i. If g, ${M}_{\mathrm{in}}^{{\rm{a}}}$ and ${M}_{\mathrm{out}}^{{\rm{a}}}$ are inevitable, for any sequence of operations in $\{{ \mathcal O }(\chi )\}$,$\begin{eqnarray}\begin{array}{rcl} & & \parallel \widetilde{{ \mathcal O }}({\chi }_{N}){g}^{-1}\cdots \widetilde{{ \mathcal O }}({\chi }_{2}){g}^{-1}\widetilde{{ \mathcal O }}({\chi }_{1})\\ & & {-{M}_{\mathrm{out}}^{{\rm{a}}}{{ \mathcal O }}^{{\rm{a}}}({\chi }_{N})\cdots {{ \mathcal O }}^{{\rm{a}}}({\chi }_{2}){{ \mathcal O }}^{{\rm{a}}}({\chi }_{1}){M}_{\mathrm{in}}^{{\rm{a}}}\parallel }_{\max }\\ & \leqslant & {N}_{Q}^{{\rm{a}}}{N}_{\rho }^{{\rm{a}}}\left[{\left(1+\parallel {\delta }_{g}\parallel \right)}^{N-1}\prod _{j=1}^{N}\left(\parallel {{ \mathcal O }}^{{\rm{a}}}({\chi }_{j})\parallel +\parallel {\delta }_{{\chi }_{j}}\parallel \right)\right.\\ & & -\left.\prod _{j=1}^{N}\parallel {{ \mathcal O }}^{{\rm{a}}}({\chi }_{j})\parallel \right],\end{array}\end{eqnarray}$where$\begin{eqnarray}{\delta }_{g}={M}_{\mathrm{in}}^{{\rm{a}}}{g}^{-1}{M}_{\mathrm{out}}^{{\rm{a}}}-{\mathbb{1}},\end{eqnarray}$$\begin{eqnarray}{\delta }_{\chi }={\left({M}_{\mathrm{out}}^{{\rm{a}}}\right)}^{-1}\widetilde{{ \mathcal O }}(\chi ){\left({M}_{\mathrm{in}}^{{\rm{a}}}\right)}^{-1}-{{ \mathcal O }}^{{\rm{a}}}(\chi ).\end{eqnarray}$

We have$\begin{eqnarray}{g}^{-1}={\left({M}_{\mathrm{in}}^{{\rm{a}}}\right)}^{-1}({\mathbb{1}}+{\delta }_{g}){\left({M}_{\mathrm{out}}^{{\rm{a}}}\right)}^{-1},\end{eqnarray}$$\begin{eqnarray}\widetilde{{ \mathcal O }}(\chi )={M}_{\mathrm{out}}^{{\rm{a}}}[{{ \mathcal O }}^{{\rm{a}}}(\chi )+{\delta }_{\chi }]{M}_{\mathrm{in}}^{{\rm{a}}}.\end{eqnarray}$Then,$\begin{eqnarray}\begin{array}{rcl} & & \widetilde{{ \mathcal O }}({\chi }_{N}){g}^{-1}\cdots \widetilde{{ \mathcal O }}({\chi }_{2}){g}^{-1}\widetilde{{ \mathcal O }}({\chi }_{1})\\ & = & {M}_{\mathrm{out}}^{{\rm{a}}}[{{ \mathcal O }}^{{\rm{a}}}({\chi }_{N})+{\delta }_{{\chi }_{N}}]({\mathbb{1}}+{\delta }_{g})\cdots \\ & & \times [{{ \mathcal O }}^{{\rm{a}}}({\chi }_{2})+{\delta }_{{\chi }_{2}}]({\mathbb{1}}+{\delta }_{g})[{{ \mathcal O }}^{{\rm{a}}}({\chi }_{1})+{\delta }_{{\chi }_{1}}]{M}_{\mathrm{in}}^{{\rm{a}}}.\end{array}\end{eqnarray}$Therefore, inequality (C1) holds.

We now apply theorem 4 to the approximate model given by the approximate invariant subspace Πin. Let $\{\left|l\right.\unicode{x027EB}\,| \,l=1,2,\ldots ,d\}$ be an orthonormal basis of the subspace Πin, i.e. $\unicode{x027EA}{l}_{1}| {l}_{2}\unicode{x027EB}={\delta }_{{l}_{1},{l}_{2}}$ and ${{\rm{\Pi }}}_{\mathrm{in}}={\sum }_{l=1}^{d}\left|l\right.\unicode{x027EB}\left.\unicode{x027EA}l\right|$, and $\{\left|{l}^{{\rm{a}}}\right.\unicode{x027EB}\,| \,l=1,2,\ldots ,d\}$ be an orthonormal basis of the approximate-model space, i.e. $\unicode{x027EA}{l}_{1}^{{\rm{a}}}| {l}_{2}^{{\rm{a}}}\unicode{x027EB}={\delta }_{{l}_{1},{l}_{2}}$ and ${\mathbb{1}}={\sum }_{l=1}^{d}\left|{l}^{{\rm{a}}}\right.\unicode{x027EB}\left.\unicode{x027EA}{l}^{{\rm{a}}}\right|$. Then, $T\equiv {\sum }_{l=1}^{d}\left|{l}^{{\rm{a}}}\right.\unicode{x027EB}\left.\unicode{x027EA}l\right|$ is the transformation from the actual space to the approximate-model space, and ${T}^{+}={\sum }_{l=1}^{d}\left|l\right.\unicode{x027EB}\left.\unicode{x027EA}{l}^{{\rm{a}}}\right|$ is the inverse transformation. We have TT+ = 1 and T+T = Πin. The approximate model is given by$\begin{eqnarray}{M}_{\mathrm{in}}^{{\rm{a}}}={{TM}}_{\mathrm{in}},\end{eqnarray}$$\begin{eqnarray}{M}_{\mathrm{out}}^{{\rm{a}}}={M}_{\mathrm{out}}{T}^{+},\end{eqnarray}$$\begin{eqnarray}{{ \mathcal O }}^{{\rm{a}}}(\chi )=T{ \mathcal O }(\chi ){T}^{+}.\end{eqnarray}$

We take the vector norm in the approximate-model space $\parallel \left.\unicode{x027EA}{A}^{{\rm{a}}}\right|\parallel =\parallel \left.\unicode{x027EA}{A}^{{\rm{a}}}\right|T\parallel $ and $\parallel \left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel =\parallel {T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel $. Then, $\left|\unicode{x027EA}{A}^{{\rm{a}}}| {B}^{{\rm{a}}}\unicode{x027EB}\right|=\parallel \left.\unicode{x027EA}{A}^{{\rm{a}}}\right|\parallel \parallel \left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel $ is satisfied. We have$\begin{eqnarray}\parallel {{ \mathcal O }}^{{\rm{a}}}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel =\parallel {T}^{+}{{ \mathcal O }}^{{\rm{a}}}{{TT}}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel .\end{eqnarray}$Because$\begin{eqnarray}\begin{array}{rcl}\parallel {T}^{+}{{ \mathcal O }}^{{\rm{a}}}{{TT}}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel & \leqslant & \parallel {T}^{+}{{ \mathcal O }}^{{\rm{a}}}T\parallel \parallel {T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \\ & = & \parallel {T}^{+}{{ \mathcal O }}^{{\rm{a}}}T\parallel \parallel \left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel ,\end{array}\end{eqnarray}$we have$\begin{eqnarray}\parallel {{ \mathcal O }}^{{\rm{a}}}\parallel \leqslant \parallel {T}^{+}{{ \mathcal O }}^{{\rm{a}}}T\parallel .\end{eqnarray}$Let ${ \mathcal O }$ be an operation satisfying ${{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }{{\rm{\Pi }}}_{\mathrm{in}}={T}^{+}{{ \mathcal O }}^{{\rm{a}}}T$, we have$\begin{eqnarray}\begin{array}{rcl} & & \parallel {T}^{+}{{ \mathcal O }}^{{\rm{a}}}{{TT}}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \\ & = & \parallel {{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }{{\rm{\Pi }}}_{\mathrm{in}}{T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \\ & \leqslant & \parallel { \mathcal O }{{\rm{\Pi }}}_{\mathrm{in}}{T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel +\parallel {\delta }_{{ \mathcal O }}{T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \\ & = & \parallel { \mathcal O }{T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel +\parallel {\delta }_{{ \mathcal O }}{T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \\ & \leqslant & (\parallel { \mathcal O }\parallel +\parallel {\delta }_{{ \mathcal O }}\parallel )\parallel {T}^{+}\left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \\ & = & (\parallel { \mathcal O }\parallel +\parallel {\delta }_{{ \mathcal O }}\parallel )\parallel \left|{B}^{{\rm{a}}}\right.\unicode{x027EB}\parallel ,\end{array}\end{eqnarray}$where$\begin{eqnarray}{\delta }_{{ \mathcal O }}={{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }{{\rm{\Pi }}}_{\mathrm{in}}-{ \mathcal O }{{\rm{\Pi }}}_{\mathrm{in}}.\end{eqnarray}$Therefore, $\parallel {{ \mathcal O }}^{{\rm{a}}}\parallel \leqslant \parallel { \mathcal O }\parallel +\parallel {\delta }_{{ \mathcal O }}\parallel $. Because ${T}^{+}{{ \mathcal O }}^{{\rm{a}}}(\chi )T\,={{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}$, we have $\parallel {{ \mathcal O }}^{{\rm{a}}}(\chi )\parallel \leqslant \parallel { \mathcal O }(\chi )\parallel +\parallel {\delta }_{{ \mathcal O }(\chi )}\parallel $.

We have,$\begin{eqnarray}\begin{array}{rcl}{\delta }_{g} & = & {{TM}}_{\mathrm{in}}{g}^{-1}{M}_{\mathrm{out}}{T}^{+}-{\mathbb{1}}\\ & = & T({M}_{\mathrm{in}}{g}^{-1}{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}-{{\rm{\Pi }}}_{\mathrm{in}}){T}^{+}.\end{array}\end{eqnarray}$Because g is invertible, Min and Mout are full rank. Thus, ${M}_{\mathrm{in}}^{+}{M}_{\mathrm{in}}={\mathbb{1}}$, ${M}_{\mathrm{in}}{M}_{\mathrm{in}}^{+}={{\rm{\Pi }}}_{\mathrm{in}}$, ${M}_{\mathrm{out}}{M}_{\mathrm{out}}^{+}={\mathbb{1}}$ and ${M}_{\mathrm{out}}^{+}{M}_{\mathrm{out}}\,={{\rm{\Pi }}}_{\mathrm{out}}$. Then,$\begin{eqnarray}\begin{array}{rcl} & & {M}_{\mathrm{in}}{g}^{-1}{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}={M}_{\mathrm{in}}{g}^{-1}{M}_{\mathrm{out}}{M}_{\mathrm{in}}{M}_{\mathrm{in}}^{+}\\ & = & {M}_{\mathrm{in}}{g}^{-1}{{gM}}_{\mathrm{in}}^{+}={M}_{\mathrm{in}}{M}_{\mathrm{in}}^{+}={{\rm{\Pi }}}_{\mathrm{in}}.\end{array}\end{eqnarray}$Therefore, δg = 0.

Because g = MoutΠinMin is invertible, MoutΠin is full rank. Thus, ${M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}={\mathbb{1}}$ and ${\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\,={{\rm{\Pi }}}_{\mathrm{in}}$. Then, we have ${\left({M}_{\mathrm{out}}{T}^{+}\right)}^{-1}=T{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}$ and ${\left({{TM}}_{\mathrm{in}}\right)}^{-1}={M}_{\mathrm{in}}^{+}{T}^{+}$. Therefore,$\begin{eqnarray}\begin{array}{rcl}{\delta }_{\chi } & = & {\left({M}_{\mathrm{out}}{T}^{+}\right)}^{-1}\widetilde{{ \mathcal O }}(\chi ){\left({{TM}}_{\mathrm{in}}\right)}^{-1}-T{ \mathcal O }(\chi ){T}^{+}\\ & = & T\left[{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}\widetilde{{ \mathcal O }}(\chi ){M}_{\mathrm{in}}^{+}-{{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}\right]{T}^{+}.\end{array}\end{eqnarray}$We have$\begin{eqnarray}\begin{array}{rcl} & & {\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}\widetilde{{ \mathcal O }}(\chi ){M}_{\mathrm{in}}^{+}\\ & = & {\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{ \mathcal O }(\chi ){M}_{\mathrm{in}}{M}_{\mathrm{in}}^{+}\\ & = & {\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}\\ & = & {\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}-{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{\delta }_{{ \mathcal O }(\chi )}\\ & = & {{\rm{\Pi }}}_{\mathrm{in}}{ \mathcal O }(\chi ){{\rm{\Pi }}}_{\mathrm{in}}-{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{\delta }_{{ \mathcal O }(\chi )}.\end{array}\end{eqnarray}$Then,$\begin{eqnarray}{\delta }_{\chi }=-T{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{\delta }_{{ \mathcal O }(\chi )}{T}^{+}.\end{eqnarray}$

Let $G={{\rm{\Pi }}}_{\mathrm{in}}{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}$, we have$\begin{eqnarray}{{\rm{\Pi }}}_{\mathrm{in}}G=G,\end{eqnarray}$$\begin{eqnarray}G{{\rm{\Pi }}}_{\mathrm{out}}=G,\end{eqnarray}$$\begin{eqnarray}G{{\rm{\Pi }}}_{\mathrm{in}}={{\rm{\Pi }}}_{\mathrm{in}}{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}={{\rm{\Pi }}}_{\mathrm{in}},\end{eqnarray}$$\begin{eqnarray}\begin{array}{rcl}{{\rm{\Pi }}}_{\mathrm{out}}G & = & {{\rm{\Pi }}}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}\\ & = & {M}_{\mathrm{out}}^{+}{M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}\\ & = & {M}_{\mathrm{out}}^{+}{M}_{\mathrm{out}}={{\rm{\Pi }}}_{\mathrm{out}}.\end{array}\end{eqnarray}$Then,$\begin{eqnarray}({\mathbb{1}}-{{\rm{\Pi }}}_{\mathrm{in}}+{{\rm{\Pi }}}_{\mathrm{out}})G={{\rm{\Pi }}}_{\mathrm{out}}.\end{eqnarray}$We define δΠ ≡ Πin − Πout. If 1 − δΠ is invertible, we have$\begin{eqnarray}G={\left({\mathbb{1}}-{\delta }_{{\rm{\Pi }}}\right)}^{-1}{{\rm{\Pi }}}_{\mathrm{out}}.\end{eqnarray}$Then,$\begin{eqnarray}\begin{array}{rcl}{T}^{+}{\delta }_{\chi }T & = & -{{\rm{\Pi }}}_{\mathrm{in}}{\left({M}_{\mathrm{out}}{{\rm{\Pi }}}_{\mathrm{in}}\right)}^{+}{M}_{\mathrm{out}}{\delta }_{{ \mathcal O }(\chi )}{{\rm{\Pi }}}_{\mathrm{in}}\\ & = & -G{\delta }_{{ \mathcal O }(\chi )}\\ & = & -{\left({\mathbb{1}}-{\delta }_{{\rm{\Pi }}}\right)}^{-1}{{\rm{\Pi }}}_{\mathrm{out}}{\delta }_{{ \mathcal O }(\chi )}\\ & = & {\left({\mathbb{1}}-{\delta }_{{\rm{\Pi }}}\right)}^{-1}{\delta }_{{\rm{\Pi }}}{\delta }_{{ \mathcal O }(\chi )}\\ & & -{\left({\mathbb{1}}-{\delta }_{{\rm{\Pi }}}\right)}^{-1}{{\rm{\Pi }}}_{\mathrm{in}}{\delta }_{{ \mathcal O }(\chi )}\\ & = & {\left({\mathbb{1}}-{\delta }_{{\rm{\Pi }}}\right)}^{-1}{\delta }_{{\rm{\Pi }}}{\delta }_{{ \mathcal O }(\chi )}.\end{array}\end{eqnarray}$Therefore,$\begin{eqnarray}\begin{array}{rcl}\parallel {\delta }_{\chi }\parallel & \leqslant & \parallel {T}^{+}{\delta }_{\chi }T\parallel =\parallel {\left({\mathbb{1}}-{\delta }_{{\rm{\Pi }}}\right)}^{-1}{\delta }_{{\rm{\Pi }}}{\delta }_{{ \mathcal O }(\chi )}\parallel \\ & \leqslant & {\left(1-\parallel {\delta }_{{\rm{\Pi }}}\parallel \right)}^{-1}\parallel {\delta }_{{\rm{\Pi }}}\parallel \parallel {\delta }_{{ \mathcal O }(\chi )}\parallel .\end{array}\end{eqnarray}$

Using inequality (C1), we have$\begin{eqnarray}\begin{array}{l}\parallel \widetilde{{ \mathcal O }}({\chi }_{N}){g}^{-1}\cdots \widetilde{{ \mathcal O }}({\chi }_{2}){g}^{-1}\widetilde{{ \mathcal O }}({\chi }_{1})\\ {-{M}_{\mathrm{out}}^{{\rm{a}}}{{ \mathcal O }}^{{\rm{a}}}({\chi }_{N})\cdots {{ \mathcal O }}^{{\rm{a}}}({\chi }_{2}){{ \mathcal O }}^{{\rm{a}}}({\chi }_{1}){M}_{\mathrm{in}}^{{\rm{a}}}\parallel }_{\max }\\ \leqslant N{{\prime} }_{Q}N{{\prime} }_{\rho }\left\{\prod _{j=1}^{N}\left[\parallel { \mathcal O }(\chi )\parallel +\parallel {\delta }_{{ \mathcal O }(\chi )}\parallel \right.\right.\\ \left.+{\left(1-\parallel {\delta }_{{\rm{\Pi }}}\parallel \right)}^{-1}\parallel {\delta }_{{\rm{\Pi }}}\parallel \parallel {\delta }_{{ \mathcal O }(\chi )}\parallel \right]\\ \left.-\prod _{j=1}^{N}\left(\parallel { \mathcal O }(\chi )\parallel +\parallel {\delta }_{{ \mathcal O }(\chi )}\parallel \right)\right\},\end{array}\end{eqnarray}$where$\begin{eqnarray}\begin{array}{rcl}N{{\prime} }_{Q} & = & \max \left\{\parallel \left.\unicode{x027EA}{Q}_{k}^{{\rm{a}}}\right|\parallel \right\}=\max \left\{\parallel \left.\unicode{x027EA}{Q}_{k}^{{\rm{a}}}\right|T\parallel \right\}\\ & = & \max \left\{\parallel \left.\unicode{x027EA}{Q}_{k}\right|{{\rm{\Pi }}}_{\mathrm{in}}\parallel \right\}\\ & \leqslant & \max \left\{\parallel \left.\unicode{x027EA}{Q}_{k}\right|{{\rm{\Pi }}}_{\mathrm{out}}\parallel +\parallel \left.\unicode{x027EA}{Q}_{k}\right|{\delta }_{{\rm{\Pi }}}\parallel \right\}\\ & \leqslant & \max \left\{\parallel \left.\unicode{x027EA}{Q}_{k}\right|\parallel (1+\parallel {\delta }_{{\rm{\Pi }}}\parallel )\right\},\end{array}\end{eqnarray}$$\begin{eqnarray}\begin{array}{rcl}N{{\prime} }_{\rho } & = & \max \left\{\parallel \left|{\rho }_{i}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \right\}=\max \left\{\parallel {T}^{+}\left|{\rho }_{i}^{{\rm{a}}}\right.\unicode{x027EB}\parallel \right\}\\ & = & \max \left\{\parallel {{\rm{\Pi }}}_{\mathrm{in}}\left|{\rho }_{i}\right.\unicode{x027EB}\parallel \right\}=\max \left\{\parallel \left|{\rho }_{i}\right.\unicode{x027EB}\parallel \right\}.\end{array}\end{eqnarray}$

Appendix D. Vector space dimensions

If Hilbert spaces of the system and environment are respectively dS-dimensional and dE-dimensional, the Hilbert space of SE is (dSdE)-dimensional. Then, a column vector $\left|\rho \right.\unicode{x027EB}$ representing the state of SE is $({d}_{{\rm{S}}}^{2}{d}_{{\rm{E}}}^{2})$-dimensional. We remark that dE = m.

For the classical random variable noise, the state is in the form $\rho ={\sum }_{\lambda }p(\lambda ){\rho }_{{\rm{S}}}(\lambda )\otimes \left|\lambda \right\rangle {\left\langle \lambda \right|}_{{\rm{E}}}$, i.e. the state of the environment (in the reduced density matrix form) only has diagonal elements. Therefore, we can use a $({d}_{{\rm{S}}}^{2}{d}_{{\rm{E}}})$-dimensional vector to represent the state, i.e. take $\left|\rho \right.\unicode{x027EB}={\sum }_{\lambda }p(\lambda ){\left|{\rho }_{{\rm{S}}}(\lambda )\right.\unicode{x027EB}}_{{\rm{S}}}\otimes {\left|\lambda \right.\unicode{x027EB}}_{{\rm{E}}}$, where $\{{\left|{\rho }_{{\rm{S}}}(\lambda )\right.\unicode{x027EB}}_{{\rm{S}}}\}$ are column vectors representing states of the system, and $\{{\left|\lambda \right.\unicode{x027EB}}_{{\rm{E}}}\}$ are column vectors representing states of the environment.

The state of the system can always be expressed in the form ${\rho }_{{\rm{S}}}={d}_{{\rm{S}}}^{-1}{{\mathbb{1}}}_{{\rm{S}}}+{\rho }_{{\rm{S}}}^{\prime} $, where $\mathrm{Tr}({\rho }_{{\rm{S}}}^{\prime} )=0$. Then ${\left|{\rho }_{{\rm{S}}}\right.\unicode{x027EB}}_{{\rm{S}}}={\left|{\mathbb{1}}\right.\unicode{x027EB}}_{{\rm{S}}}+{\left|{\rho }_{{\rm{S}}}^{\prime} \right.\unicode{x027EB}}_{{\rm{S}}}$, where ${\left|{\mathbb{1}}\right.\unicode{x027EB}}_{{\rm{S}}}$ represents the maximally mixed state ${d}_{{\rm{S}}}^{-1}{{\mathbb{1}}}_{{\rm{S}}}$, and ${\left|{\mathbb{1}}\right.\unicode{x027EB}}_{{\rm{S}}}$ and ${\left|{\rho }_{{\rm{S}}}^{\prime} \right.\unicode{x027EB}}_{{\rm{S}}}$ are orthogonal. We focus on the dE-dimensional subspace $\mathrm{span}(\{{\left|{\mathbb{1}}\right.\unicode{x027EB}}_{{\rm{S}}}\otimes {\left|\lambda \right.\unicode{x027EB}}_{{\rm{E}}}\})$. The orthogonal projection on this subspace is ${P}_{{\mathbb{1}}}\,=\left|{\mathbb{1}}\right.\unicode{x027EB}{\left.\unicode{x027EA}{\mathbb{1}}\right|}_{{\rm{S}}}\otimes \left|\lambda \right.\unicode{x027EB}{\left.\unicode{x027EA}\lambda \right|}_{{\rm{E}}}$. Then, ${P}_{{\mathbb{1}}}\left|\rho \right.\unicode{x027EB}={\sum }_{\lambda }p(\lambda ){\left|{\mathbb{1}}\right.\unicode{x027EB}}_{{\rm{S}}}\otimes {\left|\lambda \right.\unicode{x027EB}}_{{\rm{E}}}$. If the distribution of λ is stationary, i.e. {p(λ)} are invariant under operations, we have ${P}_{{\mathbb{1}}}{ \mathcal O }\left|\rho \right.\unicode{x027EB}={P}_{{\mathbb{1}}}\left|\rho \right.\unicode{x027EB}$ for any operation ${ \mathcal O }$ that does not change the distribution. Therefore, if the distribution of λ is stationary, ${P}_{{\mathbb{1}}}\left|\rho \right.\unicode{x027EB}$ is the only non-trivial vector in the subspace P1 that contributes to the state, and $\left|\rho \right.\unicode{x027EB}$ is effectively $[({d}_{{\rm{S}}}^{2}-1){d}_{{\rm{E}}}+1]$-dimensional.

Appendix E. Details of the numerical simulation


To simulate the behavior of the actual quantum computer, we use the Gaussian cubature approximation to match up to the 9th-order moment, by taking ${{\rm{e}}}^{-{\lambda }^{2}}$ instead of λ as the random variable. Reducing the precision of the approximation and only matching up to the 7th-order moment, we find that the difference is negligible.


Trial states and observables are generated as follows: We selected a set of gate sequences, (G1, G2,…,GN), where Gj = H, S; Each gate sequence corresponds to a state ${G}_{N}\cdots {G}_{2}{G}_{1}\left|0\right|$ and an observable ${\left({G}_{1}{G}_{2}\cdots {G}_{N}\right)}^{\dagger }Z({G}_{1}{G}_{2}\cdots {G}_{N})$. These states and observables are realised using H and S gates accordingly.

When d = 4, four gates sequences are used in the simulation, (null), (H), (H,S) and (H, S, H). The four states are $\left|0\right|$, $\left|+\right|=H\left|0\right|$, $\left|y+\right|={SH}\left|0\right|$ and $\left|y-\right|={HSH}\left|0\right|;$ The four observables are Z, X = HZH, − Y = SHYHS and Y = HSHYHSH.

When d = 7, 123 gates sequences are used in the simulation, including all sequences with the gate number N ≤ 5 and four sequences for each gate number 5 < N ≤ 20.


Circuits for generating data {Cm} used in MLE are the same as circuits used in LIM simulation.


The seven-dimensional Pauli transfer matrices of ideal gates are$\begin{eqnarray}{M}_{{ \mathcal O }(H)}^{\mathrm{ideal}}=\left(\begin{array}{ccccccc}1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & -1 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 & -1 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 0\end{array}\right),\end{eqnarray}$and$\begin{eqnarray}{M}_{{ \mathcal O }(S)}^{\mathrm{ideal}}=\left(\begin{array}{ccccccc}1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & -1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & -1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1\end{array}\right).\end{eqnarray}$In LIM, we take ${\widehat{M}}_{\mathrm{in}}={\widehat{M}}_{\mathrm{out}}^{-1}g$, and ${\widehat{M}}_{\mathrm{out}}$ is chosen to minimize the difference between ${M}_{{ \mathcal O }}$ and ${M}_{{ \mathcal O }}^{\mathrm{ideal}}$.

Acknowledgments

This work was supported by the National Key R&D Program of China (Grant No. 2016YFA0301200) and the National Basic Research Program of China (Grant No. 2014CB921403). It is also supported by Science Challenge Project (Grant No. TZ2017003) and the National Natural Science Foundation of China (Grants No. 11 774 024, No. 11 534 002, and No. U1530401). YL is supported by National Natural Science Foundation of China (Grant No. 11 875 050, 12088101) and NSAF (Grant No. U1930403).


Reference By original order
By published year
By cited within times
By Impact factor

Nielsen M A Chuang I L 2010 Quantum Computation and Quantum Information Cambridge Cambridge University Press
[Cited within: 1]

Barends R et al. 2014 Superconducting quantum circuits at the surface code threshold for fault tolerance
Nature 508 500

DOI:10.1038/nature13171 [Cited within: 1]

Rong X Geng J Shi F Liu Y Xu K Ma W Kong F Jiang Z Wu Y Du J 2015 Experimental fault-tolerant universal quantum gates with solid-state spins under ambient conditions
Nat. Commun. 6 8748

DOI:10.1038/ncomms9748

Ballance C J Harty T P Linke N M Sepiol M A Lucas D M 2016 High-fidelity quantum logic gates using trapped-ion hyperfine qubits
Phys. Rev. Lett. 117 060504

DOI:10.1103/PhysRevLett.117.060504

Gaebler J P et al. 2016 High-fidelity universal gate set for 9Be+ ion qubits
Phys. Rev. Lett. 117 060505

DOI:10.1103/PhysRevLett.117.060505

Blume-Kohout R King Gamble J Nielsen E Rudinger K Mizrahi J Fortier K Maunz P 2017 Demonstration of qubit operations below a rigorous fault tolerance threshold with gate set tomography
Nat. Commun. 8 14485

DOI:10.1038/ncomms14485 [Cited within: 12]

Emerson J Alicki R Życzkowski K 2005 Scalable noise estimation with random unitary operators
J. Opt. B: Quantum Semiclass. Opt. 7 S347

DOI:10.1088/1464-4266/7/10/021 [Cited within: 3]

Knill E Leibfried D Reichle R Britton J Blakestad R B Jost J D Langer C Ozeri R Seidelin S Wineland D J 2008 Randomized benchmarking of quantum gates
Phys. Rev. A 77 012307

DOI:10.1103/PhysRevA.77.012307 [Cited within: 1]

Magesan E Gambetta J M Emerson J 2011 Scalable and robust randomized benchmarking of quantum processes
Phys. Rev. Lett. 106 180504

DOI:10.1103/PhysRevLett.106.180504

Wallman J J Flammia S T 2014 Randomized benchmarking with confidence
New J. Phys. 16 103032

DOI:10.1088/1367-2630/16/10/103032 [Cited within: 2]

Fogarty M A Veldhorst M Harper R Yang C H Bartlett S D Flammia S T Dzurak A S 2015 Nonexponential fidelity decay in randomized benchmarking with low-frequency noise
Phys. Rev. A 92 022326

DOI:10.1103/PhysRevA.92.022326

Ball H Stace T M Flammia S T Biercuk M J 2016 Effect of noise correlations on randomized benchmarking
Phys. Rev. A 93 022303

DOI:10.1103/PhysRevA.93.022303

Mavadia S Edmunds C L Hempel C Ball H Roy F Stace T M Biercuk M J 2018 Experimental quantum verification in the presence of temporally correlated noise
npj Quantum Informationvol. 4 7

DOI:10.1038/s41534-017-0052-0 [Cited within: 3]

Poyatos J F Cirac J I Zoller P 1997 Complete characterization of a quantum process: the two-bit quantum gate
Phys. Rev. Lett. 78 390

DOI:10.1103/PhysRevLett.78.390 [Cited within: 2]

Chuang I L Nielsen M A 1997 Prescription for experimental determination of the dynamics of a quantum black box
J. Mod. Opt. 44 2455

DOI:10.1080/09500349708231894 [Cited within: 2]

Wang D S Fowler A G Hollenberg L C L 2011 Surface code quantum computing with error rates over 1%
Phys. Rev. A 83 020302(R)

DOI:10.1103/PhysRevA.83.020302 [Cited within: 2]

Kueng R Long D M Doherty A C Flammia S T 2016 Comparing experiments to the fault-tolerance threshold
Phys. Rev. Lett. 117 170502

DOI:10.1103/PhysRevLett.117.170502 [Cited within: 1]

Aharonov D Ben-Or M arXiv:quant-ph/9906129Fault-tolerant quantum computation with constant error rate
[Cited within: 1]

Huo M-X Li Y 2017 Learning time-dependent noise to reduce logical errors: real time error rate estimation in quantum error correction
New J. Phys. 19 123032

DOI:10.1088/1367-2630/aa916e [Cited within: 1]

Chiaverini J et al. 2004 Realization of quantum error correction
Nature 432 602

DOI:10.1038/nature03074 [Cited within: 1]

Schindler P Barreiro J T Monz T Nebendahl V Nigg D Chwalla M Hennrich M Blatt R 2011 Experimental repetitive quantum error correction
Science 332 1059

DOI:10.1126/science.1203329

Nigg D Müller M Martinez E A Schindler P Hennrich M Monz T Martin-Delgado M A Blatt R 2014 Quantum computations on a topologically encoded qubit
Science 345 302

DOI:10.1126/science.1253742

Taminiau T H Cramer J van der Sar1 T Dobrovitski V V Hanson R 2014 Universal control and error correction in multi-qubit spin registers in diamond
Nat. Nanotech. 9 171

DOI:10.1038/nnano.2014.2

Córcoles A D Magesan E Srinivasan S J Cross A W Steffen M Gambetta J M Chow J M 2015 Demonstration of a quantum error detection code using a square lattice of four superconducting qubits
Nat. Commun. 6 6979

DOI:10.1038/ncomms7979

Ristè D Poletto S Huang M-Z Bruno A Vesterinen V Saira O-P DiCarlo L 2015 Detecting bit-flip errors in a logical qubit using stabilizer measurements
Nat. Commun. 6 6983

DOI:10.1038/ncomms7983

Müller M Rivas A Martínez E A Nigg D Schindler P Monz T Blatt R Martin-Delgado M A 2016 Iterative phase optimization of elementary quantum error correcting codes
Phys. Rev. X 6 031030

DOI:10.1103/PhysRevX.6.031030

Linke N M Gutierrez M Landsman K A Figgatt C Debnath S Brown K R Monroe C 2017 Fault-tolerant quantum error detection
Sci. Adv. 3 1701074

DOI:10.1126/sciadv.1701074

Bermudez A et al. 2017 Assessing the progress of trapped-ion processors towards fault-tolerant quantum computation
Phys. Rev. X 7 041061

DOI:10.1103/PhysRevX.7.041061 [Cited within: 1]

Fowler A G Mariantoni M Martinis J M Cleland A N 2012 Surface codes: towards practical large-scale quantum computation
Phys. Rev. A 86 032324

DOI:10.1103/PhysRevA.86.032324 [Cited within: 1]

O'Gorman J Campbell E T 2017 Quantum computation with realistic magic state factories
Phys. Rev. A 95 032338(R)

DOI:10.1103/PhysRevA.95.032338 [Cited within: 1]

Preskill J 2018 Quantum Computing in the NISQ era and beyond
Quantum 2 79

DOI:10.22331/q-2018-08-06-79 [Cited within: 1]

Boixo S Isakov S V Smelyanskiy V N Babbush R Ding N Jiang Z Bremner M J Martinis J M Neven H 2018 Characterizing quantum supremacy in near-term devices
Nat. Phys. 14 595

DOI:10.1038/s41567-018-0124-x

Neill C et al. 2018 A blueprint for demonstrating quantum supremacy with superconducting qubits
Science 360 195

DOI:10.1126/science.aao4309 [Cited within: 1]

Li Y Benjamin S C 2017 Efficient variational quantum simulator incorporating active error minimization
Phys. Rev. X 7 021050

DOI:10.1103/PhysRevX.7.021050 [Cited within: 1]

Temme K Bravyi S Gambetta J M 2017 Error mitigation for short-depth quantum circuits
Phys. Rev. Lett. 119 180509

DOI:10.1103/PhysRevLett.119.180509

Endo S Benjamin S C Li Y 2018 Practical quantum error mitigation for near-future applications
Phys. Rev. X 8 031027

DOI:10.1103/PhysRevX.8.031027 [Cited within: 2]

Kandala A Temme K Córcoles A D Mezzacapo A Chow J M Gambetta J M 2019 Extending the computational reach of a noisy superconducting quantum processor
Nature 567 491

DOI:10.1038/s41586-019-1040-7 [Cited within: 1]

Merkel S T Gambetta J M Smolin J A Poletto S Córcoles A D Johnson B R Ryan C A Steffen M 2013 Self-consistent quantum process tomography
Phys. Rev. A 87 062119

DOI:10.1103/PhysRevA.87.062119 [Cited within: 7]

Blume-Kohout R Gamble J K Nielsen E Mizrahi J Sterk J D Maunz P Robust, self-consistent, closed-form tomography of quantum logic gates on a trapped ion qubit
arXiv:1310.4492



Stark C 2014 Self-consistent tomography of the state-measurement Gram matrix
Phys. Rev. A 89 052109

DOI:10.1103/PhysRevA.89.052109

Greenbaum D Introduction to quantum gate set tomography
arXiv:1509.02921



Sugiyama T Imori S Tanaka F Reliable characterization of super-accurate quantum operations arXiv:1806.02696
[Cited within: 7]

Hooge F N Kleinpenning T G M Vandamme L K J 1981 Experimental studies on 1/f noise
Rep. Prog. Phys. 44 479

DOI:10.1088/0034-4885/44/5/001 [Cited within: 1]

Paik H et al. 2011 Observation of high coherence in Josephson junction qubits measured in a three-dimensional circuit QED architecture
Phys. Rev. Lett. 107 240501

DOI:10.1103/PhysRevLett.107.240501

Sank D et al. 2012 Flux noise probed with real time qubit tomography in a Josephson phase qubit
Phys. Rev. Lett. 109 067001

DOI:10.1103/PhysRevLett.109.067001 [Cited within: 1]

Rutman J 1978 Characterization of phase and frequency instabilities in precision frequency sources: fifteen years of progress
Proc. IEEE 66 1048

DOI:10.1109/PROC.1978.11080 [Cited within: 2]

Wineland D J Monroe C Itano W M Leibfried D King B E Meekhof D M 1998 Experimental issues in coherent quantum-state manipulation of trapped atomic ions
J. Res. Natl Inst. Stand. Technol. 103 259

DOI:10.6028/jres.103.019 [Cited within: 1]

Schmidt-Kaler F Häffner H Riebe M Gulde S Lancaster G P T Deuschle T Becher C Roos C F Eschner J Blatt R 2003 Realization of the Cirac−Zoller controlled-NOT quantum gate
Nature 422 408

DOI:10.1038/nature01494

Benhelm J Kirchmair G Roos C F Blatt R 2008 Towards fault-tolerant quantum computing with trapped ions
Nat. Phys. 4 463

DOI:10.1038/nphys961

Ballance C J Harty T P Linke N M Sepiol M A Lucas D M 2016 High-fidelity quantum logic gates using trapped-ion hyperfine qubits
Phys. Rev. Lett. 117 060504

DOI:10.1103/PhysRevLett.117.060504 [Cited within: 2]

Rudinger K Proctor T Langharst D Sarovar M Young K Blume-Kohout R Probing context-dependent errors in quantum processorsarXiv:1810.05651
[Cited within: 1]

Veitia A van Enk S 2020 Testing the context-independence of quantum gates
Quantum Inf. Comput. 20 1304 1352

DOI:10.26421/QIC20.15-16-3 [Cited within: 1]

Pechukas P 1994 Reduced dynamics need not be completely positive
Phys. Rev. Lett. 73 1060

DOI:10.1103/PhysRevLett.73.1060 [Cited within: 1]

Pollock F A Rodríguez-Rosario C Frauenheim T Paternostro M Modi K 2018 Non-markovian quantum processes: Complete framework and efficient characterization
Phys. Rev. A 97 012127

DOI:10.1103/PhysRevA.97.012127 [Cited within: 1]

Lin J Buonacorsi B Laflamme R Wallman J J 2019 On the freedom in representing quantum operations
New J. Phys. 21 023006

DOI:10.1088/1367-2630/ab075a [Cited within: 1]

Helsen J Battistel F Terhal B M 2019 Spectral quantum tomography
npj Quantum Inform. 5 74

DOI:10.1038/s41534-019-0189-0 [Cited within: 1]

Gustavsson S et al. 2016 Suppressing relaxation in superconducting qubits by quasiparticle pumping
Science 354 1573

DOI:10.1126/science.aah5844 [Cited within: 1]

Miller A CIIIRice T R 1983 Discrete approximations of probability distributions
Manage. Sci. 29 352

DOI:10.1287/mnsc.29.3.352 [Cited within: 1]

DeVuysta E A Preckelb P V 2007 Gaussian cubature: a practitioner's guide
Math. Comput. Modelling 45 787

DOI:10.1016/j.mcm.2006.07.021 [Cited within: 1]

相关话题/consistent tomography temporally