Английская Википедия:Estimation of signal parameters via rotational invariance techniques
Estimation theory, or estimation of signal parameters via rotational invariant techniques (ESPRIT), is a technique to determine the parameters of a mixture of sinusoids in background noise. This technique was first proposed for frequency estimation,[1] however, with the introduction of phased-array systems in everyday technology, it is also used for angle of arrival estimations.[2]
General description
System model
The model under investigation is the following (1-D version):<math display="block">y_m[t] = \sum_{k=1}^K a_{m,k} x_k[t] + n_m[t]</math>
This model describes a system that is fed with <math display="inline">K </math> inputs signals <math display="inline">x_k[t] \in \mathbb C </math>, with <math display="inline">k = 1, \ldots, K </math>, and produces<math display="inline">M </math> output signals <math display="inline">y_m[t] \in \mathbb C </math>with<math display="inline">m = 1, \ldots, M </math>. The system's output is sampled at discrete time instances<math>t </math>. All<math display="inline">K </math> input signals are weighted and summed up. There are separate weights <math display="inline">a_{m,k} </math> for each input signal and for each output signal. The quantity <math display="inline">n_m[t] \in \mathbb C </math> denotes noise added by the system.
The one-dimensional form of ESPRIT can be applied if the weights have the following form.<math display="block">a_{m,k} = e^{-j(m-1)w_k}</math>That is, the weights are complex exponentials, and the phases are integer multiples of some radial frequency<math>w_k </math>. Note that this frequency only depends on the index of system's input!
The goal of ESPRIT is to estimate the radial frequencies <math>w_k </math> given the outputs<math display="inline">y_m[t] \in \mathbb C </math> and the number of input signals <math display="inline">K </math>.
Since the radial frequencies are the actual objectives, we will change notation from <math display="inline">a_{m,k} </math> to <math display="inline">a_m(w_k) </math>.<math display="block">y_m[t] = \sum_{k=1}^K a_m(w_k) x_k[t] + n_m[t]</math>Let us now change to a vector notation by putting the weights <math display="inline">a_m(w_k) </math> in a column vector <math>\mathbf a(w_k) </math>.<math display="block">\mathbf a(w_k) := [1 \quad e^{-jw_k} \quad e^{-j2w_k} \quad ... \quad e^{-j(M-1)w_k}]^\mathrm T</math>Now, the system model can be rewritten using <math display="inline">\mathbf a(w_k) </math> and the output vector <math display="inline">\mathbf y[t] </math> as follows.<math display="block">\mathbf y[t] = \sum_{k=1}^K \mathbf a(w_k) x_k[t] + \mathbf{n}[t]</math>
Dividing into virtual sub-arrays
The basis of ESPRIT is that the weight vector <math display="inline">\mathbf a(w_k) </math> has the property that adjacent entries are related as follows:
<math display="block">[\mathbf{a}(w_k)]_{m+1} = e^{-jw_k} [\mathbf{a}(w_k)]_{m} </math>
In order to write down this property for the whole vector <math>\mathbf a(w_k) </math> we define two selection matrices <math>\mathbf{J}_1 </math>and <math>\mathbf{J}_2 </math>:<math display="block"> \begin{align} \mathbf J_1 &= [\mathbf I_{M-1} \quad \mathbf 0] \\ \mathbf J_2 &= [\mathbf 0 \quad \mathbf I_{M-1}] \end{align} </math>Here, <math> \mathbf I_{M-1} </math> is an identity matrix of size <math display="inline">(M-1) \times (M-1)</math> and <math>\mathbf 0</math> is a vector of zeros. The vector<math>\mathbf J_1 \mathbf a(w_k) </math> contains all elements of <math>\mathbf a(w_k) </math> except the last one. The vector <math>\mathbf J_2 \mathbf a(w_k) </math> contains all elements of <math>\mathbf a(w_k) </math> except the first one. Therefore, we can write:<math display="block">\mathbf J_2 \mathbf a(w_k) = e^{-jw_k} \mathbf J_1 \mathbf a(w_k) </math>In general, we have multiple sinusoids with radial frequencies <math>w_1, w_2, ... w_K </math>. Therefore, we put the corresponding weight vectors <math>\mathbf a(w_1), \mathbf a(w_2), ..., \mathbf a(w_K) </math> into a Vandermonde matrix <math>\mathbf{A}</math>.<math display="block"> \mathbf A := [\mathbf a(w_1) \quad \mathbf a(w_2) \quad ... \quad \mathbf a(w_K)]</math>Moreover, we define a matrix <math> \mathbf H</math> which has complex exponentials on its main diagonal and zero in all other places.
<math display="block"> {\mathbf {H}} := \begin{bmatrix} e^{-jw_1} & \\ & e^{-jw_2} \\ & & \ddots \\ & & & e^{-jw_K} \end{bmatrix}</math>Now, we can write down the property <math>\mathbf a(w_k) </math> for the whole matrix <math>\mathbf{A} </math>.<math display="block"> \mathbf J_2 \mathbf A = \mathbf J_1 \mathbf A \mathbf H</math>Note: <math> \mathbf H</math> is multiplied from the right such that it scales each column of <math>\mathbf A </math> by the same value.
In the next sections, we will use the following matrices:
<math display="block"> \begin{align} \mathbf A_1 &:= \mathbf J_1 \mathbf A \\ \mathbf A_2 &:= \mathbf J_2 \mathbf A \end{align} </math>
Here, <math>\mathbf A_1 </math> contains the first <math>M-1</math> rows of<math>\mathbf A</math>, while <math>\mathbf A_2</math> contains the last <math>M-1</math> rows of <math>\mathbf A</math>.
Hence, the basic property becomes:
<math display="block"> \mathbf A_2 = \mathbf A_1 \mathbf H</math>
Notice that <math>\mathbf H </math> applies a rotation to the matrix <math>\mathbf A_1</math> in the complex plane. ESPRIT exploits similar rotations from the covariance matrix of the measured data.
Signal subspace
The relation <math> \mathbf A_2 = \mathbf A_1 \mathbf H</math> is the first major observation required for ESPRIT. The second major observation concerns the signal subspace that can be computed from the output signals <math display="inline">\mathbf y[t] </math>.
We will now look at multiple-time instances<math display="inline">t = 1, 2, 3, \dots, T </math>. For each time instance, we measure an output vector <math display="inline">\mathbf y[t] </math>. Let <math display="inline">\mathbf Y </math> denote the matrix of size <math>M \times T</math> comprising all of these measurements.<math display="block">\mathbf{Y} := \begin{bmatrix} \mathbf{y}[1] & \mathbf{y}[2] & \dots & \mathbf{y}[ T ] \end{bmatrix}</math>
Likewise, let us put all input signals <math display="inline">x_k[t] </math> into a matrix <math display="inline">\mathbf X </math>.<math display="block">\mathbf{X} := \begin{bmatrix} x_1[1] & x_1[2] & \dots & x_1[T]\\ x_2[1] & x_2[2] & \dots & x_2[T]\\ \vdots & \vdots & \ddots & \vdots \\ x_M[1] & x_M[2] & \dots & x_M[T] \end{bmatrix}</math>
The same we do for the noise components:<math display="block">\mathbf{N} := \begin{bmatrix} \mathbf{n}[1] & \mathbf{n}[2] & \dots & \mathbf{n}[ T ] \end{bmatrix}</math>
The system model can now be written as<math display="block">\mathbf{Y} = \mathbf{A} \mathbf{X} + \mathbf{N}</math>
The singular value decomposition (SVD) of <math display="inline">\mathbf Y </math> is given as<math display="block"> \mathbf{Y} = \mathbf U \mathbf E \mathbf V^*</math>where <math display="inline">\mathbf U </math> and <math display="inline">\mathbf V </math> are unitary matrices of sizes <math display="inline">M \times M</math> and <math display="inline">T \times T</math>, respectively. <math display="inline">\mathbf E </math> is a non-rectangular diagonal matrix of size <math display="inline">M \times T</math> that holds the singular values from largest (top left) in descending order. The operator * denotes the complex-conjugate transpose (Hermitian transpose)
Let us assume that<math display="inline">T \geq M</math>, which means that the number of times<math display="inline">T</math> that we conduct a measurement is at least as large as the number of output signals <math display="inline">M</math>.
Notice that in the system model, we have <math display="inline">K</math> input signals. We presume that the<math display="inline">K</math> largest singular values stem from these input signals. All other singular values are presumed to stem from noise. That is, if there was no noise, there would only be <math display="inline">K</math> non-zero singular values. We will now decompose each SVD matrix into submatrices, where some submatrices correspond to the input signals and some correspond to the input noise, respectively:<math display="block"> \begin{aligned} \mathbf{U} = \begin{bmatrix} \mathbf{U}_\mathrm{S} & \mathbf{U}_\mathrm{N} \end{bmatrix}, & & \mathbf{E} = \begin{bmatrix} \mathbf{E}_\mathrm{S} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{E}_\mathrm{N} & \mathbf{0} \end{bmatrix}, & & \mathbf{V} = \begin{bmatrix} \mathbf{V}_\mathrm{S} & \mathbf{V}_\mathrm{N} & \mathbf{V}_\mathrm{0} \end{bmatrix} \end{aligned}</math>Here, <math display="inline">\mathbf{U}_\mathrm{S} \in \mathbb{C}^{M \times K} </math> and <math display="inline">\mathbf{V}_\mathrm{S} \in \mathbb{C}^{N \times K} </math> contain the first <math display="inline">K</math> columns of <math display="inline">\mathbf U </math> and<math display="inline">\mathbf V </math>, respectively. <math display="inline">\mathbf{E}_\mathrm{S} \in \mathbb{C}^{K \times K} </math>is a diagonal matrix comprising the <math display="inline">K</math> largest singular values. The SVD can be equivalently written as follows.<math display="block"> \mathbf{Y} = \mathbf U_\mathrm{S} \mathbf E_\mathrm{S} \mathbf V_\mathrm{S}^* + \mathbf U_\mathrm{N} \mathbf E_\mathrm{N} \mathbf V_\mathrm{N}^*</math><math display="inline">\mathbf{U}_\mathrm{S} </math>,<math display="inline">\mathbf{V}_\mathrm{S} </math>, and <math display="inline">\mathbf{E}_\mathrm{S} </math> represent the contribution of the input signal <math display="inline">x_k[t] </math> to <math display="inline">\mathbf Y </math>. Therefore, we will call <math display="inline">\mathbf{U}_\mathrm{S} </math> the signal subspace. In contrast, <math display="inline">\mathbf{U}_\mathrm{N} </math>, <math display="inline">\mathbf{V}_\mathrm{ N } </math>, and <math display="inline">\mathbf{E}_\mathrm{N} </math> represent the contribution of noise <math display="inline">n_m[ t ] </math> to <math display="inline">\mathbf Y </math>.
Hence, by using the system model we can write:<math display="block"> \mathbf{A} \mathbf{X} = \mathbf U_\mathrm{S} \mathbf E_\mathrm{S} \mathbf V_\mathrm{S}^* </math>and<math display="block"> \mathbf{N} = \mathbf U_\mathrm{N} \mathbf E_\mathrm{N} \mathbf V_\mathrm{N}^* </math>By modifying the second-last equation, we get:<math display="block"> \begin{aligned} \mathbf U_\mathrm{S} \mathbf E_\mathrm{S} \mathbf V_\mathrm{S}^* &= \mathbf{A} \mathbf{X} & \cdot V_\mathrm{S}
\\
\mathbf U_\mathrm{S} \mathbf E_\mathrm{S} \underbrace{\mathbf V_\mathrm{S}^{*} \mathbf V_\mathrm{S}}_{\mathbf{I}} &= \mathbf{A} \mathbf{X} \mathbf V_\mathrm{S} \\ \mathbf U_\mathrm{S} \mathbf E_\mathrm{S} &= \mathbf{A} \mathbf{X} \mathbf V_\mathrm{S} & \cdot \mathbf E_\mathrm{S}^{-1} \\ \mathbf U_\mathrm{S} \underbrace{\mathbf E_\mathrm{S} \mathbf E_\mathrm{S}^{-1}}_{ \mathbf{I} } &= \mathbf{A} \mathbf{X} \mathbf V_\mathrm{S} \mathbf E_\mathrm{S}^{-1} \\ \mathbf U_\mathrm{S} &= \mathbf{A} \underbrace{\mathbf{X} \mathbf V_\mathrm{S} \mathbf E_\mathrm{S}^{-1}}_{ =: \mathbf F } \\ \mathbf U_\mathrm{S} &= \mathbf{A} \mathbf{F} \end{aligned} </math>That is, the signal subspace<math display="inline">\mathbf{U}_\mathrm{S} </math> is the product of the matrix <math display="inline">\mathbf A </math> and some other matrix<math display="inline">\mathbf F </math>. In the following, it is only important that there exist such an invertible matrix <math display="inline">\mathbf F </math>. Its actual content will not be important.
Note:
The signal subspace is usually not computed from the measurement matrix <math display="inline">\mathbf Y </math>. Instead, one may use the auto-correlation matrix.
<math display="block"> \begin{aligned} \mathbf{R}_\mathrm{YY}
- = &
\tfrac{1}{T} \sum_{t=1}^T \mathbf{y}[t] \mathbf{y}^*[t] \\ =& \tfrac{1}{T} \mathbf{Y} \mathbf{Y}^* = \tfrac{1}{T} \mathbf U \underbrace{\mathbf E \mathbf E^*}_{=: \mathbf{E}'} \mathbf U^* \end{aligned}</math>Hence, <math display="inline">\mathbf{R}_\mathrm{YY} </math> can be decomposed into signal subspace and noise subspace
<math display="block"> \mathbf{R}_\mathrm{YY} = \tfrac{1}{T} \mathbf U_\mathrm{S} \mathbf E_\mathrm{S}' \mathbf U_\mathrm{S}^* + \tfrac{1}{T} \mathbf U_\mathrm{N} \mathbf E_\mathrm{N}' \mathbf U_\mathrm{N}^*</math>
Putting the things together
These are the two basic properties that are known now:
<math display="block"> \begin{aligned} \mathbf U_\mathrm{S} &= \mathbf{A} \mathbf{F} & \mathbf J_2 \mathbf A = \mathbf J_1 \mathbf A \mathbf H \end{aligned}
</math>
Let us start with the equation on the right:
<math display="block"> \begin {aligned} \mathbf J_2 \mathbf A &= \mathbf J_1 \mathbf A \mathbf H && \text{using:}\quad \mathbf U_\mathrm{S} = \mathbf{A} \mathbf{F} \\ \mathbf J_2 \mathbf U_\mathrm{S} \mathbf{F}^{-1} &= \mathbf J_1 \mathbf U_\mathrm{S} \mathbf{F}^{-1} \mathbf H && \cdot \mathbf F \\ \mathbf J_2 \mathbf U_\mathrm{S} \underbrace{\mathbf{F}^{-1} \mathbf F}_{\mathbf{I}} &= \mathbf J_1 \mathbf U_\mathrm{S} \mathbf{F}^{-1} \mathbf H \mathbf F \\ \mathbf J_2 \mathbf U_\mathrm{S} &= \mathbf J_1 \mathbf U_\mathrm{S} \mathbf{F}^{-1} \mathbf H \mathbf{F} \end{aligned}</math>
Define these abbreviations for the truncated signal subspaces:<math display="block"> \begin {aligned} \mathbf{S}_1 &:= \mathbf J_1 \mathbf U_\mathrm{S} & \mathbf{S}_2 &:= \mathbf J_2 \mathbf U_\mathrm{S} \end{aligned}</math>Moreover, define this matrix:<math display="block"> \begin {aligned} \mathbf{P} &:= \mathbf{F}^{-1} \mathbf H \mathbf{F} \end{aligned}</math>Note that the left-hand side of the last equation has the form of an eigenvalue decomposition, where the eigenvalues are stored in the matrix <math> \mathbf H</math>. As defined in some earlier section, <math> \mathbf H</math> stores complex exponentials on its main diagonals. Their phases are the sought-after radial frequencies <math>w_1, w_2, ... w_K </math>.
Using these abbreviations, the following form is obtained:
<math display="block"> \begin {aligned} \mathbf S_2 &= \mathbf S_1 \mathbf{P} \end{aligned}</math>
The idea is now that, if we could compute <math> \mathbf P</math> from this equation, we would be able to find the eigenvalues of <math> \mathbf P</math> which in turn would give us the radial frequencies. However, <math> \mathbf S_1</math> is generally not invertible. For that, a least squares solution can be used
<math display="block"> {\mathbf P} = (\mathbf S_1^* {\mathbf S_1})^{-1} \mathbf S_1^* {\mathbf S_2}</math>
Estimation of radial frequencies
The eigenvalues <math display="inline">\lambda_1, \lambda_2, \ldots, \lambda_K </math> of P are complex numbers:<math display="block">\lambda_k = \alpha_k \mathrm{e}^{j \omega_k} </math>The estimated radial frequencies <math>w_1, w_2, ... w_K </math> are the phases (angles) of the eigenvalues <math display="inline">\lambda_1, \lambda_2, \ldots, \lambda_K </math>.
Algorithm summary
- Collect measurements <math display="inline">\mathbf y[1], \mathbf y[2], \ldots, \mathbf y[T]
</math>.
- If not already known: Estimate the number of input signals <math display="inline">K
</math>.
- Compute auto-correlation matrix.<math display="block"> \begin{aligned}
\mathbf{R}_\mathrm{YY} = \tfrac{1}{T} \sum_{t=1}^T \mathbf{y}[t] \mathbf{y}^*[t] \\ \end{aligned}</math>
- Compute singular value decomposition (SVD) of <math display="inline">\mathbf{R}_\mathrm{YY}
</math> and extract the signal subspace <math display="inline">\mathbf{U}_\mathrm{S} \in \mathbb{C}^{M \times K} </math>.<math display="block"> \mathbf{R}_\mathrm{YY} = \tfrac{1}{T} \mathbf U_\mathrm{S} \mathbf E_\mathrm{S}' \mathbf U_\mathrm{S}^* + \tfrac{1}{T} \mathbf U_\mathrm{N} \mathbf E_\mathrm{N}' \mathbf U_\mathrm{N}^*</math>
- Compute matrices <math display="inline">\mathbf{S}_\mathrm{1}
</math> and <math display="inline">\mathbf{S}_\mathrm{2} </math>.<math display="block"> \begin {aligned} \mathbf{S}_1 &:= \mathbf J_1 \mathbf U_\mathrm{S} & \mathbf{S}_2 &:= \mathbf J_2 \mathbf U_\mathrm{S} \end{aligned}</math>where <math> \mathbf J_1 = [\mathbf I_{M-1} \quad \mathbf 0] </math> and <math> \mathbf J_2 = [\mathbf 0 \quad \mathbf I_{M-1}] </math>.
- Solve the equation <math display="block"> \begin {aligned}
\mathbf S_2 &= \mathbf S_1 \mathbf{P} \end{aligned}</math> for <math display="inline">\mathbf{P} </math>. An example would be the least squares solution:<math display="block"> {\mathbf P} = (\mathbf S_1^* {\mathbf S_1})^{-1} \mathbf S_1^* {\mathbf S_2}</math>(Here, * denotes the Hermitian (conjugate) transpose.) An alternative would be the total least squares solution.
- Compute the eigenvalues <math display="inline">\lambda_1, \lambda_2, \ldots, \lambda_K
</math> of <math display="inline">\mathbf{P} </math>.
- The phases of the eigenvalues <math display="inline">\lambda_k = \alpha_k \mathrm{e}^{j \omega_k}
</math> are the sought-after radial frequencies <math display="inline">\omega_k </math>.<math display="block">\omega_k = \arg \lambda_k </math>
Notes
Choice of selection matrices
In the derivation above, the selection matrices <math display="inline"> \mathbf J_1 </math> and <math> \mathbf{J}_2 </math> were used. For simplicity, they were defined as <math> \mathbf J_1 = [\mathbf I_{M-1} \quad \mathbf 0] </math> and <math> \mathbf J_2 = [\mathbf 0 \quad \mathbf I_{M-1}] </math>. However, at no point during the derivation it was required that <math display="inline"> \mathbf J_1 </math> and <math> \mathbf{J}_2 </math> must be defined like this. Indeed, any appropriate matrices may be used as long as the rotational invariance<math display="block">\mathbf J_2 \mathbf a(w_k) = e^{-jw_k} \mathbf J_1 \mathbf a(w_k) </math>(or some generalization of it) is maintained. And accordingly, <math display="inline"> \mathbf A_1 := \mathbf J_1 \mathbf A </math> and <math display="inline"> \mathbf A_2 := \mathbf J_2 \mathbf A </math> may contain any rows of <math display="inline"> \mathbf A </math>.
Generalized rotational invariance
The rotational invariance used in the derivation may be generalized. So far, the matrix <math> \mathbf H</math> has been defined to be a diagonal matrix that stores the sought-after complex exponentials on its main diagonal. However, <math> \mathbf H</math> may also exhibit some other structure.[3] For instance, it may be an upper triangular matrix. In this case, <math display="inline"> \begin {aligned} \mathbf{P} &:= \mathbf{F}^{-1} \mathbf H \mathbf{F} \end{aligned}</math>constitutes a triangularization of <math> \mathbf P</math>.
Algorithm example
A pseudocode is given below for the implementation of ESPRIT algorithm.
function esprit(y, model_order, number_of_sources): m = model_order n = number_of_sources create covariance matrix R, from the noisy measurements y. Size of R will be (m-by-m). compute the svd of R [U, E, V] = svd(R) obtain the orthonormal eigenvectors corresponding to the sources S = U(:, 1:n) split the orthonormal eigenvectors in two S1 = S(1:m-1, :) and S2 = S(2:m, :) compute P via LS (MATLAB's backslash operator) P = S1\S2 find the angles of the eigenvalues of P w = angle(eig(P)) / (2*pi*elspacing) doa=asind(w) %return the doa angle by taking the arcsin in degrees return 'doa
See also
References
Further reading
- Шаблон:Citation.
- Шаблон:Cite journal.
- Шаблон:Cite journal
- Haardt, M., Zoltowski, M. D., Mathews, C. P., & Nossek, J. (1995, May). 2D unitary ESPRIT for efficient 2D parameter estimation. In icassp (pp. 2096-2099). IEEE.
- ↑ Шаблон:Citation
- ↑ Volodymyr Vasylyshyn. Direction of arrival estimation using ESPRIT with sparse arrays.// Proc. 2009 European Radar Conference (EuRAD). – 30 Sept.-2 Oct. 2009. - Pp. 246 - 249. - [1]
- ↑ Шаблон:Cite journal