Английская Википедия:Gauss–Seidel method
Шаблон:Short description In numerical linear algebra, the Gauss–Seidel method, also known as the Liebmann method or the method of successive displacement, is an iterative method used to solve a system of linear equations. It is named after the German mathematicians Carl Friedrich Gauss and Philipp Ludwig von Seidel, and is similar to the Jacobi method. Though it can be applied to any matrix with non-zero elements on the diagonals, convergence is only guaranteed if the matrix is either strictly diagonally dominant,[1] or symmetric and positive definite. It was only mentioned in a private letter from Gauss to his student Gerling in 1823.[2] A publication was not delivered before 1874 by Seidel.[3]
Description
Let <math display="inline">A\mathbf x = \mathbf b</math> be a square system of Шаблон:Mvar linear equations, where:
<math display="block">A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}, \qquad \mathbf{x} = \begin{bmatrix} x_{1} \\ x_2 \\ \vdots \\ x_n \end{bmatrix} , \qquad \mathbf{b} = \begin{bmatrix} b_{1} \\ b_2 \\ \vdots \\ b_n \end{bmatrix}.</math>
When <math>A</math> and <math>\mathbf b</math> are known, and <math>\mathbf x</math> is unknown, we can use the Gauss–Seidel method to approximate <math>\mathbf x</math>. The vector <math>\mathbf x^{(0)}</math> denotes our initial guess for <math>\mathbf x</math> (often <math>\mathbf x^{(0)}_i=0</math> for <math>i=1,2,...,n</math>). We denote <math>\mathbf{x}^{(k)}</math> as the Шаблон:Mvar-th approximation or iteration of <math>\mathbf{x}</math>, and <math>\mathbf{x}^{(k+1)}</math> is the next (or k+1) iteration of <math>\mathbf{x}</math>.
Matrix-based formula
The solution is obtained iteratively via <math display="block"> L_* \mathbf{x}^{(k+1)} = \mathbf{b} - U \mathbf{x}^{(k)}, </math> where the matrix <math>A</math> is decomposed into a lower triangular component <math>L_*</math>, and a strictly upper triangular component <math>U</math> such that <math> A = L_* + U </math>.[4] More specifically, the decomposition of <math>A</math> into <math>L_*</math> and <math>U</math> is given by:
<math display="block">A = \underbrace{ \begin{bmatrix} a_{11} & 0 & \cdots & 0 \\ a_{21} & a_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix} }_{\textstyle L_*} + \underbrace{ \begin{bmatrix} 0 & a_{12} & \cdots & a_{1n} \\ 0 & 0 & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & 0 \end{bmatrix} }_{\textstyle U} . </math>
Why the matrix-based formula works
The system of linear equations may be rewritten as:
- <math>\begin{alignat}{1}
A\mathbf x &= \mathbf b \\ (L_*+U) \mathbf x &= \mathbf b \\ L_* \mathbf x+U\mathbf x &= \mathbf b \\ L_* \mathbf{x} &= \mathbf{b} - U \mathbf{x} \end{alignat} </math>
The Gauss–Seidel method now solves the left hand side of this expression for <math>\mathbf{x}</math>, using previous value for <math>\mathbf{x}</math> on the right hand side. Analytically, this may be written as: <math display="block"> \mathbf{x}^{(k+1)} = L_*^{-1} \left(\mathbf{b} - U \mathbf{x}^{(k)}\right). </math>
Element-based formula
However, by taking advantage of the triangular form of <math>L_*</math>, the elements of <math>\mathbf{x}^{(k+1)}</math> can be computed sequentially for each row <math>i</math> using forward substitution:[5] <math display="block"> x^{(k+1)}_i = \frac{1}{a_{ii}} \left(b_i - \sum_{j=1}^{i-1}a_{ij}x^{(k+1)}_j - \sum_{j=i+1}^{n}a_{ij}x^{(k)}_j \right),\quad i=1,2,\dots,n. </math>
Notice that the formula uses two summations per iteration which can be expressed as one summation <math>\sum_{j \ne i} a_{ij}x_j</math> that uses the most recently calculated iteration of <math>x_j</math>. The procedure is generally continued until the changes made by an iteration are below some tolerance, such as a sufficiently small residual.
Discussion
The element-wise formula for the Gauss–Seidel method is similar to that of the Jacobi method.
The computation of <math>\mathbf{x}^{(k+1)}</math> uses the elements of <math>\mathbf{x}^{(k+1)}</math> that have already been computed, and only the elements of <math>\mathbf{x}^{(k)}</math> that have not been computed in the Шаблон:Math-th iteration. This means that, unlike the Jacobi method, only one storage vector is required as elements can be overwritten as they are computed, which can be advantageous for very large problems.
However, unlike the Jacobi method, the computations for each element are generally much harder to implement in parallel, since they can have a very long critical path, and are thus most feasible for sparse matrices. Furthermore, the values at each iteration are dependent on the order of the original equations.
Gauss-Seidel is the same as successive over-relaxation with <math>\omega=1</math>.
Convergence
The convergence properties of the Gauss–Seidel method are dependent on the matrix A. Namely, the procedure is known to converge if either:
- Шаблон:Mvar is symmetric positive-definite,[6] or
- Шаблон:Mvar is strictly or irreducibly diagonally dominant.[7]
The Gauss–Seidel method sometimes converges even if these conditions are not satisfied.
Golub and Van Loan give a theorem for an algorithm that splits <math>A</math> into two parts. Suppose <math>A = M - N</math> is nonsingular. Let <math>r = \rho(M^{-1}N)</math> be the spectral radius of <math>M^{-1}N</math>. Then the iterates <math>x^{(k)}</math> defined by <math>Mx^{(k+1)} = Nx^{(k)} + b</math> converge to <math>x = A^{-1}b</math> for any starting vector <math>x^{(0)}</math> if <math>M</math> is nonsingular and <math>r < 1</math>.[8]
Algorithm
Since elements can be overwritten as they are computed in this algorithm, only one storage vector is needed, and vector indexing is omitted. The algorithm goes as follows:
algorithm Gauss–Seidel method is inputs: Шаблон:Var, Шаблон:Var Шаблон:Nowrap Шаблон:Nowrap repeat until convergence for Шаблон:Var from 1 until Шаблон:Var do Шаблон:Nowrap for Шаблон:Var from 1 until Шаблон:Var do if Шаблон:Var ≠ Шаблон:Var then Шаблон:Nowrap end if end (Шаблон:Var-loop) Шаблон:Nowrap end (Шаблон:Var-loop) check if convergence is reached end (repeat)
Examples
An example for the matrix version
A linear system shown as <math>A \mathbf{x} = \mathbf{b}</math> is given by: <math display="block"> A=
\begin{bmatrix} 16 & 3 \\ 7 & -11 \\ \end{bmatrix}
\quad \text{and} \quad
b= \begin{bmatrix} 11 \\ 13 \end{bmatrix}.
</math>
We want to use the equation <math display="block"> \mathbf{x}^{(k+1)} = L_*^{-1} (\mathbf{b} - U \mathbf{x}^{(k)}) </math> in the form <math display="block"> \mathbf{x}^{(k+1)} = T \mathbf{x}^{(k)} + C </math> where:
- <math>T = - L_*^{-1} U \quad \text{and} \quad C = L_*^{-1} \mathbf{b}.</math>
We must decompose <math>A</math> into the sum of a lower triangular component <math>L_*</math> and a strict upper triangular component <math>U</math>: <math display="block"> L_*=
\begin{bmatrix} 16 & 0 \\ 7 & -11 \\ \end{bmatrix}
\quad \text{and} \quad U =
\begin{bmatrix} 0 & 3 \\ 0 & 0 \end{bmatrix}.</math>
The inverse of <math>L_*</math> is: <math display="block"> L_*^{-1} =
\begin{bmatrix} 16 & 0 \\ 7 & -11 \end{bmatrix}^{-1} = \begin{bmatrix} 0.0625 & 0.0000 \\ 0.0398 & -0.0909 \\ \end{bmatrix}.
</math>
Now we can find: <math display="block">\begin{align}
T &= - \begin{bmatrix} 0.0625 & 0.0000 \\ 0.0398 & -0.0909 \end{bmatrix} \begin{bmatrix} 0 & 3 \\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1194 \end{bmatrix},
\\[1ex]
C &= \begin{bmatrix} 0.0625 & 0.0000 \\ 0.0398 & -0.0909 \end{bmatrix} \begin{bmatrix} 11 \\ 13 \end{bmatrix} = \begin{bmatrix} 0.6875 \\ -0.7439 \end{bmatrix}.
\end{align}</math>
Now we have <math>T</math> and <math>C</math> and we can use them to obtain the vectors <math>\mathbf{x}</math> iteratively.
First of all, we have to choose <math>\mathbf{x}^{(0)}</math>: we can only guess. The better the guess, the quicker the algorithm will perform.
We choose a starting point: <math display="block"> x^{(0)} = \begin{bmatrix} 1.0 \\ 1.0 \end{bmatrix}.</math>
We can then calculate: <math display="block">\begin{align}
x^{(1)} &= \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1193 \end{bmatrix} \begin{bmatrix} 1.0 \\ 1.0 \end{bmatrix} + \begin{bmatrix} 0.6875 \\ -0.7443 \end{bmatrix} = \begin{bmatrix} 0.5000 \\ -0.8636 \end{bmatrix}.
\\[1ex]
x^{(2)} &= \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1193 \end{bmatrix} \begin{bmatrix} 0.5000 \\ -0.8636 \end{bmatrix} + \begin{bmatrix} 0.6875 \\ -0.7443 \end{bmatrix} = \begin{bmatrix} 0.8494 \\ -0.6413 \end{bmatrix}.
\\[1ex]
x^{(3)} &= \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1193 \end{bmatrix} \begin{bmatrix} 0.8494 \\ -0.6413 \\ \end{bmatrix} + \begin{bmatrix} 0.6875 \\ -0.7443 \end{bmatrix} = \begin{bmatrix} 0.8077 \\ -0.6678 \end{bmatrix}.
\\[1ex]
x^{(4)} &= \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1193 \end{bmatrix} \begin{bmatrix} 0.8077 \\ -0.6678 \end{bmatrix} + \begin{bmatrix} 0.6875 \\ -0.7443 \end{bmatrix} = \begin{bmatrix} 0.8127 \\ -0.6646 \end{bmatrix}.
\\[1ex]
x^{(5)} &= \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1193 \end{bmatrix} \begin{bmatrix} 0.8127 \\ -0.6646 \end{bmatrix} + \begin{bmatrix} 0.6875 \\ -0.7443 \end{bmatrix} = \begin{bmatrix} 0.8121 \\ -0.6650 \end{bmatrix}.
\\[1ex]
x^{(6)} &= \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1193 \end{bmatrix} \begin{bmatrix} 0.8121 \\ -0.6650 \end{bmatrix} + \begin{bmatrix} 0.6875 \\ -0.7443 \end{bmatrix} = \begin{bmatrix} 0.8122 \\ -0.6650 \end{bmatrix}.
\\[1ex]
x^{(7)} &= \begin{bmatrix} 0.000 & -0.1875 \\ 0.000 & -0.1193 \end{bmatrix} \begin{bmatrix} 0.8122 \\ -0.6650 \end{bmatrix} + \begin{bmatrix} 0.6875 \\ -0.7443 \end{bmatrix} = \begin{bmatrix} 0.8122 \\ -0.6650 \end{bmatrix}.
\end{align}</math>
As expected, the algorithm converges to the exact solution: <math display="block"> \mathbf{x} = A^{-1} \mathbf{b} \approx \begin{bmatrix} 0.8122\\ -0.6650 \end{bmatrix}. </math>
In fact, the matrix Шаблон:Mvar is strictly diagonally dominant (but not positive definite).
Another example for the matrix version
Another linear system shown as <math>A \mathbf{x} = \mathbf{b}</math> is given by:
<math display="block"> A=
\begin{bmatrix} 2 & 3 \\ 5 & 7 \\ \end{bmatrix}
\quad \text{and} \quad b =
\begin{bmatrix} 11 \\ 13 \\ \end{bmatrix}.
</math>
We want to use the equation <math display="block"> \mathbf{x}^{(k+1)} = L_*^{-1} (\mathbf{b} - U \mathbf{x}^{(k)}) </math> in the form <math display="block"> \mathbf{x}^{(k+1)} = T \mathbf{x}^{(k)} + C </math> where:
- <math>T = - L_*^{-1} U \quad \text{and} \quad C = L_*^{-1} \mathbf{b}.</math>
We must decompose <math>A</math> into the sum of a lower triangular component <math>L_*</math> and a strict upper triangular component <math>U</math>: <math display="block"> L_*=
\begin{bmatrix} 2 & 0 \\ 5 & 7 \\ \end{bmatrix}
\quad \text{and} \quad U =
\begin{bmatrix} 0 & 3 \\ 0 & 0 \\ \end{bmatrix}.</math>
The inverse of <math>L_*</math> is: <math display="block"> L_*^{-1} =
\begin{bmatrix} 2 & 0 \\ 5 & 7 \\ \end{bmatrix}^{-1} = \begin{bmatrix} 0.500 & 0.000 \\ -0.357 & 0.143 \\ \end{bmatrix}
.</math>
Now we can find: <math display="block">\begin{align}
T &= - \begin{bmatrix} 0.500 & 0.000 \\ -0.357 & 0.143 \\ \end{bmatrix} \begin{bmatrix} 0 & 3 \\ 0 & 0 \\ \end{bmatrix} = \begin{bmatrix} 0.000 & -1.500 \\ 0.000 & 1.071 \\ \end{bmatrix},
\\[1ex]
C &= \begin{bmatrix} 0.500 & 0.000 \\ -0.357 & 0.143 \\ \end{bmatrix} \begin{bmatrix} 11 \\ 13 \\ \end{bmatrix} = \begin{bmatrix} 5.500 \\ -2.071 \\ \end{bmatrix}.
\end{align}</math>
Now we have <math>T</math> and <math>C</math> and we can use them to obtain the vectors <math>\mathbf{x}</math> iteratively.
First of all, we have to choose <math>\mathbf{x}^{(0)}</math>: we can only guess. The better the guess, the quicker will perform the algorithm.
We suppose: <math display="block"> x^{(0)} = \begin{bmatrix} 1.1 \\ 2.3 \end{bmatrix}.</math>
We can then calculate: <math display="block">\begin{align}
x^{(1)} &= \begin{bmatrix} 0 & -1.500 \\ 0 & 1.071 \\ \end{bmatrix} \begin{bmatrix} 1.1 \\ 2.3 \\ \end{bmatrix} + \begin{bmatrix} 5.500 \\ -2.071 \\ \end{bmatrix} = \begin{bmatrix} 2.050 \\ 0.393 \\ \end{bmatrix}.
\\[1ex]
x^{(2)} &= \begin{bmatrix} 0 & -1.500 \\ 0 & 1.071 \\ \end{bmatrix} \begin{bmatrix} 2.050 \\ 0.393 \\ \end{bmatrix} + \begin{bmatrix} 5.500 \\ -2.071 \\ \end{bmatrix} = \begin{bmatrix} 4.911 \\ -1.651 \end{bmatrix}.
\\[1ex]
x^{(3)} &= \cdots.
\end{align}</math>
If we test for convergence we'll find that the algorithm diverges. In fact, the matrix A is neither diagonally dominant nor positive definite. Then, convergence to the exact solution <math display="block"> \mathbf{x} = A^{-1} \mathbf{b} = \begin{bmatrix} -38\\ 29 \end{bmatrix} </math> is not guaranteed and, in this case, will not occur.
An example for the equation version
Suppose given Шаблон:Mvar equations where xn are vectors of these equations and starting point x0. From the first equation solve for x1 in terms of <math>x_{n+1}, x_{n+2}, \dots, x_n.</math> For the next equations substitute the previous values of xs.
To make it clear consider an example. <math display="block">\begin{array}{rrrrl} 10x_1 &- x_2 &+ 2x_3 & & = 6, \\
-x_1 &+ 11x_2 &- x_3 &+ 3x_4 & = 25, \\ 2x_1 &- x_2 &+ 10x_3 &- x_4 & = -11, \\ & 3x_2 &- x_3 &+ 8x_4 & = 15.
\end{array}</math>
Solving for <math>x_1, x_2, x_3</math> and <math>x_4</math> gives: <math display="block">\begin{align} x_1 & = x_2/10 - x_3/5 + 3/5, \\ x_2 & = x_1/11 + x_3/11 - 3x_4/11 + 25/11, \\ x_3 & = -x_1/5 + x_2/10 + x_4/10 - 11/10, \\ x_4 & = -3x_2/8 + x_3/8 + 15/8. \end{align}</math>
Suppose we choose Шаблон:Math as the initial approximation, then the first approximate solution is given by <math display="block">\begin{align} x_1 & = 3/5 = 0.6, \\ x_2 & = (3/5)/11 + 25/11 = 3/55 + 25/11 = 2.3272, \\ x_3 & = -(3/5)/5 +(2.3272)/10-11/10 = -3/25 + 0.23272-1.1 = -0.9873,\\ x_4 & = -3(2.3272)/8 +(-0.9873)/8+15/8 = 0.8789. \end{align}</math>
Using the approximations obtained, the iterative procedure is repeated until the desired accuracy has been reached. The following are the approximated solutions after four iterations.
<math>x_1</math> | <math>x_2</math> | <math>x_3</math> | <math>x_4</math> |
---|---|---|---|
0.6 | 2.32727 | −0.987273 | 0.878864 |
1.03018 | 2.03694 | −1.01446 | 0.984341 |
1.00659 | 2.00356 | −1.00253 | 0.998351 |
1.00086 | 2.0003 | −1.00031 | 0.99985 |
The exact solution of the system is Шаблон:Math.
An example using Python and NumPy
The following numerical procedure simply iterates to produce the solution vector.
import numpy as np
ITERATION_LIMIT = 1000
# initialize the matrix
A = np.array(
[
[10.0, -1.0, 2.0, 0.0],
[-1.0, 11.0, -1.0, 3.0],
[2.0, -1.0, 10.0, -1.0],
[0.0, 3.0, -1.0, 8.0],
]
)
# initialize the RHS vector
b = np.array([6.0, 25.0, -11.0, 15.0])
print("System of equations:")
for i in range(A.shape[0]):
row = [f"{A[i,j]:3g}*x{j+1}" for j in range(A.shape[1])]
print("[{0}] = [{1:3g}]".format(" + ".join(row), b[i]))
x = np.zeros_like(b, np.float_)
for it_count in range(1, ITERATION_LIMIT):
x_new = np.zeros_like(x, dtype=np.float_)
print(f"Iteration {it_count}: {x}")
for i in range(A.shape[0]):
s1 = np.dot(A[i, :i], x_new[:i])
s2 = np.dot(A[i, i + 1 :], x[i + 1 :])
x_new[i] = (b[i] - s1 - s2) / A[i, i]
if np.allclose(x, x_new, rtol=1e-8):
break
x = x_new
print(f"Solution: {x}")
error = np.dot(A, x) - b
print(f"Error: {error}")
Produces the output:
System of equations:
[ 10*x1 + -1*x2 + 2*x3 + 0*x4] = [ 6]
[ -1*x1 + 11*x2 + -1*x3 + 3*x4] = [ 25]
[ 2*x1 + -1*x2 + 10*x3 + -1*x4] = [-11]
[ 0*x1 + 3*x2 + -1*x3 + 8*x4] = [ 15]
Iteration 1: [ 0. 0. 0. 0.]
Iteration 2: [ 0.6 2.32727273 -0.98727273 0.87886364]
Iteration 3: [ 1.03018182 2.03693802 -1.0144562 0.98434122]
Iteration 4: [ 1.00658504 2.00355502 -1.00252738 0.99835095]
Iteration 5: [ 1.00086098 2.00029825 -1.00030728 0.99984975]
Iteration 6: [ 1.00009128 2.00002134 -1.00003115 0.9999881 ]
Iteration 7: [ 1.00000836 2.00000117 -1.00000275 0.99999922]
Iteration 8: [ 1.00000067 2.00000002 -1.00000021 0.99999996]
Iteration 9: [ 1.00000004 1.99999999 -1.00000001 1. ]
Iteration 10: [ 1. 2. -1. 1.]
Solution: [ 1. 2. -1. 1.]
Error: [ 2.06480930e-08 -1.25551054e-08 3.61417563e-11 0.00000000e+00]
Program to solve arbitrary no. of equations using Matlab
The following code uses the formula <math display="block">x^{(k+1)}_i = \frac{1}{a_{ii}} \left(b_i - \sum_{j<i}a_{ij}x^{(k+1)}_j - \sum_{j>i}a_{ij}x^{(k)}_j \right),\quad \begin{array}{l} i=1,2,\ldots,n \\ k=0,1,2,\ldots \end{array}</math>
function x = gauss_seidel(A, b, x, iters)
for i = 1:iters
for j = 1:size(A,1)
x(j) = (b(j) - sum(A(j,:)'.*x) + A(j,j)*x(j)) / A(j,j);
end
end
end
See also
- Conjugate gradient method
- Gaussian belief propagation
- Iterative method: Linear systems
- Kaczmarz method (a "row-oriented" method, whereas Gauss-Seidel is "column-oriented." See, for example, this paper.)
- Matrix splitting
- Richardson iteration
Notes
References
External links
- Шаблон:Springer
- Gauss–Seidel from www.math-linux.com
- Gauss–Seidel From Holistic Numerical Methods Institute
- Gauss Siedel Iteration from www.geocities.com
- The Gauss-Seidel Method
- Bickson
- Matlab code
- C code example
Шаблон:Numerical linear algebra Шаблон:Authority control
- Английская Википедия
- Numerical linear algebra
- Articles with example pseudocode
- Relaxation (iterative methods)
- Articles with example Python (programming language) code
- Articles with example MATLAB/Octave code
- Страницы, где используется шаблон "Навигационная таблица/Телепорт"
- Страницы с телепортом
- Википедия
- Статья из Википедии
- Статья из Английской Википедии