Solving Poisson Equation using
Conjugate Gradient Method
and its implementation
Jongsu Kim
Theoretical
From the Basics, Ax=b
Linear Systems
𝐴𝑥 = 𝑏
Goal of this presentation
What have you learned?
• Direct Method
• Gauss Elimination
• Thomas Algorithm (TDMA) (for tridiagonal matrix only)
• Iterative Method
• Jacobi method
• SOR method
• Conjugate Gradient Method
• Red Black Jacobi Method
Iterative Method
Start with decomposition
𝐴 = 𝐷 − 𝐸 − 𝐹
Jacobi Method
𝑥 𝑘+1 = 𝐷−1 𝐸 + 𝐹 𝑥 𝑘 + 𝐷−1 𝑏
Gauss-Seidel Method
𝑥 𝑘+1 = (𝐷 − 𝐸)−1
𝐹𝑥 𝑘 + 𝐷 − 𝐸 −1
𝑏
𝐴𝑥 = 𝑏
Backward Gauss-Seidel Iteration
𝐷 − 𝐹 𝑥 𝑘+1 = 𝐸𝑥 𝑘 + 𝑏
(𝑖 = 1, … , 𝑛 − 1, 𝑛)
(𝑖 = 𝑛, 𝑛 − 1 … , )
Splitting of A matrix
Previous method has a common form
𝐴 = 𝐷 − 𝐸 − 𝐹
𝑥 𝑘+1 = 𝐷−1 𝐸 + 𝐹 𝑥 𝑘 + 𝐷−1 𝑏
𝑥 𝑘+1 = (𝐷 − 𝐸)−1 𝐹𝑥 𝑘 + 𝐷 − 𝐸 −1 𝑏
𝐴𝑥 = 𝑏
𝐷 − 𝐹 𝑥 𝑘+1 = 𝐸𝑥 𝑘 + 𝑏
𝑴𝒙 𝒌+𝟏 = 𝑵𝒙 𝒌 + 𝒃 = 𝑴 − 𝑨 𝒙 𝒌 + 𝒃
𝑨 = 𝑴 − 𝑵
Introducing SOR (Successive Over Relaxation) method
𝝎𝑨 = 𝑫 − 𝝎𝑬 − (𝝎𝑭 + 𝟏 − 𝝎 𝑫)
𝑫 − 𝝎𝑬 𝒙 𝒌+𝟏 = 𝝎𝑭 + 𝟏 − 𝝎 𝑫 𝒙 𝒌 + 𝝎𝒃
SOR to SSOR
Gauss Seidel method
𝑥 𝑘+1 = (𝐷 − 𝐸)−1
𝐹𝑥 𝑘 + 𝐷 − 𝐸 −1
𝑏
SOR (Successive Over Relaxation) method
𝐷 − 𝜔𝐸 𝑥 𝑘+1 = 𝜔𝐹 + 1 − 𝜔 𝐷 𝑥 𝑘 + 𝜔𝑏
(𝐷 − 𝐸)𝑥 𝑘+1= 𝐹𝑥 𝑘 + 𝑏
𝐷 − 𝐹 𝑥 𝑘+1 = 𝐸𝑥 𝑘 + 𝑏
Backward Gauss Seidel method
Backward SOR method
𝐷 − 𝜔𝐹 𝑥 𝑘+1 = 𝜔𝐸 + 1 − 𝜔 𝐷 𝑥 𝑘 + 𝜔𝑏
SSOR method
SSOR (Symmetric Successive Over Relaxation) method
SOR step followed by backward SOR step for symmetric matrix
𝐷 − 𝜔𝐸 𝑥 𝑘+1/2 = 𝜔𝐹 + 1 − 𝜔 𝐷 𝑥 𝑘 + 𝜔𝑏
𝐷 − 𝜔𝐹 𝑥 𝑘+1 = 𝜔𝐸 + 1 − 𝜔 𝐷 𝑥 𝑘+1/2 + 𝜔𝑏
𝑥 𝑘+1 = 𝑮 𝝎 𝑥 𝑘 + 𝒇 𝝎
𝑮 𝝎 = 𝐷 − 𝜔𝐹 −1 𝜔𝐸 + 1 − 𝜔 𝐷 × 𝐷 − 𝜔𝐸 −1 𝜔𝐹 + 1 − 𝜔 𝐷
𝒇 𝝎 = 𝜔 𝐷 − 𝜔𝐹 −1
𝐼 + 𝜔𝐸 + 1 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1
𝑏
Observing that
𝜔𝐸 + 1 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1 = − 𝐷 − 𝜔𝐸 + 2 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1
= −𝐼 + 2 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1
𝒇 𝝎 = 𝜔 2 − 𝜔 𝐷 − 𝜔𝐹 −1
𝐷 𝐷 − 𝜔𝐸 −1
𝑏
Used as preconditioner (explain later)
Preconditioned System
𝒙 𝒌+𝟏 = 𝑮 𝝎 𝒙 𝒌 + 𝒇 𝝎
𝐺 𝐺𝑆 𝐴 = 𝐼 − (𝐷 − 𝐸)−1 𝐴𝐺𝐽𝐴 𝐴 = 𝐼 − 𝐷−1 𝐴,
𝒙 𝒌+𝟏 = 𝑴−𝟏
𝑵𝒙 𝒌 + 𝑴−𝟏
𝒃
We have two forms for iterative method
Ex)
𝐺 = 𝑀−1
𝑁 = 𝑀−1
𝑀 − 𝐴 = 𝐼 − 𝑀−1
𝐴 𝑓 = 𝑀−1
𝑏
𝐼 − 𝐺 𝑥 = 𝑓
Another view…
[𝐼 − (𝐼 − 𝑀−1
𝐴)]𝑥 = 𝑓
𝑀−1
𝐴𝑥 = 𝑓
𝑴−𝟏
𝑨𝒙 = 𝑴−𝟏
𝒃 Preconditioner 𝑀
Preconditioned System
𝑴−𝟏
𝑨𝒙 = 𝑴−𝟏
𝒃 With Preconditioner 𝑀
𝑀 𝐺𝑆 = 𝐷 − 𝐸Gauss-Seidel
𝑀𝑆𝑆𝑂𝑅 =
1
𝜔 2 − 𝜔
𝐷 − 𝜔𝐸 𝐷−1
(𝐷 − 𝜔𝐹)SSOR
𝑀𝐽𝐴 = 𝐷Jacobi
𝑀𝐽𝐴 =
1
𝜔
(𝐷 − 𝜔𝐸)SOR
It may not be “SPARSE” due to inverse (𝑀−1)
How to compute this?
𝑤 = 𝑀−1
𝐴𝑣 𝑟 = 𝐴𝑣 and 𝑀𝑤 = 𝑟
𝐴𝑣 might be expensive. Much better?
𝑤 = 𝑀−1 𝐴𝑣 = 𝑀−1 𝑀 − 𝑁 𝑣 = 𝐼 − 𝑀−1 𝑁 𝑣
𝑟 = 𝑁𝑣
𝑤 = 𝑀−1
𝑟
𝑤 ≔ 𝑣 − 𝑤
N may be sparser than A
and less expensive than 𝐴𝑣
Minimization Problem
Forget about 𝐴𝑥 = 𝑏 temporarily, but thinking about some quadratic function 𝑓
Function Matrix
𝑓(x) =
1
2
𝐴𝑥2
− 𝑏𝑥 + 𝑐 𝑓 𝑥 =
1
2
𝑥 𝑇
𝐴𝑥 − 𝑏 𝑇
𝑥 + 𝑐
𝑓′
x = 𝐴𝑥 − b 𝑓′ x =
1
2
𝐴 𝑇 𝑥 +
1
2
A𝑥 − b
If Matrix 𝐴 is symmetric, 𝐴 𝑇
= 𝐴, then
𝒇′ 𝒙 = 𝑨𝒙 − 𝒃
Setting the gradient to zero, we get the linear system we wish to solve.
Our original GOAL!!
(a) Quadratic form for a positive
definite matrix
(b) Quadratic form for a negative
definite matrix
(c) Singular (and positive-indefinite)
matrix; A line that runs through
bottom of the valley is the set of
solutions
(d) For an indefinite matrix. Saddle
point.
For a Symmetric and Positive Definite Matrix, minimizing
𝑓 𝑥 =
1
2
𝑥 𝑇 𝐴𝑥 − 𝑏 𝑇 𝑥 + 𝑐
Reduced to our solution
Minimization Problem
Steep Descent Method
Choose direction in which 𝑓 decrease most quickly, which is the direction opposite 𝑓′(𝑥 𝑖 )
𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖)
−𝑓′ 𝑥 𝑖 = 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖)
𝑥(1) = 𝑥(0) + 𝛼𝑟(0)
To Find 𝛼, set
𝑑
𝑑𝛼
𝑓 𝑥 1 = 0
𝑑
𝑑𝛼
𝑓 𝑥 1 = 𝑓′
𝑥 1
𝑇 𝑑
𝑑𝛼
𝑥(1) = 𝑓′
𝑥 1
𝑇
𝑟(0)
𝑓′
𝑥 𝑖+1
𝑇
and 𝑟(𝑖) are orthogonal!
−𝑓′
𝑥 𝑖+1 = 𝑟(𝑖+1)
𝑓′
𝑥 𝑖+1
𝑇
𝑟(𝑖) = 0
𝑟 𝑖+1
𝑇
𝑟(𝑖) = 0
𝜶 =
𝒓 𝒊
𝑻
𝒓 𝒊
𝒓(𝒊)
𝑻
𝑨𝒓(𝒊)
Conjugate Gradient Method
Steep Descent Method not always converge well
Worst case of steep descent method
• Solid lines : worst convergence line
• Dashed line : steps toward convergence
Why it doesn’t directly go along line for fast
convergence? → related to eigen value
problem
Introducing Conjugate Gradient method
Conjugate Gradient Method
What is the meaning of conjugate?
• Definition : A binomial formed by negating the second term of binomial
• 𝑥 + 𝑦 ← conjugate → 𝑥 − 𝑦
Then, what is the meaning of conjugate gradient?
• Steep descent method often finds itself taking steps in the same direction
• Wouldn’t it better if we got it right the every step?
• Here is a step
• error 𝑒(𝑖) = 𝑥(𝑖) − 𝑥, residual 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖), 𝑑(𝑖) a set of orthogonal search
direction
• for each step, we choose a point 𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼(𝑖) 𝑑(𝑖)
• To find 𝛼, 𝑒(𝑖+1) should be orthogonal to 𝑑(𝑖). (𝑒 𝑖+1 = 𝑒 𝑖 + 𝛼 𝑖 𝑑 𝑖 )
𝑑(𝑖)
𝑇
𝑒(𝑖+1) = 0
𝑑(𝑖)
𝑇
(𝑒 𝑖 +𝛼(𝑖) 𝑑(𝑖)) = 0
𝛼(𝑖) = −
𝑑 𝑖
𝑇
𝑒 𝑖
𝑑(𝑖)
𝑇
𝑑(𝑖)
We don’t know anything about 𝑒(𝑖), because if we know 𝑒(𝑖), it means we know the answer.
Conjugate Gradient Method
Instead of orthogonal, introduce 𝐴-orthogonal
𝒅(𝒊)
𝑻
𝑨𝒅(𝒋) = 𝟎, if 𝑑(𝑖) and 𝑑(𝑗) are 𝐴-orthogonal, or conjugate
𝒆(𝒊+𝟏) is 𝑨-orthogonal to 𝒅(𝒊), and this condition is equivalent to finding the minimum
point along the search direction 𝑑(𝑖) , as in steep descent method
𝑑
𝑑𝛼
𝑓 𝑥 𝑖+1 = 0
𝛼 minimize 𝑓 when directional
derivative is equal to zero
𝑓′ 𝑥 𝑖+1
𝑇 𝑑
𝑑𝛼
𝑥 𝑖+1 = 0
−𝑟 𝑖+1
𝑇
𝑑(𝑖) = 0
Chain rule
𝑓′ 𝑥(𝑖+1) = 𝐴𝑥(𝑖+1) − 𝑏
𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖)
𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼(𝑖) 𝑑(𝑖)
𝑑(𝑖)
𝑇
𝐴𝑒(𝑖+1) = 0 𝑥(𝑖+1)
𝑇
𝐴 𝑇
𝑑(𝑖) − 𝑏 𝑇
𝑑 𝑖 = 0
𝑥(𝑖+1)
𝑇
𝐴 𝑇
𝑑(𝑖) − 𝑥 𝑇
𝐴 𝑇
𝑑 𝑖 = 0
𝑒 𝑖+1
𝑇
𝐴 𝑇
𝑑(𝑖) = 0 Transpose again
How it can be same as orthogonality used in steep descent method?
𝑒(𝑖+1) = 𝑥(𝑖+1) − 𝑥
𝜶(𝒊) = −
𝒅 𝒊
𝑻
𝒓 𝒊
𝒅(𝒊)
𝑻
𝑨𝒅(𝒊)
Conjugate Gradient Method
𝑑(𝑖)
𝑇
𝐴𝑒(𝑖+1) = 0
𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼𝑑(𝑖)
𝑒(𝑖+1) = (𝑥(𝑖) + 𝛼𝑟(𝑖)) − 𝑥
𝑑(𝑖)
𝑇
𝐴𝑒(𝑖+1) = 𝑑 𝑖
𝑇
𝐴((𝑥 𝑖 + 𝛼𝑑 𝑖 ) − 𝑥)
𝑑 𝑖
𝑇
𝐴𝑥(𝑖) + 𝛼𝑑 𝑖
𝑇
𝐴𝑑(𝑖) − 𝑑 𝑖
𝑇
𝐴𝑥 = 0
𝑑 𝑖
𝑇
𝐴𝑥 𝑖 − 𝑏 = −𝛼𝑑 𝑖
𝑇
𝐴𝑑(𝑖)
How to find 𝑑(𝑖)?
Gram-Schmidt Process
𝑑(𝑖) = 𝑢(𝑖) + Σk=0
𝑖−1
𝛽𝑖𝑘 𝑑(𝑘)
Find set of 𝐴-orthogonal vector
𝛽𝑖𝑘 = −
𝑢𝑖
𝑇
𝐴𝑑 𝑗
𝑑(𝑗)
𝑇
𝐴𝑑(𝑗)
For set of independent vectors 𝑢𝑖
due to 𝑑 𝑖
𝑇
𝐴𝑑(𝑗) = 0
𝑖 > 𝑗
Conjugate Gradient Method
Overall Algorithm
Initialization
𝑖 = 0
𝑟 = 𝑏 − 𝐴𝑥
𝑑 = 𝑟
𝛿 𝑛𝑒𝑤 = 𝑟 𝑇
𝑟
𝛿0 = 𝛿 𝑛𝑒𝑤
𝜖 = 1.0𝑒 − 6
Iteration check
While i<imax &&
𝛿 𝑛𝑒𝑤 > 𝜖2
𝛿0
Inside loop
𝑞 = 𝐴𝑑
𝛼 =
𝛿 𝑛𝑒𝑤
𝑑 𝑇 𝑞
𝑥 = 𝑥 + 𝛼𝑑
If 𝑖 is divisible by 50
𝑟 = 𝑏 − 𝐴𝑥
else
𝑟 = 𝑟 − 𝛼𝑞
endif
𝛿 𝑜𝑙𝑑 = 𝛿 𝑛𝑒𝑤
𝛿 𝑛𝑒𝑤 = 𝑟 𝑇 𝑟
𝛽 =
𝛿 𝑛𝑒𝑤
𝛿 𝑜𝑙𝑑
𝑑 = 𝑟 + 𝛽𝑑
𝑖 = 𝑖 + 1
Preconditioner Again
𝑴−𝟏
𝑨𝒙 = 𝑴−𝟏
𝒃 With Preconditioner 𝑀
𝑀 𝐺𝑆 = 𝐷 − 𝐸Gauss-Seidel
𝑀𝑆𝑆𝑂𝑅 =
1
𝜔 2 − 𝜔
𝐷 − 𝜔𝐸 𝐷−1
(𝐷 − 𝜔𝐹)SSOR
𝑀𝐽𝐴 = 𝐷Jacobi
𝑀𝐽𝐴 =
1
𝜔
(𝐷 − 𝜔𝐸)SOR
Incomplete LU Decomposition 𝐴 = 𝐿𝑈 − 𝑅 𝑅 : residual error
Incomplete Cholesky Decomposition 𝐴 = 𝐿𝐿 𝑇 − 𝑅
If A is SPD (Symmetric Positive Definite), above two decomposition are same
To make sparse system, used incomplete Factorization
Implementation
Implementation Issue
• For 3D case, Matrix 𝐴 would be huge. (for (128 × 128 × 128) grid, 𝐴 matrix has
128 × 128 × 128 × 128 × 128 × 128 = 32𝑇𝐵, (for 2D it takes only 2GB)
• However, there are almost 0 in 𝐴 matrix for poisson equation. ⇒ Sparse Matrix!
How to represent Sparse Matrix?
• Simplest thing. Store nonzero value and row, column index. (Coordinate
Format, COO)
Too many
duplication
Sparse Matrix Format
Compressed Sparse Row (CSR)
• Store only non-zero values
• Available three or four arrays
• Not easy to construct the algorithm such as ILU or IC preconditioner
Use MKL (Intel Math Kernel Library)
MKL?
• a library of optimized math routines for science, engineering, and financial
applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse
solvers, fast Fourier transforms, and vector math. The routines in MKL are
hand-optimized specifically for Intel processors.
• For my problem, I usually use BLAS, fast Fourier transforms (for poisson
equation solver with Neumann, periodic, dirichlet BC)
BLAS?
• a specified set of low-level subroutines that perform common linear algebra
operations, widely used. Even in MATLAB!
• Usually used in vector or matrix multiplication, dot product like operations.
• Level 1 : vector – vector operation
• Level 2 : matrix – vector operation
• Level 3 : matrix – matrix operation
• Parallelized internally by Intel. Just turn on the option.
• Reference manual : https://software.intel.com/en-us/mkl_11.1_ref
How to use Library
For MKL
• For compile (when creating .c files in your makefile)
• -i8 -openmp -I$(MKLROOT)/include
• For link (when creating executable files using –o option)
• -L$(MKLROOT)/lib/intel64 -lmkl_core -lmkl_intel_thread
-lpthread –lm
• https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
Library Linking Process
• Compile
• -I option indicate where is
header file (.h file), specifying
include path
• Linking
• -L option indicate where is
library file (.lib, .dll, .a, .so),
specifying linking path
• -l option indicate library name
Reference
• Shewchuk, Jonathan Richard. "An introduction to the conjugate gradient
method without the agonizing pain." (1994).
• Deepak Chandan, “Using Sparse Matrix and Solver Routines from Intel
MKL”, Scinet User Group Meeting, (2013)
• Saad, Yousef. Iterative methods for sparse linear systems. Siam, 2003.
• Akhunov, R. R., et al. "Optimization of the ILU(0) factorization algorithm with
the use of compressed sparse row format." Zapiski Nauchnykh Seminarov POMI
405 (2012): 40-53.

Solving Poisson Equation using Conjugate Gradient Method and its implementation

  • 1.
    Solving Poisson Equationusing Conjugate Gradient Method and its implementation Jongsu Kim
  • 2.
  • 3.
    From the Basics,Ax=b Linear Systems 𝐴𝑥 = 𝑏 Goal of this presentation What have you learned? • Direct Method • Gauss Elimination • Thomas Algorithm (TDMA) (for tridiagonal matrix only) • Iterative Method • Jacobi method • SOR method • Conjugate Gradient Method • Red Black Jacobi Method
  • 4.
    Iterative Method Start withdecomposition 𝐴 = 𝐷 − 𝐸 − 𝐹 Jacobi Method 𝑥 𝑘+1 = 𝐷−1 𝐸 + 𝐹 𝑥 𝑘 + 𝐷−1 𝑏 Gauss-Seidel Method 𝑥 𝑘+1 = (𝐷 − 𝐸)−1 𝐹𝑥 𝑘 + 𝐷 − 𝐸 −1 𝑏 𝐴𝑥 = 𝑏 Backward Gauss-Seidel Iteration 𝐷 − 𝐹 𝑥 𝑘+1 = 𝐸𝑥 𝑘 + 𝑏 (𝑖 = 1, … , 𝑛 − 1, 𝑛) (𝑖 = 𝑛, 𝑛 − 1 … , )
  • 5.
    Splitting of Amatrix Previous method has a common form 𝐴 = 𝐷 − 𝐸 − 𝐹 𝑥 𝑘+1 = 𝐷−1 𝐸 + 𝐹 𝑥 𝑘 + 𝐷−1 𝑏 𝑥 𝑘+1 = (𝐷 − 𝐸)−1 𝐹𝑥 𝑘 + 𝐷 − 𝐸 −1 𝑏 𝐴𝑥 = 𝑏 𝐷 − 𝐹 𝑥 𝑘+1 = 𝐸𝑥 𝑘 + 𝑏 𝑴𝒙 𝒌+𝟏 = 𝑵𝒙 𝒌 + 𝒃 = 𝑴 − 𝑨 𝒙 𝒌 + 𝒃 𝑨 = 𝑴 − 𝑵 Introducing SOR (Successive Over Relaxation) method 𝝎𝑨 = 𝑫 − 𝝎𝑬 − (𝝎𝑭 + 𝟏 − 𝝎 𝑫) 𝑫 − 𝝎𝑬 𝒙 𝒌+𝟏 = 𝝎𝑭 + 𝟏 − 𝝎 𝑫 𝒙 𝒌 + 𝝎𝒃
  • 6.
    SOR to SSOR GaussSeidel method 𝑥 𝑘+1 = (𝐷 − 𝐸)−1 𝐹𝑥 𝑘 + 𝐷 − 𝐸 −1 𝑏 SOR (Successive Over Relaxation) method 𝐷 − 𝜔𝐸 𝑥 𝑘+1 = 𝜔𝐹 + 1 − 𝜔 𝐷 𝑥 𝑘 + 𝜔𝑏 (𝐷 − 𝐸)𝑥 𝑘+1= 𝐹𝑥 𝑘 + 𝑏 𝐷 − 𝐹 𝑥 𝑘+1 = 𝐸𝑥 𝑘 + 𝑏 Backward Gauss Seidel method Backward SOR method 𝐷 − 𝜔𝐹 𝑥 𝑘+1 = 𝜔𝐸 + 1 − 𝜔 𝐷 𝑥 𝑘 + 𝜔𝑏
  • 7.
    SSOR method SSOR (SymmetricSuccessive Over Relaxation) method SOR step followed by backward SOR step for symmetric matrix 𝐷 − 𝜔𝐸 𝑥 𝑘+1/2 = 𝜔𝐹 + 1 − 𝜔 𝐷 𝑥 𝑘 + 𝜔𝑏 𝐷 − 𝜔𝐹 𝑥 𝑘+1 = 𝜔𝐸 + 1 − 𝜔 𝐷 𝑥 𝑘+1/2 + 𝜔𝑏 𝑥 𝑘+1 = 𝑮 𝝎 𝑥 𝑘 + 𝒇 𝝎 𝑮 𝝎 = 𝐷 − 𝜔𝐹 −1 𝜔𝐸 + 1 − 𝜔 𝐷 × 𝐷 − 𝜔𝐸 −1 𝜔𝐹 + 1 − 𝜔 𝐷 𝒇 𝝎 = 𝜔 𝐷 − 𝜔𝐹 −1 𝐼 + 𝜔𝐸 + 1 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1 𝑏 Observing that 𝜔𝐸 + 1 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1 = − 𝐷 − 𝜔𝐸 + 2 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1 = −𝐼 + 2 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1 𝒇 𝝎 = 𝜔 2 − 𝜔 𝐷 − 𝜔𝐹 −1 𝐷 𝐷 − 𝜔𝐸 −1 𝑏 Used as preconditioner (explain later)
  • 8.
    Preconditioned System 𝒙 𝒌+𝟏= 𝑮 𝝎 𝒙 𝒌 + 𝒇 𝝎 𝐺 𝐺𝑆 𝐴 = 𝐼 − (𝐷 − 𝐸)−1 𝐴𝐺𝐽𝐴 𝐴 = 𝐼 − 𝐷−1 𝐴, 𝒙 𝒌+𝟏 = 𝑴−𝟏 𝑵𝒙 𝒌 + 𝑴−𝟏 𝒃 We have two forms for iterative method Ex) 𝐺 = 𝑀−1 𝑁 = 𝑀−1 𝑀 − 𝐴 = 𝐼 − 𝑀−1 𝐴 𝑓 = 𝑀−1 𝑏 𝐼 − 𝐺 𝑥 = 𝑓 Another view… [𝐼 − (𝐼 − 𝑀−1 𝐴)]𝑥 = 𝑓 𝑀−1 𝐴𝑥 = 𝑓 𝑴−𝟏 𝑨𝒙 = 𝑴−𝟏 𝒃 Preconditioner 𝑀
  • 9.
    Preconditioned System 𝑴−𝟏 𝑨𝒙 =𝑴−𝟏 𝒃 With Preconditioner 𝑀 𝑀 𝐺𝑆 = 𝐷 − 𝐸Gauss-Seidel 𝑀𝑆𝑆𝑂𝑅 = 1 𝜔 2 − 𝜔 𝐷 − 𝜔𝐸 𝐷−1 (𝐷 − 𝜔𝐹)SSOR 𝑀𝐽𝐴 = 𝐷Jacobi 𝑀𝐽𝐴 = 1 𝜔 (𝐷 − 𝜔𝐸)SOR It may not be “SPARSE” due to inverse (𝑀−1) How to compute this? 𝑤 = 𝑀−1 𝐴𝑣 𝑟 = 𝐴𝑣 and 𝑀𝑤 = 𝑟 𝐴𝑣 might be expensive. Much better? 𝑤 = 𝑀−1 𝐴𝑣 = 𝑀−1 𝑀 − 𝑁 𝑣 = 𝐼 − 𝑀−1 𝑁 𝑣 𝑟 = 𝑁𝑣 𝑤 = 𝑀−1 𝑟 𝑤 ≔ 𝑣 − 𝑤 N may be sparser than A and less expensive than 𝐴𝑣
  • 10.
    Minimization Problem Forget about𝐴𝑥 = 𝑏 temporarily, but thinking about some quadratic function 𝑓 Function Matrix 𝑓(x) = 1 2 𝐴𝑥2 − 𝑏𝑥 + 𝑐 𝑓 𝑥 = 1 2 𝑥 𝑇 𝐴𝑥 − 𝑏 𝑇 𝑥 + 𝑐 𝑓′ x = 𝐴𝑥 − b 𝑓′ x = 1 2 𝐴 𝑇 𝑥 + 1 2 A𝑥 − b If Matrix 𝐴 is symmetric, 𝐴 𝑇 = 𝐴, then 𝒇′ 𝒙 = 𝑨𝒙 − 𝒃 Setting the gradient to zero, we get the linear system we wish to solve. Our original GOAL!!
  • 11.
    (a) Quadratic formfor a positive definite matrix (b) Quadratic form for a negative definite matrix (c) Singular (and positive-indefinite) matrix; A line that runs through bottom of the valley is the set of solutions (d) For an indefinite matrix. Saddle point. For a Symmetric and Positive Definite Matrix, minimizing 𝑓 𝑥 = 1 2 𝑥 𝑇 𝐴𝑥 − 𝑏 𝑇 𝑥 + 𝑐 Reduced to our solution Minimization Problem
  • 12.
    Steep Descent Method Choosedirection in which 𝑓 decrease most quickly, which is the direction opposite 𝑓′(𝑥 𝑖 ) 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖) −𝑓′ 𝑥 𝑖 = 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖) 𝑥(1) = 𝑥(0) + 𝛼𝑟(0) To Find 𝛼, set 𝑑 𝑑𝛼 𝑓 𝑥 1 = 0 𝑑 𝑑𝛼 𝑓 𝑥 1 = 𝑓′ 𝑥 1 𝑇 𝑑 𝑑𝛼 𝑥(1) = 𝑓′ 𝑥 1 𝑇 𝑟(0) 𝑓′ 𝑥 𝑖+1 𝑇 and 𝑟(𝑖) are orthogonal! −𝑓′ 𝑥 𝑖+1 = 𝑟(𝑖+1) 𝑓′ 𝑥 𝑖+1 𝑇 𝑟(𝑖) = 0 𝑟 𝑖+1 𝑇 𝑟(𝑖) = 0 𝜶 = 𝒓 𝒊 𝑻 𝒓 𝒊 𝒓(𝒊) 𝑻 𝑨𝒓(𝒊)
  • 13.
    Conjugate Gradient Method SteepDescent Method not always converge well Worst case of steep descent method • Solid lines : worst convergence line • Dashed line : steps toward convergence Why it doesn’t directly go along line for fast convergence? → related to eigen value problem Introducing Conjugate Gradient method
  • 14.
    Conjugate Gradient Method Whatis the meaning of conjugate? • Definition : A binomial formed by negating the second term of binomial • 𝑥 + 𝑦 ← conjugate → 𝑥 − 𝑦 Then, what is the meaning of conjugate gradient? • Steep descent method often finds itself taking steps in the same direction • Wouldn’t it better if we got it right the every step? • Here is a step • error 𝑒(𝑖) = 𝑥(𝑖) − 𝑥, residual 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖), 𝑑(𝑖) a set of orthogonal search direction • for each step, we choose a point 𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼(𝑖) 𝑑(𝑖) • To find 𝛼, 𝑒(𝑖+1) should be orthogonal to 𝑑(𝑖). (𝑒 𝑖+1 = 𝑒 𝑖 + 𝛼 𝑖 𝑑 𝑖 ) 𝑑(𝑖) 𝑇 𝑒(𝑖+1) = 0 𝑑(𝑖) 𝑇 (𝑒 𝑖 +𝛼(𝑖) 𝑑(𝑖)) = 0 𝛼(𝑖) = − 𝑑 𝑖 𝑇 𝑒 𝑖 𝑑(𝑖) 𝑇 𝑑(𝑖) We don’t know anything about 𝑒(𝑖), because if we know 𝑒(𝑖), it means we know the answer.
  • 15.
    Conjugate Gradient Method Insteadof orthogonal, introduce 𝐴-orthogonal 𝒅(𝒊) 𝑻 𝑨𝒅(𝒋) = 𝟎, if 𝑑(𝑖) and 𝑑(𝑗) are 𝐴-orthogonal, or conjugate 𝒆(𝒊+𝟏) is 𝑨-orthogonal to 𝒅(𝒊), and this condition is equivalent to finding the minimum point along the search direction 𝑑(𝑖) , as in steep descent method 𝑑 𝑑𝛼 𝑓 𝑥 𝑖+1 = 0 𝛼 minimize 𝑓 when directional derivative is equal to zero 𝑓′ 𝑥 𝑖+1 𝑇 𝑑 𝑑𝛼 𝑥 𝑖+1 = 0 −𝑟 𝑖+1 𝑇 𝑑(𝑖) = 0 Chain rule 𝑓′ 𝑥(𝑖+1) = 𝐴𝑥(𝑖+1) − 𝑏 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖) 𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼(𝑖) 𝑑(𝑖) 𝑑(𝑖) 𝑇 𝐴𝑒(𝑖+1) = 0 𝑥(𝑖+1) 𝑇 𝐴 𝑇 𝑑(𝑖) − 𝑏 𝑇 𝑑 𝑖 = 0 𝑥(𝑖+1) 𝑇 𝐴 𝑇 𝑑(𝑖) − 𝑥 𝑇 𝐴 𝑇 𝑑 𝑖 = 0 𝑒 𝑖+1 𝑇 𝐴 𝑇 𝑑(𝑖) = 0 Transpose again How it can be same as orthogonality used in steep descent method?
  • 16.
    𝑒(𝑖+1) = 𝑥(𝑖+1)− 𝑥 𝜶(𝒊) = − 𝒅 𝒊 𝑻 𝒓 𝒊 𝒅(𝒊) 𝑻 𝑨𝒅(𝒊) Conjugate Gradient Method 𝑑(𝑖) 𝑇 𝐴𝑒(𝑖+1) = 0 𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼𝑑(𝑖) 𝑒(𝑖+1) = (𝑥(𝑖) + 𝛼𝑟(𝑖)) − 𝑥 𝑑(𝑖) 𝑇 𝐴𝑒(𝑖+1) = 𝑑 𝑖 𝑇 𝐴((𝑥 𝑖 + 𝛼𝑑 𝑖 ) − 𝑥) 𝑑 𝑖 𝑇 𝐴𝑥(𝑖) + 𝛼𝑑 𝑖 𝑇 𝐴𝑑(𝑖) − 𝑑 𝑖 𝑇 𝐴𝑥 = 0 𝑑 𝑖 𝑇 𝐴𝑥 𝑖 − 𝑏 = −𝛼𝑑 𝑖 𝑇 𝐴𝑑(𝑖) How to find 𝑑(𝑖)? Gram-Schmidt Process 𝑑(𝑖) = 𝑢(𝑖) + Σk=0 𝑖−1 𝛽𝑖𝑘 𝑑(𝑘) Find set of 𝐴-orthogonal vector 𝛽𝑖𝑘 = − 𝑢𝑖 𝑇 𝐴𝑑 𝑗 𝑑(𝑗) 𝑇 𝐴𝑑(𝑗) For set of independent vectors 𝑢𝑖 due to 𝑑 𝑖 𝑇 𝐴𝑑(𝑗) = 0 𝑖 > 𝑗
  • 17.
    Conjugate Gradient Method OverallAlgorithm Initialization 𝑖 = 0 𝑟 = 𝑏 − 𝐴𝑥 𝑑 = 𝑟 𝛿 𝑛𝑒𝑤 = 𝑟 𝑇 𝑟 𝛿0 = 𝛿 𝑛𝑒𝑤 𝜖 = 1.0𝑒 − 6 Iteration check While i<imax && 𝛿 𝑛𝑒𝑤 > 𝜖2 𝛿0 Inside loop 𝑞 = 𝐴𝑑 𝛼 = 𝛿 𝑛𝑒𝑤 𝑑 𝑇 𝑞 𝑥 = 𝑥 + 𝛼𝑑 If 𝑖 is divisible by 50 𝑟 = 𝑏 − 𝐴𝑥 else 𝑟 = 𝑟 − 𝛼𝑞 endif 𝛿 𝑜𝑙𝑑 = 𝛿 𝑛𝑒𝑤 𝛿 𝑛𝑒𝑤 = 𝑟 𝑇 𝑟 𝛽 = 𝛿 𝑛𝑒𝑤 𝛿 𝑜𝑙𝑑 𝑑 = 𝑟 + 𝛽𝑑 𝑖 = 𝑖 + 1
  • 18.
    Preconditioner Again 𝑴−𝟏 𝑨𝒙 =𝑴−𝟏 𝒃 With Preconditioner 𝑀 𝑀 𝐺𝑆 = 𝐷 − 𝐸Gauss-Seidel 𝑀𝑆𝑆𝑂𝑅 = 1 𝜔 2 − 𝜔 𝐷 − 𝜔𝐸 𝐷−1 (𝐷 − 𝜔𝐹)SSOR 𝑀𝐽𝐴 = 𝐷Jacobi 𝑀𝐽𝐴 = 1 𝜔 (𝐷 − 𝜔𝐸)SOR Incomplete LU Decomposition 𝐴 = 𝐿𝑈 − 𝑅 𝑅 : residual error Incomplete Cholesky Decomposition 𝐴 = 𝐿𝐿 𝑇 − 𝑅 If A is SPD (Symmetric Positive Definite), above two decomposition are same To make sparse system, used incomplete Factorization
  • 19.
  • 20.
    Implementation Issue • For3D case, Matrix 𝐴 would be huge. (for (128 × 128 × 128) grid, 𝐴 matrix has 128 × 128 × 128 × 128 × 128 × 128 = 32𝑇𝐵, (for 2D it takes only 2GB) • However, there are almost 0 in 𝐴 matrix for poisson equation. ⇒ Sparse Matrix! How to represent Sparse Matrix? • Simplest thing. Store nonzero value and row, column index. (Coordinate Format, COO) Too many duplication
  • 21.
    Sparse Matrix Format CompressedSparse Row (CSR) • Store only non-zero values • Available three or four arrays • Not easy to construct the algorithm such as ILU or IC preconditioner
  • 22.
    Use MKL (IntelMath Kernel Library) MKL? • a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math. The routines in MKL are hand-optimized specifically for Intel processors. • For my problem, I usually use BLAS, fast Fourier transforms (for poisson equation solver with Neumann, periodic, dirichlet BC) BLAS? • a specified set of low-level subroutines that perform common linear algebra operations, widely used. Even in MATLAB! • Usually used in vector or matrix multiplication, dot product like operations. • Level 1 : vector – vector operation • Level 2 : matrix – vector operation • Level 3 : matrix – matrix operation • Parallelized internally by Intel. Just turn on the option. • Reference manual : https://software.intel.com/en-us/mkl_11.1_ref
  • 23.
    How to useLibrary For MKL • For compile (when creating .c files in your makefile) • -i8 -openmp -I$(MKLROOT)/include • For link (when creating executable files using –o option) • -L$(MKLROOT)/lib/intel64 -lmkl_core -lmkl_intel_thread -lpthread –lm • https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor Library Linking Process • Compile • -I option indicate where is header file (.h file), specifying include path • Linking • -L option indicate where is library file (.lib, .dll, .a, .so), specifying linking path • -l option indicate library name
  • 24.
    Reference • Shewchuk, JonathanRichard. "An introduction to the conjugate gradient method without the agonizing pain." (1994). • Deepak Chandan, “Using Sparse Matrix and Solver Routines from Intel MKL”, Scinet User Group Meeting, (2013) • Saad, Yousef. Iterative methods for sparse linear systems. Siam, 2003. • Akhunov, R. R., et al. "Optimization of the ILU(0) factorization algorithm with the use of compressed sparse row format." Zapiski Nauchnykh Seminarov POMI 405 (2012): 40-53.