Sherman–Morrison formula
In mathematics, in particular linear algebra, the Sherman–Morrison formula,[1][2][3] named after Jack Sherman and Winifred J. Morrison, computes the inverse of the sum of an invertible matrix A{displaystyle A} and the outer product, uvT{displaystyle uv^{T}}, of vectors u{displaystyle u} and v{displaystyle v}. The Sherman–Morrison formula is a special case of the Woodbury formula. Though named after Sherman and Morrison, it appeared already in earlier publications.[4]
Contents
1 Statement
2 Proof
3 Application
4 Alternative verification
5 Generalization (Woodbury matrix identity)
6 See also
7 References
8 External links
Statement
Suppose A∈Rn×n{displaystyle Ain mathbb {R} ^{ntimes n}} is an invertible square matrix and u{displaystyle u}, v∈Rn{displaystyle vin mathbb {R} ^{n}} are column vectors. Then A+uvT{displaystyle A+uv^{T}}is invertible iff 1+vTA−1u≠0{displaystyle 1+v^{T}A^{-1}uneq 0}. In this case,
- (A+uvT)−1=A−1−A−1uvTA−11+vTA−1u.{displaystyle (A+uv^{T})^{-1}=A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}.}
Here, uvT{displaystyle uv^{T}} is the outer product of two vectors u{displaystyle u} and v{displaystyle v}. The general form shown here is the one published by Bartlett.[5]
Proof
(⇐{displaystyle Leftarrow }) To prove that the backward direction (1+vTA−1u≠0⇒A+uvT{displaystyle 1+v^{T}A^{-1}uneq 0Rightarrow A+uv^{T}} is invertible with inverse given as above) is true, we verify the properties of the inverse. A matrix Y{displaystyle Y} (in this case the right-hand side of the Sherman–Morrison formula) is the inverse of a matrix X{displaystyle X} (in this case A+uvT{displaystyle A+uv^{T}}) if and only if XY=I{displaystyle XY=I}.
We first verify that the right hand side (Y{displaystyle Y}) satisfies XY=I{displaystyle XY=I}.
- XY=(A+uvT)(A−1−A−1uvTA−11+vTA−1u)=AA−1+uvTA−1−AA−1uvTA−1+uvTA−1uvTA−11+vTA−1u=I+uvTA−1−uvTA−1+uvTA−1uvTA−11+vTA−1u=I+uvTA−1−u(1+vTA−1u)vTA−11+vTA−1u=I+uvTA−1−uvTA−11+vTA−1u is a scalar=I{displaystyle {begin{aligned}XY&=(A+uv^{T})left(A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}right)\[6pt]&=AA^{-1}+uv^{T}A^{-1}-{AA^{-1}uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-{uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-{u(1+v^{T}A^{-1}u)v^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-uv^{T}A^{-1}&&1+v^{T}A^{-1}u{text{ is a scalar}}\[6pt]&=Iend{aligned}}}
(⇒{displaystyle Rightarrow }) Reciprocally, if 1+vTA−1u=0{displaystyle 1+v^{T}A^{-1}u=0}, then letting x=A−1u{displaystyle x=A^{-1}u}, one has (A+uvT)x=0{displaystyle (A+uv^{T})x=0} thus (A+uvT){displaystyle (A+uv^{T})} has a non-trivial kernel and is therefore not invertible.
Application
If the inverse of A{displaystyle A} is already known, the formula provides a
numerically cheap way
to compute the inverse of A{displaystyle A} corrected by the matrix uvT{displaystyle uv^{T}}
(depending on the point of view, the correction may be seen as a
perturbation or as a rank-1 update).
The computation is relatively cheap because the inverse of A+uvT{displaystyle A+uv^{T}}
does not have to be computed from scratch (which in general is expensive),
but can be computed by correcting (or perturbing) A−1{displaystyle A^{-1}}.
Using unit columns (columns from the identity matrix) for u{displaystyle u} or v{displaystyle v}, individual columns or rows of A{displaystyle A} may be manipulated
and a correspondingly updated inverse computed relatively cheaply in this way.[6] In the general case, where A−1{displaystyle A^{-1}} is a n{displaystyle n}-by-n{displaystyle n} matrix
and u{displaystyle u} and v{displaystyle v} are arbitrary vectors of dimension n{displaystyle n}, the whole matrix is updated[5] and the computation takes 3n2{displaystyle 3n^{2}} scalar multiplications.[7] If u{displaystyle u} is a unit column, the computation takes only 2n2{displaystyle 2n^{2}} scalar multiplications. The same goes if v{displaystyle v} is a unit column. If both u{displaystyle u} and v{displaystyle v} are unit columns, the computation takes only n2{displaystyle n^{2}} scalar multiplications.
This formula also has application in theoretical physics. Namely, in quantum field theory, one uses this formula to calculate the propagator of a spin-1 field[8][circular reference]. The inverse propagator (as it appears in the Lagrangian) has the form (A+uvT){displaystyle (A+uv^{T})}. One uses the Sherman-Morrison formula to calculate the inverse (satisfying certain time-ordering boundary conditions) of the inverse propagator - or simply the (Feynman) propagator - which is needed to perform any perturbative calculation[9] involving the spin-1 field.
Alternative verification
Following is an alternate verification of the Sherman–Morrison formula using the easily verifiable identity
(I+wvT)−1=I−wvT1+vTw{displaystyle (I+wv^{T})^{-1}=I-{frac {wv^{T}}{1+v^{T}w}}}.
Let
- u=Aw,andA+uvT=A(I+wvT),{displaystyle u=Aw,quad {text{and}}quad A+uv^{T}=Aleft(I+wv^{T}right),}
then
(A+uvT)−1=(I+wvT)−1A−1=(I−wvT1+vTw)A−1{displaystyle (A+uv^{T})^{-1}=(I+wv^{T})^{-1}{A^{-1}}=left(I-{frac {wv^{T}}{1+v^{T}w}}right)A^{-1}}.
Substituting w=A−1u{displaystyle w={{A}^{-1}}u} gives
- (A+uvT)−1=(I−A−1uvT1+vTA−1u)A−1=A−1−A−1uvTA−11+vTA−1u{displaystyle (A+uv^{T})^{-1}=left(I-{frac {A^{-1}uv^{T}}{1+v^{T}A^{-1}u}}right)A^{-1}={A^{-1}}-{frac {A^{-1}uv^{T}A^{-1}}{1+v^{T}A^{-1}u}}}
Generalization (Woodbury matrix identity)
Given a square invertible n×n{displaystyle ntimes n} matrix A{displaystyle A}, an n×k{displaystyle ntimes k} matrix U{displaystyle U}, and a k×n{displaystyle ktimes n} matrix V{displaystyle V}, let B{displaystyle B} be an n×n{displaystyle ntimes n} matrix such that B=A+UV{displaystyle B=A+UV}. Then, assuming (Ik+VA−1U){displaystyle left(I_{k}+VA^{-1}Uright)} is invertible, we have
- B−1=A−1−A−1U(Ik+VA−1U)−1VA−1.{displaystyle B^{-1}=A^{-1}-A^{-1}Uleft(I_{k}+VA^{-1}Uright)^{-1}VA^{-1}.}
See also
- The matrix determinant lemma performs a rank-1 update to a determinant.
- Woodbury matrix identity
- Quasi-Newton method
- Binomial inverse theorem
- Bunch–Nielsen–Sorensen formula
References
^ Sherman, Jack; Morrison, Winifred J. (1949). "Adjustment of an Inverse Matrix Corresponding to Changes in the Elements of a Given Column or a Given Row of the Original Matrix (abstract)". Annals of Mathematical Statistics. 20: 621. doi:10.1214/aoms/1177729959..mw-parser-output cite.citation{font-style:inherit}.mw-parser-output .citation q{quotes:"""""""'""'"}.mw-parser-output .citation .cs1-lock-free a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/9px-Lock-green.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .citation .cs1-lock-limited a,.mw-parser-output .citation .cs1-lock-registration a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/d/d6/Lock-gray-alt-2.svg/9px-Lock-gray-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .citation .cs1-lock-subscription a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Lock-red-alt-2.svg/9px-Lock-red-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration{color:#555}.mw-parser-output .cs1-subscription span,.mw-parser-output .cs1-registration span{border-bottom:1px dotted;cursor:help}.mw-parser-output .cs1-ws-icon a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/4/4c/Wikisource-logo.svg/12px-Wikisource-logo.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output code.cs1-code{color:inherit;background:inherit;border:inherit;padding:inherit}.mw-parser-output .cs1-hidden-error{display:none;font-size:100%}.mw-parser-output .cs1-visible-error{font-size:100%}.mw-parser-output .cs1-maint{display:none;color:#33aa33;margin-left:0.3em}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration,.mw-parser-output .cs1-format{font-size:95%}.mw-parser-output .cs1-kern-left,.mw-parser-output .cs1-kern-wl-left{padding-left:0.2em}.mw-parser-output .cs1-kern-right,.mw-parser-output .cs1-kern-wl-right{padding-right:0.2em}
^ Sherman, Jack; Morrison, Winifred J. (1950). "Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix". Annals of Mathematical Statistics. 21 (1): 124–127. doi:10.1214/aoms/1177729893. MR 0035118. Zbl 0037.00901.
^ Press, William H.; Teukolsky, Saul A.; Vetterling, William T.; Flannery, Brian P. (2007), "Section 2.7.1 Sherman–Morrison Formula", Numerical Recipes: The Art of Scientific Computing (3rd ed.), New York: Cambridge University Press, ISBN 978-0-521-88068-8
^ Hager, William W. (1989). "Updating the inverse of a matrix". SIAM Review. 31 (2): 221–239. doi:10.1137/1031049. JSTOR 2030425. MR 0997457.
^ ab Bartlett, Maurice S. (1951). "An Inverse Matrix Adjustment Arising in Discriminant Analysis". Annals of Mathematical Statistics. 22 (1): 107–111. doi:10.1214/aoms/1177729698. MR 0040068. Zbl 0042.38203.
^ Langville, Amy N.; and Meyer, Carl D.; "Google's PageRank and Beyond: The Science of Search Engine Rankings", Princeton University Press, 2006, p. 156
^ Update of the inverse matrix by the Sherman–Morrison formula
^ Propagator#Spin 1
^ [1]
External links
- Weisstein, Eric W. "Sherman–Morrison formula". MathWorld.