Sherman–Morrison formula






Formula computing the inverse of the sum of a matrix with the outer product of two vectors

In mathematics, in particular linear algebra, the Sherman–Morrison formula,[1][2][3] named after Jack Sherman and Winifred J. Morrison, computes the inverse of the sum of an invertible matrix A{displaystyle A}A and the outer product, uvT{displaystyle uv^{T}}uv^{T}, of vectors u{displaystyle u}u and v{displaystyle v}v. The Sherman–Morrison formula is a special case of the Woodbury formula. Though named after Sherman and Morrison, it appeared already in earlier publications.[4]




Contents






  • 1 Statement


  • 2 Proof


  • 3 Application


  • 4 Alternative verification


  • 5 Generalization (Woodbury matrix identity)


  • 6 See also


  • 7 References


  • 8 External links





Statement


Suppose A∈Rn×n{displaystyle Ain mathbb {R} ^{ntimes n}}{displaystyle Ain mathbb {R} ^{ntimes n}} is an invertible square matrix and u{displaystyle u}u, v∈Rn{displaystyle vin mathbb {R} ^{n}}{displaystyle vin mathbb {R} ^{n}} are column vectors. Then A+uvT{displaystyle A+uv^{T}}A+uv^{T}is invertible iff 1+vTA−1u≠0{displaystyle 1+v^{T}A^{-1}uneq 0}1+v^{T}A^{-1}uneq 0. In this case,


(A+uvT)−1=A−1−A−1uvTA−11+vTA−1u.{displaystyle (A+uv^{T})^{-1}=A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}.}(A+uv^{T})^{-1}=A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}.

Here, uvT{displaystyle uv^{T}}uv^{T} is the outer product of two vectors u{displaystyle u}u and v{displaystyle v}v. The general form shown here is the one published by Bartlett.[5]



Proof


({displaystyle Leftarrow }Leftarrow ) To prove that the backward direction (1+vTA−1u≠0⇒A+uvT{displaystyle 1+v^{T}A^{-1}uneq 0Rightarrow A+uv^{T}}{displaystyle 1+v^{T}A^{-1}uneq 0Rightarrow A+uv^{T}} is invertible with inverse given as above) is true, we verify the properties of the inverse. A matrix Y{displaystyle Y}Y (in this case the right-hand side of the Sherman–Morrison formula) is the inverse of a matrix X{displaystyle X}X (in this case A+uvT{displaystyle A+uv^{T}}A+uv^{T}) if and only if XY=I{displaystyle XY=I}XY=I.


We first verify that the right hand side (Y{displaystyle Y}Y) satisfies XY=I{displaystyle XY=I}XY=I.


XY=(A+uvT)(A−1−A−1uvTA−11+vTA−1u)=AA−1+uvTA−1−AA−1uvTA−1+uvTA−1uvTA−11+vTA−1u=I+uvTA−1−uvTA−1+uvTA−1uvTA−11+vTA−1u=I+uvTA−1−u(1+vTA−1u)vTA−11+vTA−1u=I+uvTA−1−uvTA−11+vTA−1u is a scalar=I{displaystyle {begin{aligned}XY&=(A+uv^{T})left(A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}right)\[6pt]&=AA^{-1}+uv^{T}A^{-1}-{AA^{-1}uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-{uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-{u(1+v^{T}A^{-1}u)v^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-uv^{T}A^{-1}&&1+v^{T}A^{-1}u{text{ is a scalar}}\[6pt]&=Iend{aligned}}}{displaystyle {begin{aligned}XY&=(A+uv^{T})left(A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}right)\[6pt]&=AA^{-1}+uv^{T}A^{-1}-{AA^{-1}uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-{uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-{u(1+v^{T}A^{-1}u)v^{T}A^{-1} over 1+v^{T}A^{-1}u}\[6pt]&=I+uv^{T}A^{-1}-uv^{T}A^{-1}&&1+v^{T}A^{-1}u{text{ is a scalar}}\[6pt]&=Iend{aligned}}}

({displaystyle Rightarrow }Rightarrow ) Reciprocally, if 1+vTA−1u=0{displaystyle 1+v^{T}A^{-1}u=0}{displaystyle 1+v^{T}A^{-1}u=0}, then letting x=A−1u{displaystyle x=A^{-1}u}{displaystyle x=A^{-1}u}, one has (A+uvT)x=0{displaystyle (A+uv^{T})x=0}{displaystyle (A+uv^{T})x=0} thus (A+uvT){displaystyle (A+uv^{T})}{displaystyle (A+uv^{T})} has a non-trivial kernel and is therefore not invertible.



Application


If the inverse of A{displaystyle A}A is already known, the formula provides a
numerically cheap way
to compute the inverse of A{displaystyle A}A corrected by the matrix uvT{displaystyle uv^{T}}uv^{T}
(depending on the point of view, the correction may be seen as a
perturbation or as a rank-1 update).
The computation is relatively cheap because the inverse of A+uvT{displaystyle A+uv^{T}}A+uv^{T}
does not have to be computed from scratch (which in general is expensive),
but can be computed by correcting (or perturbing) A−1{displaystyle A^{-1}}A^{-1}.


Using unit columns (columns from the identity matrix) for u{displaystyle u}u or v{displaystyle v}v, individual columns or rows of A{displaystyle A}A may be manipulated
and a correspondingly updated inverse computed relatively cheaply in this way.[6] In the general case, where A−1{displaystyle A^{-1}}A^{-1} is a n{displaystyle n}n-by-n{displaystyle n}n matrix
and u{displaystyle u}u and v{displaystyle v}v are arbitrary vectors of dimension n{displaystyle n}n, the whole matrix is updated[5] and the computation takes 3n2{displaystyle 3n^{2}}3n^{2} scalar multiplications.[7] If u{displaystyle u}u is a unit column, the computation takes only 2n2{displaystyle 2n^{2}}2n^{2} scalar multiplications. The same goes if v{displaystyle v}v is a unit column. If both u{displaystyle u}u and v{displaystyle v}v are unit columns, the computation takes only n2{displaystyle n^{2}}n^{2} scalar multiplications.


This formula also has application in theoretical physics. Namely, in quantum field theory, one uses this formula to calculate the propagator of a spin-1 field[8][circular reference]. The inverse propagator (as it appears in the Lagrangian) has the form (A+uvT){displaystyle (A+uv^{T})}{displaystyle (A+uv^{T})}. One uses the Sherman-Morrison formula to calculate the inverse (satisfying certain time-ordering boundary conditions) of the inverse propagator - or simply the (Feynman) propagator - which is needed to perform any perturbative calculation[9] involving the spin-1 field.



Alternative verification


Following is an alternate verification of the Sherman–Morrison formula using the easily verifiable identity



(I+wvT)−1=I−wvT1+vTw{displaystyle (I+wv^{T})^{-1}=I-{frac {wv^{T}}{1+v^{T}w}}}(I+wv^{T})^{-1}=I-{frac {wv^{T}}{1+v^{T}w}}.

Let


u=Aw,andA+uvT=A(I+wvT),{displaystyle u=Aw,quad {text{and}}quad A+uv^{T}=Aleft(I+wv^{T}right),}{displaystyle u=Aw,quad {text{and}}quad A+uv^{T}=Aleft(I+wv^{T}right),}

then



(A+uvT)−1=(I+wvT)−1A−1=(I−wvT1+vTw)A−1{displaystyle (A+uv^{T})^{-1}=(I+wv^{T})^{-1}{A^{-1}}=left(I-{frac {wv^{T}}{1+v^{T}w}}right)A^{-1}}(A+uv^{T})^{-1}=(I+wv^{T})^{-1}{A^{-1}}=left(I-{frac {wv^{T}}{1+v^{T}w}}right)A^{-1}.

Substituting w=A−1u{displaystyle w={{A}^{-1}}u}w={{A}^{{-1}}}u gives


(A+uvT)−1=(I−A−1uvT1+vTA−1u)A−1=A−1−A−1uvTA−11+vTA−1u{displaystyle (A+uv^{T})^{-1}=left(I-{frac {A^{-1}uv^{T}}{1+v^{T}A^{-1}u}}right)A^{-1}={A^{-1}}-{frac {A^{-1}uv^{T}A^{-1}}{1+v^{T}A^{-1}u}}}(A+uv^{T})^{-1}=left(I-{frac {A^{-1}uv^{T}}{1+v^{T}A^{-1}u}}right)A^{-1}={A^{-1}}-{frac {A^{-1}uv^{T}A^{-1}}{1+v^{T}A^{-1}u}}


Generalization (Woodbury matrix identity)


Given a square invertible n{displaystyle ntimes n}ntimes n matrix A{displaystyle A}A, an k{displaystyle ntimes k}n times k matrix U{displaystyle U}U, and a n{displaystyle ktimes n}ktimes n matrix V{displaystyle V}V, let B{displaystyle B}B be an n{displaystyle ntimes n}ntimes n matrix such that B=A+UV{displaystyle B=A+UV}{displaystyle B=A+UV}. Then, assuming (Ik+VA−1U){displaystyle left(I_{k}+VA^{-1}Uright)}{displaystyle left(I_{k}+VA^{-1}Uright)} is invertible, we have


B−1=A−1−A−1U(Ik+VA−1U)−1VA−1.{displaystyle B^{-1}=A^{-1}-A^{-1}Uleft(I_{k}+VA^{-1}Uright)^{-1}VA^{-1}.}{displaystyle B^{-1}=A^{-1}-A^{-1}Uleft(I_{k}+VA^{-1}Uright)^{-1}VA^{-1}.}


See also



  • The matrix determinant lemma performs a rank-1 update to a determinant.

  • Woodbury matrix identity

  • Quasi-Newton method

  • Binomial inverse theorem

  • Bunch–Nielsen–Sorensen formula



References





  1. ^ Sherman, Jack; Morrison, Winifred J. (1949). "Adjustment of an Inverse Matrix Corresponding to Changes in the Elements of a Given Column or a Given Row of the Original Matrix (abstract)". Annals of Mathematical Statistics. 20: 621. doi:10.1214/aoms/1177729959..mw-parser-output cite.citation{font-style:inherit}.mw-parser-output .citation q{quotes:"""""""'""'"}.mw-parser-output .citation .cs1-lock-free a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/9px-Lock-green.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .citation .cs1-lock-limited a,.mw-parser-output .citation .cs1-lock-registration a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/d/d6/Lock-gray-alt-2.svg/9px-Lock-gray-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .citation .cs1-lock-subscription a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Lock-red-alt-2.svg/9px-Lock-red-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration{color:#555}.mw-parser-output .cs1-subscription span,.mw-parser-output .cs1-registration span{border-bottom:1px dotted;cursor:help}.mw-parser-output .cs1-ws-icon a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/4/4c/Wikisource-logo.svg/12px-Wikisource-logo.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output code.cs1-code{color:inherit;background:inherit;border:inherit;padding:inherit}.mw-parser-output .cs1-hidden-error{display:none;font-size:100%}.mw-parser-output .cs1-visible-error{font-size:100%}.mw-parser-output .cs1-maint{display:none;color:#33aa33;margin-left:0.3em}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration,.mw-parser-output .cs1-format{font-size:95%}.mw-parser-output .cs1-kern-left,.mw-parser-output .cs1-kern-wl-left{padding-left:0.2em}.mw-parser-output .cs1-kern-right,.mw-parser-output .cs1-kern-wl-right{padding-right:0.2em}


  2. ^ Sherman, Jack; Morrison, Winifred J. (1950). "Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix". Annals of Mathematical Statistics. 21 (1): 124–127. doi:10.1214/aoms/1177729893. MR 0035118. Zbl 0037.00901.


  3. ^ Press, William H.; Teukolsky, Saul A.; Vetterling, William T.; Flannery, Brian P. (2007), "Section 2.7.1 Sherman–Morrison Formula", Numerical Recipes: The Art of Scientific Computing (3rd ed.), New York: Cambridge University Press, ISBN 978-0-521-88068-8


  4. ^ Hager, William W. (1989). "Updating the inverse of a matrix". SIAM Review. 31 (2): 221–239. doi:10.1137/1031049. JSTOR 2030425. MR 0997457.


  5. ^ ab Bartlett, Maurice S. (1951). "An Inverse Matrix Adjustment Arising in Discriminant Analysis". Annals of Mathematical Statistics. 22 (1): 107–111. doi:10.1214/aoms/1177729698. MR 0040068. Zbl 0042.38203.


  6. ^ Langville, Amy N.; and Meyer, Carl D.; "Google's PageRank and Beyond: The Science of Search Engine Rankings", Princeton University Press, 2006, p. 156


  7. ^ Update of the inverse matrix by the Sherman–Morrison formula


  8. ^ Propagator#Spin 1


  9. ^ [1]




External links


  • Weisstein, Eric W. "Sherman–Morrison formula". MathWorld.



Popular posts from this blog

Shashamane

Carrot

Deprivation index