Spectral Clustering

2019-09-18 08:00 CST

2019-12-22 15:33 CST

理论基础

  • 邻接矩阵 $E$:$\omega_{ij}$ 为 $v_i$ 与 $v_j$ 的边权
  • $D$: $di=\sum{j=1}^n\omega_i$

unnormalized graph Laplacian matrix

  • $L=D-E$
  • $\forall f\in\mathbb{R}^n,f^TLf=\frac{1}{2}\sum{i,j=1}^n\omega{ij}(f_i-f_j)^2$
  • $L$ is symmetric and positive semi-definite
  • The smallest eigenvalue of $L$ is $0$, the corresponding eigenvector is the constant one vector $\mathbb{1}$
  • $L$ has $n$ non-negative, real-valued eigenvalues $0=\lambda_1\leq\lambda_2\leq\cdots\leq\lambda_n$
  • self-edges in a graph do no change $L$
  • the multiplicity $k$ of the eigenvalue $0$ of $L$ equals the number of connected components in the graph

normalized graph Laplacians

  • $L_{\text{sym}} = D^{-\frac{1}{2}}LD^{-\frac{1}{2}}=I-D^{-\frac{1}{2}}WD^{-\frac{1}{2}}$
  • $L_{\text{rw}} = D^{-1}L=I-D^{-1}W$
  • $\forall f\in\mathbb{R}^n,f^TL{\text{sym}}f=\frac{1}{2}\sum{i,j=1}^n\omega_{ij}(\frac{f_i}{\sqrt{d_i}}-\frac{f_j}{\sqrt{d_j}})^2$
  • $\lambda$ is an eigenvalue of $L_\text{sym}$ with eigenvector $u$ iff
    • $\lambda$ is an eigenvalue of $L_{\text{sym}}$ with eigenvector $\omega=D^{\frac{1}{2}}u$
    • Or $\lambda$ and $u$ solve the generalized eigenproblem $Lu=\lambda Du$
  • $L{\text{sym}}$ and $L{\text{rw}}$ are positive semi-definite and have $n$ non-negative real-valued eigenvalues $0=\lambda_1\leq\cdots\leq\lambda_n$

算法

Unnormalized spectral clustering

  • Construct a similarity graph
  • Compute first $k$ eigenvalues $u1,\cdots,u_k$ of $L$
    • $U\in\mathbb{R}^{n\times k}$, $U$ contains $u_1,\cdots, u_k$ as column
  • Cluster point $(y_i)$ of $U$ in $\mathbb{R}^k$ with $k$-means into cluster $C_1,\cdots,C_k$
  • $A_i={j|y_j\in C_i}$

Normalized spectral clustering according to Shi and Malik

  • Construct a similarity graph
  • Compute the first $k$ generalized eigenvectors $u_1,\cdots,u_k$ of problem $Lu=\lambda Du$
    • $U\in\mathbb{R}^{n\times k}$, $U$ contains $u_1,\cdots, u_k$ as column
  • Cluster point $(y_i)$ of $U$ in $\mathbb{R}^k$ with $k$-means into cluster $C_1,\cdots,C_k$
  • $A_i={j|y_j\in C_i}$

Normalized spectral clustering according to Ng et al.

  • Construct a similarity graph
  • Compute first $k$ eigenvalues $u1,\cdots,uk$ of $L{\text{sym}}$
  • Form $T\in\mathbb{R}^{n\times k}$ from $U$ by normalizing the rows to norm 1
    • $t{ij}=\frac{u{ij}}{(\sumku{ik}^2)^\frac{1}{2}}$
  • Cluster point $(y_i)$ of $T$ in $\mathbb{R}^k$ with $k$-means into cluster $C_1,\cdots,C_k$
  • $A_i={j|y_j\in C_i}$