基于高斯混合模型的分布拟合与参数估计:高维复数域的详细推导
迪丽瓦拉
2024-03-01 00:52:36
0

文章目录

    • 基于高斯混合模型的分布拟合
      • 考虑Ck=σk2I\boldsymbol{C}_k=\sigma^2_k \boldsymbol{I}Ck​=σk2​I
      • 考虑一般的Ck\boldsymbol{C}_kCk​
    • 小结:转化为参数估计问题
      • 如果考虑Ck=σk2I\boldsymbol{C}_k=\sigma^2_k \boldsymbol{I}Ck​=σk2​I
      • 如果考虑一般的Ck\boldsymbol{C}_kCk​

基于高斯混合模型的分布拟合

实际中,我们不知道一些样本对应的真实分布。比如我们手上有NpN_pNp​个样本{hn}n=1Np\{ \boldsymbol h_n \}_{n=1}^{N_p}{hn​}n=1Np​​,这些样本都对应到某个特定的label,比如特定的数字或者类。令真实的分布为f(h)f(\boldsymbol h)f(h),我们可以通过经验产生大量样本来近似该分布。定义fff的近似为p(h)p(\boldsymbol h)p(h),并且限制分布ppp为Gaussian Mixture, i.e.
p(h)=∑k=1KαkCN(h;μk,Ck)(20)p \left (\boldsymbol h \right) = \sum_{k=1}^K \alpha_k \mathcal{CN}(\boldsymbol h; \boldsymbol \mu_k, \boldsymbol{C}_k) \tag{20} p(h)=k=1∑K​αk​CN(h;μk​,Ck​)(20)

其中h∈CL×1\boldsymbol h \in \mathbb{C}^{L \times 1}h∈CL×1,μk∈CL×1\boldsymbol{\mu_k} \in \mathbb{C}^{L \times 1}μk​∈CL×1,Ck∈CL×L\boldsymbol{C}_k \in \mathbb{C}^{L \times L}Ck​∈CL×L。假设我们固定第ppp个标识label,然后经验生成NpN_pNp​个样本,记为{hn}n=1Np\{ \boldsymbol h_n \}_{n=1}^{N_p}{hn​}n=1Np​​,我们要最大化似然函数:
max⁡{αk,μk,Ck}ln⁡(∏n=1Np∑k=1KαkCN(hn;μk,Ck))=∑n=1Npln⁡(∑k=1KαkCN(hn;μk,Ck))(21)\begin{aligned} \max_{\{ \alpha_k, \boldsymbol{\mu}_k, \boldsymbol{C}_k \}} & \ln \left ( \prod_{n=1}^{N_p} \sum_{k=1}^K \alpha_k \mathcal{CN}(\boldsymbol h_n; \boldsymbol \mu_k, \boldsymbol{C}_k) \right ) \\ = & \sum_{n=1}^{N_p} \ln \left( \sum_{k=1}^K \alpha_k \mathcal{CN}(\boldsymbol h_n; \boldsymbol \mu_k, \boldsymbol{C}_k) \right )\\ \end{aligned} \tag{21} {αk​,μk​,Ck​}max​=​ln⎝⎛​n=1∏Np​​k=1∑K​αk​CN(hn​;μk​,Ck​)⎠⎞​n=1∑Np​​ln(k=1∑K​αk​CN(hn​;μk​,Ck​))​(21)

我们采用EM算法进行相关参数的估计。正式计算之前,先引入一个隐变量z∈{1,⋯,K}z \in \{1,\cdots,K\}z∈{1,⋯,K}表征GMM中的第几个模(mode)。当K→∞K \rightarrow \inftyK→∞时,我们认为GMM能够很好地逼近原始分布。那么反过来,等效地来看,引入的隐变量zzz可以解释为:先根据隐变量zzz的先验p(z=k)p(z=k)p(z=k)确定第kkk个mode,再基于似然p(hn∣zn=k)p(\boldsymbol h_n|z^n=k)p(hn​∣zn=k),由第kkk个mode来产生相应的样本生成(sample)。根据相关概率公式,我们进一步描述其概率意义:
k-th mode: p(hn,zn=k)=p(zn=k)p(hn∣zn=k)all modes: p(hn)=∑zp(hn,zn=k)=∑zp(zn=k)p(hn∣zn=k)posterior of k-th mode: p(zn=k∣hn)=p(zn=k)p(hn∣zn=k)p(hn)=p(zn=k)p(hn∣zn=k)∑zp(zn=k)p(hn∣zn=k)\begin{aligned} \text{k-th mode: } & p(\boldsymbol h_n,z^n=k) = p(z^n=k) p(\boldsymbol h_n|z^n=k) \\ \text{all modes: } & p(\boldsymbol h_n) = \sum_z p(\boldsymbol h_n,z^n=k) = \sum_z p(z^n=k) p(\boldsymbol h_n|z^n=k) \\ \text{posterior of k-th mode: }& p(z^n=k|\boldsymbol h_n) = \frac{p(z^n=k) p(\boldsymbol h_n|z^n=k)}{p(\boldsymbol h_n)} = \frac{p(z^n=k) p(\boldsymbol h_n|z^n=k)}{\sum_z p(z^n=k) p(\boldsymbol h_n|z^n=k)} \end{aligned} k-th mode: all modes: posterior of k-th mode: ​p(hn​,zn=k)=p(zn=k)p(hn​∣zn=k)p(hn​)=z∑​p(hn​,zn=k)=z∑​p(zn=k)p(hn​∣zn=k)p(zn=k∣hn​)=p(hn​)p(zn=k)p(hn​∣zn=k)​=∑z​p(zn=k)p(hn​∣zn=k)p(zn=k)p(hn​∣zn=k)​​

我们将EM算法的步骤描述为:对于第t+1t+1t+1次迭代
(1) E步:固定第ttt次迭代的参数{αkt,μkt,Ckt}k=1K\{\alpha^t_k, \boldsymbol \mu^t_k, \boldsymbol{C}^t_k \}_{k=1}^K{αkt​,μkt​,Ckt​}k=1K​,对∀n,k\forall n,k∀n,k,即第nnn个sample,第kkk个mode,计算后验后验分布,定义为
γnkt+1≐p(zn=k∣hn)=p(zn=k)p(hn∣zn=k)p(hn)=αktCN(hn;μkt,Ckt)∑k=1KαktCN(hn;μkt,Ckt),∀n,k(22)\begin{aligned} \gamma^{t+1}_{nk} & \doteq p(z^n=k| \boldsymbol h_n) \\ &= \frac{p(z^n=k) p(\boldsymbol h_n|z^n=k)}{p(\boldsymbol h_n)} \\ &= \frac{ \alpha^t_k \mathcal{CN}(\boldsymbol h_n; \boldsymbol \mu^t_k, \boldsymbol{C}^t_k) } { \sum_{k=1}^K \alpha^t_k \mathcal{CN}(\boldsymbol h_n; \boldsymbol \mu^t_k, \boldsymbol{C}^t_k) }, \ \ \forall n,k \end{aligned} \tag{22} γnkt+1​​≐p(zn=k∣hn​)=p(hn​)p(zn=k)p(hn​∣zn=k)​=∑k=1K​αkt​CN(hn​;μkt​,Ckt​)αkt​CN(hn​;μkt​,Ckt​)​,  ∀n,k​(22)

经过E步,我们便得到了一组后验分布{γnkt+1}n,k\{\gamma^{t+1}_{nk}\}_{n,k}{γnkt+1​}n,k​

(2) M步:固定后验分布{γnkt+1}n,k\{\gamma^{t+1}_{nk}\}_{n,k}{γnkt+1​}n,k​,通过最大化证据下界(Evidence Lower BOund)(即对数边际似然log⁡p(h;α,μ,C)\log p(\boldsymbol h;\alpha, \boldsymbol \mu, \boldsymbol{C})logp(h;α,μ,C)的下界),来估计第t+1t+1t+1次迭代的参数{αkt+1,μkt+1,Ckt+1}k=1K\{ \alpha^{t+1}_k, \boldsymbol \mu^{t+1}_k, \boldsymbol{C}^{t+1}_k \}_{k=1}^K{αkt+1​,μkt+1​,Ckt+1​}k=1K​

ELBO({γnkt+1}n,k,{hn}n;{αkt+1,μkt+1,Ckt+1}k=1K)=∑n=1Np∑k=1Kγnkt+1log⁡p(hn,zn=k)γnkt+1=∑n=1Np∑k=1Kγnkt+1log⁡p(zn=k)p(hn∣zn=k)γnkt+1=∑n=1Np∑k=1Kγnkt+1log⁡αkt+1⋅CN(hn;μkt+1,Ckt+1)γnkt+1=∑n=1Np∑k=1Kγnkt+1(−∥(Ckt+1)−12(hn−μkt+1)∥22+log⁡αkt+1−log⁡∣Ckt+1∣)+C(23)\begin{aligned} & \ \ \ \ ELBO \left (\{\gamma^{t+1}_{nk}\}_{n,k}, \{ \boldsymbol h_n \}_n ; \{ \alpha^{t+1}_k, \boldsymbol \mu^{t+1}_k, \boldsymbol{C}^{t+1}_k \}_{k=1}^K \right ) \\ & = \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \log \frac{p(\boldsymbol h_n,z^n=k)}{ \gamma^{t+1}_{nk}} \\ & = \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \log \frac{p(z^n=k) p(\boldsymbol h_n|z^n=k)}{ \gamma^{t+1}_{nk}} \\ & = \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \log \frac{\alpha^{t+1}_k \cdot \mathcal{CN}(\boldsymbol h_n; \boldsymbol \mu^{t+1}_k, \boldsymbol{C}^{t+1}_k)}{ \gamma^{t+1}_{nk}} \\ &= \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \left ( - \left \Vert (\boldsymbol{C}^{t+1}_k)^{-\frac{1}{2}} \left ( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2 + \log \alpha^{t+1}_k - \log \left \vert \boldsymbol{{C}}^{t+1}_k \right \vert \right) + C \end{aligned} \tag{23} ​    ELBO({γnkt+1​}n,k​,{hn​}n​;{αkt+1​,μkt+1​,Ckt+1​}k=1K​)=n=1∑Np​​k=1∑K​γnkt+1​logγnkt+1​p(hn​,zn=k)​=n=1∑Np​​k=1∑K​γnkt+1​logγnkt+1​p(zn=k)p(hn​∣zn=k)​=n=1∑Np​​k=1∑K​γnkt+1​logγnkt+1​αkt+1​⋅CN(hn​;μkt+1​,Ckt+1​)​=n=1∑Np​​k=1∑K​γnkt+1​(−∥∥∥​(Ckt+1​)−21​(hn​−μkt+1​)∥∥∥​22​+logαkt+1​−log∣∣​Ckt+1​∣∣​)+C​(23)

其中CCC是与{αkt+1,μkt+1,Ckt+1}k=1K\{ \alpha^{t+1}_k, \boldsymbol \mu^{t+1}_k, \boldsymbol{C}^{t+1}_k \}_{k=1}^K{αkt+1​,μkt+1​,Ckt+1​}k=1K​无关的常数。上述最大化ELBO转化为优化问题:
arg max⁡{αkt+1,μkt+1,Ckt+1}∑n=1Np∑k=1Kγnkt+1(−∥(Ckt+1)−12(hn−μkt+1)∥22+log⁡αkt+1−log⁡∣Ckt+1∣)s.t. ∑k=1Kαk=1(24)\begin{aligned} & \argmax_{\{ \alpha^{t+1}_k, \boldsymbol \mu^{t+1}_k, \boldsymbol{C}^{t+1}_k \}} \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \left ( - \left \Vert (\boldsymbol{C}^{t+1}_k)^{-\frac{1}{2}} \left (\boldsymbol h_n- \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2 + \log \alpha^{t+1}_k - \log \left \vert \boldsymbol{{C}}^{t+1}_k \right \vert \right) \\ & \text{s.t. } \sum_{k=1}^K \alpha_k = 1 \end{aligned} \tag{24} ​{αkt+1​,μkt+1​,Ckt+1​}argmax​n=1∑Np​​k=1∑K​γnkt+1​(−∥∥∥​(Ckt+1​)−21​(hn​−μkt+1​)∥∥∥​22​+logαkt+1​−log∣∣​Ckt+1​∣∣​)s.t. k=1∑K​αk​=1​(24)

利用拉格朗日函数:
g:∑n=1Np∑k=1Kγnkt+1(−∥(Ckt+1)−12(hn−μkt+1)∥22+log⁡αkt+1−log⁡∣Ckt+1∣)−λ(∑k=1Kαk−1)(25)g: \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \left ( - \left \Vert (\boldsymbol{C}^{t+1}_k)^{-\frac{1}{2}} \left ( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2 + \log \alpha^{t+1}_k - \log \left \vert \boldsymbol{{C}}^{t+1}_k \right \vert \right) - \lambda (\sum_{k=1}^K \alpha_k - 1) \tag{25} g:n=1∑Np​​k=1∑K​γnkt+1​(−∥∥∥​(Ckt+1​)−21​(hn​−μkt+1​)∥∥∥​22​+logαkt+1​−log∣∣​Ckt+1​∣∣​)−λ(k=1∑K​αk​−1)(25)

求关于{αkt+1,μkt+1,Ckt+1}\{ \alpha^{t+1}_k, \boldsymbol \mu^{t+1}_k, \boldsymbol{C}^{t+1}_k \}{αkt+1​,μkt+1​,Ckt+1​}的偏导数:
∂g∂αkt+1=0∂g∂μkt+1=0∂g∂Ckt+1=0(26)\begin{aligned} \frac{\partial g}{\partial \alpha^{t+1}_k} &= 0 \\ \frac{\partial g}{\partial \boldsymbol \mu^{t+1}_k} &= \boldsymbol 0 \\ \frac{\partial g}{\partial \boldsymbol{C}^{t+1}_k} &= \boldsymbol 0 \\ \end{aligned} \tag{26} ∂αkt+1​∂g​∂μkt+1​∂g​∂Ckt+1​∂g​​=0=0=0​(26)

考虑Ck=σk2I\boldsymbol{C}_k=\sigma^2_k \boldsymbol{I}Ck​=σk2​I

在实际计算时,为了简化,可以取Ck=σk2I\boldsymbol{C}_k=\sigma^2_k \boldsymbol{I}Ck​=σk2​I,即等高线为圆的高斯,这时式(25)可以化简为
g:∑n=1Np∑k=1Kγnkt+1(−1σk2∥(hn−μkt+1)∥22+log⁡αkt+1−2NUENBSlog⁡σk)−λ(∑k=1Kαk−1)g: \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \left ( - \frac{1}{\sigma^2_k} \left \Vert \left (\boldsymbol h_n- \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2 + \log \alpha^{t+1}_k - 2 N_{UE} N_{BS} \log \sigma_k \right) - \lambda (\sum_{k=1}^K \alpha_k - 1) g:n=1∑Np​​k=1∑K​γnkt+1​(−σk2​1​∥∥​(hn​−μkt+1​)∥∥​22​+logαkt+1​−2NUE​NBS​logσk​)−λ(k=1∑K​αk​−1)

将式(26)代入到上式
首先对αk\alpha_kαk​求导
∂g∂αk=∑n=1Npγnkt+1⋅1αk−λ=0⇒αk=∑n=1Npγnkt+1λ\begin{aligned} & \frac{\partial g}{\partial \alpha^{}_k} = \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \cdot \frac{1}{\alpha^{}_k} - \lambda = 0 \\ \Rightarrow & \alpha_k = \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}}{ \lambda } \end{aligned} ⇒​∂αk​∂g​=n=1∑Np​​γnkt+1​⋅αk​1​−λ=0αk​=λ∑n=1Np​​γnkt+1​​​

又因为∑k=1Kαk=1\sum_{k=1}^K \alpha_k = 1∑k=1K​αk​=1,因此λ=Np\lambda=N_pλ=Np​,故
αk=∑n=1Npγnkt+1Np\alpha_k = \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}}{ N_p } αk​=Np​∑n=1Np​​γnkt+1​​

然后对μk∈CL×1\boldsymbol{\mu}_k \in \mathbb{C}^{L \times 1}μk​∈CL×1求导
∂g∂μk=∑n=1Npγnkt+1⋅(−1σk2)2(μk−hn)=0⇒∑n=1Npγnkt+1⋅(μk−hn)=0⇒μk⋅∑n=1Npγnkt+1=∑n=1Npγnkt+1hn⇒μk=∑n=1Npγnkt+1hn∑n=1Npγnkt+1\begin{aligned} & \frac{\partial g}{\partial \boldsymbol \mu^{}_k} = \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \cdot ( - \frac{1}{\sigma^2_k}) 2\left (\boldsymbol{\mu}_k - \boldsymbol h_n \right ) = 0 \\ \Rightarrow & \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \cdot \left (\boldsymbol{\mu}_k - \boldsymbol h_n \right ) = 0 \\ \Rightarrow & \boldsymbol{\mu}_k \cdot \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} = \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \boldsymbol h_n \\ \Rightarrow & \boldsymbol{\mu}_k = \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \boldsymbol h_n}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \end{aligned} ⇒⇒⇒​∂μk​∂g​=n=1∑Np​​γnkt+1​⋅(−σk2​1​)2(μk​−hn​)=0n=1∑Np​​γnkt+1​⋅(μk​−hn​)=0μk​⋅n=1∑Np​​γnkt+1​=n=1∑Np​​γnkt+1​hn​μk​=∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​hn​​​

最后对σk\sigma_kσk​求导
∂g∂σk=∑n=1Npγnkt+1(1σk3∥(hn−μkt+1)∥22−2NUENBS1σk)=0⇒σk2=12NUENBS⋅∑n=1Npγnkt+1∥(hn−μkt+1)∥22∑n=1Npγnkt+1\begin{aligned} & \frac{\partial g}{\partial \sigma_k} = \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left ( \frac{1}{\sigma^3_k} \left \Vert \left ( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2 - 2 N_{UE} N_{BS} \frac{1}{\sigma_k} \right) = 0 \\ \Rightarrow & \sigma^2_k = \frac{1}{2 N_{UE} N_{BS}} \cdot \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left \Vert \left ( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \end{aligned} ⇒​∂σk​∂g​=n=1∑Np​​γnkt+1​(σk3​1​∥∥​(hn​−μkt+1​)∥∥​22​−2NUE​NBS​σk​1​)=0σk2​=2NUE​NBS​1​⋅∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​∥∥​(hn​−μkt+1​)∥∥​22​​​

考虑一般的Ck\boldsymbol{C}_kCk​

首先对αk\alpha_kαk​求导
∂g∂αk=∑n=1Npγnkt+1⋅1αk−λ=0⇒αk=∑n=1Npγnkt+1λ\begin{aligned} & \frac{\partial g}{\partial \alpha^{}_k} = \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \cdot \frac{1}{\alpha^{}_k} - \lambda = 0 \\ \Rightarrow & \alpha_k = \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}}{ \lambda } \end{aligned} ⇒​∂αk​∂g​=n=1∑Np​​γnkt+1​⋅αk​1​−λ=0αk​=λ∑n=1Np​​γnkt+1​​​

又因为∑k=1Kαk=1\sum_{k=1}^K \alpha_k = 1∑k=1K​αk​=1,因此λ=Np\lambda=N_pλ=Np​

然后对μk∗∈CL×1\boldsymbol{\mu}^{*}_k \in \mathbb{C}^{L \times 1}μk∗​∈CL×1求导,根据公式(复矩阵求导):
∂(bHx)∂x∗=0,∂(xHb)∂x∗=b∂(xHAx)∂x∗=Ax\begin{aligned} & \frac{\partial ( \boldsymbol b^H \boldsymbol x )}{ \partial \boldsymbol x^{*} } = \boldsymbol{0}, \ \ \frac{\partial ( \boldsymbol x^H \boldsymbol b )}{ \partial \boldsymbol x^{*} } = \boldsymbol{b} \\ & \frac{\partial ( \boldsymbol x^H \boldsymbol{A} \boldsymbol x )}{ \partial \boldsymbol x^{*} } = \boldsymbol{Ax} \end{aligned} ​∂x∗∂(bHx)​=0,  ∂x∗∂(xHb)​=b∂x∗∂(xHAx)​=Ax​

进一步,
∂g∂μk∗=∂∂μk∗∑n=1Np−γnkt+1∥(Ckt+1)−12(hn−μkt+1)∥22=∑n=1Np−γnkt+1∂∂μk∗(hn−μkt+1)H(Ckt+1)−1(hn−μkt+1)=∑n=1Npγnkt+1(Ckt+1)−1(μkt+1−hn)=(Ckt+1)−1∑n=1Npγnkt+1(μkt+1−hn)=0⇒μk=∑n=1Npγnkt+1hn∑n=1Npγnkt+1\begin{aligned} \frac{ \partial g }{ \partial \boldsymbol{\mu}^{*}_k} &= \frac{ \partial }{ \partial \boldsymbol{\mu}^{*}_k} \sum_{n=1}^{N_p} - \gamma^{t+1}_{nk} \left \Vert (\boldsymbol{C}^{t+1}_k)^{-\frac{1}{2}} \left (\boldsymbol h_n- \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2 \\ &= \sum_{n=1}^{N_p} - \gamma^{t+1}_{nk} \frac{ \partial }{ \partial \boldsymbol{\mu}^{*}_k} \left( \boldsymbol h_n- \boldsymbol{\mu}^{t+1}_k \right )^H (\boldsymbol{C}^{t+1}_k)^{-1} \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right ) \\ &= \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} (\boldsymbol{C}^{t+1}_k)^{-1} \left( \boldsymbol{\mu}^{t+1}_k - \boldsymbol h_n \right ) \\ &= (\boldsymbol{C}^{t+1}_k)^{-1} \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left( \boldsymbol{\mu}^{t+1}_k - \boldsymbol h_n \right ) = \boldsymbol{0} \\ \Rightarrow \boldsymbol{\mu}_k & = \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \boldsymbol h_n}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \end{aligned} ∂μk∗​∂g​⇒μk​​=∂μk∗​∂​n=1∑Np​​−γnkt+1​∥∥∥​(Ckt+1​)−21​(hn​−μkt+1​)∥∥∥​22​=n=1∑Np​​−γnkt+1​∂μk∗​∂​(hn​−μkt+1​)H(Ckt+1​)−1(hn​−μkt+1​)=n=1∑Np​​γnkt+1​(Ckt+1​)−1(μkt+1​−hn​)=(Ckt+1​)−1n=1∑Np​​γnkt+1​(μkt+1​−hn​)=0=∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​hn​​​

最后对Ck\boldsymbol C_kCk​求导,但是实际中,为了方便处理,我们是对(Ck−1)∗(\boldsymbol C^{-1}_k)^*(Ck−1​)∗进行求导,在求导之前,我们回顾一些基本概念。

回顾:如果A∈CN×N\boldsymbol{A} \in \mathbb{C}^{N \times N}A∈CN×N是正定矩阵,那么根据det(A)=∏nλn\text{det}(\boldsymbol{A})=\prod_{n} \lambda_ndet(A)=∏n​λn​,我们可以得到det(A)=det(A∗)=det(AT)=det(AH)\text{det}(\boldsymbol{A})=\text{det}(\boldsymbol{A}^*)=\text{det}(\boldsymbol{A}^T)=\text{det}(\boldsymbol{A}^H)det(A)=det(A∗)=det(AT)=det(AH)。另外,det(A)=1det(A−1)\text{det}(\boldsymbol{A})=\frac{1}{\text{det}(\boldsymbol{A}^{-1})}det(A)=det(A−1)1​。因此我们可以把拉格朗日函数ggg重写为

g:∑n=1Np∑k=1Kγnkt+1(−∥(Ckt+1)−12(hn−μkt+1)∥22+log⁡αkt+1−log⁡∣Ckt+1∣)−λ(∑k=1Kαk−1)=∑n=1Np∑k=1Kγnkt+1(−(hn−μkt+1)H(Ckt+1)−1(hn−μkt+1)+log⁡αkt+1+log⁡∣((Ckt+1)−1)∗∣)−λ(∑k=1Kαk−1)\begin{aligned} g:& \ \ \ \ \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \left ( - \left \Vert (\boldsymbol{C}^{t+1}_k)^{-\frac{1}{2}} \left ( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2 + \log \alpha^{t+1}_k - \log \left \vert \boldsymbol{{C}}^{t+1}_k \right \vert \right) - \lambda (\sum_{k=1}^K \alpha_k - 1) \\ &= \sum_{n=1}^{N_p} \sum_{k=1}^K \gamma^{t+1}_{nk} \left ( - \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right )^H (\boldsymbol{C}^{t+1}_k)^{-1} \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right ) + \log \alpha^{t+1}_k + \log \left \vert ((\boldsymbol{{C}}^{t+1}_k)^{-1})^* \right \vert \right) - \lambda (\sum_{k=1}^K \alpha_k - 1) \end{aligned} g:​    n=1∑Np​​k=1∑K​γnkt+1​(−∥∥∥​(Ckt+1​)−21​(hn​−μkt+1​)∥∥∥​22​+logαkt+1​−log∣∣​Ckt+1​∣∣​)−λ(k=1∑K​αk​−1)=n=1∑Np​​k=1∑K​γnkt+1​(−(hn​−μkt+1​)H(Ckt+1​)−1(hn​−μkt+1​)+logαkt+1​+log∣∣​((Ckt+1​)−1)∗∣∣​)−λ(k=1∑K​αk​−1)​

根据公式,若A∈CN×N\boldsymbol{A} \in \mathbb{C}^{N \times N}A∈CN×N
∂yHAHx∂A∗=xyH∂det(A)∂A=det(A)⋅(AT)−1\begin{aligned} \frac{ \partial \boldsymbol{y}^H \boldsymbol{A}^H \boldsymbol{x} }{ \partial \boldsymbol{A}^{*} } &= \boldsymbol{x} \boldsymbol{y}^H \\ \frac{\partial \text{det}(\boldsymbol{A})}{ \partial \boldsymbol{A} } &= \text{det}(\boldsymbol{A}) \cdot (\boldsymbol{A}^T)^{-1} \end{aligned} ∂A∗∂yHAHx​∂A∂det(A)​​=xyH=det(A)⋅(AT)−1​

对(Ck−1)∗(\boldsymbol C^{-1}_k)^*(Ck−1​)∗进行求导:
∂g∂(Ck−1)∗=∑n=1Npγnkt+1(−(hn−μkt+1)(hn−μkt+1)H+Ckt+1)=Ckt+1∑n=1Npγnkt+1−∑n=1Npγnkt+1(hn−μkt+1)(hn−μkt+1)H=0⇒Ckt+1=∑n=1Npγnkt+1(hn−μkt+1)(hn−μkt+1)H∑n=1Npγnkt+1\begin{aligned} \frac{\partial g}{ \partial (\boldsymbol C^{-1}_k)^*} &= \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left (- \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right ) \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right )^H + \boldsymbol{C}^{t+1}_k \right ) \\ &= \boldsymbol{C}^{t+1}_k \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} - \sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right ) \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right )^H = 0 \\ \Rightarrow \boldsymbol{C}^{t+1}_k &= \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right ) \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right )^H}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \end{aligned} ∂(Ck−1​)∗∂g​⇒Ckt+1​​=n=1∑Np​​γnkt+1​(−(hn​−μkt+1​)(hn​−μkt+1​)H+Ckt+1​)=Ckt+1​n=1∑Np​​γnkt+1​−n=1∑Np​​γnkt+1​(hn​−μkt+1​)(hn​−μkt+1​)H=0=∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​(hn​−μkt+1​)(hn​−μkt+1​)H​​

这与我们的直觉是一致的。

小结:转化为参数估计问题

如果考虑Ck=σk2I\boldsymbol{C}_k=\sigma^2_k \boldsymbol{I}Ck​=σk2​I

αk=∑n=1Npγnkt+1Npμk=∑n=1Npγnkt+1hn∑n=1Npγnkt+1σk2=12NUENBS⋅∑n=1Npγnkt+1∥(hn−μkt+1)∥22∑n=1Npγnkt+1\begin{aligned} \alpha_k &= \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}}{ N_p } \\ \boldsymbol{\mu}_k &= \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \boldsymbol h_n}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \\ \sigma^2_k &= \frac{1}{2 N_{UE} N_{BS}} \cdot \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left \Vert \left ( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right) \right \Vert^2_2}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \end{aligned} αk​μk​σk2​​=Np​∑n=1Np​​γnkt+1​​=∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​hn​​=2NUE​NBS​1​⋅∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​∥∥​(hn​−μkt+1​)∥∥​22​​​

如果考虑一般的Ck\boldsymbol{C}_kCk​

αk=∑n=1Npγnkt+1Npμk=∑n=1Npγnkt+1hn∑n=1Npγnkt+1Ck=∑n=1Npγnkt+1(hn−μkt+1)(hn−μkt+1)H∑n=1Npγnkt+1\begin{aligned} \alpha_k &= \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}}{ N_p } \\ \boldsymbol{\mu}_k &= \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \boldsymbol h_n}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \\ \boldsymbol{C}^{}_k &= \frac{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk} \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right ) \left( \boldsymbol h_n - \boldsymbol{\mu}^{t+1}_k \right )^H}{\sum_{n=1}^{N_p} \gamma^{t+1}_{nk}} \end{aligned} αk​μk​Ck​​=Np​∑n=1Np​​γnkt+1​​=∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​hn​​=∑n=1Np​​γnkt+1​∑n=1Np​​γnkt+1​(hn​−μkt+1​)(hn​−μkt+1​)H​​

相关内容