Matrix determinant inequality proof without using information theory

Question

Let $A$ be a $k \times n$ orthogonal matrix; i.e., $AA^T = I_{k \times k}$. For $1 \leq j \leq n$, let the squared norm of the $j$-th column of $A$ be denoted by $\alpha_j^2$; i.e., $$\sum_{i=1}^k a_{ij}^2 = \alpha_j^2.$$ Let $\Lambda = \operatorname{Diag}(\lambda_1, \lambda_2, \dotsc, \lambda_n)$ be a positive definite matrix. Define a function $F$ from the space of p.d. diagonal matrices to $\mathbb R$ as follows: $$ F(\Lambda) = \log \lvert A \Lambda A^T\rvert-\sum_{j=1}^n \alpha_j^2 \log \lambda_j.$$

To show: $F(\Lambda) \geq 0$, with equality only when $\Lambda$ is a multiple of identity.

Background: This result is true and follows from an inequality known in information theory as Zamir and Feder's entropy power inequality. To obtain the result, we may apply Zamir and Feder's inequality to Gaussian random variables with covariances $\lambda_i$. (A statement of the inequality may be found here: Rioul - Information-theoretic proofs of entropy power inequalities, Proposition 9, (65c).)

I am interested in knowing if there is an alternate proof of this result that does not rely on entropy inequalities, and uses linear algebraic tools instead.

What do you denote by $|C|$ ? The determinant of $C$? Something else? — Dima Pasechnik, Oct 15, 2018 at 21:56
An equivalent formulation is that if $M$ is a positive definite matrix with eigenvalues $\lambda_1, \ldots, \lambda_n$ and $H$ is the top-left principal submatrix, then $\det M \le (\det H) \prod_{i=1}^n \lambda_j^{1-\beta_j^2}$, where $\beta_j$ is the norm of the truncation to the last $n-k$ coordinates of the unit eigenvector corresponding to $\lambda_j$. This looks strongly reminiscent of the results in section 7.8 of Horn & Johnson's book (volume 1, second edition). — Mark Meckes, Oct 17, 2018 at 19:36
I feel Cauchy-Binet will do the trick; too lazy to try right now. — Suvrit, Oct 18, 2018 at 20:04
@Suvrit: I think you're right! Let me see if the details work out. — VSJ, Oct 18, 2018 at 20:55

Suvrit · Accepted Answer · 2018-10-19 02:36:10Z

Here's a more general result.

From Choi's inequality, we know that for a positive linear map $\Phi$ and an operator convex function $f$, we have $\Phi(f(\Lambda)) \ge f (\Phi(\Lambda))$. Let $\Phi(\Lambda)=A\Lambda A^T$, and let $f(t)=-\log t$, then we obtain \begin{align*} -A\log(\Lambda)A^T \ge -\log(A\Lambda A^T)\\ \Leftrightarrow\qquad\log(A\Lambda A^T) \ge A\log(\Lambda)A^T. \end{align*} This operator inequality is stronger than the what we need. Simply take the trace of both sides to conclude $F(\Lambda) \ge 0$ (using the observation that $\log\det(Z)=\text{tr}\log Z$, and since $\Lambda$ is diagonal).

VSJ · Accepted Answer · 2018-10-19 02:35:52Z

Let $B := A \Lambda^{1/2}$, so that $A\Lambda A^T = BB^T$. Using the Cauchy-Binet formula for the determinant of $BB^T$, we obtain

$$|BB^T| = \sum_{1 \leq i_1 < \dots < i_k \leq n} |B_{i_1 i_2 \dots i_k}|| B_{i_1 i_2 \dots i_k}^T|, $$

where $B_{i_1 i_2 \dots i_k}$ consists of $k$ columns of $B$ corresponding to the indices $i_1, \dots, i_k$. The right hand side of the above equality may be written explicitly as

$$ \sum_{1 \leq i_1 < \dots < i_k \leq n} \left({\prod_{j=1}^k \lambda_{i_j}}\right) |A_{i_1i_2\dots i_k}|^2.$$

Noting that $\sum_{1 \leq i_1 < \dots < i_k \leq n}|A_{i_1i_2\dots i_k}|^2 = |AA^T| = |I|= 1 $ (again via Cauchy-Binet formula), we can take logarithms and use Jensen's inequality to obtain

$$ \log |A \Lambda A^T| \geq \sum_{1 \leq i_1 < \dots < i_k \leq n} |A_{i_1i_2\dots i_k}|^2 \log \left({\prod_{j=1}^k \lambda_{i_j}}\right). $$

We now gather the coefficients of $\log {\lambda_j}$ for a fixed $j$. Without loss of generality, let $j=1$. The coefficient of $\log {\lambda_1}$ is given by

$$ \sum_{1 = i_1 < \dots < i_k \leq n} |A_{i_1i_2\dots i_k}|^2= 1 - |A_{2,3,\dots, n}A_{2,3,\dots,n}^T| = 1 - |I - A_1 A_1^T| = \alpha_1^2.$$

The first equality follows by using the Cauchy-Binet formula again, the second equality follows from the orthogonality of $A$, and the third equality is true because $|I-uu^T| = 1- ||u||^2$ for any vector $u$.

Dima Pasechnik · Accepted Answer · 2018-10-17 11:16:30Z

This is too long for a comment, and not (yet) a complete answer.

In other words, we have to show $e^{F(\Lambda)}=\prod_{j} \lambda_j^{-\alpha_j^2}\det (A\Lambda A^\top)\geq 1,$ or $$\prod_{j} \lambda_j^{\alpha_j^2}\leq\det (A\Lambda A^\top).$$ Indeed, this is trivially true for $k=n$, as then $\det (A\Lambda A^\top)=\prod_j \lambda_j$, and by orthogonality of $A$ one has in this case $\alpha_j=1$.

As well, for $k=1$ one has $\det (A\Lambda A^\top)=\sum_j \lambda_j\alpha_j^2\geq \prod_{j} \lambda_j^{\alpha_j^2}$, and this is a consequence of Jensen inequality for the convex function $\exp$. Indeed, the latter gives $$\exp\left(\frac{\sum_j x_j\alpha_j^2}{\sum_j \alpha_j^2}\right)\leq \frac{\sum_j \exp(x_j)\alpha_j^2}{\sum_j \alpha_j^2},$$ and the claim follows by setting $\lambda_j=\exp(x_j)$ and noting that $\sum_j \alpha_j^2=1$.

maybe computing det of the Schur complement of $A\Lambda A^\top$ will give the needed induction step. — Dima Pasechnik, Oct 19, 2018 at 9:54

Stack Exchange Network

Matrix determinant inequality proof without using information theory

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged
linear-algebra
inequalities
or ask your own question.

Matrix determinant inequality proof without using information theory

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged linear-algebrainequalities or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
linear-algebra
inequalities
or ask your own question.