11
$\begingroup$

Assume that $f:\mathbb R\to\mathbb R$ is continuous. Given a real symmetric matrix $A\in\text{Sym}(n)$, we can define $f(A)$ by applying $f$ to its spectrum. More explicitly, $$ f(A):=\sum f(\lambda)P_\lambda,\qquad A=\sum\lambda P_\lambda. $$ Here both sums are finite, and the second one is the decomposition of $A$ as a linear combination of orthogonal projections ($P_\lambda$ is the projection onto the eigenspace for the eigenvalue $\lambda$, so that $P_\lambda P_{\lambda'}=0$). Such decomposition exists and is unique by the spectral theorem.

I guess it is well known that $f:\text{Sym}(n)\to\text{Sym}(n)$ is continuous.

Assuming $f\in C^\infty(\mathbb R)$, is the induced map $f:\text{Sym}(n)\to\text{Sym}(n)$ also smooth?

I think I can show that it is (Fréchet) differentiable everywhere, but I am wondering whether it is always $C^1$ or even $C^\infty$.

$\endgroup$

2 Answers 2

10
$\begingroup$

Yes. The can be derived from the resolvent formalism.

I'll just do the $C^1$ case and leave higher derivatives as an exercise - ask if it's not clear how to generalize. I am basically using formula (2.7) of "On differentiability of symmetric matrix valued functions", Alexander Shapiro, http://www.optimization-online.org/DB_HTML/2002/07/499.html. (I think there's a typo in the middle case of the display preceeding (2.7): $f(\mu_j)/(\mu_j-\mu_k)$ should be $(f(\mu_j)-f(\mu_k))/(\mu_j-\mu_k).$) Shapiro's paper references "(cf., [4])", where [4] is the 600-page textbook "Perturbation Theory for Linear Operators" by T. Kato, but I don't know if that is helpful for this specific question.

I will call the induced map $f^*$ to distinguish it from $f.$ I'll also call the dimension $p$ instead of $n.$

It suffices to show $f^*$ is $C^1$ for matrices with eigenvalues in a given bounded interval $J.$ Approximate $f$ by polynomials $f_n$ such that $\sup_{x\in J}|f(x)-f_n(x)|\to 0$ and $\sup_{x\in J}|f'(x)-f_n'(x)|\to 0.$ Since $f_n$ is analytic, $f^*_n$ can be evaluated using resolvents:

$$f_n^*(X) = \frac{1}{2\pi i}\int_C f_n(z)(z I_p - X)^{-1} dz$$ where $C$ is an anticlockwise circle in the complex plane with $J$ in its interior. For $H\in\mathrm{Sym}(p),$

\begin{align*} f_n^*(X+H) &= \frac{1}{2\pi i}\int_C f_n(z)(z I_p - X-H)^{-1} dz\\ &= \frac{1}{2\pi i}\int_C f_n(z)(z I_p - X)^{-1}+f_n(z)(z I_p - X)^{-1}H(z I_p - X)^{-1} +\dots dz\\ &= \frac{1}{2\pi i}\int_C f_n(z)\sum_{\lambda}(z-\lambda)^{-1}P_\lambda +f_n(z)\sum_{\lambda_1,\lambda_2}(z-\lambda_1)^{-1}(z-\lambda_2)^{-1}P_{\lambda_1}HP_{\lambda_2}+\dots dz\\ &= f_n^*(X)+\sum_{\lambda_1,\lambda_2} P_{\lambda_1} H P_{\lambda_2}\int_0^1 f'_n(t\lambda_1+(1-t)\lambda_2)+\dots dt \end{align*}

The second equality uses the Taylor expansion $$(A-H)^{-1}=A^{-1}+A^{-1}HA^{-1}+\dots$$ with $A=z I_p-X.$ The third equality uses $(zI_p - X)^{-1}=\sum_\lambda (z-\lambda)^{-1} P_\lambda.$ The fourth equality uses $\int_C f_n(z)(z-\lambda)^{-1}(z-\mu)^{-1}dz =\int_0^1 f'_n(t\lambda+(1-t)\mu)dt.$

This gives a bound

$$\|Df^*_n(X)H\| \leq c_p\|H\|\cdot \sup_{x\in J}|f'_n(x)-f_n'(x)|$$

for some constant $c_p>0,$ where $\|\cdot\|$ is any matrix norm. This shows that $f^*$ can be approximated arbitrarily well in the $C^1$ norm, which means it's $C^1.$

$\endgroup$
3
  • 1
    $\begingroup$ Excellent! I see how to generalize it to higher derivatives. I am amazed by the speed of your answer :-) $\endgroup$
    – Mizar
    Apr 6, 2019 at 18:15
  • $\begingroup$ Higher derivative estimates follow from the formula $\sum_{j=0}^k\frac{f(\lambda_j)}{\prod_{\ell\neq j}(\lambda_j-\lambda_\ell)}=\frac{1}{k!|\Delta_k|}\int_{\Delta_k}f^{(k)}(\sum_jt_j\lambda_j)\,dt_0\cdots dt_k$, $\Delta_k$ being the standard simplex $\{t_j\ge 0,\sum t_j=1\}$ (assuming wlog the $\lambda_j$'s are distinct). $\endgroup$
    – Mizar
    Apr 7, 2019 at 15:21
  • $\begingroup$ The formula, in turn, is easy to prove by induction: we can subtract $f(\lambda_0)$ from each numerator in the left-hand side (thanks to this: math.stackexchange.com/questions/104262/…) and obtain $LHS=\sum_{j=1}^k\int_0^1\frac{f'(t_0\lambda_0+(1-t_0)\lambda_j)}{\prod_{\ell\neq j,\ell>0}(\lambda_j-\lambda_\ell)}$. We are done applying induction with $g:=f'(t_0\lambda_0+(1-t_0)z)$ in place of $f(z)$ and noticing $g^{(k-1)}(z)=(1-t_0)^{k-1}f^{(k)}(t_0\lambda_0+(1-t_0)z)$. $\endgroup$
    – Mizar
    Apr 7, 2019 at 15:25
0
$\begingroup$

Yes. To show that f(A) is n-times differentiable at A=B, simply interpolate f by a polynomial P so that P and its derivatives up to order n agree with f on the spectrum of B. Clearly P(A) is n-times differentiable, and it isn't too much work to show that f and P have the same nth derivative.

$\endgroup$
1
  • $\begingroup$ Yeah, I thought about this strategy, but it's not so clear why this gives $C^n$ regularity rather than just a Taylor approximation of degree $n$ at each $A$. (If one manages to infer such an approximation exists with coefficients depending continuously on $A$, then one could invoke this result: mathoverflow.net/questions/88501/converse-of-taylors-theorem) $\endgroup$
    – Mizar
    Apr 6, 2019 at 23:00

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Not the answer you're looking for? Browse other questions tagged or ask your own question.