5
$\begingroup$

I have a function $f:\mathbb{R}^+ \to \mathbb{R}^+$ that I want to prove is eventually concave - i.e. that there exists $\gamma _0 > 0$ such that for every $\gamma>\gamma_0$, $f(\gamma)$ is concave. I also know that $f$ is real-analytic, increasing and upper bounded, and it is therefore enough to show that $f''$ has a finite number of zeros.

A concrete definition of $f(\gamma)$ is as follows: $$f(\gamma)= \frac{1}{\sqrt{2\pi}}\sum_{x\in\mathcal{X},x'\in\mathcal{X}'}p_{x,x'}\int_{-\infty}^\infty{e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}\log\left(\frac{\sum_{x\in\mathcal{X}}p_{x|x'}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}{\sum_{x\in\mathcal{X}}p_{x}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}\right)dy}$$ where $\mathcal{X},\mathcal{X}'$ are two finite sets, $p:\mathcal{X}\times\mathcal{X}'\to[0,1]$ is a probability mass function and $p_{x|x'}=p_{x,x'}/\sum_{\xi\in\mathcal{X}}p_{\xi,x'}$ , $p_x=\sum_{\xi'\in\mathcal{X}'}p_{x,\xi'}$ are the conditional and marginal distributions on $\mathcal{X}$, respectively. In Information-Theoretic terms, $f(\gamma)$ is the mutual information between $X'$ and $Y_{\gamma}=\sqrt{\gamma}X+N$, where $N \sim \mathcal{N}(0,1)$ is independent of $X'$ and $X$, which are dependent and have finite alphabets. The variable $\gamma$ is the signal-to-noise ratio.

The function is known to be concave in the special case $X=X'$ ($p_{x|x'}=\delta_{x,x'}$). When $X\neq X'$, examples can be found in which $f$ is not always concave, but becomes concave eventually as $\gamma$ grows.

Intuitively, $f''$ must have finitely many zeros (=$f$ is eventually concave), as $f$ is a finite combination of "well-behaved" functions. I would like to formalize this by showing that $f$ belongs to class of functions that have a finite number of zeros, that is closed under differentiation. This will show that $f''$ also has finitely many zeros, which is just the thing we need.

There exist several such classes of functions including Hardy's class L of "orders of infinity" and the Pfaffian closure. See also this related question. However, the mathematical machinery here is a bit complicated for me, and I could not show $f$ belongs to either class due to the integration with resepect to $y$ in its definition. Perhaps this can be shown by recasting $f$ as a solution to an ODE? Or perhaps there exists another, more suitable class of functions that I am not aware of?


There is also an expression for $f''(\gamma)$ which is quite lengthy. Here it is for completeness: $$f''(\gamma) = -\frac{1}{2\sqrt{2\pi}}\sum_{x\in\mathcal{X},x'\in\mathcal{X}'}p_{x,x'}\int_{-\infty}^\infty{e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}[\phi_X^2(y;\gamma)-\phi_{X|X'}^2(y,x';\gamma)]dy}$$ where $$\phi_X(y;\gamma)=\frac{\sum_{x\in\mathcal{X}}x^2p_{x}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}{\sum_{x\in\mathcal{X}}p_{x}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}-\left(\frac{\sum_{x\in\mathcal{X}}xp_{x}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}{\sum_{x\in\mathcal{X}}p_{x}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}\right)^2$$ and similarly $$\phi_{X|X'}(y,x';\gamma)=\frac{\sum_{x\in\mathcal{X}}x^2p_{x|x'}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}{\sum_{x\in\mathcal{X}}p_{x|x'}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}-\left(\frac{\sum_{x\in\mathcal{X}}xp_{x|x'}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}{\sum_{x\in\mathcal{X}}p_{x|x'}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}\right)^2$$ This has some estimation-theoretic meaning, but not anything that seems useful in establishing my desired result.

$\endgroup$
3
  • $\begingroup$ If you do have an ODE for $f$, I think it would help. $\endgroup$ Oct 25, 2012 at 13:13
  • $\begingroup$ Unfortunately I don't. I have a meaningful expression $f'$ (a difference of two MMSEs) but I don't know of a way to relate to $f$ algebraically. Is there no canonical way to turn an integral formula for $f$ like the first equation in my question to an ODE? $\endgroup$ Oct 25, 2012 at 13:28
  • $\begingroup$ The link to springerlink.com is broken. I'm also unable to find any snapshot saved on the Wayback Machine. $\endgroup$ May 1 at 12:14

2 Answers 2

4
$\begingroup$

Well, I also couldn't understand Thierry's argument, so I decided to use the old "pushing junk down the ladder" trick from classical asymptotic analysis. Note first of all that looking at the expression for the first derivative, which is an integral of a non-negative quantity, we can easily realize that either $f'(t)$ is identically $0$, or $f'(t)\ge ce^{-Ct}$ (I prefer to use $t$ instead of $\gamma$ for typing reasons). Canceling $e^{-{y^2}/{2}}$ whenever possible and changing the variable $y$ to $\sqrt ty$, we see that up to some irrelevant factor, $f''(t)$ is the same as $$ F(t)=\int_{-\infty}^\infty e^{-t\frac{y^2}2}\frac{\sum_k\alpha_k e^{t U_k(y)}}{\sum_j\beta_j e^{t V_j(y)}}\,dy $$ where $U_k,V_j$ are some fixed linear functions of $y$ and the sums are finite. Note that $I(t)\le e^{Ct}$ for some $C>0$ (in your case, it has no chance to grow at all, but let's be generous and use the particular structure of your function only when we need it). Now look at $W(y)=\max_j V_j(y)$. It is a piecewise linear function. Note that the whole real line can be split into finitely many (bounded or unbounded) intervals $J_m$, each interval containing one point where the graph of $W$ has a corner so that on each interval we have a few linear forms $V_j$ meeting at one point while all the rest stay down from the maximum of these meeting forms by some constant. Splitting at that corner point and moving the orijin to it, we see that $F(t)$ is a sum of integrals of the form $$ I(t)=\int_0^a e^{-t\frac{y^2}2}\frac{S(t,y)}{R(t,y)+J(t,y)}\,dy $$ where $S(t,y)$ and $J(t,y)$ are of the same form $\sum_k\alpha_k e^{t U_k(y)}$ with arbitrary linear forms $U_k$, while $R(t,y)=1+\sum_j{\beta_je^{-\lambda_j t}}$ with some positive $\lambda_j$ and $J(t,y)\le e^{-ct}R(t,y)$ on $[0,a)$.

Now it is time to push the junk term $J(t,y)$ down the ladder by writing $$ \frac 1{R(t,y)+J(t,y)}=\frac 1{R(t,y)}\sum_{m=0}^M (-1)^m\left[\frac{J(t,y)}{R(t,y)}\right]^m+\frac 1{R(t,y)+J(t,y)}(-1)^{M+1}\left[\frac{J(t,y)}{R(t,y)}\right]^{M+1} $$ If $M$ is chosen large enough, the last term we want to drop is dominated by $e^{-Ct}$ times the original expression, so, stopping far enough, we can make the error decay exponentially with as large exponent as we wish. Thus, we can get an approximation, which is a linear combination of terms of the kind $$ T(t)=e^{bt}\int_0^a e^{-t\frac{y^2}2}\frac{e^{\lambda ty}}{1+Q(t,y)}\,dy $$ where $Q(t,y)=\sum_j{\beta_je^{-\lambda_j t}}$ with some positive $\lambda_j$.

Now we have to consider 2 cases.

Case 1: There is no $Q(t,y)$ (pure exponent). Here we can shift everything to the point of maximum of $\lambda y-\frac{y^2}2$, split if necessary, and get a linear combination of terms of the kind $$ e^{bt}\int_0^a e^{-t\frac{y^2}2}\,dy=ce^{bt}t^{-1/2}-ce^{b't}\int_0^{\infty}e^{-t\frac{y^2}2}e^{-aty}\,dy $$ (write $\int_0^a=\int_{0}^\infty-\int_{a}^\infty$ and shift the starting point back to the origin in the second one).

Case 2: There is a non-trivial $Q(t,y)$ in the game. Then we want to push the junk down the ladder once again writing $$ \frac{1}{1+Q(t,y)}=\sum_{m=0}^M (-Q(t,y))^m+\frac{(-Q(t,y))^{M+1}}{1+Q(t,y)} $$ The purpose of the game is to get a lot of pure exponents and an exponentially decaying function $e^{\lambda t}Q(t,y)^{M+1}$ in the numerator (note that, for all we know, $\lambda$ could be large positive in the beginning). Moreover, when integrating the last term, we can add the integral from $a$ to $\infty$ because that one can be made to decay exponentially with arbitrarily large exponent by choosing $M$ large enough.

Now, with all the junk pushed ruthlessly away, we get the sum of the terms of two kinds: $e^{bt}t^{-1/2}$ and $e^{bt}\int_0^\infty e^{-t\frac{y^2}2}f(ty)\,dy$ where $f$ is some exponentially decaying function on $(0,\infty)$. Plus, of course, we have a very quickly decaying error. Of course, we can combine the terms of each kind with the same $b$, so I'll assume that all $b$'s are different.

In the terms of the second kind, one is tempted to change variable to $ty$ getting $$ t^{-1}\int_0^\infty e^{-t^{-1}\frac{y^2}2}f(y)\,dy $$ At this point, we can safely write the asymptotic expansion in the inverse powers of $t$ just expanding $e^-z$ and integrating formally. Note that it will be merely an asymptotic expansion, not a convergent series. However, it is enough for us to get the asymptotics for the whole thing unless all coefficient miraculously vanish. However, the coefficients are even moments of an exponentially decaying function. Reflecting it around the origin, we get a function with all vanishing moments and exponential decay. The Fourier transform of such function will be a real analytic function with all derivatives at the origin equal to $0$. Thus, it is identically $0$, and so is the original $f$. But then we have total collapse of all terms and only the error is present, i.e., $F$ decays faster than any exponent. It is, probably, impossible in general but in your particular case, we have a clear reason: if $f''$ decays superexponentially, so does $f'$, but we ruled that out from the beginning.

Of course, if we can fix the Pfaffian argument, this "junk" method will become totally obsolete. But until Thierry replies, one can live with it :).

$\endgroup$
2
  • $\begingroup$ Thanks! I must admit that after reading your solution twice some of the steps are still not totally clear to me - especially the part when you switch from $F$ to $I$ using $W$ that makes a very brief appearance. Nonetheless the "junk" approach seems viable and might serve as a substitute if the Pfaffian directions falls through. $\endgroup$ Nov 1, 2012 at 9:12
  • $\begingroup$ Draw a picture! It is, indeed, a bit awkward in words, but if you bother to draw 5-7 lines on the plane and look at them for 5-10 minutes, everything should become clear. $\endgroup$
    – fedja
    Nov 1, 2012 at 12:01
2
$\begingroup$

Your function $f(\gamma)$, and its second derivative, are definable in the Pfaffian closure of $\mathbb{R}_{\exp}$. In particular, since this Pfafian closure is an o-minimal structure, this means that the set $\lbrace \gamma \mid f''(\gamma)=0\rbrace$ has only finitely many connected components, which might be enough for your purpose.

More details: let $$ \phi(x,y)= e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}\log\left(\frac{\sum_{x\in\mathcal{X}}p_{x|x'}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}{\sum_{x\in\mathcal{X}}p_{x}e^{-\frac{1}{2}(y-\sqrt{\gamma}x)^2}}\right),$$ and let $$ \Phi(x,y)=\int_0^y \phi(x,t)dt.$$

This function $\phi$ is certainly definable in the exponential field, and the function $\Phi$ is definable in its Pfaffian closure. Now, $$ \int_{-\infty}^\infty \phi(x,y)dy = \lim_{A \to \infty} \Phi(x,A) - \lim_{B\to -\infty} \Phi(x,B).$$ Since these limits are first-order definable, these integrals, and thus your function $f(\gamma)$, are definable in the Pfaffian closure of $\mathbb{R}_{\exp}$.

$\endgroup$
7
  • $\begingroup$ Sorry for my ignorance, but is there a simple reason why the Pfaffian closure is closed under parametric integration like this? (By the way, there is a confusion of $x$ and $\gamma$ in the definition of $\phi$.) $\endgroup$ Oct 25, 2012 at 15:45
  • $\begingroup$ @Thierry, when you say $\phi$ is definable in the exponential field, do you mean that for every $t$, $\phi(\cdot,t)$ is definable? If so I also don't understand why $\Phi$ is in the Pfaffian closure of $\mathbb{R}_{\exp}$. Otherwise, I don't understand in what sense is $\phi$ definable - is there an extension of the exponential field to functions of more than one variable? Thanks! $\endgroup$ Oct 25, 2012 at 17:10
  • 2
    $\begingroup$ A function $f(x_1,\dots,x_n)\colon\mathbb R^n\to\mathbb R$ is definable if there is a first-order formula $A(x_1,\dots,x_n,y)$ in the language of $\mathbb R_{\exp}$ which describes the graph $\{(x_1,\dots,x_n,y)\in\mathbb R^{n+1}:f(x_1,\dots,x_n)=y\}$. The function does not have to be unary (and it does not have to be total, as I wrote here for simplicity). $\mathbb R_{\exp}$ even has binary functions in its signature: $x+y$ and $x\cdot y$. Here, it suffices to show that $\log x$ and $x/y$ are definable (e.g., $x/y=z$ iff $x=yz\land y\ne0$), and then $\phi$ is constructed by composition. $\endgroup$ Oct 25, 2012 at 17:41
  • $\begingroup$ In this case, doesn't the fact that $\frac{\partial}{\partial y}\Phi(\gamma,y)=\phi(\gamma,y)$ imply that $\Phi$ is in the Pfaffian closure of a set containing $\phi$? $\endgroup$ Oct 25, 2012 at 18:52
  • 3
    $\begingroup$ I'm sorry that I didn't have the time to come back to this. I have been in email contact with the OP. Emil raises some good objections, and I am not sure how to fix them yet or when I'll have the time to tackle this. Sorry about that! $\endgroup$ Nov 1, 2012 at 22:28

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Not the answer you're looking for? Browse other questions tagged or ask your own question.