Non-continuous behaviour when designing a repeated experiment

Ask Question

Asked 7 years, 3 months ago

Modified 7 years, 3 months ago

Viewed 68 times

Assume one can perform measurements of an unknown quantity $\theta$ as $$y = \theta + \epsilon(t),$$ where $\epsilon(t) \sim \mathcal{N}(0,1/t)$ is the measurement error when a time $t$ was spent to collect the observation.

Now, assume that we can actually repeat this several times, but that the observations might be correlated. The goal is to optimally split $T=1$ hour of computation time between observations.

The thing I have in mind is that $\epsilon(t)$ should in fact be deterministic, but we try to model it by a (non-stationary) Gaussian process: $$\epsilon(t) = \frac{1}{\sqrt{t}} Z(t),$$ where $Z(t)$ is GP of unit variance. For simplicity, let us assume $$\operatorname{cov}\big(Z(t_1),Z(t_2)\big) = e^{-(t_1-t_2)^2}.$$

If two observations are allowed, with a time $t$ spent on the first observation, and $1-t$ on the second one, then we have $$ \left[\begin{array}{c}y_1\\y_2\end{array}\right]\sim \mathcal{N}\left( \mathbf{1}\theta,\, \Sigma(t) \right), \ \mathrm{where}\ \Sigma(t)= \left[\begin{array}{cc}\frac{1}{t} & \frac{1}{\sqrt{t(1-t)}} e^{-(1-2t)^2}\\ \frac{1}{\sqrt{t(1-t)}} e^{-(1-2t)^2} & \frac{1}{1-t}\end{array}\right] $$ and $\mathbf{1}$ is the vector of all ones. The variance of the BLUE for $\theta$ is $\big(\mathbf{1}^T \Sigma(t)^{-1} \mathbf{1}\big)^{-1}$, so we want to select $t\in[0,1]$ that maximizes the quantity $\rho(t):= \mathbf{1}^T \Sigma(t)^{-1} \mathbf{1}$. Curiously, the suppremum is attained when $t\to 0^+$ (or when $t \to 1^-$), with $\rho(t)\to 1/(1-e^{-2})\simeq 1.156$. This indicates a non-continuous behaviour around $t=0$ and $t=1$, where there is a single observation of unit variance, so we would define $\rho(0)=\rho(1)=1$.
Even worse: assume we can observe $y_1$ during a time $t=\varepsilon$, and y_2 during a time $t=2\epsilon$ (for a small $\epsilon>0$). Then, both $y_1$ and $y_2$ have a huge variance, but they are extremely correlated (Pearson correlation factor is $\rho=e^{-\epsilon^2}$). The variance of the BLUE is equal to $$ \frac{-\sqrt{2}(e^{-2\epsilon^2}-1)}{\epsilon(3\sqrt{2}-4e^{-\epsilon^2})},$$ which tends to 0 when $\epsilon\to 0$, so $\theta$ can be recovered with arbitrary precision, and furthermore for a vanishing computational effort !!! For information, in this situation the BLUE for $\theta$ is $$\hat{\theta} = \frac{e^{-\epsilon^2}\sqrt{2}-1}{2e^{-\epsilon^2}\sqrt{2}-3}\, y_1 + \frac{e^{-\epsilon^2}\sqrt{2}-2}{2e^{-\epsilon^2}\sqrt{2}-3}\, y_2 \simeq -2.4142 y_1 + 3.4142 y_2.$$

So, based on these observations:

Do you have an intuitive explanation for this phenomenon? How could two experiments with weights $t_1=\epsilon$ and $t_2=1-\epsilon$ be much better than a single experiment with weight $t=1$? My guess is that assuming that the covariance kernel of $Z(t)$ is known is an extremely strong assumption, but I find counter-intuitive that information on the correlation between two observations can give an arbitrary good precision, even though both observations have a huge variance !
Can you think of a better model, with more realistic assumptions, that allows us to optimally split one unit time of computation ?

edited Sep 1, 2016 at 12:49

asked Aug 29, 2016 at 9:57

guigux

6073 silver badges10 bronze badges

$\begingroup$ I don't follow your model. I suspect that you are accidentally assuming something strange and that is why you are getting strange results. If you are sure you haven't made a mistake, please elaborate on how you get a better estimate of $\theta$ when $\epsilon = 10^{-10}$, say. What do you do with the observations $y_1=50,000, y_2=3$? $\endgroup$
– Douglas Zare
Aug 29, 2016 at 14:09
$\begingroup$ Yes, I agree this is a strange model, but I wonder what a better model could be? The BLUE for $\theta$ is obtained by WLS: $\hat{\theta} = (\mathbf{1}^T \Sigma^{-1} \mathbf{1})^{-1} \mathbf{1}^T \Sigma^{-1} \mathbf{y}$. For a small $\epsilon$ a development up to the order 1/2 gives $\hat{\theta} \simeq y_2 - e^{-1} (y_1-y_2) \sqrt{\epsilon}$. $\endgroup$
– guigux
Aug 31, 2016 at 9:24
$\begingroup$ So for your values, instead of having $\hat{\theta}=3$, we would obtain $\hat{\theta}\simeq 2.816$. $\endgroup$
– guigux
Aug 31, 2016 at 9:26
$\begingroup$ The (too) strong assumption is that we know that the Pearson correlation between $y_1$ and $y_2$ is $\rho \simeq e^{-1}$. The conditionned law of $y_2|y1$ is a normal of variance $\simeq 1-\rho^2 < 1$. $\endgroup$
– guigux
Aug 31, 2016 at 9:36
1

$\begingroup$ To me, that confirms your model is wrong, and you should move on if you are actually interested in understanding real problems that could be described as splitting your observation time up with errors that might be correlated. It's like you were modelling the temperature of a cup of tea and your first guess accidentally produced a vertical asymptote instead of a smooth decay. Do you study when nuclear fusion might be happening in the cup of tea or do you correct your model? $\endgroup$
– Douglas Zare
Sep 2, 2016 at 9:55

| Show 2 more comments

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Stack Exchange Network

Non-continuous behaviour when designing a repeated experiment

0

Your Answer

Browse other questions tagged
st.statistics
stochastic-processes
combinatorial-designs
or ask your own question.

Non-continuous behaviour when designing a repeated experiment

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Browse other questions tagged st.statisticsstochastic-processescombinatorial-designs or ask your own question.

Related

Browse other questions tagged
st.statistics
stochastic-processes
combinatorial-designs
or ask your own question.