Difficulty of Predicting the Maximum of Gaussians

Suppose that we have a random variable X \in \mathbb{R}^d, such that \mathbb{E}[XX^{\top}] = I_{d \times d}. Now take k independent Gaussian random variables Z_1, \ldots, Z_k \sim \mathcal{N}(0, I_{d \times d}), and let J be the argmax (over j in 1, …, k) of Z_j^{\top}X.

It seems that it should be very hard to predict J well, in the following sense: for any function q(j \mid x), the expectation of \mathbb{E}_{x}[q(J \mid x)], should with high probability be very close to \frac{1}{k} (where the second probability is taken over the randomness in Z). In fact, Alex Zhai and I think that the probability of the expectation exceeding \frac{1}{k} should be at most \exp(-C(\epsilon/k)^2d) for some constant C. (We can already show this to be true where we replace (\epsilon/k)^2 with (\epsilon/k)^4.) I will not sketch a proof here but the idea is pretty cool, it basically uses Lipschitz concentration of Gaussian random variables.

I’m mainly posting this problem because I think it’s pretty interesting, in case anyone else is inspired to work on it. It is closely related to the covering number of exponential families under the KL divergence, where we are interested in coverings at relatively large radii (\log(k) - \epsilon rather than \epsilon).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: