3
$\begingroup$

The following problem has come up while working on the relationship between certificate and randomized decision tree complexities of boolean functions. However, I think it is of interest by itself and some variants of it might have already been studied:

Given $k$ finite sets $C_1, ..., C_k$, such that $|C_j| \geq c$ for every $1 \leq j \leq k$. Let $\mathcal{F}$ be a family of sets $S_1, ..., S_m \subseteq \{1,...,n\}$ with $s = \max_{1 \leq j \leq m} |S_j|$ such that for any $\epsilon m$ distinct indexes $i_1 ... i_{\epsilon m}$ there exists a $1 \leq j \leq k$ such that $C_j \subseteq S_{i_1} \cup ... \cup S_{i_{\epsilon m}}$.

What is the minimal $s$ as a function of $c,k,m\;\text{and}\;n$?

Some further (optional or obvious constraints)

  • $c \leq n$ and in general $c$ is a function of $n$.
  • $k$ can be anything and is a function of $n$ and $c$. Usually $k$ is greater than $n$ and $c$, although even the case of $k = 2$ is interesting.
  • $m$ can be anything, but the more interesting case is when $m >> n$.
  • $\epsilon$ can be assumed to be any constant strictly less than $1/2$. Although non-trivial results for $\epsilon = l/m$ with some constant integer $l$ are also interesting.

If $k = 1$ then $s \geq (1 - \epsilon)|C_1|$ by a simple pigeon hole argument. I don't know any non-trivial bound for $k = 2$, but obviously we are interested in the case of $C_1 \cap C_2 = 0$.

Alternatively, if $C_1, ..., C_k$ are all the $(\lfloor n/2 \rfloor + 1)$-element subsets then there exists an $\mathcal{F}$ such that $s = 3$. What happens if $C_1,...,C_k$ are all the $r$-element subsets?

This seems like a relatively natural question, and I would be surprised if something similar has not been asked before. Any pointers to the literature or partial results are appreciated. If this has not been studied, then can you recommend any tools or approaches with which to answer this question?


As per request, a more formal statement:

Let $n,m$ be positive integers, and $0 \leq \epsilon < 1/2$ -- a constant.

Fix $C_1, ..., C_k \subseteq \{1,...,n\}$.

Solve:

$s = \min_{S_1, ..., S_m \subseteq \{1,...,n\}} \max_{1 \leq j \leq m} |S_j|$

such that:

$\forall I \subseteq \{1,...,m\} \; \exists j \in \{1,..,k\} \quad (|I| \geq \epsilon m) \Rightarrow (C_j \subseteq \bigcup_{i \in I} S_i)$


The question also works if we replace $\max_j |S_j|$ by $\mathbb{E}_j[|S_j|]$ or by assuming that $|S_i| = |S_j|$ for any $1\leq i < j \leq m$, but I am not sure how that helps.

$\endgroup$
5
  • 2
    $\begingroup$ With your use of the words "given" and "let" I find it hard to work out exactly how the quantification works in your question. Would it be possible to ask it in a more formal way? $\endgroup$
    – gowers
    Jul 14, 2011 at 19:59
  • 3
    $\begingroup$ I think another way to present it is as follows: Let the C's and epsilon be given. Let m and n be additional parameters large enough to make the following interesting. What is the smallest s among all families such that no S has more than s elements and every union of epsilon times m S's covers one of the C's. One may need large n if epsilon is very small. Gerhard "Email Me About System Design" Paseman, 2011.07.14 $\endgroup$ Jul 14, 2011 at 22:07
  • $\begingroup$ Added a more formal statement, hopefully it is clear. Paseman's more informal restatement also works. $\endgroup$ Jul 15, 2011 at 5:45
  • $\begingroup$ So s depends on $\epsilon$ and the sets $C_1,\dots,C_k$? $\endgroup$
    – gowers
    Jul 15, 2011 at 12:51
  • $\begingroup$ Gowers , $s$ also depends on $m$, but yes. The reason I see for including $n$ is if epsilon is small, we may need some padding to get nontrivial unions with very large $m$. This is one problem where the poster might help out by motivating some of the parameters. I would like to know why the C's are specified before n is mentioned. Gerhard "Email Me About System Design" Paseman, 2011.07.15 $\endgroup$ Jul 15, 2011 at 16:52

1 Answer 1

1
$\begingroup$

This problem has enough parameters to easily confuse. I will add some trivial cases to help motivate the problem, and encourage the poster to provide some more.

Of course, arranging that every set in S contains one of the sets in the given collection C shows that there is a solution with $s > c$; what is wanted is $s <= c$.

For $k=2$ and disjoint sets $C_1$ and $C_2$, both of size $c$, getting $s < c$ seems challenging because you have to choose at most half the sets of S. I can imagine a case where epsilon times $m > 2$, and then reducing to the case of $k=1$ and (epsilon times $m$) also being reduced by 1, but this implies constraints on n,m and epsilon and it is not pretty, at least to me. However,a pigeonhole argument should work for the case of $n$ disjoint equinumerous $C_i$ as it does for $n=1$, provided m and epsilon fit some narrow constraints, again getting $s > c(1-$ epsilon).

If C contains all subsets of size $r$ of an $n$ set, one can choose s=1, as long as $m$ and epsilon play nice. If epsilon is less than 1/2, then $2r < n$. For larger $r$, one can try mostly disjoint sets in S of larger size; the problem reminds one of doing certain kinds of covering designs.

As my understanding of the problem develops, I will add to this. It seems to help if one relaxes the constraints on s and epsilon, and then looks for more challenging bits as you try to impose some constraints on groups of parameters. Even so, I encourage Artem to provide more motivation.

Gerhard "Ask Me About System Design" Paseman, 2011.07.15

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Not the answer you're looking for? Browse other questions tagged or ask your own question.