Sets with Small Intersection

Suppose that we want to construct subsets S_1, \ldots, S_m \subseteq \{1,\ldots,n\} with the following properties:

  1. |S_i| \geq k for all i
  2. |S_i \cap S_j| \leq 1 for all i \neq j

The goal is to construct as large a family of such subsets as possible (i.e., to make m as large as possible). If k \geq 2\sqrt{n}, then up to constants it is not hard to show that the optimal number of sets is \frac{n}{k} (that is, the trivial construction with all sets disjoint is essentially the best we can do).

Here I am interested in the case when k \ll \sqrt{n}. In this case I claim that we can substantially outperform the trivial construction: we can take m = \Omega(n^2 / k^3). The proof is a very nice application of the asymmetric Lovasz Local Lemma. (Readers can refresh their memory here on what the asymmetric LLL says.)

Proof. We will take a randomized construction. For i \in \{1,\ldots,n\}, j \in \{1,\ldots,m\}, let X_{i,a} be the event that i \in S_a. We will take the X_{i,a} to be independent each with probability \frac{2k}{n}. Also define the events

Y_{i,j,a,b} = I[i \in S_a \wedge j \in S_a \wedge i \in S_b \wedge j \in S_b]

Z_{a} = I[|S_a| < k]

It suffices to show that with non-zero probability, all of the Y_{i,j,a,b} and Z_{a} are false. Note that each Y_{i,j,a,b} depends on Y_{i',j,a',b}, Y_{i',j,a,b'}, Y_{i,j',a',b}, Y_{i,j',a,b'}, Z_a, Z_b, and each Z_a depends on Y_{i,j,a,b}. Thus each Y depends on at most 4nm other Y and 2 other Z, and each Z depends on at most n^2m/2 of the Y. Also note that P(Y_{i,j,a,b}) = (k/n)^4 and P(Z_a) \leq \exp(-k/4) (by the Chernoff bound). It thus suffices to find constants y, z such that

(k/n)^4 \leq y(1-y)^{4nm}(1-z)^2

\exp(-k/4) \leq z(1-y)^{n^2m/2}

We will guess y = \frac{k}{n^2m}, z = \frac{1}{2}, in which case the bottom inequality is approximately \exp(-k/4) \leq \frac{1}{2}\exp(-k/2) (which is satisfied for large enough k, and the top inequality is approximately \frac{k^4}{n^4} \leq \frac{k}{4n^2m} \exp(-4k/n), which is satisfied for m \leq \frac{n^2}{4ek^3} (assuming k \leq n/4). Hence in particular we can indeed take m = \Omega(n^2/k^3), as claimed.