I realize this is long, but hopefully I think it may be worth the reading for people interested in combinatorics and it might prove important to Covid-19 testing. Slightly reduced in edit.
The starting point of this question is this important article by Mutesa et al. where a hypercube $\{0,1,2\}^n$ is used for pooling takings for Covid-19 testing. This pooling design is only usable at low prevalence, the main questions are whether it can be improved in its prevalence range and whether one can find good pooling designs usable at higher prevalence.
I have written a draft sketching some possible research directions, and I would like to share here the main point and ask here what seem to me to be the main questions. It might be better to set up a Polymath project, but I don't feel I have the skills (I am not a combinatorician myself) nor the proper network to make it work.
We will model pooled PCR testing for e.g. Covid-19 by a hypergraph, i.e. a pair $(V,E)$ where $V$ is a set (whose elements are called vertices and represent patients) and $E$ is a set of non-empty subsets of $V$ (whose elements are called edges and represent pools). Recall that $v=\lvert V\rvert$ is the order of the hypergraph and $e=\lvert E\rvert$ its size; $v$ is the number of takings analyzed in a batch, and $e$ the number of tests to be run in parallel.
Definition Given a vertex $x\in V$, let $x^*$ be the set of edges containing $x$. Given a subset $X\subset V$ of vertices, let $X^*=\{e\in E \mid \exists x\in X, x\in e\}$ be the set of all edges incident to some element of $X$. Let us define a pooling design as a hypergraph $(V,E)$ satisfying the following property: $$\forall x\in V, \forall X\subset V, \quad x^* = X^* \implies X=\{x\}$$
This condition ensures that, whenever there is at most one positive taking, its uniqueness is guaranteed by the tests and it can be identified.
Given a pooling design $(V,E)$, we define its compression rate $$r=\frac{e}{v}$$ (the smaller, the better), and its detection capacity, i.e. the maximal number of positive taking that can be guaranteed and identified. Formally, letting $\mathcal{P}_{\le n}(V)$ be the set of subsets of $V$ with at most $n$ elements, we set $$c = \max \big\{n\colon \forall X,Y\in \mathcal{P}_{\le n}(V), X^*=Y^*\implies X=Y \big\}.$$ The definition of a pooling design ensures $c\ge 1$, but larger is better.
Proposition. Let $(V,E)$ be a pooling design of order $v$, size $e$ and detection capacity $c$. Then the compression rate satisfies $$r \ge H\big(\frac{c}{v}\big) - o_{v\to\infty}(1) $$
The proof is straightfoward, and sketched in the draft.
Example 1. The individual testing consist in taking $V$ the set of all $N$ takings, and $E=\big\{\{x\} \colon x\in V\big\}$: each edge is a single vertex. We call this the trivial pooling design of order $v$; it has \begin{align*} v &= e = N & r &= 1 & c &= N \end{align*}
Example 2. The hypercube design of (Mutesa et al. 2020) with dimension $D\ge2$ consist in taking $V=\{1,2,3\}^D$ and $E$ the set of coordinate slices, i.e. $$E=\bigcup_{k=1}^D \big\{\{1,2,3\}^{k-1}\times \{i\}\times\{1,2,3\}^{D-k} \colon i\in\{1,2,3\}\big\}.$$ It has \begin{align*} v &= 3^D & e &= 3D & r &= \frac{D}{3^{D-1}} & c &= 1 \end{align*}
Comparing $H(c/v)$ and the actual compression rate for the hypercube design with various values of $D$ show some limited room for improvement (see the draft): the hypercube is off by only a factor less than $2$; these pooling designs are thus not too far from optimal in their prevalence regime.
Example 3. The complete quadrilateral can be described with $V=\{1,2,3,4,5,6\}$ and $E=\big\{ \{1,2,3\}, \{3,4,5\}, \{5,6,2\}, \{1,4,6\} \big\}$. It has \begin{align*} v &= 6 & e &= 4 & r &= \frac23 & c &= 1 \end{align*} For comparison, we note that $H(c/v) \simeq 0.65$, very close to the compression rate: this pooling design is close to optimal in its prevalence regime.
Other examples from incidence geometry are given in the draft.
Example 4. Let $p$ be a prime number (or a primitive number) and $\mathbb{F}_p$ be the Field with $p$ elements. Choose a dimension $D\ge 2$ and a parameter $k\ge D$. We set $V=\mathbb{F}_p^D$ (for $p=3$, we thus have the same vertex set than in the hypercube design). Let $(\phi_1,\dots,\phi_k)$ be linear forms such that any $D$ of them are linearly independent. Without loss of generality, we can assume $(\phi_1,\dots,\phi_D)$ is the canonical dual basis (i.e. $\phi_i(x_1,\dots,x_D) = x_i$). Last, we let $E$ be the set of all levels of all the $\phi_i$: $$ E = \big\{\phi_i^{-1}(y) \colon i\in\{1,\dots, k\}, y\in\mathbb{F}_p \big\}.$$ Let us call the pooling design $(V,E)$ the generalized hybercube design of parameters $(p,D,k)$. It has \begin{align*} v &= p^D & e &= kp & r &= \frac{k}{p^{D-1}} \end{align*} and the remaining question is how large can be $c$.
General Question Which values of $v,r,c$ are realized by a pooling design?
Question 1. Determine $c$ for the generalized hypercube design (it might be that $c$ depends on the specific linear form chosen, although I would bet a low stake that it does not). Given $v_0$, what choice of $p,D,k$ such that $v\simeq v_0$ minimizes $\frac{r}{H(c/v)}$? Given a prevalence, what is the best value of $r$ that can be achieved with a generalized hypercube for which detection capacity is exceeded with probability less than $5\%$?
Question 2. Does there exist pooling designs with $v\gg 1$, $c/v \simeq 1/6$ and compression rate $\simeq2/3$?
Question 3. For small values of $v$, give all pooling designs that are optimal in the sense that no other pooling design with the same order has both better compression rate and better detection capability.
Question 4. Are any of the above question made simpler if we generalize the definitions, and replace the detection capacity $c$ by the set $\mathcal{D}$ of $X\subset V$ such $X^*=Y^* \implies X=Y$ for all $Y\subset V$? (Then the performance of the pooling at prevalence $p$ would be the probability that the set of positives takings is in $\mathcal{D}$, assuming the takings are IID random variables with Bernoulli laws of parameter $p$).