Optimal Rates for Differentially Private Hypothesis Testing with E-values

摘要

E-values have attracted considerable interest in recent years as flexible tools for enabling anytime-valid and adaptive data analysis. Hypothesis testing is at the core of many of these applications, which can often involve private or sensitive data. In this work, we answer a simple but important question: given two distributions $\mathbb{P}$ and $\mathbb{Q}$, what is the maximum achievable e-power when testing $X\sim \mathbb{P}^n$ against $X\sim\mathbb{Q}^n$ with e-values that satisfy $\varepsilon$-differential privacy? We characterize the optimal rate for this problem and provide an algorithm which matches it exactly. In the sequential setting, when observations arrive one-by-one and the analyst chooses when to halt, we give matching upper and lower bounds on the stopping times of any private e-process. Numerical experiments confirm the practicality of our algorithms, which require less data than the recently proposed DP-SPRT across a range of sequential testing problems and privacy levels.

核心问题与主要方法

核心问题

maximize hypothesis-testing power with e-values under differential privacy, in batch and sequential settings

场景：simple testing of $X\sim\mathbb{P}^n$ versus $X\sim\mathbb{Q}^n$ under central $[0m$-DP, with an adaptive sequential extension via private e-processes

主要方法

Decouples privacy from e-value validity by first considering an arbitrary epsilon-DP mechanism and then using the likelihood ratio between M(Q^n) and M(P^n) as the optimal post-processing e-variable. Characterizes the optimal batch rate through an intermediate distribution Q-tilde that bridges P and Q and yields a KL-plus-total-variation expression under central DP constraints. Uses total-variation couplings, shadow datasets, KL chain rules, data processing, joint convexity, and group privacy to prove upper bounds on any private mechanism's e-power. Constructs a clipped likelihood-ratio e-variable with bounded log-sensitivity, then privatizes a transformed aggregate using Laplace noise while compensating for exponentiation bias to preserve the e-variable condition. Reduces sequential private e-process construction to batched releases of bounded-sensitivity e-values, with a calibrated batching schedule trading initial latency for a competitive ratio at later stopping times.

关键贡献与后续阅读

关键贡献

Provides an instance-specific characterization of the maximum e-power achievable by central epsilon-DP e-values for simple hypothesis testing between P and Q. Builds an epsilon-DP e-value for arbitrary distribution pairs that asymptotically matches the optimal batch e-power rate up to lower-order terms. Derives a dual characterization of maximum KL divergence between central DP mechanisms, presented as potentially independently useful. Establishes finite-stopping-time lower bounds for private e-processes, rather than relying only on asymptotic growth-rate metrics that can hide privacy-induced batching latency. Introduces an efficient private e-process construction with a tunable batching schedule that matches the stopping-time lower bound up to a constant factor for sufficiently large stopping times. Provides a distribution-independent bounded e-variable alternative via a truncated/scaled likelihood-ratio statistic, with stated constant-factor guarantees in relevant regimes.

研究启发

What are the exact theorem statements for Theorems 2.2, 2.4, 3.1, and 3.2 after checking the PDF math rendering? How large are the lower-order terms and startup-time thresholds in practical privacy/sample-size regimes? How sensitive is the Algorithm 1 advantage over DP-SPRT beyond the reported Bernoulli experiments and chosen rho=3 setup? Can the distribution-independent statistic match the optimal rate when epsilon=o(1), which the paper leaves open?

限制与不确定性

Structure evidence includes some garbled math tokens, so exact assumptions and parameter regimes may need verification in a full review. Some guarantees appear asymptotic or constant-factor, with noted open questions for distribution-independent optimality in small-epsilon regimes.

参考文献

48 条

[1] J. M. Abowd (2018) The U.S. census bureau adopts differential privacy . In International Conference on Knowledge Discovery & Data Mining , pp. 2867 . External Links: Document , Link Cited by: §1 .
[2] V. Amrhein, S. Greenland, and B. McShane (2019) Scientists rise up against statistical significance . Nature 567 ( 7748 ), pp. 305–307 . Cited by: §1 .
[3] H. Asi, J. C. Duchi, S. Haque, Z. Li, and F. Ruan (2024) Universally instance-optimal mechanisms for private statistical estimation . In Conference on Learning Theory , Proceedings of Machine Learning Research , Vol. 247 , pp. 221–259 . Cited by: §1 .
[4] A. Azize, M. Jourdan, A. A. Marjani, and D. Basu (2026) Differentially private best-arm identification . External Links: 2406.06408 , Link Cited by: §1 .
[5] B. Balle, G. Barthe, and M. Gaboardi (2018) Privacy amplification by subsampling: tight analyses via couplings and divergences . In Conference on Neural Information Processing Systems , Vol. 31 . External Links: Link Cited by: §2.3 .
[6] C. L. Canonne, G. Kamath, A. McMillan, A. Smith, and J. Ullman (2019) The structure of optimal private tests for simple hypotheses . In Symposium on Theory of Computing , pp. 310–321 . External Links: Document Cited by: §1 , §1 , §2.3 .
[7] B. Chugg, S. Cortes-Gomez, B. Wilder, and A. Ramdas (2023) Auditing fairness by betting . In Conference on Neural Information Processing Systems , Vol. 36 , pp. 6070–6091 . External Links: Link Cited by: §1 .
[8] B. Chugg, A. Ramdas, and P. Grünwald (2026) E-values as statistical evidence: a comparison to bayes factors, likelihoods, and p-values . External Links: 2603.24421 , Link , Document Cited by: §1 .
[9] C. Covington, X. He, J. Honaker, and G. Kamath (2025) UNBIASED statistical estimation and valid confidence intervals under differential privacy . Statistica Sinica 35 , pp. 651–670 . Cited by: §1 .
[10] D. Csillag and D. Mesquita (2026) Differentially private e-values . In The 29th International Conference on Artificial Intelligence and Statistics , External Links: Link , Document Cited by: §1 , §2.4.1 , §3.3 .
[11] R. Cummings, S. Krehbiel, Y. Mei, R. Tuo, and W. Zhang (2018) Differentially private change-point detection . In Conference on Neural Information Processing Systems , Red Hook, NY, USA , pp. 10848–10857 . Cited by: §1 .
[12] D. Donoho and R. Liu (1991-06) Geometrizing rates of convergence, ii . The Annals of Statistics 19 . External Links: Document Cited by: §1 .
[13] J. C. Duchi, M. I. Jordan, and M. J. Wainwright (2013) Local privacy and statistical minimax rates . In Symposium on Foundations of Computer Science , Vol. , pp. 429–438 . External Links: Document Cited by: §2.1 .
[14] C. Dwork and A. Roth (2014) The algorithmic foundations of differential privacy . Found. Trends Theor. Comput. Sci. 9 ( 3-4 ), pp. 211–407 . External Links: Document , Link Cited by: §1 , §4 .
[15] F. Fioretto and P. Van Hentenryck (2025) Differential privacy in artificial intelligence: from theory to practice . Emerald Group Publishing . Cited by: §1 .
[16] T. González, M. D. Rubio, A. Ramdas, and M. Ribero (2026) Sequentially auditing differential privacy . In Conference on Neural Information Processing Systems , External Links: Link Cited by: §1 .
[17] P. Grünwald, R. de Heide, and W. M. Koolen (2020-02) Safe Testing . In 2020 Information Theory and Applications Workshop (ITA) , pp. 1–54 . Note: ISSN: 2642-7338 External Links: ISSN 2642-7338 , Link , Document Cited by: §1 .
[18] S. Huang and D. Goretzko (2026) Controlling the false discovery rate in dif detection with e-values: evidence from multidimensional and testlet simulations . Educational and Psychological Measurement , pp. 00131644261433236 . Cited by: §1 .
[19] P. J. Huber and V. Strassen (1973) Minimax tests and the neyman-pearson lemma for capacities . The Annals of Statistics 1 ( 2 ), pp. 251–263 . External Links: ISSN 00905364, 21688966 , Link Cited by: §1 .
[20] P. Kairouz, S. Oh, and P. Viswanath (2016-01) Extremal mechanisms for local differential privacy . Journal of Machine Learning Research 17 ( 1 ), pp. 492–542 . External Links: ISSN 1532-4435 Cited by: §2.1 .
[21] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith (2011) What can we learn privately? . SIAM Journal on Computing 40 ( 3 ), pp. 793–826 . External Links: Document Cited by: §1 .
[22] Z. Kazan, K. Shi, A. Groce, and A. Bray (2023) The test of tests: a framework for differentially private hypothesis testing . In International Conference on Machine Learning , ICML’23 . Cited by: §1 .
[23] D. Martinez-Taboada, T. González, and A. Ramdas (2026) Vector-valued self-normalized concentration inequalities beyond sub-gaussianity . In International Conference on Algorithmic Learning Theory , External Links: Link Cited by: §1 .
[24] A. McMillan, A. Smith, and J. Ullman (2022) Instance-optimal differentially private estimation . External Links: 2210.15819 , Link , Document Cited by: §1 .
[25] T. Michel, D. Basu, and E. Kaufmann (2025) DP-SPRT: differentially private sequential probability ratio tests . CoRR abs/2508.06377 . External Links: Link , Document , 2508.06377 Cited by: 4th item , §1 , Figure 2 , §4 , §4 .
[26] M. Mitzenmacher and E. Upfal (2005) Probability and computing: randomized algorithms and probabilistic analysis . Cambridge University Press . External Links: Document , Link , ISBN 978-0-521-83540-4 Cited by: Appendix A , §3 .
[27] K. E. Nikolakakis, D. S. Kalogerias, O. Sheffet, and A. D. Sarwate (2021) Quantile multi-armed bandits: optimal best-arm identification and a differentially private scheme . IEEE J. Sel. Areas Inf. Theory 2 ( 2 ), pp. 534–548 . External Links: 2006.06792 , Link , Document Cited by: §1 .
[28] Y. Polyanskiy and Y. Wu (2025) Information theory: from coding to learning . Cambridge university press . Cited by: Appendix A .
[29] A. Ramdas, P. Grünwald, V. Vovk, and G. Shafer (2023) Game-Theoretic Statistics and Safe Anytime-Valid Inference . Statistical Science 38 ( 4 ), pp. 576 – 601 . External Links: Document , Link Cited by: §1 , §1 .
[30] A. Ramdas, J. Ruf, M. Larsson, and W. Koolen (2022-11) Admissible anytime-valid sequential inference must rely on nonnegative martingales . arXiv ( en ). Note: arXiv:2009.03167 [math] External Links: Link , Document Cited by: §1 .