Secure Distributed Hypothesis Testing

摘要

In distributed hypothesis testing, a central server performs hypothesis testing based on information received from distributed sensors/clients. We study a secure variant of this problem in which the central server determines the hypothesis class of an underlying distribution without learning any additional information about the distribution itself. We prove that, in its standard form, this is impossible to achieve, even for simple and highly restricted cases. To bypass this impossibility, we augment the model with a shared secret key available to clients but hidden from the server. We show that a single-bit secret key enables perfectly secure testing for simple classes by reducing the test distributions to a symmetric, canonical instance. Finally, for arbitrary hypothesis classes over finite domains, we establish a reduction to standard hypothesis testing using Private Simultaneous Messages (PSM) protocols, achieving polynomial communication and key lengths.

核心问题与主要方法

核心问题

Design distributed hypothesis testing protocols that reveal only the hypothesis class and not extra information about the underlying distribution.

场景：Client-server distributed inference over i.i.d. samples from composite hypotheses on a finite domain, with optional shared secret key hidden from the server.

主要方法

Defines privacy by requiring total-variation closeness of the aggregate message distributions for any two test distributions in the same hypothesis class. Uses a Bernoulli reduction/impossibility argument showing that correctness against the singleton alternative conflicts with privacy between the two same-class Bernoulli distributions in the unkeyed model. Uses shared-key symmetrization: clients flip or keep samples according to a hidden shared bit so same-class distributions induce identical server-visible message distributions while preserving class distinguishability. For broader finite-domain two-vs-one cases, reduces distributions to Bernoulli/canonical instances via channels before applying the one-bit shared-key construction. For arbitrary classes, wraps a standard symmetric hypothesis detector in a Perfectly Private Simultaneous Messages protocol so the server learns the detector output rather than the underlying sample tuple.

关键贡献与后续阅读

关键贡献

Introduces SDHT as a distribution-level privacy model for distributed inference, distinct from protecting individual client samples. Proves an unkeyed impossibility for secure distinguishability of simple Bernoulli composite hypotheses {Ber(p0), Ber(p1)} vs {Ber(p2)} with distinct parameters. Gives a characterization for two-vs-one finite-domain hypotheses without shared key in terms of whether one distribution lies on a convex combination of the other two. Shows that a single hidden shared bit suffices to obtain perfectly private, exponentially accurate SDHT for any two-vs-one finite-domain hypothesis classes. Reduces arbitrary finite-domain SDHT with shared keys to standard hypothesis testing using PSM protocols, yielding polynomial communication and key length for fixed domain size.

研究启发

How tight is the Bernoulli impossibility: does the paper characterize exact privacy-correctness tradeoffs or only non-distinguishability as errors go to zero? Can the polynomial PSM-based construction be made practical for moderate domain sizes, or is it mainly an existence reduction? How does the SDHT privacy notion compare formally with local differential privacy or distribution privacy notions under finite-sample leakage metrics? Are there nontrivial intermediate models with partial key leakage, client collusion, or public correlated randomness that avoid the stated impossibility?

限制与不确定性

Generality may be limited because the single-bit key result only covers simple hypothesis classes. Arbitrary-class construction depends on PSM reductions and may have polynomial overhead that is not practically compelling. Assessment relies on abstract and structure analysis only, not verification of proof strength or comparison to prior work.

参考文献

29 条

[1] Y. Lindell, “Secure multiparty computation,” Commun. ACM , vol. 64, no. 1, p. 86–96, Dec. 2020.
[2] J. N. Tsitsiklis, “Decentralized detection,” in Advances in Statistical Signal Processing , H. V. Poor and J. B. Thomas, Eds., vol. 2. JAI Press, 1993, pp. 297–344.
[3] R. R. Tenney and N. R. Sandell, “Detection with distributed sensors,” IEEE Transactions on Aerospace and Electronic Systems , vol. AES-17, no. 4, pp. 501–510, 1981.
[4] R. Ahlswede and I. Csiszar, “Hypothesis testing with communication constraints,” IEEE Transactions on Information Theory , vol. 32, no. 4, pp. 533–542, 1986.
[5] J. N. Tsitsiklis, “Decentralized detection by a large number of sensors,” Mathematics of Control, Signals and Systems , vol. 1, no. 2, pp. 167–182, Jun 1988.
[6] T. S. Han and S. Amari, “Statistical inference under multiterminal data compression,” IEEE Transactions on Information Theory , vol. 44, no. 6, pp. 2300–2324, 1998.
[7] P. K. Varshney, Distributed Detection and Data Fusion , 1st ed. Berlin, Heidelberg: Springer-Verlag, 1996.
[8] M. S. Rahman and A. B. Wagner, “Optimality of binning for distributed hypothesis testing,” in 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton) , 2010, pp. 828–835.
[9] P. Escamilla, M. Wigger, and A. Zaidi, “Distributed hypothesis testing: cooperation and concurrent detection,” IEEE Transactions on Information Theory , vol. 66, no. 12, pp. 7550–7564, 2020.
[10] U. Hadar, J. Liu, Y. Polyanskiy, and O. Shayevitz, “Error exponents in distributed hypothesis testing of correlations,” in 2019 IEEE International Symposium on Information Theory (ISIT) . IEEE, 2019, pp. 2674–2678.
[11] H. Kazemi, A. Pensia, and J. Varun, “The sample complexity of distributed simple binary hypothesis testing under information constraints,” in The Thirty Eighth Annual Conference on Learning Theory . PMLR, 2025, pp. 3213–3214.
[12] A. Andoni, T. Malkin, and N. S. Nosatzki, “Two party distribution testing: Communication and security,” in International Colloquium on Automata, Languages, and Programming , 2019.
[13] K. R. Sahasranand and H. Tyagi, “Communication complexity of distributed high dimensional correlation testing,” IEEE Transactions on Information Theory , vol. 67, no. 9, pp. 6082–6095, 2021.
[14] J. Acharya, C. Canonne, and H. Tyagi, “Communication-constrained inference and the role of shared randomness,” in International Conference on Machine Learning . PMLR, 2019, pp. 30–39.
[15] U. Feige, J. Killian, and M. Naor, “A minimal model for secure computation (extended abstract),” in Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing , ser. STOC ’94. Association for Computing Machinery, 1994, p. 554–563.
[16] C. Dwork, “Differential privacy,” in Automata, Languages and Programming , M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener, Eds. Springer, 2006, pp. 1–12.
[17] ——, “Differential privacy: A survey of results,” in Theory and Applications of Models of Computation , M. Agrawal, D. Du, Z. Duan, and A. Li, Eds. Springer, 2008, pp. 1–19.
[18] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?” SIAM Journal on Computing , vol. 40, no. 3, pp. 793–826, 2011.
[19] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in 2013 IEEE 54th Annual Symposium on Foundations of Computer Science , 2013, pp. 429–438.
[20] J. Acharya, G. Kamath, Z. Sun, and H. Zhang, “Inspectre: Privately estimating the unseen,” in International Conference on Machine Learning . PMLR, 2018, pp. 30–39.
[21] T. Nuradha and Z. Goldfeld, “Pufferfish privacy: An information-theoretic study,” IEEE Transactions on Information Theory , vol. 69, no. 11, pp. 7336–7356, 2023.
[22] I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,” IEEE Transactions on Information Theory , vol. 66, no. 3, pp. 1625–1657, 2019.
[23] J. Liao, L. Sankar, F. P. Calmon, and V. Y. F. Tan, “Hypothesis testing under maximal leakage privacy constraints,” in 2017 IEEE International Symposium on Information Theory (ISIT) , 2017, pp. 779–783.
[24] Z. Li, T. J. Oechtering, and D. Gündüz, “Privacy against a hypothesis testing adversary,” IEEE Transactions on Information Forensics and Security , vol. 14, no. 6, pp. 1567–1581, 2019.
[25] S. Agarwal, G. Kamath, M. Majid, A. Mouzakis, R. Silver, and J. Ullman, “Private mean estimation with person-level differential privacy,” in Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) . SIAM, 2025, pp. 2819–2880.
[26] H. Tyagi and S. Watanabe, Information-theoretic Cryptography . Cambridge University Press, 2023.
[27] R. Eriguchi and K. Shinagawa, “Efficient multiparty private simultaneous messages for symmetric functions,” in Advances in Cryptology – EUROCRYPT 2025: 44th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Madrid, Spain, May 4–8, 2025, Proceedings, Part V , 2025, p. 240–269.
[28] Y. Polyanskiy and Y. Wu, “Information theory: From coding to learning.” Cambridge University Press, 2024.
[29] A. Beimel, A. Gabizon, Y. Ishai, E. Kushilevitz, S. Meldgaard, and A. Paskin-Cherniavsky, “Non-interactive secure multiparty computation,” in Advances in Cryptology – CRYPTO 2014 . Springer, 2014, pp. 387–404.