Target-Oriented Statistical Compression: Sufficiency, Reverse Martingales, and Sequential Monitoring
摘要
Statistical procedures rarely retain all features of the observed data. A sufficient statistic removes information irrelevant to a parameter; a maximum likelihood estimate compresses an empirical objective into an optimizing point; and a hidden state in a sequential model compresses past observations into a learned representation. This article develops these practices under the unified notion of \emph{target-oriented statistical compression}: a useful summary preserves what matters for an inferential, predictive, or decision-relevant target, rather than every detail of the realized data path. The central object is the conditional target process \(M_n=\E(Z\given\G_n)\), where \(Z\) is the target and \(\G_n=σ(T_n)\) is the information retained by the compression map \(T_n\). When \((\G_n)\) is a decreasing filtration, \((M_n)\) is a reverse martingale with limit \(M_\infty=\E(Z\given\G_\infty)\). Exact sufficiency corresponds to lossless compression, while approximate summaries such as penalized estimators, principal components, and neural-network hidden states produce reverse quasi-martingale defects measuring coherence loss across compression levels. The diagnostic \(r_n=|M_n-M_{n-1}|\) is treated as an observable stability proxy, not as an unbiased estimator of the theoretical defect. Boundary degeneracy in sequential binary problems is developed as a central application. Practical boundary claims require joint assessment of boundary closeness, uncertainty control, and trajectory stability. The companion paper \citet{chang2025rm} develops the corresponding stopping procedures, finite-sample bounds, and numerical evidence; the present paper provides the broader theoretical infrastructure and extends the framework to Gaussian, Poisson, and quasi-martingale monitoring problems.
相关性判断
mediumThe paper is mainly statistical theory, but it explicitly frames sufficiency as information compression and is tagged `cs.IT`; the reverse-martingale and sequential monitoring angle makes it adjacent to information-theoretic review.
Clear theoretical framing connecting statistical sufficiency, compression, reverse martingales, and sequential monitoring. Structure analysis shows coherent technical machinery and explicit cs.IT adjacency, but the primary contribution appears statistical rather than core information theory. Main practical procedures, finite-sample bounds, and numerical evidence are deferred to a companion paper, reducing urgency for deep review of this paper alone.
核心问题与主要方法
核心问题
Characterize statistical summaries as target-oriented compression and use that structure to decide when a sequential binary process is credibly near a boundary.
场景:Sequential inference with a compression map T_n, retained sigma-fields G_n, and conditional target processes M_n under decreasing filtrations; includes binary boundary monitoring plus Gaussian, Poisson, logistic, and quasi-martingale examples.
主要方法
Define a compression map T_n, retained sigma-field G_n=sigma(T_n), and conditional target projection M_n=E(Z|G_n), separating the statistic from the martingale object. Use a decreasing filtration, including a tail sigma-field construction, so that conditional target projections form a reverse martingale and converge to M_infty. Model approximate summaries through a reverse quasi-martingale defect delta_n=E(M_n|G_{n+1})-M_{n+1}, with r_n=|M_n-M_{n-1}| serving as an empirical stability signal. Declare practical binary boundary behavior only when B_n<=epsilon, W_n<=w, and r_n<=eta hold together. Show that when summaries are exactly sufficient and the implemented stability diagnostic is linked to the zero defect, tau_RM reduces to the two-condition rule.
关键贡献与后续阅读
关键贡献
Introduces target-oriented statistical compression as a common language for sufficient statistics, MLEs, penalized estimators, risk scores, and learned hidden states. Identifies the conditional target process M_n=E(Z|G_n), rather than the statistic itself, as the object governed by reverse-martingale theory under decreasing retained information. Provides a framework for approximate sufficiency via reverse quasi-martingale defects that quantify loss of coherence across compression levels. Formulates a three-condition sequential boundary scorecard combining boundary closeness, uncertainty width, and trajectory stability. Connects exact sufficiency to lossless compression and states a structural reduction in which the stability screen imposes no additional delay when the defect vanishes under suitable diagnostic linkage.
研究启发
How much of the finite-sample error control and numerical evidence is actually proved or reproduced in this paper versus deferred to the companion paper? Is the decreasing-filtration tail construction operationally useful for online monitoring, or mainly a retrospective theoretical device? Can the relationship between the theoretical defect delta_n and practical diagnostics r_n be sharpened beyond proxy behavior for learned representations or penalized estimators? Are the reported simulation claims reproducible from the supplementary scripts, especially the comparisons with boundary-only, two-condition, SPRT, and CUSUM baselines?
限制与不确定性
Assessment relies on abstract and structure analysis only, not full-paper validation. Novelty may be overstated if the framework is mostly a unifying reinterpretation of known sufficiency and martingale tools. Empirical and procedural support appears to be outside this paper.
参考文献
28 条- Albert and Anderson (1984) Albert, A. and Anderson, J. A. (1984). On the existence of maximum likelihood estimates in logistic regression models. Biometrika , 71(1), 1–10. doi:10.1093/biomet/71.1.1
- Björk and Johansson (1996) Björk, T. and Johansson, B. (1996). Parameter estimation and reverse martingales. Stochastic Processes and their Applications , 63(2), 235–263. doi:10.1016/0304-4149(96)00080-4
- Chakraborty and Moodie (2013) Chakraborty, B. and Moodie, E. E. M. (2013). Statistical Methods for Dynamic Treatment Regimes . Springer, New York. doi:10.1007/978-1-4614-7428-9
- Chang (2026) Chang, Y.-c. I. (2026). Practical boundary degeneracy and reverse-martingale limits in sequential binary models. Preprint, arXiv:2605.02274 [stat.ME] .
- Clopper and Pearson (1934) Clopper, C. J. and Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika , 26(4), 404–413. doi:10.1093/biomet/26.4.404
- Doob (1953) Doob, J. L. (1953). Stochastic Processes . John Wiley & Sons, New York.
- Durrett (2019) Durrett, R. (2019). Probability: Theory and Examples (5th ed.). Cambridge University Press, Cambridge. doi:10.1017/9781108591034
- Firth (1993) Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika , 80(1), 27–38. doi:10.1093/biomet/80.1.27
- Fisher (1922) Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society A , 222, 309–368.
- Fong et al. (2023) Fong, E., Holmes, C., and Walker, S. G. (2023). Martingale posterior distributions. Journal of the Royal Statistical Society: Series B , 85(5), 1357–1391. doi:10.1093/jrsssb/qkad005
- Gelman et al. (2008) Gelman, A., Jakulin, A., Pittau, M. G., and Su, Y.-S. (2008). A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics , 2(4), 1360–1383. doi:10.1214/08-AOAS191
- Goodfellow et al. (2016) Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning . MIT Press, Cambridge, MA.
- Heinze and Schemper (2002) Heinze, G. and Schemper, M. (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine , 21(16), 2409–2419. doi:10.1002/sim.1047
- Hochreiter and Schmidhuber (1997) Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation , 9(8), 1735–1780.
- Howard et al. (2021) Howard, S. R., Ramdas, A., McAuliffe, J., and Sekhon, J. (2021). Time-uniform, nonparametric, nonasymptotic confidence sequences. The Annals of Statistics , 49(2), 1055–1080. doi:10.1214/20-AOS1991
- Kallenberg (2002) Kallenberg, O. (2002). Foundations of Modern Probability (2nd ed.). Springer, New York.
- Lehmann and Casella (1998) Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation (2nd ed.). Springer, New York.
- Murphy (2003) Murphy, S. A. (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B , 65(2), 331–355. doi:10.1111/1467-9868.00389
- NCHS (2020) National Center for Health Statistics (2020). NHANES 2017–2018: Laboratory Procedures Manual. National Center for Health Statistics, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Hyattsville, MD. https://wwwn.cdc.gov/Nchs/Nhanes/2017-2018/PBCD_J.htm
- Robins (2004) Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In D. Y. Lin and P. J. Heagerty (eds.), Proceedings of the Second Seattle Symposium in Biostatistics , Lecture Notes in Statistics, vol. 179, pp. 189–326. Springer, New York. doi:10.1007/978-1-4419-9076-1_11
- Robbins (1970) Robbins, H. (1970). Statistical methods related to the law of the iterated logarithm. The Annals of Mathematical Statistics , 41(5), 1397–1409. doi:10.1214/aoms/1177696786
- Siegmund (1985) Siegmund, D. (1985). Sequential Analysis: Tests and Confidence Intervals . Springer, New York. doi:10.1007/978-1-4613-9549-7
- Ville (1939) Ville, J. (1939). Étude Critique de la Notion de Collectif . Gauthier-Villars, Paris.
- Wald (1945) Wald, A. (1945). Sequential tests of statistical hypotheses. The Annals of Mathematical Statistics , 16(2), 117–186. doi:10.1214/aoms/1177731118
- Wald (1947) Wald, A. (1947). Sequential Analysis . John Wiley & Sons, New York.
- Wald and Wolfowitz (1948) Wald, A. and Wolfowitz, J. (1948). Optimum character of the sequential probability ratio test. The Annals of Mathematical Statistics , 19(3), 326–339. doi:10.1214/aoms/1177730197
- Waudby-Smith and Ramdas (2023) Waudby-Smith, I. and Ramdas, A. (2023). Estimating means of bounded random variables by betting. Journal of the Royal Statistical Society: Series B , 85(1), 1–26. doi:10.1093/jrsssb/qkac007
- Williams (1991) Williams, D. (1991). Probability with Martingales . Cambridge University Press, Cambridge.
底部评论
0 条根评论,可继续回复叠楼