Sequential Neural Probabilistic Amplitude Shaping: Learning the Channel's Language

摘要

We present the first neural probabilistic amplitude shaping that outperforms existing methods while accounting for all implementation losses, using a block-less, easily implementable sequential autoregressive encoder compatible with arithmetic distribution matching, yielding reduced rate loss and higher achievable information rates.

核心问题与主要方法

核心问题

Optimize probabilistic amplitude shaping for nonlinear channels with memory while accounting for implementation rate loss

场景：PAS over M-QAM coherent optical transmission with nonlinear fiber memory, using ADM and bit-metric decoding

主要方法

Rate-loss-aware optimization adds dependency-induced implementation loss to the neural shaping objective, preventing temporal structure from producing AIR gains that are offset by matcher rate loss. The sequential encoder factorizes unsigned-symbol generation autoregressively with fixed memory, applying the same prediction rule at every symbol position to obtain stationary statistics and cross-boundary dependencies. ADM is used as the practical distribution matcher so generated sequences follow the learned joint distribution rather than only a target marginal distribution. Training uses differentiable sampling through Gumbel-Softmax with a straight-through estimator, a differentiable channel approximation, and a mismatched Gaussian demapper producing LLRs. A Maxwell-Boltzmann marginal regularizer balances preservation of favorable marginal shaping against exploitation of temporal dependencies.

关键贡献与后续阅读

关键贡献

Introduces a neural PAS training objective that explicitly accounts for implementation rate loss induced by learned symbol dependencies. Recasts NPAS as a block-less sequential autoregressive model over unsigned QAM symbols, removing fixed block boundaries and aiming for stationary symbol statistics. Connects learned joint symbol distributions to practical ADM implementation, including empirical comparison of ADM rate loss against a theoretical lower bound. Demonstrates, in the provided optical WDM simulation setting, that rate-aware joint-distribution learning can outperform ESS and sequence-selection baselines in nonlinear launch-power regimes. Shows that ignoring rate loss during neural shaping can create substantial intrinsic loss, while the proposed rate-aware objective reduces that loss.

研究启发

How sensitive are the gains to the finite memory parameter mu, especially when the channel memory differs from the selected mu=15 setting? Are the reported AIR gains reproducible across longer-haul links, different modulation orders, or different WDM configurations? What is the practical encoder/decoder complexity and latency of Transformer-based Seq-NPAS with ADM compared with optimized ESS and sequence selection? Do the omitted equations in the HTML extraction materially change the interpretation of the rate-loss lower bound or the proposed objective?

限制与不确定性

Evidence is based on structure analysis only, not full-paper verification. Reported gains may be narrow to the specific WDM fiber simulation and channel-memory regime. Primary category is cs.LG, so broader information-theory importance may depend on reproducibility and generality beyond optical simulation.

参考文献

21 条

[1] A. Amari, S. Goossens, Y. C. Gültekin, O. Vassilieva, I. Kim, T. Ikeuchi, C. M. Okonkwo, F. M. J. Willems, and A. Alvarado (2019) Introducing enumerative sphere shaping for optical communication systems with short blocklengths . Journal of Lightwave Technology 37 ( 23 ), pp. 5926–5936 . External Links: Document Cited by: §1 .
[2] F. A. Aoudia and J. Hoydis (2020) Joint learning of probabilistic and geometric shaping for coded modulation systems . In IEEE Global Communications Conference (Globecom) , Vol. , pp. 1–6 . External Links: Document Cited by: §1 , §2 .
[3] M. T. Askari, L. Lampe, and A. Ghazisaeidi (2025) Neural probabilistic shaping: joint distribution learning for optical fiber communications . In 2025 European Conference on Optical Communications (ECOC) , pp. 1–4 . External Links: Document Cited by: §1 , §3 , §3 .
[4] M. T. Askari, L. Lampe, and A. Ghazisaeidi (2026) Neural probabilistic amplitude shaping for nonlinear fiber channels . arXiv preprint arXiv:2602.02716 . Cited by: §1 , §3 , §5 .
[5] M. T. Askari, L. Lampe, and J. Mitra (2023) Probabilistic amplitude shaping and nonlinearity tolerance: analysis and sequence selection method . Journal of Lightwave Technology 41 ( 17 ), pp. 5503–5517 . External Links: Document Cited by: §1 .
[6] M. T. Askari and L. Lampe (2024) Perturbation-based sequence selection for probabilistic amplitude shaping . In European Conference on Optical Communication (ECOC) , Vol. , pp. 846–849 . External Links: Document Cited by: §5 .
[7] M. T. Askari and L. Lampe (2025) Probabilistic shaping for nonlinearity tolerance . Journal of Lightwave Technology 43 ( 4 ), pp. 1565–1580 . External Links: Document Cited by: §1 , §1 .
[8] S. Baur and G. Böcherer (2015) Arithmetic distribution matching . In International ITG Conference on Systems, Communications and Coding (SGC) , pp. 1–6 . Cited by: §1 .
[9] Y. Bengio, N. Léonard, and A. Courville (2013) Estimating or propagating gradients through stochastic neurons for conditional computation . arXiv preprint arXiv:1308.3432 . Cited by: §3 .
[10] G. Böcherer, F. Steiner, and P. Schulte (2015) Bandwidth efficient and rate-matched low-density parity-check coded modulation . IEEE Transactions on Communications 63 ( 12 ), pp. 4651–4665 . External Links: Document Cited by: §1 .
[11] S. Civelli, E. Forestieri, and M. Secondini (2024) Sequence-selection-based constellation shaping for nonlinear channels . Journal of Lightwave Technology 42 ( 3 ), pp. 1031–1043 . External Links: Document Cited by: §1 .
[12] S. Civelli and M. Secondini (2025) Cost-gain analysis of sequence selection for nonlinearity mitigation . In Optical fiber communication conference , pp. Tu2F–7 . External Links: Document Cited by: §1 .
[13] R. Dar, M. Feder, A. Mecozzi, and M. Shtaif (2014) On shaping gain in the nonlinear fiber-optic channel . In 2014 IEEE International Symposium on Information Theory , Vol. , pp. 2794–2798 . External Links: Document Cited by: §1 .
[14] T. Fehenberger, D. S. Millar, T. Koike-Akino, K. Kojima, K. Parsons, and H. Griesser (2020) Analysis of nonlinear fiber interactions for finite-length constant-composition sequences . Journal of Lightwave Technology 38 ( 2 ), pp. 457–465 . External Links: Document Cited by: §1 .
[15] S. Hochreiter and J. Schmidhuber (1997) Long short-term memory . Neural Computation 9 ( 8 ), pp. 1735–1780 . External Links: Document Cited by: §5 .
[16] E. Jang, S. Gu, and B. Poole (2017) Categorical reparameterization with gumbel-softmax . In International Conference on Learning Representations (ICLR) , External Links: Link Cited by: §3 .
[17] P. Neshaastegaran and A. H. Banihashemi (2019) Log-likelihood ratio calculation for pilot symbol assisted coded modulation schemes with residual phase noise . IEEE Transactions on Communications 67 ( 5 ), pp. 3782–3790 . External Links: Document Cited by: §5 .
[18] M. Secondini, S. Civelli, E. Forestieri, and L. Z. Khan (2022) New lower bounds on the capacity of optical fiber channels via optimized shaping and detection . Journal of Lightwave Technology 40 ( 10 ), pp. 3197–3209 . External Links: Document Cited by: §1 .
[19] N. Shazeer (2020) Glu variants improve transformer . arXiv preprint arXiv:2002.05202 . Cited by: §5 .
[20] J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu (2024) Roformer: enhanced transformer with rotary position embedding . Neurocomputing 568 , pp. 127063 . Cited by: §5 .
[21] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin (2017) Attention is all you need . Advances in neural information processing systems 30 . Cited by: §5 .