Using Set Shaping Theory to Trade RAM Accesses for CPU Computation
摘要
This paper studies Set Shaping Theory (SST) in a database-index setting under a revised interpretation: SST is not treated as a competing hashing method, but as a structural pre processing layer that can be applied before an existing indexing algorithm. The experimental question is therefore whether a method improves when it is used with SST rather than with out it. The study compares linear probing, double hashing, quadratic probing, and Robin Hood hashing against their corresponding SST-augmented variants for shaping orders K = 2,4,8. Beyond mean time, the benchmark reports mean successful probes, 95th and 99th percentile probes, collisions per stored record, and maxi mum cluster length. Experiments cover load factors from 0.75 to 0.95, database sizes from M =5000 to M =500000, query multipliers up to 200 lookups per stored record, and both uniform and hotspot query distributions. The results highlight two fundamental advantages. First, SST reduces the number of RAM accesses required during retrieval. By prevent ing clusters and long probe chains from forming at insertion time, the lookup phase requires fewer memory jumps, lower probe counts, and reduced tail latency. Second, the method introduces a new way of thinking about data storage: the data are not treated as fixed objects that must be placed passively into a table, but as reversible representations that can be struc turally adapted before being written. A small metadata tag records which transformation was selected, allowing the original key to remain recoverable and the lookup process to remain deterministic.This article is connected to the Set Shaping Theory simulator project, available online at https://sst-simulator.github.io/Set-Shaping-Theory-Simulator/ where it is possible to simulate part of the results presented in the article.
相关性判断
lowThe paper is about hash-table indexing and database performance optimization, not information theory, coding theory, or communications. It is only loosely adjacent through the cs.IT category and structural transformation ideas.
尚未形成判断。
核心问题与主要方法
核心问题
尚未提取。
主要方法
尚未提取。
关键贡献与后续阅读
关键贡献
尚未提取。
研究启发
尚未提取。
参考文献
10 条- [1] C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal , vol. 27, pp. 379–423 and 623–656, 1948.
- [2] T. M. Cover and J. A. Thomas, Elements of Information Theory , 2nd ed. Wiley, 2006.
- [3] D. E. Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching , 2nd ed. Addison-Wesley, 1998.
- [4] R. Ramakrishnan and J. Gehrke, Database Management Systems , 3rd ed. McGraw-Hill, 2003.
- [5] P. Celis, P.-A. Larson, and J. I. Munro, “Robin Hood hashing (preliminary report),” in Proceedings of the 26th Annual Symposium on Foundations of Computer Science , 1985, pp. 281–288.
- [6] R. Pagh and F. F. Rodler, “Cuckoo hashing,” Journal of Algorithms , vol. 51, no. 2, pp. 122–144, 2004.
- [7] S. Kozlov, “Introduction to Set Shaping Theory,” arXiv:2111.08369, 2021.
- [8] C. Schmidt, A. Vdberg, and A. Petit, “Practical applications of Set Shaping Theory in Huffman coding,” arXiv:2208.13020, 2022.
- [9] S. Biereagu, “Introducing the role of shaping order K K in Set Shaping Theory,” AfricArXiv, 2023.
- [10] A. Koch, A. Petit, C. Schmidt, A. Vdberg, and L. Lewis, “Overcoming the encoding limit N H 0 ( S ) NH_{0}(S) using Set Shaping Theory,” arXiv:2310.12732, 2023.
底部评论
0 条根评论,可继续回复叠楼