本科生推翻了一个有 40 年历史的数据科学猜想 |《量子杂志》

发布于 2 月 10 日

Main points:
- Andrew Krapivin, an undergraduate at Rutgers University, was inspired by a 2021 paper and rethought a common computer science tool.
- He developed a new kind of hash table that works faster than expected, disproving a 40-year-old conjecture.
- Krapivin, along with Martín Farach-Colton and William Kuszmaul, demonstrated this in a 2025 paper.
- Hash tables are widely used but have open questions about their workings.
- The new hash table has a worst-case query and insertion time proportional to (log x)², faster than the previously believed x.
- It also showed that non-greedy hash tables can have a constant average query time regardless of fullness, contrary to a 1985 result.
Key information:
- The 2021 paper was "Tiny Pointers" which led to Krapivin's work.
- Krapivin was initially unaware of Yao's conjecture.
- Yao asserted that for certain hash tables, the best way to find elements is random probing and the worst-case time is x.
- The new hash table doesn't rely on uniform probing and has faster performance.
- The team's results may not have immediate applications but are important for understanding data structures.
Important details:
- Krapivin's efforts started when he set aside time to go through the "Tiny Pointers" paper.
- Martín Farach-Colton was initially skeptical but asked William Kuszmaul to check the invention.
- Hash tables have been around since the early 1950s and are widely used for simplicity.
- The measure of hash table fullness is described by x, where 100 means 99% full and 1000 means 99.9% full.
- The counterexample for the average query time of non-greedy hash tables is a significant finding.

阅读 13