- Caesar cipher in transformers: A Caesar cipher is a reasonable transformation for transformers to learn in their weights. Frontier models can fluently read and write it with common offsets in training data. They can also infer the correct offset on the fly.
- Testing in frontier models: Decoding cipher without test time thinking tokens was tested. Success rate gets worse as offsets get further from zero. Claude-3.7-Sonnet can infer offset in first forward pass.
- Byzantine music notation Unicode block: Many models including Claude and gpt-4o can read and write hidden messages in this range. It's like a Caesar-like cipher with offset 118784 and near-perfect decoding accuracy.
- Reason for working: In some public tokenizers, addition in certain Unicode ranges commutes with addition in token space. A circuit can implement a shifting cipher. The special offset 118784 maps "a" to a specific character.
- GPT-4o and Claude: GPT-4o retains some deciphering capability with an offset one larger, suggesting commutativity of addition in Unicode-token spaces. Claude handles deciphering well, maybe its tokenizer handles binary strings like o200k.
- Oddity: Unusual shifting algorithm works across multiple model families more consistently than regular Caesar cipher, suggesting it may use circuits from other tasks or have a more fundamental reason for usefulness in next-token prediction.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。