Project Overview: Over the past months, work has been done on libbzip2-rs, a 100% Rust compatible bzip2 implementation. Funded by NLnet Foundation, the project uses
c2rust
for initial translation from C to Rust. The post describes usingc2rust
and cleaning up its output.Cleaning up c2rust output:
- Loops: C
for
loops are translated to Rustwhile
loops as the conversion is always valid. Some loops were cleaned up manually, and in some cases, smarter iteration methods were used. - Types and Casting: The output contains many explicit type annotations and casts. Removing them and changing types is manual and time-consuming.
- Loops: C
Making it safe:
- Insertion sort example: The generated Rust code for insertion sort was initially unsafe due to pointer arithmetic. By converting to a Rust slice, the code became safe.
- Complex control flow: C constructs like
goto
andswitch
cases that fall through don't translate straightforwardly to Rust. The output can be messy, and in some cases, better names can be used but duplication may be needed. - libc functions: Translating libc functions like
fopen_output_safely
is challenging. Conditional compilation is lost, and using libc directly is not ergonomic. A Rust standard library call was used instead, butfdopen
was still needed.
Testing:
- Original test suite: It's important to maintain compatibility with the original C code. Porting the test cases later was valuable but not fun.
- Fuzzing: Differential fuzzing is useful but we need to be careful as we need to handle older bzip2 versions.
- Benchmarking: A dashboard was set up to track performance. There are some improvements in compression speed and both improvements and regressions in decompression speed. Benchmarks are somewhat unreliable due to bzip2's memory usage and slowness.
- Conclusion: For the library portion, using
c2rust
was successful. For the binary portion, the effort in cleanup was not worth it.c2rust
is better than manual translation and continues to improve. The libbzip2-rs source code is available on GitHub and can be used with thelibbz2-rs-sys
feature gate. - Support: Trifecta Tech Foundation's Data Compression initiative aims to create memory-safe compression libraries and relies on sponsorships.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。