静态交叉编译 10,000 多个 Rust CLI 板条箱

  • Research Team and AI Tools: The research team, not native English speakers, used AI translation tools to enhance language clarity.
  • Pkgforge and Prebuilt Binaries: Pkgforge hosts the world's largest collection of prebuilt, static binaries that work everywhere without dependencies. Their main repos include hand-picked packages maintained by the team. They had the idea of automatically harvesting CLI tools from ecosystems like Rust's crates.io, building them as static binaries, and making them available.
  • Ingesting Crates.io: Crates.io provides API access for individual crate lookups and bulk operations. The script initially iterated through the first 1,000 pages with 100 crates per page, getting approximately 111,000 crates. But they faced a bottleneck when querying each crate individually. Fortunately, RFC-3463 provided database dumps at static.crates.io/db-dump.tar.gz. They drafted a cli to parse this and automate the process.
  • Crate Selection: With over 111,000 crates, they set constraints to filter for what they wanted to build, such as being in the command-line-utilities category, having a [[bin]] section in the Manifest, and being updated within the last year. They ended up with ~10,000 crates to compile.
  • Build Constraints: To achieve portable, optimized, and statically linked binaries, they applied comprehensive build constraints including being statically linked (-C target-feature=+crt-static), self-contained (-C link-self-contained=yes), all features (--all-features), link time optimization (-C lto=yes), all optimizations (-C opt-level=3), stripped (-C debuginfo=none -C strip=symbols), and no system libraries.
  • Build Tool: With over 10,000 crates to build, they chose cross-rs/cross as it supported all the targets needed and worked out of the box. They also used jpeddicord/askalono to automatically detect and copy over licenses.
  • Build Targets: While Soar supports any Unix-based distro, due to lack of CI support for other Unix Kernels on GitHub Runners, they are limited to Linux only. They refined their target matrix by excluding architectures approaching end-of-life.
  • Build Security: They ensured the build process was secure by downloading crates from crates.io, running CI/CD on GitHub Actions with temporary, scoped tokens per package, generating and verifying checksums for each artifact, and creating and updating artifact attestation and build provenance.
  • Build Workflow: With 10,000 crates multiplied by 4 targets, they needed to run ~40,000 instances of CI and handle metadata, sanity checks, and uploading to ghcr. They also set up a discord webhook to stream real-time progress updates.
  • Build Success vs. Failure: Out of approximately 10,000 crates queued for building, 5,779 were built successfully and 4,254 failed. The majority of failures were due to system library dependencies and custom build systems. This reinforces the strategy of targeting CLI tools that can be fully statically linked.
  • Crates vs Executables: Many crates produce multiple executables. Out of the ~5,800 crates attempted, ~21,000 individual executables were generated, revealing the rich Rust CLI ecosystem.
  • Native vs Cross: The consistent success rates across architectures demonstrate Rust's excellent cross-platform story, though newer architectures like loongarch64 show slightly lower compatibility rates. Some crates successfully build for non-standard targets but fail for standard architectures due to build hooks and scripts.
  • CI Performance Metrics: The average build time was ~2 minutes.
  • Compilation vs. Prebuilt Distribution: Compilation is slower than fetching prebuilt binaries. Cargo Binstall/Quick Install and Cargo-Quickinstall are excellent tools for faster cargo install. Soar takes a different approach with static executables for end users.
  • Conclusion: The Rust CLI ecosystem is rich and diverse. Cross-compilation compatibility has room for improvement, and static linking is both powerful and challenging. This project demonstrates that automated, large-scale binary distribution is feasible and can provide significant value to the developer community.
  • Future Roadmap: The pkgforge-cargo project will likely see additions like automated updates, integration with Cargo, build optimization, contributing upstream, and listening to community feedback to improve the project and expand its influence.
阅读 11
0 条评论